Reconstructing liver shape and position from MR image slices using an active shape model
NASA Astrophysics Data System (ADS)
Fenchel, Matthias; Thesen, Stefan; Schilling, Andreas
2008-03-01
We present an algorithm for fully automatic reconstruction of 3D position, orientation and shape of the human liver from a sparsely covering set of n 2D MR slice images. Reconstructing the shape of an organ from slice images can be used for scan planning, for surgical planning or other purposes where 3D anatomical knowledge has to be inferred from sparse slices. The algorithm is based on adapting an active shape model of the liver surface to a given set of slice images. The active shape model is created from a training set of liver segmentations from a group of volunteers. The training set is set up with semi-manual segmentations of T1-weighted volumetric MR images. Searching for the optimal shape model that best fits to the image data is done by maximizing a similarity measure based on local appearance at the surface. Two different algorithms for the active shape model search are proposed and compared: both algorithms seek to maximize the a-posteriori probability of the grey level appearance around the surface while constraining the surface to the space of valid shapes. The first algorithm works by using grey value profile statistics in normal direction. The second algorithm uses average and variance images to calculate the local surface appearance on the fly. Both algorithms are validated by fitting the active shape model to abdominal 2D slice images and comparing the shapes, which have been reconstructed, to the manual segmentations and to the results of active shape model searches from 3D image data. The results turn out to be promising and competitive to active shape model segmentations from 3D data.
Measuring Constraint-Set Utility for Partitional Clustering Algorithms
NASA Technical Reports Server (NTRS)
Davidson, Ian; Wagstaff, Kiri L.; Basu, Sugato
2006-01-01
Clustering with constraints is an active area of machine learning and data mining research. Previous empirical work has convincingly shown that adding constraints to clustering improves the performance of a variety of algorithms. However, in most of these experiments, results are averaged over different randomly chosen constraint sets from a given set of labels, thereby masking interesting properties of individual sets. We demonstrate that constraint sets vary significantly in how useful they are for constrained clustering; some constraint sets can actually decrease algorithm performance. We create two quantitative measures, informativeness and coherence, that can be used to identify useful constraint sets. We show that these measures can also help explain differences in performance for four particular constrained clustering algorithms.
Automatic Earthquake Detection by Active Learning
NASA Astrophysics Data System (ADS)
Bergen, K.; Beroza, G. C.
2017-12-01
In recent years, advances in machine learning have transformed fields such as image recognition, natural language processing and recommender systems. Many of these performance gains have relied on the availability of large, labeled data sets to train high-accuracy models; labeled data sets are those for which each sample includes a target class label, such as waveforms tagged as either earthquakes or noise. Earthquake seismologists are increasingly leveraging machine learning and data mining techniques to detect and analyze weak earthquake signals in large seismic data sets. One of the challenges in applying machine learning to seismic data sets is the limited labeled data problem; learning algorithms need to be given examples of earthquake waveforms, but the number of known events, taken from earthquake catalogs, may be insufficient to build an accurate detector. Furthermore, earthquake catalogs are known to be incomplete, resulting in training data that may be biased towards larger events and contain inaccurate labels. This challenge is compounded by the class imbalance problem; the events of interest, earthquakes, are infrequent relative to noise in continuous data sets, and many learning algorithms perform poorly on rare classes. In this work, we investigate the use of active learning for automatic earthquake detection. Active learning is a type of semi-supervised machine learning that uses a human-in-the-loop approach to strategically supplement a small initial training set. The learning algorithm incorporates domain expertise through interaction between a human expert and the algorithm, with the algorithm actively posing queries to the user to improve detection performance. We demonstrate the potential of active machine learning to improve earthquake detection performance with limited available training data.
Active learning for clinical text classification: is it better than random sampling?
Figueroa, Rosa L; Zeng-Treitler, Qing; Ngo, Long H; Goryachev, Sergey; Wiechmann, Eduardo P
2012-01-01
This study explores active learning algorithms as a way to reduce the requirements for large training sets in medical text classification tasks. Three existing active learning algorithms (distance-based (DIST), diversity-based (DIV), and a combination of both (CMB)) were used to classify text from five datasets. The performance of these algorithms was compared to that of passive learning on the five datasets. We then conducted a novel investigation of the interaction between dataset characteristics and the performance results. Classification accuracy and area under receiver operating characteristics (ROC) curves for each algorithm at different sample sizes were generated. The performance of active learning algorithms was compared with that of passive learning using a weighted mean of paired differences. To determine why the performance varies on different datasets, we measured the diversity and uncertainty of each dataset using relative entropy and correlated the results with the performance differences. The DIST and CMB algorithms performed better than passive learning. With a statistical significance level set at 0.05, DIST outperformed passive learning in all five datasets, while CMB was found to be better than passive learning in four datasets. We found strong correlations between the dataset diversity and the DIV performance, as well as the dataset uncertainty and the performance of the DIST algorithm. For medical text classification, appropriate active learning algorithms can yield performance comparable to that of passive learning with considerably smaller training sets. In particular, our results suggest that DIV performs better on data with higher diversity and DIST on data with lower uncertainty.
Active learning for clinical text classification: is it better than random sampling?
Figueroa, Rosa L; Ngo, Long H; Goryachev, Sergey; Wiechmann, Eduardo P
2012-01-01
Objective This study explores active learning algorithms as a way to reduce the requirements for large training sets in medical text classification tasks. Design Three existing active learning algorithms (distance-based (DIST), diversity-based (DIV), and a combination of both (CMB)) were used to classify text from five datasets. The performance of these algorithms was compared to that of passive learning on the five datasets. We then conducted a novel investigation of the interaction between dataset characteristics and the performance results. Measurements Classification accuracy and area under receiver operating characteristics (ROC) curves for each algorithm at different sample sizes were generated. The performance of active learning algorithms was compared with that of passive learning using a weighted mean of paired differences. To determine why the performance varies on different datasets, we measured the diversity and uncertainty of each dataset using relative entropy and correlated the results with the performance differences. Results The DIST and CMB algorithms performed better than passive learning. With a statistical significance level set at 0.05, DIST outperformed passive learning in all five datasets, while CMB was found to be better than passive learning in four datasets. We found strong correlations between the dataset diversity and the DIV performance, as well as the dataset uncertainty and the performance of the DIST algorithm. Conclusion For medical text classification, appropriate active learning algorithms can yield performance comparable to that of passive learning with considerably smaller training sets. In particular, our results suggest that DIV performs better on data with higher diversity and DIST on data with lower uncertainty. PMID:22707743
Recognition of military-specific physical activities with body-fixed sensors.
Wyss, Thomas; Mäder, Urs
2010-11-01
The purpose of this study was to develop and validate an algorithm for recognizing military-specific, physically demanding activities using body-fixed sensors. To develop the algorithm, the first group of study participants (n = 15) wore body-fixed sensors capable of measuring acceleration, step frequency, and heart rate while completing six military-specific activities: walking, marching with backpack, lifting and lowering loads, lifting and carrying loads, digging, and running. The accuracy of the algorithm was tested in these isolated activities in a laboratory setting (n = 18) and in the context of daily military training routine (n = 24). The overall recognition rates during isolated activities and during daily military routine activities were 87.5% and 85.5%, respectively. We conclude that the algorithm adequately recognized six military-specific physical activities based on sensor data alone both in a laboratory setting and in the military training environment. By recognizing type of physical activities this objective method provides additional information on military-job descriptions.
The UXO Classification Demonstration at San Luis Obispo, CA
2010-09-01
Set ................................45 2.17.2 Active Learning Training and Test Set ..........................................47 2.17.3 Extended...optimized algorithm by applying it to only the unlabeled data in the test set. 2.17.2 Active Learning Training and Test Set SIG also used active ... learning [12]. Active learning , an alternative approach for constructing a training set, is used in conjunction with either supervised or semi
Banda, Jorge A; Haydel, K Farish; Davila, Tania; Desai, Manisha; Bryson, Susan; Haskell, William L; Matheson, Donna; Robinson, Thomas N
2016-01-01
To examine the effects of accelerometer epoch lengths, wear time (WT) algorithms, and activity cut-points on estimates of WT, sedentary behavior (SB), and physical activity (PA). 268 7-11 year-olds with BMI ≥ 85th percentile for age and sex wore accelerometers on their right hips for 4-7 days. Data were processed and analyzed at epoch lengths of 1-, 5-, 10-, 15-, 30-, and 60-seconds. For each epoch length, WT minutes/day was determined using three common WT algorithms, and minutes/day and percent time spent in SB, light (LPA), moderate (MPA), and vigorous (VPA) PA were determined using five common activity cut-points. ANOVA tested differences in WT, SB, LPA, MPA, VPA, and MVPA when using the different epoch lengths, WT algorithms, and activity cut-points. WT minutes/day varied significantly by epoch length when using the NHANES WT algorithm (p < .0001), but did not vary significantly by epoch length when using the ≥ 20 minute consecutive zero or Choi WT algorithms. Minutes/day and percent time spent in SB, LPA, MPA, VPA, and MVPA varied significantly by epoch length for all sets of activity cut-points tested with all three WT algorithms (all p < .0001). Across all epoch lengths, minutes/day and percent time spent in SB, LPA, MPA, VPA, and MVPA also varied significantly across all sets of activity cut-points with all three WT algorithms (all p < .0001). The common practice of converting WT algorithms and activity cut-point definitions to match different epoch lengths may introduce significant errors. Estimates of SB and PA from studies that process and analyze data using different epoch lengths, WT algorithms, and/or activity cut-points are not comparable, potentially leading to very different results, interpretations, and conclusions, misleading research and public policy.
Hemmateenejad, Bahram; Akhond, Morteza; Miri, Ramin; Shamsipur, Mojtaba
2003-01-01
A QSAR algorithm, principal component-genetic algorithm-artificial neural network (PC-GA-ANN), has been applied to a set of newly synthesized calcium channel blockers, which are of special interest because of their role in cardiac diseases. A data set of 124 1,4-dihydropyridines bearing different ester substituents at the C-3 and C-5 positions of the dihydropyridine ring and nitroimidazolyl, phenylimidazolyl, and methylsulfonylimidazolyl groups at the C-4 position with known Ca(2+) channel binding affinities was employed in this study. Ten different sets of descriptors (837 descriptors) were calculated for each molecule. The principal component analysis was used to compress the descriptor groups into principal components. The most significant descriptors of each set were selected and used as input for the ANN. The genetic algorithm (GA) was used for the selection of the best set of extracted principal components. A feed forward artificial neural network with a back-propagation of error algorithm was used to process the nonlinear relationship between the selected principal components and biological activity of the dihydropyridines. A comparison between PC-GA-ANN and routine PC-ANN shows that the first model yields better prediction ability.
An Optimal CDS Construction Algorithm with Activity Scheduling in Ad Hoc Networks
Penumalli, Chakradhar; Palanichamy, Yogesh
2015-01-01
A new energy efficient optimal Connected Dominating Set (CDS) algorithm with activity scheduling for mobile ad hoc networks (MANETs) is proposed. This algorithm achieves energy efficiency by minimizing the Broadcast Storm Problem [BSP] and at the same time considering the node's remaining energy. The Connected Dominating Set is widely used as a virtual backbone or spine in mobile ad hoc networks [MANETs] or Wireless Sensor Networks [WSN]. The CDS of a graph representing a network has a significant impact on an efficient design of routing protocol in wireless networks. Here the CDS is a distributed algorithm with activity scheduling based on unit disk graph [UDG]. The node's mobility and residual energy (RE) are considered as parameters in the construction of stable optimal energy efficient CDS. The performance is evaluated at various node densities, various transmission ranges, and mobility rates. The theoretical analysis and simulation results of this algorithm are also presented which yield better results. PMID:26221627
Linear regression models and k-means clustering for statistical analysis of fNIRS data.
Bonomini, Viola; Zucchelli, Lucia; Re, Rebecca; Ieva, Francesca; Spinelli, Lorenzo; Contini, Davide; Paganoni, Anna; Torricelli, Alessandro
2015-02-01
We propose a new algorithm, based on a linear regression model, to statistically estimate the hemodynamic activations in fNIRS data sets. The main concern guiding the algorithm development was the minimization of assumptions and approximations made on the data set for the application of statistical tests. Further, we propose a K-means method to cluster fNIRS data (i.e. channels) as activated or not activated. The methods were validated both on simulated and in vivo fNIRS data. A time domain (TD) fNIRS technique was preferred because of its high performances in discriminating cortical activation and superficial physiological changes. However, the proposed method is also applicable to continuous wave or frequency domain fNIRS data sets.
Linear regression models and k-means clustering for statistical analysis of fNIRS data
Bonomini, Viola; Zucchelli, Lucia; Re, Rebecca; Ieva, Francesca; Spinelli, Lorenzo; Contini, Davide; Paganoni, Anna; Torricelli, Alessandro
2015-01-01
We propose a new algorithm, based on a linear regression model, to statistically estimate the hemodynamic activations in fNIRS data sets. The main concern guiding the algorithm development was the minimization of assumptions and approximations made on the data set for the application of statistical tests. Further, we propose a K-means method to cluster fNIRS data (i.e. channels) as activated or not activated. The methods were validated both on simulated and in vivo fNIRS data. A time domain (TD) fNIRS technique was preferred because of its high performances in discriminating cortical activation and superficial physiological changes. However, the proposed method is also applicable to continuous wave or frequency domain fNIRS data sets. PMID:25780751
Bastian, Thomas; Maire, Aurélia; Dugas, Julien; Ataya, Abbas; Villars, Clément; Gris, Florence; Perrin, Emilie; Caritu, Yanis; Doron, Maeva; Blanc, Stéphane; Jallon, Pierre; Simon, Chantal
2015-03-15
"Objective" methods to monitor physical activity and sedentary patterns in free-living conditions are necessary to further our understanding of their impacts on health. In recent years, many software solutions capable of automatically identifying activity types from portable accelerometry data have been developed, with promising results in controlled conditions, but virtually no reports on field tests. An automatic classification algorithm initially developed using laboratory-acquired data (59 subjects engaging in a set of 24 standardized activities) to discriminate between 8 activity classes (lying, slouching, sitting, standing, walking, running, and cycling) was applied to data collected in the field. Twenty volunteers equipped with a hip-worn triaxial accelerometer performed at their own pace an activity set that included, among others, activities such as walking the streets, running, cycling, and taking the bus. Performances of the laboratory-calibrated classification algorithm were compared with those of an alternative version of the same model including field-collected data in the learning set. Despite good results in laboratory conditions, the performances of the laboratory-calibrated algorithm (assessed by confusion matrices) decreased for several activities when applied to free-living data. Recalibrating the algorithm with data closer to real-life conditions and from an independent group of subjects proved useful, especially for the detection of sedentary behaviors while in transports, thereby improving the detection of overall sitting (sensitivity: laboratory model = 24.9%; recalibrated model = 95.7%). Automatic identification methods should be developed using data acquired in free-living conditions rather than data from standardized laboratory activity sets only, and their limits carefully tested before they are used in field studies. Copyright © 2015 the American Physiological Society.
Machine learning of molecular properties: Locality and active learning
NASA Astrophysics Data System (ADS)
Gubaev, Konstantin; Podryabinkin, Evgeny V.; Shapeev, Alexander V.
2018-06-01
In recent years, the machine learning techniques have shown great potent1ial in various problems from a multitude of disciplines, including materials design and drug discovery. The high computational speed on the one hand and the accuracy comparable to that of density functional theory on another hand make machine learning algorithms efficient for high-throughput screening through chemical and configurational space. However, the machine learning algorithms available in the literature require large training datasets to reach the chemical accuracy and also show large errors for the so-called outliers—the out-of-sample molecules, not well-represented in the training set. In the present paper, we propose a new machine learning algorithm for predicting molecular properties that addresses these two issues: it is based on a local model of interatomic interactions providing high accuracy when trained on relatively small training sets and an active learning algorithm of optimally choosing the training set that significantly reduces the errors for the outliers. We compare our model to the other state-of-the-art algorithms from the literature on the widely used benchmark tests.
Automated assessment of cognitive health using smart home technologies.
Dawadi, Prafulla N; Cook, Diane J; Schmitter-Edgecombe, Maureen; Parsey, Carolyn
2013-01-01
The goal of this work is to develop intelligent systems to monitor the wellbeing of individuals in their home environments. This paper introduces a machine learning-based method to automatically predict activity quality in smart homes and automatically assess cognitive health based on activity quality. This paper describes an automated framework to extract set of features from smart home sensors data that reflects the activity performance or ability of an individual to complete an activity which can be input to machine learning algorithms. Output from learning algorithms including principal component analysis, support vector machine, and logistic regression algorithms are used to quantify activity quality for a complex set of smart home activities and predict cognitive health of participants. Smart home activity data was gathered from volunteer participants (n=263) who performed a complex set of activities in our smart home testbed. We compare our automated activity quality prediction and cognitive health prediction with direct observation scores and health assessment obtained from neuropsychologists. With all samples included, we obtained statistically significant correlation (r=0.54) between direct observation scores and predicted activity quality. Similarly, using a support vector machine classifier, we obtained reasonable classification accuracy (area under the ROC curve=0.80, g-mean=0.73) in classifying participants into two different cognitive classes, dementia and cognitive healthy. The results suggest that it is possible to automatically quantify the task quality of smart home activities and perform limited assessment of the cognitive health of individual if smart home activities are properly chosen and learning algorithms are appropriately trained.
Automated Assessment of Cognitive Health Using Smart Home Technologies
Dawadi, Prafulla N.; Cook, Diane J.; Schmitter-Edgecombe, Maureen; Parsey, Carolyn
2014-01-01
BACKGROUND The goal of this work is to develop intelligent systems to monitor the well being of individuals in their home environments. OBJECTIVE This paper introduces a machine learning-based method to automatically predict activity quality in smart homes and automatically assess cognitive health based on activity quality. METHODS This paper describes an automated framework to extract set of features from smart home sensors data that reflects the activity performance or ability of an individual to complete an activity which can be input to machine learning algorithms. Output from learning algorithms including principal component analysis, support vector machine, and logistic regression algorithms are used to quantify activity quality for a complex set of smart home activities and predict cognitive health of participants. RESULTS Smart home activity data was gathered from volunteer participants (n=263) who performed a complex set of activities in our smart home testbed. We compare our automated activity quality prediction and cognitive health prediction with direct observation scores and health assessment obtained from neuropsychologists. With all samples included, we obtained statistically significant correlation (r=0.54) between direct observation scores and predicted activity quality. Similarly, using a support vector machine classifier, we obtained reasonable classification accuracy (area under the ROC curve = 0.80, g-mean = 0.73) in classifying participants into two different cognitive classes, dementia and cognitive healthy. CONCLUSIONS The results suggest that it is possible to automatically quantify the task quality of smart home activities and perform limited assessment of the cognitive health of individual if smart home activities are properly chosen and learning algorithms are appropriately trained. PMID:23949177
Deep learning improves prediction of CRISPR-Cpf1 guide RNA activity.
Kim, Hui Kwon; Min, Seonwoo; Song, Myungjae; Jung, Soobin; Choi, Jae Woo; Kim, Younggwang; Lee, Sangeun; Yoon, Sungroh; Kim, Hyongbum Henry
2018-03-01
We present two algorithms to predict the activity of AsCpf1 guide RNAs. Indel frequencies for 15,000 target sequences were used in a deep-learning framework based on a convolutional neural network to train Seq-deepCpf1. We then incorporated chromatin accessibility information to create the better-performing DeepCpf1 algorithm for cell lines for which such information is available and show that both algorithms outperform previous machine learning algorithms on our own and published data sets.
Hip and Wrist Accelerometer Algorithms for Free-Living Behavior Classification.
Ellis, Katherine; Kerr, Jacqueline; Godbole, Suneeta; Staudenmayer, John; Lanckriet, Gert
2016-05-01
Accelerometers are a valuable tool for objective measurement of physical activity (PA). Wrist-worn devices may improve compliance over standard hip placement, but more research is needed to evaluate their validity for measuring PA in free-living settings. Traditional cut-point methods for accelerometers can be inaccurate and need testing in free living with wrist-worn devices. In this study, we developed and tested the performance of machine learning (ML) algorithms for classifying PA types from both hip and wrist accelerometer data. Forty overweight or obese women (mean age = 55.2 ± 15.3 yr; BMI = 32.0 ± 3.7) wore two ActiGraph GT3X+ accelerometers (right hip, nondominant wrist; ActiGraph, Pensacola, FL) for seven free-living days. Wearable cameras captured ground truth activity labels. A classifier consisting of a random forest and hidden Markov model classified the accelerometer data into four activities (sitting, standing, walking/running, and riding in a vehicle). Free-living wrist and hip ML classifiers were compared with each other, with traditional accelerometer cut points, and with an algorithm developed in a laboratory setting. The ML classifier obtained average values of 89.4% and 84.6% balanced accuracy over the four activities using the hip and wrist accelerometer, respectively. In our data set with average values of 28.4 min of walking or running per day, the ML classifier predicted average values of 28.5 and 24.5 min of walking or running using the hip and wrist accelerometer, respectively. Intensity-based cut points and the laboratory algorithm significantly underestimated walking minutes. Our results demonstrate the superior performance of our PA-type classification algorithm, particularly in comparison with traditional cut points. Although the hip algorithm performed better, additional compliance achieved with wrist devices might justify using a slightly lower performing algorithm.
A novel algorithm for detecting active propulsion in wheelchair users following spinal cord injury.
Popp, Werner L; Brogioli, Michael; Leuenberger, Kaspar; Albisser, Urs; Frotzler, Angela; Curt, Armin; Gassert, Roger; Starkey, Michelle L
2016-03-01
Physical activity in wheelchair-bound individuals can be assessed by monitoring their mobility as this is one of the most intense upper extremity activities they perform. Current accelerometer-based approaches for describing wheelchair mobility do not distinguish between self- and attendant-propulsion and hence may overestimate total physical activity. The aim of this study was to develop and validate an inertial measurement unit based algorithm to monitor wheel kinematics and the type of wheelchair propulsion (self- or attendant-) within a "real-world" situation. Different sensor set-ups were investigated, ranging from a high precision set-up including four sensor modules with a relatively short measurement duration of 24 h, to a less precise set-up with only one module attached at the wheel exceeding one week of measurement because the gyroscope of the sensor was turned off. The "high-precision" algorithm distinguished self- and attendant-propulsion with accuracy greater than 93% whilst the long-term measurement set-up showed an accuracy of 82%. The estimation accuracy of kinematic parameters was greater than 97% for both set-ups. The possibility of having different sensor set-ups allows the use of the inertial measurement units as high precision tools for researchers as well as unobtrusive and simple tools for manual wheelchair users. Copyright © 2016 IPEM. Published by Elsevier Ltd. All rights reserved.
NASA Technical Reports Server (NTRS)
Lu, Yun-Chi; Chang, Hyo Duck; Krupp, Brian; Kumar, Ravindar; Swaroop, Anand
1992-01-01
Volume 3 assists Earth Observing System (EOS) investigators in locating required non-EOS data products by identifying their non-EOS input requirements and providing the information on data sets available at various Distributed Active Archive Centers (DAAC's), including those from Pathfinder Activities and Earth Probes. Volume 3 is intended to complement, not to duplicate, the the EOSDIS Science Data Plan (SDP) by providing detailed data set information which was not presented in the SDP. Section 9 of this volume discusses the algorithm summary tables containing information on retrieval algorithms, expected outputs and required input data. Section 10 describes the non-EOS input requirements of instrument teams and IDS investigators. Also described are the current and future data holdings of the original seven DAACS and data products planned from the future missions and projects including Earth Probes and Pathfinder Activities. Section 11 describes source of information used in compiling data set information presented in this volume. A list of data set attributes used to describe various data sets is presented in section 12 along with their descriptions. Finally, Section 13 presents the SPSO's future plan to improve this report .
Ensemble Methods for Classification of Physical Activities from Wrist Accelerometry.
Chowdhury, Alok Kumar; Tjondronegoro, Dian; Chandran, Vinod; Trost, Stewart G
2017-09-01
To investigate whether the use of ensemble learning algorithms improve physical activity recognition accuracy compared to the single classifier algorithms, and to compare the classification accuracy achieved by three conventional ensemble machine learning methods (bagging, boosting, random forest) and a custom ensemble model comprising four algorithms commonly used for activity recognition (binary decision tree, k nearest neighbor, support vector machine, and neural network). The study used three independent data sets that included wrist-worn accelerometer data. For each data set, a four-step classification framework consisting of data preprocessing, feature extraction, normalization and feature selection, and classifier training and testing was implemented. For the custom ensemble, decisions from the single classifiers were aggregated using three decision fusion methods: weighted majority vote, naïve Bayes combination, and behavior knowledge space combination. Classifiers were cross-validated using leave-one subject out cross-validation and compared on the basis of average F1 scores. In all three data sets, ensemble learning methods consistently outperformed the individual classifiers. Among the conventional ensemble methods, random forest models provided consistently high activity recognition; however, the custom ensemble model using weighted majority voting demonstrated the highest classification accuracy in two of the three data sets. Combining multiple individual classifiers using conventional or custom ensemble learning methods can improve activity recognition accuracy from wrist-worn accelerometer data.
An investigation of new methods for estimating parameter sensitivities
NASA Technical Reports Server (NTRS)
Beltracchi, Todd J.; Gabriele, Gary A.
1988-01-01
Parameter sensitivity is defined as the estimation of changes in the modeling functions and the design variables due to small changes in the fixed parameters of the formulation. There are currently several methods for estimating parameter sensitivities requiring either difficult to obtain second order information, or do not return reliable estimates for the derivatives. Additionally, all the methods assume that the set of active constraints does not change in a neighborhood of the estimation point. If the active set does in fact change, than any extrapolations based on these derivatives may be in error. The objective here is to investigate more efficient new methods for estimating parameter sensitivities when the active set changes. The new method is based on the recursive quadratic programming (RQP) method and in conjunction a differencing formula to produce estimates of the sensitivities. This is compared to existing methods and is shown to be very competitive in terms of the number of function evaluations required. In terms of accuracy, the method is shown to be equivalent to a modified version of the Kuhn-Tucker method, where the Hessian of the Lagrangian is estimated using the BFS method employed by the RPQ algorithm. Inital testing on a test set with known sensitivities demonstrates that the method can accurately calculate the parameter sensitivity. To handle changes in the active set, a deflection algorithm is proposed for those cases where the new set of active constraints remains linearly independent. For those cases where dependencies occur, a directional derivative is proposed. A few simple examples are included for the algorithm, but extensive testing has not yet been performed.
A Random Forest-based ensemble method for activity recognition.
Feng, Zengtao; Mo, Lingfei; Li, Meng
2015-01-01
This paper presents a multi-sensor ensemble approach to human physical activity (PA) recognition, using random forest. We designed an ensemble learning algorithm, which integrates several independent Random Forest classifiers based on different sensor feature sets to build a more stable, more accurate and faster classifier for human activity recognition. To evaluate the algorithm, PA data collected from the PAMAP (Physical Activity Monitoring for Aging People), which is a standard, publicly available database, was utilized to train and test. The experimental results show that the algorithm is able to correctly recognize 19 PA types with an accuracy of 93.44%, while the training is faster than others. The ensemble classifier system based on the RF (Random Forest) algorithm can achieve high recognition accuracy and fast calculation.
Evaluation of a Didactic Method for the Active Learning of Greedy Algorithms
ERIC Educational Resources Information Center
Esteban-Sánchez, Natalia; Pizarro, Celeste; Velázquez-Iturbide, J. Ángel
2014-01-01
An evaluation of the educational effectiveness of a didactic method for the active learning of greedy algorithms is presented. The didactic method sets students structured-inquiry challenges to be addressed with a specific experimental method, supported by the interactive system GreedEx. This didactic method has been refined over several years of…
Active Learning with Irrelevant Examples
NASA Technical Reports Server (NTRS)
Wagstaff, Kiri; Mazzoni, Dominic
2009-01-01
An improved active learning method has been devised for training data classifiers. One example of a data classifier is the algorithm used by the United States Postal Service since the 1960s to recognize scans of handwritten digits for processing zip codes. Active learning algorithms enable rapid training with minimal investment of time on the part of human experts to provide training examples consisting of correctly classified (labeled) input data. They function by identifying which examples would be most profitable for a human expert to label. The goal is to maximize classifier accuracy while minimizing the number of examples the expert must label. Although there are several well-established methods for active learning, they may not operate well when irrelevant examples are present in the data set. That is, they may select an item for labeling that the expert simply cannot assign to any of the valid classes. In the context of classifying handwritten digits, the irrelevant items may include stray marks, smudges, and mis-scans. Querying the expert about these items results in wasted time or erroneous labels, if the expert is forced to assign the item to one of the valid classes. In contrast, the new algorithm provides a specific mechanism for avoiding querying the irrelevant items. This algorithm has two components: an active learner (which could be a conventional active learning algorithm) and a relevance classifier. The combination of these components yields a method, denoted Relevance Bias, that enables the active learner to avoid querying irrelevant data so as to increase its learning rate and efficiency when irrelevant items are present. The algorithm collects irrelevant data in a set of rejected examples, then trains the relevance classifier to distinguish between labeled (relevant) training examples and the rejected ones. The active learner combines its ranking of the items with the probability that they are relevant to yield a final decision about which item to present to the expert for labeling. Experiments on several data sets have demonstrated that the Relevance Bias approach significantly decreases the number of irrelevant items queried and also accelerates learning speed.
Markel, D; Naqa, I El; Freeman, C; Vallières, M
2012-06-01
To present a novel joint segmentation/registration for multimodality image-guided and adaptive radiotherapy. A major challenge to this framework is the sensitivity of many segmentation or registration algorithms to noise. Presented is a level set active contour based on the Jensen-Renyi (JR) divergence to achieve improved noise robustness in a multi-modality imaging space. To present a novel joint segmentation/registration for multimodality image-guided and adaptive radiotherapy. A major challenge to this framework is the sensitivity of many segmentation or registration algorithms to noise. Presented is a level set active contour based on the Jensen-Renyi (JR) divergence to achieve improved noise robustness in a multi-modality imaging space. It was found that JR divergence when used for segmentation has an improved robustness to noise compared to using mutual information, or other entropy-based metrics. The MI metric failed at around 2/3 the noise power than the JR divergence. The JR divergence metric is useful for the task of joint segmentation/registration of multimodality images and shows improved results compared entropy based metric. The algorithm can be easily modified to incorporate non-intensity based images, which would allow applications into multi-modality and texture analysis. © 2012 American Association of Physicists in Medicine.
A Semisupervised Support Vector Machines Algorithm for BCI Systems
Qin, Jianzhao; Li, Yuanqing; Sun, Wei
2007-01-01
As an emerging technology, brain-computer interfaces (BCIs) bring us new communication interfaces which translate brain activities into control signals for devices like computers, robots, and so forth. In this study, we propose a semisupervised support vector machine (SVM) algorithm for brain-computer interface (BCI) systems, aiming at reducing the time-consuming training process. In this algorithm, we apply a semisupervised SVM for translating the features extracted from the electrical recordings of brain into control signals. This SVM classifier is built from a small labeled data set and a large unlabeled data set. Meanwhile, to reduce the time for training semisupervised SVM, we propose a batch-mode incremental learning method, which can also be easily applied to the online BCI systems. Additionally, it is suggested in many studies that common spatial pattern (CSP) is very effective in discriminating two different brain states. However, CSP needs a sufficient labeled data set. In order to overcome the drawback of CSP, we suggest a two-stage feature extraction method for the semisupervised learning algorithm. We apply our algorithm to two BCI experimental data sets. The offline data analysis results demonstrate the effectiveness of our algorithm. PMID:18368141
A developmental approach to learning causal models for cyber security
NASA Astrophysics Data System (ADS)
Mugan, Jonathan
2013-05-01
To keep pace with our adversaries, we must expand the scope of machine learning and reasoning to address the breadth of possible attacks. One approach is to employ an algorithm to learn a set of causal models that describes the entire cyber network and each host end node. Such a learning algorithm would run continuously on the system and monitor activity in real time. With a set of causal models, the algorithm could anticipate novel attacks, take actions to thwart them, and predict the second-order effects flood of information, and the algorithm would have to determine which streams of that flood were relevant in which situations. This paper will present the results of efforts toward the application of a developmental learning algorithm to the problem of cyber security. The algorithm is modeled on the principles of human developmental learning and is designed to allow an agent to learn about the computer system in which it resides through active exploration. Children are flexible learners who acquire knowledge by actively exploring their environment and making predictions about what they will find,1, 2 and our algorithm is inspired by the work of the developmental psychologist Jean Piaget.3 Piaget described how children construct knowledge in stages and learn new concepts on top of those they already know. Developmental learning allows our algorithm to focus on subsets of the environment that are most helpful for learning given its current knowledge. In experiments, the algorithm was able to learn the conditions for file exfiltration and use that knowledge to protect sensitive files.
Khan, Naveed; McClean, Sally; Zhang, Shuai; Nugent, Chris
2016-01-01
In recent years, smart phones with inbuilt sensors have become popular devices to facilitate activity recognition. The sensors capture a large amount of data, containing meaningful events, in a short period of time. The change points in this data are used to specify transitions to distinct events and can be used in various scenarios such as identifying change in a patient’s vital signs in the medical domain or requesting activity labels for generating real-world labeled activity datasets. Our work focuses on change-point detection to identify a transition from one activity to another. Within this paper, we extend our previous work on multivariate exponentially weighted moving average (MEWMA) algorithm by using a genetic algorithm (GA) to identify the optimal set of parameters for online change-point detection. The proposed technique finds the maximum accuracy and F_measure by optimizing the different parameters of the MEWMA, which subsequently identifies the exact location of the change point from an existing activity to a new one. Optimal parameter selection facilitates an algorithm to detect accurate change points and minimize false alarms. Results have been evaluated based on two real datasets of accelerometer data collected from a set of different activities from two users, with a high degree of accuracy from 99.4% to 99.8% and F_measure of up to 66.7%. PMID:27792177
Optimal Control Allocation with Load Sensor Feedback for Active Load Suppression
NASA Technical Reports Server (NTRS)
Miller, Christopher
2017-01-01
These slide sets describe the OCLA formulation and associated algorithms as a set of new technologies in the first practical application of load limiting flight control utilizing load feedback as a primary control measurement. Slide set one describes Experiment Development and slide set two describes Flight-Test Performance.
A variational approach to multi-phase motion of gas, liquid and solid based on the level set method
NASA Astrophysics Data System (ADS)
Yokoi, Kensuke
2009-07-01
We propose a simple and robust numerical algorithm to deal with multi-phase motion of gas, liquid and solid based on the level set method [S. Osher, J.A. Sethian, Front propagating with curvature-dependent speed: Algorithms based on Hamilton-Jacobi formulation, J. Comput. Phys. 79 (1988) 12; M. Sussman, P. Smereka, S. Osher, A level set approach for capturing solution to incompressible two-phase flow, J. Comput. Phys. 114 (1994) 146; J.A. Sethian, Level Set Methods and Fast Marching Methods, Cambridge University Press, 1999; S. Osher, R. Fedkiw, Level Set Methods and Dynamics Implicit Surface, Applied Mathematical Sciences, vol. 153, Springer, 2003]. In Eulerian framework, to simulate interaction between a moving solid object and an interfacial flow, we need to define at least two functions (level set functions) to distinguish three materials. In such simulations, in general two functions overlap and/or disagree due to numerical errors such as numerical diffusion. In this paper, we resolved the problem using the idea of the active contour model [M. Kass, A. Witkin, D. Terzopoulos, Snakes: active contour models, International Journal of Computer Vision 1 (1988) 321; V. Caselles, R. Kimmel, G. Sapiro, Geodesic active contours, International Journal of Computer Vision 22 (1997) 61; G. Sapiro, Geometric Partial Differential Equations and Image Analysis, Cambridge University Press, 2001; R. Kimmel, Numerical Geometry of Images: Theory, Algorithms, and Applications, Springer-Verlag, 2003] introduced in the field of image processing.
2009-07-01
Performance Analysis of the Probabilistic Multi- Hypothesis Tracking Algorithm On the SEABAR Data Sets Dr. Christian G . Hempel Naval...Hypothesis Tracking,” NUWC-NPT Technical Report 10,428, Naval Undersea Warfare Center Division, Newport, RI, 15 February 1995. [2] G . McLachlan, T...the 9th International Conference on Information Fusion, Florence Italy, July, 2006. [8] C. Hempel, “Track Initialization for Multi-Static Active Sonay
Pérez-Castillo, Yunierkis; Lazar, Cosmin; Taminau, Jonatan; Froeyen, Mathy; Cabrera-Pérez, Miguel Ángel; Nowé, Ann
2012-09-24
Computer-aided drug design has become an important component of the drug discovery process. Despite the advances in this field, there is not a unique modeling approach that can be successfully applied to solve the whole range of problems faced during QSAR modeling. Feature selection and ensemble modeling are active areas of research in ligand-based drug design. Here we introduce the GA(M)E-QSAR algorithm that combines the search and optimization capabilities of Genetic Algorithms with the simplicity of the Adaboost ensemble-based classification algorithm to solve binary classification problems. We also explore the usefulness of Meta-Ensembles trained with Adaboost and Voting schemes to further improve the accuracy, generalization, and robustness of the optimal Adaboost Single Ensemble derived from the Genetic Algorithm optimization. We evaluated the performance of our algorithm using five data sets from the literature and found that it is capable of yielding similar or better classification results to what has been reported for these data sets with a higher enrichment of active compounds relative to the whole actives subset when only the most active chemicals are considered. More important, we compared our methodology with state of the art feature selection and classification approaches and found that it can provide highly accurate, robust, and generalizable models. In the case of the Adaboost Ensembles derived from the Genetic Algorithm search, the final models are quite simple since they consist of a weighted sum of the output of single feature classifiers. Furthermore, the Adaboost scores can be used as ranking criterion to prioritize chemicals for synthesis and biological evaluation after virtual screening experiments.
Active Solution Space and Search on Job-shop Scheduling Problem
NASA Astrophysics Data System (ADS)
Watanabe, Masato; Ida, Kenichi; Gen, Mitsuo
In this paper we propose a new searching method of Genetic Algorithm for Job-shop scheduling problem (JSP). The coding method that represent job number in order to decide a priority to arrange a job to Gannt Chart (called the ordinal representation with a priority) in JSP, an active schedule is created by using left shift. We define an active solution at first. It is solution which can create an active schedule without using left shift, and set of its defined an active solution space. Next, we propose an algorithm named Genetic Algorithm with active solution space search (GA-asol) which can create an active solution while solution is evaluated, in order to search the active solution space effectively. We applied it for some benchmark problems to compare with other method. The experimental results show good performance.
Shape-driven 3D segmentation using spherical wavelets.
Nain, Delphine; Haker, Steven; Bobick, Aaron; Tannenbaum, Allen
2006-01-01
This paper presents a novel active surface segmentation algorithm using a multiscale shape representation and prior. We define a parametric model of a surface using spherical wavelet functions and learn a prior probability distribution over the wavelet coefficients to model shape variations at different scales and spatial locations in a training set. Based on this representation, we derive a parametric active surface evolution using the multiscale prior coefficients as parameters for our optimization procedure to naturally include the prior in the segmentation framework. Additionally, the optimization method can be applied in a coarse-to-fine manner. We apply our algorithm to the segmentation of brain caudate nucleus, of interest in the study of schizophrenia. Our validation shows our algorithm is computationally efficient and outperforms the Active Shape Model algorithm by capturing finer shape details.
Algorithms of walking and stability for an anthropomorphic robot
NASA Astrophysics Data System (ADS)
Sirazetdinov, R. T.; Devaev, V. M.; Nikitina, D. V.; Fadeev, A. Y.; Kamalov, A. R.
2017-09-01
Autonomous movement of an anthropomorphic robot is considered as a superposition of a set of typical elements of movement - so-called patterns, each of which can be considered as an agent of some multi-agent system [ 1 ]. To control the AP-601 robot, an information and communication infrastructure has been created that represents some multi-agent system that allows the development of algorithms for individual patterns of moving and run them in the system as a set of independently executed and interacting agents. The algorithms of lateral movement of the anthropomorphic robot AP-601 series with active stability due to the stability pattern are presented.
NASA Technical Reports Server (NTRS)
Colliander, Andreas; Chan, Steven; Yueh, Simon; Cosh, Michael; Bindlish, Rajat; Jackson, Tom; Njoku, Eni
2010-01-01
Field experiment data sets that include coincident remote sensing measurements and in situ sampling will be valuable in the development and validation of the soil moisture algorithms of the NASA's future SMAP (Soil Moisture Active and Passive) mission. This paper presents an overview of the field experiment data collected from SGP99, SMEX02, CLASIC and SMAPVEX08 campaigns. Common in these campaigns were observations of the airborne PALS (Passive and Active L- and S-band) instrument, which was developed to acquire radar and radiometer measurements at low frequencies. The combined set of the PALS measurements and ground truth obtained from all these campaigns was under study. The investigation shows that the data set contains a range of soil moisture values collected under a limited number of conditions. The quality of both PALS and ground truth data meets the needs of the SMAP algorithm development and validation. The data set has already made significant impact on the science behind SMAP mission. The areas where complementing of the data would be most beneficial are also discussed.
Performance of Activity Classification Algorithms in Free-living Older Adults
Sasaki, Jeffer Eidi; Hickey, Amanda; Staudenmayer, John; John, Dinesh; Kent, Jane A.; Freedson, Patty S.
2015-01-01
Purpose To compare activity type classification rates of machine learning algorithms trained on laboratory versus free-living accelerometer data in older adults. Methods Thirty-five older adults (21F and 14M ; 70.8 ± 4.9 y) performed selected activities in the laboratory while wearing three ActiGraph GT3X+ activity monitors (dominant hip, wrist, and ankle). Monitors were initialized to collect raw acceleration data at a sampling rate of 80 Hz. Fifteen of the participants also wore the GT3X+ in free-living settings and were directly observed for 2-3 hours. Time- and frequency- domain features from acceleration signals of each monitor were used to train Random Forest (RF) and Support Vector Machine (SVM) models to classify five activity types: sedentary, standing, household, locomotion, and recreational activities. All algorithms were trained on lab data (RFLab and SVMLab) and free-living data (RFFL and SVMFL) using 20 s signal sampling windows. Classification accuracy rates of both types of algorithms were tested on free-living data using a leave-one-out technique. Results Overall classification accuracy rates for the algorithms developed from lab data were between 49% (wrist) to 55% (ankle) for the SVMLab algorithms, and 49% (wrist) to 54% (ankle) for RFLab algorithms. The classification accuracy rates for SVMFL and RFFL algorithms ranged from 58% (wrist) to 69% (ankle) and from 61% (wrist) to 67% (ankle), respectively. Conclusion Our algorithms developed on free-living accelerometer data were more accurate in classifying activity type in free-living older adults than our algorithms developed on laboratory accelerometer data. Future studies should consider using free-living accelerometer data to train machine-learning algorithms in older adults. PMID:26673129
Performance of Activity Classification Algorithms in Free-Living Older Adults.
Sasaki, Jeffer Eidi; Hickey, Amanda M; Staudenmayer, John W; John, Dinesh; Kent, Jane A; Freedson, Patty S
2016-05-01
The objective of this study is to compare activity type classification rates of machine learning algorithms trained on laboratory versus free-living accelerometer data in older adults. Thirty-five older adults (21 females and 14 males, 70.8 ± 4.9 yr) performed selected activities in the laboratory while wearing three ActiGraph GT3X+ activity monitors (in the dominant hip, wrist, and ankle; ActiGraph, LLC, Pensacola, FL). Monitors were initialized to collect raw acceleration data at a sampling rate of 80 Hz. Fifteen of the participants also wore GT3X+ in free-living settings and were directly observed for 2-3 h. Time- and frequency-domain features from acceleration signals of each monitor were used to train random forest (RF) and support vector machine (SVM) models to classify five activity types: sedentary, standing, household, locomotion, and recreational activities. All algorithms were trained on laboratory data (RFLab and SVMLab) and free-living data (RFFL and SVMFL) using 20-s signal sampling windows. Classification accuracy rates of both types of algorithms were tested on free-living data using a leave-one-out technique. Overall classification accuracy rates for the algorithms developed from laboratory data were between 49% (wrist) and 55% (ankle) for the SVMLab algorithms and 49% (wrist) to 54% (ankle) for the RFLab algorithms. The classification accuracy rates for SVMFL and RFFL algorithms ranged from 58% (wrist) to 69% (ankle) and from 61% (wrist) to 67% (ankle), respectively. Our algorithms developed on free-living accelerometer data were more accurate in classifying the activity type in free-living older adults than those on our algorithms developed on laboratory accelerometer data. Future studies should consider using free-living accelerometer data to train machine learning algorithms in older adults.
Jayaraman, Chandrasekaran; Mummidisetty, Chaithanya Krishna; Mannix-Slobig, Alannah; McGee Koch, Lori; Jayaraman, Arun
2018-03-13
Monitoring physical activity and leveraging wearable sensor technologies to facilitate active living in individuals with neurological impairment has been shown to yield benefits in terms of health and quality of living. In this context, accurate measurement of physical activity estimates from these sensors are vital. However, wearable sensor manufacturers generally only provide standard proprietary algorithms based off of healthy individuals to estimate physical activity metrics which may lead to inaccurate estimates in population with neurological impairment like stroke and incomplete spinal cord injury (iSCI). The main objective of this cross-sectional investigation was to evaluate the validity of physical activity estimates provided by standard proprietary algorithms for individuals with stroke and iSCI. Two research grade wearable sensors used in clinical settings were chosen and the outcome metrics estimated using standard proprietary algorithms were validated against designated golden standard measures (Cosmed K4B2 for energy expenditure and metabolic equivalent and manual tallying for step counts). The influence of sensor location, sensor type and activity characteristics were also studied. 28 participants (Healthy (n = 10); incomplete SCI (n = 8); stroke (n = 10)) performed a spectrum of activities in a laboratory setting using two wearable sensors (ActiGraph and Metria-IH1) at different body locations. Manufacturer provided standard proprietary algorithms estimated the step count, energy expenditure (EE) and metabolic equivalent (MET). These estimates were compared with the estimates from gold standard measures. For verifying validity, a series of Kruskal Wallis ANOVA tests (Games-Howell multiple comparison for post-hoc analyses) were conducted to compare the mean rank and absolute agreement of outcome metrics estimated by each of the devices in comparison with the designated gold standard measurements. The sensor type, sensor location, activity characteristics and the population specific condition influences the validity of estimation of physical activity metrics using standard proprietary algorithms. Implementing population specific customized algorithms accounting for the influences of sensor location, type and activity characteristics for estimating physical activity metrics in individuals with stroke and iSCI could be beneficial.
Raja, Muhammad Asif Zahoor; Khan, Junaid Ali; Ahmad, Siraj-ul-Islam; Qureshi, Ijaz Mansoor
2012-01-01
A methodology for solution of Painlevé equation-I is presented using computational intelligence technique based on neural networks and particle swarm optimization hybridized with active set algorithm. The mathematical model of the equation is developed with the help of linear combination of feed-forward artificial neural networks that define the unsupervised error of the model. This error is minimized subject to the availability of appropriate weights of the networks. The learning of the weights is carried out using particle swarm optimization algorithm used as a tool for viable global search method, hybridized with active set algorithm for rapid local convergence. The accuracy, convergence rate, and computational complexity of the scheme are analyzed based on large number of independents runs and their comprehensive statistical analysis. The comparative studies of the results obtained are made with MATHEMATICA solutions, as well as, with variational iteration method and homotopy perturbation method. PMID:22919371
Pixel decomposition for tracking in low resolution videos
NASA Astrophysics Data System (ADS)
Govinda, Vivekanand; Ralph, Jason F.; Spencer, Joseph W.; Goulermas, John Y.; Yang, Lihua; Abbas, Alaa M.
2008-04-01
This paper describes a novel set of algorithms that allows indoor activity to be monitored using data from very low resolution imagers and other non-intrusive sensors. The objects are not resolved but activity may still be determined. This allows the use of such technology in sensitive environments where privacy must be maintained. Spectral un-mixing algorithms from remote sensing were adapted for this environment. These algorithms allow the fractional contributions from different colours within each pixel to be estimated and this is used to assist in the detection and monitoring of small objects or sub-pixel motion.
Atmospheric electricity/meteorology analysis
NASA Technical Reports Server (NTRS)
Goodman, Steven J.; Blakeslee, Richard; Buechler, Dennis
1993-01-01
This activity focuses on Lightning Imaging Sensor (LIS)/Lightning Mapper Sensor (LMS) algorithm development and applied research. Specifically we are exploring the relationships between (1) global and regional lightning activity and rainfall, and (2) storm electrical development, physics, and the role of the environment. U.S. composite radar-rainfall maps and ground strike lightning maps are used to understand lightning-rainfall relationships at the regional scale. These observations are then compared to SSM/I brightness temperatures to simulate LIS/TRMM multi-sensor algorithm data sets. These data sets are supplied to the WETNET project archive. WSR88-D (NEXRAD) data are also used as it becomes available. The results of this study allow us to examine the information content from lightning imaging sensors in low-earth and geostationary orbits. Analysis of tropical and U.S. data sets continues. A neural network/sensor fusion algorithm is being refined for objectively associating lightning and rainfall with their parent storm systems. Total lightning data from interferometers are being used in conjunction with data from the national lightning network. A 6-year lightning/rainfall climatology has been assembled for LIS sampling studies.
Shape-Driven 3D Segmentation Using Spherical Wavelets
Nain, Delphine; Haker, Steven; Bobick, Aaron; Tannenbaum, Allen
2013-01-01
This paper presents a novel active surface segmentation algorithm using a multiscale shape representation and prior. We define a parametric model of a surface using spherical wavelet functions and learn a prior probability distribution over the wavelet coefficients to model shape variations at different scales and spatial locations in a training set. Based on this representation, we derive a parametric active surface evolution using the multiscale prior coefficients as parameters for our optimization procedure to naturally include the prior in the segmentation framework. Additionally, the optimization method can be applied in a coarse-to-fine manner. We apply our algorithm to the segmentation of brain caudate nucleus, of interest in the study of schizophrenia. Our validation shows our algorithm is computationally efficient and outperforms the Active Shape Model algorithm by capturing finer shape details. PMID:17354875
A method for feature selection of APT samples based on entropy
NASA Astrophysics Data System (ADS)
Du, Zhenyu; Li, Yihong; Hu, Jinsong
2018-05-01
By studying the known APT attack events deeply, this paper propose a feature selection method of APT sample and a logic expression generation algorithm IOCG (Indicator of Compromise Generate). The algorithm can automatically generate machine readable IOCs (Indicator of Compromise), to solve the existing IOCs logical relationship is fixed, the number of logical items unchanged, large scale and cannot generate a sample of the limitations of the expression. At the same time, it can reduce the redundancy and useless APT sample processing time consumption, and improve the sharing rate of information analysis, and actively respond to complex and volatile APT attack situation. The samples were divided into experimental set and training set, and then the algorithm was used to generate the logical expression of the training set with the IOC_ Aware plug-in. The contrast expression itself was different from the detection result. The experimental results show that the algorithm is effective and can improve the detection effect.
Palmisano, Pietro; Ziacchi, Matteo; Biffi, Mauro; Ricci, Renato P; Landolina, Maurizio; Zoni-Berisso, Massimo; Occhetta, Eraldo; Maglia, Giampiero; Botto, Gianluca; Padeletti, Luigi; Boriani, Giuseppe
2018-04-01
: The purpose of this two-part consensus document is to provide specific suggestions (based on an extensive literature review) on appropriate pacemaker setting in relation to patients' clinical features. In part 2, criteria for pacemaker choice and programming in atrioventricular blocks and neurally mediate syncope are proposed. The atrioventricular blocks can be paroxysmal or persistent, isolated or associated with sinus node disease. Neurally mediated syncope can be related to carotid sinus syndrome or cardioinhibitory vasovagal syncope. In sinus rhythm, with persistent atrioventricular block, we considered appropriate the activation of mode-switch algorithms, and algorithms for auto-adaptive management of the ventricular pacing output. If the atrioventricular block is paroxysmal, in addition to algorithms mentioned above, algorithms to maximize intrinsic atrioventricular conduction should be activated. When sinus node disease is associated with atrioventricular block, the activation of rate-responsive function in patients with chronotropic incompetence is appropriate. In permanent atrial fibrillation with atrioventricular block, algorithms for auto-adaptive management of the ventricular pacing output should be activated. If the atrioventricular block is persistent, the activation of rate-responsive function is appropriate. In carotid sinus syndrome, adequate rate hysteresis should be programmed. In vasovagal syncope, specialized sensing and pacing algorithms designed for reflex syncope prevention should be activated.
Classification of physical activities based on body-segments coordination.
Fradet, Laetitia; Marin, Frederic
2016-09-01
Numerous innovations based on connected objects and physical activity (PA) monitoring have been proposed. However, recognition of PAs requires robust algorithm and methodology. The current study presents an innovative approach for PA recognition. It is based on the heuristic definition of postures and the use of body-segments coordination obtained through external sensors. The first part of this study presents the methodology required to define the set of accelerations which is the most appropriate to represent the particular body-segments coordination involved in the chosen PAs (here walking, running, and cycling). For that purpose, subjects of different ages and heterogeneous physical conditions walked, ran, cycled, and performed daily activities at different paces. From the 3D motion capture, vertical and horizontal accelerations of 8 anatomical landmarks representative of the body were computed. Then, the 680 combinations from up to 3 accelerations were compared to identify the most appropriate set of acceleration to discriminate the PAs in terms of body segment coordinations. The discrimination was based on the maximal Hausdorff Distance obtained between the different set of accelerations. The vertical accelerations of both knees demonstrated the best PAs discrimination. The second step was the proof of concept, implementing the proposed algorithm to classify PAs of new group of subjects. The originality of the proposed algorithm is the possibility to use the subject's specific measures as reference data. With the proposed algorithm, 94% of the trials were correctly classified. In conclusion, our study proposed a flexible and extendable methodology. At the current stage, the algorithm has been shown to be valid for heterogeneous subjects, which suggests that it could be deployed in clinical or health-related applications regardless of the subjects' physical abilities or characteristics. Copyright © 2016 Elsevier Ltd. All rights reserved.
An improved VSS NLMS algorithm for active noise cancellation
NASA Astrophysics Data System (ADS)
Sun, Yunzhuo; Wang, Mingjiang; Han, Yufei; Zhang, Congyan
2017-08-01
In this paper, an improved variable step size NLMS algorithm is proposed. NLMS has fast convergence rate and low steady state error compared to other traditional adaptive filtering algorithm. But there is a contradiction between the convergence speed and steady state error that affect the performance of the NLMS algorithm. Now, we propose a new variable step size NLMS algorithm. It dynamically changes the step size according to current error and iteration times. The proposed algorithm has simple formulation and easily setting parameters, and effectively solves the contradiction in NLMS. The simulation results show that the proposed algorithm has a good tracking ability, fast convergence rate and low steady state error simultaneously.
Lieb, Florian; Stark, Hans-Georg; Thielemann, Christiane
2017-06-01
Spike detection from extracellular recordings is a crucial preprocessing step when analyzing neuronal activity. The decision whether a specific part of the signal is a spike or not is important for any kind of other subsequent preprocessing steps, like spike sorting or burst detection in order to reduce the classification of erroneously identified spikes. Many spike detection algorithms have already been suggested, all working reasonably well whenever the signal-to-noise ratio is large enough. When the noise level is high, however, these algorithms have a poor performance. In this paper we present two new spike detection algorithms. The first is based on a stationary wavelet energy operator and the second is based on the time-frequency representation of spikes. Both algorithms are more reliable than all of the most commonly used methods. The performance of the algorithms is confirmed by using simulated data, resembling original data recorded from cortical neurons with multielectrode arrays. In order to demonstrate that the performance of the algorithms is not restricted to only one specific set of data, we also verify the performance using a simulated publicly available data set. We show that both proposed algorithms have the best performance under all tested methods, regardless of the signal-to-noise ratio in both data sets. This contribution will redound to the benefit of electrophysiological investigations of human cells. Especially the spatial and temporal analysis of neural network communications is improved by using the proposed spike detection algorithms.
VizieR Online Data Catalog: Gamma-ray AGN type determination (Hassan+, 2013)
NASA Astrophysics Data System (ADS)
Hassan, T.; Mirabal, N.; Contreras, J. L.; Oya, I.
2013-11-01
In this paper, we employ Support Vector Machines (SVMs) and Random Forest (RF) that embody two of the most robust supervised learning algorithms available today. We are interested in building classifiers that can distinguish between two AGN classes: BL Lacs and FSRQs. In the 2FGL, there is a total set of 1074 identified/associated AGN objects with the following labels: 'bzb' (BL Lacs), 'bzq' (FSRQs), 'agn' (other non-blazar AGN) and 'agu' (active galaxies of uncertain type). From this global set, we group the identified/associated blazars ('bzb' and 'bzq' labels) as the training/testing set of our algorithms. (2 data files).
Zhan, Xue-yan; Zhao, Na; Lin, Zhao-zhou; Wu, Zhi-sheng; Yuan, Rui-juan; Qiao, Yan-jiang
2014-12-01
The appropriate algorithm for calibration set selection was one of the key technologies for a good NIR quantitative model. There are different algorithms for calibration set selection, such as Random Sampling (RS) algorithm, Conventional Selection (CS) algorithm, Kennard-Stone(KS) algorithm and Sample set Portioning based on joint x-y distance (SPXY) algorithm, et al. However, there lack systematic comparisons between two algorithms of the above algorithms. The NIR quantitative models to determine the asiaticoside content in Centella total glucosides were established in the present paper, of which 7 indexes were classified and selected, and the effects of CS algorithm, KS algorithm and SPXY algorithm for calibration set selection on the accuracy and robustness of NIR quantitative models were investigated. The accuracy indexes of NIR quantitative models with calibration set selected by SPXY algorithm were significantly different from that with calibration set selected by CS algorithm or KS algorithm, while the robustness indexes, such as RMSECV and |RMSEP-RMSEC|, were not significantly different. Therefore, SPXY algorithm for calibration set selection could improve the predicative accuracy of NIR quantitative models to determine asiaticoside content in Centella total glucosides, and have no significant effect on the robustness of the models, which provides a reference to determine the appropriate algorithm for calibration set selection when NIR quantitative models are established for the solid system of traditional Chinese medcine.
SOLAR FLARE PREDICTION USING SDO/HMI VECTOR MAGNETIC FIELD DATA WITH A MACHINE-LEARNING ALGORITHM
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bobra, M. G.; Couvidat, S., E-mail: couvidat@stanford.edu
2015-01-10
We attempt to forecast M- and X-class solar flares using a machine-learning algorithm, called support vector machine (SVM), and four years of data from the Solar Dynamics Observatory's Helioseismic and Magnetic Imager, the first instrument to continuously map the full-disk photospheric vector magnetic field from space. Most flare forecasting efforts described in the literature use either line-of-sight magnetograms or a relatively small number of ground-based vector magnetograms. This is the first time a large data set of vector magnetograms has been used to forecast solar flares. We build a catalog of flaring and non-flaring active regions sampled from a databasemore » of 2071 active regions, comprised of 1.5 million active region patches of vector magnetic field data, and characterize each active region by 25 parameters. We then train and test the machine-learning algorithm and we estimate its performances using forecast verification metrics with an emphasis on the true skill statistic (TSS). We obtain relatively high TSS scores and overall predictive abilities. We surmise that this is partly due to fine-tuning the SVM for this purpose and also to an advantageous set of features that can only be calculated from vector magnetic field data. We also apply a feature selection algorithm to determine which of our 25 features are useful for discriminating between flaring and non-flaring active regions and conclude that only a handful are needed for good predictive abilities.« less
Automated system for analyzing the activity of individual neurons
NASA Technical Reports Server (NTRS)
Bankman, Isaac N.; Johnson, Kenneth O.; Menkes, Alex M.; Diamond, Steve D.; Oshaughnessy, David M.
1993-01-01
This paper presents a signal processing system that: (1) provides an efficient and reliable instrument for investigating the activity of neuronal assemblies in the brain; and (2) demonstrates the feasibility of generating the command signals of prostheses using the activity of relevant neurons in disabled subjects. The system operates online, in a fully automated manner and can recognize the transient waveforms of several neurons in extracellular neurophysiological recordings. Optimal algorithms for detection, classification, and resolution of overlapping waveforms are developed and evaluated. Full automation is made possible by an algorithm that can set appropriate decision thresholds and an algorithm that can generate templates on-line. The system is implemented with a fast IBM PC compatible processor board that allows on-line operation.
NASA Astrophysics Data System (ADS)
Jarvis, Jan; Haertelt, Marko; Hugger, Stefan; Butschek, Lorenz; Fuchs, Frank; Ostendorf, Ralf; Wagner, Joachim; Beyerer, Juergen
2017-04-01
In this work we present data analysis algorithms for detection of hazardous substances in hyperspectral observations acquired using active mid-infrared (MIR) backscattering spectroscopy. We present a novel background extraction algorithm based on the adaptive target generation process proposed by Ren and Chang called the adaptive background generation process (ABGP) that generates a robust and physically meaningful set of background spectra for operation of the well-known adaptive matched subspace detection (AMSD) algorithm. It is shown that the resulting AMSD-ABGP detection algorithm competes well with other widely used detection algorithms. The method is demonstrated in measurement data obtained by two fundamentally different active MIR hyperspectral data acquisition devices. A hyperspectral image sensor applicable in static scenes takes a wavelength sequential approach to hyperspectral data acquisition, whereas a rapid wavelength-scanning single-element detector variant of the same principle uses spatial scanning to generate the hyperspectral observation. It is shown that the measurement timescale of the latter is sufficient for the application of the data analysis algorithms even in dynamic scenarios.
Padilla, Jennifer E.; Liu, Wenyan; Seeman, Nadrian C.
2012-01-01
We introduce a hierarchical self assembly algorithm that produces the quasiperiodic patterns found in the Robinson tilings and suggest a practical implementation of this algorithm using DNA origami tiles. We modify the abstract Tile Assembly Model, (aTAM), to include active signaling and glue activation in response to signals to coordinate the hierarchical assembly of Robinson patterns of arbitrary size from a small set of tiles according to the tile substitution algorithm that generates them. Enabling coordinated hierarchical assembly in the aTAM makes possible the efficient encoding of the recursive process of tile substitution. PMID:23226722
Padilla, Jennifer E; Liu, Wenyan; Seeman, Nadrian C
2012-06-01
We introduce a hierarchical self assembly algorithm that produces the quasiperiodic patterns found in the Robinson tilings and suggest a practical implementation of this algorithm using DNA origami tiles. We modify the abstract Tile Assembly Model, (aTAM), to include active signaling and glue activation in response to signals to coordinate the hierarchical assembly of Robinson patterns of arbitrary size from a small set of tiles according to the tile substitution algorithm that generates them. Enabling coordinated hierarchical assembly in the aTAM makes possible the efficient encoding of the recursive process of tile substitution.
Markel, D; Naqa, I El
2012-06-01
Positron emission tomography (PET) presents a valuable resource for delineating the biological tumor volume (BTV) for image-guided radiotherapy. However, accurate and consistent image segmentation is a significant challenge within the context of PET, owing to its low spatial resolution and high levels of noise. Active contour methods based on the level set methods can be sensitive to noise and susceptible to failing in low contrast regions. Therefore, this work evaluates a novel active contour algorithm applied to the task of PET tumor segmentation. A novel active contour segmentation algorithm based on maximizing the Jensen-Renyi Divergence between regions of interest was applied to the task of segmenting lesions in 7 patients with T3-T4 pharyngolaryngeal squamous cell carcinoma. The algorithm was implemented on an NVidia GEFORCE GTV 560M GPU. The cases were taken from the Louvain database, which includes contours of the macroscopically defined BTV drawn using histology of resected tissue. The images were pre-processed using denoising/deconvolution. The segmented volumes agreed well with the macroscopic contours, with an average concordance index and classification error of 0.6 ± 0.09 and 55 ± 16.5%, respectively. The algorithm in its present implementation requires approximately 0.5-1.3 sec per iteration and can reach convergence within 10-30 iterations. The Jensen-Renyi active contour method was shown to come close to and in terms of concordance, outperforms a variety of PET segmentation methods that have been previously evaluated using the same data. Further evaluation on a larger dataset along with performance optimization is necessary before clinical deployment. © 2012 American Association of Physicists in Medicine.
A robust return-map algorithm for general multisurface plasticity
Adhikary, Deepak P.; Jayasundara, Chandana T.; Podgorney, Robert K.; ...
2016-06-16
Three new contributions to the field of multisurface plasticity are presented for general situations with an arbitrary number of nonlinear yield surfaces with hardening or softening. A method for handling linearly dependent flow directions is described. A residual that can be used in a line search is defined. An algorithm that has been implemented and comprehensively tested is discussed in detail. Examples are presented to illustrate the computational cost of various components of the algorithm. The overall result is that a single Newton-Raphson iteration of the algorithm costs between 1.5 and 2 times that of an elastic calculation. Examples alsomore » illustrate the successful convergence of the algorithm in complicated situations. For example, without using the new contributions presented here, the algorithm fails to converge for approximately 50% of the trial stresses for a common geomechanical model of sedementary rocks, while the current algorithm results in complete success. Since it involves no approximations, the algorithm is used to quantify the accuracy of an efficient, pragmatic, but approximate, algorithm used for sedimentary-rock plasticity in a commercial software package. Furthermore, the main weakness of the algorithm is identified as the difficulty of correctly choosing the set of initially active constraints in the general setting.« less
NASA Technical Reports Server (NTRS)
Shaffer, Scott; Dunbar, R. Scott; Hsiao, S. Vincent; Long, David G.
1989-01-01
The NASA Scatterometer, NSCAT, is an active spaceborne radar designed to measure the normalized radar backscatter coefficient (sigma0) of the ocean surface. These measurements can, in turn, be used to infer the surface vector wind over the ocean using a geophysical model function. Several ambiguous wind vectors result because of the nature of the model function. A median-filter-based ambiguity removal algorithm will be used by the NSCAT ground data processor to select the best wind vector from the set of ambiguous wind vectors. This process is commonly known as dealiasing or ambiguity removal. The baseline NSCAT ambiguity removal algorithm and the method used to select the set of optimum parameter values are described. An extensive simulation of the NSCAT instrument and ground data processor provides a means of testing the resulting tuned algorithm. This simulation generates the ambiguous wind-field vectors expected from the instrument as it orbits over a set of realistic meoscale wind fields. The ambiguous wind field is then dealiased using the median-based ambiguity removal algorithm. Performance is measured by comparison of the unambiguous wind fields with the true wind fields. Results have shown that the median-filter-based ambiguity removal algorithm satisfies NSCAT mission requirements.
Learning Setting-Generalized Activity Models for Smart Spaces
Cook, Diane J.
2011-01-01
The data mining and pervasive computing technologies found in smart homes offer unprecedented opportunities for providing context-aware services, including health monitoring and assistance to individuals experiencing difficulties living independently at home. In order to provide these services, smart environment algorithms need to recognize and track activities that people normally perform as part of their daily routines. However, activity recognition has typically involved gathering and labeling large amounts of data in each setting to learn a model for activities in that setting. We hypothesize that generalized models can be learned for common activities that span multiple environment settings and resident types. We describe our approach to learning these models and demonstrate the approach using eleven CASAS datasets collected in seven environments. PMID:21461133
Fast, Safe, Propellant-Efficient Spacecraft Motion Planning Under Clohessy-Wiltshire-Hill Dynamics
NASA Technical Reports Server (NTRS)
Starek, Joseph A.; Schmerling, Edward; Maher, Gabriel D.; Barbee, Brent W.; Pavone, Marco
2016-01-01
This paper presents a sampling-based motion planning algorithm for real-time and propellant-optimized autonomous spacecraft trajectory generation in near-circular orbits. Specifically, this paper leverages recent algorithmic advances in the field of robot motion planning to the problem of impulsively actuated, propellant- optimized rendezvous and proximity operations under the Clohessy-Wiltshire-Hill dynamics model. The approach calls upon a modified version of the FMT* algorithm to grow a set of feasible trajectories over a deterministic, low-dispersion set of sample points covering the free state space. To enforce safety, the tree is only grown over the subset of actively safe samples, from which there exists a feasible one-burn collision-avoidance maneuver that can safely circularize the spacecraft orbit along its coasting arc under a given set of potential thruster failures. Key features of the proposed algorithm include 1) theoretical guarantees in terms of trajectory safety and performance, 2) amenability to real-time implementation, and 3) generality, in the sense that a large class of constraints can be handled directly. As a result, the proposed algorithm offers the potential for widespread application, ranging from on-orbit satellite servicing to orbital debris removal and autonomous inspection missions.
A Methodology for the Hybridization Based in Active Components: The Case of cGA and Scatter Search.
Villagra, Andrea; Alba, Enrique; Leguizamón, Guillermo
2016-01-01
This work presents the results of a new methodology for hybridizing metaheuristics. By first locating the active components (parts) of one algorithm and then inserting them into second one, we can build efficient and accurate optimization, search, and learning algorithms. This gives a concrete way of constructing new techniques that contrasts the spread ad hoc way of hybridizing. In this paper, the enhanced algorithm is a Cellular Genetic Algorithm (cGA) which has been successfully used in the past to find solutions to such hard optimization problems. In order to extend and corroborate the use of active components as an emerging hybridization methodology, we propose here the use of active components taken from Scatter Search (SS) to improve cGA. The results obtained over a varied set of benchmarks are highly satisfactory in efficacy and efficiency when compared with a standard cGA. Moreover, the proposed hybrid approach (i.e., cGA+SS) has shown encouraging results with regard to earlier applications of our methodology.
Raghuram, Jayaram; Miller, David J; Kesidis, George
2014-07-01
We propose a method for detecting anomalous domain names, with focus on algorithmically generated domain names which are frequently associated with malicious activities such as fast flux service networks, particularly for bot networks (or botnets), malware, and phishing. Our method is based on learning a (null hypothesis) probability model based on a large set of domain names that have been white listed by some reliable authority. Since these names are mostly assigned by humans, they are pronounceable, and tend to have a distribution of characters, words, word lengths, and number of words that are typical of some language (mostly English), and often consist of words drawn from a known lexicon. On the other hand, in the present day scenario, algorithmically generated domain names typically have distributions that are quite different from that of human-created domain names. We propose a fully generative model for the probability distribution of benign (white listed) domain names which can be used in an anomaly detection setting for identifying putative algorithmically generated domain names. Unlike other methods, our approach can make detections without considering any additional (latency producing) information sources, often used to detect fast flux activity. Experiments on a publicly available, large data set of domain names associated with fast flux service networks show encouraging results, relative to several baseline methods, with higher detection rates and low false positive rates.
Raghuram, Jayaram; Miller, David J.; Kesidis, George
2014-01-01
We propose a method for detecting anomalous domain names, with focus on algorithmically generated domain names which are frequently associated with malicious activities such as fast flux service networks, particularly for bot networks (or botnets), malware, and phishing. Our method is based on learning a (null hypothesis) probability model based on a large set of domain names that have been white listed by some reliable authority. Since these names are mostly assigned by humans, they are pronounceable, and tend to have a distribution of characters, words, word lengths, and number of words that are typical of some language (mostly English), and often consist of words drawn from a known lexicon. On the other hand, in the present day scenario, algorithmically generated domain names typically have distributions that are quite different from that of human-created domain names. We propose a fully generative model for the probability distribution of benign (white listed) domain names which can be used in an anomaly detection setting for identifying putative algorithmically generated domain names. Unlike other methods, our approach can make detections without considering any additional (latency producing) information sources, often used to detect fast flux activity. Experiments on a publicly available, large data set of domain names associated with fast flux service networks show encouraging results, relative to several baseline methods, with higher detection rates and low false positive rates. PMID:25685511
Digital codec for real-time processing of broadcast quality video signals at 1.8 bits/pixel
NASA Technical Reports Server (NTRS)
Shalkhauser, Mary JO; Whyte, Wayne A., Jr.
1989-01-01
The authors present the hardware implementation of a digital television bandwidth compression algorithm which processes standard NTSC (National Television Systems Committee) composite color television signals and produces broadcast-quality video in real time at an average of 1.8 b/pixel. The sampling rate used with this algorithm results in 768 samples over the active portion of each video line by 512 active video lines per video frame. The algorithm is based on differential pulse code modulation (DPCM), but additionally utilizes a nonadaptive predictor, nonuniform quantizer, and multilevel Huffman coder to reduce the data rate substantially below that achievable with straight DPCM. The nonadaptive predictor and multilevel Huffman coder combine to set this technique apart from prior-art DPCM encoding algorithms. The authors describe the data compression algorithm and the hardware implementation of the codec and provide performance results.
Falls event detection using triaxial accelerometry and barometric pressure measurement.
Bianchi, Federico; Redmond, Stephen J; Narayanan, Michael R; Cerutti, Sergio; Celler, Branko G; Lovell, Nigel H
2009-01-01
A falls detection system, employing a Bluetooth-based wearable device, containing a triaxial accelerometer and a barometric pressure sensor, is described. The aim of this study is to evaluate the use of barometric pressure measurement, as a surrogate measure of altitude, to augment previously reported accelerometry-based falls detection algorithms. The accelerometry and barometric pressure signals obtained from the waist-mounted device are analyzed by a signal processing and classification algorithm to discriminate falls from activities of daily living. This falls detection algorithm has been compared to two existing algorithms which utilize accelerometry signals alone. A set of laboratory-based simulated falls, along with other tasks associated with activities of daily living (16 tests) were performed by 15 healthy volunteers (9 male and 6 female; age: 23.7 +/- 2.9 years; height: 1.74 +/- 0.11 m). The algorithm incorporating pressure information detected falls with the highest sensitivity (97.8%) and the highest specificity (96.7%).
NASA Astrophysics Data System (ADS)
Lieb, Florian; Stark, Hans-Georg; Thielemann, Christiane
2017-06-01
Objective. Spike detection from extracellular recordings is a crucial preprocessing step when analyzing neuronal activity. The decision whether a specific part of the signal is a spike or not is important for any kind of other subsequent preprocessing steps, like spike sorting or burst detection in order to reduce the classification of erroneously identified spikes. Many spike detection algorithms have already been suggested, all working reasonably well whenever the signal-to-noise ratio is large enough. When the noise level is high, however, these algorithms have a poor performance. Approach. In this paper we present two new spike detection algorithms. The first is based on a stationary wavelet energy operator and the second is based on the time-frequency representation of spikes. Both algorithms are more reliable than all of the most commonly used methods. Main results. The performance of the algorithms is confirmed by using simulated data, resembling original data recorded from cortical neurons with multielectrode arrays. In order to demonstrate that the performance of the algorithms is not restricted to only one specific set of data, we also verify the performance using a simulated publicly available data set. We show that both proposed algorithms have the best performance under all tested methods, regardless of the signal-to-noise ratio in both data sets. Significance. This contribution will redound to the benefit of electrophysiological investigations of human cells. Especially the spatial and temporal analysis of neural network communications is improved by using the proposed spike detection algorithms.
Fast online deconvolution of calcium imaging data
Zhou, Pengcheng; Paninski, Liam
2017-01-01
Fluorescent calcium indicators are a popular means for observing the spiking activity of large neuronal populations, but extracting the activity of each neuron from raw fluorescence calcium imaging data is a nontrivial problem. We present a fast online active set method to solve this sparse non-negative deconvolution problem. Importantly, the algorithm 3progresses through each time series sequentially from beginning to end, thus enabling real-time online estimation of neural activity during the imaging session. Our algorithm is a generalization of the pool adjacent violators algorithm (PAVA) for isotonic regression and inherits its linear-time computational complexity. We gain remarkable increases in processing speed: more than one order of magnitude compared to currently employed state of the art convex solvers relying on interior point methods. Unlike these approaches, our method can exploit warm starts; therefore optimizing model hyperparameters only requires a handful of passes through the data. A minor modification can further improve the quality of activity inference by imposing a constraint on the minimum spike size. The algorithm enables real-time simultaneous deconvolution of O(105) traces of whole-brain larval zebrafish imaging data on a laptop. PMID:28291787
de Araujo Furtado, Marcio; Zheng, Andy; Sedigh-Sarvestani, Madineh; Lumley, Lucille; Lichtenstein, Spencer; Yourick, Debra
2009-10-30
The organophosphorous compound soman is an acetylcholinesterase inhibitor that causes damage to the brain. Exposure to soman causes neuropathology as a result of prolonged and recurrent seizures. In the present study, long-term recordings of cortical EEG were used to develop an unbiased means to quantify measures of seizure activity in a large data set while excluding other signal types. Rats were implanted with telemetry transmitters and exposed to soman followed by treatment with therapeutics similar to those administered in the field after nerve agent exposure. EEG, activity and temperature were recorded continuously for a minimum of 2 days pre-exposure and 15 days post-exposure. A set of automatic MATLAB algorithms have been developed to remove artifacts and measure the characteristics of long-term EEG recordings. The algorithms use short-time Fourier transforms to compute the power spectrum of the signal for 2-s intervals. The spectrum is then divided into the delta, theta, alpha, and beta frequency bands. A linear fit to the power spectrum is used to distinguish normal EEG activity from artifacts and high amplitude spike wave activity. Changes in time spent in seizure over a prolonged period are a powerful indicator of the effects of novel therapeutics against seizures. A graphical user interface has been created that simultaneously plots the raw EEG in the time domain, the power spectrum, and the wavelet transform. Motor activity and temperature are associated with EEG changes. The accuracy of this algorithm is also verified against visual inspection of video recordings up to 3 days after exposure.
Test of the Semi-Analytical Case 1 and Gelbstoff Case 2 SeaWiFS Algorithm with a Global Data Set
NASA Technical Reports Server (NTRS)
Carder, Kendall L.
1997-01-01
The algorithm-development activities at USF during the second half of 1997 have concentrated on data collection and theoretical modeling. Six abstracts were submitted for presentation at the AGU conference in San Diego, California during February 9-13, 1998. Four papers were submitted to JGR and Applied Optics for publication.
NASA Astrophysics Data System (ADS)
Liao, Wei-Cheng; Hong, Mingyi; Liu, Ya-Feng; Luo, Zhi-Quan
2014-08-01
In a densely deployed heterogeneous network (HetNet), the number of pico/micro base stations (BS) can be comparable with the number of the users. To reduce the operational overhead of the HetNet, proper identification of the set of serving BSs becomes an important design issue. In this work, we show that by jointly optimizing the transceivers and determining the active set of BSs, high system resource utilization can be achieved with only a small number of BSs. In particular, we provide formulations and efficient algorithms for such joint optimization problem, under the following two common design criteria: i) minimization of the total power consumption at the BSs, and ii) maximization of the system spectrum efficiency. In both cases, we introduce a nonsmooth regularizer to facilitate the activation of the most appropriate BSs. We illustrate the efficiency and the efficacy of the proposed algorithms via extensive numerical simulations.
A generalized LSTM-like training algorithm for second-order recurrent neural networks
Monner, Derek; Reggia, James A.
2011-01-01
The Long Short Term Memory (LSTM) is a second-order recurrent neural network architecture that excels at storing sequential short-term memories and retrieving them many time-steps later. LSTM’s original training algorithm provides the important properties of spatial and temporal locality, which are missing from other training approaches, at the cost of limiting it’s applicability to a small set of network architectures. Here we introduce the Generalized Long Short-Term Memory (LSTM-g) training algorithm, which provides LSTM-like locality while being applicable without modification to a much wider range of second-order network architectures. With LSTM-g, all units have an identical set of operating instructions for both activation and learning, subject only to the configuration of their local environment in the network; this is in contrast to the original LSTM training algorithm, where each type of unit has its own activation and training instructions. When applied to LSTM architectures with peephole connections, LSTM-g takes advantage of an additional source of back-propagated error which can enable better performance than the original algorithm. Enabled by the broad architectural applicability of LSTM-g, we demonstrate that training recurrent networks engineered for specific tasks can produce better results than single-layer networks. We conclude that LSTM-g has the potential to both improve the performance and broaden the applicability of spatially and temporally local gradient-based training algorithms for recurrent neural networks. PMID:21803542
Petitjean, Michel
2017-10-01
Some major proteins families, such as carbonic anhydrases (CAs), have a conical cavity at the active site. No algorithm was available to compute conical cavities, so we needed to design one. The fast algorithm we designed let us show on a set of 717 CAs extracted from the PDB database that γ-CAs are characterized by active site cavity cone angles significantly larger than those of α-CAs and β-CAs: the generatrix-axis angles are greater than 60° for the γ-CAs while they are smaller than 50° for the other CAs. Free binaries of the CONICA software implementing the algorithm are available through a software repository at http://petitjeanmichel.free.fr/itoweb.petitjean.freeware.html. © 2017 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
Robust non-rigid registration algorithm based on local affine registration
NASA Astrophysics Data System (ADS)
Wu, Liyang; Xiong, Lei; Du, Shaoyi; Bi, Duyan; Fang, Ting; Liu, Kun; Wu, Dongpeng
2018-04-01
Aiming at the problem that the traditional point set non-rigid registration algorithm has low precision and slow convergence speed for complex local deformation data, this paper proposes a robust non-rigid registration algorithm based on local affine registration. The algorithm uses a hierarchical iterative method to complete the point set non-rigid registration from coarse to fine. In each iteration, the sub data point sets and sub model point sets are divided and the shape control points of each sub point set are updated. Then we use the control point guided affine ICP algorithm to solve the local affine transformation between the corresponding sub point sets. Next, the local affine transformation obtained by the previous step is used to update the sub data point sets and their shape control point sets. When the algorithm reaches the maximum iteration layer K, the loop ends and outputs the updated sub data point sets. Experimental results demonstrate that the accuracy and convergence of our algorithm are greatly improved compared with the traditional point set non-rigid registration algorithms.
NASA Astrophysics Data System (ADS)
Debats, Stephanie Renee
Smallholder farms dominate in many parts of the world, including Sub-Saharan Africa. These systems are characterized by small, heterogeneous, and often indistinct field patterns, requiring a specialized methodology to map agricultural landcover. In this thesis, we developed a benchmark labeled data set of high-resolution satellite imagery of agricultural fields in South Africa. We presented a new approach to mapping agricultural fields, based on efficient extraction of a vast set of simple, highly correlated, and interdependent features, followed by a random forest classifier. The algorithm achieved similar high performance across agricultural types, including spectrally indistinct smallholder fields, and demonstrated the ability to generalize across large geographic areas. In sensitivity analyses, we determined multi-temporal images provided greater performance gains than the addition of multi-spectral bands. We also demonstrated how active learning can be incorporated in the algorithm to create smaller, more efficient training data sets, which reduced computational resources, minimized the need for humans to hand-label data, and boosted performance. We designed a patch-based uncertainty metric to drive the active learning framework, based on the regular grid of a crowdsourcing platform, and demonstrated how subject matter experts can be replaced with fleets of crowdsourcing workers. Our active learning algorithm achieved similar performance as an algorithm trained with randomly selected data, but with 62% less data samples. This thesis furthers the goal of providing accurate agricultural landcover maps, at a scale that is relevant for the dominant smallholder class. Accurate maps are crucial for monitoring and promoting agricultural production. Furthermore, improved agricultural landcover maps will aid a host of other applications, including landcover change assessments, cadastral surveys to strengthen smallholder land rights, and constraints for crop modeling and famine prediction.
Parallel group independent component analysis for massive fMRI data sets.
Chen, Shaojie; Huang, Lei; Qiu, Huitong; Nebel, Mary Beth; Mostofsky, Stewart H; Pekar, James J; Lindquist, Martin A; Eloyan, Ani; Caffo, Brian S
2017-01-01
Independent component analysis (ICA) is widely used in the field of functional neuroimaging to decompose data into spatio-temporal patterns of co-activation. In particular, ICA has found wide usage in the analysis of resting state fMRI (rs-fMRI) data. Recently, a number of large-scale data sets have become publicly available that consist of rs-fMRI scans from thousands of subjects. As a result, efficient ICA algorithms that scale well to the increased number of subjects are required. To address this problem, we propose a two-stage likelihood-based algorithm for performing group ICA, which we denote Parallel Group Independent Component Analysis (PGICA). By utilizing the sequential nature of the algorithm and parallel computing techniques, we are able to efficiently analyze data sets from large numbers of subjects. We illustrate the efficacy of PGICA, which has been implemented in R and is freely available through the Comprehensive R Archive Network, through simulation studies and application to rs-fMRI data from two large multi-subject data sets, consisting of 301 and 779 subjects respectively.
Dragas, Jelena; Jäckel, David; Hierlemann, Andreas; Franke, Felix
2017-01-01
Reliable real-time low-latency spike sorting with large data throughput is essential for studies of neural network dynamics and for brain-machine interfaces (BMIs), in which the stimulation of neural networks is based on the networks' most recent activity. However, the majority of existing multi-electrode spike-sorting algorithms are unsuited for processing high quantities of simultaneously recorded data. Recording from large neuronal networks using large high-density electrode sets (thousands of electrodes) imposes high demands on the data-processing hardware regarding computational complexity and data transmission bandwidth; this, in turn, entails demanding requirements in terms of chip area, memory resources and processing latency. This paper presents computational complexity optimization techniques, which facilitate the use of spike-sorting algorithms in large multi-electrode-based recording systems. The techniques are then applied to a previously published algorithm, on its own, unsuited for large electrode set recordings. Further, a real-time low-latency high-performance VLSI hardware architecture of the modified algorithm is presented, featuring a folded structure capable of processing the activity of hundreds of neurons simultaneously. The hardware is reconfigurable “on-the-fly” and adaptable to the nonstationarities of neuronal recordings. By transmitting exclusively spike time stamps and/or spike waveforms, its real-time processing offers the possibility of data bandwidth and data storage reduction. PMID:25415989
Dragas, Jelena; Jackel, David; Hierlemann, Andreas; Franke, Felix
2015-03-01
Reliable real-time low-latency spike sorting with large data throughput is essential for studies of neural network dynamics and for brain-machine interfaces (BMIs), in which the stimulation of neural networks is based on the networks' most recent activity. However, the majority of existing multi-electrode spike-sorting algorithms are unsuited for processing high quantities of simultaneously recorded data. Recording from large neuronal networks using large high-density electrode sets (thousands of electrodes) imposes high demands on the data-processing hardware regarding computational complexity and data transmission bandwidth; this, in turn, entails demanding requirements in terms of chip area, memory resources and processing latency. This paper presents computational complexity optimization techniques, which facilitate the use of spike-sorting algorithms in large multi-electrode-based recording systems. The techniques are then applied to a previously published algorithm, on its own, unsuited for large electrode set recordings. Further, a real-time low-latency high-performance VLSI hardware architecture of the modified algorithm is presented, featuring a folded structure capable of processing the activity of hundreds of neurons simultaneously. The hardware is reconfigurable “on-the-fly” and adaptable to the nonstationarities of neuronal recordings. By transmitting exclusively spike time stamps and/or spike waveforms, its real-time processing offers the possibility of data bandwidth and data storage reduction.
Prediction Study on Anti-Slide Control of Railway Vehicle Based on RBF Neural Networks
NASA Astrophysics Data System (ADS)
Yang, Lijun; Zhang, Jimin
While railway vehicle braking, Anti-slide control system will detect operating status of each wheel-sets e.g. speed difference and deceleration etc. Once the detected value on some wheel-set is over pre-defined threshold, brake effort on such wheel-set will be adjusted automatically to avoid blocking. Such method takes effect on guarantee safety operation of vehicle and avoid wheel-set flatness, however it cannot adapt itself to the rail adhesion variation. While wheel-sets slide, the operating status is chaotic time series with certain law, and can be predicted with the law and experiment data in certain time. The predicted values can be used as the input reference signals of vehicle anti-slide control system, to judge and control the slide status of wheel-sets. In this article, the RBF neural networks is taken to predict wheel-set slide status in multi-step with weight vector adjusted based on online self-adaptive algorithm, and the center & normalizing parameters of active function of the hidden unit of RBF neural networks' hidden layer computed with K-means clustering algorithm. With multi-step prediction simulation, the predicted signal with appropriate precision can be used by anti-slide system to trace actively and adjust wheel-set slide tendency, so as to adapt to wheel-rail adhesion variation and reduce the risk of wheel-set blocking.
A Methodology for the Hybridization Based in Active Components: The Case of cGA and Scatter Search
Alba, Enrique; Leguizamón, Guillermo
2016-01-01
This work presents the results of a new methodology for hybridizing metaheuristics. By first locating the active components (parts) of one algorithm and then inserting them into second one, we can build efficient and accurate optimization, search, and learning algorithms. This gives a concrete way of constructing new techniques that contrasts the spread ad hoc way of hybridizing. In this paper, the enhanced algorithm is a Cellular Genetic Algorithm (cGA) which has been successfully used in the past to find solutions to such hard optimization problems. In order to extend and corroborate the use of active components as an emerging hybridization methodology, we propose here the use of active components taken from Scatter Search (SS) to improve cGA. The results obtained over a varied set of benchmarks are highly satisfactory in efficacy and efficiency when compared with a standard cGA. Moreover, the proposed hybrid approach (i.e., cGA+SS) has shown encouraging results with regard to earlier applications of our methodology. PMID:27403153
An algorithm for the solution of dynamic linear programs
NASA Technical Reports Server (NTRS)
Psiaki, Mark L.
1989-01-01
The algorithm's objective is to efficiently solve Dynamic Linear Programs (DLP) by taking advantage of their special staircase structure. This algorithm constitutes a stepping stone to an improved algorithm for solving Dynamic Quadratic Programs, which, in turn, would make the nonlinear programming method of Successive Quadratic Programs more practical for solving trajectory optimization problems. The ultimate goal is to being trajectory optimization solution speeds into the realm of real-time control. The algorithm exploits the staircase nature of the large constraint matrix of the equality-constrained DLPs encountered when solving inequality-constrained DLPs by an active set approach. A numerically-stable, staircase QL factorization of the staircase constraint matrix is carried out starting from its last rows and columns. The resulting recursion is like the time-varying Riccati equation from multi-stage LQR theory. The resulting factorization increases the efficiency of all of the typical LP solution operations over that of a dense matrix LP code. At the same time numerical stability is ensured. The algorithm also takes advantage of dynamic programming ideas about the cost-to-go by relaxing active pseudo constraints in a backwards sweeping process. This further decreases the cost per update of the LP rank-1 updating procedure, although it may result in more changes of the active set that if pseudo constraints were relaxed in a non-stagewise fashion. The usual stability of closed-loop Linear/Quadratic optimally-controlled systems, if it carries over to strictly linear cost functions, implies that the saving due to reduced factor update effort may outweigh the cost of an increased number of updates. An aerospace example is presented in which a ground-to-ground rocket's distance is maximized. This example demonstrates the applicability of this class of algorithms to aerospace guidance. It also sheds light on the efficacy of the proposed pseudo constraint relaxation scheme.
A Round Robin evaluation of AMSR-E soil moisture retrievals
NASA Astrophysics Data System (ADS)
Mittelbach, Heidi; Hirschi, Martin; Nicolai-Shaw, Nadine; Gruber, Alexander; Dorigo, Wouter; de Jeu, Richard; Parinussa, Robert; Jones, Lucas A.; Wagner, Wolfgang; Seneviratne, Sonia I.
2014-05-01
Large-scale and long-term soil moisture observations based on remote sensing are promising data sets to investigate and understand various processes of the climate system including the water and biochemical cycles. Currently, the ESA Climate Change Initiative for soil moisture develops and evaluates a consistent global long-term soil moisture data set, which is based on merging passive and active remotely sensed soil moisture. Within this project an inter-comparison of algorithms for AMSR-E and ASCAT Level 2 products was conducted separately to assess the performance of different retrieval algorithms. Here we present the inter-comparison of AMSR-E Level 2 soil moisture products. These include the public data sets from University of Montana (UMT), Japan Aerospace and Space Exploration Agency (JAXA), VU University of Amsterdam (VUA; two algorithms) and National Aeronautics and Space Administration (NASA). All participating algorithms are applied to the same AMSR-E Level 1 data set. Ascending and descending paths of scaled surface soil moisture are considered and evaluated separately in daily and monthly resolution over the 2007-2011 time period. Absolute values of soil moisture as well as their long-term anomalies (i.e. removing the mean seasonal cycle) and short-term anomalies (i.e. removing a five weeks moving average) are evaluated. The evaluation is based on conventional measures like correlation and unbiased root-mean-square differences as well as on the application of the triple collocation method. As reference data set, surface soil moisture of 75 quality controlled soil moisture sites from the International Soil Moisture Network (ISMN) are used, which cover a wide range of vegetation density and climate conditions. For the application of the triple collocation method, surface soil moisture estimates from the Global Land Data Assimilation System are used as third independent data set. We find that the participating algorithms generally display a better performance for the descending compared to the ascending paths. A first classification of the sites defined by geographical locations show that the algorithms have a very similar average performance. Further classifications of the sites by land cover types and climate regions will be conducted which might result in a more diverse performance of the algorithms.
On Connected Target k-Coverage in Heterogeneous Wireless Sensor Networks.
Yu, Jiguo; Chen, Ying; Ma, Liran; Huang, Baogui; Cheng, Xiuzhen
2016-01-15
Coverage and connectivity are two important performance evaluation indices for wireless sensor networks (WSNs). In this paper, we focus on the connected target k-coverage (CTC k) problem in heterogeneous wireless sensor networks (HWSNs). A centralized connected target k-coverage algorithm (CCTC k) and a distributed connected target k-coverage algorithm (DCTC k) are proposed so as to generate connected cover sets for energy-efficient connectivity and coverage maintenance. To be specific, our proposed algorithms aim at achieving minimum connected target k-coverage, where each target in the monitored region is covered by at least k active sensor nodes. In addition, these two algorithms strive to minimize the total number of active sensor nodes and guarantee that each sensor node is connected to a sink, such that the sensed data can be forwarded to the sink. Our theoretical analysis and simulation results show that our proposed algorithms outperform a state-of-art connected k-coverage protocol for HWSNs.
A semi-learning algorithm for noise rejection: an fNIRS study on ADHD children
NASA Astrophysics Data System (ADS)
Sutoko, Stephanie; Funane, Tsukasa; Katura, Takusige; Sato, Hiroki; Kiguchi, Masashi; Maki, Atsushi; Monden, Yukifumi; Nagashima, Masako; Yamagata, Takanori; Dan, Ippeita
2017-02-01
In pediatrics studies, the quality of functional near infrared spectroscopy (fNIRS) signals is often reduced by motion artifacts. These artifacts likely mislead brain functionality analysis, causing false discoveries. While noise correction methods and their performance have been investigated, these methods require several parameter assumptions that apparently result in noise overfitting. In contrast, the rejection of noisy signals serves as a preferable method because it maintains the originality of the signal waveform. Here, we describe a semi-learning algorithm to detect and eliminate noisy signals. The algorithm dynamically adjusts noise detection according to the predetermined noise criteria, which are spikes, unusual activation values (averaged amplitude signals within the brain activation period), and high activation variances (among trials). Criteria were sequentially organized in the algorithm and orderly assessed signals based on each criterion. By initially setting an acceptable rejection rate, particular criteria causing excessive data rejections are neglected, whereas others with tolerable rejections practically eliminate noises. fNIRS data measured during the attention response paradigm (oddball task) in children with attention deficit/hyperactivity disorder (ADHD) were utilized to evaluate and optimize the algorithm's performance. This algorithm successfully substituted the visual noise identification done in the previous studies and consistently found significantly lower activation of the right prefrontal and parietal cortices in ADHD patients than in typical developing children. Thus, we conclude that the semi-learning algorithm confers more objective and standardized judgment for noise rejection and presents a promising alternative to visual noise rejection
Automated Ecological Assessment of Physical Activity: Advancing Direct Observation.
Carlson, Jordan A; Liu, Bo; Sallis, James F; Kerr, Jacqueline; Hipp, J Aaron; Staggs, Vincent S; Papa, Amy; Dean, Kelsey; Vasconcelos, Nuno M
2017-12-01
Technological advances provide opportunities for automating direct observations of physical activity, which allow for continuous monitoring and feedback. This pilot study evaluated the initial validity of computer vision algorithms for ecological assessment of physical activity. The sample comprised 6630 seconds per camera (three cameras in total) of video capturing up to nine participants engaged in sitting, standing, walking, and jogging in an open outdoor space while wearing accelerometers. Computer vision algorithms were developed to assess the number and proportion of people in sedentary, light, moderate, and vigorous activity, and group-based metabolic equivalents of tasks (MET)-minutes. Means and standard deviations (SD) of bias/difference values, and intraclass correlation coefficients (ICC) assessed the criterion validity compared to accelerometry separately for each camera. The number and proportion of participants sedentary and in moderate-to-vigorous physical activity (MVPA) had small biases (within 20% of the criterion mean) and the ICCs were excellent (0.82-0.98). Total MET-minutes were slightly underestimated by 9.3-17.1% and the ICCs were good (0.68-0.79). The standard deviations of the bias estimates were moderate-to-large relative to the means. The computer vision algorithms appeared to have acceptable sample-level validity (i.e., across a sample of time intervals) and are promising for automated ecological assessment of activity in open outdoor settings, but further development and testing is needed before such tools can be used in a diverse range of settings.
Automated Ecological Assessment of Physical Activity: Advancing Direct Observation
Carlson, Jordan A.; Liu, Bo; Sallis, James F.; Kerr, Jacqueline; Papa, Amy; Dean, Kelsey; Vasconcelos, Nuno M.
2017-01-01
Technological advances provide opportunities for automating direct observations of physical activity, which allow for continuous monitoring and feedback. This pilot study evaluated the initial validity of computer vision algorithms for ecological assessment of physical activity. The sample comprised 6630 seconds per camera (three cameras in total) of video capturing up to nine participants engaged in sitting, standing, walking, and jogging in an open outdoor space while wearing accelerometers. Computer vision algorithms were developed to assess the number and proportion of people in sedentary, light, moderate, and vigorous activity, and group-based metabolic equivalents of tasks (MET)-minutes. Means and standard deviations (SD) of bias/difference values, and intraclass correlation coefficients (ICC) assessed the criterion validity compared to accelerometry separately for each camera. The number and proportion of participants sedentary and in moderate-to-vigorous physical activity (MVPA) had small biases (within 20% of the criterion mean) and the ICCs were excellent (0.82–0.98). Total MET-minutes were slightly underestimated by 9.3–17.1% and the ICCs were good (0.68–0.79). The standard deviations of the bias estimates were moderate-to-large relative to the means. The computer vision algorithms appeared to have acceptable sample-level validity (i.e., across a sample of time intervals) and are promising for automated ecological assessment of activity in open outdoor settings, but further development and testing is needed before such tools can be used in a diverse range of settings. PMID:29194358
Utilization of Ancillary Data Sets for SMAP Algorithm Development and Product Generation
NASA Technical Reports Server (NTRS)
ONeill, P.; Podest, E.; Njoku, E.
2011-01-01
Algorithms being developed for the Soil Moisture Active Passive (SMAP) mission require a variety of both static and ancillary data. The selection of the most appropriate source for each ancillary data parameter is driven by a number of considerations, including accuracy, latency, availability, and consistency across all SMAP products and with SMOS (Soil Moisture Ocean Salinity). It is anticipated that initial selection of all ancillary datasets, which are needed for ongoing algorithm development activities on the SMAP algorithm testbed at JPL, will be completed within the year. These datasets will be updated as new or improved sources become available, and all selections and changes will be documented for the benefit of the user community. Wise choices in ancillary data will help to enable SMAP to provide new global measurements of soil moisture and freeze/thaw state at the targeted accuracy necessary to tackle hydrologically-relevant societal issues.
Experimental and analytical study of secondary path variations in active engine mounts
NASA Astrophysics Data System (ADS)
Hausberg, Fabian; Scheiblegger, Christian; Pfeffer, Peter; Plöchl, Manfred; Hecker, Simon; Rupp, Markus
2015-03-01
Active engine mounts (AEMs) provide an effective solution to further improve the acoustic and vibrational comfort of passenger cars. Typically, adaptive feedforward control algorithms, e.g., the filtered-x-least-mean-squares (FxLMS) algorithm, are applied to cancel disturbing engine vibrations. These algorithms require an accurate estimate of the AEM active dynamic characteristics, also known as the secondary path, in order to guarantee control performance and stability. This paper focuses on the experimental and theoretical study of secondary path variations in AEMs. The impact of three major influences, namely nonlinearity, change of preload and component temperature, on the AEM active dynamic characteristics is experimentally analyzed. The obtained test results are theoretically investigated with a linear AEM model which incorporates an appropriate description for elastomeric components. A special experimental set-up extends the model validation of the active dynamic characteristics to higher frequencies up to 400 Hz. The theoretical and experimental results show that significant secondary path variations are merely observed in the frequency range of the AEM actuator's resonance frequency. These variations mainly result from the change of the component temperature. As the stability of the algorithm is primarily affected by the actuator's resonance frequency, the findings of this paper facilitate the design of AEMs with simpler adaptive feedforward algorithms. From a practical point of view it may further be concluded that algorithmic countermeasures against instability are only necessary in the frequency range of the AEM actuator's resonance frequency.
Li, Mengshan; Zhang, Huaijing; Chen, Bingsheng; Wu, Yan; Guan, Lixin
2018-03-05
The pKa value of drugs is an important parameter in drug design and pharmacology. In this paper, an improved particle swarm optimization (PSO) algorithm was proposed based on the population entropy diversity. In the improved algorithm, when the population entropy was higher than the set maximum threshold, the convergence strategy was adopted; when the population entropy was lower than the set minimum threshold the divergence strategy was adopted; when the population entropy was between the maximum and minimum threshold, the self-adaptive adjustment strategy was maintained. The improved PSO algorithm was applied in the training of radial basis function artificial neural network (RBF ANN) model and the selection of molecular descriptors. A quantitative structure-activity relationship model based on RBF ANN trained by the improved PSO algorithm was proposed to predict the pKa values of 74 kinds of neutral and basic drugs and then validated by another database containing 20 molecules. The validation results showed that the model had a good prediction performance. The absolute average relative error, root mean square error, and squared correlation coefficient were 0.3105, 0.0411, and 0.9685, respectively. The model can be used as a reference for exploring other quantitative structure-activity relationships.
Bidirectional Active Learning: A Two-Way Exploration Into Unlabeled and Labeled Data Set.
Zhang, Xiao-Yu; Wang, Shupeng; Yun, Xiaochun
2015-12-01
In practical machine learning applications, human instruction is indispensable for model construction. To utilize the precious labeling effort effectively, active learning queries the user with selective sampling in an interactive way. Traditional active learning techniques merely focus on the unlabeled data set under a unidirectional exploration framework and suffer from model deterioration in the presence of noise. To address this problem, this paper proposes a novel bidirectional active learning algorithm that explores into both unlabeled and labeled data sets simultaneously in a two-way process. For the acquisition of new knowledge, forward learning queries the most informative instances from unlabeled data set. For the introspection of learned knowledge, backward learning detects the most suspiciously unreliable instances within the labeled data set. Under the two-way exploration framework, the generalization ability of the learning model can be greatly improved, which is demonstrated by the encouraging experimental results.
NASA Astrophysics Data System (ADS)
Nair, Binu M.; Diskin, Yakov; Asari, Vijayan K.
2012-10-01
We present an autonomous system capable of performing security check routines. The surveillance machine, the Clearpath Husky robotic platform, is equipped with three IP cameras with different orientations for the surveillance tasks of face recognition, human activity recognition, autonomous navigation and 3D reconstruction of its environment. Combining the computer vision algorithms onto a robotic machine has given birth to the Robust Artificial Intelligencebased Defense Electro-Robot (RAIDER). The end purpose of the RAIDER is to conduct a patrolling routine on a single floor of a building several times a day. As the RAIDER travels down the corridors off-line algorithms use two of the RAIDER's side mounted cameras to perform a 3D reconstruction from monocular vision technique that updates a 3D model to the most current state of the indoor environment. Using frames from the front mounted camera, positioned at the human eye level, the system performs face recognition with real time training of unknown subjects. Human activity recognition algorithm will also be implemented in which each detected person is assigned to a set of action classes picked to classify ordinary and harmful student activities in a hallway setting.The system is designed to detect changes and irregularities within an environment as well as familiarize with regular faces and actions to distinguish potentially dangerous behavior. In this paper, we present the various algorithms and their modifications which when implemented on the RAIDER serves the purpose of indoor surveillance.
Bourke, Alan K; Klenk, Jochen; Schwickert, Lars; Aminian, Kamiar; Ihlen, Espen A F; Mellone, Sabato; Helbostad, Jorunn L; Chiari, Lorenzo; Becker, Clemens
2016-08-01
Automatic fall detection will promote independent living and reduce the consequences of falls in the elderly by ensuring people can confidently live safely at home for linger. In laboratory studies inertial sensor technology has been shown capable of distinguishing falls from normal activities. However less than 7% of fall-detection algorithm studies have used fall data recorded from elderly people in real life. The FARSEEING project has compiled a database of real life falls from elderly people, to gain new knowledge about fall events and to develop fall detection algorithms to combat the problems associated with falls. We have extracted 12 different kinematic, temporal and kinetic related features from a data-set of 89 real-world falls and 368 activities of daily living. Using the extracted features we applied machine learning techniques and produced a selection of algorithms based on different feature combinations. The best algorithm employs 10 different features and produced a sensitivity of 0.88 and a specificity of 0.87 in classifying falls correctly. This algorithm can be used distinguish real-world falls from normal activities of daily living in a sensor consisting of a tri-axial accelerometer and tri-axial gyroscope located at L5.
Joint reconstruction of activity and attenuation in Time-of-Flight PET: A Quantitative Analysis.
Rezaei, Ahmadreza; Deroose, Christophe M; Vahle, Thomas; Boada, Fernando; Nuyts, Johan
2018-03-01
Joint activity and attenuation reconstruction methods from time of flight (TOF) positron emission tomography (PET) data provide an effective solution to attenuation correction when no (or incomplete/inaccurate) information on the attenuation is available. One of the main barriers limiting their use in clinical practice is the lack of validation of these methods on a relatively large patient database. In this contribution, we aim at validating the activity reconstructions of the maximum likelihood activity reconstruction and attenuation registration (MLRR) algorithm on a whole-body patient data set. Furthermore, a partial validation (since the scale problem of the algorithm is avoided for now) of the maximum likelihood activity and attenuation reconstruction (MLAA) algorithm is also provided. We present a quantitative comparison of the joint reconstructions to the current clinical gold-standard maximum likelihood expectation maximization (MLEM) reconstruction with CT-based attenuation correction. Methods: The whole-body TOF-PET emission data of each patient data set is processed as a whole to reconstruct an activity volume covering all the acquired bed positions, which helps to reduce the problem of a scale per bed position in MLAA to a global scale for the entire activity volume. Three reconstruction algorithms are used: MLEM, MLRR and MLAA. A maximum likelihood (ML) scaling of the single scatter simulation (SSS) estimate to the emission data is used for scatter correction. The reconstruction results are then analyzed in different regions of interest. Results: The joint reconstructions of the whole-body patient data set provide better quantification in case of PET and CT misalignments caused by patient and organ motion. Our quantitative analysis shows a difference of -4.2% (±2.3%) and -7.5% (±4.6%) between the joint reconstructions of MLRR and MLAA compared to MLEM, averaged over all regions of interest, respectively. Conclusion: Joint activity and attenuation estimation methods provide a useful means to estimate the tracer distribution in cases where CT-based attenuation images are subject to misalignments or are not available. With an accurate estimate of the scatter contribution in the emission measurements, the joint TOF-PET reconstructions are within clinical acceptable accuracy. Copyright © 2018 by the Society of Nuclear Medicine and Molecular Imaging, Inc.
Tengku Hashim, Tengku Juhana; Mohamed, Azah
2017-01-01
The growing interest in distributed generation (DG) in recent years has led to a number of generators connected to a distribution system. The integration of DGs in a distribution system has resulted in a network known as active distribution network due to the existence of bidirectional power flow in the system. Voltage rise issue is one of the predominantly important technical issues to be addressed when DGs exist in an active distribution network. This paper presents the application of the backtracking search algorithm (BSA), which is relatively new optimisation technique to determine the optimal settings of coordinated voltage control in a distribution system. The coordinated voltage control considers power factor, on-load tap-changer and generation curtailment control to manage voltage rise issue. A multi-objective function is formulated to minimise total losses and voltage deviation in a distribution system. The proposed BSA is compared with that of particle swarm optimisation (PSO) so as to evaluate its effectiveness in determining the optimal settings of power factor, tap-changer and percentage active power generation to be curtailed. The load flow algorithm from MATPOWER is integrated in the MATLAB environment to solve the multi-objective optimisation problem. Both the BSA and PSO optimisation techniques have been tested on a radial 13-bus distribution system and the results show that the BSA performs better than PSO by providing better fitness value and convergence rate. PMID:28991919
Tengku Hashim, Tengku Juhana; Mohamed, Azah
2017-01-01
The growing interest in distributed generation (DG) in recent years has led to a number of generators connected to a distribution system. The integration of DGs in a distribution system has resulted in a network known as active distribution network due to the existence of bidirectional power flow in the system. Voltage rise issue is one of the predominantly important technical issues to be addressed when DGs exist in an active distribution network. This paper presents the application of the backtracking search algorithm (BSA), which is relatively new optimisation technique to determine the optimal settings of coordinated voltage control in a distribution system. The coordinated voltage control considers power factor, on-load tap-changer and generation curtailment control to manage voltage rise issue. A multi-objective function is formulated to minimise total losses and voltage deviation in a distribution system. The proposed BSA is compared with that of particle swarm optimisation (PSO) so as to evaluate its effectiveness in determining the optimal settings of power factor, tap-changer and percentage active power generation to be curtailed. The load flow algorithm from MATPOWER is integrated in the MATLAB environment to solve the multi-objective optimisation problem. Both the BSA and PSO optimisation techniques have been tested on a radial 13-bus distribution system and the results show that the BSA performs better than PSO by providing better fitness value and convergence rate.
MODIS infrared data applied to Popocatepetl's volcanic clouds
NASA Astrophysics Data System (ADS)
Rose, W. I.; Delgado-Granados, H.; Watson, I. M.; Matiella, M. A.; Escobar, D.; Gu, Y.
2003-04-01
Popocatepetl volcano, Mexico, has shown diverse activity for the past eight years and has been characterized by strong sulfur dioxide releases and ash eruptions of variable tropospheric height. We have begun study on the eruptive activity of December 2000 and January 2001, when Popocatepetl was showing prominent activity. MODIS data is abundant during this period, and we have applied a variety of algorithms which use infrared channels of MODIS and can potentially map and measure ash size and optical depth [Wen &Rose, 1994, J Geophys Res 99 5421-5431], sulfur dioxide mass [Realmuto et al, 1997, J. Geophys. Res., 102, 15057-15072; Prata et al, in press, AGU Volcanism &Atmosphere Monograph] and sulfate particle size and mass [Yu &Rose, 2000, AGU Monograph 116: 87-100]. Because of variable environmental conditions (clouds, winds) and characteristics of the activity (explosivity and rates of sulfur dioxide and ash releases) the data set studied offers a robust test of the various algorithms, and the data may also be compared to data collected as part of the volcanic monitoring effort, including COSPEC-based sulfur dioxide surveys. The data set will be used to evaluate which algorithms work best in various conditions. At abstract time work on the data is incomplete, but we expect that such data may provide information that is useful to the volcanologists studying Popocatepetl and the people who provide information for ash cloud hazards to aircraft.
Information filtering in evolving online networks
NASA Astrophysics Data System (ADS)
Chen, Bo-Lun; Li, Fen-Fen; Zhang, Yong-Jun; Ma, Jia-Lin
2018-02-01
Recommender systems use the records of users' activities and profiles of both users and products to predict users' preferences in the future. Considerable works towards recommendation algorithms have been published to solve the problems such as accuracy, diversity, congestion, cold-start, novelty, coverage and so on. However, most of these research did not consider the temporal effects of the information included in the users' historical data. For example, the segmentation of the training set and test set was completely random, which was entirely different from the real scenario in recommender systems. More seriously, all the objects are treated as the same, regardless of the new, the popular or obsoleted products, so do the users. These data processing methods always lose useful information and mislead the understanding of the system's state. In this paper, we detailed analyzed the difference of the network structure between the traditional random division method and the temporal division method on two benchmark data sets, Netflix and MovieLens. Then three classical recommendation algorithms, Global Ranking method, Collaborative Filtering and Mass Diffusion method, were employed. The results show that all these algorithms became worse in all four key indicators, ranking score, precision, popularity and diversity, in the temporal scenario. Finally, we design a new recommendation algorithm based on both users' and objects' first appearance time in the system. Experimental results showed that the new algorithm can greatly improve the accuracy and other metrics.
ANALYTiC: An Active Learning System for Trajectory Classification.
Soares Junior, Amilcar; Renso, Chiara; Matwin, Stan
2017-01-01
The increasing availability and use of positioning devices has resulted in large volumes of trajectory data. However, semantic annotations for such data are typically added by domain experts, which is a time-consuming task. Machine-learning algorithms can help infer semantic annotations from trajectory data by learning from sets of labeled data. Specifically, active learning approaches can minimize the set of trajectories to be annotated while preserving good performance measures. The ANALYTiC web-based interactive tool visually guides users through this annotation process.
Active learning in the presence of unlabelable examples
NASA Technical Reports Server (NTRS)
Mazzoni, Dominic; Wagstaff, Kiri
2004-01-01
We propose a new active learning framework where the expert labeler is allowed to decline to label any example. This may be necessary because the true label is unknown or because the example belongs to a class that is not part of the real training problem. We show that within this framework, popular active learning algorithms (such as Simple) may perform worse than random selection because they make so many queries to the unlabelable class. We present a method by which any active learning algorithm can be modified to avoid unlabelable examples by training a second classifier to distinguish between the labelable and unlabelable classes. We also demonstrate the effectiveness of the method on two benchmark data sets and a real-world problem.
An Efficient Distributed Compressed Sensing Algorithm for Decentralized Sensor Network.
Liu, Jing; Huang, Kaiyu; Zhang, Guoxian
2017-04-20
We consider the joint sparsity Model 1 (JSM-1) in a decentralized scenario, where a number of sensors are connected through a network and there is no fusion center. A novel algorithm, named distributed compact sensing matrix pursuit (DCSMP), is proposed to exploit the computational and communication capabilities of the sensor nodes. In contrast to the conventional distributed compressed sensing algorithms adopting a random sensing matrix, the proposed algorithm focuses on the deterministic sensing matrices built directly on the real acquisition systems. The proposed DCSMP algorithm can be divided into two independent parts, the common and innovation support set estimation processes. The goal of the common support set estimation process is to obtain an estimated common support set by fusing the candidate support set information from an individual node and its neighboring nodes. In the following innovation support set estimation process, the measurement vector is projected into a subspace that is perpendicular to the subspace spanned by the columns indexed by the estimated common support set, to remove the impact of the estimated common support set. We can then search the innovation support set using an orthogonal matching pursuit (OMP) algorithm based on the projected measurement vector and projected sensing matrix. In the proposed DCSMP algorithm, the process of estimating the common component/support set is decoupled with that of estimating the innovation component/support set. Thus, the inaccurately estimated common support set will have no impact on estimating the innovation support set. It is proven that under the condition the estimated common support set contains the true common support set, the proposed algorithm can find the true innovation set correctly. Moreover, since the innovation support set estimation process is independent of the common support set estimation process, there is no requirement for the cardinality of both sets; thus, the proposed DCSMP algorithm is capable of tackling the unknown sparsity problem successfully.
Image segmentation algorithm based on improved PCNN
NASA Astrophysics Data System (ADS)
Chen, Hong; Wu, Chengdong; Yu, Xiaosheng; Wu, Jiahui
2017-11-01
A modified simplified Pulse Coupled Neural Network (PCNN) model is proposed in this article based on simplified PCNN. Some work have done to enrich this model, such as imposing restrictions items of the inputs, improving linking inputs and internal activity of PCNN. A self-adaptive parameter setting method of linking coefficient and threshold value decay time constant is proposed here, too. At last, we realized image segmentation algorithm for five pictures based on this proposed simplified PCNN model and PSO. Experimental results demonstrate that this image segmentation algorithm is much better than method of SPCNN and OTSU.
Evaluation and Application of Satellite-Based Latent Heating Profile Estimation Methods
NASA Technical Reports Server (NTRS)
Olson, William S.; Grecu, Mircea; Yang, Song; Tao, Wei-Kuo
2004-01-01
In recent years, methods for estimating atmospheric latent heating vertical structure from both passive and active microwave remote sensing have matured to the point where quantitative evaluation of these methods is the next logical step. Two approaches for heating algorithm evaluation are proposed: First, application of heating algorithms to synthetic data, based upon cloud-resolving model simulations, can be used to test the internal consistency of heating estimates in the absence of systematic errors in physical assumptions. Second, comparisons of satellite-retrieved vertical heating structures to independent ground-based estimates, such as rawinsonde-derived analyses of heating, provide an additional test. The two approaches are complementary, since systematic errors in heating indicated by the second approach may be confirmed by the first. A passive microwave and combined passive/active microwave heating retrieval algorithm are evaluated using the described approaches. In general, the passive microwave algorithm heating profile estimates are subject to biases due to the limited vertical heating structure information contained in the passive microwave observations. These biases may be partly overcome by including more environment-specific a priori information into the algorithm s database of candidate solution profiles. The combined passive/active microwave algorithm utilizes the much higher-resolution vertical structure information provided by spaceborne radar data to produce less biased estimates; however, the global spatio-temporal sampling by spaceborne radar is limited. In the present study, the passive/active microwave algorithm is used to construct a more physically-consistent and environment-specific set of candidate solution profiles for the passive microwave algorithm and to help evaluate errors in the passive algorithm s heating estimates. Although satellite estimates of latent heating are based upon instantaneous, footprint- scale data, suppression of random errors requires averaging to at least half-degree resolution. Analysis of mesoscale and larger space-time scale phenomena based upon passive and passive/active microwave heating estimates from TRMM, SSMI, and AMSR data will be presented at the conference.
Fast Dating Using Least-Squares Criteria and Algorithms.
To, Thu-Hien; Jung, Matthieu; Lycett, Samantha; Gascuel, Olivier
2016-01-01
Phylogenies provide a useful way to understand the evolutionary history of genetic samples, and data sets with more than a thousand taxa are becoming increasingly common, notably with viruses (e.g., human immunodeficiency virus (HIV)). Dating ancestral events is one of the first, essential goals with such data. However, current sophisticated probabilistic approaches struggle to handle data sets of this size. Here, we present very fast dating algorithms, based on a Gaussian model closely related to the Langley-Fitch molecular-clock model. We show that this model is robust to uncorrelated violations of the molecular clock. Our algorithms apply to serial data, where the tips of the tree have been sampled through times. They estimate the substitution rate and the dates of all ancestral nodes. When the input tree is unrooted, they can provide an estimate for the root position, thus representing a new, practical alternative to the standard rooting methods (e.g., midpoint). Our algorithms exploit the tree (recursive) structure of the problem at hand, and the close relationships between least-squares and linear algebra. We distinguish between an unconstrained setting and the case where the temporal precedence constraint (i.e., an ancestral node must be older that its daughter nodes) is accounted for. With rooted trees, the former is solved using linear algebra in linear computing time (i.e., proportional to the number of taxa), while the resolution of the latter, constrained setting, is based on an active-set method that runs in nearly linear time. With unrooted trees the computing time becomes (nearly) quadratic (i.e., proportional to the square of the number of taxa). In all cases, very large input trees (>10,000 taxa) can easily be processed and transformed into time-scaled trees. We compare these algorithms to standard methods (root-to-tip, r8s version of Langley-Fitch method, and BEAST). Using simulated data, we show that their estimation accuracy is similar to that of the most sophisticated methods, while their computing time is much faster. We apply these algorithms on a large data set comprising 1194 strains of Influenza virus from the pdm09 H1N1 Human pandemic. Again the results show that these algorithms provide a very fast alternative with results similar to those of other computer programs. These algorithms are implemented in the LSD software (least-squares dating), which can be downloaded from http://www.atgc-montpellier.fr/LSD/, along with all our data sets and detailed results. An Online Appendix, providing additional algorithm descriptions, tables, and figures can be found in the Supplementary Material available on Dryad at http://dx.doi.org/10.5061/dryad.968t3. © The Author(s) 2015. Published by Oxford University Press, on behalf of the Society of Systematic Biologists.
Fast Dating Using Least-Squares Criteria and Algorithms
To, Thu-Hien; Jung, Matthieu; Lycett, Samantha; Gascuel, Olivier
2016-01-01
Phylogenies provide a useful way to understand the evolutionary history of genetic samples, and data sets with more than a thousand taxa are becoming increasingly common, notably with viruses (e.g., human immunodeficiency virus (HIV)). Dating ancestral events is one of the first, essential goals with such data. However, current sophisticated probabilistic approaches struggle to handle data sets of this size. Here, we present very fast dating algorithms, based on a Gaussian model closely related to the Langley–Fitch molecular-clock model. We show that this model is robust to uncorrelated violations of the molecular clock. Our algorithms apply to serial data, where the tips of the tree have been sampled through times. They estimate the substitution rate and the dates of all ancestral nodes. When the input tree is unrooted, they can provide an estimate for the root position, thus representing a new, practical alternative to the standard rooting methods (e.g., midpoint). Our algorithms exploit the tree (recursive) structure of the problem at hand, and the close relationships between least-squares and linear algebra. We distinguish between an unconstrained setting and the case where the temporal precedence constraint (i.e., an ancestral node must be older that its daughter nodes) is accounted for. With rooted trees, the former is solved using linear algebra in linear computing time (i.e., proportional to the number of taxa), while the resolution of the latter, constrained setting, is based on an active-set method that runs in nearly linear time. With unrooted trees the computing time becomes (nearly) quadratic (i.e., proportional to the square of the number of taxa). In all cases, very large input trees (>10,000 taxa) can easily be processed and transformed into time-scaled trees. We compare these algorithms to standard methods (root-to-tip, r8s version of Langley–Fitch method, and BEAST). Using simulated data, we show that their estimation accuracy is similar to that of the most sophisticated methods, while their computing time is much faster. We apply these algorithms on a large data set comprising 1194 strains of Influenza virus from the pdm09 H1N1 Human pandemic. Again the results show that these algorithms provide a very fast alternative with results similar to those of other computer programs. These algorithms are implemented in the LSD software (least-squares dating), which can be downloaded from http://www.atgc-montpellier.fr/LSD/, along with all our data sets and detailed results. An Online Appendix, providing additional algorithm descriptions, tables, and figures can be found in the Supplementary Material available on Dryad at http://dx.doi.org/10.5061/dryad.968t3. PMID:26424727
Human Purposive Movement Theory
2012-03-01
theory and provides examples of developmental and operational technologies that could use this theory in common settings. 15. SUBJECT TERMS human ... activity , prediction of behavior, human algorithms purposive movement theory 16. SECURITY CLASSIFICATION OF: 17. LIMITATION OF ABSTRACT UU 18
Cornejo-Aragón, Luz G; Santos-Cuevas, Clara L; Ocampo-García, Blanca E; Chairez-Oria, Isaac; Diaz-Nieto, Lorenza; García-Quiroz, Janice
2017-01-01
The aim of this study was to develop a semi automatic image processing algorithm (AIPA) based on the simultaneous information provided by X-ray and radioisotopic images to determine the biokinetic models of Tc-99m radiopharmaceuticals from quantification of image radiation activity in murine models. These radioisotopic images were obtained by a CCD (charge couple device) camera coupled to an ultrathin phosphorous screen in a preclinical multimodal imaging system (Xtreme, Bruker). The AIPA consisted of different image processing methods for background, scattering and attenuation correction on the activity quantification. A set of parametric identification algorithms was used to obtain the biokinetic models that characterize the interaction between different tissues and the radiopharmaceuticals considered in the study. The set of biokinetic models corresponded to the Tc-99m biodistribution observed in different ex vivo studies. This fact confirmed the contribution of the semi-automatic image processing technique developed in this study.
Understanding unmet need: history, theory, and measurement.
Bradley, Sarah E K; Casterline, John B
2014-06-01
During the past two decades, estimates of unmet need have become an influential measure for assessing population policies and programs. This article recounts the evolution of the concept of unmet need, describes how demographic survey data have been used to generate estimates of its prevalence, and tests the sensitivity of these estimates to various assumptions in the unmet need algorithm. The algorithm uses a complex set of assumptions to identify women: who are sexually active, who are infecund, whose most recent pregnancy was unwanted, who wish to postpone their next birth, and who are postpartum amenorrheic. The sensitivity tests suggest that defensible alternative criteria for identifying four out of five of these subgroups of women would increase the estimated prevalence of unmet need. The exception is identification of married women who are sexually active; more accurate measurement of this subgroup would reduce the estimated prevalence of unmet need in most settings. © 2013 The Population Council, Inc.
Tracking Activities in Complex Settings Using Smart Environment Technologies.
Singla, Geetika; Cook, Diane J; Schmitter-Edgecombe, Maureen
2009-01-01
The pervasive sensing technologies found in smart homes offer unprecedented opportunities for providing health monitoring and assistance to individuals experiencing difficulties living independently at home. A primary challenge that needs to be tackled to meet this need is the ability to recognize and track functional activities that people perform in their own homes and everyday settings. In this paper we look at approaches to perform real-time recognition of Activities of Daily Living. We enhance other related research efforts to develop approaches that are effective when activities are interrupted and interleaved. To evaluate the accuracy of our recognition algorithms we assess them using real data collected from participants performing activities in our on-campus smart apartment testbed.
Competitive learning with pairwise constraints.
Covões, Thiago F; Hruschka, Eduardo R; Ghosh, Joydeep
2013-01-01
Constrained clustering has been an active research topic since the last decade. Most studies focus on batch-mode algorithms. This brief introduces two algorithms for on-line constrained learning, named on-line linear constrained vector quantization error (O-LCVQE) and constrained rival penalized competitive learning (C-RPCL). The former is a variant of the LCVQE algorithm for on-line settings, whereas the latter is an adaptation of the (on-line) RPCL algorithm to deal with constrained clustering. The accuracy results--in terms of the normalized mutual information (NMI)--from experiments with nine datasets show that the partitions induced by O-LCVQE are competitive with those found by the (batch-mode) LCVQE. Compared with this formidable baseline algorithm, it is surprising that C-RPCL can provide better partitions (in terms of the NMI) for most of the datasets. Also, experiments on a large dataset show that on-line algorithms for constrained clustering can significantly reduce the computational time.
Improved collaborative filtering recommendation algorithm of similarity measure
NASA Astrophysics Data System (ADS)
Zhang, Baofu; Yuan, Baoping
2017-05-01
The Collaborative filtering recommendation algorithm is one of the most widely used recommendation algorithm in personalized recommender systems. The key is to find the nearest neighbor set of the active user by using similarity measure. However, the methods of traditional similarity measure mainly focus on the similarity of user common rating items, but ignore the relationship between the user common rating items and all items the user rates. And because rating matrix is very sparse, traditional collaborative filtering recommendation algorithm is not high efficiency. In order to obtain better accuracy, based on the consideration of common preference between users, the difference of rating scale and score of common items, this paper presents an improved similarity measure method, and based on this method, a collaborative filtering recommendation algorithm based on similarity improvement is proposed. Experimental results show that the algorithm can effectively improve the quality of recommendation, thus alleviate the impact of data sparseness.
A ranking method for the concurrent learning of compounds with various activity profiles.
Dörr, Alexander; Rosenbaum, Lars; Zell, Andreas
2015-01-01
In this study, we present a SVM-based ranking algorithm for the concurrent learning of compounds with different activity profiles and their varying prioritization. To this end, a specific labeling of each compound was elaborated in order to infer virtual screening models against multiple targets. We compared the method with several state-of-the-art SVM classification techniques that are capable of inferring multi-target screening models on three chemical data sets (cytochrome P450s, dehydrogenases, and a trypsin-like protease data set) containing three different biological targets each. The experiments show that ranking-based algorithms show an increased performance for single- and multi-target virtual screening. Moreover, compounds that do not completely fulfill the desired activity profile are still ranked higher than decoys or compounds with an entirely undesired profile, compared to other multi-target SVM methods. SVM-based ranking methods constitute a valuable approach for virtual screening in multi-target drug design. The utilization of such methods is most helpful when dealing with compounds with various activity profiles and the finding of many ligands with an already perfectly matching activity profile is not to be expected.
Activity recognition using a single accelerometer placed at the wrist or ankle.
Mannini, Andrea; Intille, Stephen S; Rosenberger, Mary; Sabatini, Angelo M; Haskell, William
2013-11-01
Large physical activity surveillance projects such as the UK Biobank and NHANES are using wrist-worn accelerometer-based activity monitors that collect raw data. The goal is to increase wear time by asking subjects to wear the monitors on the wrist instead of the hip, and then to use information in the raw signal to improve activity type and intensity estimation. The purposes of this work was to obtain an algorithm to process wrist and ankle raw data and to classify behavior into four broad activity classes: ambulation, cycling, sedentary, and other activities. Participants (N = 33) wearing accelerometers on the wrist and ankle performed 26 daily activities. The accelerometer data were collected, cleaned, and preprocessed to extract features that characterize 2-, 4-, and 12.8-s data windows. Feature vectors encoding information about frequency and intensity of motion extracted from analysis of the raw signal were used with a support vector machine classifier to identify a subject's activity. Results were compared with categories classified by a human observer. Algorithms were validated using a leave-one-subject-out strategy. The computational complexity of each processing step was also evaluated. With 12.8-s windows, the proposed strategy showed high classification accuracies for ankle data (95.0%) that decreased to 84.7% for wrist data. Shorter (4 s) windows only minimally decreased performances of the algorithm on the wrist to 84.2%. A classification algorithm using 13 features shows good classification into the four classes given the complexity of the activities in the original data set. The algorithm is computationally efficient and could be implemented in real time on mobile devices with only 4-s latency.
Multiscale 3-D shape representation and segmentation using spherical wavelets.
Nain, Delphine; Haker, Steven; Bobick, Aaron; Tannenbaum, Allen
2007-04-01
This paper presents a novel multiscale shape representation and segmentation algorithm based on the spherical wavelet transform. This work is motivated by the need to compactly and accurately encode variations at multiple scales in the shape representation in order to drive the segmentation and shape analysis of deep brain structures, such as the caudate nucleus or the hippocampus. Our proposed shape representation can be optimized to compactly encode shape variations in a population at the needed scale and spatial locations, enabling the construction of more descriptive, nonglobal, nonuniform shape probability priors to be included in the segmentation and shape analysis framework. In particular, this representation addresses the shortcomings of techniques that learn a global shape prior at a single scale of analysis and cannot represent fine, local variations in a population of shapes in the presence of a limited dataset. Specifically, our technique defines a multiscale parametric model of surfaces belonging to the same population using a compact set of spherical wavelets targeted to that population. We further refine the shape representation by separating into groups wavelet coefficients that describe independent global and/or local biological variations in the population, using spectral graph partitioning. We then learn a prior probability distribution induced over each group to explicitly encode these variations at different scales and spatial locations. Based on this representation, we derive a parametric active surface evolution using the multiscale prior coefficients as parameters for our optimization procedure to naturally include the prior for segmentation. Additionally, the optimization method can be applied in a coarse-to-fine manner. We apply our algorithm to two different brain structures, the caudate nucleus and the hippocampus, of interest in the study of schizophrenia. We show: 1) a reconstruction task of a test set to validate the expressiveness of our multiscale prior and 2) a segmentation task. In the reconstruction task, our results show that for a given training set size, our algorithm significantly improves the approximation of shapes in a testing set over the Point Distribution Model, which tends to oversmooth data. In the segmentation task, our validation shows our algorithm is computationally efficient and outperforms the Active Shape Model algorithm, by capturing finer shape details.
Multiscale 3-D Shape Representation and Segmentation Using Spherical Wavelets
Nain, Delphine; Haker, Steven; Bobick, Aaron
2013-01-01
This paper presents a novel multiscale shape representation and segmentation algorithm based on the spherical wavelet transform. This work is motivated by the need to compactly and accurately encode variations at multiple scales in the shape representation in order to drive the segmentation and shape analysis of deep brain structures, such as the caudate nucleus or the hippocampus. Our proposed shape representation can be optimized to compactly encode shape variations in a population at the needed scale and spatial locations, enabling the construction of more descriptive, nonglobal, nonuniform shape probability priors to be included in the segmentation and shape analysis framework. In particular, this representation addresses the shortcomings of techniques that learn a global shape prior at a single scale of analysis and cannot represent fine, local variations in a population of shapes in the presence of a limited dataset. Specifically, our technique defines a multiscale parametric model of surfaces belonging to the same population using a compact set of spherical wavelets targeted to that population. We further refine the shape representation by separating into groups wavelet coefficients that describe independent global and/or local biological variations in the population, using spectral graph partitioning. We then learn a prior probability distribution induced over each group to explicitly encode these variations at different scales and spatial locations. Based on this representation, we derive a parametric active surface evolution using the multiscale prior coefficients as parameters for our optimization procedure to naturally include the prior for segmentation. Additionally, the optimization method can be applied in a coarse-to-fine manner. We apply our algorithm to two different brain structures, the caudate nucleus and the hippocampus, of interest in the study of schizophrenia. We show: 1) a reconstruction task of a test set to validate the expressiveness of our multiscale prior and 2) a segmentation task. In the reconstruction task, our results show that for a given training set size, our algorithm significantly improves the approximation of shapes in a testing set over the Point Distribution Model, which tends to oversmooth data. In the segmentation task, our validation shows our algorithm is computationally efficient and outperforms the Active Shape Model algorithm, by capturing finer shape details. PMID:17427745
Round Robin evaluation of soil moisture retrieval models for the MetOp-A ASCAT Instrument
NASA Astrophysics Data System (ADS)
Gruber, Alexander; Paloscia, Simonetta; Santi, Emanuele; Notarnicola, Claudia; Pasolli, Luca; Smolander, Tuomo; Pulliainen, Jouni; Mittelbach, Heidi; Dorigo, Wouter; Wagner, Wolfgang
2014-05-01
Global soil moisture observations are crucial to understand hydrologic processes, earth-atmosphere interactions and climate variability. ESA's Climate Change Initiative (CCI) project aims to create a global consistent long-term soil moisture data set based on the merging of the best available active and passive satellite-based microwave sensors and retrieval algorithms. Within the CCI, a Round Robin evaluation of existing retrieval algorithms for both active and passive instruments was carried out. In this study we present the comparison of five different retrieval algorithms covering three different modelling principles applied to active MetOp-A ASCAT L1 backscatter data. These models include statistical models (Bayesian Regression and Support Vector Regression, provided by the Institute for Applied Remote Sensing, Eurac Research Viale Druso, Italy, and an Artificial Neural Network, provided by the Institute of Applied Physics, CNR-IFAC, Italy), a semi-empirical model (provided by the Finnish Meteorological Institute), and a change detection model (provided by the Vienna University of Technology). The algorithms were applied on L1 backscatter data within the period of 2007-2011, resampled to a 12.5 km grid. The evaluation was performed over 75 globally distributed, quality controlled in situ stations drawn from the International Soil Moisture Network (ISMN) using surface soil moisture data from the Global Land Data Assimilation System (GLDAS-) Noah land surface model as second independent reference. The temporal correlation between the data sets was analyzed and random errors of the the different algorithms were estimated using the triple collocation method. Absolute soil moisture values as well as soil moisture anomalies were considered including both long-term anomalies from the mean seasonal cycle and short-term anomalies from a five weeks moving average window. Results show a very high agreement between all five algorithms for most stations. A slight vegetation dependency of the errors and a spatial decorrelation of the performance patterns of the different algorithms was found. We conclude that future research should focus on understanding, combining and exploiting the advantages of all available modelling approaches rather than trying to optimize one approach to fit every possible condition.
The congestion control algorithm based on queue management of each node in mobile ad hoc networks
NASA Astrophysics Data System (ADS)
Wei, Yifei; Chang, Lin; Wang, Yali; Wang, Gaoping
2016-12-01
This paper proposes an active queue management mechanism, considering the node's own ability and its importance in the network to set the queue threshold. As the network load increases, local congestion of mobile ad hoc network may lead to network performance degradation, hot node's energy consumption increase even failure. If small energy nodes congested because of forwarding data packets, then when it is used as the source node will cause a lot of packet loss. This paper proposes an active queue management mechanism, considering the node's own ability and its importance in the network to set the queue threshold. Controlling nodes buffer queue in different levels of congestion area probability by adjusting the upper limits and lower limits, thus nodes can adjust responsibility of forwarding data packets according to their own situation. The proposed algorithm will slow down the send rate hop by hop along the data package transmission direction from congestion node to source node so that to prevent further congestion from the source node. The simulation results show that, the algorithm can better play the data forwarding ability of strong nodes, protect the weak nodes, can effectively alleviate the network congestion situation.
Photometric Supernova Classification with Machine Learning
NASA Astrophysics Data System (ADS)
Lochner, Michelle; McEwen, Jason D.; Peiris, Hiranya V.; Lahav, Ofer; Winter, Max K.
2016-08-01
Automated photometric supernova classification has become an active area of research in recent years in light of current and upcoming imaging surveys such as the Dark Energy Survey (DES) and the Large Synoptic Survey Telescope, given that spectroscopic confirmation of type for all supernovae discovered will be impossible. Here, we develop a multi-faceted classification pipeline, combining existing and new approaches. Our pipeline consists of two stages: extracting descriptive features from the light curves and classification using a machine learning algorithm. Our feature extraction methods vary from model-dependent techniques, namely SALT2 fits, to more independent techniques that fit parametric models to curves, to a completely model-independent wavelet approach. We cover a range of representative machine learning algorithms, including naive Bayes, k-nearest neighbors, support vector machines, artificial neural networks, and boosted decision trees (BDTs). We test the pipeline on simulated multi-band DES light curves from the Supernova Photometric Classification Challenge. Using the commonly used area under the curve (AUC) of the Receiver Operating Characteristic as a metric, we find that the SALT2 fits and the wavelet approach, with the BDTs algorithm, each achieve an AUC of 0.98, where 1 represents perfect classification. We find that a representative training set is essential for good classification, whatever the feature set or algorithm, with implications for spectroscopic follow-up. Importantly, we find that by using either the SALT2 or the wavelet feature sets with a BDT algorithm, accurate classification is possible purely from light curve data, without the need for any redshift information.
A wavelet based method for automatic detection of slow eye movements: a pilot study.
Magosso, Elisa; Provini, Federica; Montagna, Pasquale; Ursino, Mauro
2006-11-01
Electro-oculographic (EOG) activity during the wake-sleep transition is characterized by the appearance of slow eye movements (SEM). The present work describes an algorithm for the automatic localisation of SEM events from EOG recordings. The algorithm is based on a wavelet multiresolution analysis of the difference between right and left EOG tracings, and includes three main steps: (i) wavelet decomposition down to 10 detail levels (i.e., 10 scales), using Daubechies order 4 wavelet; (ii) computation of energy in 0.5s time steps at any level of decomposition; (iii) construction of a non-linear discriminant function expressing the relative energy of high-scale details to both high- and low-scale details. The main assumption is that the value of the discriminant function increases above a given threshold during SEM episodes due to energy redistribution toward higher scales. Ten EOG recordings from ten male patients with obstructive sleep apnea syndrome were used. All tracings included a period from pre-sleep wakefulness to stage 2 sleep. Two experts inspected the tracings separately to score SEMs. A reference set of SEM (gold standard) were obtained by joint examination by both experts. Parameters of the discriminant function were assigned on three tracings (design set) to minimize the disagreement between the system classification and classification by the two experts; the algorithm was then tested on the remaining seven tracings (test set). Results show that the agreement between the algorithm and the gold standard was 80.44+/-4.09%, the sensitivity of the algorithm was 67.2+/-7.37% and the selectivity 83.93+/-8.65%. However, most errors were not caused by an inability of the system to detect intervals with SEM activity against NON-SEM intervals, but were due to a different localisation of the beginning and end of some SEM episodes. The proposed method may be a valuable tool for computerized EOG analysis.
Robust Group Sparse Beamforming for Multicast Green Cloud-RAN With Imperfect CSI
NASA Astrophysics Data System (ADS)
Shi, Yuanming; Zhang, Jun; Letaief, Khaled B.
2015-09-01
In this paper, we investigate the network power minimization problem for the multicast cloud radio access network (Cloud-RAN) with imperfect channel state information (CSI). The key observation is that network power minimization can be achieved by adaptively selecting active remote radio heads (RRHs) via controlling the group-sparsity structure of the beamforming vector. However, this yields a non-convex combinatorial optimization problem, for which we propose a three-stage robust group sparse beamforming algorithm. In the first stage, a quadratic variational formulation of the weighted mixed l1/l2-norm is proposed to induce the group-sparsity structure in the aggregated beamforming vector, which indicates those RRHs that can be switched off. A perturbed alternating optimization algorithm is then proposed to solve the resultant non-convex group-sparsity inducing optimization problem by exploiting its convex substructures. In the second stage, we propose a PhaseLift technique based algorithm to solve the feasibility problem with a given active RRH set, which helps determine the active RRHs. Finally, the semidefinite relaxation (SDR) technique is adopted to determine the robust multicast beamformers. Simulation results will demonstrate the convergence of the perturbed alternating optimization algorithm, as well as, the effectiveness of the proposed algorithm to minimize the network power consumption for multicast Cloud-RAN.
NASA Astrophysics Data System (ADS)
Liu, Tao; Lyu, Mindong; Wang, Zixi; Yan, Shaoze
2018-02-01
Identification of orbit responses can make the active protection operation more easily realize for active magnetic bearings (AMB) in case of touchdowns. This paper presents an identification method of the orbit responses rooting on signal processing of rotor displacements during touchdowns. The recognition method consists of two major steps. Firstly, the combined rub and bouncing is distinguished from the other orbit responses by the mathematical expectation of axis displacements of the rotor. Because when the combined rub and bouncing occurs, the rotor of AMB will not be always close to the touchdown bearings (TDB). Secondly, we recognize the pendulum vibration and the full rub by the Fourier spectrum of displacement in horizontal direction, as the frequency characteristics of the two responses are different. The principle of the whole identification algorithm is illustrated by two sets of signal generated by a dynamic model of the specific rotor-TDB system. The universality of the method is validated by other four sets of signal. Besides, the adaptability of noise is also tested by adding white noises with different strengths, and the result is promising. As the mathematical expectation and Discrete Fourier transform are major calculations of the algorithm, the calculation quantity of the algorithm is low, so it is fast, easily realized and embedded in the AMB controller, which has an important engineering value for the protection of AMBs during touchdowns.
NASA Astrophysics Data System (ADS)
Hashimoto, Hiroyuki; Takaguchi, Yusuke; Nakamura, Shizuka
Instability of calculation process and increase of calculation time caused by increasing size of continuous optimization problem remain the major issues to be solved to apply the technique to practical industrial systems. This paper proposes an enhanced quadratic programming algorithm based on interior point method mainly for improvement of calculation stability. The proposed method has dynamic estimation mechanism of active constraints on variables, which fixes the variables getting closer to the upper/lower limit on them and afterwards releases the fixed ones as needed during the optimization process. It is considered as algorithm-level integration of the solution strategy of active-set method into the interior point method framework. We describe some numerical results on commonly-used bench-mark problems called “CUTEr” to show the effectiveness of the proposed method. Furthermore, the test results on large-sized ELD problem (Economic Load Dispatching problems in electric power supply scheduling) are also described as a practical industrial application.
Characterizing volcanic activity: Application of freely-available webcams
NASA Astrophysics Data System (ADS)
Dehn, J.; Harrild, M.; Webley, P. W.
2017-12-01
In recent years, freely-available web-based cameras, or webcams, have become more readily available allowing an increased level of monitoring at active volcanoes across the globe. While these cameras have been extensively used as qualitative tools, they provide a unique dataset to perform quantitative analyzes of the changing behavior of the particular volcano within the cameras field of view. We focus on the multitude of these freely-available webcams and present a new algorithm to detect changes in volcanic activity using nighttime webcam data. Our approach uses a quick, efficient, and fully automated algorithm to identify changes in webcam data in near real-time, including techniques such as edge detection, Gaussian mixture models, and temporal/spatial statistical tests, which are applied to each target image. Often the image metadata (exposure, gain settings, aperture, focal length, etc.) are unknown, meaning we developed our algorithm to identify the quantity of volcanically incandescent pixels as well as the number of specific algorithm tests needed to detect thermal activity, instead of directly correlating brightness in the webcam to eruption temperatures. We compared our algorithm results to a manual analysis of webcam data for several volcanoes and determined a false detection rate of less than 3% for the automated approach. In our presentation, we describe the different tests integrated into our algorithm, lessons learned, and how we applied our method to several volcanoes across the North Pacific during its development and implementation. We will finish with a discussion on the global applicability of our approach and how to build a 24/7, 365 day a year tool that can be used as an additional data source for real-time analysis of volcanic activity.
Fine-Tuning ADAS Algorithm Parameters for Optimizing Traffic ...
With the development of the Connected Vehicle technology that facilitates wirelessly communication among vehicles and road-side infrastructure, the Advanced Driver Assistance Systems (ADAS) can be adopted as an effective tool for accelerating traffic safety and mobility optimization at various highway facilities. To this end, the traffic management centers identify the optimal ADAS algorithm parameter set that enables the maximum improvement of the traffic safety and mobility performance, and broadcast the optimal parameter set wirelessly to individual ADAS-equipped vehicles. After adopting the optimal parameter set, the ADAS-equipped drivers become active agents in the traffic stream that work collectively and consistently to prevent traffic conflicts, lower the intensity of traffic disturbances, and suppress the development of traffic oscillations into heavy traffic jams. Successful implementation of this objective requires the analysis capability of capturing the impact of the ADAS on driving behaviors, and measuring traffic safety and mobility performance under the influence of the ADAS. To address this challenge, this research proposes a synthetic methodology that incorporates the ADAS-affected driving behavior modeling and state-of-the-art microscopic traffic flow modeling into a virtually simulated environment. Building on such an environment, the optimal ADAS algorithm parameter set is identified through an optimization programming framework to enable th
Patel, Sanjay R.; Weng, Jia; Rueschman, Michael; Dudley, Katherine A.; Loredo, Jose S.; Mossavar-Rahmani, Yasmin; Ramirez, Maricelle; Ramos, Alberto R.; Reid, Kathryn; Seiger, Ashley N.; Sotres-Alvarez, Daniela; Zee, Phyllis C.; Wang, Rui
2015-01-01
Study Objectives: While actigraphy is considered objective, the process of setting rest intervals to calculate sleep variables is subjective. We sought to evaluate the reproducibility of actigraphy-derived measures of sleep using a standardized algorithm for setting rest intervals. Design: Observational study. Setting: Community-based. Participants: A random sample of 50 adults aged 18–64 years free of severe sleep apnea participating in the Sueño sleep ancillary study to the Hispanic Community Health Study/Study of Latinos. Interventions: N/A. Measurements and Results: Participants underwent 7 days of continuous wrist actigraphy and completed daily sleep diaries. Studies were scored twice by each of two scorers. Rest intervals were set using a standardized hierarchical approach based on event marker, diary, light, and activity data. Sleep/wake status was then determined for each 30-sec epoch using a validated algorithm, and this was used to generate 11 variables: mean nightly sleep duration, nap duration, 24-h sleep duration, sleep latency, sleep maintenance efficiency, sleep fragmentation index, sleep onset time, sleep offset time, sleep midpoint time, standard deviation of sleep duration, and standard deviation of sleep midpoint. Intra-scorer intraclass correlation coefficients (ICCs) were high, ranging from 0.911 to 0.995 across all 11 variables. Similarly, inter-scorer ICCs were high, also ranging from 0.911 to 0.995, and mean inter-scorer differences were small. Bland-Altman plots did not reveal any systematic disagreement in scoring. Conclusions: With use of a standardized algorithm to set rest intervals, scoring of actigraphy for the purpose of generating a wide array of sleep variables is highly reproducible. Citation: Patel SR, Weng J, Rueschman M, Dudley KA, Loredo JS, Mossavar-Rahmani Y, Ramirez M, Ramos AR, Reid K, Seiger AN, Sotres-Alvarez D, Zee PC, Wang R. Reproducibility of a standardized actigraphy scoring algorithm for sleep in a US Hispanic/Latino population. SLEEP 2015;38(9):1497–1503. PMID:25845697
Fully convolutional network with cluster for semantic segmentation
NASA Astrophysics Data System (ADS)
Ma, Xiao; Chen, Zhongbi; Zhang, Jianlin
2018-04-01
At present, image semantic segmentation technology has been an active research topic for scientists in the field of computer vision and artificial intelligence. Especially, the extensive research of deep neural network in image recognition greatly promotes the development of semantic segmentation. This paper puts forward a method based on fully convolutional network, by cluster algorithm k-means. The cluster algorithm using the image's low-level features and initializing the cluster centers by the super-pixel segmentation is proposed to correct the set of points with low reliability, which are mistakenly classified in great probability, by the set of points with high reliability in each clustering regions. This method refines the segmentation of the target contour and improves the accuracy of the image segmentation.
Tests of a Semi-Analytical Case 1 and Gelbstoff Case 2 SeaWiFS Algorithm with a Global Data Set
NASA Technical Reports Server (NTRS)
Carder, Kendall L.; Hawes, Steve K.; Lee, Zhongping
1997-01-01
A semi-analytical algorithm was tested with a total of 733 points of either unpackaged or packaged-pigment data, with corresponding algorithm parameters for each data type. The 'unpackaged' type consisted of data sets that were generally consistent with the Case 1 CZCS algorithm and other well calibrated data sets. The 'packaged' type consisted of data sets apparently containing somewhat more packaged pigments, requiring modification of the absorption parameters of the model consistent with the CalCOFI study area. This resulted in two equally divided data sets. A more thorough scrutiny of these and other data sets using a semianalytical model requires improved knowledge of the phytoplankton and gelbstoff of the specific environment studied. Since the semi-analytical algorithm is dependent upon 4 spectral channels including the 412 nm channel, while most other algorithms are not, a means of testing data sets for consistency was sought. A numerical filter was developed to classify data sets into the above classes. The filter uses reflectance ratios, which can be determined from space. The sensitivity of such numerical filters to measurement resulting from atmospheric correction and sensor noise errors requires further study. The semi-analytical algorithm performed superbly on each of the data sets after classification, resulting in RMS1 errors of 0.107 and 0.121, respectively, for the unpackaged and packaged data-set classes, with little bias and slopes near 1.0. In combination, the RMS1 performance was 0.114. While these numbers appear rather sterling, one must bear in mind what mis-classification does to the results. Using an average or compromise parameterization on the modified global data set yielded an RMS1 error of 0.171, while using the unpackaged parameterization on the global evaluation data set yielded an RMS1 error of 0.284. So, without classification, the algorithm performs better globally using the average parameters than it does using the unpackaged parameters. Finally, the effects of even more extreme pigment packaging must be examined in order to improve algorithm performance at high latitudes. Note, however, that the North Sea and Mississippi River plume studies contributed data to the packaged and unpackaged classess, respectively, with little effect on algorithm performance. This suggests that gelbstoff-rich Case 2 waters do not seriously degrade performance of the semi-analytical algorithm.
NASA Astrophysics Data System (ADS)
Houchin, J. S.
2014-09-01
A common problem for the off-line validation of the calibration algorithms and algorithm coefficients is being able to run science data through the exact same software used for on-line calibration of that data. The Joint Polar Satellite System (JPSS) program solved part of this problem by making the Algorithm Development Library (ADL) available, which allows the operational algorithm code to be compiled and run on a desktop Linux workstation using flat file input and output. However, this solved only part of the problem, as the toolkit and methods to initiate the processing of data through the algorithms were geared specifically toward the algorithm developer, not the calibration analyst. In algorithm development mode, a limited number of sets of test data are staged for the algorithm once, and then run through the algorithm over and over as the software is developed and debugged. In calibration analyst mode, we are continually running new data sets through the algorithm, which requires significant effort to stage each of those data sets for the algorithm without additional tools. AeroADL solves this second problem by providing a set of scripts that wrap the ADL tools, providing both efficient means to stage and process an input data set, to override static calibration coefficient look-up-tables (LUT) with experimental versions of those tables, and to manage a library containing multiple versions of each of the static LUT files in such a way that the correct set of LUTs required for each algorithm are automatically provided to the algorithm without analyst effort. Using AeroADL, The Aerospace Corporation's analyst team has demonstrated the ability to quickly and efficiently perform analysis tasks for both the VIIRS and OMPS sensors with minimal training on the software tools.
Reinforcement Learning for Weakly-Coupled MDPs and an Application to Planetary Rover Control
NASA Technical Reports Server (NTRS)
Bernstein, Daniel S.; Zilberstein, Shlomo
2003-01-01
Weakly-coupled Markov decision processes can be decomposed into subprocesses that interact only through a small set of bottleneck states. We study a hierarchical reinforcement learning algorithm designed to take advantage of this particular type of decomposability. To test our algorithm, we use a decision-making problem faced by autonomous planetary rovers. In this problem, a Mars rover must decide which activities to perform and when to traverse between science sites in order to make the best use of its limited resources. In our experiments, the hierarchical algorithm performs better than Q-learning in the early stages of learning, but unlike Q-learning it converges to a suboptimal policy. This suggests that it may be advantageous to use the hierarchical algorithm when training time is limited.
Efficient hyperspectral image segmentation using geometric active contour formulation
NASA Astrophysics Data System (ADS)
Albalooshi, Fatema A.; Sidike, Paheding; Asari, Vijayan K.
2014-10-01
In this paper, we present a new formulation of geometric active contours that embeds the local hyperspectral image information for an accurate object region and boundary extraction. We exploit self-organizing map (SOM) unsupervised neural network to train our model. The segmentation process is achieved by the construction of a level set cost functional, in which, the dynamic variable is the best matching unit (BMU) coming from SOM map. In addition, we use Gaussian filtering to discipline the deviation of the level set functional from a signed distance function and this actually helps to get rid of the re-initialization step that is computationally expensive. By using the properties of the collective computational ability and energy convergence capability of the active control models (ACM) energy functional, our method optimizes the geometric ACM energy functional with lower computational time and smoother level set function. The proposed algorithm starts with feature extraction from raw hyperspectral images. In this step, the principal component analysis (PCA) transformation is employed, and this actually helps in reducing dimensionality and selecting best sets of the significant spectral bands. Then the modified geometric level set functional based ACM is applied on the optimal number of spectral bands determined by the PCA. By introducing local significant spectral band information, our proposed method is capable to force the level set functional to be close to a signed distance function, and therefore considerably remove the need of the expensive re-initialization procedure. To verify the effectiveness of the proposed technique, we use real-life hyperspectral images and test our algorithm in varying textural regions. This framework can be easily adapted to different applications for object segmentation in aerial hyperspectral imagery.
On the precision of automated activation time estimation
NASA Technical Reports Server (NTRS)
Kaplan, D. T.; Smith, J. M.; Rosenbaum, D. S.; Cohen, R. J.
1988-01-01
We examined how the assignment of local activation times in epicardial and endocardial electrograms is affected by sampling rate, ambient signal-to-noise ratio, and sinx/x waveform interpolation. Algorithms used for the estimation of fiducial point locations included dV/dtmax, and a matched filter detection algorithm. Test signals included epicardial and endocardial electrograms overlying both normal and infarcted regions of dog myocardium. Signal-to-noise levels were adjusted by combining known data sets with white noise "colored" to match the spectral characteristics of experimentally recorded noise. For typical signal-to-noise ratios and sampling rates, the template-matching algorithm provided the greatest precision in reproducibly estimating fiducial point location, and sinx/x interpolation allowed for an additional significant improvement. With few restrictions, combining these two techniques may allow for use of digitization rates below the Nyquist rate without significant loss of precision.
DNA Microarray Data Analysis: A Novel Biclustering Algorithm Approach
NASA Astrophysics Data System (ADS)
Tchagang, Alain B.; Tewfik, Ahmed H.
2006-12-01
Biclustering algorithms refer to a distinct class of clustering algorithms that perform simultaneous row-column clustering. Biclustering problems arise in DNA microarray data analysis, collaborative filtering, market research, information retrieval, text mining, electoral trends, exchange analysis, and so forth. When dealing with DNA microarray experimental data for example, the goal of biclustering algorithms is to find submatrices, that is, subgroups of genes and subgroups of conditions, where the genes exhibit highly correlated activities for every condition. In this study, we develop novel biclustering algorithms using basic linear algebra and arithmetic tools. The proposed biclustering algorithms can be used to search for all biclusters with constant values, biclusters with constant values on rows, biclusters with constant values on columns, and biclusters with coherent values from a set of data in a timely manner and without solving any optimization problem. We also show how one of the proposed biclustering algorithms can be adapted to identify biclusters with coherent evolution. The algorithms developed in this study discover all valid biclusters of each type, while almost all previous biclustering approaches will miss some.
Beretta, Lorenzo; Santaniello, Alessandro; van Riel, Piet L C M; Coenen, Marieke J H; Scorza, Raffaella
2010-08-06
Epistasis is recognized as a fundamental part of the genetic architecture of individuals. Several computational approaches have been developed to model gene-gene interactions in case-control studies, however, none of them is suitable for time-dependent analysis. Herein we introduce the Survival Dimensionality Reduction (SDR) algorithm, a non-parametric method specifically designed to detect epistasis in lifetime datasets. The algorithm requires neither specification about the underlying survival distribution nor about the underlying interaction model and proved satisfactorily powerful to detect a set of causative genes in synthetic epistatic lifetime datasets with a limited number of samples and high degree of right-censorship (up to 70%). The SDR method was then applied to a series of 386 Dutch patients with active rheumatoid arthritis that were treated with anti-TNF biological agents. Among a set of 39 candidate genes, none of which showed a detectable marginal effect on anti-TNF responses, the SDR algorithm did find that the rs1801274 SNP in the Fc gamma RIIa gene and the rs10954213 SNP in the IRF5 gene non-linearly interact to predict clinical remission after anti-TNF biologicals. Simulation studies and application in a real-world setting support the capability of the SDR algorithm to model epistatic interactions in candidate-genes studies in presence of right-censored data. http://sourceforge.net/projects/sdrproject/.
SortNet: learning to rank by a neural preference function.
Rigutini, Leonardo; Papini, Tiziano; Maggini, Marco; Scarselli, Franco
2011-09-01
Relevance ranking consists in sorting a set of objects with respect to a given criterion. However, in personalized retrieval systems, the relevance criteria may usually vary among different users and may not be predefined. In this case, ranking algorithms that adapt their behavior from users' feedbacks must be devised. Two main approaches are proposed in the literature for learning to rank: the use of a scoring function, learned by examples, that evaluates a feature-based representation of each object yielding an absolute relevance score, a pairwise approach, where a preference function is learned to determine the object that has to be ranked first in a given pair. In this paper, we present a preference learning method for learning to rank. A neural network, the comparative neural network (CmpNN), is trained from examples to approximate the comparison function for a pair of objects. The CmpNN adopts a particular architecture designed to implement the symmetries naturally present in a preference function. The learned preference function can be embedded as the comparator into a classical sorting algorithm to provide a global ranking of a set of objects. To improve the ranking performances, an active-learning procedure is devised, that aims at selecting the most informative patterns in the training set. The proposed algorithm is evaluated on the LETOR dataset showing promising performances in comparison with other state-of-the-art algorithms.
A flooding algorithm for multirobot exploration.
Cabrera-Mora, Flavio; Xiao, Jizhong
2012-06-01
In this paper, we present a multirobot exploration algorithm that aims at reducing the exploration time and to minimize the overall traverse distance of the robots by coordinating the movement of the robots performing the exploration. Modeling the environment as a tree, we consider a coordination model that restricts the number of robots allowed to traverse an edge and to enter a vertex during each step. This coordination is achieved in a decentralized manner by the robots using a set of active landmarks that are dropped by them at explored vertices. We mathematically analyze the algorithm on trees, obtaining its main properties and specifying its bounds on the exploration time. We also define three metrics of performance for multirobot algorithms. We simulate and compare the performance of this new algorithm with those of our multirobot depth first search (MR-DFS) approach presented in our recent paper and classic single-robot DFS.
Artificial metaplasticity neural network applied to credit scoring.
Marcano-Cedeño, Alexis; Marin-de-la-Barcena, A; Jimenez-Trillo, J; Piñuela, J A; Andina, D
2011-08-01
The assessment of the risk of default on credit is important for financial institutions. Different Artificial Neural Networks (ANN) have been suggested to tackle the credit scoring problem, however, the obtained error rates are often high. In the search for the best ANN algorithm for credit scoring, this paper contributes with the application of an ANN Training Algorithm inspired by the neurons' biological property of metaplasticity. This algorithm is especially efficient when few patterns of a class are available, or when information inherent to low probability events is crucial for a successful application, as weight updating is overemphasized in the less frequent activations than in the more frequent ones. Two well-known and readily available such as: Australia and German data sets has been used to test the algorithm. The results obtained by AMMLP shown have been superior to state-of-the-art classification algorithms in credit scoring.
A general optimality criteria algorithm for a class of engineering optimization problems
NASA Astrophysics Data System (ADS)
Belegundu, Ashok D.
2015-05-01
An optimality criteria (OC)-based algorithm for optimization of a general class of nonlinear programming (NLP) problems is presented. The algorithm is only applicable to problems where the objective and constraint functions satisfy certain monotonicity properties. For multiply constrained problems which satisfy these assumptions, the algorithm is attractive compared with existing NLP methods as well as prevalent OC methods, as the latter involve computationally expensive active set and step-size control strategies. The fixed point algorithm presented here is applicable not only to structural optimization problems but also to certain problems as occur in resource allocation and inventory models. Convergence aspects are discussed. The fixed point update or resizing formula is given physical significance, which brings out a strength and trim feature. The number of function evaluations remains independent of the number of variables, allowing the efficient solution of problems with large number of variables.
Learning Instance-Specific Predictive Models
Visweswaran, Shyam; Cooper, Gregory F.
2013-01-01
This paper introduces a Bayesian algorithm for constructing predictive models from data that are optimized to predict a target variable well for a particular instance. This algorithm learns Markov blanket models, carries out Bayesian model averaging over a set of models to predict a target variable of the instance at hand, and employs an instance-specific heuristic to locate a set of suitable models to average over. We call this method the instance-specific Markov blanket (ISMB) algorithm. The ISMB algorithm was evaluated on 21 UCI data sets using five different performance measures and its performance was compared to that of several commonly used predictive algorithms, including nave Bayes, C4.5 decision tree, logistic regression, neural networks, k-Nearest Neighbor, Lazy Bayesian Rules, and AdaBoost. Over all the data sets, the ISMB algorithm performed better on average on all performance measures against all the comparison algorithms. PMID:25045325
CONEDEP: COnvolutional Neural network based Earthquake DEtection and Phase Picking
NASA Astrophysics Data System (ADS)
Zhou, Y.; Huang, Y.; Yue, H.; Zhou, S.; An, S.; Yun, N.
2017-12-01
We developed an automatic local earthquake detection and phase picking algorithm based on Fully Convolutional Neural network (FCN). The FCN algorithm detects and segments certain features (phases) in 3 component seismograms to realize efficient picking. We use STA/LTA algorithm and template matching algorithm to construct the training set from seismograms recorded 1 month before and after the Wenchuan earthquake. Precise P and S phases are identified and labeled to construct the training set. Noise data are produced by combining back-ground noise and artificial synthetic noise to form the equivalent scale of noise set as the signal set. Training is performed on GPUs to achieve efficient convergence. Our algorithm has significantly improved performance in terms of the detection rate and precision in comparison with STA/LTA and template matching algorithms.
Spreco, A; Eriksson, O; Dahlström, Ö; Timpka, T
2017-07-01
Methods for the detection of influenza epidemics and prediction of their progress have seldom been comparatively evaluated using prospective designs. This study aimed to perform a prospective comparative trial of algorithms for the detection and prediction of increased local influenza activity. Data on clinical influenza diagnoses recorded by physicians and syndromic data from a telenursing service were used. Five detection and three prediction algorithms previously evaluated in public health settings were calibrated and then evaluated over 3 years. When applied on diagnostic data, only detection using the Serfling regression method and prediction using the non-adaptive log-linear regression method showed acceptable performances during winter influenza seasons. For the syndromic data, none of the detection algorithms displayed a satisfactory performance, while non-adaptive log-linear regression was the best performing prediction method. We conclude that evidence was found for that available algorithms for influenza detection and prediction display satisfactory performance when applied on local diagnostic data during winter influenza seasons. When applied on local syndromic data, the evaluated algorithms did not display consistent performance. Further evaluations and research on combination of methods of these types in public health information infrastructures for 'nowcasting' (integrated detection and prediction) of influenza activity are warranted.
NASA Technical Reports Server (NTRS)
Hlavka, Dennis L.; Palm, S. P.; Welton, E. J.; Hart, W. D.; Spinhirne, J. D.; McGill, M.; Mahesh, A.; Starr, David OC. (Technical Monitor)
2001-01-01
The Geoscience Laser Altimeter System (GLAS) is scheduled for launch on the ICESat satellite as part of the NASA EOS mission in 2002. GLAS will be used to perform high resolution surface altimetry and will also provide a continuously operating atmospheric lidar to profile clouds, aerosols, and the planetary boundary layer with horizontal and vertical resolution of 175 and 76.8 m, respectively. GLAS is the first active satellite atmospheric profiler to provide global coverage. Data products include direct measurements of the heights of aerosol and cloud layers, and the optical depth of transmissive layers. In this poster we provide an overview of the GLAS atmospheric data products, present a simulated GLAS data set, and show results from the simulated data set using the GLAS data processing algorithm. Optical results from the ER-2 Cloud Physics Lidar (CPL), which uses many of the same processing algorithms as GLAS, show algorithm performance with real atmospheric conditions during the Southern African Regional Science Initiative (SAFARI 2000).
Exact analytical solution of irreversible binary dynamics on networks.
Laurence, Edward; Young, Jean-Gabriel; Melnik, Sergey; Dubé, Louis J
2018-03-01
In binary cascade dynamics, the nodes of a graph are in one of two possible states (inactive, active), and nodes in the inactive state make an irreversible transition to the active state, as soon as their precursors satisfy a predetermined condition. We introduce a set of recursive equations to compute the probability of reaching any final state, given an initial state, and a specification of the transition probability function of each node. Because the naive recursive approach for solving these equations takes factorial time in the number of nodes, we also introduce an accelerated algorithm, built around a breath-first search procedure. This algorithm solves the equations as efficiently as possible in exponential time.
Exact analytical solution of irreversible binary dynamics on networks
NASA Astrophysics Data System (ADS)
Laurence, Edward; Young, Jean-Gabriel; Melnik, Sergey; Dubé, Louis J.
2018-03-01
In binary cascade dynamics, the nodes of a graph are in one of two possible states (inactive, active), and nodes in the inactive state make an irreversible transition to the active state, as soon as their precursors satisfy a predetermined condition. We introduce a set of recursive equations to compute the probability of reaching any final state, given an initial state, and a specification of the transition probability function of each node. Because the naive recursive approach for solving these equations takes factorial time in the number of nodes, we also introduce an accelerated algorithm, built around a breath-first search procedure. This algorithm solves the equations as efficiently as possible in exponential time.
Esposito, Fabrizio; Formisano, Elia; Seifritz, Erich; Goebel, Rainer; Morrone, Renato; Tedeschi, Gioacchino; Di Salle, Francesco
2002-07-01
Independent component analysis (ICA) has been successfully employed to decompose functional MRI (fMRI) time-series into sets of activation maps and associated time-courses. Several ICA algorithms have been proposed in the neural network literature. Applied to fMRI, these algorithms might lead to different spatial or temporal readouts of brain activation. We compared the two ICA algorithms that have been used so far for spatial ICA (sICA) of fMRI time-series: the Infomax (Bell and Sejnowski [1995]: Neural Comput 7:1004-1034) and the Fixed-Point (Hyvärinen [1999]: Adv Neural Inf Proc Syst 10:273-279) algorithms. We evaluated the Infomax- and Fixed Point-based sICA decompositions of simulated motor, and real motor and visual activation fMRI time-series using an ensemble of measures. Log-likelihood (McKeown et al. [1998]: Hum Brain Mapp 6:160-188) was used as a measure of how significantly the estimated independent sources fit the statistical structure of the data; receiver operating characteristics (ROC) and linear correlation analyses were used to evaluate the algorithms' accuracy of estimating the spatial layout and the temporal dynamics of simulated and real activations; cluster sizing calculations and an estimation of a residual gaussian noise term within the components were used to examine the anatomic structure of ICA components and for the assessment of noise reduction capabilities. Whereas both algorithms produced highly accurate results, the Fixed-Point outperformed the Infomax in terms of spatial and temporal accuracy as long as inferential statistics were employed as benchmarks. Conversely, the Infomax sICA was superior in terms of global estimation of the ICA model and noise reduction capabilities. Because of its adaptive nature, the Infomax approach appears to be better suited to investigate activation phenomena that are not predictable or adequately modelled by inferential techniques. Copyright 2002 Wiley-Liss, Inc.
Brain MRI Tumor Detection using Active Contour Model and Local Image Fitting Energy
NASA Astrophysics Data System (ADS)
Nabizadeh, Nooshin; John, Nigel
2014-03-01
Automatic abnormality detection in Magnetic Resonance Imaging (MRI) is an important issue in many diagnostic and therapeutic applications. Here an automatic brain tumor detection method is introduced that uses T1-weighted images and K. Zhang et. al.'s active contour model driven by local image fitting (LIF) energy. Local image fitting energy obtains the local image information, which enables the algorithm to segment images with intensity inhomogeneities. Advantage of this method is that the LIF energy functional has less computational complexity than the local binary fitting (LBF) energy functional; moreover, it maintains the sub-pixel accuracy and boundary regularization properties. In Zhang's algorithm, a new level set method based on Gaussian filtering is used to implement the variational formulation, which is not only vigorous to prevent the energy functional from being trapped into local minimum, but also effective in keeping the level set function regular. Experiments show that the proposed method achieves high accuracy brain tumor segmentation results.
Application of the artificial bee colony algorithm for solving the set covering problem.
Crawford, Broderick; Soto, Ricardo; Cuesta, Rodrigo; Paredes, Fernando
2014-01-01
The set covering problem is a formal model for many practical optimization problems. In the set covering problem the goal is to choose a subset of the columns of minimal cost that covers every row. Here, we present a novel application of the artificial bee colony algorithm to solve the non-unicost set covering problem. The artificial bee colony algorithm is a recent swarm metaheuristic technique based on the intelligent foraging behavior of honey bees. Experimental results show that our artificial bee colony algorithm is competitive in terms of solution quality with other recent metaheuristic approaches for the set covering problem.
Application of the Artificial Bee Colony Algorithm for Solving the Set Covering Problem
Crawford, Broderick; Soto, Ricardo; Cuesta, Rodrigo; Paredes, Fernando
2014-01-01
The set covering problem is a formal model for many practical optimization problems. In the set covering problem the goal is to choose a subset of the columns of minimal cost that covers every row. Here, we present a novel application of the artificial bee colony algorithm to solve the non-unicost set covering problem. The artificial bee colony algorithm is a recent swarm metaheuristic technique based on the intelligent foraging behavior of honey bees. Experimental results show that our artificial bee colony algorithm is competitive in terms of solution quality with other recent metaheuristic approaches for the set covering problem. PMID:24883356
Adaptive Batch Mode Active Learning.
Chakraborty, Shayok; Balasubramanian, Vineeth; Panchanathan, Sethuraman
2015-08-01
Active learning techniques have gained popularity to reduce human effort in labeling data instances for inducing a classifier. When faced with large amounts of unlabeled data, such algorithms automatically identify the exemplar and representative instances to be selected for manual annotation. More recently, there have been attempts toward a batch mode form of active learning, where a batch of data points is simultaneously selected from an unlabeled set. Real-world applications require adaptive approaches for batch selection in active learning, depending on the complexity of the data stream in question. However, the existing work in this field has primarily focused on static or heuristic batch size selection. In this paper, we propose two novel optimization-based frameworks for adaptive batch mode active learning (BMAL), where the batch size as well as the selection criteria are combined in a single formulation. We exploit gradient-descent-based optimization strategies as well as properties of submodular functions to derive the adaptive BMAL algorithms. The solution procedures have the same computational complexity as existing state-of-the-art static BMAL techniques. Our empirical results on the widely used VidTIMIT and the mobile biometric (MOBIO) data sets portray the efficacy of the proposed frameworks and also certify the potential of these approaches in being used for real-world biometric recognition applications.
PHOTOMETRIC SUPERNOVA CLASSIFICATION WITH MACHINE LEARNING
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lochner, Michelle; Peiris, Hiranya V.; Lahav, Ofer
Automated photometric supernova classification has become an active area of research in recent years in light of current and upcoming imaging surveys such as the Dark Energy Survey (DES) and the Large Synoptic Survey Telescope, given that spectroscopic confirmation of type for all supernovae discovered will be impossible. Here, we develop a multi-faceted classification pipeline, combining existing and new approaches. Our pipeline consists of two stages: extracting descriptive features from the light curves and classification using a machine learning algorithm. Our feature extraction methods vary from model-dependent techniques, namely SALT2 fits, to more independent techniques that fit parametric models tomore » curves, to a completely model-independent wavelet approach. We cover a range of representative machine learning algorithms, including naive Bayes, k -nearest neighbors, support vector machines, artificial neural networks, and boosted decision trees (BDTs). We test the pipeline on simulated multi-band DES light curves from the Supernova Photometric Classification Challenge. Using the commonly used area under the curve (AUC) of the Receiver Operating Characteristic as a metric, we find that the SALT2 fits and the wavelet approach, with the BDTs algorithm, each achieve an AUC of 0.98, where 1 represents perfect classification. We find that a representative training set is essential for good classification, whatever the feature set or algorithm, with implications for spectroscopic follow-up. Importantly, we find that by using either the SALT2 or the wavelet feature sets with a BDT algorithm, accurate classification is possible purely from light curve data, without the need for any redshift information.« less
Ahmadi, Mehdi; Shahlaei, Mohsen
2015-01-01
P2X7 antagonist activity for a set of 49 molecules of the P2X7 receptor antagonists, derivatives of purine, was modeled with the aid of chemometric and artificial intelligence techniques. The activity of these compounds was estimated by means of combination of principal component analysis (PCA), as a well-known data reduction method, genetic algorithm (GA), as a variable selection technique, and artificial neural network (ANN), as a non-linear modeling method. First, a linear regression, combined with PCA, (principal component regression) was operated to model the structure-activity relationships, and afterwards a combination of PCA and ANN algorithm was employed to accurately predict the biological activity of the P2X7 antagonist. PCA preserves as much of the information as possible contained in the original data set. Seven most important PC's to the studied activity were selected as the inputs of ANN box by an efficient variable selection method, GA. The best computational neural network model was a fully-connected, feed-forward model with 7-7-1 architecture. The developed ANN model was fully evaluated by different validation techniques, including internal and external validation, and chemical applicability domain. All validations showed that the constructed quantitative structure-activity relationship model suggested is robust and satisfactory.
Ahmadi, Mehdi; Shahlaei, Mohsen
2015-01-01
P2X7 antagonist activity for a set of 49 molecules of the P2X7 receptor antagonists, derivatives of purine, was modeled with the aid of chemometric and artificial intelligence techniques. The activity of these compounds was estimated by means of combination of principal component analysis (PCA), as a well-known data reduction method, genetic algorithm (GA), as a variable selection technique, and artificial neural network (ANN), as a non-linear modeling method. First, a linear regression, combined with PCA, (principal component regression) was operated to model the structure–activity relationships, and afterwards a combination of PCA and ANN algorithm was employed to accurately predict the biological activity of the P2X7 antagonist. PCA preserves as much of the information as possible contained in the original data set. Seven most important PC's to the studied activity were selected as the inputs of ANN box by an efficient variable selection method, GA. The best computational neural network model was a fully-connected, feed-forward model with 7−7−1 architecture. The developed ANN model was fully evaluated by different validation techniques, including internal and external validation, and chemical applicability domain. All validations showed that the constructed quantitative structure–activity relationship model suggested is robust and satisfactory. PMID:26600858
NASA Astrophysics Data System (ADS)
Siregar, H.; Junaeti, E.; Hayatno, T.
2017-03-01
Activities correspondence is often used by agencies or companies, so that institutions or companies set up a special division to handle issues related to the letter management. Most of the distribution of letters using electronic media, then the letter should be kept confidential in order to avoid things that are not desirable. Techniques that can be done to meet the security aspect is by using cryptography or by giving a digital signature. The addition of asymmetric and symmetric algorithms, i.e. RSA and AES algorithms, on the digital signature had been done in this study to maintain data security. The RSA algorithm was used during the process of giving digital signature, while the AES algorithm was used during the process of encoding a message that will be sent to the receiver. Based on the research can be concluded that the additions of AES and RSA algorithms on the digital signature meet four objectives of cryptography: Secrecy, Data Integrity, Authentication and Non-repudiation.
A 4D biomechanical lung phantom for joint segmentation/registration evaluation
NASA Astrophysics Data System (ADS)
Markel, Daniel; Levesque, Ives; Larkin, Joe; Léger, Pierre; El Naqa, Issam
2016-10-01
At present, there exists few openly available methods for evaluation of simultaneous segmentation and registration algorithms. These methods allow for a combination of both techniques to track the tumor in complex settings such as adaptive radiotherapy. We have produced a quality assurance platform for evaluating this specific subset of algorithms using a preserved porcine lung in such that it is multi-modality compatible: positron emission tomography (PET), computer tomography (CT) and magnetic resonance imaging (MRI). A computer controlled respirator was constructed to pneumatically manipulate the lungs in order to replicate human breathing traces. A registration ground truth was provided using an in-house bifurcation tracking pipeline. Segmentation ground truth was provided by synthetic multi-compartment lesions to simulate biologically active tumor, background tissue and a necrotic core. The bifurcation tracking pipeline results were compared to digital deformations and used to evaluate three registration algorithms, Diffeomorphic demons, fast-symmetric forces demons and MiMVista’s deformable registration tool. Three segmentation algorithms the Chan Vese level sets method, a Hybrid technique and the multi-valued level sets algorithm. The respirator was able to replicate three seperate breathing traces with a mean accuracy of 2-2.2%. Bifurcation tracking error was found to be sub-voxel when using human CT data for displacements up to 6.5 cm and approximately 1.5 voxel widths for displacements up to 3.5 cm for the porcine lungs. For the fast-symmetric, diffeomorphic and MiMvista registration algorithms, mean geometric errors were found to be 0.430+/- 0.001 , 0.416+/- 0.001 and 0.605+/- 0.002 voxels widths respectively using the vector field differences and 0.4+/- 0.2 , 0.4+/- 0.2 and 0.6+/- 0.2 voxel widths using the bifurcation tracking pipeline. The proposed phantom was found sufficient for accurate evaluation of registration and segmentation algorithms. The use of automatically generated anatomical landmarks proposed can eliminate the time and potential innacuracy of manual landmark selection using expert observers.
Kaspi, Omer; Yosipof, Abraham; Senderowitz, Hanoch
2017-06-06
An important aspect of chemoinformatics and material-informatics is the usage of machine learning algorithms to build Quantitative Structure Activity Relationship (QSAR) models. The RANdom SAmple Consensus (RANSAC) algorithm is a predictive modeling tool widely used in the image processing field for cleaning datasets from noise. RANSAC could be used as a "one stop shop" algorithm for developing and validating QSAR models, performing outlier removal, descriptors selection, model development and predictions for test set samples using applicability domain. For "future" predictions (i.e., for samples not included in the original test set) RANSAC provides a statistical estimate for the probability of obtaining reliable predictions, i.e., predictions within a pre-defined number of standard deviations from the true values. In this work we describe the first application of RNASAC in material informatics, focusing on the analysis of solar cells. We demonstrate that for three datasets representing different metal oxide (MO) based solar cell libraries RANSAC-derived models select descriptors previously shown to correlate with key photovoltaic properties and lead to good predictive statistics for these properties. These models were subsequently used to predict the properties of virtual solar cells libraries highlighting interesting dependencies of PV properties on MO compositions.
[Purity Detection Model Update of Maize Seeds Based on Active Learning].
Tang, Jin-ya; Huang, Min; Zhu, Qi-bing
2015-08-01
Seed purity reflects the degree of seed varieties in typical consistent characteristics, so it is great important to improve the reliability and accuracy of seed purity detection to guarantee the quality of seeds. Hyperspectral imaging can reflect the internal and external characteristics of seeds at the same time, which has been widely used in nondestructive detection of agricultural products. The essence of nondestructive detection of agricultural products using hyperspectral imaging technique is to establish the mathematical model between the spectral information and the quality of agricultural products. Since the spectral information is easily affected by the sample growth environment, the stability and generalization of model would weaken when the test samples harvested from different origin and year. Active learning algorithm was investigated to add representative samples to expand the sample space for the original model, so as to implement the rapid update of the model's ability. Random selection (RS) and Kennard-Stone algorithm (KS) were performed to compare the model update effect with active learning algorithm. The experimental results indicated that in the division of different proportion of sample set (1:1, 3:1, 4:1), the updated purity detection model for maize seeds from 2010 year which was added 40 samples selected by active learning algorithm from 2011 year increased the prediction accuracy for 2011 new samples from 47%, 33.75%, 49% to 98.89%, 98.33%, 98.33%. For the updated purity detection model of 2011 year, its prediction accuracy for 2010 new samples increased by 50.83%, 54.58%, 53.75% to 94.57%, 94.02%, 94.57% after adding 56 new samples from 2010 year. Meanwhile the effect of model updated by active learning algorithm was better than that of RS and KS. Therefore, the update for purity detection model of maize seeds is feasible by active learning algorithm.
JPSS Cryosphere Algorithms: Integration and Testing in Algorithm Development Library (ADL)
NASA Astrophysics Data System (ADS)
Tsidulko, M.; Mahoney, R. L.; Meade, P.; Baldwin, D.; Tschudi, M. A.; Das, B.; Mikles, V. J.; Chen, W.; Tang, Y.; Sprietzer, K.; Zhao, Y.; Wolf, W.; Key, J.
2014-12-01
JPSS is a next generation satellite system that is planned to be launched in 2017. The satellites will carry a suite of sensors that are already on board the Suomi National Polar-orbiting Partnership (S-NPP) satellite. The NOAA/NESDIS/STAR Algorithm Integration Team (AIT) works within the Algorithm Development Library (ADL) framework which mimics the operational JPSS Interface Data Processing Segment (IDPS). The AIT contributes in development, integration and testing of scientific algorithms employed in the IDPS. This presentation discusses cryosphere related activities performed in ADL. The addition of a new ancillary data set - NOAA Global Multisensor Automated Snow/Ice data (GMASI) - with ADL code modifications is described. Preliminary GMASI impact on the gridded Snow/Ice product is estimated. Several modifications to the Ice Age algorithm that demonstrates mis-classification of ice type for certain areas/time periods are tested in the ADL. Sensitivity runs for day time, night time and terminator zone are performed and presented. Comparisons between the original and modified versions of the Ice Age algorithm are also presented.
Experimental Performance of a Genetic Algorithm for Airborne Strategic Conflict Resolution
NASA Technical Reports Server (NTRS)
Karr, David A.; Vivona, Robert A.; Roscoe, David A.; DePascale, Stephen M.; Consiglio, Maria
2009-01-01
The Autonomous Operations Planner, a research prototype flight-deck decision support tool to enable airborne self-separation, uses a pattern-based genetic algorithm to resolve predicted conflicts between the ownship and traffic aircraft. Conflicts are resolved by modifying the active route within the ownship s flight management system according to a predefined set of maneuver pattern templates. The performance of this pattern-based genetic algorithm was evaluated in the context of batch-mode Monte Carlo simulations running over 3600 flight hours of autonomous aircraft in en-route airspace under conditions ranging from typical current traffic densities to several times that level. Encountering over 8900 conflicts during two simulation experiments, the genetic algorithm was able to resolve all but three conflicts, while maintaining a required time of arrival constraint for most aircraft. Actual elapsed running time for the algorithm was consistent with conflict resolution in real time. The paper presents details of the genetic algorithm s design, along with mathematical models of the algorithm s performance and observations regarding the effectiveness of using complimentary maneuver patterns when multiple resolutions by the same aircraft were required.
Experimental Performance of a Genetic Algorithm for Airborne Strategic Conflict Resolution
NASA Technical Reports Server (NTRS)
Karr, David A.; Vivona, Robert A.; Roscoe, David A.; DePascale, Stephen M.; Consiglio, Maria
2009-01-01
The Autonomous Operations Planner, a research prototype flight-deck decision support tool to enable airborne self-separation, uses a pattern-based genetic algorithm to resolve predicted conflicts between the ownship and traffic aircraft. Conflicts are resolved by modifying the active route within the ownship's flight management system according to a predefined set of maneuver pattern templates. The performance of this pattern-based genetic algorithm was evaluated in the context of batch-mode Monte Carlo simulations running over 3600 flight hours of autonomous aircraft in en-route airspace under conditions ranging from typical current traffic densities to several times that level. Encountering over 8900 conflicts during two simulation experiments, the genetic algorithm was able to resolve all but three conflicts, while maintaining a required time of arrival constraint for most aircraft. Actual elapsed running time for the algorithm was consistent with conflict resolution in real time. The paper presents details of the genetic algorithm's design, along with mathematical models of the algorithm's performance and observations regarding the effectiveness of using complimentary maneuver patterns when multiple resolutions by the same aircraft were required.
Asynchronous Gossip for Averaging and Spectral Ranking
NASA Astrophysics Data System (ADS)
Borkar, Vivek S.; Makhijani, Rahul; Sundaresan, Rajesh
2014-08-01
We consider two variants of the classical gossip algorithm. The first variant is a version of asynchronous stochastic approximation. We highlight a fundamental difficulty associated with the classical asynchronous gossip scheme, viz., that it may not converge to a desired average, and suggest an alternative scheme based on reinforcement learning that has guaranteed convergence to the desired average. We then discuss a potential application to a wireless network setting with simultaneous link activation constraints. The second variant is a gossip algorithm for distributed computation of the Perron-Frobenius eigenvector of a nonnegative matrix. While the first variant draws upon a reinforcement learning algorithm for an average cost controlled Markov decision problem, the second variant draws upon a reinforcement learning algorithm for risk-sensitive control. We then discuss potential applications of the second variant to ranking schemes, reputation networks, and principal component analysis.
ARTIST: A fully automated artifact rejection algorithm for single-pulse TMS-EEG data.
Wu, Wei; Keller, Corey J; Rogasch, Nigel C; Longwell, Parker; Shpigel, Emmanuel; Rolle, Camarin E; Etkin, Amit
2018-04-01
Concurrent single-pulse TMS-EEG (spTMS-EEG) is an emerging noninvasive tool for probing causal brain dynamics in humans. However, in addition to the common artifacts in standard EEG data, spTMS-EEG data suffer from enormous stimulation-induced artifacts, posing significant challenges to the extraction of neural information. Typically, neural signals are analyzed after a manual time-intensive and often subjective process of artifact rejection. Here we describe a fully automated algorithm for spTMS-EEG artifact rejection. A key step of this algorithm is to decompose the spTMS-EEG data into statistically independent components (ICs), and then train a pattern classifier to automatically identify artifact components based on knowledge of the spatio-temporal profile of both neural and artefactual activities. The autocleaned and hand-cleaned data yield qualitatively similar group evoked potential waveforms. The algorithm achieves a 95% IC classification accuracy referenced to expert artifact rejection performance, and does so across a large number of spTMS-EEG data sets (n = 90 stimulation sites), retains high accuracy across stimulation sites/subjects/populations/montages, and outperforms current automated algorithms. Moreover, the algorithm was superior to the artifact rejection performance of relatively novice individuals, who would be the likely users of spTMS-EEG as the technique becomes more broadly disseminated. In summary, our algorithm provides an automated, fast, objective, and accurate method for cleaning spTMS-EEG data, which can increase the utility of TMS-EEG in both clinical and basic neuroscience settings. © 2018 Wiley Periodicals, Inc.
Extreme Trust Region Policy Optimization for Active Object Recognition.
Liu, Huaping; Wu, Yupei; Sun, Fuchun; Huaping Liu; Yupei Wu; Fuchun Sun; Sun, Fuchun; Liu, Huaping; Wu, Yupei
2018-06-01
In this brief, we develop a deep reinforcement learning method to actively recognize objects by choosing a sequence of actions for an active camera that helps to discriminate between the objects. The method is realized using trust region policy optimization, in which the policy is realized by an extreme learning machine and, therefore, leads to efficient optimization algorithm. The experimental results on the publicly available data set show the advantages of the developed extreme trust region optimization method.
Chen, Hongda; Knebel, Phillip; Brenner, Hermann
2016-07-01
Search for biomarkers for early detection of cancer is a very active area of research, but most studies are done in clinical rather than screening settings. We aimed to empirically evaluate the role of study setting for early detection marker identification and validation. A panel of 92 candidate cancer protein markers was measured in 35 clinically identified colorectal cancer patients and 35 colorectal cancer patients identified at screening colonoscopy. For each case group, we selected 38 controls without colorectal neoplasms at screening colonoscopy. Single-, two- and three-marker combinations discriminating cases and controls were identified in each setting and subsequently validated in the alternative setting. In all scenarios, a higher number of predictive biomarkers were initially detected in the clinical setting, but a substantially lower proportion of identified biomarkers could subsequently be confirmed in the screening setting. Confirmation rates were 50.0%, 84.5%, and 74.2% for one-, two-, and three-marker algorithms identified in the screening setting and were 42.9%, 18.6%, and 25.7% for algorithms identified in the clinical setting. Validation of early detection markers of cancer in a true screening setting is important to limit the number of false-positive findings. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.
OligoIS: Scalable Instance Selection for Class-Imbalanced Data Sets.
García-Pedrajas, Nicolás; Perez-Rodríguez, Javier; de Haro-García, Aida
2013-02-01
In current research, an enormous amount of information is constantly being produced, which poses a challenge for data mining algorithms. Many of the problems in extremely active research areas, such as bioinformatics, security and intrusion detection, or text mining, share the following two features: large data sets and class-imbalanced distribution of samples. Although many methods have been proposed for dealing with class-imbalanced data sets, most of these methods are not scalable to the very large data sets common to those research fields. In this paper, we propose a new approach to dealing with the class-imbalance problem that is scalable to data sets with many millions of instances and hundreds of features. This proposal is based on the divide-and-conquer principle combined with application of the selection process to balanced subsets of the whole data set. This divide-and-conquer principle allows the execution of the algorithm in linear time. Furthermore, the proposed method is easy to implement using a parallel environment and can work without loading the whole data set into memory. Using 40 class-imbalanced medium-sized data sets, we will demonstrate our method's ability to improve the results of state-of-the-art instance selection methods for class-imbalanced data sets. Using three very large data sets, we will show the scalability of our proposal to millions of instances and hundreds of features.
Ziacchi, Matteo; Palmisano, Pietro; Biffi, Mauro; Ricci, Renato P; Landolina, Maurizio; Zoni-Berisso, Massimo; Occhetta, Eraldo; Maglia, Giampiero; Botto, Gianluca; Padeletti, Luigi; Boriani, Giuseppe
2018-04-01
: Modern pacemakers have an increasing number of programable parameters and specific algorithms designed to optimize pacing therapy in relation to the individual characteristics of patients. When choosing the most appropriate pacemaker type and programing, the following variables must be taken into account: the type of bradyarrhythmia at the time of pacemaker implantation; the cardiac chamber requiring pacing, and the percentage of pacing actually needed to correct the rhythm disorder; the possible association of multiple rhythm disturbances and conduction diseases; the evolution of conduction disorders during follow-up. The goals of device programing are to preserve or restore the heart rate response to metabolic and hemodynamic demands; to maintain physiological conduction; to maximize device longevity; to detect, prevent, and treat atrial arrhythmia. In patients with sinus node disease, the optimal pacing mode is DDDR. Based on all the available evidence, in this setting, we consider appropriate the activation of the following algorithms: rate responsive function in patients with chronotropic incompetence; algorithms to maximize intrinsic atrioventricular conduction in the absence of atrioventricular blocks; mode-switch algorithms; algorithms for autoadaptive management of the atrial pacing output; algorithms for the prevention and treatment of atrial tachyarrhythmias in the subgroup of patients with atrial tachyarrhythmias/atrial fibrillation. The purpose of this two-part consensus document is to provide specific suggestions (based on an extensive literature review) on appropriate pacemaker setting in relation to patients' clinical features.
Li, Changyang; Wang, Xiuying; Eberl, Stefan; Fulham, Michael; Yin, Yong; Dagan Feng, David
2015-01-01
Automated and general medical image segmentation can be challenging because the foreground and the background may have complicated and overlapping density distributions in medical imaging. Conventional region-based level set algorithms often assume piecewise constant or piecewise smooth for segments, which are implausible for general medical image segmentation. Furthermore, low contrast and noise make identification of the boundaries between foreground and background difficult for edge-based level set algorithms. Thus, to address these problems, we suggest a supervised variational level set segmentation model to harness the statistical region energy functional with a weighted probability approximation. Our approach models the region density distributions by using the mixture-of-mixtures Gaussian model to better approximate real intensity distributions and distinguish statistical intensity differences between foreground and background. The region-based statistical model in our algorithm can intuitively provide better performance on noisy images. We constructed a weighted probability map on graphs to incorporate spatial indications from user input with a contextual constraint based on the minimization of contextual graphs energy functional. We measured the performance of our approach on ten noisy synthetic images and 58 medical datasets with heterogeneous intensities and ill-defined boundaries and compared our technique to the Chan-Vese region-based level set model, the geodesic active contour model with distance regularization, and the random walker model. Our method consistently achieved the highest Dice similarity coefficient when compared to the other methods.
Parallel Clustering Algorithm for Large-Scale Biological Data Sets
Wang, Minchao; Zhang, Wu; Ding, Wang; Dai, Dongbo; Zhang, Huiran; Xie, Hao; Chen, Luonan; Guo, Yike; Xie, Jiang
2014-01-01
Backgrounds Recent explosion of biological data brings a great challenge for the traditional clustering algorithms. With increasing scale of data sets, much larger memory and longer runtime are required for the cluster identification problems. The affinity propagation algorithm outperforms many other classical clustering algorithms and is widely applied into the biological researches. However, the time and space complexity become a great bottleneck when handling the large-scale data sets. Moreover, the similarity matrix, whose constructing procedure takes long runtime, is required before running the affinity propagation algorithm, since the algorithm clusters data sets based on the similarities between data pairs. Methods Two types of parallel architectures are proposed in this paper to accelerate the similarity matrix constructing procedure and the affinity propagation algorithm. The memory-shared architecture is used to construct the similarity matrix, and the distributed system is taken for the affinity propagation algorithm, because of its large memory size and great computing capacity. An appropriate way of data partition and reduction is designed in our method, in order to minimize the global communication cost among processes. Result A speedup of 100 is gained with 128 cores. The runtime is reduced from serval hours to a few seconds, which indicates that parallel algorithm is capable of handling large-scale data sets effectively. The parallel affinity propagation also achieves a good performance when clustering large-scale gene data (microarray) and detecting families in large protein superfamilies. PMID:24705246
Kupas, Katrin; Ultsch, Alfred; Klebe, Gerhard
2008-05-15
A new method to discover similar substructures in protein binding pockets, independently of sequence and folding patterns or secondary structure elements, is introduced. The solvent-accessible surface of a binding pocket, automatically detected as a depression on the protein surface, is divided into a set of surface patches. Each surface patch is characterized by its shape as well as by its physicochemical characteristics. Wavelets defined on surfaces are used for the description of the shape, as they have the great advantage of allowing a comparison at different resolutions. The number of coefficients to describe the wavelets can be chosen with respect to the size of the considered data set. The physicochemical characteristics of the patches are described by the assignment of the exposed amino acid residues to one or more of five different properties determinant for molecular recognition. A self-organizing neural network is used to project the high-dimensional feature vectors onto a two-dimensional layer of neurons, called a map. To find similarities between the binding pockets, in both geometrical and physicochemical features, a clustering of the projected feature vector is performed using an automatic distance- and density-based clustering algorithm. The method was validated with a small training data set of 109 binding cavities originating from a set of enzymes covering 12 different EC numbers. A second test data set of 1378 binding cavities, extracted from enzymes of 13 different EC numbers, was then used to prove the discriminating power of the algorithm and to demonstrate its applicability to large scale analyses. In all cases, members of the data set with the same EC number were placed into coherent regions on the map, with small distances between them. Different EC numbers are separated by large distances between the feature vectors. A third data set comprising three subfamilies of endopeptidases is used to demonstrate the ability of the algorithm to detect similar substructures between functionally related active sites. The algorithm can also be used to predict the function of novel proteins not considered in training data set. 2007 Wiley-Liss, Inc.
Observations on the Invalid Scoring Algorithm of "NASA" and Similar Consensus Tasks.
ERIC Educational Resources Information Center
Slevin, Dennis P.
1978-01-01
The NASA ranking task and similar ranking activities used to demonstrate the superiority of group thinking are examined. It is argued that the current scores cannot be used to prove the superiority of group-consensus decision making in either training or research settings. (Author)
Validation of SMAP surface soil moisture products with core validation sites
USDA-ARS?s Scientific Manuscript database
The NASA Soil Moisture Active Passive (SMAP) mission has utilized a set of core validation sites as the primary methodology in assessing the soil moisture retrieval algorithm performance. Those sites provide well-calibrated in situ soil moisture measurements within SMAP product grid pixels for diver...
Active subspace: toward scalable low-rank learning.
Liu, Guangcan; Yan, Shuicheng
2012-12-01
We address the scalability issues in low-rank matrix learning problems. Usually these problems resort to solving nuclear norm regularized optimization problems (NNROPs), which often suffer from high computational complexities if based on existing solvers, especially in large-scale settings. Based on the fact that the optimal solution matrix to an NNROP is often low rank, we revisit the classic mechanism of low-rank matrix factorization, based on which we present an active subspace algorithm for efficiently solving NNROPs by transforming large-scale NNROPs into small-scale problems. The transformation is achieved by factorizing the large solution matrix into the product of a small orthonormal matrix (active subspace) and another small matrix. Although such a transformation generally leads to nonconvex problems, we show that a suboptimal solution can be found by the augmented Lagrange alternating direction method. For the robust PCA (RPCA) (Candès, Li, Ma, & Wright, 2009 ) problem, a typical example of NNROPs, theoretical results verify the suboptimality of the solution produced by our algorithm. For the general NNROPs, we empirically show that our algorithm significantly reduces the computational complexity without loss of optimality.
Sharifahmadian, Ershad
2006-01-01
The set partitioning in hierarchical trees (SPIHT) algorithm is very effective and computationally simple technique for image and signal compression. Here the author modified the algorithm which provides even better performance than the SPIHT algorithm. The enhanced set partitioning in hierarchical trees (ESPIHT) algorithm has performance faster than the SPIHT algorithm. In addition, the proposed algorithm reduces the number of bits in a bit stream which is stored or transmitted. I applied it to compression of multichannel ECG data. Also, I presented a specific procedure based on the modified algorithm for more efficient compression of multichannel ECG data. This method employed on selected records from the MIT-BIH arrhythmia database. According to experiments, the proposed method attained the significant results regarding compression of multichannel ECG data. Furthermore, in order to compress one signal which is stored for a long time, the proposed multichannel compression method can be utilized efficiently.
Efficient bulk-loading of gridfiles
NASA Technical Reports Server (NTRS)
Leutenegger, Scott T.; Nicol, David M.
1994-01-01
This paper considers the problem of bulk-loading large data sets for the gridfile multiattribute indexing technique. We propose a rectilinear partitioning algorithm that heuristically seeks to minimize the size of the gridfile needed to ensure no bucket overflows. Empirical studies on both synthetic data sets and on data sets drawn from computational fluid dynamics applications demonstrate that our algorithm is very efficient, and is able to handle large data sets. In addition, we present an algorithm for bulk-loading data sets too large to fit in main memory. Utilizing a sort of the entire data set it creates a gridfile without incurring any overflows.
Macromolecular target prediction by self-organizing feature maps.
Schneider, Gisbert; Schneider, Petra
2017-03-01
Rational drug discovery would greatly benefit from a more nuanced appreciation of the activity of pharmacologically active compounds against a diverse panel of macromolecular targets. Already, computational target-prediction models assist medicinal chemists in library screening, de novo molecular design, optimization of active chemical agents, drug re-purposing, in the spotting of potential undesired off-target activities, and in the 'de-orphaning' of phenotypic screening hits. The self-organizing map (SOM) algorithm has been employed successfully for these and other purposes. Areas covered: The authors recapitulate contemporary artificial neural network methods for macromolecular target prediction, and present the basic SOM algorithm at a conceptual level. Specifically, they highlight consensus target-scoring by the employment of multiple SOMs, and discuss the opportunities and limitations of this technique. Expert opinion: Self-organizing feature maps represent a straightforward approach to ligand clustering and classification. Some of the appeal lies in their conceptual simplicity and broad applicability domain. Despite known algorithmic shortcomings, this computational target prediction concept has been proven to work in prospective settings with high success rates. It represents a prototypic technique for future advances in the in silico identification of the modes of action and macromolecular targets of bioactive molecules.
Aziz, Omar; Musngi, Magnus; Park, Edward J; Mori, Greg; Robinovitch, Stephen N
2017-01-01
Falls are the leading cause of injury-related morbidity and mortality among older adults. Over 90 % of hip and wrist fractures and 60 % of traumatic brain injuries in older adults are due to falls. Another serious consequence of falls among older adults is the 'long lie' experienced by individuals who are unable to get up and remain on the ground for an extended period of time after a fall. Considerable research has been conducted over the past decade on the design of wearable sensor systems that can automatically detect falls and send an alert to care providers to reduce the frequency and severity of long lies. While most systems described to date incorporate threshold-based algorithms, machine learning algorithms may offer increased accuracy in detecting falls. In the current study, we compared the accuracy of these two approaches in detecting falls by conducting a comprehensive set of falling experiments with 10 young participants. Participants wore waist-mounted tri-axial accelerometers and simulated the most common causes of falls observed in older adults, along with near-falls and activities of daily living. The overall performance of five machine learning algorithms was greater than the performance of five threshold-based algorithms described in the literature, with support vector machines providing the highest combination of sensitivity and specificity.
A Target Coverage Scheduling Scheme Based on Genetic Algorithms in Directional Sensor Networks
Gil, Joon-Min; Han, Youn-Hee
2011-01-01
As a promising tool for monitoring the physical world, directional sensor networks (DSNs) consisting of a large number of directional sensors are attracting increasing attention. As directional sensors in DSNs have limited battery power and restricted angles of sensing range, maximizing the network lifetime while monitoring all the targets in a given area remains a challenge. A major technique to conserve the energy of directional sensors is to use a node wake-up scheduling protocol by which some sensors remain active to provide sensing services, while the others are inactive to conserve their energy. In this paper, we first address a Maximum Set Covers for DSNs (MSCD) problem, which is known to be NP-complete, and present a greedy algorithm-based target coverage scheduling scheme that can solve this problem by heuristics. This scheme is used as a baseline for comparison. We then propose a target coverage scheduling scheme based on a genetic algorithm that can find the optimal cover sets to extend the network lifetime while monitoring all targets by the evolutionary global search technique. To verify and evaluate these schemes, we conducted simulations and showed that the schemes can contribute to extending the network lifetime. Simulation results indicated that the genetic algorithm-based scheduling scheme had better performance than the greedy algorithm-based scheme in terms of maximizing network lifetime. PMID:22319387
Real-time segmentation of burst suppression patterns in critical care EEG monitoring
Westover, M. Brandon; Shafi, Mouhsin M.; Ching, ShiNung; Chemali, Jessica J.; Purdon, Patrick L.; Cash, Sydney S.; Brown, Emery N.
2014-01-01
Objective Develop a real-time algorithm to automatically discriminate suppressions from non-suppressions (bursts) in electroencephalograms of critically ill adult patients. Methods A real-time method for segmenting adult ICU EEG data into bursts and suppressions is presented based on thresholding local voltage variance. Results are validated against manual segmentations by two experienced human electroencephalographers. We compare inter-rater agreement between manual EEG segmentations by experts with inter-rater agreement between human vs automatic segmentations, and investigate the robustness of segmentation quality to variations in algorithm parameter settings. We further compare the results of using these segmentations as input for calculating the burst suppression probability (BSP), a continuous measure of depth-of-suppression. Results Automated segmentation was comparable to manual segmentation, i.e. algorithm-vs-human agreement was comparable to human-vs-human agreement, as judged by comparing raw EEG segmentations or the derived BSP signals. Results were robust to modest variations in algorithm parameter settings. Conclusions Our automated method satisfactorily segments burst suppression data across a wide range adult ICU EEG patterns. Performance is comparable to or exceeds that of manual segmentation by human electroencephalographers. Significance Automated segmentation of burst suppression EEG patterns is an essential component of quantitative brain activity monitoring in critically ill and anesthetized adults. The segmentations produced by our algorithm provide a basis for accurate tracking of suppression depth. PMID:23891828
Real-time segmentation of burst suppression patterns in critical care EEG monitoring.
Brandon Westover, M; Shafi, Mouhsin M; Ching, Shinung; Chemali, Jessica J; Purdon, Patrick L; Cash, Sydney S; Brown, Emery N
2013-09-30
Develop a real-time algorithm to automatically discriminate suppressions from non-suppressions (bursts) in electroencephalograms of critically ill adult patients. A real-time method for segmenting adult ICU EEG data into bursts and suppressions is presented based on thresholding local voltage variance. Results are validated against manual segmentations by two experienced human electroencephalographers. We compare inter-rater agreement between manual EEG segmentations by experts with inter-rater agreement between human vs automatic segmentations, and investigate the robustness of segmentation quality to variations in algorithm parameter settings. We further compare the results of using these segmentations as input for calculating the burst suppression probability (BSP), a continuous measure of depth-of-suppression. Automated segmentation was comparable to manual segmentation, i.e. algorithm-vs-human agreement was comparable to human-vs-human agreement, as judged by comparing raw EEG segmentations or the derived BSP signals. Results were robust to modest variations in algorithm parameter settings. Our automated method satisfactorily segments burst suppression data across a wide range adult ICU EEG patterns. Performance is comparable to or exceeds that of manual segmentation by human electroencephalographers. Automated segmentation of burst suppression EEG patterns is an essential component of quantitative brain activity monitoring in critically ill and anesthetized adults. The segmentations produced by our algorithm provide a basis for accurate tracking of suppression depth. Copyright © 2013 Elsevier B.V. All rights reserved.
Hybrid approach for detection of dental caries based on the methods FCM and level sets
NASA Astrophysics Data System (ADS)
Chaabene, Marwa; Ben Ali, Ramzi; Ejbali, Ridha; Zaied, Mourad
2017-03-01
This paper presents a new technique for detection of dental caries that is a bacterial disease that destroys the tooth structure. In our approach, we have achieved a new segmentation method that combines the advantages of fuzzy C mean algorithm and level set method. The results obtained by the FCM algorithm will be used by Level sets algorithm to reduce the influence of the noise effect on the working of each of these algorithms, to facilitate level sets manipulation and to lead to more robust segmentation. The sensitivity and specificity confirm the effectiveness of proposed method for caries detection.
Code Verification Capabilities and Assessments in Support of ASC V&V Level 2 Milestone #6035
DOE Office of Scientific and Technical Information (OSTI.GOV)
Doebling, Scott William; Budzien, Joanne Louise; Ferguson, Jim Michael
This document provides a summary of the code verification activities supporting the FY17 Level 2 V&V milestone entitled “Deliver a Capability for V&V Assessments of Code Implementations of Physics Models and Numerical Algorithms in Support of Future Predictive Capability Framework Pegposts.” The physics validation activities supporting this milestone are documented separately. The objectives of this portion of the milestone are: 1) Develop software tools to support code verification analysis; 2) Document standard definitions of code verification test problems; and 3) Perform code verification assessments (focusing on error behavior of algorithms). This report and a set of additional standalone documents servemore » as the compilation of results demonstrating accomplishment of these objectives.« less
2010-01-01
Background Epistasis is recognized as a fundamental part of the genetic architecture of individuals. Several computational approaches have been developed to model gene-gene interactions in case-control studies, however, none of them is suitable for time-dependent analysis. Herein we introduce the Survival Dimensionality Reduction (SDR) algorithm, a non-parametric method specifically designed to detect epistasis in lifetime datasets. Results The algorithm requires neither specification about the underlying survival distribution nor about the underlying interaction model and proved satisfactorily powerful to detect a set of causative genes in synthetic epistatic lifetime datasets with a limited number of samples and high degree of right-censorship (up to 70%). The SDR method was then applied to a series of 386 Dutch patients with active rheumatoid arthritis that were treated with anti-TNF biological agents. Among a set of 39 candidate genes, none of which showed a detectable marginal effect on anti-TNF responses, the SDR algorithm did find that the rs1801274 SNP in the FcγRIIa gene and the rs10954213 SNP in the IRF5 gene non-linearly interact to predict clinical remission after anti-TNF biologicals. Conclusions Simulation studies and application in a real-world setting support the capability of the SDR algorithm to model epistatic interactions in candidate-genes studies in presence of right-censored data. Availability: http://sourceforge.net/projects/sdrproject/ PMID:20691091
Identifying reliable independent components via split-half comparisons
Groppe, David M.; Makeig, Scott; Kutas, Marta
2011-01-01
Independent component analysis (ICA) is a family of unsupervised learning algorithms that have proven useful for the analysis of the electroencephalogram (EEG) and magnetoencephalogram (MEG). ICA decomposes an EEG/MEG data set into a basis of maximally temporally independent components (ICs) that are learned from the data. As with any statistic, a concern with using ICA is the degree to which the estimated ICs are reliable. An IC may not be reliable if ICA was trained on insufficient data, if ICA training was stopped prematurely or at a local minimum (for some algorithms), or if multiple global minima were present. Consequently, evidence of ICA reliability is critical for the credibility of ICA results. In this paper, we present a new algorithm for assessing the reliability of ICs based on applying ICA separately to split-halves of a data set. This algorithm improves upon existing methods in that it considers both IC scalp topographies and activations, uses a probabilistically interpretable threshold for accepting ICs as reliable, and requires applying ICA only three times per data set. As evidence of the method’s validity, we show that the method can perform comparably to more time intensive bootstrap resampling and depends in a reasonable manner on the amount of training data. Finally, using the method we illustrate the importance of checking the reliability of ICs by demonstrating that IC reliability is dramatically increased by removing the mean EEG at each channel for each epoch of data rather than the mean EEG in a prestimulus baseline. PMID:19162199
Prediction of new bioactive molecules using a Bayesian belief network.
Abdo, Ammar; Leclère, Valérie; Jacques, Philippe; Salim, Naomie; Pupin, Maude
2014-01-27
Natural products and synthetic compounds are a valuable source of new small molecules leading to novel drugs to cure diseases. However identifying new biologically active small molecules is still a challenge. In this paper, we introduce a new activity prediction approach using Bayesian belief network for classification (BBNC). The roots of the network are the fragments composing a compound. The leaves are, on one side, the activities to predict and, on another side, the unknown compound. The activities are represented by sets of known compounds, and sets of inactive compounds are also used. We calculated a similarity between an unknown compound and each activity class. The more similar activity is assigned to the unknown compound. We applied this new approach on eight well-known data sets extracted from the literature and compared its performance to three classical machine learning algorithms. Experiments showed that BBNC provides interesting prediction rates (from 79% accuracy for high diverse data sets to 99% for low diverse ones) with a short time calculation. Experiments also showed that BBNC is particularly effective for homogeneous data sets but has been found to perform less well with structurally heterogeneous sets. However, it is important to stress that we believe that using several approaches whenever possible for activity prediction can often give a broader understanding of the data than using only one approach alone. Thus, BBNC is a useful addition to the computational chemist's toolbox.
Sparse Contextual Activation for Efficient Visual Re-Ranking.
Bai, Song; Bai, Xiang
2016-03-01
In this paper, we propose an extremely efficient algorithm for visual re-ranking. By considering the original pairwise distance in the contextual space, we develop a feature vector called sparse contextual activation (SCA) that encodes the local distribution of an image. Hence, re-ranking task can be simply accomplished by vector comparison under the generalized Jaccard metric, which has its theoretical meaning in the fuzzy set theory. In order to improve the time efficiency of re-ranking procedure, inverted index is successfully introduced to speed up the computation of generalized Jaccard metric. As a result, the average time cost of re-ranking for a certain query can be controlled within 1 ms. Furthermore, inspired by query expansion, we also develop an additional method called local consistency enhancement on the proposed SCA to improve the retrieval performance in an unsupervised manner. On the other hand, the retrieval performance using a single feature may not be satisfactory enough, which inspires us to fuse multiple complementary features for accurate retrieval. Based on SCA, a robust feature fusion algorithm is exploited that also preserves the characteristic of high time efficiency. We assess our proposed method in various visual re-ranking tasks. Experimental results on Princeton shape benchmark (3D object), WM-SRHEC07 (3D competition), YAEL data set B (face), MPEG-7 data set (shape), and Ukbench data set (image) manifest the effectiveness and efficiency of SCA.
Neurodynamic evaluation of hearing aid features using EEG correlates of listening effort.
Bernarding, Corinna; Strauss, Daniel J; Hannemann, Ronny; Seidler, Harald; Corona-Strauss, Farah I
2017-06-01
In this study, we propose a novel estimate of listening effort using electroencephalographic data. This method is a translation of our past findings, gained from the evoked electroencephalographic activity, to the oscillatory EEG activity. To test this technique, electroencephalographic data from experienced hearing aid users with moderate hearing loss were recorded, wearing hearing aids. The investigated hearing aid settings were: a directional microphone combined with a noise reduction algorithm in a medium and a strong setting, the noise reduction setting turned off, and a setting using omnidirectional microphones without any noise reduction. The results suggest that the electroencephalographic estimate of listening effort seems to be a useful tool to map the exerted effort of the participants. In addition, the results indicate that a directional processing mode can reduce the listening effort in multitalker listening situations.
The improved Apriori algorithm based on matrix pruning and weight analysis
NASA Astrophysics Data System (ADS)
Lang, Zhenhong
2018-04-01
This paper uses the matrix compression algorithm and weight analysis algorithm for reference and proposes an improved matrix pruning and weight analysis Apriori algorithm. After the transactional database is scanned for only once, the algorithm will construct the boolean transaction matrix. Through the calculation of one figure in the rows and columns of the matrix, the infrequent item set is pruned, and a new candidate item set is formed. Then, the item's weight and the transaction's weight as well as the weight support for items are calculated, thus the frequent item sets are gained. The experimental result shows that the improved Apriori algorithm not only reduces the number of repeated scans of the database, but also improves the efficiency of data correlation mining.
2011-01-01
Background Machine learning has a vast range of applications. In particular, advanced machine learning methods are routinely and increasingly used in quantitative structure activity relationship (QSAR) modeling. QSAR data sets often encompass tens of thousands of compounds and the size of proprietary, as well as public data sets, is rapidly growing. Hence, there is a demand for computationally efficient machine learning algorithms, easily available to researchers without extensive machine learning knowledge. In granting the scientific principles of transparency and reproducibility, Open Source solutions are increasingly acknowledged by regulatory authorities. Thus, an Open Source state-of-the-art high performance machine learning platform, interfacing multiple, customized machine learning algorithms for both graphical programming and scripting, to be used for large scale development of QSAR models of regulatory quality, is of great value to the QSAR community. Results This paper describes the implementation of the Open Source machine learning package AZOrange. AZOrange is specially developed to support batch generation of QSAR models in providing the full work flow of QSAR modeling, from descriptor calculation to automated model building, validation and selection. The automated work flow relies upon the customization of the machine learning algorithms and a generalized, automated model hyper-parameter selection process. Several high performance machine learning algorithms are interfaced for efficient data set specific selection of the statistical method, promoting model accuracy. Using the high performance machine learning algorithms of AZOrange does not require programming knowledge as flexible applications can be created, not only at a scripting level, but also in a graphical programming environment. Conclusions AZOrange is a step towards meeting the needs for an Open Source high performance machine learning platform, supporting the efficient development of highly accurate QSAR models fulfilling regulatory requirements. PMID:21798025
Stålring, Jonna C; Carlsson, Lars A; Almeida, Pedro; Boyer, Scott
2011-07-28
Machine learning has a vast range of applications. In particular, advanced machine learning methods are routinely and increasingly used in quantitative structure activity relationship (QSAR) modeling. QSAR data sets often encompass tens of thousands of compounds and the size of proprietary, as well as public data sets, is rapidly growing. Hence, there is a demand for computationally efficient machine learning algorithms, easily available to researchers without extensive machine learning knowledge. In granting the scientific principles of transparency and reproducibility, Open Source solutions are increasingly acknowledged by regulatory authorities. Thus, an Open Source state-of-the-art high performance machine learning platform, interfacing multiple, customized machine learning algorithms for both graphical programming and scripting, to be used for large scale development of QSAR models of regulatory quality, is of great value to the QSAR community. This paper describes the implementation of the Open Source machine learning package AZOrange. AZOrange is specially developed to support batch generation of QSAR models in providing the full work flow of QSAR modeling, from descriptor calculation to automated model building, validation and selection. The automated work flow relies upon the customization of the machine learning algorithms and a generalized, automated model hyper-parameter selection process. Several high performance machine learning algorithms are interfaced for efficient data set specific selection of the statistical method, promoting model accuracy. Using the high performance machine learning algorithms of AZOrange does not require programming knowledge as flexible applications can be created, not only at a scripting level, but also in a graphical programming environment. AZOrange is a step towards meeting the needs for an Open Source high performance machine learning platform, supporting the efficient development of highly accurate QSAR models fulfilling regulatory requirements.
Developing a benchmark for emotional analysis of music
Yang, Yi-Hsuan; Soleymani, Mohammad
2017-01-01
Music emotion recognition (MER) field rapidly expanded in the last decade. Many new methods and new audio features are developed to improve the performance of MER algorithms. However, it is very difficult to compare the performance of the new methods because of the data representation diversity and scarcity of publicly available data. In this paper, we address these problems by creating a data set and a benchmark for MER. The data set that we release, a MediaEval Database for Emotional Analysis in Music (DEAM), is the largest available data set of dynamic annotations (valence and arousal annotations for 1,802 songs and song excerpts licensed under Creative Commons with 2Hz time resolution). Using DEAM, we organized the ‘Emotion in Music’ task at MediaEval Multimedia Evaluation Campaign from 2013 to 2015. The benchmark attracted, in total, 21 active teams to participate in the challenge. We analyze the results of the benchmark: the winning algorithms and feature-sets. We also describe the design of the benchmark, the evaluation procedures and the data cleaning and transformations that we suggest. The results from the benchmark suggest that the recurrent neural network based approaches combined with large feature-sets work best for dynamic MER. PMID:28282400
Developing a benchmark for emotional analysis of music.
Aljanaki, Anna; Yang, Yi-Hsuan; Soleymani, Mohammad
2017-01-01
Music emotion recognition (MER) field rapidly expanded in the last decade. Many new methods and new audio features are developed to improve the performance of MER algorithms. However, it is very difficult to compare the performance of the new methods because of the data representation diversity and scarcity of publicly available data. In this paper, we address these problems by creating a data set and a benchmark for MER. The data set that we release, a MediaEval Database for Emotional Analysis in Music (DEAM), is the largest available data set of dynamic annotations (valence and arousal annotations for 1,802 songs and song excerpts licensed under Creative Commons with 2Hz time resolution). Using DEAM, we organized the 'Emotion in Music' task at MediaEval Multimedia Evaluation Campaign from 2013 to 2015. The benchmark attracted, in total, 21 active teams to participate in the challenge. We analyze the results of the benchmark: the winning algorithms and feature-sets. We also describe the design of the benchmark, the evaluation procedures and the data cleaning and transformations that we suggest. The results from the benchmark suggest that the recurrent neural network based approaches combined with large feature-sets work best for dynamic MER.
Dou, Ying; Mi, Hong; Zhao, Lingzhi; Ren, Yuqiu; Ren, Yulin
2006-09-01
The application of the second most popular artificial neural networks (ANNs), namely, the radial basis function (RBF) networks, has been developed for quantitative analysis of drugs during the last decade. In this paper, the two components (aspirin and phenacetin) were simultaneously determined in compound aspirin tablets by using near-infrared (NIR) spectroscopy and RBF networks. The total database was randomly divided into a training set (50) and a testing set (17). Different preprocessing methods (standard normal variate (SNV), multiplicative scatter correction (MSC), first-derivative and second-derivative) were applied to two sets of NIR spectra of compound aspirin tablets with different concentrations of two active components and compared each other. After that, the performance of RBF learning algorithm adopted the nearest neighbor clustering algorithm (NNCA) and the criterion for selection used a cross-validation technique. Results show that using RBF networks to quantificationally analyze tablets is reliable, and the best RBF model was obtained by first-derivative spectra.
Jeong, Jeong-Won; Shin, Dae C; Do, Synho; Marmarelis, Vasilis Z
2006-08-01
This paper presents a novel segmentation methodology for automated classification and differentiation of soft tissues using multiband data obtained with the newly developed system of high-resolution ultrasonic transmission tomography (HUTT) for imaging biological organs. This methodology extends and combines two existing approaches: the L-level set active contour (AC) segmentation approach and the agglomerative hierarchical kappa-means approach for unsupervised clustering (UC). To prevent the trapping of the current iterative minimization AC algorithm in a local minimum, we introduce a multiresolution approach that applies the level set functions at successively increasing resolutions of the image data. The resulting AC clusters are subsequently rearranged by the UC algorithm that seeks the optimal set of clusters yielding the minimum within-cluster distances in the feature space. The presented results from Monte Carlo simulations and experimental animal-tissue data demonstrate that the proposed methodology outperforms other existing methods without depending on heuristic parameters and provides a reliable means for soft tissue differentiation in HUTT images.
Zhao, Jing; Zong, Haili
2018-01-01
In this paper, we propose parallel and cyclic iterative algorithms for solving the multiple-set split equality common fixed-point problem of firmly quasi-nonexpansive operators. We also combine the process of cyclic and parallel iterative methods and propose two mixed iterative algorithms. Our several algorithms do not need any prior information about the operator norms. Under mild assumptions, we prove weak convergence of the proposed iterative sequences in Hilbert spaces. As applications, we obtain several iterative algorithms to solve the multiple-set split equality problem.
Zhang, Yuting; Beenakker, Karel G M; Butala, Pankil M; Lin, Cheng-Chieh; Little, Thomas D C; Maier, Andrea B; Stijntjes, Marjon; Vartanian, Richard; Wagenaar, Robert C
2012-01-01
Changes in gait parameters have been shown to be an important indicator of several age-related cognitive and physical declines of older adults. In this paper we propose a method to monitor and analyze walking and cycling activities based on a triaxial accelerometer worn on one ankle. We use an algorithm that can (1) distinguish between static and dynamic functional activities, (2) detect walking and cycling events, (3) identify gait parameters, including step frequency, number of steps, number of walking periods, and total walking duration per day, and (4) evaluate cycling parameters, including cycling frequency, number of cycling periods, and total cycling duration. Our algorithm is evaluated against the triaxial accelerometer data obtained from a group of 297 middle-aged to older adults wearing an activity monitor on the right ankle for approximately one week while performing unconstrained daily activities in the home and community setting. The correlation coefficients between each of detected gait and cycling parameters on two weekdays are all statistically significant, ranging from 0.668 to 0.873. These results demonstrate good test-retest reliability of our method in monitoring walking and cycling activities and analyzing gait and cycling parameters. This algorithm is efficient and causal in time and thus implementable for real-time monitoring and feedback.
NASA Astrophysics Data System (ADS)
Maltz, Jonathan S.
2000-11-01
We present an algorithm of reduced computational cost which is able to estimate kinetic model parameters directly from dynamic ECT sinograms made up of temporally inconsistent projections. The algorithm exploits the extreme degree of parameter redundancy inherent in linear combinations of the exponential functions which represent the modes of first-order compartmental systems. The singular value decomposition is employed to find a small set of orthogonal functions, the linear combinations of which are able to accurately represent all modes within the physiologically anticipated range in a given study. The reduced-dimension basis is formed as the convolution of this orthogonal set with a measured input function. The Moore-Penrose pseudoinverse is used to find coefficients of this basis. Algorithm performance is evaluated at realistic count rates using MCAT phantom and clinical 99mTc-teboroxime myocardial study data. Phantom data are modelled as originating from a Poisson process. For estimates recovered from a single slice projection set containing 2.5×105 total counts, recovered tissue responses compare favourably with those obtained using more computationally intensive methods. The corresponding kinetic parameter estimates (coefficients of the new basis) exhibit negligible bias, while parameter variances are low, falling within 30% of the Cramér-Rao lower bound.
Quantum algorithm for linear regression
NASA Astrophysics Data System (ADS)
Wang, Guoming
2017-07-01
We present a quantum algorithm for fitting a linear regression model to a given data set using the least-squares approach. Differently from previous algorithms which yield a quantum state encoding the optimal parameters, our algorithm outputs these numbers in the classical form. So by running it once, one completely determines the fitted model and then can use it to make predictions on new data at little cost. Moreover, our algorithm works in the standard oracle model, and can handle data sets with nonsparse design matrices. It runs in time poly( log2(N ) ,d ,κ ,1 /ɛ ) , where N is the size of the data set, d is the number of adjustable parameters, κ is the condition number of the design matrix, and ɛ is the desired precision in the output. We also show that the polynomial dependence on d and κ is necessary. Thus, our algorithm cannot be significantly improved. Furthermore, we also give a quantum algorithm that estimates the quality of the least-squares fit (without computing its parameters explicitly). This algorithm runs faster than the one for finding this fit, and can be used to check whether the given data set qualifies for linear regression in the first place.
Discriminative structural approaches for enzyme active-site prediction.
Kato, Tsuyoshi; Nagano, Nozomi
2011-02-15
Predicting enzyme active-sites in proteins is an important issue not only for protein sciences but also for a variety of practical applications such as drug design. Because enzyme reaction mechanisms are based on the local structures of enzyme active-sites, various template-based methods that compare local structures in proteins have been developed to date. In comparing such local sites, a simple measurement, RMSD, has been used so far. This paper introduces new machine learning algorithms that refine the similarity/deviation for comparison of local structures. The similarity/deviation is applied to two types of applications, single template analysis and multiple template analysis. In the single template analysis, a single template is used as a query to search proteins for active sites, whereas a protein structure is examined as a query to discover the possible active-sites using a set of templates in the multiple template analysis. This paper experimentally illustrates that the machine learning algorithms effectively improve the similarity/deviation measurements for both the analyses.
Particle Swarm Optimization for Programming Deep Brain Stimulation Arrays
Peña, Edgar; Zhang, Simeng; Deyo, Steve; Xiao, YiZi; Johnson, Matthew D.
2017-01-01
Objective Deep brain stimulation (DBS) therapy relies on both precise neurosurgical targeting and systematic optimization of stimulation settings to achieve beneficial clinical outcomes. One recent advance to improve targeting is the development of DBS arrays (DBSAs) with electrodes segmented both along and around the DBS lead. However, increasing the number of independent electrodes creates the logistical challenge of optimizing stimulation parameters efficiently. Approach Solving such complex problems with multiple solutions and objectives is well known to occur in biology, in which complex collective behaviors emerge out of swarms of individual organisms engaged in learning through social interactions. Here, we developed a particle swarm optimization (PSO) algorithm to program DBSAs using a swarm of individual particles representing electrode configurations and stimulation amplitudes. Using a finite element model of motor thalamic DBS, we demonstrate how the PSO algorithm can efficiently optimize a multi-objective function that maximizes predictions of axonal activation in regions of interest (ROI, cerebellar-receiving area of motor thalamus), minimizes predictions of axonal activation in regions of avoidance (ROA, somatosensory thalamus), and minimizes power consumption. Main Results The algorithm solved the multi-objective problem by producing a Pareto front. ROI and ROA activation predictions were consistent across swarms (<1% median discrepancy in axon activation). The algorithm was able to accommodate for (1) lead displacement (1 mm) with relatively small ROI (≤9.2%) and ROA (≤1%) activation changes, irrespective of shift direction; (2) reduction in maximum per-electrode current (by 50% and 80%) with ROI activation decreasing by 5.6% and 16%, respectively; and (3) disabling electrodes (n=3 and 12) with ROI activation reduction by 1.8% and 14%, respectively. Additionally, comparison between PSO predictions and multi-compartment axon model simulations showed discrepancies of <1% between approaches. Significance The PSO algorithm provides a computationally efficient way to program DBS systems especially those with higher electrode counts. PMID:28068291
Particle swarm optimization for programming deep brain stimulation arrays
NASA Astrophysics Data System (ADS)
Peña, Edgar; Zhang, Simeng; Deyo, Steve; Xiao, YiZi; Johnson, Matthew D.
2017-02-01
Objective. Deep brain stimulation (DBS) therapy relies on both precise neurosurgical targeting and systematic optimization of stimulation settings to achieve beneficial clinical outcomes. One recent advance to improve targeting is the development of DBS arrays (DBSAs) with electrodes segmented both along and around the DBS lead. However, increasing the number of independent electrodes creates the logistical challenge of optimizing stimulation parameters efficiently. Approach. Solving such complex problems with multiple solutions and objectives is well known to occur in biology, in which complex collective behaviors emerge out of swarms of individual organisms engaged in learning through social interactions. Here, we developed a particle swarm optimization (PSO) algorithm to program DBSAs using a swarm of individual particles representing electrode configurations and stimulation amplitudes. Using a finite element model of motor thalamic DBS, we demonstrate how the PSO algorithm can efficiently optimize a multi-objective function that maximizes predictions of axonal activation in regions of interest (ROI, cerebellar-receiving area of motor thalamus), minimizes predictions of axonal activation in regions of avoidance (ROA, somatosensory thalamus), and minimizes power consumption. Main results. The algorithm solved the multi-objective problem by producing a Pareto front. ROI and ROA activation predictions were consistent across swarms (<1% median discrepancy in axon activation). The algorithm was able to accommodate for (1) lead displacement (1 mm) with relatively small ROI (⩽9.2%) and ROA (⩽1%) activation changes, irrespective of shift direction; (2) reduction in maximum per-electrode current (by 50% and 80%) with ROI activation decreasing by 5.6% and 16%, respectively; and (3) disabling electrodes (n = 3 and 12) with ROI activation reduction by 1.8% and 14%, respectively. Additionally, comparison between PSO predictions and multi-compartment axon model simulations showed discrepancies of <1% between approaches. Significance. The PSO algorithm provides a computationally efficient way to program DBS systems especially those with higher electrode counts.
Probabilistic Common Spatial Patterns for Multichannel EEG Analysis
Chen, Zhe; Gao, Xiaorong; Li, Yuanqing; Brown, Emery N.; Gao, Shangkai
2015-01-01
Common spatial patterns (CSP) is a well-known spatial filtering algorithm for multichannel electroencephalogram (EEG) analysis. In this paper, we cast the CSP algorithm in a probabilistic modeling setting. Specifically, probabilistic CSP (P-CSP) is proposed as a generic EEG spatio-temporal modeling framework that subsumes the CSP and regularized CSP algorithms. The proposed framework enables us to resolve the overfitting issue of CSP in a principled manner. We derive statistical inference algorithms that can alleviate the issue of local optima. In particular, an efficient algorithm based on eigendecomposition is developed for maximum a posteriori (MAP) estimation in the case of isotropic noise. For more general cases, a variational algorithm is developed for group-wise sparse Bayesian learning for the P-CSP model and for automatically determining the model size. The two proposed algorithms are validated on a simulated data set. Their practical efficacy is also demonstrated by successful applications to single-trial classifications of three motor imagery EEG data sets and by the spatio-temporal pattern analysis of one EEG data set recorded in a Stroop color naming task. PMID:26005228
NASA Astrophysics Data System (ADS)
Zhang, D.; Zhang, W. Y.
2017-08-01
Evacuation planning is an important activity in disaster management. It has to be planned in advance due to the unpredictable occurrence of disasters. It is necessary that the evacuation plans are as close as possible to the real evacuation work. However, the evacuation plan is extremely challenging because of the inherent uncertainty of the required information. There is a kind of vehicle routing problem based on the public traffic evacuation. In this paper, the demand for each evacuation set point is a fuzzy number, and each routing selection of the point is based on the fuzzy credibility preference index. This paper proposes an approximate optimal solution for this problem by the genetic algorithm based on the fuzzy reliability theory. Finally, the algorithm is applied to an optimization model, and the experiment result shows that the algorithm is effective.
3-D CSEM data inversion algorithm based on simultaneously active multiple transmitters concept
NASA Astrophysics Data System (ADS)
Dehiya, Rahul; Singh, Arun; Gupta, Pravin Kumar; Israil, Mohammad
2017-05-01
We present an algorithm for efficient 3-D inversion of marine controlled-source electromagnetic data. The efficiency is achieved by exploiting the redundancy in data. The data redundancy is reduced by compressing the data through stacking of the response of transmitters which are in close proximity. This stacking is equivalent to synthesizing the data as if the multiple transmitters are simultaneously active. The redundancy in data, arising due to close transmitter spacing, has been studied through singular value analysis of the Jacobian formed in 1-D inversion. This study reveals that the transmitter spacing of 100 m, typically used in marine data acquisition, does result in redundancy in the data. In the proposed algorithm, the data are compressed through stacking which leads to both computational advantage and reduction in noise. The performance of the algorithm for noisy data is demonstrated through the studies on two types of noise, viz., uncorrelated additive noise and correlated non-additive noise. It is observed that in case of uncorrelated additive noise, up to a moderately high (10 percent) noise level the algorithm addresses the noise as effectively as the traditional full data inversion. However, when the noise level in the data is high (20 percent), the algorithm outperforms the traditional full data inversion in terms of data misfit. Similar results are obtained in case of correlated non-additive noise and the algorithm performs better if the level of noise is high. The inversion results of a real field data set are also presented to demonstrate the robustness of the algorithm. The significant computational advantage in all cases presented makes this algorithm a better choice.
A Genetic Algorithm Approach to Motion Sensor Placement in Smart Environments.
Thomas, Brian L; Crandall, Aaron S; Cook, Diane J
2016-04-01
Smart environments and ubiquitous computing technologies hold great promise for a wide range of real world applications. The medical community is particularly interested in high quality measurement of activities of daily living. With accurate computer modeling of older adults, decision support tools may be built to assist care providers. One aspect of effectively deploying these technologies is determining where the sensors should be placed in the home to effectively support these end goals. This work introduces and evaluates a set of approaches for generating sensor layouts in the home. These approaches range from the gold standard of human intuition-based placement to more advanced search algorithms, including Hill Climbing and Genetic Algorithms. The generated layouts are evaluated based on their ability to detect activities while minimizing the number of needed sensors. Sensor-rich environments can provide valuable insights about adults as they go about their lives. These sensors, once in place, provide information on daily behavior that can facilitate an aging-in-place approach to health care.
A Genetic Algorithm Approach to Motion Sensor Placement in Smart Environments
Thomas, Brian L.; Crandall, Aaron S.; Cook, Diane J.
2016-01-01
Smart environments and ubiquitous computing technologies hold great promise for a wide range of real world applications. The medical community is particularly interested in high quality measurement of activities of daily living. With accurate computer modeling of older adults, decision support tools may be built to assist care providers. One aspect of effectively deploying these technologies is determining where the sensors should be placed in the home to effectively support these end goals. This work introduces and evaluates a set of approaches for generating sensor layouts in the home. These approaches range from the gold standard of human intuition-based placement to more advanced search algorithms, including Hill Climbing and Genetic Algorithms. The generated layouts are evaluated based on their ability to detect activities while minimizing the number of needed sensors. Sensor-rich environments can provide valuable insights about adults as they go about their lives. These sensors, once in place, provide information on daily behavior that can facilitate an aging-in-place approach to health care. PMID:27453810
GASAKe: forecasting landslide activations by a genetic-algorithms based hydrological model
NASA Astrophysics Data System (ADS)
Terranova, O. G.; Gariano, S. L.; Iaquinta, P.; Iovine, G. G. R.
2015-02-01
GASAKe is a new hydrological model aimed at forecasting the triggering of landslides. The model is based on genetic-algorithms and allows to obtaining thresholds of landslide activation from the set of historical occurrences and from the rainfall series. GASAKe can be applied to either single landslides or set of similar slope movements in a homogeneous environment. Calibration of the model is based on genetic-algorithms, and provides for families of optimal, discretized solutions (kernels) that maximize the fitness function. Starting from these latter, the corresponding mobility functions (i.e. the predictive tools) can be obtained through convolution with the rain series. The base time of the kernel is related to the magnitude of the considered slope movement, as well as to hydro-geological complexity of the site. Generally, smaller values are expected for shallow slope instabilities with respect to large-scale phenomena. Once validated, the model can be applied to estimate the timing of future landslide activations in the same study area, by employing recorded or forecasted rainfall series. Example of application of GASAKe to a medium-scale slope movement (the Uncino landslide at San Fili, in Calabria, Southern Italy) and to a set of shallow landslides (in the Sorrento Peninsula, Campania, Southern Italy) are discussed. In both cases, a successful calibration of the model has been achieved, despite unavoidable uncertainties concerning the dates of landslide occurrence. In particular, for the Sorrento Peninsula case, a fitness of 0.81 has been obtained by calibrating the model against 10 dates of landslide activation; in the Uncino case, a fitness of 1 (i.e. neither missing nor false alarms) has been achieved against 5 activations. As for temporal validation, the experiments performed by considering the extra dates of landslide activation have also proved satisfactory. In view of early-warning applications for civil protection purposes, the capability of the model to simulate the occurrences of the Uncino landslide has been tested by means of a progressive, self-adaptive procedure. Finally, a sensitivity analysis has been performed by taking into account the main parameters of the model. The obtained results are quite promising, given the high performance of the model obtained against different types of slope instabilities, characterized by several historical activations. Nevertheless, further refinements are still needed for applications to landslide risk mitigation within early-warning and decision-support systems.
NASA Astrophysics Data System (ADS)
La Foy, Roderick; Vlachos, Pavlos
2011-11-01
An optimally designed MLOS tomographic reconstruction algorithm for use in 3D PIV and PTV applications is analyzed. Using a set of optimized reconstruction parameters, the reconstructions produced by the MLOS algorithm are shown to be comparable to reconstructions produced by the MART algorithm for a range of camera geometries, camera numbers, and particle seeding densities. The resultant velocity field error calculated using PIV and PTV algorithms is further minimized by applying both pre and post processing to the reconstructed data sets.
Scalable Nearest Neighbor Algorithms for High Dimensional Data.
Muja, Marius; Lowe, David G
2014-11-01
For many computer vision and machine learning problems, large training sets are key for good performance. However, the most computationally expensive part of many computer vision and machine learning algorithms consists of finding nearest neighbor matches to high dimensional vectors that represent the training data. We propose new algorithms for approximate nearest neighbor matching and evaluate and compare them with previous algorithms. For matching high dimensional features, we find two algorithms to be the most efficient: the randomized k-d forest and a new algorithm proposed in this paper, the priority search k-means tree. We also propose a new algorithm for matching binary features by searching multiple hierarchical clustering trees and show it outperforms methods typically used in the literature. We show that the optimal nearest neighbor algorithm and its parameters depend on the data set characteristics and describe an automated configuration procedure for finding the best algorithm to search a particular data set. In order to scale to very large data sets that would otherwise not fit in the memory of a single machine, we propose a distributed nearest neighbor matching framework that can be used with any of the algorithms described in the paper. All this research has been released as an open source library called fast library for approximate nearest neighbors (FLANN), which has been incorporated into OpenCV and is now one of the most popular libraries for nearest neighbor matching.
Survey and analysis of multiresolution methods for turbulence data
Pulido, Jesus; Livescu, Daniel; Woodring, Jonathan; ...
2015-11-10
This paper compares the effectiveness of various multi-resolution geometric representation methods, such as B-spline, Daubechies, Coiflet and Dual-tree wavelets, curvelets and surfacelets, to capture the structure of fully developed turbulence using a truncated set of coefficients. The turbulence dataset is obtained from a Direct Numerical Simulation of buoyancy driven turbulence on a 512 3 mesh size, with an Atwood number, A = 0.05, and turbulent Reynolds number, Re t = 1800, and the methods are tested against quantities pertaining to both velocities and active scalar (density) fields and their derivatives, spectra, and the properties of constant density surfaces. The comparisonsmore » between the algorithms are given in terms of performance, accuracy, and compression properties. The results should provide useful information for multi-resolution analysis of turbulence, coherent feature extraction, compression for large datasets handling, as well as simulations algorithms based on multi-resolution methods. In conclusion, the final section provides recommendations for best decomposition algorithms based on several metrics related to computational efficiency and preservation of turbulence properties using a reduced set of coefficients.« less
Fast Adapting Ensemble: A New Algorithm for Mining Data Streams with Concept Drift
Ortíz Díaz, Agustín; Ramos-Jiménez, Gonzalo; Frías Blanco, Isvani; Caballero Mota, Yailé; Morales-Bueno, Rafael
2015-01-01
The treatment of large data streams in the presence of concept drifts is one of the main challenges in the field of data mining, particularly when the algorithms have to deal with concepts that disappear and then reappear. This paper presents a new algorithm, called Fast Adapting Ensemble (FAE), which adapts very quickly to both abrupt and gradual concept drifts, and has been specifically designed to deal with recurring concepts. FAE processes the learning examples in blocks of the same size, but it does not have to wait for the batch to be complete in order to adapt its base classification mechanism. FAE incorporates a drift detector to improve the handling of abrupt concept drifts and stores a set of inactive classifiers that represent old concepts, which are activated very quickly when these concepts reappear. We compare our new algorithm with various well-known learning algorithms, taking into account, common benchmark datasets. The experiments show promising results from the proposed algorithm (regarding accuracy and runtime), handling different types of concept drifts. PMID:25879051
Service-Aware Clustering: An Energy-Efficient Model for the Internet-of-Things
Bagula, Antoine; Abidoye, Ademola Philip; Zodi, Guy-Alain Lusilao
2015-01-01
Current generation wireless sensor routing algorithms and protocols have been designed based on a myopic routing approach, where the motes are assumed to have the same sensing and communication capabilities. Myopic routing is not a natural fit for the IoT, as it may lead to energy imbalance and subsequent short-lived sensor networks, routing the sensor readings over the most service-intensive sensor nodes, while leaving the least active nodes idle. This paper revisits the issue of energy efficiency in sensor networks to propose a clustering model where sensor devices’ service delivery is mapped into an energy awareness model, used to design a clustering algorithm that finds service-aware clustering (SAC) configurations in IoT settings. The performance evaluation reveals the relative energy efficiency of the proposed SAC algorithm compared to related routing algorithms in terms of energy consumption, the sensor nodes’ life span and its traffic engineering efficiency in terms of throughput and delay. These include the well-known low energy adaptive clustering hierarchy (LEACH) and LEACH-centralized (LEACH-C) algorithms, as well as the most recent algorithms, such as DECSA and MOCRN. PMID:26703619
Service-Aware Clustering: An Energy-Efficient Model for the Internet-of-Things.
Bagula, Antoine; Abidoye, Ademola Philip; Zodi, Guy-Alain Lusilao
2015-12-23
Current generation wireless sensor routing algorithms and protocols have been designed based on a myopic routing approach, where the motes are assumed to have the same sensing and communication capabilities. Myopic routing is not a natural fit for the IoT, as it may lead to energy imbalance and subsequent short-lived sensor networks, routing the sensor readings over the most service-intensive sensor nodes, while leaving the least active nodes idle. This paper revisits the issue of energy efficiency in sensor networks to propose a clustering model where sensor devices' service delivery is mapped into an energy awareness model, used to design a clustering algorithm that finds service-aware clustering (SAC) configurations in IoT settings. The performance evaluation reveals the relative energy efficiency of the proposed SAC algorithm compared to related routing algorithms in terms of energy consumption, the sensor nodes' life span and its traffic engineering efficiency in terms of throughput and delay. These include the well-known low energy adaptive clustering hierarchy (LEACH) and LEACH-centralized (LEACH-C) algorithms, as well as the most recent algorithms, such as DECSA and MOCRN.
Novel multimodality segmentation using level sets and Jensen-Rényi divergence.
Markel, Daniel; Zaidi, Habib; El Naqa, Issam
2013-12-01
Positron emission tomography (PET) is playing an increasing role in radiotherapy treatment planning. However, despite progress, robust algorithms for PET and multimodal image segmentation are still lacking, especially if the algorithm were extended to image-guided and adaptive radiotherapy (IGART). This work presents a novel multimodality segmentation algorithm using the Jensen-Rényi divergence (JRD) to evolve the geometric level set contour. The algorithm offers improved noise tolerance which is particularly applicable to segmentation of regions found in PET and cone-beam computed tomography. A steepest gradient ascent optimization method is used in conjunction with the JRD and a level set active contour to iteratively evolve a contour to partition an image based on statistical divergence of the intensity histograms. The algorithm is evaluated using PET scans of pharyngolaryngeal squamous cell carcinoma with the corresponding histological reference. The multimodality extension of the algorithm is evaluated using 22 PET/CT scans of patients with lung carcinoma and a physical phantom scanned under varying image quality conditions. The average concordance index (CI) of the JRD segmentation of the PET images was 0.56 with an average classification error of 65%. The segmentation of the lung carcinoma images had a maximum diameter relative error of 63%, 19.5%, and 14.8% when using CT, PET, and combined PET/CT images, respectively. The estimated maximal diameters of the gross tumor volume (GTV) showed a high correlation with the macroscopically determined maximal diameters, with a R(2) value of 0.85 and 0.88 using the PET and PET/CT images, respectively. Results from the physical phantom show that the JRD is more robust to image noise compared to mutual information and region growing. The JRD has shown improved noise tolerance compared to mutual information for the purpose of PET image segmentation. Presented is a flexible framework for multimodal image segmentation that can incorporate a large number of inputs efficiently for IGART.
Stride search: A general algorithm for storm detection in high resolution climate data
Bosler, Peter Andrew; Roesler, Erika Louise; Taylor, Mark A.; ...
2015-09-08
This article discusses the problem of identifying extreme climate events such as intense storms within large climate data sets. The basic storm detection algorithm is reviewed, which splits the problem into two parts: a spatial search followed by a temporal correlation problem. Two specific implementations of the spatial search algorithm are compared. The commonly used grid point search algorithm is reviewed, and a new algorithm called Stride Search is introduced. Stride Search is designed to work at all latitudes, while grid point searches may fail in polar regions. Results from the two algorithms are compared for the application of tropicalmore » cyclone detection, and shown to produce similar results for the same set of storm identification criteria. The time required for both algorithms to search the same data set is compared. Furthermore, Stride Search's ability to search extreme latitudes is demonstrated for the case of polar low detection.« less
Fuzzy automata and pattern matching
NASA Technical Reports Server (NTRS)
Setzer, C. B.; Warsi, N. A.
1986-01-01
A wide-ranging search for articles and books concerned with fuzzy automata and syntactic pattern recognition is presented. A number of survey articles on image processing and feature detection were included. Hough's algorithm is presented to illustrate the way in which knowledge about an image can be used to interpret the details of the image. It was found that in hand generated pictures, the algorithm worked well on following the straight lines, but had great difficulty turning corners. An algorithm was developed which produces a minimal finite automaton recognizing a given finite set of strings. One difficulty of the construction is that, in some cases, this minimal automaton is not unique for a given set of strings and a given maximum length. This algorithm compares favorably with other inference algorithms. More importantly, the algorithm produces an automaton with a rigorously described relationship to the original set of strings that does not depend on the algorithm itself.
Distributed Sleep Scheduling in Wireless Sensor Networks via Fractional Domatic Partitioning
NASA Astrophysics Data System (ADS)
Schumacher, André; Haanpää, Harri
We consider setting up sleep scheduling in sensor networks. We formulate the problem as an instance of the fractional domatic partition problem and obtain a distributed approximation algorithm by applying linear programming approximation techniques. Our algorithm is an application of the Garg-Könemann (GK) scheme that requires solving an instance of the minimum weight dominating set (MWDS) problem as a subroutine. Our two main contributions are a distributed implementation of the GK scheme for the sleep-scheduling problem and a novel asynchronous distributed algorithm for approximating MWDS based on a primal-dual analysis of Chvátal's set-cover algorithm. We evaluate our algorithm with
Effective traffic features selection algorithm for cyber-attacks samples
NASA Astrophysics Data System (ADS)
Li, Yihong; Liu, Fangzheng; Du, Zhenyu
2018-05-01
By studying the defense scheme of Network attacks, this paper propose an effective traffic features selection algorithm based on k-means++ clustering to deal with the problem of high dimensionality of traffic features which extracted from cyber-attacks samples. Firstly, this algorithm divide the original feature set into attack traffic feature set and background traffic feature set by the clustering. Then, we calculates the variation of clustering performance after removing a certain feature. Finally, evaluating the degree of distinctiveness of the feature vector according to the result. Among them, the effective feature vector is whose degree of distinctiveness exceeds the set threshold. The purpose of this paper is to select out the effective features from the extracted original feature set. In this way, it can reduce the dimensionality of the features so as to reduce the space-time overhead of subsequent detection. The experimental results show that the proposed algorithm is feasible and it has some advantages over other selection algorithms.
Reference set design for relational modeling of fuzzy systems
NASA Astrophysics Data System (ADS)
Lapohos, Tibor; Buchal, Ralph O.
1994-10-01
One of the keys to the successful relational modeling of fuzzy systems is the proper design of fuzzy reference sets. This has been discussed throughout the literature. In the frame of modeling a stochastic system, we analyze the problem numerically. First, we briefly describe the relational model and present the performance of the modeling in the most trivial case: the reference sets are triangle shaped. Next, we present a known fuzzy reference set generator algorithm (FRSGA) which is based on the fuzzy c-means (Fc-M) clustering algorithm. In the second section of this chapter we improve the previous FRSGA by adding a constraint to the Fc-M algorithm (modified Fc-M or MFc-M): two cluster centers are forced to coincide with the domain limits. This is needed to obtain properly shaped extreme linguistic reference values. We apply this algorithm to uniformly discretized domains of the variables involved. The fuzziness of the reference sets produced by both Fc-M and MFc-M is determined by a parameter, which in our experiments is modified iteratively. Each time, a new model is created and its performance analyzed. For certain algorithm parameter values both of these two algorithms have shortcomings. To eliminate the drawbacks of these two approaches, we develop a completely new generator algorithm for reference sets which we call Polyline. This algorithm and its performance are described in the last section. In all three cases, the modeling is performed for a variety of operators used in the inference engine and two defuzzification methods. Therefore our results depend neither on the system model order nor the experimental setup.
Kirtania, Kawnish; Bhattacharya, Sankar
2012-03-01
Apart from capturing carbon dioxide, fresh water algae can be used to produce biofuel. To assess the energy potential of Chlorococcum humicola, the alga's pyrolytic behavior was studied at heating rates of 5-20K/min in a thermobalance. To model the weight loss characteristics, an algorithm was developed based on the distributed activation energy model and applied to experimental data to extract the kinetics of the decomposition process. When the kinetic parameters estimated by this method were applied to another set of experimental data which were not used to estimate the parameters, the model was capable of predicting the pyrolysis behavior, in the new set of data with a R(2) value of 0.999479. The slow weight loss, that took place at the end of the pyrolysis process, was also accounted for by the proposed algorithm which is capable of predicting the pyrolysis kinetics of C. humicola at different heating rates. Crown Copyright © 2011. Published by Elsevier Ltd. All rights reserved.
Radiofrequency pulse design in parallel transmission under strict temperature constraints.
Boulant, Nicolas; Massire, Aurélien; Amadon, Alexis; Vignaud, Alexandre
2014-09-01
To gain radiofrequency (RF) pulse performance by directly addressing the temperature constraints, as opposed to the specific absorption rate (SAR) constraints, in parallel transmission at ultra-high field. The magnitude least-squares RF pulse design problem under hard SAR constraints was solved repeatedly by using the virtual observation points and an active-set algorithm. The SAR constraints were updated at each iteration based on the result of a thermal simulation. The numerical study was performed for an SAR-demanding and simplified time of flight sequence using B1 and ΔB0 maps obtained in vivo on a human brain at 7T. The proposed adjustment of the SAR constraints combined with an active-set algorithm provided higher flexibility in RF pulse design within a reasonable time. The modifications of those constraints acted directly upon the thermal response as desired. Although further confidence in the thermal models is needed, this study shows that RF pulse design under strict temperature constraints is within reach, allowing better RF pulse performance and faster acquisitions at ultra-high fields at the cost of higher sequence complexity. Copyright © 2013 Wiley Periodicals, Inc.
Variable Scheduling to Mitigate Channel Losses in Energy-Efficient Body Area Networks
Tselishchev, Yuriy; Boulis, Athanassios; Libman, Lavy
2012-01-01
We consider a typical body area network (BAN) setting in which sensor nodes send data to a common hub regularly on a TDMA basis, as defined by the emerging IEEE 802.15.6 BAN standard. To reduce transmission losses caused by the highly dynamic nature of the wireless channel around the human body, we explore variable TDMA scheduling techniques that allow the order of transmissions within each TDMA round to be decided on the fly, rather than being fixed in advance. Using a simple Markov model of the wireless links, we devise a number of scheduling algorithms that can be performed by the hub, which aim to maximize the expected number of successful transmissions in a TDMA round, and thereby significantly reduce transmission losses as compared with a static TDMA schedule. Importantly, these algorithms do not require a priori knowledge of the statistical properties of the wireless channels, and the reliability improvement is achieved entirely via shuffling the order of transmissions among devices, and does not involve any additional energy consumption (e.g., retransmissions). We evaluate these algorithms directly on an experimental set of traces obtained from devices strapped to human subjects performing regular daily activities, and confirm that the benefits of the proposed variable scheduling algorithms extend to this practical setup as well. PMID:23202183
Generalization of some hidden subgroup algorithms for input sets of arbitrary size
NASA Astrophysics Data System (ADS)
Poslu, Damla; Say, A. C. Cem
2006-05-01
We consider the problem of generalizing some quantum algorithms so that they will work on input domains whose cardinalities are not necessarily powers of two. When analyzing the algorithms we assume that generating superpositions of arbitrary subsets of basis states whose cardinalities are not necessarily powers of two perfectly is possible. We have taken Ballhysa's model as a template and have extended it to Chi, Kim and Lee's generalizations of the Deutsch-Jozsa algorithm and to Simon's algorithm. With perfectly equal superpositions of input sets of arbitrary size, Chi, Kim and Lee's generalized Deutsch-Jozsa algorithms, both for evenly-distributed and evenly-balanced functions, worked with one-sided error property. For Simon's algorithm the success probability of the generalized algorithm is the same as that of the original for input sets of arbitrary cardinalities with equiprobable superpositions, since the property that the measured strings are all those which have dot product zero with the string we search, for the case where the function is 2-to-1, is not lost.
Ensemble of hybrid genetic algorithm for two-dimensional phase unwrapping
NASA Astrophysics Data System (ADS)
Balakrishnan, D.; Quan, C.; Tay, C. J.
2013-06-01
The phase unwrapping is the final and trickiest step in any phase retrieval technique. Phase unwrapping by artificial intelligence methods (optimization algorithms) such as hybrid genetic algorithm, reverse simulated annealing, particle swarm optimization, minimum cost matching showed better results than conventional phase unwrapping methods. In this paper, Ensemble of hybrid genetic algorithm with parallel populations is proposed to solve the branch-cut phase unwrapping problem. In a single populated hybrid genetic algorithm, the selection, cross-over and mutation operators are applied to obtain new population in every generation. The parameters and choice of operators will affect the performance of the hybrid genetic algorithm. The ensemble of hybrid genetic algorithm will facilitate to have different parameters set and different choice of operators simultaneously. Each population will use different set of parameters and the offspring of each population will compete against the offspring of all other populations, which use different set of parameters. The effectiveness of proposed algorithm is demonstrated by phase unwrapping examples and advantages of the proposed method are discussed.
Flowgen: Flowchart-based documentation for C + + codes
NASA Astrophysics Data System (ADS)
Kosower, David A.; Lopez-Villarejo, J. J.
2015-11-01
We present the Flowgen tool, which generates flowcharts from annotated C + + source code. The tool generates a set of interconnected high-level UML activity diagrams, one for each function or method in the C + + sources. It provides a simple and visual overview of complex implementations of numerical algorithms. Flowgen is complementary to the widely-used Doxygen documentation tool. The ultimate aim is to render complex C + + computer codes accessible, and to enhance collaboration between programmers and algorithm or science specialists. We describe the tool and a proof-of-concept application to the VINCIA plug-in for simulating collisions at CERN's Large Hadron Collider.
2011-01-01
Background Gene regulatory networks play essential roles in living organisms to control growth, keep internal metabolism running and respond to external environmental changes. Understanding the connections and the activity levels of regulators is important for the research of gene regulatory networks. While relevance score based algorithms that reconstruct gene regulatory networks from transcriptome data can infer genome-wide gene regulatory networks, they are unfortunately prone to false positive results. Transcription factor activities (TFAs) quantitatively reflect the ability of the transcription factor to regulate target genes. However, classic relevance score based gene regulatory network reconstruction algorithms use models do not include the TFA layer, thus missing a key regulatory element. Results This work integrates TFA prediction algorithms with relevance score based network reconstruction algorithms to reconstruct gene regulatory networks with improved accuracy over classic relevance score based algorithms. This method is called Gene expression and Transcription factor activity based Relevance Network (GTRNetwork). Different combinations of TFA prediction algorithms and relevance score functions have been applied to find the most efficient combination. When the integrated GTRNetwork method was applied to E. coli data, the reconstructed genome-wide gene regulatory network predicted 381 new regulatory links. This reconstructed gene regulatory network including the predicted new regulatory links show promising biological significances. Many of the new links are verified by known TF binding site information, and many other links can be verified from the literature and databases such as EcoCyc. The reconstructed gene regulatory network is applied to a recent transcriptome analysis of E. coli during isobutanol stress. In addition to the 16 significantly changed TFAs detected in the original paper, another 7 significantly changed TFAs have been detected by using our reconstructed network. Conclusions The GTRNetwork algorithm introduces the hidden layer TFA into classic relevance score-based gene regulatory network reconstruction processes. Integrating the TFA biological information with regulatory network reconstruction algorithms significantly improves both detection of new links and reduces that rate of false positives. The application of GTRNetwork on E. coli gene transcriptome data gives a set of potential regulatory links with promising biological significance for isobutanol stress and other conditions. PMID:21668997
Neuroprosthetic Decoder Training as Imitation Learning.
Merel, Josh; Carlson, David; Paninski, Liam; Cunningham, John P
2016-05-01
Neuroprosthetic brain-computer interfaces function via an algorithm which decodes neural activity of the user into movements of an end effector, such as a cursor or robotic arm. In practice, the decoder is often learned by updating its parameters while the user performs a task. When the user's intention is not directly observable, recent methods have demonstrated value in training the decoder against a surrogate for the user's intended movement. Here we show that training a decoder in this way is a novel variant of an imitation learning problem, where an oracle or expert is employed for supervised training in lieu of direct observations, which are not available. Specifically, we describe how a generic imitation learning meta-algorithm, dataset aggregation (DAgger), can be adapted to train a generic brain-computer interface. By deriving existing learning algorithms for brain-computer interfaces in this framework, we provide a novel analysis of regret (an important metric of learning efficacy) for brain-computer interfaces. This analysis allows us to characterize the space of algorithmic variants and bounds on their regret rates. Existing approaches for decoder learning have been performed in the cursor control setting, but the available design principles for these decoders are such that it has been impossible to scale them to naturalistic settings. Leveraging our findings, we then offer an algorithm that combines imitation learning with optimal control, which should allow for training of arbitrary effectors for which optimal control can generate goal-oriented control. We demonstrate this novel and general BCI algorithm with simulated neuroprosthetic control of a 26 degree-of-freedom model of an arm, a sophisticated and realistic end effector.
Cost-effectiveness of WHO-Recommended Algorithms for TB Case Finding at Ethiopian HIV Clinics.
Adelman, Max W; McFarland, Deborah A; Tsegaye, Mulugeta; Aseffa, Abraham; Kempker, Russell R; Blumberg, Henry M
2018-01-01
The World Health Organization (WHO) recommends active tuberculosis (TB) case finding and a rapid molecular diagnostic test (Xpert MTB/RIF) to detect TB among people living with HIV (PLHIV) in high-burden settings. Information on the cost-effectiveness of these recommended strategies is crucial for their implementation. We conducted a model-based cost-effectiveness analysis comparing 2 algorithms for TB screening and diagnosis at Ethiopian HIV clinics: (1) WHO-recommended symptom screen combined with Xpert for PLHIV with a positive symptom screen and (2) current recommended practice algorithm (CRPA; based on symptom screening, smear microscopy, and clinical TB diagnosis). Our primary outcome was US$ per disability-adjusted life-year (DALY) averted. Secondary outcomes were additional true-positive diagnoses, and false-negative and false-positive diagnoses averted. Compared with CRPA, combining a WHO-recommended symptom screen with Xpert was highly cost-effective (incremental cost of $5 per DALY averted). Among a cohort of 15 000 PLHIV with a TB prevalence of 6% (900 TB cases), this algorithm detected 8 more true-positive cases than CRPA, and averted 2045 false-positive and 8 false-negative diagnoses compared with CRPA. The WHO-recommended algorithm was marginally costlier ($240 000) than CRPA ($239 000). In sensitivity analysis, the symptom screen/Xpert algorithm was dominated at low Xpert sensitivity (66%). In this model-based analysis, combining a WHO-recommended symptom screen with Xpert for TB diagnosis among PLHIV was highly cost-effective ($5 per DALY averted) and more sensitive than CRPA in a high-burden, resource-limited setting.
Molecular system identification for enzyme directed evolution and design
NASA Astrophysics Data System (ADS)
Guan, Xiangying; Chakrabarti, Raj
2017-09-01
The rational design of chemical catalysts requires methods for the measurement of free energy differences in the catalytic mechanism for any given catalyst Hamiltonian. The scope of experimental learning algorithms that can be applied to catalyst design would also be expanded by the availability of such methods. Methods for catalyst characterization typically either estimate apparent kinetic parameters that do not necessarily correspond to free energy differences in the catalytic mechanism or measure individual free energy differences that are not sufficient for establishing the relationship between the potential energy surface and catalytic activity. Moreover, in order to enhance the duty cycle of catalyst design, statistically efficient methods for the estimation of the complete set of free energy differences relevant to the catalytic activity based on high-throughput measurements are preferred. In this paper, we present a theoretical and algorithmic system identification framework for the optimal estimation of free energy differences in solution phase catalysts, with a focus on one- and two-substrate enzymes. This framework, which can be automated using programmable logic, prescribes a choice of feasible experimental measurements and manipulated input variables that identify the complete set of free energy differences relevant to the catalytic activity and minimize the uncertainty in these free energy estimates for each successive Hamiltonian design. The framework also employs decision-theoretic logic to determine when model reduction can be applied to improve the duty cycle of high-throughput catalyst design. Automation of the algorithm using fluidic control systems is proposed, and applications of the framework to the problem of enzyme design are discussed.
A new approach based on the median filter to T-wave detection in ECG signal.
Kholkhal, Mourad; Bereksi Reguig, Fethi
2014-07-01
The electrocardiogram (ECG) is one of the most used signals in the diagnosis of heart disease. It contains different waves which directly correlate to heart activity. Different methods have been used in order to detect these waves and consequently lead to heart activity diagnosis. This paper is interested more particularly to the detection of the T-wave. Such a wave represents the re-polarization state of the heart activity. The proposed approach is based on the algorithm procedure which allows the detection of the T-wave using a lot of filter including mean and median filter. The proposed algorithm is implemented and tested on a set of ECG recordings taken from, respectively, the European STT, MITBIH and MITBIH ST databases. The results are found to be very satisfactory in terms of sensitivity, predictivity and error compared to other works in the field.
One cutting plane algorithm using auxiliary functions
NASA Astrophysics Data System (ADS)
Zabotin, I. Ya; Kazaeva, K. E.
2016-11-01
We propose an algorithm for solving a convex programming problem from the class of cutting methods. The algorithm is characterized by the construction of approximations using some auxiliary functions, instead of the objective function. Each auxiliary function bases on the exterior penalty function. In proposed algorithm the admissible set and the epigraph of each auxiliary function are embedded into polyhedral sets. In connection with the above, the iteration points are found by solving linear programming problems. We discuss the implementation of the algorithm and prove its convergence.
On a programming language for graph algorithms
NASA Technical Reports Server (NTRS)
Rheinboldt, W. C.; Basili, V. R.; Mesztenyi, C. K.
1971-01-01
An algorithmic language, GRAAL, is presented for describing and implementing graph algorithms of the type primarily arising in applications. The language is based on a set algebraic model of graph theory which defines the graph structure in terms of morphisms between certain set algebraic structures over the node set and arc set. GRAAL is modular in the sense that the user specifies which of these mappings are available with any graph. This allows flexibility in the selection of the storage representation for different graph structures. In line with its set theoretic foundation, the language introduces sets as a basic data type and provides for the efficient execution of all set and graph operators. At present, GRAAL is defined as an extension of ALGOL 60 (revised) and its formal description is given as a supplement to the syntactic and semantic definition of ALGOL. Several typical graph algorithms are written in GRAAL to illustrate various features of the language and to show its applicability.
NASA Astrophysics Data System (ADS)
Hori, Yasuaki; Yasuno, Yoshiaki; Sakai, Shingo; Matsumoto, Masayuki; Sugawara, Tomoko; Madjarova, Violeta; Yamanari, Masahiro; Makita, Shuichi; Yasui, Takeshi; Araki, Tsutomu; Itoh, Masahide; Yatagai, Toyohiko
2006-03-01
A set of fully automated algorithms that is specialized for analyzing a three-dimensional optical coherence tomography (OCT) volume of human skin is reported. The algorithm set first determines the skin surface of the OCT volume, and a depth-oriented algorithm provides the mean epidermal thickness, distribution map of the epidermis, and a segmented volume of the epidermis. Subsequently, an en face shadowgram is produced by an algorithm to visualize the infundibula in the skin with high contrast. The population and occupation ratio of the infundibula are provided by a histogram-based thresholding algorithm and a distance mapping algorithm. En face OCT slices at constant depths from the sample surface are extracted, and the histogram-based thresholding algorithm is again applied to these slices, yielding a three-dimensional segmented volume of the infundibula. The dermal attenuation coefficient is also calculated from the OCT volume in order to evaluate the skin texture. The algorithm set examines swept-source OCT volumes of the skins of several volunteers, and the results show the high stability, portability and reproducibility of the algorithm.
An extended affinity propagation clustering method based on different data density types.
Zhao, XiuLi; Xu, WeiXiang
2015-01-01
Affinity propagation (AP) algorithm, as a novel clustering method, does not require the users to specify the initial cluster centers in advance, which regards all data points as potential exemplars (cluster centers) equally and groups the clusters totally by the similar degree among the data points. But in many cases there exist some different intensive areas within the same data set, which means that the data set does not distribute homogeneously. In such situation the AP algorithm cannot group the data points into ideal clusters. In this paper, we proposed an extended AP clustering algorithm to deal with such a problem. There are two steps in our method: firstly the data set is partitioned into several data density types according to the nearest distances of each data point; and then the AP clustering method is, respectively, used to group the data points into clusters in each data density type. Two experiments are carried out to evaluate the performance of our algorithm: one utilizes an artificial data set and the other uses a real seismic data set. The experiment results show that groups are obtained more accurately by our algorithm than OPTICS and AP clustering algorithm itself.
Procedure of Partitioning Data Into Number of Data Sets or Data Group - A Review
NASA Astrophysics Data System (ADS)
Kim, Tai-Hoon
The goal of clustering is to decompose a dataset into similar groups based on a objective function. Some already well established clustering algorithms are there for data clustering. Objective of these data clustering algorithms are to divide the data points of the feature space into a number of groups (or classes) so that a predefined set of criteria are satisfied. The article considers the comparative study about the effectiveness and efficiency of traditional data clustering algorithms. For evaluating the performance of the clustering algorithms, Minkowski score is used here for different data sets.
Computationally efficient algorithm for Gaussian Process regression in case of structured samples
NASA Astrophysics Data System (ADS)
Belyaev, M.; Burnaev, E.; Kapushev, Y.
2016-04-01
Surrogate modeling is widely used in many engineering problems. Data sets often have Cartesian product structure (for instance factorial design of experiments with missing points). In such case the size of the data set can be very large. Therefore, one of the most popular algorithms for approximation-Gaussian Process regression-can be hardly applied due to its computational complexity. In this paper a computationally efficient approach for constructing Gaussian Process regression in case of data sets with Cartesian product structure is presented. Efficiency is achieved by using a special structure of the data set and operations with tensors. Proposed algorithm has low computational as well as memory complexity compared to existing algorithms. In this work we also introduce a regularization procedure allowing to take into account anisotropy of the data set and avoid degeneracy of regression model.
Breast boundary detection with active contours
NASA Astrophysics Data System (ADS)
Balic, I.; Goyal, P.; Roy, O.; Duric, N.
2014-03-01
Ultrasound tomography is a modality that can be used to image various characteristics of the breast, such as sound speed, attenuation, and reflectivity. In the considered setup, the breast is immersed in water and scanned along the coronal axis from the chest wall to the nipple region. To improve image visualization, it is desirable to remove the water background. To this end, the 3D boundary of the breast must be accurately estimated. We present an iterative algorithm based on active contours that automatically detects the boundary of a breast using a 3D stack of attenuation images obtained from an ultrasound tomography scanner. We build upon an existing method to design an algorithm that is fast, fully automated, and reliable. We demonstrate the effectiveness of the proposed technique using clinical data sets.
Pattern-set generation algorithm for the one-dimensional multiple stock sizes cutting stock problem
NASA Astrophysics Data System (ADS)
Cui, Yaodong; Cui, Yi-Ping; Zhao, Zhigang
2015-09-01
A pattern-set generation algorithm (PSG) for the one-dimensional multiple stock sizes cutting stock problem (1DMSSCSP) is presented. The solution process contains two stages. In the first stage, the PSG solves the residual problems repeatedly to generate the patterns in the pattern set, where each residual problem is solved by the column-generation approach, and each pattern is generated by solving a single large object placement problem. In the second stage, the integer linear programming model of the 1DMSSCSP is solved using a commercial solver, where only the patterns in the pattern set are considered. The computational results of benchmark instances indicate that the PSG outperforms existing heuristic algorithms and rivals the exact algorithm in solution quality.
NASA Technical Reports Server (NTRS)
Chang, H.
1976-01-01
A computer program using Lemke, Salkin and Spielberg's Set Covering Algorithm (SCA) to optimize a traffic model problem in the Scheduling Algorithm for Mission Planning and Logistics Evaluation (SAMPLE) was documented. SCA forms a submodule of SAMPLE and provides for input and output, subroutines, and an interactive feature for performing the optimization and arranging the results in a readily understandable form for output.
Detection of person borne IEDs using multiple cooperative sensors
NASA Astrophysics Data System (ADS)
MacIntosh, Scott; Deming, Ross; Hansen, Thorkild; Kishan, Neel; Tang, Ling; Shea, Jing; Lang, Stephen
2011-06-01
The use of multiple cooperative sensors for the detection of person borne IEDs is investigated. The purpose of the effort is to evaluate the performance benefits of adding multiple sensor data streams into an aided threat detection algorithm, and a quantitative analysis of which sensor data combinations improve overall detection performance. Testing includes both mannequins and human subjects with simulated suicide bomb devices of various configurations, materials, sizes and metal content. Aided threat recognition algorithms are being developed to test detection performance of individual sensors against combined fused sensors inputs. Sensors investigated include active and passive millimeter wave imaging systems, passive infrared, 3-D profiling sensors and acoustic imaging. The paper describes the experimental set-up and outlines the methodology behind a decision fusion algorithm-based on the concept of a "body model".
Tai, David; Fang, Jianwen
2012-08-27
The large sizes of today's chemical databases require efficient algorithms to perform similarity searches. It can be very time consuming to compare two large chemical databases. This paper seeks to build upon existing research efforts by describing a novel strategy for accelerating existing search algorithms for comparing large chemical collections. The quest for efficiency has focused on developing better indexing algorithms by creating heuristics for searching individual chemical against a chemical library by detecting and eliminating needless similarity calculations. For comparing two chemical collections, these algorithms simply execute searches for each chemical in the query set sequentially. The strategy presented in this paper achieves a speedup upon these algorithms by indexing the set of all query chemicals so redundant calculations that arise in the case of sequential searches are eliminated. We implement this novel algorithm by developing a similarity search program called Symmetric inDexing or SymDex. SymDex shows over a 232% maximum speedup compared to the state-of-the-art single query search algorithm over real data for various fingerprint lengths. Considerable speedup is even seen for batch searches where query set sizes are relatively small compared to typical database sizes. To the best of our knowledge, SymDex is the first search algorithm designed specifically for comparing chemical libraries. It can be adapted to most, if not all, existing indexing algorithms and shows potential for accelerating future similarity search algorithms for comparing chemical databases.
NASA Astrophysics Data System (ADS)
Moneta, Diana; Mora, Paolo; Viganò, Giacomo; Alimonti, Gianluca
2014-12-01
The diffusion of Distributed Generation (DG) based on Renewable Energy Sources (RES) requires new strategies to ensure reliable and economic operation of the distribution networks and to support the diffusion of DG itself. An advanced algorithm (DISCoVER - DIStribution Company VoltagE Regulator) is being developed to optimize the operation of active network by means of an advanced voltage control based on several regulations. Starting from forecasted load and generation, real on-field measurements, technical constraints and costs for each resource, the algorithm generates for each time period a set of commands for controllable resources that guarantees achievement of technical goals minimizing the overall cost. Before integrating the controller into the telecontrol system of the real networks, and in order to validate the proper behaviour of the algorithm and to identify possible critical conditions, a complete simulation phase has started. The first step is concerning the definition of a wide range of "case studies", that are the combination of network topology, technical constraints and targets, load and generation profiles and "costs" of resources that define a valid context to test the algorithm, with particular focus on battery and RES management. First results achieved from simulation activity on test networks (based on real MV grids) and actual battery characteristics are given, together with prospective performance on real case applications.
Cao, Yuan-Yuan; Su, Yan-Gang; Bai, Jin; Wang, Wei; Wang, Jing-Feng; Qin, Sheng-Mei; Ge, Jun-Bo
2015-01-01
Loss of left ventricular (LV) capture may lead to deterioration of heart failure in patients with cardiac resynchronization therapy (CRT). Recognition of loss of LV capture in time is important in clinical practice. A total of 422 electrocardiograms were acquired and analyzed from 53 CRT patients at 8 different pacing settings (LV only, right ventricle [RV] only, biventricular [BV] pacing with LV preactivation of 60, 40, 20, and 0 milliseconds and RV preactivation of 20 and 40 milliseconds). A modified Ammann algorithm by adding a third step-presence of Q (q, or QS) wave-to the original 2-step Ammann algorithm and a QRS axis shift method were devised to identify the loss of LV capture. The accuracy of modified Ammann algorithm was significantly higher than that of Ammann algorithm (78.9% vs. 69.1%, P < 0.001). The accuracy of the axis shift method was 66.4%, which was significantly lower than the modified Ammann algorithm (P < 0.001) and similar to the original one (P = 0.412). However, in the ECGs with QRS axis shift, 96.8% were correctly classified. LV preactivation or simultaneous BV activation and LV lead positioned in nonposterior or noninferior wall could elevate the accuracies of the modified Ammann algorithm and the QRS axis shift method. The accuracy of the modified Ammann algorithm is greatly improved. The QRS axis shift method can help diagnose LV capture. The LV preactivation, or simultaneous BV activation and LV lead positioned in nonposterior or noninferior wall can increase the diagnostic power of the modified Ammann algorithm and QRS axis shift method. © 2014 Wiley Periodicals, Inc.
Inverse consistent non-rigid image registration based on robust point set matching
2014-01-01
Background Robust point matching (RPM) has been extensively used in non-rigid registration of images to robustly register two sets of image points. However, except for the location at control points, RPM cannot estimate the consistent correspondence between two images because RPM is a unidirectional image matching approach. Therefore, it is an important issue to make an improvement in image registration based on RPM. Methods In our work, a consistent image registration approach based on the point sets matching is proposed to incorporate the property of inverse consistency and improve registration accuracy. Instead of only estimating the forward transformation between the source point sets and the target point sets in state-of-the-art RPM algorithms, the forward and backward transformations between two point sets are estimated concurrently in our algorithm. The inverse consistency constraints are introduced to the cost function of RPM and the fuzzy correspondences between two point sets are estimated based on both the forward and backward transformations simultaneously. A modified consistent landmark thin-plate spline registration is discussed in detail to find the forward and backward transformations during the optimization of RPM. The similarity of image content is also incorporated into point matching in order to improve image matching. Results Synthetic data sets, medical images are employed to demonstrate and validate the performance of our approach. The inverse consistent errors of our algorithm are smaller than RPM. Especially, the topology of transformations is preserved well for our algorithm for the large deformation between point sets. Moreover, the distance errors of our algorithm are similar to that of RPM, and they maintain a downward trend as whole, which demonstrates the convergence of our algorithm. The registration errors for image registrations are evaluated also. Again, our algorithm achieves the lower registration errors in same iteration number. The determinant of the Jacobian matrix of the deformation field is used to analyse the smoothness of the forward and backward transformations. The forward and backward transformations estimated by our algorithm are smooth for small deformation. For registration of lung slices and individual brain slices, large or small determinant of the Jacobian matrix of the deformation fields are observed. Conclusions Results indicate the improvement of the proposed algorithm in bi-directional image registration and the decrease of the inverse consistent errors of the forward and the reverse transformations between two images. PMID:25559889
Analysis of miRNA expression profile based on SVM algorithm
NASA Astrophysics Data System (ADS)
Ting-ting, Dai; Chang-ji, Shan; Yan-shou, Dong; Yi-duo, Bian
2018-05-01
Based on mirna expression spectrum data set, a new data mining algorithm - tSVM - KNN (t statistic with support vector machine - k nearest neighbor) is proposed. the idea of the algorithm is: firstly, the feature selection of the data set is carried out by the unified measurement method; Secondly, SVM - KNN algorithm, which combines support vector machine (SVM) and k - nearest neighbor (k - nearest neighbor) is used as classifier. Simulation results show that SVM - KNN algorithm has better classification ability than SVM and KNN alone. Tsvm - KNN algorithm only needs 5 mirnas to obtain 96.08 % classification accuracy in terms of the number of mirna " tags" and recognition accuracy. compared with similar algorithms, tsvm - KNN algorithm has obvious advantages.
Continuous Glucose Monitoring Enables the Detection of Losses in Infusion Set Actuation (LISAs)
Howsmon, Daniel P.; Cameron, Faye; Baysal, Nihat; Ly, Trang T.; Forlenza, Gregory P.; Maahs, David M.; Buckingham, Bruce A.; Hahn, Juergen; Bequette, B. Wayne
2017-01-01
Reliable continuous glucose monitoring (CGM) enables a variety of advanced technology for the treatment of type 1 diabetes. In addition to artificial pancreas algorithms that use CGM to automate continuous subcutaneous insulin infusion (CSII), CGM can also inform fault detection algorithms that alert patients to problems in CGM or CSII. Losses in infusion set actuation (LISAs) can adversely affect clinical outcomes, resulting in hyperglycemia due to impaired insulin delivery. Prolonged hyperglycemia may lead to diabetic ketoacidosis—a serious metabolic complication in type 1 diabetes. Therefore, an algorithm for the detection of LISAs based on CGM and CSII signals was developed to improve patient safety. The LISA detection algorithm is trained retrospectively on data from 62 infusion set insertions from 20 patients. The algorithm collects glucose and insulin data, and computes relevant fault metrics over two different sliding windows; an alarm sounds when these fault metrics are exceeded. With the chosen algorithm parameters, the LISA detection strategy achieved a sensitivity of 71.8% and issued 0.28 false positives per day on the training data. Validation on two independent data sets confirmed that similar performance is seen on data that was not used for training. The developed algorithm is able to effectively alert patients to possible infusion set failures in open-loop scenarios, with limited evidence of its extension to closed-loop scenarios. PMID:28098839
Continuous Glucose Monitoring Enables the Detection of Losses in Infusion Set Actuation (LISAs).
Howsmon, Daniel P; Cameron, Faye; Baysal, Nihat; Ly, Trang T; Forlenza, Gregory P; Maahs, David M; Buckingham, Bruce A; Hahn, Juergen; Bequette, B Wayne
2017-01-15
Reliable continuous glucose monitoring (CGM) enables a variety of advanced technology for the treatment of type 1 diabetes. In addition to artificial pancreas algorithms that use CGM to automate continuous subcutaneous insulin infusion (CSII), CGM can also inform fault detection algorithms that alert patients to problems in CGM or CSII. Losses in infusion set actuation (LISAs) can adversely affect clinical outcomes, resulting in hyperglycemia due to impaired insulin delivery. Prolonged hyperglycemia may lead to diabetic ketoacidosis-a serious metabolic complication in type 1 diabetes. Therefore, an algorithm for the detection of LISAs based on CGM and CSII signals was developed to improve patient safety. The LISA detection algorithm is trained retrospectively on data from 62 infusion set insertions from 20 patients. The algorithm collects glucose and insulin data, and computes relevant fault metrics over two different sliding windows; an alarm sounds when these fault metrics are exceeded. With the chosen algorithm parameters, the LISA detection strategy achieved a sensitivity of 71.8% and issued 0.28 false positives per day on the training data. Validation on two independent data sets confirmed that similar performance is seen on data that was not used for training. The developed algorithm is able to effectively alert patients to possible infusion set failures in open-loop scenarios, with limited evidence of its extension to closed-loop scenarios.
Kernel-Based Sensor Fusion With Application to Audio-Visual Voice Activity Detection
NASA Astrophysics Data System (ADS)
Dov, David; Talmon, Ronen; Cohen, Israel
2016-12-01
In this paper, we address the problem of multiple view data fusion in the presence of noise and interferences. Recent studies have approached this problem using kernel methods, by relying particularly on a product of kernels constructed separately for each view. From a graph theory point of view, we analyze this fusion approach in a discrete setting. More specifically, based on a statistical model for the connectivity between data points, we propose an algorithm for the selection of the kernel bandwidth, a parameter, which, as we show, has important implications on the robustness of this fusion approach to interferences. Then, we consider the fusion of audio-visual speech signals measured by a single microphone and by a video camera pointed to the face of the speaker. Specifically, we address the task of voice activity detection, i.e., the detection of speech and non-speech segments, in the presence of structured interferences such as keyboard taps and office noise. We propose an algorithm for voice activity detection based on the audio-visual signal. Simulation results show that the proposed algorithm outperforms competing fusion and voice activity detection approaches. In addition, we demonstrate that a proper selection of the kernel bandwidth indeed leads to improved performance.
Wave front sensing for next generation earth observation telescope
NASA Astrophysics Data System (ADS)
Delvit, J.-M.; Thiebaut, C.; Latry, C.; Blanchet, G.
2017-09-01
High resolution observations systems are highly dependent on optics quality and are usually designed to be nearly diffraction limited. Such a performance allows to set a Nyquist frequency closer to the cut off frequency, or equivalently to minimize the pupil diameter for a given ground sampling distance target. Up to now, defocus is the only aberration that is allowed to evolve slowly and that may be inflight corrected, using an open loop correction based upon ground estimation and refocusing command upload. For instance, Pleiades satellites defocus is assessed from star acquisitions and refocusing is done with a thermal actuation of the M2 mirror. Next generation systems under study at CNES should include active optics in order to allow evolving aberrations not only limited to defocus, due for instance to in orbit thermal variable conditions. Active optics relies on aberration estimations through an onboard Wave Front Sensor (WFS). One option is using a Shack Hartmann. The Shack-Hartmann wave-front sensor could be used on extended scenes (unknown landscapes). A wave-front computation algorithm should then be implemented on-board the satellite to provide the control loop wave-front error measure. In the worst case scenario, this measure should be computed before each image acquisition. A robust and fast shift estimation algorithm between Shack-Hartmann images is then needed to fulfill this last requirement. A fast gradient-based algorithm using optical flows with a Lucas-Kanade method has been studied and implemented on an electronic device developed by CNES. Measurement accuracy depends on the Wave Front Error (WFE), the landscape frequency content, the number of searched aberrations, the a priori knowledge of high order aberrations and the characteristics of the sensor. CNES has realized a full scale sensitivity analysis on the whole parameter set with our internally developed algorithm.
Sinha Gregory, Naina; Seley, Jane Jeffrie; Gerber, Linda M; Tang, Chin; Brillon, David
2016-12-01
More than one-third of hospitalized patients have hyperglycemia. Despite evidence that improving glycemic control leads to better outcomes, achieving recognized targets remains a challenge. The objective of this study was to evaluate the implementation of a computerized insulin order set and titration algorithm on rates of hypoglycemia and overall inpatient glycemic control. A prospective observational study evaluating the impact of a glycemic order set and titration algorithm in an academic medical center in non-critical care medical and surgical inpatients. The initial intervention was hospital-wide implementation of a comprehensive insulin order set. The secondary intervention was initiation of an insulin titration algorithm in two pilot medicine inpatient units. Point of care testing blood glucose reports were analyzed. These reports included rates of hypoglycemia (BG < 70 mg/dL) and hyperglycemia (BG >200 mg/dL in phase 1, BG > 180 mg/dL in phase 2). In the first phase of the study, implementation of the insulin order set was associated with decreased rates of hypoglycemia (1.92% vs 1.61%; p < 0.001) and increased rates of hyperglycemia (24.02% vs 27.27%; p < 0.001) from 2010 to 2011. In the second phase, addition of a titration algorithm was associated with decreased rates of hypoglycemia (2.57% vs 1.82%; p = 0.039) and increased rates of hyperglycemia (31.76% vs 41.33%; p < 0.001) from 2012 to 2013. A comprehensive computerized insulin order set and titration algorithm significantly decreased rates of hypoglycemia. This significant reduction in hypoglycemia was associated with increased rates of hyperglycemia. Hardwiring the algorithm into the electronic medical record may foster adoption.
Machine learning methods in chemoinformatics
Mitchell, John B O
2014-01-01
Machine learning algorithms are generally developed in computer science or adjacent disciplines and find their way into chemical modeling by a process of diffusion. Though particular machine learning methods are popular in chemoinformatics and quantitative structure–activity relationships (QSAR), many others exist in the technical literature. This discussion is methods-based and focused on some algorithms that chemoinformatics researchers frequently use. It makes no claim to be exhaustive. We concentrate on methods for supervised learning, predicting the unknown property values of a test set of instances, usually molecules, based on the known values for a training set. Particularly relevant approaches include Artificial Neural Networks, Random Forest, Support Vector Machine, k-Nearest Neighbors and naïve Bayes classifiers. WIREs Comput Mol Sci 2014, 4:468–481. How to cite this article: WIREs Comput Mol Sci 2014, 4:468–481. doi:10.1002/wcms.1183 PMID:25285160
Distributed Cognition (DCOG): Foundations for a Computational Associative Memory Model
2006-08-01
retrieved through context and attention . Instead of looking up an old exemplar in memory, we simulate it by activating the same (or almost the same) set of...association, regardless of current activation. We restrict our attention here to immediate links; although the algorithm can be generalized to include...recognition. A shift in attention may, by creating a new context, trigger the recognition of a concept. A detailed example of the use of attention in
Multi Robot Path Planning for Budgeted Active Perception with Self-Organising Maps
2016-10-04
Multi- Robot Path Planning for Budgeted Active Perception with Self-Organising Maps Graeme Best1, Jan Faigl2 and Robert Fitch1 Abstract— We propose a...optimise paths for a multi- robot team that aims to maximally observe a set of nodes in the environment. The selected nodes are observed by visiting...regions, each node has an observation reward, and the robots are constrained by travel budgets. The SOM algorithm jointly selects and allocates nodes
Modeling of skin cancer dermatoscopy images
NASA Astrophysics Data System (ADS)
Iralieva, Malica B.; Myakinin, Oleg O.; Bratchenko, Ivan A.; Zakharov, Valery P.
2018-04-01
An early identified cancer is more likely to effective respond to treatment and has a less expensive treatment as well. Dermatoscopy is one of general diagnostic techniques for skin cancer early detection that allows us in vivo evaluation of colors and microstructures on skin lesions. Digital phantoms with known properties are required during new instrument developing to compare sample's features with data from the instrument. An algorithm for image modeling of skin cancer is proposed in the paper. Steps of the algorithm include setting shape, texture generation, adding texture and normal skin background setting. The Gaussian represents the shape, and then the texture generation based on a fractal noise algorithm is responsible for spatial chromophores distributions, while the colormap applied to the values corresponds to spectral properties. Finally, a normal skin image simulated by mixed Monte Carlo method using a special online tool is added as a background. Varying of Asymmetry, Borders, Colors and Diameter settings is shown to be fully matched to the ABCD clinical recognition algorithm. The asymmetry is specified by setting different standard deviation values of Gaussian in different parts of image. The noise amplitude is increased to set the irregular borders score. Standard deviation is changed to determine size of the lesion. Colors are set by colormap changing. The algorithm for simulating different structural elements is required to match with others recognition algorithms.
Strong convergence of an extragradient-type algorithm for the multiple-sets split equality problem.
Zhao, Ying; Shi, Luoyi
2017-01-01
This paper introduces a new extragradient-type method to solve the multiple-sets split equality problem (MSSEP). Under some suitable conditions, the strong convergence of an algorithm can be verified in the infinite-dimensional Hilbert spaces. Moreover, several numerical results are given to show the effectiveness of our algorithm.
Image-processing algorithms for inspecting characteristics of hybrid rice seed
NASA Astrophysics Data System (ADS)
Cheng, Fang; Ying, Yibin
2004-03-01
Incompletely closed glumes, germ and disease are three characteristics of hybrid rice seed. Image-processing algorithms developed to detect these seed characteristics were presented in this paper. The rice seed used for this study involved five varieties of Jinyou402, Shanyou10, Zhongyou207, Jiayou and IIyou. The algorithms were implemented with a 5*600 images set, a 4*400 images set and the other 5*600 images set respectively. The image sets included black background images, white background images and both sides images of rice seed. Results show that the algorithm for inspecting seeds with incompletely closed glumes based on Radon Transform achieved an accuracy of 96% for normal seeds, 92% for seeds with fine fissure and 87% for seeds with unclosed glumes, the algorithm for inspecting germinated seeds on panicle based on PCA and ANN achieved n average accuracy of 98% for normal seeds, 88% for germinated seeds on panicle and the algorithm for inspecting diseased seeds based on color features achieved an accuracy of 92% for normal and healthy seeds, 95% for spot diseased seeds and 83% for severe diseased seeds.
Final Report: Sampling-Based Algorithms for Estimating Structure in Big Data.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Matulef, Kevin Michael
The purpose of this project was to develop sampling-based algorithms to discover hidden struc- ture in massive data sets. Inferring structure in large data sets is an increasingly common task in many critical national security applications. These data sets come from myriad sources, such as network traffic, sensor data, and data generated by large-scale simulations. They are often so large that traditional data mining techniques are time consuming or even infeasible. To address this problem, we focus on a class of algorithms that do not compute an exact answer, but instead use sampling to compute an approximate answer using fewermore » resources. The particular class of algorithms that we focus on are streaming algorithms , so called because they are designed to handle high-throughput streams of data. Streaming algorithms have only a small amount of working storage - much less than the size of the full data stream - so they must necessarily use sampling to approximate the correct answer. We present two results: * A streaming algorithm called HyperHeadTail , that estimates the degree distribution of a graph (i.e., the distribution of the number of connections for each node in a network). The degree distribution is a fundamental graph property, but prior work on estimating the degree distribution in a streaming setting was impractical for many real-world application. We improve upon prior work by developing an algorithm that can handle streams with repeated edges, and graph structures that evolve over time. * An algorithm for the task of maintaining a weighted subsample of items in a stream, when the items must be sampled according to their weight, and the weights are dynamically changing. To our knowledge, this is the first such algorithm designed for dynamically evolving weights. We expect it may be useful as a building block for other streaming algorithms on dynamic data sets.« less
Vector Quantization Algorithm Based on Associative Memories
NASA Astrophysics Data System (ADS)
Guzmán, Enrique; Pogrebnyak, Oleksiy; Yáñez, Cornelio; Manrique, Pablo
This paper presents a vector quantization algorithm for image compression based on extended associative memories. The proposed algorithm is divided in two stages. First, an associative network is generated applying the learning phase of the extended associative memories between a codebook generated by the LBG algorithm and a training set. This associative network is named EAM-codebook and represents a new codebook which is used in the next stage. The EAM-codebook establishes a relation between training set and the LBG codebook. Second, the vector quantization process is performed by means of the recalling stage of EAM using as associative memory the EAM-codebook. This process generates a set of the class indices to which each input vector belongs. With respect to the LBG algorithm, the main advantages offered by the proposed algorithm is high processing speed and low demand of resources (system memory); results of image compression and quality are presented.
Determining open cluster membership. A Bayesian framework for quantitative member classification
NASA Astrophysics Data System (ADS)
Stott, Jonathan J.
2018-01-01
Aims: My goal is to develop a quantitative algorithm for assessing open cluster membership probabilities. The algorithm is designed to work with single-epoch observations. In its simplest form, only one set of program images and one set of reference images are required. Methods: The algorithm is based on a two-stage joint astrometric and photometric assessment of cluster membership probabilities. The probabilities were computed within a Bayesian framework using any available prior information. Where possible, the algorithm emphasizes simplicity over mathematical sophistication. Results: The algorithm was implemented and tested against three observational fields using published survey data. M 67 and NGC 654 were selected as cluster examples while a third, cluster-free, field was used for the final test data set. The algorithm shows good quantitative agreement with the existing surveys and has a false-positive rate significantly lower than the astrometric or photometric methods used individually.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bosler, Peter Andrew; Roesler, Erika Louise; Taylor, Mark A.
This article discusses the problem of identifying extreme climate events such as intense storms within large climate data sets. The basic storm detection algorithm is reviewed, which splits the problem into two parts: a spatial search followed by a temporal correlation problem. Two specific implementations of the spatial search algorithm are compared. The commonly used grid point search algorithm is reviewed, and a new algorithm called Stride Search is introduced. Stride Search is designed to work at all latitudes, while grid point searches may fail in polar regions. Results from the two algorithms are compared for the application of tropicalmore » cyclone detection, and shown to produce similar results for the same set of storm identification criteria. The time required for both algorithms to search the same data set is compared. Furthermore, Stride Search's ability to search extreme latitudes is demonstrated for the case of polar low detection.« less
Optic disc segmentation: level set methods and blood vessels inpainting
NASA Astrophysics Data System (ADS)
Almazroa, A.; Sun, Weiwei; Alodhayb, Sami; Raahemifar, Kaamran; Lakshminarayanan, Vasudevan
2017-03-01
Segmenting the optic disc (OD) is an important and essential step in creating a frame of reference for diagnosing optic nerve head (ONH) pathology such as glaucoma. Therefore, a reliable OD segmentation technique is necessary for automatic screening of ONH abnormalities. The main contribution of this paper is in presenting a novel OD segmentation algorithm based on applying a level set method on a localized OD image. To prevent the blood vessels from interfering with the level set process, an inpainting technique is applied. The algorithm is evaluated using a new retinal fundus image dataset called RIGA (Retinal Images for Glaucoma Analysis). In the case of low quality images, a double level set is applied in which the first level set is considered to be a localization for the OD. Five hundred and fifty images are used to test the algorithm accuracy as well as its agreement with manual markings by six ophthalmologists. The accuracy of the algorithm in marking the optic disc area and centroid is 83.9%, and the best agreement is observed between the results of the algorithm and manual markings in 379 images.
A Modified MinMax k-Means Algorithm Based on PSO.
Wang, Xiaoyan; Bai, Yanping
The MinMax k -means algorithm is widely used to tackle the effect of bad initialization by minimizing the maximum intraclustering errors. Two parameters, including the exponent parameter and memory parameter, are involved in the executive process. Since different parameters have different clustering errors, it is crucial to choose appropriate parameters. In the original algorithm, a practical framework is given. Such framework extends the MinMax k -means to automatically adapt the exponent parameter to the data set. It has been believed that if the maximum exponent parameter has been set, then the programme can reach the lowest intraclustering errors. However, our experiments show that this is not always correct. In this paper, we modified the MinMax k -means algorithm by PSO to determine the proper values of parameters which can subject the algorithm to attain the lowest clustering errors. The proposed clustering method is tested on some favorite data sets in several different initial situations and is compared to the k -means algorithm and the original MinMax k -means algorithm. The experimental results indicate that our proposed algorithm can reach the lowest clustering errors automatically.
Exploring Representativeness and Informativeness for Active Learning.
Du, Bo; Wang, Zengmao; Zhang, Lefei; Zhang, Liangpei; Liu, Wei; Shen, Jialie; Tao, Dacheng
2017-01-01
How can we find a general way to choose the most suitable samples for training a classifier? Even with very limited prior information? Active learning, which can be regarded as an iterative optimization procedure, plays a key role to construct a refined training set to improve the classification performance in a variety of applications, such as text analysis, image recognition, social network modeling, etc. Although combining representativeness and informativeness of samples has been proven promising for active sampling, state-of-the-art methods perform well under certain data structures. Then can we find a way to fuse the two active sampling criteria without any assumption on data? This paper proposes a general active learning framework that effectively fuses the two criteria. Inspired by a two-sample discrepancy problem, triple measures are elaborately designed to guarantee that the query samples not only possess the representativeness of the unlabeled data but also reveal the diversity of the labeled data. Any appropriate similarity measure can be employed to construct the triple measures. Meanwhile, an uncertain measure is leveraged to generate the informativeness criterion, which can be carried out in different ways. Rooted in this framework, a practical active learning algorithm is proposed, which exploits a radial basis function together with the estimated probabilities to construct the triple measures and a modified best-versus-second-best strategy to construct the uncertain measure, respectively. Experimental results on benchmark datasets demonstrate that our algorithm consistently achieves superior performance over the state-of-the-art active learning algorithms.
Lai, Fu-Jou; Chang, Hong-Tsun; Huang, Yueh-Min; Wu, Wei-Sheng
2014-01-01
Eukaryotic transcriptional regulation is known to be highly connected through the networks of cooperative transcription factors (TFs). Measuring the cooperativity of TFs is helpful for understanding the biological relevance of these TFs in regulating genes. The recent advances in computational techniques led to various predictions of cooperative TF pairs in yeast. As each algorithm integrated different data resources and was developed based on different rationales, it possessed its own merit and claimed outperforming others. However, the claim was prone to subjectivity because each algorithm compared with only a few other algorithms and only used a small set of performance indices for comparison. This motivated us to propose a series of indices to objectively evaluate the prediction performance of existing algorithms. And based on the proposed performance indices, we conducted a comprehensive performance evaluation. We collected 14 sets of predicted cooperative TF pairs (PCTFPs) in yeast from 14 existing algorithms in the literature. Using the eight performance indices we adopted/proposed, the cooperativity of each PCTFP was measured and a ranking score according to the mean cooperativity of the set was given to each set of PCTFPs under evaluation for each performance index. It was seen that the ranking scores of a set of PCTFPs vary with different performance indices, implying that an algorithm used in predicting cooperative TF pairs is of strength somewhere but may be of weakness elsewhere. We finally made a comprehensive ranking for these 14 sets. The results showed that Wang J's study obtained the best performance evaluation on the prediction of cooperative TF pairs in yeast. In this study, we adopted/proposed eight performance indices to make a comprehensive performance evaluation on the prediction results of 14 existing cooperative TFs identification algorithms. Most importantly, these proposed indices can be easily applied to measure the performance of new algorithms developed in the future, thus expedite progress in this research field.
MacRae, J; Darlow, B; McBain, L; Jones, O; Stubbe, M; Turner, N; Dowell, A
2015-08-21
To develop a natural language processing software inference algorithm to classify the content of primary care consultations using electronic health record Big Data and subsequently test the algorithm's ability to estimate the prevalence and burden of childhood respiratory illness in primary care. Algorithm development and validation study. To classify consultations, the algorithm is designed to interrogate clinical narrative entered as free text, diagnostic (Read) codes created and medications prescribed on the day of the consultation. Thirty-six consenting primary care practices from a mixed urban and semirural region of New Zealand. Three independent sets of 1200 child consultation records were randomly extracted from a data set of all general practitioner consultations in participating practices between 1 January 2008-31 December 2013 for children under 18 years of age (n=754,242). Each consultation record within these sets was independently classified by two expert clinicians as respiratory or non-respiratory, and subclassified according to respiratory diagnostic categories to create three 'gold standard' sets of classified records. These three gold standard record sets were used to train, test and validate the algorithm. Sensitivity, specificity, positive predictive value and F-measure were calculated to illustrate the algorithm's ability to replicate judgements of expert clinicians within the 1200 record gold standard validation set. The algorithm was able to identify respiratory consultations in the 1200 record validation set with a sensitivity of 0.72 (95% CI 0.67 to 0.78) and a specificity of 0.95 (95% CI 0.93 to 0.98). The positive predictive value of algorithm respiratory classification was 0.93 (95% CI 0.89 to 0.97). The positive predictive value of the algorithm classifying consultations as being related to specific respiratory diagnostic categories ranged from 0.68 (95% CI 0.40 to 1.00; other respiratory conditions) to 0.91 (95% CI 0.79 to 1.00; throat infections). A software inference algorithm that uses primary care Big Data can accurately classify the content of clinical consultations. This algorithm will enable accurate estimation of the prevalence of childhood respiratory illness in primary care and resultant service utilisation. The methodology can also be applied to other areas of clinical care. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
Delaunay based algorithm for finding polygonal voids in planar point sets
NASA Astrophysics Data System (ADS)
Alonso, R.; Ojeda, J.; Hitschfeld, N.; Hervías, C.; Campusano, L. E.
2018-01-01
This paper presents a new algorithm to find under-dense regions called voids inside a 2D point set. The algorithm starts from terminal-edges (local longest-edges) in a Delaunay triangulation and builds the largest possible low density terminal-edge regions around them. A terminal-edge region can represent either an entire void or part of a void (subvoid). Using artificial data sets, the case of voids that are detected as several adjacent subvoids is analyzed and four subvoid joining criteria are proposed and evaluated. Since this work is inspired on searches of a more robust, effective and efficient algorithm to find 3D cosmological voids the evaluation of the joining criteria considers this context. However, the design of the algorithm permits its adaption to the requirements of any similar application.
Rough sets and Laplacian score based cost-sensitive feature selection
Yu, Shenglong
2018-01-01
Cost-sensitive feature selection learning is an important preprocessing step in machine learning and data mining. Recently, most existing cost-sensitive feature selection algorithms are heuristic algorithms, which evaluate the importance of each feature individually and select features one by one. Obviously, these algorithms do not consider the relationship among features. In this paper, we propose a new algorithm for minimal cost feature selection called the rough sets and Laplacian score based cost-sensitive feature selection. The importance of each feature is evaluated by both rough sets and Laplacian score. Compared with heuristic algorithms, the proposed algorithm takes into consideration the relationship among features with locality preservation of Laplacian score. We select a feature subset with maximal feature importance and minimal cost when cost is undertaken in parallel, where the cost is given by three different distributions to simulate different applications. Different from existing cost-sensitive feature selection algorithms, our algorithm simultaneously selects out a predetermined number of “good” features. Extensive experimental results show that the approach is efficient and able to effectively obtain the minimum cost subset. In addition, the results of our method are more promising than the results of other cost-sensitive feature selection algorithms. PMID:29912884
Rough sets and Laplacian score based cost-sensitive feature selection.
Yu, Shenglong; Zhao, Hong
2018-01-01
Cost-sensitive feature selection learning is an important preprocessing step in machine learning and data mining. Recently, most existing cost-sensitive feature selection algorithms are heuristic algorithms, which evaluate the importance of each feature individually and select features one by one. Obviously, these algorithms do not consider the relationship among features. In this paper, we propose a new algorithm for minimal cost feature selection called the rough sets and Laplacian score based cost-sensitive feature selection. The importance of each feature is evaluated by both rough sets and Laplacian score. Compared with heuristic algorithms, the proposed algorithm takes into consideration the relationship among features with locality preservation of Laplacian score. We select a feature subset with maximal feature importance and minimal cost when cost is undertaken in parallel, where the cost is given by three different distributions to simulate different applications. Different from existing cost-sensitive feature selection algorithms, our algorithm simultaneously selects out a predetermined number of "good" features. Extensive experimental results show that the approach is efficient and able to effectively obtain the minimum cost subset. In addition, the results of our method are more promising than the results of other cost-sensitive feature selection algorithms.
Twagirumukiza, M; Van Bortel, L M
2011-01-01
Hypertension is emerging in many developing nations as a leading cause of cardiovascular mortality, morbidity and disability in adults. In sub-Saharan African (SSA) countries it has specificities such as occurring in young and active adults, resulting in severe complications dominated by heart failure and taking place in limited-resource settings in which an individual's access to treatment (affordability) is very limited. Within this context of restrained economic conditions, the greatest gains for SSA in controlling the hypertension epidemic lie in its prevention. Attempts should be made to detect hypertensive patients early before irreversible organ damage becomes apparent, and to provide them with the best possible and affordable non-pharmacological and pharmacological treatment. Therefore, efforts should be made for detection and early management at the community level. In this context, a standardized algorithm of management can help in the rational use of available resources. Although many international and regional guidelines have been published, they cannot apply to SSA settings because the economy of the countries and affordability of the patients do not allow access to advocated treatment. In addition, none of them suggest a clear algorithm of management for limited-resource settings at the community level. In line with available data and analysing existing guidelines, a practical algorithm for management of hypertension at the community level, including treatment affordability, has been suggested in the present work.
Using Ensemble Decisions and Active Selection to Improve Low-Cost Labeling for Multi-View Data
NASA Technical Reports Server (NTRS)
Rebbapragada, Umaa; Wagstaff, Kiri L.
2011-01-01
This paper seeks to improve low-cost labeling in terms of training set reliability (the fraction of correctly labeled training items) and test set performance for multi-view learning methods. Co-training is a popular multiview learning method that combines high-confidence example selection with low-cost (self) labeling. However, co-training with certain base learning algorithms significantly reduces training set reliability, causing an associated drop in prediction accuracy. We propose the use of ensemble labeling to improve reliability in such cases. We also discuss and show promising results on combining low-cost ensemble labeling with active (low-confidence) example selection. We unify these example selection and labeling strategies under collaborative learning, a family of techniques for multi-view learning that we are developing for distributed, sensor-network environments.
Esteban, Santiago; Rodríguez Tablado, Manuel; Peper, Francisco; Mahumud, Yamila S; Ricci, Ricardo I; Kopitowski, Karin; Terrasa, Sergio
2017-01-01
Precision medicine requires extremely large samples. Electronic health records (EHR) are thought to be a cost-effective source of data for that purpose. Phenotyping algorithms help reduce classification errors, making EHR a more reliable source of information for research. Four algorithm development strategies for classifying patients according to their diabetes status (diabetics; non-diabetics; inconclusive) were tested (one codes-only algorithm; one boolean algorithm, four statistical learning algorithms and six stacked generalization meta-learners). The best performing algorithms within each strategy were tested on the validation set. The stacked generalization algorithm yielded the highest Kappa coefficient value in the validation set (0.95 95% CI 0.91, 0.98). The implementation of these algorithms allows for the exploitation of data from thousands of patients accurately, greatly reducing the costs of constructing retrospective cohorts for research.
An Evaluation of an Algorithm for Linear Inequalities and Its Applications
NASA Technical Reports Server (NTRS)
Jurgensen, J.
1973-01-01
An algorithm is presented for obtaining a solution alpha to a set of inequalities (A alpha) 0 where A is an N x m-matrix and alpha is an m-vector. If the set of inequalities is consistant, then the algorithm is guaranteed to arrive at a solution in a finite number of steps. Also, if in the iteration, a negative vector is obtained, then the initial set of inequalities is inconsistant, and the iteration is terminated.
HECLIB. Volume 2: HECDSS Subroutines Programmer’s Manual
1991-05-01
algorithm and hierarchical design for database accesses. This algorithm provides quick access to data sets and an efficient means of adding new data set...Description of How DSS Works DSS version 6 utilizes a modified hash algorithm based upon the pathname to store and retrieve data. This structure allows...balancing disk space and record access times. A variation in this algorithm is for "stable" files. In a stable file, a hash table is not utilized
Finite Set Control Transcription for Optimal Control Applications
2009-05-01
Figures 1.1 The Parameters of x . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.1 Categories of Optimization Algorithms ...Programming (NLP) algorithm , such as SNOPT2 (hereafter, called the optimizer). The Finite Set Control Transcription (FSCT) method is essentially a...artificial neural networks, ge- netic algorithms , or combinations thereof for analysis.4,5 Indeed, an actual biological neural network is an example of
GSNFS: Gene subnetwork biomarker identification of lung cancer expression data.
Doungpan, Narumol; Engchuan, Worrawat; Chan, Jonathan H; Meechai, Asawin
2016-12-05
Gene expression has been used to identify disease gene biomarkers, but there are ongoing challenges. Single gene or gene-set biomarkers are inadequate to provide sufficient understanding of complex disease mechanisms and the relationship among those genes. Network-based methods have thus been considered for inferring the interaction within a group of genes to further study the disease mechanism. Recently, the Gene-Network-based Feature Set (GNFS), which is capable of handling case-control and multiclass expression for gene biomarker identification, has been proposed, partly taking into account of network topology. However, its performance relies on a greedy search for building subnetworks and thus requires further improvement. In this work, we establish a new approach named Gene Sub-Network-based Feature Selection (GSNFS) by implementing the GNFS framework with two proposed searching and scoring algorithms, namely gene-set-based (GS) search and parent-node-based (PN) search, to identify subnetworks. An additional dataset is used to validate the results. The two proposed searching algorithms of the GSNFS method for subnetwork expansion are concerned with the degree of connectivity and the scoring scheme for building subnetworks and their topology. For each iteration of expansion, the neighbour genes of a current subnetwork, whose expression data improved the overall subnetwork score, is recruited. While the GS search calculated the subnetwork score using an activity score of a current subnetwork and the gene expression values of its neighbours, the PN search uses the expression value of the corresponding parent of each neighbour gene. Four lung cancer expression datasets were used for subnetwork identification. In addition, using pathway data and protein-protein interaction as network data in order to consider the interaction among significant genes were discussed. Classification was performed to compare the performance of the identified gene subnetworks with three subnetwork identification algorithms. The two searching algorithms resulted in better classification and gene/gene-set agreement compared to the original greedy search of the GNFS method. The identified lung cancer subnetwork using the proposed searching algorithm resulted in an improvement of the cross-dataset validation and an increase in the consistency of findings between two independent datasets. The homogeneity measurement of the datasets was conducted to assess dataset compatibility in cross-dataset validation. The lung cancer dataset with higher homogeneity showed a better result when using the GS search while the dataset with low homogeneity showed a better result when using the PN search. The 10-fold cross-dataset validation on the independent lung cancer datasets showed higher classification performance of the proposed algorithms when compared with the greedy search in the original GNFS method. The proposed searching algorithms provide a higher number of genes in the subnetwork expansion step than the greedy algorithm. As a result, the performance of the subnetworks identified from the GSNFS method was improved in terms of classification performance and gene/gene-set level agreement depending on the homogeneity of the datasets used in the analysis. Some common genes obtained from the four datasets using different searching algorithms are genes known to play a role in lung cancer. The improvement of classification performance and the gene/gene-set level agreement, and the biological relevance indicated the effectiveness of the GSNFS method for gene subnetwork identification using expression data.
SETTER: web server for RNA structure comparison
Čech, Petr; Svozil, Daniel; Hoksza, David
2012-01-01
The recent discoveries of regulatory non-coding RNAs changed our view of RNA as a simple information transfer molecule. Understanding the architecture and function of active RNA molecules requires methods for comparing and analyzing their 3D structures. While structural alignment of short RNAs is achievable in a reasonable amount of time, large structures represent much bigger challenge. Here, we present the SETTER web server for the RNA structure pairwise comparison utilizing the SETTER (SEcondary sTructure-based TERtiary Structure Similarity Algorithm) algorithm. The SETTER method divides an RNA structure into the set of non-overlapping structural elements called generalized secondary structure units (GSSUs). The SETTER algorithm scales as O(n2) with the size of a GSSUs and as O(n) with the number of GSSUs in the structure. This scaling gives SETTER its high speed as the average size of the GSSU remains constant irrespective of the size of the structure. However, the favorable speed of the algorithm does not compromise its accuracy. The SETTER web server together with the stand-alone implementation of the SETTER algorithm are freely accessible at http://siret.cz/setter. PMID:22693209
Efficient irregular wavefront propagation algorithms on Intel® Xeon Phi™
Gomes, Jeremias M.; Teodoro, George; de Melo, Alba; Kong, Jun; Kurc, Tahsin; Saltz, Joel H.
2016-01-01
We investigate the execution of the Irregular Wavefront Propagation Pattern (IWPP), a fundamental computing structure used in several image analysis operations, on the Intel® Xeon Phi™ co-processor. An efficient implementation of IWPP on the Xeon Phi is a challenging problem because of IWPP’s irregularity and the use of atomic instructions in the original IWPP algorithm to resolve race conditions. On the Xeon Phi, the use of SIMD and vectorization instructions is critical to attain high performance. However, SIMD atomic instructions are not supported. Therefore, we propose a new IWPP algorithm that can take advantage of the supported SIMD instruction set. We also evaluate an alternate storage container (priority queue) to track active elements in the wavefront in an effort to improve the parallel algorithm efficiency. The new IWPP algorithm is evaluated with Morphological Reconstruction and Imfill operations as use cases. Our results show performance improvements of up to 5.63× on top of the original IWPP due to vectorization. Moreover, the new IWPP achieves speedups of 45.7× and 1.62×, respectively, as compared to efficient CPU and GPU implementations. PMID:27298591
Efficient irregular wavefront propagation algorithms on Intel® Xeon Phi™.
Gomes, Jeremias M; Teodoro, George; de Melo, Alba; Kong, Jun; Kurc, Tahsin; Saltz, Joel H
2015-10-01
We investigate the execution of the Irregular Wavefront Propagation Pattern (IWPP), a fundamental computing structure used in several image analysis operations, on the Intel ® Xeon Phi ™ co-processor. An efficient implementation of IWPP on the Xeon Phi is a challenging problem because of IWPP's irregularity and the use of atomic instructions in the original IWPP algorithm to resolve race conditions. On the Xeon Phi, the use of SIMD and vectorization instructions is critical to attain high performance. However, SIMD atomic instructions are not supported. Therefore, we propose a new IWPP algorithm that can take advantage of the supported SIMD instruction set. We also evaluate an alternate storage container (priority queue) to track active elements in the wavefront in an effort to improve the parallel algorithm efficiency. The new IWPP algorithm is evaluated with Morphological Reconstruction and Imfill operations as use cases. Our results show performance improvements of up to 5.63 × on top of the original IWPP due to vectorization. Moreover, the new IWPP achieves speedups of 45.7 × and 1.62 × , respectively, as compared to efficient CPU and GPU implementations.
Neuroprosthetic Decoder Training as Imitation Learning
Merel, Josh; Paninski, Liam; Cunningham, John P.
2016-01-01
Neuroprosthetic brain-computer interfaces function via an algorithm which decodes neural activity of the user into movements of an end effector, such as a cursor or robotic arm. In practice, the decoder is often learned by updating its parameters while the user performs a task. When the user’s intention is not directly observable, recent methods have demonstrated value in training the decoder against a surrogate for the user’s intended movement. Here we show that training a decoder in this way is a novel variant of an imitation learning problem, where an oracle or expert is employed for supervised training in lieu of direct observations, which are not available. Specifically, we describe how a generic imitation learning meta-algorithm, dataset aggregation (DAgger), can be adapted to train a generic brain-computer interface. By deriving existing learning algorithms for brain-computer interfaces in this framework, we provide a novel analysis of regret (an important metric of learning efficacy) for brain-computer interfaces. This analysis allows us to characterize the space of algorithmic variants and bounds on their regret rates. Existing approaches for decoder learning have been performed in the cursor control setting, but the available design principles for these decoders are such that it has been impossible to scale them to naturalistic settings. Leveraging our findings, we then offer an algorithm that combines imitation learning with optimal control, which should allow for training of arbitrary effectors for which optimal control can generate goal-oriented control. We demonstrate this novel and general BCI algorithm with simulated neuroprosthetic control of a 26 degree-of-freedom model of an arm, a sophisticated and realistic end effector. PMID:27191387
Heuristics for Multiobjective Optimization of Two-Sided Assembly Line Systems
Jawahar, N.; Ponnambalam, S. G.; Sivakumar, K.; Thangadurai, V.
2014-01-01
Products such as cars, trucks, and heavy machinery are assembled by two-sided assembly line. Assembly line balancing has significant impacts on the performance and productivity of flow line manufacturing systems and is an active research area for several decades. This paper addresses the line balancing problem of a two-sided assembly line in which the tasks are to be assigned at L side or R side or any one side (addressed as E). Two objectives, minimum number of workstations and minimum unbalance time among workstations, have been considered for balancing the assembly line. There are two approaches to solve multiobjective optimization problem: first approach combines all the objectives into a single composite function or moves all but one objective to the constraint set; second approach determines the Pareto optimal solution set. This paper proposes two heuristics to evolve optimal Pareto front for the TALBP under consideration: Enumerative Heuristic Algorithm (EHA) to handle problems of small and medium size and Simulated Annealing Algorithm (SAA) for large-sized problems. The proposed approaches are illustrated with example problems and their performances are compared with a set of test problems. PMID:24790568
Isaacson, M D; Srinivasan, S; Lloyd, L L
2010-01-01
MathSpeak is a set of rules for non speaking of mathematical expressions. These rules have been incorporated into a computerised module that translates printed mathematics into the non-ambiguous MathSpeak form for synthetic speech rendering. Differences between individual utterances produced with the translator module are difficult to discern because of insufficient pausing between utterances; hence, the purpose of this study was to develop an algorithm for improving the synthetic speech rendering of MathSpeak. To improve synthetic speech renderings, an algorithm for inserting pauses was developed based upon recordings of middle and high school math teachers speaking mathematic expressions. Efficacy testing of this algorithm was conducted with college students without disabilities and high school/college students with visual impairments. Parameters measured included reception accuracy, short-term memory retention, MathSpeak processing capacity and various rankings concerning the quality of synthetic speech renderings. All parameters measured showed statistically significant improvements when the algorithm was used. The algorithm improves the quality and information processing capacity of synthetic speech renderings of MathSpeak. This increases the capacity of individuals with print disabilities to perform mathematical activities and to successfully fulfill science, technology, engineering and mathematics academic and career objectives.
Synchronization Algorithms for Co-Simulation of Power Grid and Communication Networks
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ciraci, Selim; Daily, Jeffrey A.; Agarwal, Khushbu
2014-09-11
The ongoing modernization of power grids consists of integrating them with communication networks in order to achieve robust and resilient control of grid operations. To understand the operation of the new smart grid, one approach is to use simulation software. Unfortunately, current power grid simulators at best utilize inadequate approximations to simulate communication networks, if at all. Cooperative simulation of specialized power grid and communication network simulators promises to more accurately reproduce the interactions of real smart grid deployments. However, co-simulation is a challenging problem. A co-simulation must manage the exchange of informa- tion, including the synchronization of simulator clocks,more » between all simulators while maintaining adequate computational perfor- mance. This paper describes two new conservative algorithms for reducing the overhead of time synchronization, namely Active Set Conservative and Reactive Conservative. We provide a detailed analysis of their performance characteristics with respect to the current state of the art including both conservative and optimistic synchronization algorithms. In addition, we provide guidelines for selecting the appropriate synchronization algorithm based on the requirements of the co-simulation. The newly proposed algorithms are shown to achieve as much as 14% and 63% im- provement, respectively, over the existing conservative algorithm.« less
Sampling solution traces for the problem of sorting permutations by signed reversals
2012-01-01
Background Traditional algorithms to solve the problem of sorting by signed reversals output just one optimal solution while the space of all optimal solutions can be huge. A so-called trace represents a group of solutions which share the same set of reversals that must be applied to sort the original permutation following a partial ordering. By using traces, we therefore can represent the set of optimal solutions in a more compact way. Algorithms for enumerating the complete set of traces of solutions were developed. However, due to their exponential complexity, their practical use is limited to small permutations. A partial enumeration of traces is a sampling of the complete set of traces and can be an alternative for the study of distinct evolutionary scenarios of big permutations. Ideally, the sampling should be done uniformly from the space of all optimal solutions. This is however conjectured to be ♯P-complete. Results We propose and evaluate three algorithms for producing a sampling of the complete set of traces that instead can be shown in practice to preserve some of the characteristics of the space of all solutions. The first algorithm (RA) performs the construction of traces through a random selection of reversals on the list of optimal 1-sequences. The second algorithm (DFALT) consists in a slight modification of an algorithm that performs the complete enumeration of traces. Finally, the third algorithm (SWA) is based on a sliding window strategy to improve the enumeration of traces. All proposed algorithms were able to enumerate traces for permutations with up to 200 elements. Conclusions We analysed the distribution of the enumerated traces with respect to their height and average reversal length. Various works indicate that the reversal length can be an important aspect in genome rearrangements. The algorithms RA and SWA show a tendency to lose traces with high average reversal length. Such traces are however rare, and qualitatively our results show that, for testable-sized permutations, the algorithms DFALT and SWA produce distributions which approximate the reversal length distributions observed with a complete enumeration of the set of traces. PMID:22704580
Digital tripwire: a small automated human detection system
NASA Astrophysics Data System (ADS)
Fischer, Amber D.; Redd, Emmett; Younger, A. Steven
2009-05-01
A low cost, lightweight, easily deployable imaging sensor that can dependably discriminate threats from other activities within its field of view and, only then, alert the distant duty officer by transmitting a visual confirmation of the threat would provide a valuable asset to modern defense. At present, current solutions suffer from a multitude of deficiencies - size, cost, power endurance, but most notably, an inability to assess an image and conclude that it contains a threat. The human attention span cannot maintain critical surveillance over banks of displays constantly conveying such images from the field. DigitalTripwire is a small, self-contained, automated human-detection system capable of running for 1-5 days on two AA batteries. To achieve such long endurance, the DigitalTripwire system utilizes an FPGA designed with sleep functionality. The system uses robust vision algorithms, such as a partially unsupervised innovative backgroundmodeling algorithm, which employ several data reduction strategies to operate in real-time, and achieve high detection rates. When it detects human activity, either mounted or dismounted, it sends an alert including images to notify the command center. In this paper, we describe the hardware and software design of the DigitalTripwire system. In addition, we provide detection and false alarm rates across several challenging data sets demonstrating the performance of the vision algorithms in autonomously analyzing the video stream and classifying moving objects into four primary categories - dismounted human, vehicle, non-human, or unknown. Performance results across several challenging data sets are provided.
Expanded Processing Techniques for EMI Systems
2012-07-01
possible to perform better target detection using physics-based algorithms and the entire data set, rather than simulating a simpler data set and mapping...possible to perform better target detection using physics-based algorithms and the entire data set, rather than simulating a simpler data set and...54! Figure 4.25: Plots of simulated MetalMapper data for two oblate spheroidal targets
Jeonghee Kim; Parnell, Claire; Wichmann, Thomas; DeWeerth, Stephen P
2016-08-01
Assessments of tremor characteristics by movement disorder physicians are usually done at single time points in clinic settings, so that the description of the tremor does not take into account the dependence of the tremor on specific behavioral situations. Moreover, treatment-induced changes in tremor or behavior cannot be quantitatively tracked for extended periods of time. We developed a wearable tremor measurement system with tremor and activity recognition algorithms for long-term upper limb behavior tracking, to characterize tremor characteristics and treatment effects in their daily lives. In this pilot study, we collected sensor data of arm movement from three healthy participants using a wrist device that included a 3-axis accelerometer and a 3-axis gyroscope, and classified tremor and activities within scenario tasks which resembled real life situations. Our results show that the system was able to classify the tremor and activities with 89.71% and 74.48% accuracies during the scenario tasks. From this results, we expect to expand our tremor and activity measurement in longer time period.
Yu, Qiang; Wei, Dingbang; Huo, Hongwei
2018-06-18
Given a set of t n-length DNA sequences, q satisfying 0 < q ≤ 1, and l and d satisfying 0 ≤ d < l < n, the quorum planted motif search (qPMS) finds l-length strings that occur in at least qt input sequences with up to d mismatches and is mainly used to locate transcription factor binding sites in DNA sequences. Existing qPMS algorithms have been able to efficiently process small standard datasets (e.g., t = 20 and n = 600), but they are too time consuming to process large DNA datasets, such as ChIP-seq datasets that contain thousands of sequences or more. We analyze the effects of t and q on the time performance of qPMS algorithms and find that a large t or a small q causes a longer computation time. Based on this information, we improve the time performance of existing qPMS algorithms by selecting a sample sequence set D' with a small t and a large q from the large input dataset D and then executing qPMS algorithms on D'. A sample sequence selection algorithm named SamSelect is proposed. The experimental results on both simulated and real data show (1) that SamSelect can select D' efficiently and (2) that the qPMS algorithms executed on D' can find implanted or real motifs in a significantly shorter time than when executed on D. We improve the ability of existing qPMS algorithms to process large DNA datasets from the perspective of selecting high-quality sample sequence sets so that the qPMS algorithms can find motifs in a short time in the selected sample sequence set D', rather than take an unfeasibly long time to search the original sequence set D. Our motif discovery method is an approximate algorithm.
Correlated Noise: How it Breaks NMF, and What to Do About It.
Plis, Sergey M; Potluru, Vamsi K; Lane, Terran; Calhoun, Vince D
2011-01-12
Non-negative matrix factorization (NMF) is a problem of decomposing multivariate data into a set of features and their corresponding activations. When applied to experimental data, NMF has to cope with noise, which is often highly correlated. We show that correlated noise can break the Donoho and Stodden separability conditions of a dataset and a regular NMF algorithm will fail to decompose it, even when given freedom to be able to represent the noise as a separate feature. To cope with this issue, we present an algorithm for NMF with a generalized least squares objective function (glsNMF) and derive multiplicative updates for the method together with proving their convergence. The new algorithm successfully recovers the true representation from the noisy data. Robust performance can make glsNMF a valuable tool for analyzing empirical data.
Correlated Noise: How it Breaks NMF, and What to Do About It
Plis, Sergey M.; Potluru, Vamsi K.; Lane, Terran; Calhoun, Vince D.
2010-01-01
Non-negative matrix factorization (NMF) is a problem of decomposing multivariate data into a set of features and their corresponding activations. When applied to experimental data, NMF has to cope with noise, which is often highly correlated. We show that correlated noise can break the Donoho and Stodden separability conditions of a dataset and a regular NMF algorithm will fail to decompose it, even when given freedom to be able to represent the noise as a separate feature. To cope with this issue, we present an algorithm for NMF with a generalized least squares objective function (glsNMF) and derive multiplicative updates for the method together with proving their convergence. The new algorithm successfully recovers the true representation from the noisy data. Robust performance can make glsNMF a valuable tool for analyzing empirical data. PMID:23750288
Combined rule extraction and feature elimination in supervised classification.
Liu, Sheng; Patel, Ronak Y; Daga, Pankaj R; Liu, Haining; Fu, Gang; Doerksen, Robert J; Chen, Yixin; Wilkins, Dawn E
2012-09-01
There are a vast number of biology related research problems involving a combination of multiple sources of data to achieve a better understanding of the underlying problems. It is important to select and interpret the most important information from these sources. Thus it will be beneficial to have a good algorithm to simultaneously extract rules and select features for better interpretation of the predictive model. We propose an efficient algorithm, Combined Rule Extraction and Feature Elimination (CRF), based on 1-norm regularized random forests. CRF simultaneously extracts a small number of rules generated by random forests and selects important features. We applied CRF to several drug activity prediction and microarray data sets. CRF is capable of producing performance comparable with state-of-the-art prediction algorithms using a small number of decision rules. Some of the decision rules are biologically significant.
Multimodal Hierarchical Dirichlet Process-Based Active Perception by a Robot
Taniguchi, Tadahiro; Yoshino, Ryo; Takano, Toshiaki
2018-01-01
In this paper, we propose an active perception method for recognizing object categories based on the multimodal hierarchical Dirichlet process (MHDP). The MHDP enables a robot to form object categories using multimodal information, e.g., visual, auditory, and haptic information, which can be observed by performing actions on an object. However, performing many actions on a target object requires a long time. In a real-time scenario, i.e., when the time is limited, the robot has to determine the set of actions that is most effective for recognizing a target object. We propose an active perception for MHDP method that uses the information gain (IG) maximization criterion and lazy greedy algorithm. We show that the IG maximization criterion is optimal in the sense that the criterion is equivalent to a minimization of the expected Kullback–Leibler divergence between a final recognition state and the recognition state after the next set of actions. However, a straightforward calculation of IG is practically impossible. Therefore, we derive a Monte Carlo approximation method for IG by making use of a property of the MHDP. We also show that the IG has submodular and non-decreasing properties as a set function because of the structure of the graphical model of the MHDP. Therefore, the IG maximization problem is reduced to a submodular maximization problem. This means that greedy and lazy greedy algorithms are effective and have a theoretical justification for their performance. We conducted an experiment using an upper-torso humanoid robot and a second one using synthetic data. The experimental results show that the method enables the robot to select a set of actions that allow it to recognize target objects quickly and accurately. The numerical experiment using the synthetic data shows that the proposed method can work appropriately even when the number of actions is large and a set of target objects involves objects categorized into multiple classes. The results support our theoretical outcomes. PMID:29872389
Multimodal Hierarchical Dirichlet Process-Based Active Perception by a Robot.
Taniguchi, Tadahiro; Yoshino, Ryo; Takano, Toshiaki
2018-01-01
In this paper, we propose an active perception method for recognizing object categories based on the multimodal hierarchical Dirichlet process (MHDP). The MHDP enables a robot to form object categories using multimodal information, e.g., visual, auditory, and haptic information, which can be observed by performing actions on an object. However, performing many actions on a target object requires a long time. In a real-time scenario, i.e., when the time is limited, the robot has to determine the set of actions that is most effective for recognizing a target object. We propose an active perception for MHDP method that uses the information gain (IG) maximization criterion and lazy greedy algorithm. We show that the IG maximization criterion is optimal in the sense that the criterion is equivalent to a minimization of the expected Kullback-Leibler divergence between a final recognition state and the recognition state after the next set of actions. However, a straightforward calculation of IG is practically impossible. Therefore, we derive a Monte Carlo approximation method for IG by making use of a property of the MHDP. We also show that the IG has submodular and non-decreasing properties as a set function because of the structure of the graphical model of the MHDP. Therefore, the IG maximization problem is reduced to a submodular maximization problem. This means that greedy and lazy greedy algorithms are effective and have a theoretical justification for their performance. We conducted an experiment using an upper-torso humanoid robot and a second one using synthetic data. The experimental results show that the method enables the robot to select a set of actions that allow it to recognize target objects quickly and accurately. The numerical experiment using the synthetic data shows that the proposed method can work appropriately even when the number of actions is large and a set of target objects involves objects categorized into multiple classes. The results support our theoretical outcomes.
Automated video-based detection of nocturnal convulsive seizures in a residential care setting.
Geertsema, Evelien E; Thijs, Roland D; Gutter, Therese; Vledder, Ben; Arends, Johan B; Leijten, Frans S; Visser, Gerhard H; Kalitzin, Stiliyan N
2018-06-01
People with epilepsy need assistance and are at risk of sudden death when having convulsive seizures (CS). Automated real-time seizure detection systems can help alert caregivers, but wearable sensors are not always tolerated. We determined algorithm settings and investigated detection performance of a video algorithm to detect CS in a residential care setting. The algorithm calculates power in the 2-6 Hz range relative to 0.5-12.5 Hz range in group velocity signals derived from video-sequence optical flow. A detection threshold was found using a training set consisting of video-electroencephalogaphy (EEG) recordings of 72 CS. A test set consisting of 24 full nights of 12 new subjects in residential care and additional recordings of 50 CS selected randomly was used to estimate performance. All data were analyzed retrospectively. The start and end of CS (generalized clonic and tonic-clonic seizures) and other seizures considered desirable to detect (long generalized tonic, hyperkinetic, and other major seizures) were annotated. The detection threshold was set to the value that obtained 97% sensitivity in the training set. Sensitivity, latency, and false detection rate (FDR) per night were calculated in the test set. A seizure was detected when the algorithm output exceeded the threshold continuously for 2 seconds. With the detection threshold determined in the training set, all CS were detected in the test set (100% sensitivity). Latency was ≤10 seconds in 78% of detections. Three/five hyperkinetic and 6/9 other major seizures were detected. Median FDR was 0.78 per night and no false detections occurred in 9/24 nights. Our algorithm could improve safety unobtrusively by automated real-time detection of CS in video registrations, with an acceptable latency and FDR. The algorithm can also detect some other motor seizures requiring assistance. © 2018 The Authors. Epilepsia published by Wiley Periodicals, Inc. on behalf of International League Against Epilepsy.
Comparative Evaluation of Different Optimization Algorithms for Structural Design Applications
NASA Technical Reports Server (NTRS)
Patnaik, Surya N.; Coroneos, Rula M.; Guptill, James D.; Hopkins, Dale A.
1996-01-01
Non-linear programming algorithms play an important role in structural design optimization. Fortunately, several algorithms with computer codes are available. At NASA Lewis Research Centre, a project was initiated to assess the performance of eight different optimizers through the development of a computer code CometBoards. This paper summarizes the conclusions of that research. CometBoards was employed to solve sets of small, medium and large structural problems, using the eight different optimizers on a Cray-YMP8E/8128 computer. The reliability and efficiency of the optimizers were determined from the performance of these problems. For small problems, the performance of most of the optimizers could be considered adequate. For large problems, however, three optimizers (two sequential quadratic programming routines, DNCONG of IMSL and SQP of IDESIGN, along with Sequential Unconstrained Minimizations Technique SUMT) outperformed others. At optimum, most optimizers captured an identical number of active displacement and frequency constraints but the number of active stress constraints differed among the optimizers. This discrepancy can be attributed to singularity conditions in the optimization and the alleviation of this discrepancy can improve the efficiency of optimizers.
Performance Trend of Different Algorithms for Structural Design Optimization
NASA Technical Reports Server (NTRS)
Patnaik, Surya N.; Coroneos, Rula M.; Guptill, James D.; Hopkins, Dale A.
1996-01-01
Nonlinear programming algorithms play an important role in structural design optimization. Fortunately, several algorithms with computer codes are available. At NASA Lewis Research Center, a project was initiated to assess performance of different optimizers through the development of a computer code CometBoards. This paper summarizes the conclusions of that research. CometBoards was employed to solve sets of small, medium and large structural problems, using different optimizers on a Cray-YMP8E/8128 computer. The reliability and efficiency of the optimizers were determined from the performance of these problems. For small problems, the performance of most of the optimizers could be considered adequate. For large problems however, three optimizers (two sequential quadratic programming routines, DNCONG of IMSL and SQP of IDESIGN, along with the sequential unconstrained minimizations technique SUMT) outperformed others. At optimum, most optimizers captured an identical number of active displacement and frequency constraints but the number of active stress constraints differed among the optimizers. This discrepancy can be attributed to singularity conditions in the optimization and the alleviation of this discrepancy can improve the efficiency of optimizers.
Current algorithmic solutions for peptide-based proteomics data generation and identification.
Hoopmann, Michael R; Moritz, Robert L
2013-02-01
Peptide-based proteomic data sets are ever increasing in size and complexity. These data sets provide computational challenges when attempting to quickly analyze spectra and obtain correct protein identifications. Database search and de novo algorithms must consider high-resolution MS/MS spectra and alternative fragmentation methods. Protein inference is a tricky problem when analyzing large data sets of degenerate peptide identifications. Combining multiple algorithms for improved peptide identification puts significant strain on computational systems when investigating large data sets. This review highlights some of the recent developments in peptide and protein identification algorithms for analyzing shotgun mass spectrometry data when encountering the aforementioned hurdles. Also explored are the roles that analytical pipelines, public spectral libraries, and cloud computing play in the evolution of peptide-based proteomics. Copyright © 2012 Elsevier Ltd. All rights reserved.
Silva, Adão; Gameiro, Atílio
2014-01-01
We present in this work a low-complexity algorithm to solve the sum rate maximization problem in multiuser MIMO broadcast channels with downlink beamforming. Our approach decouples the user selection problem from the resource allocation problem and its main goal is to create a set of quasiorthogonal users. The proposed algorithm exploits physical metrics of the wireless channels that can be easily computed in such a way that a null space projection power can be approximated efficiently. Based on the derived metrics we present a mathematical model that describes the dynamics of the user selection process which renders the user selection problem into an integer linear program. Numerical results show that our approach is highly efficient to form groups of quasiorthogonal users when compared to previously proposed algorithms in the literature. Our user selection algorithm achieves a large portion of the optimum user selection sum rate (90%) for a moderate number of active users. PMID:24574928
An Annotation Agnostic Algorithm for Detecting Nascent RNA Transcripts in GRO-Seq.
Azofeifa, Joseph G; Allen, Mary A; Lladser, Manuel E; Dowell, Robin D
2017-01-01
We present a fast and simple algorithm to detect nascent RNA transcription in global nuclear run-on sequencing (GRO-seq). GRO-seq is a relatively new protocol that captures nascent transcripts from actively engaged polymerase, providing a direct read-out on bona fide transcription. Most traditional assays, such as RNA-seq, measure steady state RNA levels which are affected by transcription, post-transcriptional processing, and RNA stability. GRO-seq data, however, presents unique analysis challenges that are only beginning to be addressed. Here, we describe a new algorithm, Fast Read Stitcher (FStitch), that takes advantage of two popular machine-learning techniques, hidden Markov models and logistic regression, to classify which regions of the genome are transcribed. Given a small user-defined training set, our algorithm is accurate, robust to varying read depth, annotation agnostic, and fast. Analysis of GRO-seq data without a priori need for annotation uncovers surprising new insights into several aspects of the transcription process.
NASA Astrophysics Data System (ADS)
Niknam, Taher; Kavousifard, Abdollah; Tabatabaei, Sajad; Aghaei, Jamshid
2011-10-01
In this paper a new multiobjective modified honey bee mating optimization (MHBMO) algorithm is presented to investigate the distribution feeder reconfiguration (DFR) problem considering renewable energy sources (RESs) (photovoltaics, fuel cell and wind energy) connected to the distribution network. The objective functions of the problem to be minimized are the electrical active power losses, the voltage deviations, the total electrical energy costs and the total emissions of RESs and substations. During the optimization process, the proposed algorithm finds a set of non-dominated (Pareto) optimal solutions which are stored in an external memory called repository. Since the objective functions investigated are not the same, a fuzzy clustering algorithm is utilized to handle the size of the repository in the specified limits. Moreover, a fuzzy-based decision maker is adopted to select the 'best' compromised solution among the non-dominated optimal solutions of multiobjective optimization problem. In order to see the feasibility and effectiveness of the proposed algorithm, two standard distribution test systems are used as case studies.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Marchal, Rémi; Carbonnière, Philippe; Pouchan, Claude
2015-01-22
The study of atomic clusters has become an increasingly active area of research in the recent years because of the fundamental interest in studying a completely new area that can bridge the gap between atomic and solid state physics. Due to their specific properties, such compounds are of great interest in the field of nanotechnology [1,2]. Here, we would present our GSAM algorithm based on a DFT exploration of the PES to find the low lying isomers of such compounds. This algorithm includes the generation of an intial set of structure from which the most relevant are selected. Moreover, anmore » optimization process, called raking optimization, able to discard step by step all the non physically reasonnable configurations have been implemented to reduce the computational cost of this algorithm. Structural properties of Ga{sub n}Asm clusters will be presented as an illustration of the method.« less
Compound Event Barrier Coverage in Wireless Sensor Networks under Multi-Constraint Conditions.
Zhuang, Yaoming; Wu, Chengdong; Zhang, Yunzhou; Jia, Zixi
2016-12-24
It is important to monitor compound event by barrier coverage issues in wireless sensor networks (WSNs). Compound event barrier coverage (CEBC) is a novel coverage problem. Unlike traditional ones, the data of compound event barrier coverage comes from different types of sensors. It will be subject to multiple constraints under complex conditions in real-world applications. The main objective of this paper is to design an efficient algorithm for complex conditions that can combine the compound event confidence. Moreover, a multiplier method based on an active-set strategy (ASMP) is proposed to optimize the multiple constraints in compound event barrier coverage. The algorithm can calculate the coverage ratio efficiently and allocate the sensor resources reasonably in compound event barrier coverage. The proposed algorithm can simplify complex problems to reduce the computational load of the network and improve the network efficiency. The simulation results demonstrate that the proposed algorithm is more effective and efficient than existing methods, especially in the allocation of sensor resources.
Compound Event Barrier Coverage in Wireless Sensor Networks under Multi-Constraint Conditions
Zhuang, Yaoming; Wu, Chengdong; Zhang, Yunzhou; Jia, Zixi
2016-01-01
It is important to monitor compound event by barrier coverage issues in wireless sensor networks (WSNs). Compound event barrier coverage (CEBC) is a novel coverage problem. Unlike traditional ones, the data of compound event barrier coverage comes from different types of sensors. It will be subject to multiple constraints under complex conditions in real-world applications. The main objective of this paper is to design an efficient algorithm for complex conditions that can combine the compound event confidence. Moreover, a multiplier method based on an active-set strategy (ASMP) is proposed to optimize the multiple constraints in compound event barrier coverage. The algorithm can calculate the coverage ratio efficiently and allocate the sensor resources reasonably in compound event barrier coverage. The proposed algorithm can simplify complex problems to reduce the computational load of the network and improve the network efficiency. The simulation results demonstrate that the proposed algorithm is more effective and efficient than existing methods, especially in the allocation of sensor resources. PMID:28029118
Applying active learning to supervised word sense disambiguation in MEDLINE.
Chen, Yukun; Cao, Hongxin; Mei, Qiaozhu; Zheng, Kai; Xu, Hua
2013-01-01
This study was to assess whether active learning strategies can be integrated with supervised word sense disambiguation (WSD) methods, thus reducing the number of annotated samples, while keeping or improving the quality of disambiguation models. We developed support vector machine (SVM) classifiers to disambiguate 197 ambiguous terms and abbreviations in the MSH WSD collection. Three different uncertainty sampling-based active learning algorithms were implemented with the SVM classifiers and were compared with a passive learner (PL) based on random sampling. For each ambiguous term and each learning algorithm, a learning curve that plots the accuracy computed from the test set as a function of the number of annotated samples used in the model was generated. The area under the learning curve (ALC) was used as the primary metric for evaluation. Our experiments demonstrated that active learners (ALs) significantly outperformed the PL, showing better performance for 177 out of 197 (89.8%) WSD tasks. Further analysis showed that to achieve an average accuracy of 90%, the PL needed 38 annotated samples, while the ALs needed only 24, a 37% reduction in annotation effort. Moreover, we analyzed cases where active learning algorithms did not achieve superior performance and identified three causes: (1) poor models in the early learning stage; (2) easy WSD cases; and (3) difficult WSD cases, which provide useful insight for future improvements. This study demonstrated that integrating active learning strategies with supervised WSD methods could effectively reduce annotation cost and improve the disambiguation models.
Applying active learning to supervised word sense disambiguation in MEDLINE
Chen, Yukun; Cao, Hongxin; Mei, Qiaozhu; Zheng, Kai; Xu, Hua
2013-01-01
Objectives This study was to assess whether active learning strategies can be integrated with supervised word sense disambiguation (WSD) methods, thus reducing the number of annotated samples, while keeping or improving the quality of disambiguation models. Methods We developed support vector machine (SVM) classifiers to disambiguate 197 ambiguous terms and abbreviations in the MSH WSD collection. Three different uncertainty sampling-based active learning algorithms were implemented with the SVM classifiers and were compared with a passive learner (PL) based on random sampling. For each ambiguous term and each learning algorithm, a learning curve that plots the accuracy computed from the test set as a function of the number of annotated samples used in the model was generated. The area under the learning curve (ALC) was used as the primary metric for evaluation. Results Our experiments demonstrated that active learners (ALs) significantly outperformed the PL, showing better performance for 177 out of 197 (89.8%) WSD tasks. Further analysis showed that to achieve an average accuracy of 90%, the PL needed 38 annotated samples, while the ALs needed only 24, a 37% reduction in annotation effort. Moreover, we analyzed cases where active learning algorithms did not achieve superior performance and identified three causes: (1) poor models in the early learning stage; (2) easy WSD cases; and (3) difficult WSD cases, which provide useful insight for future improvements. Conclusions This study demonstrated that integrating active learning strategies with supervised WSD methods could effectively reduce annotation cost and improve the disambiguation models. PMID:23364851
Microarray missing data imputation based on a set theoretic framework and biological knowledge.
Gan, Xiangchao; Liew, Alan Wee-Chung; Yan, Hong
2006-01-01
Gene expressions measured using microarrays usually suffer from the missing value problem. However, in many data analysis methods, a complete data matrix is required. Although existing missing value imputation algorithms have shown good performance to deal with missing values, they also have their limitations. For example, some algorithms have good performance only when strong local correlation exists in data while some provide the best estimate when data is dominated by global structure. In addition, these algorithms do not take into account any biological constraint in their imputation. In this paper, we propose a set theoretic framework based on projection onto convex sets (POCS) for missing data imputation. POCS allows us to incorporate different types of a priori knowledge about missing values into the estimation process. The main idea of POCS is to formulate every piece of prior knowledge into a corresponding convex set and then use a convergence-guaranteed iterative procedure to obtain a solution in the intersection of all these sets. In this work, we design several convex sets, taking into consideration the biological characteristic of the data: the first set mainly exploit the local correlation structure among genes in microarray data, while the second set captures the global correlation structure among arrays. The third set (actually a series of sets) exploits the biological phenomenon of synchronization loss in microarray experiments. In cyclic systems, synchronization loss is a common phenomenon and we construct a series of sets based on this phenomenon for our POCS imputation algorithm. Experiments show that our algorithm can achieve a significant reduction of error compared to the KNNimpute, SVDimpute and LSimpute methods.
Data reduction using cubic rational B-splines
NASA Technical Reports Server (NTRS)
Chou, Jin J.; Piegl, Les A.
1992-01-01
A geometric method is proposed for fitting rational cubic B-spline curves to data that represent smooth curves including intersection or silhouette lines. The algorithm is based on the convex hull and the variation diminishing properties of Bezier/B-spline curves. The algorithm has the following structure: it tries to fit one Bezier segment to the entire data set and if it is impossible it subdivides the data set and reconsiders the subset. After accepting the subset the algorithm tries to find the longest run of points within a tolerance and then approximates this set with a Bezier cubic segment. The algorithm uses this procedure repeatedly to the rest of the data points until all points are fitted. It is concluded that the algorithm delivers fitting curves which approximate the data with high accuracy even in cases with large tolerances.
NASA Astrophysics Data System (ADS)
Dubovik, O.; Litvinov, P.; Lapyonok, T.; Herman, M.; Fedorenko, A.; Lopatin, A.; Goloub, P.; Ducos, F.; Aspetsberger, M.; Planer, W.; Federspiel, C.
2013-12-01
During last few years we were developing GRASP (Generalized Retrieval of Aerosol and Surface Properties) algorithm designed for the enhanced characterization of aerosol properties from spectral, multi-angular polarimetric remote sensing observations. The concept of GRASP essentially relies on the accumulated positive research heritage from previous remote sensing aerosol retrieval developments, in particular those from the AERONET and POLDER retrieval activities. The details of the algorithm are described by Dubovik et al. (Atmos. Meas. Tech., 4, 975-1018, 2011). The GRASP retrieves properties of both aerosol and land surface reflectance in cloud-free environments. It is based on highly advanced statistically optimized fitting and deduces nearly 50 unknowns for each observed site. The algorithm derives a similar set of aerosol parameters as AERONET including detailed particle size distribution, the spectrally dependent the complex index of refraction and the fraction of non-spherical particles. The algorithm uses detailed aerosol and surface models and fully accounts for all multiple interactions of scattered solar light with aerosol, gases and the underlying surface. All calculations are done on-line without using traditional look-up tables. In addition, the algorithm uses the new multi-pixel retrieval concept - a simultaneous fitting of a large group of pixels with additional constraints limiting the time variability of surface properties and spatial variability of aerosol properties. This principle is expected to result in higher consistency and accuracy of aerosol products compare to conventional approaches especially over bright surfaces where information content of satellite observations in respect to aerosol properties is limited. The GRASP is a highly versatile algorithm that allows input from both satellite and ground-based measurements. It also has essential flexibility in measurement processing. For example, if observation data set includes spectral measurements of both total intensity and polarization, the algorithm can be easily set to use either total intensity or polarization, as well as both of them in the same retrieval. Using this feature of the algorithm design we have studied the relative importance of total intensity and polarization measurements for retrieving different parameters of aerosol. In this presentation, we present the quantitative assessment of the improvements in aerosol retrievals associated with additions of polarimetric measurements to the intensity-only observations. The study has been performed using satellite measurements by POLDER/PARASOL polarimeter and ground-based measurements by new generation AERONET sun/sky-radiometers implementing measurements of polarization at each spectral channel.
Estimating meme fitness in adaptive memetic algorithms for combinatorial problems.
Smith, J E
2012-01-01
Among the most promising and active research areas in heuristic optimisation is the field of adaptive memetic algorithms (AMAs). These gain much of their reported robustness by adapting the probability with which each of a set of local improvement operators is applied, according to an estimate of their current value to the search process. This paper addresses the issue of how the current value should be estimated. Assuming the estimate occurs over several applications of a meme, we consider whether the extreme or mean improvements should be used, and whether this aggregation should be global, or local to some part of the solution space. To investigate these issues, we use the well-established COMA framework that coevolves the specification of a population of memes (representing different local search algorithms) alongside a population of candidate solutions to the problem at hand. Two very different memetic algorithms are considered: the first using adaptive operator pursuit to adjust the probabilities of applying a fixed set of memes, and a second which applies genetic operators to dynamically adapt and create memes and their functional definitions. For the latter, especially on combinatorial problems, credit assignment mechanisms based on historical records, or on notions of landscape locality, will have limited application, and it is necessary to estimate the value of a meme via some form of sampling. The results on a set of binary encoded combinatorial problems show that both methods are very effective, and that for some problems it is necessary to use thousands of variables in order to tease apart the differences between different reward schemes. However, for both memetic algorithms, a significant pattern emerges that reward based on mean improvement is better than that based on extreme improvement. This contradicts recent findings from adapting the parameters of operators involved in global evolutionary search. The results also show that local reward schemes outperform global reward schemes in combinatorial spaces, unlike in continuous spaces. An analysis of evolving meme behaviour is used to explain these findings.
Greedy Algorithms for Nonnegativity-Constrained Simultaneous Sparse Recovery
Kim, Daeun; Haldar, Justin P.
2016-01-01
This work proposes a family of greedy algorithms to jointly reconstruct a set of vectors that are (i) nonnegative and (ii) simultaneously sparse with a shared support set. The proposed algorithms generalize previous approaches that were designed to impose these constraints individually. Similar to previous greedy algorithms for sparse recovery, the proposed algorithms iteratively identify promising support indices. In contrast to previous approaches, the support index selection procedure has been adapted to prioritize indices that are consistent with both the nonnegativity and shared support constraints. Empirical results demonstrate for the first time that the combined use of simultaneous sparsity and nonnegativity constraints can substantially improve recovery performance relative to existing greedy algorithms that impose less signal structure. PMID:26973368
Integrated Network Decompositions and Dynamic Programming for Graph Optimization (INDDGO)
DOE Office of Scientific and Technical Information (OSTI.GOV)
The INDDGO software package offers a set of tools for finding exact solutions to graph optimization problems via tree decompositions and dynamic programming algorithms. Currently the framework offers serial and parallel (distributed memory) algorithms for finding tree decompositions and solving the maximum weighted independent set problem. The parallel dynamic programming algorithm is implemented on top of the MADNESS task-based runtime.
Improving depth maps of plants by using a set of five cameras
NASA Astrophysics Data System (ADS)
Kaczmarek, Adam L.
2015-03-01
Obtaining high-quality depth maps and disparity maps with the use of a stereo camera is a challenging task for some kinds of objects. The quality of these maps can be improved by taking advantage of a larger number of cameras. The research on the usage of a set of five cameras to obtain disparity maps is presented. The set consists of a central camera and four side cameras. An algorithm for making disparity maps called multiple similar areas (MSA) is introduced. The algorithm was specially designed for the set of five cameras. Experiments were performed with the MSA algorithm and the stereo matching algorithm based on the sum of sum of squared differences (sum of SSD, SSSD) measure. Moreover, the following measures were included in the experiments: sum of absolute differences (SAD), zero-mean SAD (ZSAD), zero-mean SSD (ZSSD), locally scaled SAD (LSAD), locally scaled SSD (LSSD), normalized cross correlation (NCC), and zero-mean NCC (ZNCC). Algorithms presented were applied to images of plants. Making depth maps of plants is difficult because parts of leaves are similar to each other. The potential usability of the described algorithms is especially high in agricultural applications such as robotic fruit harvesting.
Quantum Algorithm for K-Nearest Neighbors Classification Based on the Metric of Hamming Distance
NASA Astrophysics Data System (ADS)
Ruan, Yue; Xue, Xiling; Liu, Heng; Tan, Jianing; Li, Xi
2017-11-01
K-nearest neighbors (KNN) algorithm is a common algorithm used for classification, and also a sub-routine in various complicated machine learning tasks. In this paper, we presented a quantum algorithm (QKNN) for implementing this algorithm based on the metric of Hamming distance. We put forward a quantum circuit for computing Hamming distance between testing sample and each feature vector in the training set. Taking advantage of this method, we realized a good analog for classical KNN algorithm by setting a distance threshold value t to select k - n e a r e s t neighbors. As a result, QKNN achieves O( n 3) performance which is only relevant to the dimension of feature vectors and high classification accuracy, outperforms Llyod's algorithm (Lloyd et al. 2013) and Wiebe's algorithm (Wiebe et al. 2014).
NASA Astrophysics Data System (ADS)
Alpatov, Boris; Babayan, Pavel; Ershov, Maksim; Strotov, Valery
2016-10-01
This paper describes the implementation of the orientation estimation algorithm in FPGA-based vision system. An approach to estimate an orientation of objects lacking axial symmetry is proposed. Suggested algorithm is intended to estimate orientation of a specific known 3D object based on object 3D model. The proposed orientation estimation algorithm consists of two stages: learning and estimation. Learning stage is devoted to the exploring of studied object. Using 3D model we can gather set of training images by capturing 3D model from viewpoints evenly distributed on a sphere. Sphere points distribution is made by the geosphere principle. Gathered training image set is used for calculating descriptors, which will be used in the estimation stage of the algorithm. The estimation stage is focusing on matching process between an observed image descriptor and the training image descriptors. The experimental research was performed using a set of images of Airbus A380. The proposed orientation estimation algorithm showed good accuracy in all case studies. The real-time performance of the algorithm in FPGA-based vision system was demonstrated.
Algorithms for Discovery of Multiple Markov Boundaries
Statnikov, Alexander; Lytkin, Nikita I.; Lemeire, Jan; Aliferis, Constantin F.
2013-01-01
Algorithms for Markov boundary discovery from data constitute an important recent development in machine learning, primarily because they offer a principled solution to the variable/feature selection problem and give insight on local causal structure. Over the last decade many sound algorithms have been proposed to identify a single Markov boundary of the response variable. Even though faithful distributions and, more broadly, distributions that satisfy the intersection property always have a single Markov boundary, other distributions/data sets may have multiple Markov boundaries of the response variable. The latter distributions/data sets are common in practical data-analytic applications, and there are several reasons why it is important to induce multiple Markov boundaries from such data. However, there are currently no sound and efficient algorithms that can accomplish this task. This paper describes a family of algorithms TIE* that can discover all Markov boundaries in a distribution. The broad applicability as well as efficiency of the new algorithmic family is demonstrated in an extensive benchmarking study that involved comparison with 26 state-of-the-art algorithms/variants in 15 data sets from a diversity of application domains. PMID:25285052
Nonrigid Image Registration in Digital Subtraction Angiography Using Multilevel B-Spline
2013-01-01
We address the problem of motion artifact reduction in digital subtraction angiography (DSA) using image registration techniques. Most of registration algorithms proposed for application in DSA, have been designed for peripheral and cerebral angiography images in which we mainly deal with global rigid motions. These algorithms did not yield good results when applied to coronary angiography images because of complex nonrigid motions that exist in this type of angiography images. Multiresolution and iterative algorithms are proposed to cope with this problem, but these algorithms are associated with high computational cost which makes them not acceptable for real-time clinical applications. In this paper we propose a nonrigid image registration algorithm for coronary angiography images that is significantly faster than multiresolution and iterative blocking methods and outperforms competing algorithms evaluated on the same data sets. This algorithm is based on a sparse set of matched feature point pairs and the elastic registration is performed by means of multilevel B-spline image warping. Experimental results with several clinical data sets demonstrate the effectiveness of our approach. PMID:23971026
A GENERAL ALGORITHM FOR THE CONSTRUCTION OF CONTOUR PLOTS
NASA Technical Reports Server (NTRS)
Johnson, W.
1994-01-01
The graphical presentation of experimentally or theoretically generated data sets frequently involves the construction of contour plots. A general computer algorithm has been developed for the construction of contour plots. The algorithm provides for efficient and accurate contouring with a modular approach which allows flexibility in modifying the algorithm for special applications. The algorithm accepts as input data values at a set of points irregularly distributed over a plane. The algorithm is based on an interpolation scheme in which the points in the plane are connected by straight line segments to form a set of triangles. In general, the data is smoothed using a least-squares-error fit of the data to a bivariate polynomial. To construct the contours, interpolation along the edges of the triangles is performed, using the bivariable polynomial if data smoothing was performed. Once the contour points have been located, the contour may be drawn. This program is written in FORTRAN IV for batch execution and has been implemented on an IBM 360 series computer with a central memory requirement of approximately 100K of 8-bit bytes. This computer algorithm was developed in 1981.
Pruning Rogue Taxa Improves Phylogenetic Accuracy: An Efficient Algorithm and Webservice
Aberer, Andre J.; Krompass, Denis; Stamatakis, Alexandros
2013-01-01
Abstract The presence of rogue taxa (rogues) in a set of trees can frequently have a negative impact on the results of a bootstrap analysis (e.g., the overall support in consensus trees). We introduce an efficient graph-based algorithm for rogue taxon identification as well as an interactive webservice implementing this algorithm. Compared with our previous method, the new algorithm is up to 4 orders of magnitude faster, while returning qualitatively identical results. Because of this significant improvement in scalability, the new algorithm can now identify substantially more complex and compute-intensive rogue taxon constellations. On a large and diverse collection of real-world data sets, we show that our method yields better supported reduced/pruned consensus trees than any competing rogue taxon identification method. Using the parallel version of our open-source code, we successfully identified rogue taxa in a set of 100 trees with 116 334 taxa each. For simulated data sets, we show that when removing/pruning rogue taxa with our method from a tree set, we consistently obtain bootstrap consensus trees as well as maximum-likelihood trees that are topologically closer to the respective true trees. PMID:22962004
Pruning rogue taxa improves phylogenetic accuracy: an efficient algorithm and webservice.
Aberer, Andre J; Krompass, Denis; Stamatakis, Alexandros
2013-01-01
The presence of rogue taxa (rogues) in a set of trees can frequently have a negative impact on the results of a bootstrap analysis (e.g., the overall support in consensus trees). We introduce an efficient graph-based algorithm for rogue taxon identification as well as an interactive webservice implementing this algorithm. Compared with our previous method, the new algorithm is up to 4 orders of magnitude faster, while returning qualitatively identical results. Because of this significant improvement in scalability, the new algorithm can now identify substantially more complex and compute-intensive rogue taxon constellations. On a large and diverse collection of real-world data sets, we show that our method yields better supported reduced/pruned consensus trees than any competing rogue taxon identification method. Using the parallel version of our open-source code, we successfully identified rogue taxa in a set of 100 trees with 116 334 taxa each. For simulated data sets, we show that when removing/pruning rogue taxa with our method from a tree set, we consistently obtain bootstrap consensus trees as well as maximum-likelihood trees that are topologically closer to the respective true trees.
NASA Astrophysics Data System (ADS)
Chen, D. M.; Clapp, R. G.; Biondi, B.
2006-12-01
Ricksep is a freely-available interactive viewer for multi-dimensional data sets. The viewer is very useful for simultaneous display of multiple data sets from different viewing angles, animation of movement along a path through the data space, and selection of local regions for data processing and information extraction. Several new viewing features are added to enhance the program's functionality in the following three aspects. First, two new data synthesis algorithms are created to adaptively combine information from a data set with mostly high-frequency content, such as seismic data, and another data set with mainly low-frequency content, such as velocity data. Using the algorithms, these two data sets can be synthesized into a single data set which resembles the high-frequency data set on a local scale and at the same time resembles the low- frequency data set on a larger scale. As a result, the originally separated high and low-frequency details can now be more accurately and conveniently studied together. Second, a projection algorithm is developed to display paths through the data space. Paths are geophysically important because they represent wells into the ground. Two difficulties often associated with tracking paths are that they normally cannot be seen clearly inside multi-dimensional spaces and depth information is lost along the direction of projection when ordinary projection techniques are used. The new algorithm projects samples along the path in three orthogonal directions and effectively restores important depth information by using variable projection parameters which are functions of the distance away from the path. Multiple paths in the data space can be generated using different character symbols as positional markers, and users can easily create, modify, and view paths in real time. Third, a viewing history list is implemented which enables Ricksep's users to create, edit and save a recipe for the sequence of viewing states. Then, the recipe can be loaded into an active Ricksep session, after which the user can navigate to any state in the sequence and modify the sequence from that state. Typical uses of this feature are undoing and redoing viewing commands and animating a sequence of viewing states. The theoretical discussion are carried out and several examples using real seismic data are provided to show how these new Ricksep features provide more convenient, accurate ways to manipulate multi-dimensional data sets.
NASA Astrophysics Data System (ADS)
Babayan, Pavel; Smirnov, Sergey; Strotov, Valery
2017-10-01
This paper describes the aerial object recognition algorithm for on-board and stationary vision system. Suggested algorithm is intended to recognize the objects of a specific kind using the set of the reference objects defined by 3D models. The proposed algorithm based on the outer contour descriptor building. The algorithm consists of two stages: learning and recognition. Learning stage is devoted to the exploring of reference objects. Using 3D models we can build the database containing training images by rendering the 3D model from viewpoints evenly distributed on a sphere. Sphere points distribution is made by the geosphere principle. Gathered training image set is used for calculating descriptors, which will be used in the recognition stage of the algorithm. The recognition stage is focusing on estimating the similarity of the captured object and the reference objects by matching an observed image descriptor and the reference object descriptors. The experimental research was performed using a set of the models of the aircraft of the different types (airplanes, helicopters, UAVs). The proposed orientation estimation algorithm showed good accuracy in all case studies. The real-time performance of the algorithm in FPGA-based vision system was demonstrated.
A Modified MinMax k-Means Algorithm Based on PSO
2016-01-01
The MinMax k-means algorithm is widely used to tackle the effect of bad initialization by minimizing the maximum intraclustering errors. Two parameters, including the exponent parameter and memory parameter, are involved in the executive process. Since different parameters have different clustering errors, it is crucial to choose appropriate parameters. In the original algorithm, a practical framework is given. Such framework extends the MinMax k-means to automatically adapt the exponent parameter to the data set. It has been believed that if the maximum exponent parameter has been set, then the programme can reach the lowest intraclustering errors. However, our experiments show that this is not always correct. In this paper, we modified the MinMax k-means algorithm by PSO to determine the proper values of parameters which can subject the algorithm to attain the lowest clustering errors. The proposed clustering method is tested on some favorite data sets in several different initial situations and is compared to the k-means algorithm and the original MinMax k-means algorithm. The experimental results indicate that our proposed algorithm can reach the lowest clustering errors automatically. PMID:27656201
Syndromic Algorithms for Detection of Gambiense Human African Trypanosomiasis in South Sudan
Palmer, Jennifer J.; Surur, Elizeous I.; Goch, Garang W.; Mayen, Mangar A.; Lindner, Andreas K.; Pittet, Anne; Kasparian, Serena; Checchi, Francesco; Whitty, Christopher J. M.
2013-01-01
Background Active screening by mobile teams is considered the best method for detecting human African trypanosomiasis (HAT) caused by Trypanosoma brucei gambiense but the current funding context in many post-conflict countries limits this approach. As an alternative, non-specialist health care workers (HCWs) in peripheral health facilities could be trained to identify potential cases who need testing based on their symptoms. We explored the predictive value of syndromic referral algorithms to identify symptomatic cases of HAT among a treatment-seeking population in Nimule, South Sudan. Methodology/Principal Findings Symptom data from 462 patients (27 cases) presenting for a HAT test via passive screening over a 7 month period were collected to construct and evaluate over 14,000 four item syndromic algorithms considered simple enough to be used by peripheral HCWs. For comparison, algorithms developed in other settings were also tested on our data, and a panel of expert HAT clinicians were asked to make referral decisions based on the symptom dataset. The best performing algorithms consisted of three core symptoms (sleep problems, neurological problems and weight loss), with or without a history of oedema, cervical adenopathy or proximity to livestock. They had a sensitivity of 88.9–92.6%, a negative predictive value of up to 98.8% and a positive predictive value in this context of 8.4–8.7%. In terms of sensitivity, these out-performed more complex algorithms identified in other studies, as well as the expert panel. The best-performing algorithm is predicted to identify about 9/10 treatment-seeking HAT cases, though only 1/10 patients referred would test positive. Conclusions/Significance In the absence of regular active screening, improving referrals of HAT patients through other means is essential. Systematic use of syndromic algorithms by peripheral HCWs has the potential to increase case detection and would increase their participation in HAT programmes. The algorithms proposed here, though promising, should be validated elsewhere. PMID:23350005
Aswad, Miran; Rayan, Mahmoud; Abu-Lafi, Saleh; Falah, Mizied; Raiyn, Jamal; Abdallah, Ziyad; Rayan, Anwar
2018-01-01
The aim was to index natural products for less expensive preventive or curative anti-inflammatory therapeutic drugs. A set of 441 anti-inflammatory drugs representing the active domain and 2892 natural products representing the inactive domain was used to construct a predictive model for bioactivity-indexing purposes. The model for indexing the natural products for potential anti-inflammatory activity was constructed using the iterative stochastic elimination algorithm (ISE). ISE is capable of differentiating between active and inactive anti-inflammatory molecules. By applying the prediction model to a mix set of (active/inactive) substances, we managed to capture 38% of the anti-inflammatory drugs in the top 1% of the screened set of chemicals, yielding enrichment factor of 38. Ten natural products that scored highly as potential anti-inflammatory drug candidates are disclosed. Searching the PubMed revealed that only three molecules (Moupinamide, Capsaicin, and Hypaphorine) out of the ten were tested and reported as anti-inflammatory. The other seven phytochemicals await evaluation for their anti-inflammatory activity in wet lab. The proposed anti-inflammatory model can be utilized for the virtual screening of large chemical databases and for indexing natural products for potential anti-inflammatory activity.
NASA Astrophysics Data System (ADS)
Fu, Lin; Hu, Xiangyu Y.; Adams, Nikolaus A.
2017-12-01
We propose efficient single-step formulations for reinitialization and extending algorithms, which are critical components of level-set based interface-tracking methods. The level-set field is reinitialized with a single-step (non iterative) "forward tracing" algorithm. A minimum set of cells is defined that describes the interface, and reinitialization employs only data from these cells. Fluid states are extrapolated or extended across the interface by a single-step "backward tracing" algorithm. Both algorithms, which are motivated by analogy to ray-tracing, avoid multiple block-boundary data exchanges that are inevitable for iterative reinitialization and extending approaches within a parallel-computing environment. The single-step algorithms are combined with a multi-resolution conservative sharp-interface method and validated by a wide range of benchmark test cases. We demonstrate that the proposed reinitialization method achieves second-order accuracy in conserving the volume of each phase. The interface location is invariant to reapplication of the single-step reinitialization. Generally, we observe smaller absolute errors than for standard iterative reinitialization on the same grid. The computational efficiency is higher than for the standard and typical high-order iterative reinitialization methods. We observe a 2- to 6-times efficiency improvement over the standard method for serial execution. The proposed single-step extending algorithm, which is commonly employed for assigning data to ghost cells with ghost-fluid or conservative interface interaction methods, shows about 10-times efficiency improvement over the standard method while maintaining same accuracy. Despite their simplicity, the proposed algorithms offer an efficient and robust alternative to iterative reinitialization and extending methods for level-set based multi-phase simulations.
The genetic algorithm: A robust method for stress inversion
NASA Astrophysics Data System (ADS)
Thakur, Prithvi; Srivastava, Deepak C.; Gupta, Pravin K.
2017-01-01
The stress inversion of geological or geophysical observations is a nonlinear problem. In most existing methods, it is solved by linearization, under certain assumptions. These linear algorithms not only oversimplify the problem but also are vulnerable to entrapment of the solution in a local optimum. We propose the use of a nonlinear heuristic technique, the genetic algorithm, which searches the global optimum without making any linearizing assumption or simplification. The algorithm mimics the natural evolutionary processes of selection, crossover and mutation and, minimizes a composite misfit function for searching the global optimum, the fittest stress tensor. The validity and efficacy of the algorithm are demonstrated by a series of tests on synthetic and natural fault-slip observations in different tectonic settings and also in situations where the observations are noisy. It is shown that the genetic algorithm is superior to other commonly practised methods, in particular, in those tectonic settings where none of the principal stresses is directed vertically and/or the given data set is noisy.
Optimization of genomic selection training populations with a genetic algorithm
USDA-ARS?s Scientific Manuscript database
In this article, we derive a computationally efficient statistic to measure the reliability of estimates of genetic breeding values for a fixed set of genotypes based on a given training set of genotypes and phenotypes. We adopt a genetic algorithm scheme to find a training set of certain size from ...
Advancements to the planogram frequency–distance rebinning algorithm
Champley, Kyle M; Raylman, Raymond R; Kinahan, Paul E
2010-01-01
In this paper we consider the task of image reconstruction in positron emission tomography (PET) with the planogram frequency–distance rebinning (PFDR) algorithm. The PFDR algorithm is a rebinning algorithm for PET systems with panel detectors. The algorithm is derived in the planogram coordinate system which is a native data format for PET systems with panel detectors. A rebinning algorithm averages over the redundant four-dimensional set of PET data to produce a three-dimensional set of data. Images can be reconstructed from this rebinned three-dimensional set of data. This process enables one to reconstruct PET images more quickly than reconstructing directly from the four-dimensional PET data. The PFDR algorithm is an approximate rebinning algorithm. We show that implementing the PFDR algorithm followed by the (ramp) filtered backprojection (FBP) algorithm in linogram coordinates from multiple views reconstructs a filtered version of our image. We develop an explicit formula for this filter which can be used to achieve exact reconstruction by means of a modified FBP algorithm applied to the stack of rebinned linograms and can also be used to quantify the errors introduced by the PFDR algorithm. This filter is similar to the filter in the planogram filtered backprojection algorithm derived by Brasse et al. The planogram filtered backprojection and exact reconstruction with the PFDR algorithm require complete projections which can be completed with a reprojection algorithm. The PFDR algorithm is similar to the rebinning algorithm developed by Kao et al. By expressing the PFDR algorithm in detector coordinates, we provide a comparative analysis between the two algorithms. Numerical experiments using both simulated data and measured data from a positron emission mammography/tomography (PEM/PET) system are performed. Images are reconstructed by PFDR+FBP (PFDR followed by 2D FBP reconstruction), PFDRX (PFDR followed by the modified FBP algorithm for exact reconstruction) and planogram filtered backprojection image reconstruction algorithms. We show that the PFDRX algorithm produces images that are nearly as accurate as images reconstructed with the planogram filtered backprojection algorithm and more accurate than images reconstructed with the PFDR+FBP algorithm. Both the PFDR+FBP and PFDRX algorithms provide a dramatic improvement in computation time over the planogram filtered backprojection algorithm. PMID:20436790
Fast detection of vascular plaque in optical coherence tomography images using a reduced feature set
NASA Astrophysics Data System (ADS)
Prakash, Ammu; Ocana Macias, Mariano; Hewko, Mark; Sowa, Michael; Sherif, Sherif
2018-03-01
Optical coherence tomography (OCT) images are capable of detecting vascular plaque by using the full set of 26 Haralick textural features and a standard K-means clustering algorithm. However, the use of the full set of 26 textural features is computationally expensive and may not be feasible for real time implementation. In this work, we identified a reduced set of 3 textural feature which characterizes vascular plaque and used a generalized Fuzzy C-means clustering algorithm. Our work involves three steps: 1) the reduction of a full set 26 textural feature to a reduced set of 3 textural features by using genetic algorithm (GA) optimization method 2) the implementation of an unsupervised generalized clustering algorithm (Fuzzy C-means) on the reduced feature space, and 3) the validation of our results using histology and actual photographic images of vascular plaque. Our results show an excellent match with histology and actual photographic images of vascular tissue. Therefore, our results could provide an efficient pre-clinical tool for the detection of vascular plaque in real time OCT imaging.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lee, Youngrok
2013-05-15
Heterogeneity exists on a data set when samples from di erent classes are merged into the data set. Finite mixture models can be used to represent a survival time distribution on heterogeneous patient group by the proportions of each class and by the survival time distribution within each class as well. The heterogeneous data set cannot be explicitly decomposed to homogeneous subgroups unless all the samples are precisely labeled by their origin classes; such impossibility of decomposition is a barrier to overcome for estimating nite mixture models. The expectation-maximization (EM) algorithm has been used to obtain maximum likelihood estimates ofmore » nite mixture models by soft-decomposition of heterogeneous samples without labels for a subset or the entire set of data. In medical surveillance databases we can find partially labeled data, that is, while not completely unlabeled there is only imprecise information about class values. In this study we propose new EM algorithms that take advantages of using such partial labels, and thus incorporate more information than traditional EM algorithms. We particularly propose four variants of the EM algorithm named EM-OCML, EM-PCML, EM-HCML and EM-CPCML, each of which assumes a specific mechanism of missing class values. We conducted a simulation study on exponential survival trees with five classes and showed that the advantages of incorporating substantial amount of partially labeled data can be highly signi cant. We also showed model selection based on AIC values fairly works to select the best proposed algorithm on each specific data set. A case study on a real-world data set of gastric cancer provided by Surveillance, Epidemiology and End Results (SEER) program showed a superiority of EM-CPCML to not only the other proposed EM algorithms but also conventional supervised, unsupervised and semi-supervised learning algorithms.« less
Leveraging Python Interoperability Tools to Improve Sapphire's Usability
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gezahegne, A; Love, N S
2007-12-10
The Sapphire project at the Center for Applied Scientific Computing (CASC) develops and applies an extensive set of data mining algorithms for the analysis of large data sets. Sapphire's algorithms are currently available as a set of C++ libraries. However many users prefer higher level scripting languages such as Python for their ease of use and flexibility. In this report, we evaluate four interoperability tools for the purpose of wrapping Sapphire's core functionality with Python. Exposing Sapphire's functionality through a Python interface would increase its usability and connect its algorithms to existing Python tools.
Koa-Wing, Michael; Nakagawa, Hiroshi; Luther, Vishal; Jamil-Copley, Shahnaz; Linton, Nick; Sandler, Belinda; Qureshi, Norman; Peters, Nicholas S; Davies, D Wyn; Francis, Darrel P; Jackman, Warren; Kanagaratnam, Prapa
2015-11-15
Ripple Mapping (RM) is designed to overcome the limitations of existing isochronal 3D mapping systems by representing the intracardiac electrogram as a dynamic bar on a surface bipolar voltage map that changes in height according to the electrogram voltage-time relationship, relative to a fiduciary point. We tested the hypothesis that standard approaches to atrial tachycardia CARTO™ activation maps were inadequate for RM creation and interpretation. From the results, we aimed to develop an algorithm to optimize RMs for future prospective testing on a clinical RM platform. CARTO-XP™ activation maps from atrial tachycardia ablations were reviewed by two blinded assessors on an off-line RM workstation. Ripple Maps were graded according to a diagnostic confidence scale (Grade I - high confidence with clear pattern of activation through to Grade IV - non-diagnostic). The RM-based diagnoses were corroborated against the clinical diagnoses. 43 RMs from 14 patients were classified as Grade I (5 [11.5%]); Grade II (17 [39.5%]); Grade III (9 [21%]) and Grade IV (12 [28%]). Causes of low gradings/errors included the following: insufficient chamber point density; window-of-interest<100% of cycle length (CL); <95% tachycardia CL mapped; variability of CL and/or unstable fiducial reference marker; and suboptimal bar height and scar settings. A data collection and map interpretation algorithm has been developed to optimize Ripple Maps in atrial tachycardias. This algorithm requires prospective testing on a real-time clinical platform. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
An early-biomarker algorithm predicts lethal graft-versus-host disease and survival
Hartwell, Matthew J.; Özbek, Umut; Holler, Ernst; Major-Monfried, Hannah; Reddy, Pavan; Aziz, Mina; Hogan, William J.; Ayuk, Francis; Efebera, Yvonne A.; Hexner, Elizabeth O.; Bunworasate, Udomsak; Qayed, Muna; Ordemann, Rainer; Wölfl, Matthias; Mielke, Stephan; Chen, Yi-Bin; Devine, Steven; Jagasia, Madan; Kitko, Carrie L.; Litzow, Mark R.; Kröger, Nicolaus; Locatelli, Franco; Morales, George; Nakamura, Ryotaro; Reshef, Ran; Rösler, Wolf; Weber, Daniela; Yanik, Gregory A.; Levine, John E.; Ferrara, James L.M.
2017-01-01
BACKGROUND. No laboratory test can predict the risk of nonrelapse mortality (NRM) or severe graft-versus-host disease (GVHD) after hematopoietic cellular transplantation (HCT) prior to the onset of GVHD symptoms. METHODS. Patient blood samples on day 7 after HCT were obtained from a multicenter set of 1,287 patients, and 620 samples were assigned to a training set. We measured the concentrations of 4 GVHD biomarkers (ST2, REG3α, TNFR1, and IL-2Rα) and used them to model 6-month NRM using rigorous cross-validation strategies to identify the best algorithm that defined 2 distinct risk groups. We then applied the final algorithm in an independent test set (n = 309) and validation set (n = 358). RESULTS. A 2-biomarker model using ST2 and REG3α concentrations identified patients with a cumulative incidence of 6-month NRM of 28% in the high-risk group and 7% in the low-risk group (P < 0.001). The algorithm performed equally well in the test set (33% vs. 7%, P < 0.001) and the multicenter validation set (26% vs. 10%, P < 0.001). Sixteen percent, 17%, and 20% of patients were at high risk in the training, test, and validation sets, respectively. GVHD-related mortality was greater in high-risk patients (18% vs. 4%, P < 0.001), as was severe gastrointestinal GVHD (17% vs. 8%, P < 0.001). The same algorithm can be successfully adapted to define 3 distinct risk groups at GVHD onset. CONCLUSION. A biomarker algorithm based on a blood sample taken 7 days after HCT can consistently identify a group of patients at high risk for lethal GVHD and NRM. FUNDING. The National Cancer Institute, American Cancer Society, and the Doris Duke Charitable Foundation. PMID:28194439
A relational learning approach to Structure-Activity Relationships in drug design toxicity studies.
Camacho, Rui; Pereira, Max; Costa, Vítor Santos; Fonseca, Nuno A; Adriano, Carlos; Simões, Carlos J V; Brito, Rui M M
2011-09-16
It has been recognized that the development of new therapeutic drugs is a complex and expensive process. A large number of factors affect the activity in vivo of putative candidate molecules and the propensity for causing adverse and toxic effects is recognized as one of the major hurdles behind the current "target-rich, lead-poor" scenario. Structure-Activity Relationship (SAR) studies, using relational Machine Learning (ML) algorithms, have already been shown to be very useful in the complex process of rational drug design. Despite the ML successes, human expertise is still of the utmost importance in the drug development process. An iterative process and tight integration between the models developed by ML algorithms and the know-how of medicinal chemistry experts would be a very useful symbiotic approach. In this paper we describe a software tool that achieves that goal--iLogCHEM. The tool allows the use of Relational Learners in the task of identifying molecules or molecular fragments with potential to produce toxic effects, and thus help in stream-lining drug design in silico. It also allows the expert to guide the search for useful molecules without the need to know the details of the algorithms used. The models produced by the algorithms may be visualized using a graphical interface, that is of common use amongst researchers in structural biology and medicinal chemistry. The graphical interface enables the expert to provide feedback to the learning system. The developed tool has also facilities to handle the similarity bias typical of large chemical databases. For that purpose the user can filter out similar compounds when assembling a data set. Additionally, we propose ways of providing background knowledge for Relational Learners using the results of Graph Mining algorithms. Copyright 2011 The Author(s). Published by Journal of Integrative Bioinformatics.
Simple, Scalable, Script-based, Science Processor for Measurements - Data Mining Edition (S4PM-DME)
NASA Astrophysics Data System (ADS)
Pham, L. B.; Eng, E. K.; Lynnes, C. S.; Berrick, S. W.; Vollmer, B. E.
2005-12-01
The S4PM-DME is the Goddard Earth Sciences Distributed Active Archive Center's (GES DAAC) web-based data mining environment. The S4PM-DME replaces the Near-line Archive Data Mining (NADM) system with a better web environment and a richer set of production rules. S4PM-DME enables registered users to submit and execute custom data mining algorithms. The S4PM-DME system uses the GES DAAC developed Simple Scalable Script-based Science Processor for Measurements (S4PM) to automate tasks and perform the actual data processing. A web interface allows the user to access the S4PM-DME system. The user first develops personalized data mining algorithm on his/her home platform and then uploads them to the S4PM-DME system. Algorithms in C and FORTRAN languages are currently supported. The user developed algorithm is automatically audited for any potential security problems before it is installed within the S4PM-DME system and made available to the user. Once the algorithm has been installed the user can promote the algorithm to the "operational" environment. From here the user can search and order the data available in the GES DAAC archive for his/her science algorithm. The user can also set up a processing subscription. The subscription will automatically process new data as it becomes available in the GES DAAC archive. The generated mined data products are then made available for FTP pickup. The benefits of using S4PM-DME are 1) to decrease the downloading time it typically takes a user to transfer the GES DAAC data to his/her system thus off-load the heavy network traffic, 2) to free-up the load on their system, and last 3) to utilize the rich and abundance ocean, atmosphere data from the MODIS and AIRS instruments available from the GES DAAC.
González-Recio, O; Jiménez-Montero, J A; Alenda, R
2013-01-01
In the next few years, with the advent of high-density single nucleotide polymorphism (SNP) arrays and genome sequencing, genomic evaluation methods will need to deal with a large number of genetic variants and an increasing sample size. The boosting algorithm is a machine-learning technique that may alleviate the drawbacks of dealing with such large data sets. This algorithm combines different predictors in a sequential manner with some shrinkage on them; each predictor is applied consecutively to the residuals from the committee formed by the previous ones to form a final prediction based on a subset of covariates. Here, a detailed description is provided and examples using a toy data set are included. A modification of the algorithm called "random boosting" was proposed to increase predictive ability and decrease computation time of genome-assisted evaluation in large data sets. Random boosting uses a random selection of markers to add a subsequent weak learner to the predictive model. These modifications were applied to a real data set composed of 1,797 bulls genotyped for 39,714 SNP. Deregressed proofs of 4 yield traits and 1 type trait from January 2009 routine evaluations were used as dependent variables. A 2-fold cross-validation scenario was implemented. Sires born before 2005 were used as a training sample (1,576 and 1,562 for production and type traits, respectively), whereas younger sires were used as a testing sample to evaluate predictive ability of the algorithm on yet-to-be-observed phenotypes. Comparison with the original algorithm was provided. The predictive ability of the algorithm was measured as Pearson correlations between observed and predicted responses. Further, estimated bias was computed as the average difference between observed and predicted phenotypes. The results showed that the modification of the original boosting algorithm could be run in 1% of the time used with the original algorithm and with negligible differences in accuracy and bias. This modification may be used to speed the calculus of genome-assisted evaluation in large data sets such us those obtained from consortiums. Copyright © 2013 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Learning overcomplete representations from distributed data: a brief review
NASA Astrophysics Data System (ADS)
Raja, Haroon; Bajwa, Waheed U.
2016-05-01
Most of the research on dictionary learning has focused on developing algorithms under the assumption that data is available at a centralized location. But often the data is not available at a centralized location due to practical constraints like data aggregation costs, privacy concerns, etc. Using centralized dictionary learning algorithms may not be the optimal choice in such settings. This motivates the design of dictionary learning algorithms that consider distributed nature of data as one of the problem variables. Just like centralized settings, distributed dictionary learning problem can be posed in more than one way depending on the problem setup. Most notable distinguishing features are the online versus batch nature of data and the representative versus discriminative nature of the dictionaries. In this paper, several distributed dictionary learning algorithms that are designed to tackle different problem setups are reviewed. One of these algorithms is cloud K-SVD, which solves the dictionary learning problem for batch data in distributed settings. One distinguishing feature of cloud K-SVD is that it has been shown to converge to its centralized counterpart, namely, the K-SVD solution. On the other hand, no such guarantees are provided for other distributed dictionary learning algorithms. Convergence of cloud K-SVD to the centralized K-SVD solution means problems that are solvable by K-SVD in centralized settings can now be solved in distributed settings with similar performance. Finally, cloud K-SVD is used as an example to show the advantages that are attainable by deploying distributed dictionary algorithms for real world distributed datasets.
Han, Kuk-Il; Kim, Do-Hwi; Choi, Jun-Hyuk; Kim, Tae-Kuk
2018-04-20
Treatments for detection by infrared (IR) signals are higher than for other signals such as radar or sonar because an object detected by the IR sensor cannot easily recognize its detection status. Recently, research for actively reducing IR signal has been conducted to control the IR signal by adjusting the surface temperature of the object. In this paper, we propose an active IR stealth algorithm to synchronize IR signals from the object and the background around the object. The proposed method includes the repulsive particle swarm optimization statistical optimization algorithm to estimate the IR stealth surface temperature, which will result in a synchronization between the IR signals from the object and the surrounding background by setting the inverse distance weighted contrast radiant intensity (CRI) equal to zero. We tested the IR stealth performance in mid wavelength infrared (MWIR) and long wavelength infrared (LWIR) bands for a test plate located at three different positions on a forest scene to verify the proposed method. Our results show that the inverse distance weighted active IR stealth technique proposed in this study is proved to be an effective method for reducing the contrast radiant intensity between the object and background up to 32% as compared to the previous method using the CRI determined as the simple signal difference between the object and the background.
Density-based cluster algorithms for the identification of core sets
NASA Astrophysics Data System (ADS)
Lemke, Oliver; Keller, Bettina G.
2016-10-01
The core-set approach is a discretization method for Markov state models of complex molecular dynamics. Core sets are disjoint metastable regions in the conformational space, which need to be known prior to the construction of the core-set model. We propose to use density-based cluster algorithms to identify the cores. We compare three different density-based cluster algorithms: the CNN, the DBSCAN, and the Jarvis-Patrick algorithm. While the core-set models based on the CNN and DBSCAN clustering are well-converged, constructing core-set models based on the Jarvis-Patrick clustering cannot be recommended. In a well-converged core-set model, the number of core sets is up to an order of magnitude smaller than the number of states in a conventional Markov state model with comparable approximation error. Moreover, using the density-based clustering one can extend the core-set method to systems which are not strongly metastable. This is important for the practical application of the core-set method because most biologically interesting systems are only marginally metastable. The key point is to perform a hierarchical density-based clustering while monitoring the structure of the metric matrix which appears in the core-set method. We test this approach on a molecular-dynamics simulation of a highly flexible 14-residue peptide. The resulting core-set models have a high spatial resolution and can distinguish between conformationally similar yet chemically different structures, such as register-shifted hairpin structures.
NASA Astrophysics Data System (ADS)
Englander, J. G.; Brodrick, P. G.; Brandt, A. R.
2015-12-01
Fugitive emissions from oil and gas extraction have become a greater concern with the recent increases in development of shale hydrocarbon resources. There are significant gaps in the tools and research used to estimate fugitive emissions from oil and gas extraction. Two approaches exist for quantifying these emissions: atmospheric (or 'top down') studies, which measure methane fluxes remotely, or inventory-based ('bottom up') studies, which aggregate leakage rates on an equipment-specific basis. Bottom-up studies require counting or estimating how many devices might be leaking (called an 'activity count'), as well as how much each device might leak on average (an 'emissions factor'). In a real-world inventory, there is uncertainty in both activity counts and emissions factors. Even at the well level there are significant disagreements in data reporting. For example, some prior studies noted a ~5x difference in the number of reported well completions in the United States between EPA and private data sources. The purpose of this work is to address activity count uncertainty by using machine learning algorithms to classify oilfield surface facilities using high-resolution spatial imagery. This method can help estimate venting and fugitive emissions sources from regions where reporting of oilfield equipment is incomplete or non-existent. This work will utilize high resolution satellite imagery to count well pads in the Bakken oil field of North Dakota. This initial study examines an area of ~2,000 km2 with ~1000 well pads. We compare different machine learning classification techniques, and explore the impact of training set size, input variables, and image segmentation settings to develop efficient and robust techniques identifying well pads. We discuss the tradeoffs inherent to different classification algorithms, and determine the optimal algorithms for oilfield feature detection. In the future, the results of this work will be leveraged to be provide activity counts of oilfield surface equipment including tanks, pumpjacks, and holding ponds.
Stacked Denoising Autoencoders Applied to Star/Galaxy Classification
NASA Astrophysics Data System (ADS)
Qin, Hao-ran; Lin, Ji-ming; Wang, Jun-yi
2017-04-01
In recent years, the deep learning algorithm, with the characteristics of strong adaptability, high accuracy, and structural complexity, has become more and more popular, but it has not yet been used in astronomy. In order to solve the problem that the star/galaxy classification accuracy is high for the bright source set, but low for the faint source set of the Sloan Digital Sky Survey (SDSS) data, we introduced the new deep learning algorithm, namely the SDA (stacked denoising autoencoder) neural network and the dropout fine-tuning technique, which can greatly improve the robustness and antinoise performance. We randomly selected respectively the bright source sets and faint source sets from the SDSS DR12 and DR7 data with spectroscopic measurements, and made preprocessing on them. Then, we randomly selected respectively the training sets and testing sets without replacement from the bright source sets and faint source sets. At last, using these training sets we made the training to obtain the SDA models of the bright sources and faint sources in the SDSS DR7 and DR12, respectively. We compared the test result of the SDA model on the DR12 testing set with the test results of the Library for Support Vector Machines (LibSVM), J48 decision tree, Logistic Model Tree (LMT), Support Vector Machine (SVM), Logistic Regression, and Decision Stump algorithm, and compared the test result of the SDA model on the DR7 testing set with the test results of six kinds of decision trees. The experiments show that the SDA has a better classification accuracy than other machine learning algorithms for the faint source sets of DR7 and DR12. Especially, when the completeness function is used as the evaluation index, compared with the decision tree algorithms, the correctness rate of SDA has improved about 15% for the faint source set of SDSS-DR7.
Developing an Enhanced Lightning Jump Algorithm for Operational Use
NASA Technical Reports Server (NTRS)
Schultz, Christopher J.; Petersen, Walter A.; Carey, Lawrence D.
2009-01-01
Overall Goals: 1. Build on the lightning jump framework set through previous studies. 2. Understand what typically occurs in nonsevere convection with respect to increases in lightning. 3. Ultimately develop a lightning jump algorithm for use on the Geostationary Lightning Mapper (GLM). 4 Lightning jump algorithm configurations were developed (2(sigma), 3(sigma), Threshold 10 and Threshold 8). 5 algorithms were tested on a population of 47 nonsevere and 38 severe thunderstorms. Results indicate that the 2(sigma) algorithm performed best over the entire thunderstorm sample set with a POD of 87%, a far of 35%, a CSI of 59% and a HSS of 75%.
System and method for resolving gamma-ray spectra
Gentile, Charles A.; Perry, Jason; Langish, Stephen W.; Silber, Kenneth; Davis, William M.; Mastrovito, Dana
2010-05-04
A system for identifying radionuclide emissions is described. The system includes at least one processor for processing output signals from a radionuclide detecting device, at least one training algorithm run by the at least one processor for analyzing data derived from at least one set of known sample data from the output signals, at least one classification algorithm derived from the training algorithm for classifying unknown sample data, wherein the at least one training algorithm analyzes the at least one sample data set to derive at least one rule used by said classification algorithm for identifying at least one radionuclide emission detected by the detecting device.
Ouyang, Qin; Zhao, Jiewen; Chen, Quansheng
2015-01-01
The non-sugar solids (NSS) content is one of the most important nutrition indicators of Chinese rice wine. This study proposed a rapid method for the measurement of NSS content in Chinese rice wine using near infrared (NIR) spectroscopy. We also systemically studied the efficient spectral variables selection algorithms that have to go through modeling. A new algorithm of synergy interval partial least square with competitive adaptive reweighted sampling (Si-CARS-PLS) was proposed for modeling. The performance of the final model was back-evaluated using root mean square error of calibration (RMSEC) and correlation coefficient (Rc) in calibration set and similarly tested by mean square error of prediction (RMSEP) and correlation coefficient (Rp) in prediction set. The optimum model by Si-CARS-PLS algorithm was achieved when 7 PLS factors and 18 variables were included, and the results were as follows: Rc=0.95 and RMSEC=1.12 in the calibration set, Rp=0.95 and RMSEP=1.22 in the prediction set. In addition, Si-CARS-PLS algorithm showed its superiority when compared with the commonly used algorithms in multivariate calibration. This work demonstrated that NIR spectroscopy technique combined with a suitable multivariate calibration algorithm has a high potential in rapid measurement of NSS content in Chinese rice wine. Copyright © 2015 Elsevier B.V. All rights reserved.
Algorithm That Synthesizes Other Algorithms for Hashing
NASA Technical Reports Server (NTRS)
James, Mark
2010-01-01
An algorithm that includes a collection of several subalgorithms has been devised as a means of synthesizing still other algorithms (which could include computer code) that utilize hashing to determine whether an element (typically, a number or other datum) is a member of a set (typically, a list of numbers). Each subalgorithm synthesizes an algorithm (e.g., a block of code) that maps a static set of key hashes to a somewhat linear monotonically increasing sequence of integers. The goal in formulating this mapping is to cause the length of the sequence thus generated to be as close as practicable to the original length of the set and thus to minimize gaps between the elements. The advantage of the approach embodied in this algorithm is that it completely avoids the traditional approach of hash-key look-ups that involve either secondary hash generation and look-up or further searching of a hash table for a desired key in the event of collisions. This algorithm guarantees that it will never be necessary to perform a search or to generate a secondary key in order to determine whether an element is a member of a set. This algorithm further guarantees that any algorithm that it synthesizes can be executed in constant time. To enforce these guarantees, the subalgorithms are formulated to employ a set of techniques, each of which works very effectively covering a certain class of hash-key values. These subalgorithms are of two types, summarized as follows: Given a list of numbers, try to find one or more solutions in which, if each number is shifted to the right by a constant number of bits and then masked with a rotating mask that isolates a set of bits, a unique number is thereby generated. In a variant of the foregoing procedure, omit the masking. Try various combinations of shifting, masking, and/or offsets until the solutions are found. From the set of solutions, select the one that provides the greatest compression for the representation and is executable in the minimum amount of time. Given a list of numbers, try to find one or more solutions in which, if each number is compressed by use of the modulo function by some value, then a unique value is generated.
A study on the application of topic models to motif finding algorithms.
Basha Gutierrez, Josep; Nakai, Kenta
2016-12-22
Topic models are statistical algorithms which try to discover the structure of a set of documents according to the abstract topics contained in them. Here we try to apply this approach to the discovery of the structure of the transcription factor binding sites (TFBS) contained in a set of biological sequences, which is a fundamental problem in molecular biology research for the understanding of transcriptional regulation. Here we present two methods that make use of topic models for motif finding. First, we developed an algorithm in which first a set of biological sequences are treated as text documents, and the k-mers contained in them as words, to then build a correlated topic model (CTM) and iteratively reduce its perplexity. We also used the perplexity measurement of CTMs to improve our previous algorithm based on a genetic algorithm and several statistical coefficients. The algorithms were tested with 56 data sets from four different species and compared to 14 other methods by the use of several coefficients both at nucleotide and site level. The results of our first approach showed a performance comparable to the other methods studied, especially at site level and in sensitivity scores, in which it scored better than any of the 14 existing tools. In the case of our previous algorithm, the new approach with the addition of the perplexity measurement clearly outperformed all of the other methods in sensitivity, both at nucleotide and site level, and in overall performance at site level. The statistics obtained show that the performance of a motif finding method based on the use of a CTM is satisfying enough to conclude that the application of topic models is a valid method for developing motif finding algorithms. Moreover, the addition of topic models to a previously developed method dramatically increased its performance, suggesting that this combined algorithm can be a useful tool to successfully predict motifs in different kinds of sets of DNA sequences.
Contour-based object orientation estimation
NASA Astrophysics Data System (ADS)
Alpatov, Boris; Babayan, Pavel
2016-04-01
Real-time object orientation estimation is an actual problem of computer vision nowadays. In this paper we propose an approach to estimate an orientation of objects lacking axial symmetry. Proposed algorithm is intended to estimate orientation of a specific known 3D object, so 3D model is required for learning. The proposed orientation estimation algorithm consists of 2 stages: learning and estimation. Learning stage is devoted to the exploring of studied object. Using 3D model we can gather set of training images by capturing 3D model from viewpoints evenly distributed on a sphere. Sphere points distribution is made by the geosphere principle. It minimizes the training image set. Gathered training image set is used for calculating descriptors, which will be used in the estimation stage of the algorithm. The estimation stage is focusing on matching process between an observed image descriptor and the training image descriptors. The experimental research was performed using a set of images of Airbus A380. The proposed orientation estimation algorithm showed good accuracy (mean error value less than 6°) in all case studies. The real-time performance of the algorithm was also demonstrated.
Tang, Liang; Zhu, Yongfeng; Fu, Qiang
2017-01-01
Waveform sets with good correlation and/or stopband properties have received extensive attention and been widely used in multiple-input multiple-output (MIMO) radar. In this paper, we aim at designing unimodular waveform sets with good correlation and stopband properties. To formulate the problem, we construct two criteria to measure the correlation and stopband properties and then establish an unconstrained problem in the frequency domain. After deducing the phase gradient and the step size, an efficient gradient-based algorithm with monotonicity is proposed to minimize the objective function directly. For the design problem without considering the correlation weights, we develop a simplified algorithm, which only requires a few fast Fourier transform (FFT) operations and is more efficient. Because both of the algorithms can be implemented via the FFT operations and the Hadamard product, they are computationally efficient and can be used to design waveform sets with a large waveform number and waveform length. Numerical experiments show that the proposed algorithms can provide better performance than the state-of-the-art algorithms in terms of the computational complexity. PMID:28468308
Tang, Liang; Zhu, Yongfeng; Fu, Qiang
2017-05-01
Waveform sets with good correlation and/or stopband properties have received extensive attention and been widely used in multiple-input multiple-output (MIMO) radar. In this paper, we aim at designing unimodular waveform sets with good correlation and stopband properties. To formulate the problem, we construct two criteria to measure the correlation and stopband properties and then establish an unconstrained problem in the frequency domain. After deducing the phase gradient and the step size, an efficient gradient-based algorithm with monotonicity is proposed to minimize the objective function directly. For the design problem without considering the correlation weights, we develop a simplified algorithm, which only requires a few fast Fourier transform (FFT) operations and is more efficient. Because both of the algorithms can be implemented via the FFT operations and the Hadamard product, they are computationally efficient and can be used to design waveform sets with a large waveform number and waveform length. Numerical experiments show that the proposed algorithms can provide better performance than the state-of-the-art algorithms in terms of the computational complexity.
Sea Ice Mass Reconciliation Exercise (SIMRE) for altimetry derived sea ice thickness data sets
NASA Astrophysics Data System (ADS)
Hendricks, S.; Haas, C.; Tsamados, M.; Kwok, R.; Kurtz, N. T.; Rinne, E. J.; Uotila, P.; Stroeve, J.
2017-12-01
Satellite altimetry is the primary remote sensing data source for retrieval of Arctic sea-ice thickness. Observational data sets are available from current and previous missions, namely ESA's Envisat and CryoSat as well as NASA ICESat. In addition, freeboard results have been published from the earlier ESA ERS missions and candidates for new data products are the Sentinel-3 constellation, the CNES AltiKa mission and NASA laser altimeter successor ICESat-2. With all the different aspects of sensor type and orbit configuration, all missions have unique properties. In addition, thickness retrieval algorithms have evolved over time and data centers have developed different strategies. These strategies may vary in choice of auxiliary data sets, algorithm parts and product resolution and masking. The Sea Ice Mass Reconciliation Exercise (SIMRE) is a project by the sea-ice radar altimetry community to bridge the challenges of comparing data sets across missions and algorithms. The ESA Arctic+ research program facilitates this project with the objective to collect existing data sets and to derive a reconciled estimate of Arctic sea ice mass balance. Starting with CryoSat-2 products, we compare results from different data centers (UCL, AWI, NASA JPL & NASA GSFC) at full resolution along selected orbits with independent ice thickness estimates. Three regions representative of first-year ice, multiyear ice and mixed ice conditions are used to compare the difference in thickness and thickness change between products over the seasonal cycle. We present first results and provide an outline for the further development of SIMRE activities. The methodology for comparing data sets is designed to be extendible and the project is open to contributions by interested groups. Model results of sea ice thickness will be added in a later phase of the project to extend the scope of SIMRE beyond EO products.
A new improved artificial bee colony algorithm for ship hull form optimization
NASA Astrophysics Data System (ADS)
Huang, Fuxin; Wang, Lijue; Yang, Chi
2016-04-01
The artificial bee colony (ABC) algorithm is a relatively new swarm intelligence-based optimization algorithm. Its simplicity of implementation, relatively few parameter settings and promising optimization capability make it widely used in different fields. However, it has problems of slow convergence due to its solution search equation. Here, a new solution search equation based on a combination of the elite solution pool and the block perturbation scheme is proposed to improve the performance of the algorithm. In addition, two different solution search equations are used by employed bees and onlooker bees to balance the exploration and exploitation of the algorithm. The developed algorithm is validated by a set of well-known numerical benchmark functions. It is then applied to optimize two ship hull forms with minimum resistance. The tested results show that the proposed new improved ABC algorithm can outperform the ABC algorithm in most of the tested problems.
Veta, Mitko; Johannes van Diest, Paul; van Ginneken, Bram; Karssemeijer, Nico; Litjens, Geert; van der Laak, Jeroen A. W. M.; Hermsen, Meyke; Manson, Quirine F; Balkenhol, Maschenka; Geessink, Oscar; Stathonikos, Nikolaos; van Dijk, Marcory CRF; Bult, Peter; Beca, Francisco; Beck, Andrew H; Wang, Dayong; Khosla, Aditya; Gargeya, Rishab; Irshad, Humayun; Zhong, Aoxiao; Dou, Qi; Li, Quanzheng; Chen, Hao; Lin, Huang-Jing; Heng, Pheng-Ann; Haß, Christian; Bruni, Elia; Wong, Quincy; Halici, Ugur; Öner, Mustafa Ümit; Cetin-Atalay, Rengul; Berseth, Matt; Khvatkov, Vitali; Vylegzhanin, Alexei; Kraus, Oren; Shaban, Muhammad; Rajpoot, Nasir; Awan, Ruqayya; Sirinukunwattana, Korsuk; Qaiser, Talha; Tsang, Yee-Wah; Tellez, David; Annuscheit, Jonas; Hufnagl, Peter; Valkonen, Mira; Kartasalo, Kimmo; Latonen, Leena; Ruusuvuori, Pekka; Liimatainen, Kaisa; Albarqouni, Shadi; Mungal, Bharti; George, Ami; Demirci, Stefanie; Navab, Nassir; Watanabe, Seiryo; Seno, Shigeto; Takenaka, Yoichi; Matsuda, Hideo; Ahmady Phoulady, Hady; Kovalev, Vassili; Kalinovsky, Alexander; Liauchuk, Vitali; Bueno, Gloria; Fernandez-Carrobles, M. Milagro; Serrano, Ismael; Deniz, Oscar; Racoceanu, Daniel; Venâncio, Rui
2017-01-01
Importance Application of deep learning algorithms to whole-slide pathology images can potentially improve diagnostic accuracy and efficiency. Objective Assess the performance of automated deep learning algorithms at detecting metastases in hematoxylin and eosin–stained tissue sections of lymph nodes of women with breast cancer and compare it with pathologists’ diagnoses in a diagnostic setting. Design, Setting, and Participants Researcher challenge competition (CAMELYON16) to develop automated solutions for detecting lymph node metastases (November 2015-November 2016). A training data set of whole-slide images from 2 centers in the Netherlands with (n = 110) and without (n = 160) nodal metastases verified by immunohistochemical staining were provided to challenge participants to build algorithms. Algorithm performance was evaluated in an independent test set of 129 whole-slide images (49 with and 80 without metastases). The same test set of corresponding glass slides was also evaluated by a panel of 11 pathologists with time constraint (WTC) from the Netherlands to ascertain likelihood of nodal metastases for each slide in a flexible 2-hour session, simulating routine pathology workflow, and by 1 pathologist without time constraint (WOTC). Exposures Deep learning algorithms submitted as part of a challenge competition or pathologist interpretation. Main Outcomes and Measures The presence of specific metastatic foci and the absence vs presence of lymph node metastasis in a slide or image using receiver operating characteristic curve analysis. The 11 pathologists participating in the simulation exercise rated their diagnostic confidence as definitely normal, probably normal, equivocal, probably tumor, or definitely tumor. Results The area under the receiver operating characteristic curve (AUC) for the algorithms ranged from 0.556 to 0.994. The top-performing algorithm achieved a lesion-level, true-positive fraction comparable with that of the pathologist WOTC (72.4% [95% CI, 64.3%-80.4%]) at a mean of 0.0125 false-positives per normal whole-slide image. For the whole-slide image classification task, the best algorithm (AUC, 0.994 [95% CI, 0.983-0.999]) performed significantly better than the pathologists WTC in a diagnostic simulation (mean AUC, 0.810 [range, 0.738-0.884]; P < .001). The top 5 algorithms had a mean AUC that was comparable with the pathologist interpreting the slides in the absence of time constraints (mean AUC, 0.960 [range, 0.923-0.994] for the top 5 algorithms vs 0.966 [95% CI, 0.927-0.998] for the pathologist WOTC). Conclusions and Relevance In the setting of a challenge competition, some deep learning algorithms achieved better diagnostic performance than a panel of 11 pathologists participating in a simulation exercise designed to mimic routine pathology workflow; algorithm performance was comparable with an expert pathologist interpreting whole-slide images without time constraints. Whether this approach has clinical utility will require evaluation in a clinical setting. PMID:29234806
Trend discrepancies among three best track data sets of western North Pacific tropical cyclones
NASA Astrophysics Data System (ADS)
Song, Jin-Jie; Wang, Yuan; Wu, Liguang
2010-06-01
The hot debate over the influence of global warming on tropical cyclone (TC) activity in the western North Pacific over the past several decades is partly due to the diversity of TC data sets used in recent publications. This study investigates differences of track, intensity, frequency, and the associated long-term trends for those TCs that were simultaneously recorded by the best track data sets of the Joint Typhoon Warning Center (JTWC), the Regional Specialized Meteorological Center (RSMC) Tokyo, and the Shanghai Typhoon Institute (STI). Though the differences in TC tracks among these data sets are negligibly small, the JTWC data set tends to classify TCs of category 2-3 as category 4-5, leading to an upward trend in the annual frequency of category 4-5 TCs and the annual accumulated power dissipation index, as reported by Webster et al. (2005) and Emanuel (2005). This trend and potential destructiveness over the period 1977-2007 are found only with the JTWC data set, but downward trends are apparent in the RSMC and STI data sets. It is concluded that the different algorithms used in determining TC intensity may cause the trend discrepancies of TC activity in the western North Pacific.
Saeedi, Ramyar; Purath, Janet; Venkatasubramanian, Krishna; Ghasemzadeh, Hassan
2014-01-01
Mobile wearable sensors have demonstrated great potential in a broad range of applications in healthcare and wellness. These technologies are known for their potential to revolutionize the way next generation medical services are supplied and consumed by providing more effective interventions, improving health outcomes, and substantially reducing healthcare costs. Despite these potentials, utilization of these sensor devices is currently limited to lab settings and in highly controlled clinical trials. A major obstacle in widespread utilization of these systems is that the sensors need to be used in predefined locations on the body in order to provide accurate outcomes such as type of physical activity performed by the user. This has reduced users' willingness to utilize such technologies. In this paper, we propose a novel signal processing approach that leverages feature selection algorithms for accurate and automatic localization of wearable sensors. Our results based on real data collected using wearable motion sensors demonstrate that the proposed approach can perform sensor localization with 98.4% accuracy which is 30.7% more accurate than an approach without a feature selection mechanism. Furthermore, utilizing our node localization algorithm aids the activity recognition algorithm to achieve 98.8% accuracy (an increase from 33.6% for the system without node localization).
Sariyar, M; Borg, A; Pommerening, K
2012-10-01
Supervised record linkage methods often require a clerical review to gain informative training data. Active learning means to actively prompt the user to label data with special characteristics in order to minimise the review costs. We conducted an empirical evaluation to investigate whether a simple active learning strategy using binary comparison patterns is sufficient or if string metrics together with a more sophisticated algorithm are necessary to achieve high accuracies with a small training set. Based on medical registry data with different numbers of attributes, we used active learning to acquire training sets for classification trees, which were then used to classify the remaining data. Active learning for binary patterns means that every distinct comparison pattern represents a stratum from which one item is sampled. Active learning for patterns consisting of the Levenshtein string metric values uses an iterative process where the most informative and representative examples are added to the training set. In this context, we extended the active learning strategy by Sarawagi and Bhamidipaty (2002). On the original data set, active learning based on binary comparison patterns leads to the best results. When dropping four or six attributes, using string metrics leads to better results. In both cases, not more than 200 manually reviewed training examples are necessary. In record linkage applications where only forename, name and birthday are available as attributes, we suggest the sophisticated active learning strategy based on string metrics in order to achieve highly accurate results. We recommend the simple strategy if more attributes are available, as in our study. In both cases, active learning significantly reduces the amount of manual involvement in training data selection compared to usual record linkage settings. Copyright © 2012 Elsevier Inc. All rights reserved.
Snyder, David A; Montelione, Gaetano T
2005-06-01
An important open question in the field of NMR-based biomolecular structure determination is how best to characterize the precision of the resulting ensemble of structures. Typically, the RMSD, as minimized in superimposing the ensemble of structures, is the preferred measure of precision. However, the presence of poorly determined atomic coordinates and multiple "RMSD-stable domains"--locally well-defined regions that are not aligned in global superimpositions--complicate RMSD calculations. In this paper, we present a method, based on a novel, structurally defined order parameter, for identifying a set of core atoms to use in determining superimpositions for RMSD calculations. In addition we present a method for deciding whether to partition that core atom set into "RMSD-stable domains" and, if so, how to determine partitioning of the core atom set. We demonstrate our algorithm and its application in calculating statistically sound RMSD values by applying it to a set of NMR-derived structural ensembles, superimposing each RMSD-stable domain (or the entire core atom set, where appropriate) found in each protein structure under consideration. A parameter calculated by our algorithm using a novel, kurtosis-based criterion, the epsilon-value, is a measure of precision of the superimposition that complements the RMSD. In addition, we compare our algorithm with previously described algorithms for determining core atom sets. The methods presented in this paper for biomolecular structure superimposition are quite general, and have application in many areas of structural bioinformatics and structural biology.
Clustering Binary Data in the Presence of Masking Variables
ERIC Educational Resources Information Center
Brusco, Michael J.
2004-01-01
A number of important applications require the clustering of binary data sets. Traditional nonhierarchical cluster analysis techniques, such as the popular K-means algorithm, can often be successfully applied to these data sets. However, the presence of masking variables in a data set can impede the ability of the K-means algorithm to recover the…
Research on distributed heterogeneous data PCA algorithm based on cloud platform
NASA Astrophysics Data System (ADS)
Zhang, Jin; Huang, Gang
2018-05-01
Principal component analysis (PCA) of heterogeneous data sets can solve the problem that centralized data scalability is limited. In order to reduce the generation of intermediate data and error components of distributed heterogeneous data sets, a principal component analysis algorithm based on heterogeneous data sets under cloud platform is proposed. The algorithm performs eigenvalue processing by using Householder tridiagonalization and QR factorization to calculate the error component of the heterogeneous database associated with the public key to obtain the intermediate data set and the lost information. Experiments on distributed DBM heterogeneous datasets show that the model method has the feasibility and reliability in terms of execution time and accuracy.
Inferring gene expression from ribosomal promoter sequences, a crowdsourcing approach
Meyer, Pablo; Siwo, Geoffrey; Zeevi, Danny; Sharon, Eilon; Norel, Raquel; Segal, Eran; Stolovitzky, Gustavo; Siwo, Geoffrey; Rider, Andrew K.; Tan, Asako; Pinapati, Richard S.; Emrich, Scott; Chawla, Nitesh; Ferdig, Michael T.; Tung, Yi-An; Chen, Yong-Syuan; Chen, Mei-Ju May; Chen, Chien-Yu; Knight, Jason M.; Sahraeian, Sayed Mohammad Ebrahim; Esfahani, Mohammad Shahrokh; Dreos, Rene; Bucher, Philipp; Maier, Ezekiel; Saeys, Yvan; Szczurek, Ewa; Myšičková, Alena; Vingron, Martin; Klein, Holger; Kiełbasa, Szymon M.; Knisley, Jeff; Bonnell, Jeff; Knisley, Debra; Kursa, Miron B.; Rudnicki, Witold R.; Bhattacharjee, Madhuchhanda; Sillanpää, Mikko J.; Yeung, James; Meysman, Pieter; Rodríguez, Aminael Sánchez; Engelen, Kristof; Marchal, Kathleen; Huang, Yezhou; Mordelet, Fantine; Hartemink, Alexander; Pinello, Luca; Yuan, Guo-Cheng
2013-01-01
The Gene Promoter Expression Prediction challenge consisted of predicting gene expression from promoter sequences in a previously unknown experimentally generated data set. The challenge was presented to the community in the framework of the sixth Dialogue for Reverse Engineering Assessments and Methods (DREAM6), a community effort to evaluate the status of systems biology modeling methodologies. Nucleotide-specific promoter activity was obtained by measuring fluorescence from promoter sequences fused upstream of a gene for yellow fluorescence protein and inserted in the same genomic site of yeast Saccharomyces cerevisiae. Twenty-one teams submitted results predicting the expression levels of 53 different promoters from yeast ribosomal protein genes. Analysis of participant predictions shows that accurate values for low-expressed and mutated promoters were difficult to obtain, although in the latter case, only when the mutation induced a large change in promoter activity compared to the wild-type sequence. As in previous DREAM challenges, we found that aggregation of participant predictions provided robust results, but did not fare better than the three best algorithms. Finally, this study not only provides a benchmark for the assessment of methods predicting activity of a specific set of promoters from their sequence, but it also shows that the top performing algorithm, which used machine-learning approaches, can be improved by the addition of biological features such as transcription factor binding sites. PMID:23950146
Muscillo, Rossana; Conforto, Silvia; Schmid, Maurizio; Caselli, Paolo; D'Alessio, Tommaso
2007-01-01
In the context of tele-monitoring, great interest is presently devoted to physical activity, mainly of elderly or people with disabilities. In this context, many researchers studied the recognition of activities of daily living by using accelerometers. The present work proposes a novel algorithm for activity recognition that considers the variability in movement speed, by using dynamic programming. This objective is realized by means of a matching and recognition technique that determines the distance between the signal input and a set of previously defined templates. Two different approaches are here presented, one based on Dynamic Time Warping (DTW) and the other based on the Derivative Dynamic Time Warping (DDTW). The algorithm was applied to the recognition of gait, climbing and descending stairs, using a biaxial accelerometer placed on the shin. The results on DDTW, obtained by using only one sensor channel on the shin showed an average recognition score of 95%, higher than the values obtained with DTW (around 85%). Both DTW and DDTW consistently show higher classification rate than classical Linear Time Warping (LTW).
Construction of CASCI-type wave functions for very large active spaces.
Boguslawski, Katharina; Marti, Konrad H; Reiher, Markus
2011-06-14
We present a procedure to construct a configuration-interaction expansion containing arbitrary excitations from an underlying full-configuration-interaction-type wave function defined for a very large active space. Our procedure is based on the density-matrix renormalization group (DMRG) algorithm that provides the necessary information in terms of the eigenstates of the reduced density matrices to calculate the coefficient of any basis state in the many-particle Hilbert space. Since the dimension of the Hilbert space scales binomially with the size of the active space, a sophisticated Monte Carlo sampling routine is employed. This sampling algorithm can also construct such configuration-interaction-type wave functions from any other type of tensor network states. The configuration-interaction information obtained serves several purposes. It yields a qualitatively correct description of the molecule's electronic structure, it allows us to analyze DMRG wave functions converged for the same molecular system but with different parameter sets (e.g., different numbers of active-system (block) states), and it can be considered a balanced reference for the application of a subsequent standard multi-reference configuration-interaction method.
A learning approach to the bandwidth multicolouring problem
NASA Astrophysics Data System (ADS)
Akbari Torkestani, Javad
2016-05-01
In this article, a generalisation of the vertex colouring problem known as bandwidth multicolouring problem (BMCP), in which a set of colours is assigned to each vertex such that the difference between the colours, assigned to each vertex and its neighbours, is by no means less than a predefined threshold, is considered. It is shown that the proposed method can be applied to solve the bandwidth colouring problem (BCP) as well. BMCP is known to be NP-hard in graph theory, and so a large number of approximation solutions, as well as exact algorithms, have been proposed to solve it. In this article, two learning automata-based approximation algorithms are proposed for estimating a near-optimal solution to the BMCP. We show, for the first proposed algorithm, that by choosing a proper learning rate, the algorithm finds the optimal solution with a probability close enough to unity. Moreover, we compute the worst-case time complexity of the first algorithm for finding a 1/(1-ɛ) optimal solution to the given problem. The main advantage of this method is that a trade-off between the running time of algorithm and the colour set size (colouring optimality) can be made, by a proper choice of the learning rate also. Finally, it is shown that the running time of the proposed algorithm is independent of the graph size, and so it is a scalable algorithm for large graphs. The second proposed algorithm is compared with some well-known colouring algorithms and the results show the efficiency of the proposed algorithm in terms of the colour set size and running time of algorithm.
The center for causal discovery of biomedical knowledge from big data
Bahar, Ivet; Becich, Michael J; Benos, Panayiotis V; Berg, Jeremy; Espino, Jeremy U; Glymour, Clark; Jacobson, Rebecca Crowley; Kienholz, Michelle; Lee, Adrian V; Lu, Xinghua; Scheines, Richard
2015-01-01
The Big Data to Knowledge (BD2K) Center for Causal Discovery is developing and disseminating an integrated set of open source tools that support causal modeling and discovery of biomedical knowledge from large and complex biomedical datasets. The Center integrates teams of biomedical and data scientists focused on the refinement of existing and the development of new constraint-based and Bayesian algorithms based on causal Bayesian networks, the optimization of software for efficient operation in a supercomputing environment, and the testing of algorithms and software developed using real data from 3 representative driving biomedical projects: cancer driver mutations, lung disease, and the functional connectome of the human brain. Associated training activities provide both biomedical and data scientists with the knowledge and skills needed to apply and extend these tools. Collaborative activities with the BD2K Consortium further advance causal discovery tools and integrate tools and resources developed by other centers. PMID:26138794
A Novel and Simple Spike Sorting Implementation.
Petrantonakis, Panagiotis C; Poirazi, Panayiota
2017-04-01
Monitoring the activity of multiple, individual neurons that fire spikes in the vicinity of an electrode, namely perform a Spike Sorting (SS) procedure, comprises one of the most important tools for contemporary neuroscience in order to reverse-engineer the brain. As recording electrodes' technology rabidly evolves by integrating thousands of electrodes in a confined spatial setting, the algorithms that are used to monitor individual neurons from recorded signals have to become even more reliable and computationally efficient. In this work, we propose a novel framework of the SS approach in which a single-step processing of the raw (unfiltered) extracellular signal is sufficient for both the detection and sorting of the activity of individual neurons. Despite its simplicity, the proposed approach exhibits comparable performance with state-of-the-art approaches, especially for spike detection in noisy signals, and paves the way for a new family of SS algorithms with the potential for multi-recording, fast, on-chip implementations.
Ranking online quality and reputation via the user activity
NASA Astrophysics Data System (ADS)
Liu, Xiao-Lu; Guo, Qiang; Hou, Lei; Cheng, Can; Liu, Jian-Guo
2015-10-01
How to design an accurate algorithm for ranking the object quality and user reputation is of importance for online rating systems. In this paper we present an improved iterative algorithm for online ranking object quality and user reputation in terms of the user degree (IRUA), where the user's reputation is measured by his/her rating vector, the corresponding objects' quality vector and the user degree. The experimental results for the empirical networks show that the AUC values of the IRUA algorithm can reach 0.9065 and 0.8705 in Movielens and Netflix data sets, respectively, which is better than the results generated by the traditional iterative ranking methods. Meanwhile, the results for the synthetic networks indicate that user degree should be considered in real rating systems due to users' rating behaviors. Moreover, we find that enhancing or reducing the influences of the large-degree users could produce more accurate reputation ranking lists.
Ehteshami Bejnordi, Babak; Veta, Mitko; Johannes van Diest, Paul; van Ginneken, Bram; Karssemeijer, Nico; Litjens, Geert; van der Laak, Jeroen A W M; Hermsen, Meyke; Manson, Quirine F; Balkenhol, Maschenka; Geessink, Oscar; Stathonikos, Nikolaos; van Dijk, Marcory Crf; Bult, Peter; Beca, Francisco; Beck, Andrew H; Wang, Dayong; Khosla, Aditya; Gargeya, Rishab; Irshad, Humayun; Zhong, Aoxiao; Dou, Qi; Li, Quanzheng; Chen, Hao; Lin, Huang-Jing; Heng, Pheng-Ann; Haß, Christian; Bruni, Elia; Wong, Quincy; Halici, Ugur; Öner, Mustafa Ümit; Cetin-Atalay, Rengul; Berseth, Matt; Khvatkov, Vitali; Vylegzhanin, Alexei; Kraus, Oren; Shaban, Muhammad; Rajpoot, Nasir; Awan, Ruqayya; Sirinukunwattana, Korsuk; Qaiser, Talha; Tsang, Yee-Wah; Tellez, David; Annuscheit, Jonas; Hufnagl, Peter; Valkonen, Mira; Kartasalo, Kimmo; Latonen, Leena; Ruusuvuori, Pekka; Liimatainen, Kaisa; Albarqouni, Shadi; Mungal, Bharti; George, Ami; Demirci, Stefanie; Navab, Nassir; Watanabe, Seiryo; Seno, Shigeto; Takenaka, Yoichi; Matsuda, Hideo; Ahmady Phoulady, Hady; Kovalev, Vassili; Kalinovsky, Alexander; Liauchuk, Vitali; Bueno, Gloria; Fernandez-Carrobles, M Milagro; Serrano, Ismael; Deniz, Oscar; Racoceanu, Daniel; Venâncio, Rui
2017-12-12
Application of deep learning algorithms to whole-slide pathology images can potentially improve diagnostic accuracy and efficiency. Assess the performance of automated deep learning algorithms at detecting metastases in hematoxylin and eosin-stained tissue sections of lymph nodes of women with breast cancer and compare it with pathologists' diagnoses in a diagnostic setting. Researcher challenge competition (CAMELYON16) to develop automated solutions for detecting lymph node metastases (November 2015-November 2016). A training data set of whole-slide images from 2 centers in the Netherlands with (n = 110) and without (n = 160) nodal metastases verified by immunohistochemical staining were provided to challenge participants to build algorithms. Algorithm performance was evaluated in an independent test set of 129 whole-slide images (49 with and 80 without metastases). The same test set of corresponding glass slides was also evaluated by a panel of 11 pathologists with time constraint (WTC) from the Netherlands to ascertain likelihood of nodal metastases for each slide in a flexible 2-hour session, simulating routine pathology workflow, and by 1 pathologist without time constraint (WOTC). Deep learning algorithms submitted as part of a challenge competition or pathologist interpretation. The presence of specific metastatic foci and the absence vs presence of lymph node metastasis in a slide or image using receiver operating characteristic curve analysis. The 11 pathologists participating in the simulation exercise rated their diagnostic confidence as definitely normal, probably normal, equivocal, probably tumor, or definitely tumor. The area under the receiver operating characteristic curve (AUC) for the algorithms ranged from 0.556 to 0.994. The top-performing algorithm achieved a lesion-level, true-positive fraction comparable with that of the pathologist WOTC (72.4% [95% CI, 64.3%-80.4%]) at a mean of 0.0125 false-positives per normal whole-slide image. For the whole-slide image classification task, the best algorithm (AUC, 0.994 [95% CI, 0.983-0.999]) performed significantly better than the pathologists WTC in a diagnostic simulation (mean AUC, 0.810 [range, 0.738-0.884]; P < .001). The top 5 algorithms had a mean AUC that was comparable with the pathologist interpreting the slides in the absence of time constraints (mean AUC, 0.960 [range, 0.923-0.994] for the top 5 algorithms vs 0.966 [95% CI, 0.927-0.998] for the pathologist WOTC). In the setting of a challenge competition, some deep learning algorithms achieved better diagnostic performance than a panel of 11 pathologists participating in a simulation exercise designed to mimic routine pathology workflow; algorithm performance was comparable with an expert pathologist interpreting whole-slide images without time constraints. Whether this approach has clinical utility will require evaluation in a clinical setting.
Can we make a carpet smart enough to detect falls?
Muheidat, Fadi; Tyrer, Harry W
2016-08-01
In this paper, we have enhanced smart carpet, which is a floor based personnel detector system, to detect falls using a faster but low cost processor. Our hardware front end reads 128 sensors, with sensors output a voltage due to a person walking or falling on the carpet. The processor is Jetson TK1, which provides more computing power than before. We generated a dataset with volunteers who walked and fell to test our algorithms. Data obtained allowed examining data frames (a frame is a single scan of the carpet sensors) read from the data acquisition system. We used different algorithms and techniques, and varied the windows size of number of frames (WS ≥ 1) and threshold (TH) to build our data set, which later used machine learning to help decide a fall or no fall. We then used the dataset obtained from applying a set of fall detection algorithms and the video recorded for the fall pattern experiments to train a set of classifiers using multiple test options using the Weka framework. We measured the sensitivity and specificity of the system and other metrics for intelligent detection of falls. Results showed that Computational Intelligence techniques detect falls with 96.2% accuracy and 81% sensitivity and 97.8% specificity. In addition to fall detection, we developed a database system and web applications to retain these data for years. We can display this data in realtime and for all activities in the carpet for extensive data analysis any time in the future.
Machine Learning Method for Pattern Recognition in Volcano Seismic Spectra
NASA Astrophysics Data System (ADS)
Radic, V.; Unglert, K.; Jellinek, M.
2016-12-01
Variations in the spectral content of volcano seismicity related to changes in volcanic activity are commonly identified manually in spectrograms. However, long time series of monitoring data at volcano observatories require tools to facilitate automated and rapid processing. Techniques such as Self-Organizing Maps (SOM), Principal Component Analysis (PCA) and clustering methods can help to quickly and automatically identify important patterns related to impending eruptions. In this study we develop and evaluate an algorithm applied on a set of synthetic volcano seismic spectra as well as observed spectra from Kılauea Volcano, Hawai`i. Our goal is to retrieve a set of known spectral patterns that are associated with dominant phases of volcanic tremor before, during, and after periods of volcanic unrest. The algorithm is based on training a SOM on the spectra and then identifying local maxima and minima on the SOM 'topography'. The topography is derived from the first two PCA modes so that the maxima represent the SOM patterns that carry most of the variance in the spectra. Patterns identified in this way reproduce the known set of spectra. Our results show that, regardless of the level of white noise in the spectra, the algorithm can accurately reproduce the characteristic spectral patterns and their occurrence in time. The ability to rapidly classify spectra of volcano seismic data without prior knowledge of the character of the seismicity at a given volcanic system holds great potential for real time or near-real time applications, and thus ultimately for eruption forecasting.
NASA Astrophysics Data System (ADS)
Akhmedova, Sh; Semenkin, E.
2017-02-01
Previously, a meta-heuristic approach, called Co-Operation of Biology-Related Algorithms or COBRA, for solving real-parameter optimization problems was introduced and described. COBRA’s basic idea consists of a cooperative work of five well-known bionic algorithms such as Particle Swarm Optimization, the Wolf Pack Search, the Firefly Algorithm, the Cuckoo Search Algorithm and the Bat Algorithm, which were chosen due to the similarity of their schemes. The performance of this meta-heuristic was evaluated on a set of test functions and its workability was demonstrated. Thus it was established that the idea of the algorithms’ cooperative work is useful. However, it is unclear which bionic algorithms should be included in this cooperation and how many of them. Therefore, the five above-listed algorithms and additionally the Fish School Search algorithm were used for the development of five different modifications of COBRA by varying the number of component-algorithms. These modifications were tested on the same set of functions and the best of them was found. Ways of further improving the COBRA algorithm are then discussed.
BOREAS RSS-7 Regional LAI and FPAR Images From 10-Day AVHRR-LAC Composites
NASA Technical Reports Server (NTRS)
Hall, Forrest G. (Editor); Nickeson, Jaime (Editor); Chen, Jing; Cihlar, Josef
2000-01-01
The BOReal Ecosystem-Atmosphere Study Remote Sensing Science (BOREAS RSS-7) team collected various data sets to develop and validate an algorithm to allow the retrieval of the spatial distribution of Leaf Area Index (LAI) from remotely sensed images. Advanced Very High Resolution Radiometer (AVHRR) level-4c 10-day composite Normalized Difference Vegetation Index (NDVI) images produced at CCRS were used to produce images of LAI and the Fraction of Photosynthetically Active Radiation (FPAR) absorbed by plant canopies for the three summer IFCs in 1994 across the BOREAS region. The algorithms were developed based on ground measurements and Landsat Thematic Mapper (TM) images. The data are stored in binary image format files.
Solving LP Relaxations of Large-Scale Precedence Constrained Problems
NASA Astrophysics Data System (ADS)
Bienstock, Daniel; Zuckerberg, Mark
We describe new algorithms for solving linear programming relaxations of very large precedence constrained production scheduling problems. We present theory that motivates a new set of algorithmic ideas that can be employed on a wide range of problems; on data sets arising in the mining industry our algorithms prove effective on problems with many millions of variables and constraints, obtaining provably optimal solutions in a few minutes of computation.
NASA Astrophysics Data System (ADS)
Cheng, Jun; Zhang, Jun; Tian, Jinwen
2015-12-01
Based on deep analysis of the LiveWire interactive boundary extraction algorithm, a new algorithm focusing on improving the speed of LiveWire algorithm is proposed in this paper. Firstly, the Haar wavelet transform is carried on the input image, and the boundary is extracted on the low resolution image obtained by the wavelet transform of the input image. Secondly, calculating LiveWire shortest path is based on the control point set direction search by utilizing the spatial relationship between the two control points users provide in real time. Thirdly, the search order of the adjacent points of the starting node is set in advance. An ordinary queue instead of a priority queue is taken as the storage pool of the points when optimizing their shortest path value, thus reducing the complexity of the algorithm from O[n2] to O[n]. Finally, A region iterative backward projection method based on neighborhood pixel polling has been used to convert dual-pixel boundary of the reconstructed image to single-pixel boundary after Haar wavelet inverse transform. The algorithm proposed in this paper combines the advantage of the Haar wavelet transform and the advantage of the optimal path searching method based on control point set direction search. The former has fast speed of image decomposition and reconstruction and is more consistent with the texture features of the image and the latter can reduce the time complexity of the original algorithm. So that the algorithm can improve the speed in interactive boundary extraction as well as reflect the boundary information of the image more comprehensively. All methods mentioned above have a big role in improving the execution efficiency and the robustness of the algorithm.
Onboard Science and Applications Algorithm for Hyperspectral Data Reduction
NASA Technical Reports Server (NTRS)
Chien, Steve A.; Davies, Ashley G.; Silverman, Dorothy; Mandl, Daniel
2012-01-01
An onboard processing mission concept is under development for a possible Direct Broadcast capability for the HyspIRI mission, a Hyperspectral remote sensing mission under consideration for launch in the next decade. The concept would intelligently spectrally and spatially subsample the data as well as generate science products onboard to enable return of key rapid response science and applications information despite limited downlink bandwidth. This rapid data delivery concept focuses on wildfires and volcanoes as primary applications, but also has applications to vegetation, coastal flooding, dust, and snow/ice applications. Operationally, the HyspIRI team would define a set of spatial regions of interest where specific algorithms would be executed. For example, known coastal areas would have certain products or bands downlinked, ocean areas might have other bands downlinked, and during fire seasons other areas would be processed for active fire detections. Ground operations would automatically generate the mission plans specifying the highest priority tasks executable within onboard computation, setup, and data downlink constraints. The spectral bands of the TIR (thermal infrared) instrument can accurately detect the thermal signature of fires and send down alerts, as well as the thermal and VSWIR (visible to short-wave infrared) data corresponding to the active fires. Active volcanism also produces a distinctive thermal signature that can be detected onboard to enable spatial subsampling. Onboard algorithms and ground-based algorithms suitable for onboard deployment are mature. On HyspIRI, the algorithm would perform a table-driven temperature inversion from several spectral TIR bands, and then trigger downlink of the entire spectrum for each of the hot pixels identified. Ocean and coastal applications include sea surface temperature (using a small spectral subset of TIR data, but requiring considerable ancillary data), and ocean color applications to track biological activity such as harmful algal blooms. Measuring surface water extent to track flooding is another rapid response product leveraging VSWIR spectral information.
Robust algorithm for aligning two-dimensional chromatograms.
Gros, Jonas; Nabi, Deedar; Dimitriou-Christidis, Petros; Rutler, Rebecca; Arey, J Samuel
2012-11-06
Comprehensive two-dimensional gas chromatography (GC × GC) chromatograms typically exhibit run-to-run retention time variability. Chromatogram alignment is often a desirable step prior to further analysis of the data, for example, in studies of environmental forensics or weathering of complex mixtures. We present a new algorithm for aligning whole GC × GC chromatograms. This technique is based on alignment points that have locations indicated by the user both in a target chromatogram and in a reference chromatogram. We applied the algorithm to two sets of samples. First, we aligned the chromatograms of twelve compositionally distinct oil spill samples, all analyzed using the same instrument parameters. Second, we applied the algorithm to two compositionally distinct wastewater extracts analyzed using two different instrument temperature programs, thus involving larger retention time shifts than the first sample set. For both sample sets, the new algorithm performed favorably compared to two other available alignment algorithms: that of Pierce, K. M.; Wood, Lianna F.; Wright, B. W.; Synovec, R. E. Anal. Chem.2005, 77, 7735-7743 and 2-D COW from Zhang, D.; Huang, X.; Regnier, F. E.; Zhang, M. Anal. Chem.2008, 80, 2664-2671. The new algorithm achieves the best matches of retention times for test analytes, avoids some artifacts which result from the other alignment algorithms, and incurs the least modification of quantitative signal information.
Data depth based clustering analysis
Jeong, Myeong -Hun; Cai, Yaping; Sullivan, Clair J.; ...
2016-01-01
Here, this paper proposes a new algorithm for identifying patterns within data, based on data depth. Such a clustering analysis has an enormous potential to discover previously unknown insights from existing data sets. Many clustering algorithms already exist for this purpose. However, most algorithms are not affine invariant. Therefore, they must operate with different parameters after the data sets are rotated, scaled, or translated. Further, most clustering algorithms, based on Euclidean distance, can be sensitive to noises because they have no global perspective. Parameter selection also significantly affects the clustering results of each algorithm. Unlike many existing clustering algorithms, themore » proposed algorithm, called data depth based clustering analysis (DBCA), is able to detect coherent clusters after the data sets are affine transformed without changing a parameter. It is also robust to noises because using data depth can measure centrality and outlyingness of the underlying data. Further, it can generate relatively stable clusters by varying the parameter. The experimental comparison with the leading state-of-the-art alternatives demonstrates that the proposed algorithm outperforms DBSCAN and HDBSCAN in terms of affine invariance, and exceeds or matches the ro-bustness to noises of DBSCAN or HDBSCAN. The robust-ness to parameter selection is also demonstrated through the case study of clustering twitter data.« less
NASA Technical Reports Server (NTRS)
Thadani, S. G.
1977-01-01
The Maximum Likelihood Estimation of Signature Transformation (MLEST) algorithm is used to obtain maximum likelihood estimates (MLE) of affine transformation. The algorithm has been evaluated for three sets of data: simulated (training and recognition segment pairs), consecutive-day (data gathered from Landsat images), and geographical-extension (large-area crop inventory experiment) data sets. For each set, MLEST signature extension runs were made to determine MLE values and the affine-transformed training segment signatures were used to classify the recognition segments. The classification results were used to estimate wheat proportions at 0 and 1% threshold values.
Directly Reconstructing Principal Components of Heterogeneous Particles from Cryo-EM Images
Tagare, Hemant D.; Kucukelbir, Alp; Sigworth, Fred J.; Wang, Hongwei; Rao, Murali
2015-01-01
Structural heterogeneity of particles can be investigated by their three-dimensional principal components. This paper addresses the question of whether, and with what algorithm, the three-dimensional principal components can be directly recovered from cryo-EM images. The first part of the paper extends the Fourier slice theorem to covariance functions showing that the three-dimensional covariance, and hence the principal components, of a heterogeneous particle can indeed be recovered from two-dimensional cryo-EM images. The second part of the paper proposes a practical algorithm for reconstructing the principal components directly from cryo-EM images without the intermediate step of calculating covariances. This algorithm is based on maximizing the (posterior) likelihood using the Expectation-Maximization algorithm. The last part of the paper applies this algorithm to simulated data and to two real cryo-EM data sets: a data set of the 70S ribosome with and without Elongation Factor-G (EF-G), and a data set of the inluenza virus RNA dependent RNA Polymerase (RdRP). The first principal component of the 70S ribosome data set reveals the expected conformational changes of the ribosome as the EF-G binds and unbinds. The first principal component of the RdRP data set reveals a conformational change in the two dimers of the RdRP. PMID:26049077
Parallel fuzzy connected image segmentation on GPU
Zhuge, Ying; Cao, Yong; Udupa, Jayaram K.; Miller, Robert W.
2011-01-01
Purpose: Image segmentation techniques using fuzzy connectedness (FC) principles have shown their effectiveness in segmenting a variety of objects in several large applications. However, one challenge in these algorithms has been their excessive computational requirements when processing large image datasets. Nowadays, commodity graphics hardware provides a highly parallel computing environment. In this paper, the authors present a parallel fuzzy connected image segmentation algorithm implementation on NVIDIA’s compute unified device Architecture (cuda) platform for segmenting medical image data sets. Methods: In the FC algorithm, there are two major computational tasks: (i) computing the fuzzy affinity relations and (ii) computing the fuzzy connectedness relations. These two tasks are implemented as cuda kernels and executed on GPU. A dramatic improvement in speed for both tasks is achieved as a result. Results: Our experiments based on three data sets of small, medium, and large data size demonstrate the efficiency of the parallel algorithm, which achieves a speed-up factor of 24.4x, 18.1x, and 10.3x, correspondingly, for the three data sets on the NVIDIA Tesla C1060 over the implementation of the algorithm on CPU, and takes 0.25, 0.72, and 15.04 s, correspondingly, for the three data sets. Conclusions: The authors developed a parallel algorithm of the widely used fuzzy connected image segmentation method on the NVIDIA GPUs, which are far more cost- and speed-effective than both cluster of workstations and multiprocessing systems. A near-interactive speed of segmentation has been achieved, even for the large data set. PMID:21859037
Parallel fuzzy connected image segmentation on GPU.
Zhuge, Ying; Cao, Yong; Udupa, Jayaram K; Miller, Robert W
2011-07-01
Image segmentation techniques using fuzzy connectedness (FC) principles have shown their effectiveness in segmenting a variety of objects in several large applications. However, one challenge in these algorithms has been their excessive computational requirements when processing large image datasets. Nowadays, commodity graphics hardware provides a highly parallel computing environment. In this paper, the authors present a parallel fuzzy connected image segmentation algorithm implementation on NVIDIA's compute unified device Architecture (CUDA) platform for segmenting medical image data sets. In the FC algorithm, there are two major computational tasks: (i) computing the fuzzy affinity relations and (ii) computing the fuzzy connectedness relations. These two tasks are implemented as CUDA kernels and executed on GPU. A dramatic improvement in speed for both tasks is achieved as a result. Our experiments based on three data sets of small, medium, and large data size demonstrate the efficiency of the parallel algorithm, which achieves a speed-up factor of 24.4x, 18.1x, and 10.3x, correspondingly, for the three data sets on the NVIDIA Tesla C1060 over the implementation of the algorithm on CPU, and takes 0.25, 0.72, and 15.04 s, correspondingly, for the three data sets. The authors developed a parallel algorithm of the widely used fuzzy connected image segmentation method on the NVIDIA GPUs, which are far more cost- and speed-effective than both cluster of workstations and multiprocessing systems. A near-interactive speed of segmentation has been achieved, even for the large data set.
NASA Technical Reports Server (NTRS)
Yang, Wenze; Huang, Dong; Tan, Bin; Stroeve, Julienne C.; Shabanov, Nikolay V.; Knyazikhin, Yuri; Nemani, Ramakrishna R.; Myneni, Ranga B.
2006-01-01
The analysis of two years of Collection 3 and five years of Collection 4 Terra Moderate Resolution Imaging Spectroradiometer (MODIS) Leaf Area Index (LAI) and Fraction of Photosynthetically Active Radiation (FPAR) data sets is presented in this article with the goal of understanding product quality with respect to version (Collection 3 versus 4), algorithm (main versus backup), snow (snow-free versus snow on the ground), and cloud (cloud-free versus cloudy) conditions. Retrievals from the main radiative transfer algorithm increased from 55% in Collection 3 to 67% in Collection 4 due to algorithm refinements and improved inputs. Anomalously high LAI/FPAR values observed in Collection 3 product in some vegetation types were corrected in Collection 4. The problem of reflectance saturation and too few main algorithm retrievals in broadleaf forests persisted in Collection 4. The spurious seasonality in needleleaf LAI/FPAR fields was traced to fewer reliable input data and retrievals during the boreal winter period. About 97% of the snow covered pixels were processed by the backup Normalized Difference Vegetation Index-based algorithm. Similarly, a majority of retrievals under cloudy conditions were obtained from the backup algorithm. For these reasons, the users are advised to consult the quality flags accompanying the LAI and FPAR product.
Cloud and aerosol studies using combined CPL and MAS data
NASA Astrophysics Data System (ADS)
Vaughan, Mark A.; Rodier, Sharon; Hu, Yongxiang; McGill, Matthew J.; Holz, Robert E.
2004-11-01
Current uncertainties in the role of aerosols and clouds in the Earth's climate system limit our abilities to model the climate system and predict climate change. These limitations are due primarily to difficulties of adequately measuring aerosols and clouds on a global scale. The A-train satellites (Aqua, CALIPSO, CloudSat, PARASOL, and Aura) will provide an unprecedented opportunity to address these uncertainties. The various active and passive sensors of the A-train will use a variety of measurement techniques to provide comprehensive observations of the multi-dimensional properties of clouds and aerosols. However, to fully achieve the potential of this ensemble requires a robust data analysis framework to optimally and efficiently map these individual measurements into a comprehensive set of cloud and aerosol physical properties. In this work we introduce the Multi-Instrument Data Analysis and Synthesis (MIDAS) project, whose goal is to develop a suite of physically sound and computationally efficient algorithms that will combine active and passive remote sensing data in order to produce improved assessments of aerosol and cloud radiative and microphysical properties. These algorithms include (a) the development of an intelligent feature detection algorithm that combines inputs from both active and passive sensors, and (b) identifying recognizable multi-instrument signatures related to aerosol and cloud type derived from clusters of image pixels and the associated vertical profile information. Classification of these signatures will lead to the automated identification of aerosol and cloud types. Testing of these new algorithms is done using currently existing and readily available active and passive measurements from the Cloud Physics Lidar and the MODIS Airborne Simulator, which simulate, respectively, the CALIPSO and MODIS A-train instruments.
CT liver volumetry using geodesic active contour segmentation with a level-set algorithm
NASA Astrophysics Data System (ADS)
Suzuki, Kenji; Epstein, Mark L.; Kohlbrenner, Ryan; Obajuluwa, Ademola; Xu, Jianwu; Hori, Masatoshi; Baron, Richard
2010-03-01
Automatic liver segmentation on CT images is challenging because the liver often abuts other organs of a similar density. Our purpose was to develop an accurate automated liver segmentation scheme for measuring liver volumes. We developed an automated volumetry scheme for the liver in CT based on a 5 step schema. First, an anisotropic smoothing filter was applied to portal-venous phase CT images to remove noise while preserving the liver structure, followed by an edge enhancer to enhance the liver boundary. By using the boundary-enhanced image as a speed function, a fastmarching algorithm generated an initial surface that roughly estimated the liver shape. A geodesic-active-contour segmentation algorithm coupled with level-set contour-evolution refined the initial surface so as to more precisely fit the liver boundary. The liver volume was calculated based on the refined liver surface. Hepatic CT scans of eighteen prospective liver donors were obtained under a liver transplant protocol with a multi-detector CT system. Automated liver volumes obtained were compared with those manually traced by a radiologist, used as "gold standard." The mean liver volume obtained with our scheme was 1,520 cc, whereas the mean manual volume was 1,486 cc, with the mean absolute difference of 104 cc (7.0%). CT liver volumetrics based on an automated scheme agreed excellently with "goldstandard" manual volumetrics (intra-class correlation coefficient was 0.95) with no statistically significant difference (p(F<=f)=0.32), and required substantially less completion time. Our automated scheme provides an efficient and accurate way of measuring liver volumes.
Support Vector Machine algorithm for regression and classification
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yu, Chenggang; Zavaljevski, Nela
2001-08-01
The software is an implementation of the Support Vector Machine (SVM) algorithm that was invented and developed by Vladimir Vapnik and his co-workers at AT&T Bell Laboratories. The specific implementation reported here is an Active Set method for solving a quadratic optimization problem that forms the major part of any SVM program. The implementation is tuned to specific constraints generated in the SVM learning. Thus, it is more efficient than general-purpose quadratic optimization programs. A decomposition method has been implemented in the software that enables processing large data sets. The size of the learning data is virtually unlimited by themore » capacity of the computer physical memory. The software is flexible and extensible. Two upper bounds are implemented to regulate the SVM learning for classification, which allow users to adjust the false positive and false negative rates. The software can be used either as a standalone, general-purpose SVM regression or classification program, or be embedded into a larger software system.« less
Compressed/reconstructed test images for CRAF/Cassini
NASA Technical Reports Server (NTRS)
Dolinar, S.; Cheung, K.-M.; Onyszchuk, I.; Pollara, F.; Arnold, S.
1991-01-01
A set of compressed, then reconstructed, test images submitted to the Comet Rendezvous Asteroid Flyby (CRAF)/Cassini project is presented as part of its evaluation of near lossless high compression algorithms for representing image data. A total of seven test image files were provided by the project. The seven test images were compressed, then reconstructed with high quality (root mean square error of approximately one or two gray levels on an 8 bit gray scale), using discrete cosine transforms or Hadamard transforms and efficient entropy coders. The resulting compression ratios varied from about 2:1 to about 10:1, depending on the activity or randomness in the source image. This was accomplished without any special effort to optimize the quantizer or to introduce special postprocessing to filter the reconstruction errors. A more complete set of measurements, showing the relative performance of the compression algorithms over a wide range of compression ratios and reconstruction errors, shows that additional compression is possible at a small sacrifice in fidelity.
GPU accelerated fuzzy connected image segmentation by using CUDA.
Zhuge, Ying; Cao, Yong; Miller, Robert W
2009-01-01
Image segmentation techniques using fuzzy connectedness principles have shown their effectiveness in segmenting a variety of objects in several large applications in recent years. However, one problem of these algorithms has been their excessive computational requirements when processing large image datasets. Nowadays commodity graphics hardware provides high parallel computing power. In this paper, we present a parallel fuzzy connected image segmentation algorithm on Nvidia's Compute Unified Device Architecture (CUDA) platform for segmenting large medical image data sets. Our experiments based on three data sets with small, medium, and large data size demonstrate the efficiency of the parallel algorithm, which achieves a speed-up factor of 7.2x, 7.3x, and 14.4x, correspondingly, for the three data sets over the sequential implementation of fuzzy connected image segmentation algorithm on CPU.
A Review On Missing Value Estimation Using Imputation Algorithm
NASA Astrophysics Data System (ADS)
Armina, Roslan; Zain, Azlan Mohd; Azizah Ali, Nor; Sallehuddin, Roselina
2017-09-01
The presence of the missing value in the data set has always been a major problem for precise prediction. The method for imputing missing value needs to minimize the effect of incomplete data sets for the prediction model. Many algorithms have been proposed for countermeasure of missing value problem. In this review, we provide a comprehensive analysis of existing imputation algorithm, focusing on the technique used and the implementation of global or local information of data sets for missing value estimation. In addition validation method for imputation result and way to measure the performance of imputation algorithm also described. The objective of this review is to highlight possible improvement on existing method and it is hoped that this review gives reader better understanding of imputation method trend.
Using the Chandra Source-Finding Algorithm to Automatically Identify Solar X-ray Bright Points
NASA Technical Reports Server (NTRS)
Adams, Mitzi L.; Tennant, A.; Cirtain, J. M.
2009-01-01
This poster details a technique of bright point identification that is used to find sources in Chandra X-ray data. The algorithm, part of a program called LEXTRCT, searches for regions of a given size that are above a minimum signal to noise ratio. The algorithm allows selected pixels to be excluded from the source-finding, thus allowing exclusion of saturated pixels (from flares and/or active regions). For Chandra data the noise is determined by photon counting statistics, whereas solar telescopes typically integrate a flux. Thus the calculated signal-to-noise ratio is incorrect, but we find we can scale the number to get reasonable results. For example, Nakakubo and Hara (1998) find 297 bright points in a September 11, 1996 Yohkoh image; with judicious selection of signal-to-noise ratio, our algorithm finds 300 sources. To further assess the efficacy of the algorithm, we analyze a SOHO/EIT image (195 Angstroms) and compare results with those published in the literature (McIntosh and Gurman, 2005). Finally, we analyze three sets of data from Hinode, representing different parts of the decline to minimum of the solar cycle.
Computerized scoring algorithms for the Autobiographical Memory Test.
Takano, Keisuke; Gutenbrunner, Charlotte; Martens, Kris; Salmon, Karen; Raes, Filip
2018-02-01
Reduced specificity of autobiographical memories is a hallmark of depressive cognition. Autobiographical memory (AM) specificity is typically measured by the Autobiographical Memory Test (AMT), in which respondents are asked to describe personal memories in response to emotional cue words. Due to this free descriptive responding format, the AMT relies on experts' hand scoring for subsequent statistical analyses. This manual coding potentially impedes research activities in big data analytics such as large epidemiological studies. Here, we propose computerized algorithms to automatically score AM specificity for the Dutch (adult participants) and English (youth participants) versions of the AMT by using natural language processing and machine learning techniques. The algorithms showed reliable performances in discriminating specific and nonspecific (e.g., overgeneralized) autobiographical memories in independent testing data sets (area under the receiver operating characteristic curve > .90). Furthermore, outcome values of the algorithms (i.e., decision values of support vector machines) showed a gradient across similar (e.g., specific and extended memories) and different (e.g., specific memory and semantic associates) categories of AMT responses, suggesting that, for both adults and youth, the algorithms well capture the extent to which a memory has features of specific memories. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
Noise-enhanced convolutional neural networks.
Audhkhasi, Kartik; Osoba, Osonde; Kosko, Bart
2016-06-01
Injecting carefully chosen noise can speed convergence in the backpropagation training of a convolutional neural network (CNN). The Noisy CNN algorithm speeds training on average because the backpropagation algorithm is a special case of the generalized expectation-maximization (EM) algorithm and because such carefully chosen noise always speeds up the EM algorithm on average. The CNN framework gives a practical way to learn and recognize images because backpropagation scales with training data. It has only linear time complexity in the number of training samples. The Noisy CNN algorithm finds a special separating hyperplane in the network's noise space. The hyperplane arises from the likelihood-based positivity condition that noise-boosts the EM algorithm. The hyperplane cuts through a uniform-noise hypercube or Gaussian ball in the noise space depending on the type of noise used. Noise chosen from above the hyperplane speeds training on average. Noise chosen from below slows it on average. The algorithm can inject noise anywhere in the multilayered network. Adding noise to the output neurons reduced the average per-iteration training-set cross entropy by 39% on a standard MNIST image test set of handwritten digits. It also reduced the average per-iteration training-set classification error by 47%. Adding noise to the hidden layers can also reduce these performance measures. The noise benefit is most pronounced for smaller data sets because the largest EM hill-climbing gains tend to occur in the first few iterations. This noise effect can assist random sampling from large data sets because it allows a smaller random sample to give the same or better performance than a noiseless sample gives. Copyright © 2015 Elsevier Ltd. All rights reserved.
Trepalin, Sergey; Osadchiy, Nikolay
2005-01-01
Chemical structure provides exhaustive description of a compound, but it is often proprietary and thus an impediment in the exchange of information. For example, structure disclosure is often needed for the selection of most similar or dissimilar compounds. Authors propose a centroidal algorithm based on structural fragments (screens) that can be efficiently used for the similarity and diversity selections without disclosing structures from the reference set. For an increased security purposes, authors recommend that such set contains at least some tens of structures. Analysis of reverse engineering feasibility showed that the problem difficulty grows with decrease of the screen's radius. The algorithm is illustrated with concrete calculations on known steroidal, quinoline, and quinazoline drugs. We also investigate a problem of scaffold identification in combinatorial library dataset. The results show that relatively small screens of radius equal to 2 bond lengths perform well in the similarity sorting, while radius 4 screens yield better results in diversity sorting. The software implementation of the algorithm taking SDF file with a reference set generates screens of various radii which are subsequently used for the similarity and diversity sorting of external SDFs. Since the reverse engineering of the reference set molecules from their screens has the same difficulty as the RSA asymmetric encryption algorithm, generated screens can be stored openly without further encryption. This approach ensures an end user transfers only a set of structural fragments and no other data. Like other algorithms of encryption, the centroid algorithm cannot give 100% guarantee of protecting a chemical structure from dataset, but probability of initial structure identification is very small-order of 10(-40) in typical cases.
The centroidal algorithm in molecular similarity and diversity calculations on confidential datasets
NASA Astrophysics Data System (ADS)
Trepalin, Sergey; Osadchiy, Nikolay
2005-09-01
Chemical structure provides exhaustive description of a compound, but it is often proprietary and thus an impediment in the exchange of information. For example, structure disclosure is often needed for the selection of most similar or dissimilar compounds. Authors propose a centroidal algorithm based on structural fragments (screens) that can be efficiently used for the similarity and diversity selections without disclosing structures from the reference set. For an increased security purposes, authors recommend that such set contains at least some tens of structures. Analysis of reverse engineering feasibility showed that the problem difficulty grows with decrease of the screen's radius. The algorithm is illustrated with concrete calculations on known steroidal, quinoline, and quinazoline drugs. We also investigate a problem of scaffold identification in combinatorial library dataset. The results show that relatively small screens of radius equal to 2 bond lengths perform well in the similarity sorting, while radius 4 screens yield better results in diversity sorting. The software implementation of the algorithm taking SDF file with a reference set generates screens of various radii which are subsequently used for the similarity and diversity sorting of external SDFs. Since the reverse engineering of the reference set molecules from their screens has the same difficulty as the RSA asymmetric encryption algorithm, generated screens can be stored openly without further encryption. This approach ensures an end user transfers only a set of structural fragments and no other data. Like other algorithms of encryption, the centroid algorithm cannot give 100% guarantee of protecting a chemical structure from dataset, but probability of initial structure identification is very small-order of 10-40 in typical cases.
An enhanced TIMESAT algorithm for estimating vegetation phenology metrics from MODIS data
Tan, B.; Morisette, J.T.; Wolfe, R.E.; Gao, F.; Ederer, G.A.; Nightingale, J.; Pedelty, J.A.
2011-01-01
An enhanced TIMESAT algorithm was developed for retrieving vegetation phenology metrics from 250 m and 500 m spatial resolution Moderate Resolution Imaging Spectroradiometer (MODIS) vegetation indexes (VI) over North America. MODIS VI data were pre-processed using snow-cover and land surface temperature data, and temporally smoothed with the enhanced TIMESAT algorithm. An objective third derivative test was applied to define key phenology dates and retrieve a set of phenology metrics. This algorithm has been applied to two MODIS VIs: Normalized Difference Vegetation Index (NDVI) and Enhanced Vegetation Index (EVI). In this paper, we describe the algorithm and use EVI as an example to compare three sets of TIMESAT algorithm/MODIS VI combinations: a) original TIMESAT algorithm with original MODIS VI, b) original TIMESAT algorithm with pre-processed MODIS VI, and c) enhanced TIMESAT and pre-processed MODIS VI. All retrievals were compared with ground phenology observations, some made available through the National Phenology Network. Our results show that for MODIS data in middle to high latitude regions, snow and land surface temperature information is critical in retrieving phenology metrics from satellite observations. The results also show that the enhanced TIMESAT algorithm can better accommodate growing season start and end dates that vary significantly from year to year. The TIMESAT algorithm improvements contribute to more spatial coverage and more accurate retrievals of the phenology metrics. Among three sets of TIMESAT/MODIS VI combinations, the start of the growing season metric predicted by the enhanced TIMESAT algorithm using pre-processed MODIS VIs has the best associations with ground observed vegetation greenup dates. ?? 2010 IEEE.
An Enhanced TIMESAT Algorithm for Estimating Vegetation Phenology Metrics from MODIS Data
NASA Technical Reports Server (NTRS)
Tan, Bin; Morisette, Jeffrey T.; Wolfe, Robert E.; Gao, Feng; Ederer, Gregory A.; Nightingale, Joanne; Pedelty, Jeffrey A.
2012-01-01
An enhanced TIMESAT algorithm was developed for retrieving vegetation phenology metrics from 250 m and 500 m spatial resolution Moderate Resolution Imaging Spectroradiometer (MODIS) vegetation indexes (VI) over North America. MODIS VI data were pre-processed using snow-cover and land surface temperature data, and temporally smoothed with the enhanced TIMESAT algorithm. An objective third derivative test was applied to define key phenology dates and retrieve a set of phenology metrics. This algorithm has been applied to two MODIS VIs: Normalized Difference Vegetation Index (NDVI) and Enhanced Vegetation Index (EVI). In this paper, we describe the algorithm and use EVI as an example to compare three sets of TIMESAT algorithm/MODIS VI combinations: a) original TIMESAT algorithm with original MODIS VI, b) original TIMESAT algorithm with pre-processed MODIS VI, and c) enhanced TIMESAT and pre-processed MODIS VI. All retrievals were compared with ground phenology observations, some made available through the National Phenology Network. Our results show that for MODIS data in middle to high latitude regions, snow and land surface temperature information is critical in retrieving phenology metrics from satellite observations. The results also show that the enhanced TIMESAT algorithm can better accommodate growing season start and end dates that vary significantly from year to year. The TIMESAT algorithm improvements contribute to more spatial coverage and more accurate retrievals of the phenology metrics. Among three sets of TIMESAT/MODIS VI combinations, the start of the growing season metric predicted by the enhanced TIMESAT algorithm using pre-processed MODIS VIs has the best associations with ground observed vegetation greenup dates.
NASA Astrophysics Data System (ADS)
Hortos, William S.
2009-05-01
In previous work by the author, parameters across network protocol layers were selected as features in supervised algorithms that detect and identify certain intrusion attacks on wireless ad hoc sensor networks (WSNs) carrying multisensor data. The algorithms improved the residual performance of the intrusion prevention measures provided by any dynamic key-management schemes and trust models implemented among network nodes. The approach of this paper does not train algorithms on the signature of known attack traffic, but, instead, the approach is based on unsupervised anomaly detection techniques that learn the signature of normal network traffic. Unsupervised learning does not require the data to be labeled or to be purely of one type, i.e., normal or attack traffic. The approach can be augmented to add any security attributes and quantified trust levels, established during data exchanges among nodes, to the set of cross-layer features from the WSN protocols. A two-stage framework is introduced for the security algorithms to overcome the problems of input size and resource constraints. The first stage is an unsupervised clustering algorithm which reduces the payload of network data packets to a tractable size. The second stage is a traditional anomaly detection algorithm based on a variation of support vector machines (SVMs), whose efficiency is improved by the availability of data in the packet payload. In the first stage, selected algorithms are adapted to WSN platforms to meet system requirements for simple parallel distributed computation, distributed storage and data robustness. A set of mobile software agents, acting like an ant colony in securing the WSN, are distributed at the nodes to implement the algorithms. The agents move among the layers involved in the network response to the intrusions at each active node and trustworthy neighborhood, collecting parametric values and executing assigned decision tasks. This minimizes the need to move large amounts of audit-log data through resource-limited nodes and locates routines closer to that data. Performance of the unsupervised algorithms is evaluated against the network intrusions of black hole, flooding, Sybil and other denial-of-service attacks in simulations of published scenarios. Results for scenarios with intentionally malfunctioning sensors show the robustness of the two-stage approach to intrusion anomalies.
A fast bottom-up algorithm for computing the cut sets of noncoherent fault trees
DOE Office of Scientific and Technical Information (OSTI.GOV)
Corynen, G.C.
1987-11-01
An efficient procedure for finding the cut sets of large fault trees has been developed. Designed to address coherent or noncoherent systems, dependent events, shared or common-cause events, the method - called SHORTCUT - is based on a fast algorithm for transforming a noncoherent tree into a quasi-coherent tree (COHERE), and on a new algorithm for reducing cut sets (SUBSET). To assure sufficient clarity and precision, the procedure is discussed in the language of simple sets, which is also developed in this report. Although the new method has not yet been fully implemented on the computer, we report theoretical worst-casemore » estimates of its computational complexity. 12 refs., 10 figs.« less
Active machine learning-driven experimentation to determine compound effects on protein patterns.
Naik, Armaghan W; Kangas, Joshua D; Sullivan, Devin P; Murphy, Robert F
2016-02-03
High throughput screening determines the effects of many conditions on a given biological target. Currently, to estimate the effects of those conditions on other targets requires either strong modeling assumptions (e.g. similarities among targets) or separate screens. Ideally, data-driven experimentation could be used to learn accurate models for many conditions and targets without doing all possible experiments. We have previously described an active machine learning algorithm that can iteratively choose small sets of experiments to learn models of multiple effects. We now show that, with no prior knowledge and with liquid handling robotics and automated microscopy under its control, this learner accurately learned the effects of 48 chemical compounds on the subcellular localization of 48 proteins while performing only 29% of all possible experiments. The results represent the first practical demonstration of the utility of active learning-driven biological experimentation in which the set of possible phenotypes is unknown in advance.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cox, R.G.
Much controversy surrounds government regulation of routing and scheduling of Hazardous Materials Transportation (HMT). Increases in operating costs must be balanced against expected benefits from local HMT bans and curfews when promulgating or preempting HMT regulations. Algorithmic approaches for evaluating HMT routing and scheduling regulatory policy are described. A review of current US HMT regulatory policy is presented to provide a context for the analysis. Next, a multiobjective shortest path algorithm to find the set of efficient routes under conflicting objectives is presented. This algorithm generates all efficient routes under any partial ordering in a single pass through the network.more » Also, scheduling algorithms are presented to estimate the travel time delay due to HMT curfews along a route. Algorithms are presented assuming either deterministic or stochastic travel times between curfew cities and also possible rerouting to avoid such cities. These algorithms are applied to the case study of US highway transport of spent nuclear fuel from reactors to permanent repositories. Two data sets were used. One data set included the US Interstate Highway System (IHS) network with reactor locations, possible repository sites, and 150 heavily populated areas (HPAs). The other data set contained estimates of the population residing with 0.5 miles of the IHS and the Eastern US. Curfew delay is dramatically reduced by optimally scheduling departure times unless inter-HPA travel times are highly uncertain. Rerouting shipments to avoid HPAs is a less efficient approach to reducing delay.« less
Muñoz, Mario A; Smith-Miles, Kate A
2017-01-01
This article presents a method for the objective assessment of an algorithm's strengths and weaknesses. Instead of examining the performance of only one or more algorithms on a benchmark set, or generating custom problems that maximize the performance difference between two algorithms, our method quantifies both the nature of the test instances and the algorithm performance. Our aim is to gather information about possible phase transitions in performance, that is, the points in which a small change in problem structure produces algorithm failure. The method is based on the accurate estimation and characterization of the algorithm footprints, that is, the regions of instance space in which good or exceptional performance is expected from an algorithm. A footprint can be estimated for each algorithm and for the overall portfolio. Therefore, we select a set of features to generate a common instance space, which we validate by constructing a sufficiently accurate prediction model. We characterize the footprints by their area and density. Our method identifies complementary performance between algorithms, quantifies the common features of hard problems, and locates regions where a phase transition may lie.
Fast-Solving Quasi-Optimal LS-S3VM Based on an Extended Candidate Set.
Ma, Yuefeng; Liang, Xun; Kwok, James T; Li, Jianping; Zhou, Xiaoping; Zhang, Haiyan
2018-04-01
The semisupervised least squares support vector machine (LS-S 3 VM) is an important enhancement of least squares support vector machines in semisupervised learning. Given that most data collected from the real world are without labels, semisupervised approaches are more applicable than standard supervised approaches. Although a few training methods for LS-S 3 VM exist, the problem of deriving the optimal decision hyperplane efficiently and effectually has not been solved. In this paper, a fully weighted model of LS-S 3 VM is proposed, and a simple integer programming (IP) model is introduced through an equivalent transformation to solve the model. Based on the distances between the unlabeled data and the decision hyperplane, a new indicator is designed to represent the possibility that the label of an unlabeled datum should be reversed in each iteration during training. Using the indicator, we construct an extended candidate set consisting of the indices of unlabeled data with high possibilities, which integrates more information from unlabeled data. Our algorithm is degenerated into a special scenario of the previous algorithm when the extended candidate set is reduced into a set with only one element. Two strategies are utilized to determine the descent directions based on the extended candidate set. Furthermore, we developed a novel method for locating a good starting point based on the properties of the equivalent IP model. Combined with the extended candidate set and the carefully computed starting point, a fast algorithm to solve LS-S 3 VM quasi-optimally is proposed. The choice of quasi-optimal solutions results in low computational cost and avoidance of overfitting. Experiments show that our algorithm equipped with the two designed strategies is more effective than other algorithms in at least one of the following three aspects: 1) computational complexity; 2) generalization ability; and 3) flexibility. However, our algorithm and other algorithms have similar levels of performance in the remaining aspects.
List-Based Simulated Annealing Algorithm for Traveling Salesman Problem.
Zhan, Shi-hua; Lin, Juan; Zhang, Ze-jun; Zhong, Yi-wen
2016-01-01
Simulated annealing (SA) algorithm is a popular intelligent optimization algorithm which has been successfully applied in many fields. Parameters' setting is a key factor for its performance, but it is also a tedious work. To simplify parameters setting, we present a list-based simulated annealing (LBSA) algorithm to solve traveling salesman problem (TSP). LBSA algorithm uses a novel list-based cooling schedule to control the decrease of temperature. Specifically, a list of temperatures is created first, and then the maximum temperature in list is used by Metropolis acceptance criterion to decide whether to accept a candidate solution. The temperature list is adapted iteratively according to the topology of the solution space of the problem. The effectiveness and the parameter sensitivity of the list-based cooling schedule are illustrated through benchmark TSP problems. The LBSA algorithm, whose performance is robust on a wide range of parameter values, shows competitive performance compared with some other state-of-the-art algorithms.
A unified framework for image retrieval using keyword and visual features.
Jing, Feng; Li, Mingling; Zhang, Hong-Jiang; Zhang, Bo
2005-07-01
In this paper, a unified image retrieval framework based on both keyword annotations and visual features is proposed. In this framework, a set of statistical models are built based on visual features of a small set of manually labeled images to represent semantic concepts and used to propagate keywords to other unlabeled images. These models are updated periodically when more images implicitly labeled by users become available through relevance feedback. In this sense, the keyword models serve the function of accumulation and memorization of knowledge learned from user-provided relevance feedback. Furthermore, two sets of effective and efficient similarity measures and relevance feedback schemes are proposed for query by keyword scenario and query by image example scenario, respectively. Keyword models are combined with visual features in these schemes. In particular, a new, entropy-based active learning strategy is introduced to improve the efficiency of relevance feedback for query by keyword. Furthermore, a new algorithm is proposed to estimate the keyword features of the search concept for query by image example. It is shown to be more appropriate than two existing relevance feedback algorithms. Experimental results demonstrate the effectiveness of the proposed framework.
NASA Astrophysics Data System (ADS)
Lorsakul, Auranuch; Andersson, Emilia; Vega Harring, Suzana; Sade, Hadassah; Grimm, Oliver; Bredno, Joerg
2017-03-01
Multiplex-brightfield immunohistochemistry (IHC) staining and quantitative measurement of multiple biomarkers can support therapeutic targeting of carcinoma-associated fibroblasts (CAF). This paper presents an automated digitalpathology solution to simultaneously analyze multiple biomarker expressions within a single tissue section stained with an IHC duplex assay. Our method was verified against ground truth provided by expert pathologists. In the first stage, the automated method quantified epithelial-carcinoma cells expressing cytokeratin (CK) using robust nucleus detection and supervised cell-by-cell classification algorithms with a combination of nucleus and contextual features. Using fibroblast activation protein (FAP) as biomarker for CAFs, the algorithm was trained, based on ground truth obtained from pathologists, to automatically identify tumor-associated stroma using a supervised-generation rule. The algorithm reported distance to nearest neighbor in the populations of tumor cells and activated-stromal fibroblasts as a wholeslide measure of spatial relationships. A total of 45 slides from six indications (breast, pancreatic, colorectal, lung, ovarian, and head-and-neck cancers) were included for training and verification. CK-positive cells detected by the algorithm were verified by a pathologist with good agreement (R2=0.98) to ground-truth count. For the area occupied by FAP-positive cells, the inter-observer agreement between two sets of ground-truth measurements was R2=0.93 whereas the algorithm reproduced the pathologists' areas with R2=0.96. The proposed methodology enables automated image analysis to measure spatial relationships of cells stained in an IHC-multiplex assay. Our proof-of-concept results show an automated algorithm can be trained to reproduce the expert assessment and provide quantitative readouts that potentially support a cutoff determination in hypothesis testing related to CAF-targeting-therapy decisions.
QSAR models based on quantum topological molecular similarity.
Popelier, P L A; Smith, P J
2006-07-01
A new method called quantum topological molecular similarity (QTMS) was fairly recently proposed [J. Chem. Inf. Comp. Sc., 41, 2001, 764] to construct a variety of medicinal, ecological and physical organic QSAR/QSPRs. QTMS method uses quantum chemical topology (QCT) to define electronic descriptors drawn from modern ab initio wave functions of geometry-optimised molecules. It was shown that the current abundance of computing power can be utilised to inject realistic descriptors into QSAR/QSPRs. In this article we study seven datasets of medicinal interest : the dissociation constants (pK(a)) for a set of substituted imidazolines , the pK(a) of imidazoles , the ability of a set of indole derivatives to displace [(3)H] flunitrazepam from binding to bovine cortical membranes , the influenza inhibition constants for a set of benzimidazoles , the interaction constants for a set of amides and the enzyme liver alcohol dehydrogenase , the natriuretic activity of sulphonamide carbonic anhydrase inhibitors and the toxicity of a series of benzyl alcohols. A partial least square analysis in conjunction with a genetic algorithm delivered excellent models. They are also able to highlight the active site, of the ligand or the molecule whose structure determines the activity. The advantages and limitations of QTMS are discussed.
Buckingham, Bruce A; Cameron, Fraser; Calhoun, Peter; Maahs, David M; Wilson, Darrell M; Chase, H Peter; Bequette, B Wayne; Lum, John; Sibayan, Judy; Beck, Roy W; Kollman, Craig
2013-08-01
Nocturnal hypoglycemia is a common problem with type 1 diabetes. In the home setting, we conducted a pilot study to evaluate the safety of a system consisting of an insulin pump and continuous glucose monitor communicating wirelessly with a bedside computer running an algorithm that temporarily suspends insulin delivery when hypoglycemia is predicted. After the run-in phase, a 21-night randomized trial was conducted in which each night was randomly assigned 2:1 to have either the predictive low-glucose suspend (PLGS) system active (intervention night) or inactive (control night). Three predictive algorithm versions were studied sequentially during the study for a total of 252 intervention and 123 control nights. The trial included 19 participants 18-56 years old with type 1 diabetes (hemoglobin A1c level of 6.0-7.7%) who were current users of the MiniMed Paradigm® REAL-Time Revel™ System and Sof-sensor® glucose sensor (Medtronic Diabetes, Northridge, CA). With the final algorithm, pump suspension occurred on 53% of 77 intervention nights. Mean morning glucose level was 144±48 mg/dL on the 77 intervention nights versus 133±57 mg/dL on the 37 control nights, with morning blood ketones >0.6 mmol/L following one intervention night. Overnight hypoglycemia was lower on intervention than control nights, with at least one value ≤70 mg/dL occurring on 16% versus 30% of nights, respectively, with the final algorithm. This study demonstrated that the PLGS system in the home setting is safe and feasible. The preliminary efficacy data appear promising with the final algorithm reducing nocturnal hypoglycemia by almost 50%.
Quantification of intensity variations in functional MR images using rotated principal components
NASA Astrophysics Data System (ADS)
Backfrieder, W.; Baumgartner, R.; Sámal, M.; Moser, E.; Bergmann, H.
1996-08-01
In functional MRI (fMRI), the changes in cerebral haemodynamics related to stimulated neural brain activity are measured using standard clinical MR equipment. Small intensity variations in fMRI data have to be detected and distinguished from non-neural effects by careful image analysis. Based on multivariate statistics we describe an algorithm involving oblique rotation of the most significant principal components for an estimation of the temporal and spatial distribution of the stimulated neural activity over the whole image matrix. This algorithm takes advantage of strong local signal variations. A mathematical phantom was designed to generate simulated data for the evaluation of the method. In simulation experiments, the potential of the method to quantify small intensity changes, especially when processing data sets containing multiple sources of signal variations, was demonstrated. In vivo fMRI data collected in both visual and motor stimulation experiments were analysed, showing a proper location of the activated cortical regions within well known neural centres and an accurate extraction of the activation time profile. The suggested method yields accurate absolute quantification of in vivo brain activity without the need of extensive prior knowledge and user interaction.
Centroids evaluation of the images obtained with the conical null-screen corneal topographer
NASA Astrophysics Data System (ADS)
Osorio-Infante, Arturo I.; Armengol-Cruz, Victor de Emanuel; Campos-García, Manuel; Cossio-Guerrero, Cesar; Marquez-Flores, Jorge; Díaz-Uribe, José Rufino
2016-09-01
In this work, we propose some algorithms to recover the centroids of the resultant image obtained by a conical nullscreen based corneal topographer. With these algorithms, we obtain the region of interest (roi) of the original image and using an image-processing algorithm, we calculate the geometric centroid of each roi. In order to improve our algorithm performance, we use different settings of null-screen targets, changing their size and number. We also improved the illumination system to avoid inhomogeneous zones in the corneal images. Finally, we report some corneal topographic measurements with the best setting we found.
A Feature and Algorithm Selection Method for Improving the Prediction of Protein Structural Class.
Ni, Qianwu; Chen, Lei
2017-01-01
Correct prediction of protein structural class is beneficial to investigation on protein functions, regulations and interactions. In recent years, several computational methods have been proposed in this regard. However, based on various features, it is still a great challenge to select proper classification algorithm and extract essential features to participate in classification. In this study, a feature and algorithm selection method was presented for improving the accuracy of protein structural class prediction. The amino acid compositions and physiochemical features were adopted to represent features and thirty-eight machine learning algorithms collected in Weka were employed. All features were first analyzed by a feature selection method, minimum redundancy maximum relevance (mRMR), producing a feature list. Then, several feature sets were constructed by adding features in the list one by one. For each feature set, thirtyeight algorithms were executed on a dataset, in which proteins were represented by features in the set. The predicted classes yielded by these algorithms and true class of each protein were collected to construct a dataset, which were analyzed by mRMR method, yielding an algorithm list. From the algorithm list, the algorithm was taken one by one to build an ensemble prediction model. Finally, we selected the ensemble prediction model with the best performance as the optimal ensemble prediction model. Experimental results indicate that the constructed model is much superior to models using single algorithm and other models that only adopt feature selection procedure or algorithm selection procedure. The feature selection procedure or algorithm selection procedure are really helpful for building an ensemble prediction model that can yield a better performance. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
NASA Technical Reports Server (NTRS)
Cao, Fang; Fichot, Cedric G.; Hooker, Stanford B.; Miller, William L.
2014-01-01
Photochemical processes driven by high-energy ultraviolet radiation (UVR) in inshore, estuarine, and coastal waters play an important role in global bio geochemical cycles and biological systems. A key to modeling photochemical processes in these optically complex waters is an accurate description of the vertical distribution of UVR in the water column which can be obtained using the diffuse attenuation coefficients of down welling irradiance (Kd()). The Sea UV Sea UVc algorithms (Fichot et al., 2008) can accurately retrieve Kd ( 320, 340, 380,412, 443 and 490 nm) in oceanic and coastal waters using multispectral remote sensing reflectances (Rrs(), Sea WiFS bands). However, SeaUVSeaUVc algorithms are currently not optimized for use in optically complex, inshore waters, where they tend to severely underestimate Kd(). Here, a new training data set of optical properties collected in optically complex, inshore waters was used to re-parameterize the published SeaUVSeaUVc algorithms, resulting in improved Kd() retrievals for turbid, estuarine waters. Although the updated SeaUVSeaUVc algorithms perform best in optically complex waters, the published SeaUVSeaUVc models still perform well in most coastal and oceanic waters. Therefore, we propose a composite set of SeaUVSeaUVc algorithms, optimized for Kd() retrieval in almost all marine systems, ranging from oceanic to inshore waters. The composite algorithm set can retrieve Kd from ocean color with good accuracy across this wide range of water types (e.g., within 13 mean relative error for Kd(340)). A validation step using three independent, in situ data sets indicates that the composite SeaUVSeaUVc can generate accurate Kd values from 320 490 nm using satellite imagery on a global scale. Taking advantage of the inherent benefits of our statistical methods, we pooled the validation data with the training set, obtaining an optimized composite model for estimating Kd() in UV wavelengths for almost all marine waters. This optimized composite set of SeaUVSeaUVc algorithms will provide the optical community with improved ability to quantify the role of solar UV radiation in photochemical and photobiological processes in the ocean.
NASA Astrophysics Data System (ADS)
Anders, Niels; Keesstra, Saskia; Masselink, Rens
2014-05-01
Unmanned Aerial System (UAS) are becoming popular tools in the geosciences due to improving technology and processing/analysis techniques. They can potentially fill the gap between spaceborne or manned aircraft remote sensing and terrestrial remote sensing, both in terms of spatial and temporal resolution. In this study we analyze a multi-temporal data set that was acquired with a fixed-wing UAS in an agricultural catchment (2 sq. km) in Navarra, Spain. The goal of this study is to register soil erosion activity after one year of agricultural activity. The aircraft was equipped with a Panasonic GX1 16MP pocket camera with a 20 mm lens to capture normal JPEG RGB images. The data set consisted of two sets of imagery acquired in the end of February in 2013 and 2014 after harvesting. The raw images were processed using Agisoft Photoscan Pro which includes the structure-from-motion (SfM) and multi-view stereopsis (MVS) algorithms producing digital surface models and orthophotos of both data sets. A discussion is presented that is focused on the suitability of multi-temporal UAS data and SfM/MVS processing for quantifying soil loss, mapping the distribution of eroded materials and analyzing re-occurrences of rill patterns after plowing.
NASA Astrophysics Data System (ADS)
Stützer, K.; Bert, C.; Enghardt, W.; Helmbrecht, S.; Parodi, K.; Priegnitz, M.; Saito, N.; Fiedler, F.
2013-08-01
In-beam positron emission tomography (PET) has been proven to be a reliable technique in ion beam radiotherapy for the in situ and non-invasive evaluation of the correct dose deposition in static tumour entities. In the presence of intra-fractional target motion an appropriate time-resolved (four-dimensional, 4D) reconstruction algorithm has to be used to avoid reconstructed activity distributions suffering from motion-related blurring artefacts and to allow for a dedicated dose monitoring. Four-dimensional reconstruction algorithms from diagnostic PET imaging that can properly handle the typically low counting statistics of in-beam PET data have been adapted and optimized for the characteristics of the double-head PET scanner BASTEI installed at GSI Helmholtzzentrum Darmstadt, Germany (GSI). Systematic investigations with moving radioactive sources demonstrate the more effective reduction of motion artefacts by applying a 4D maximum likelihood expectation maximization (MLEM) algorithm instead of the retrospective co-registration of phasewise reconstructed quasi-static activity distributions. Further 4D MLEM results are presented from in-beam PET measurements of irradiated moving phantoms which verify the accessibility of relevant parameters for the dose monitoring of intra-fractionally moving targets. From in-beam PET listmode data sets acquired together with a motion surrogate signal, valuable images can be generated by the 4D MLEM reconstruction for different motion patterns and motion-compensated beam delivery techniques.
Selecting Power-Efficient Signal Features for a Low-Power Fall Detector.
Wang, Changhong; Redmond, Stephen J; Lu, Wei; Stevens, Michael C; Lord, Stephen R; Lovell, Nigel H
2017-11-01
Falls are a serious threat to the health of older people. A wearable fall detector can automatically detect the occurrence of a fall and alert a caregiver or an emergency response service so they may deliver immediate assistance, improving the chances of recovering from fall-related injuries. One constraint of such a wearable technology is its limited battery life. Thus, minimization of power consumption is an important design concern, all the while maintaining satisfactory accuracy of the fall detection algorithms implemented on the wearable device. This paper proposes an approach for selecting power-efficient signal features such that the minimum desirable fall detection accuracy is assured. Using data collected in simulated falls, simulated activities of daily living, and real free-living trials, all using young volunteers, the proposed approach selects four features from a set of ten commonly used features, providing a power saving of 75.3%, while limiting the error rate of a binary classification decision tree fall detection algorithm to 7.1%.Falls are a serious threat to the health of older people. A wearable fall detector can automatically detect the occurrence of a fall and alert a caregiver or an emergency response service so they may deliver immediate assistance, improving the chances of recovering from fall-related injuries. One constraint of such a wearable technology is its limited battery life. Thus, minimization of power consumption is an important design concern, all the while maintaining satisfactory accuracy of the fall detection algorithms implemented on the wearable device. This paper proposes an approach for selecting power-efficient signal features such that the minimum desirable fall detection accuracy is assured. Using data collected in simulated falls, simulated activities of daily living, and real free-living trials, all using young volunteers, the proposed approach selects four features from a set of ten commonly used features, providing a power saving of 75.3%, while limiting the error rate of a binary classification decision tree fall detection algorithm to 7.1%.
NASA Astrophysics Data System (ADS)
Zhang, Yuli; Han, Jun; Weng, Xinqian; He, Zhongzhu; Zeng, Xiaoyang
This paper presents an Application Specific Instruction-set Processor (ASIP) for the SHA-3 BLAKE algorithm family by instruction set extensions (ISE) from an RISC (reduced instruction set computer) processor. With a design space exploration for this ASIP to increase the performance and reduce the area cost, we accomplish an efficient hardware and software implementation of BLAKE algorithm. The special instructions and their well-matched hardware function unit improve the calculation of the key section of the algorithm, namely G-functions. Also, relaxing the time constraint of the special function unit can decrease its hardware cost, while keeping the high data throughput of the processor. Evaluation results reveal the ASIP achieves 335Mbps and 176Mbps for BLAKE-256 and BLAKE-512. The extra area cost is only 8.06k equivalent gates. The proposed ASIP outperforms several software approaches on various platforms in cycle per byte. In fact, both high throughput and low hardware cost achieved by this programmable processor are comparable to that of ASIC implementations.
Preconditioning 2D Integer Data for Fast Convex Hull Computations.
Cadenas, José Oswaldo; Megson, Graham M; Luengo Hendriks, Cris L
2016-01-01
In order to accelerate computing the convex hull on a set of n points, a heuristic procedure is often applied to reduce the number of points to a set of s points, s ≤ n, which also contains the same hull. We present an algorithm to precondition 2D data with integer coordinates bounded by a box of size p × q before building a 2D convex hull, with three distinct advantages. First, we prove that under the condition min(p, q) ≤ n the algorithm executes in time within O(n); second, no explicit sorting of data is required; and third, the reduced set of s points forms a simple polygonal chain and thus can be directly pipelined into an O(n) time convex hull algorithm. This paper empirically evaluates and quantifies the speed up gained by preconditioning a set of points by a method based on the proposed algorithm before using common convex hull algorithms to build the final hull. A speedup factor of at least four is consistently found from experiments on various datasets when the condition min(p, q) ≤ n holds; the smaller the ratio min(p, q)/n is in the dataset, the greater the speedup factor achieved.
Zafar, Raheel; Kamel, Nidal; Naufal, Mohamad; Malik, Aamir Saeed; Dass, Sarat C; Ahmad, Rana Fayyaz; Abdullah, Jafri M; Reza, Faruque
2017-01-01
Decoding of human brain activity has always been a primary goal in neuroscience especially with functional magnetic resonance imaging (fMRI) data. In recent years, Convolutional neural network (CNN) has become a popular method for the extraction of features due to its higher accuracy, however it needs a lot of computation and training data. In this study, an algorithm is developed using Multivariate pattern analysis (MVPA) and modified CNN to decode the behavior of brain for different images with limited data set. Selection of significant features is an important part of fMRI data analysis, since it reduces the computational burden and improves the prediction performance; significant features are selected using t-test. MVPA uses machine learning algorithms to classify different brain states and helps in prediction during the task. General linear model (GLM) is used to find the unknown parameters of every individual voxel and the classification is done using multi-class support vector machine (SVM). MVPA-CNN based proposed algorithm is compared with region of interest (ROI) based method and MVPA based estimated values. The proposed method showed better overall accuracy (68.6%) compared to ROI (61.88%) and estimation values (64.17%).
Bourke, Alan K; van de Ven, Pepijn W J; Chaya, Amy E; OLaighin, Gearóid M; Nelson, John
2008-01-01
A fall detection system and algorithm, incorporated into a custom designed garment has been developed. The developed fall detection system uses a tri-axial accelerometer, microcontroller, battery and Bluetooth module. This sensor is attached to a custom designed vest, designed to be worn by the elderly person under clothing. The fall detection algorithm was developed and incorporates both impact and posture detection capability. The vest and fall algorithm was tested on young healthy subjects performing normal activities of daily living (ADL) and falls onto crash mats, while wearing the best and sensor. Results show that falls can de distinguished from normal activities with a sensitivity >90% and a specificity of >99%, from a total data set of 264 falls and 165 normal ADL. By incorporating the fall-detection sensor into a custom designed garment it is anticipated that greater compliance when wearing a fall-detection system can be achieved and will help reduce the incidence of the long-lie, when falls occur in the elderly population. However further long-term testing using elderly subjects is required to validate the systems performance.
2010-01-01
Background In bioinformatics it is common to search for a pattern of interest in a potentially large set of rather short sequences (upstream gene regions, proteins, exons, etc.). Although many methodological approaches allow practitioners to compute the distribution of a pattern count in a random sequence generated by a Markov source, no specific developments have taken into account the counting of occurrences in a set of independent sequences. We aim to address this problem by deriving efficient approaches and algorithms to perform these computations both for low and high complexity patterns in the framework of homogeneous or heterogeneous Markov models. Results The latest advances in the field allowed us to use a technique of optimal Markov chain embedding based on deterministic finite automata to introduce three innovative algorithms. Algorithm 1 is the only one able to deal with heterogeneous models. It also permits to avoid any product of convolution of the pattern distribution in individual sequences. When working with homogeneous models, Algorithm 2 yields a dramatic reduction in the complexity by taking advantage of previous computations to obtain moment generating functions efficiently. In the particular case of low or moderate complexity patterns, Algorithm 3 exploits power computation and binary decomposition to further reduce the time complexity to a logarithmic scale. All these algorithms and their relative interest in comparison with existing ones were then tested and discussed on a toy-example and three biological data sets: structural patterns in protein loop structures, PROSITE signatures in a bacterial proteome, and transcription factors in upstream gene regions. On these data sets, we also compared our exact approaches to the tempting approximation that consists in concatenating the sequences in the data set into a single sequence. Conclusions Our algorithms prove to be effective and able to handle real data sets with multiple sequences, as well as biological patterns of interest, even when the latter display a high complexity (PROSITE signatures for example). In addition, these exact algorithms allow us to avoid the edge effect observed under the single sequence approximation, which leads to erroneous results, especially when the marginal distribution of the model displays a slow convergence toward the stationary distribution. We end up with a discussion on our method and on its potential improvements. PMID:20205909
Nuel, Gregory; Regad, Leslie; Martin, Juliette; Camproux, Anne-Claude
2010-01-26
In bioinformatics it is common to search for a pattern of interest in a potentially large set of rather short sequences (upstream gene regions, proteins, exons, etc.). Although many methodological approaches allow practitioners to compute the distribution of a pattern count in a random sequence generated by a Markov source, no specific developments have taken into account the counting of occurrences in a set of independent sequences. We aim to address this problem by deriving efficient approaches and algorithms to perform these computations both for low and high complexity patterns in the framework of homogeneous or heterogeneous Markov models. The latest advances in the field allowed us to use a technique of optimal Markov chain embedding based on deterministic finite automata to introduce three innovative algorithms. Algorithm 1 is the only one able to deal with heterogeneous models. It also permits to avoid any product of convolution of the pattern distribution in individual sequences. When working with homogeneous models, Algorithm 2 yields a dramatic reduction in the complexity by taking advantage of previous computations to obtain moment generating functions efficiently. In the particular case of low or moderate complexity patterns, Algorithm 3 exploits power computation and binary decomposition to further reduce the time complexity to a logarithmic scale. All these algorithms and their relative interest in comparison with existing ones were then tested and discussed on a toy-example and three biological data sets: structural patterns in protein loop structures, PROSITE signatures in a bacterial proteome, and transcription factors in upstream gene regions. On these data sets, we also compared our exact approaches to the tempting approximation that consists in concatenating the sequences in the data set into a single sequence. Our algorithms prove to be effective and able to handle real data sets with multiple sequences, as well as biological patterns of interest, even when the latter display a high complexity (PROSITE signatures for example). In addition, these exact algorithms allow us to avoid the edge effect observed under the single sequence approximation, which leads to erroneous results, especially when the marginal distribution of the model displays a slow convergence toward the stationary distribution. We end up with a discussion on our method and on its potential improvements.
Li, Yang; Li, Guoqing; Wang, Zhenhao
2015-01-01
In order to overcome the problems of poor understandability of the pattern recognition-based transient stability assessment (PRTSA) methods, a new rule extraction method based on extreme learning machine (ELM) and an improved Ant-miner (IAM) algorithm is presented in this paper. First, the basic principles of ELM and Ant-miner algorithm are respectively introduced. Then, based on the selected optimal feature subset, an example sample set is generated by the trained ELM-based PRTSA model. And finally, a set of classification rules are obtained by IAM algorithm to replace the original ELM network. The novelty of this proposal is that transient stability rules are extracted from an example sample set generated by the trained ELM-based transient stability assessment model by using IAM algorithm. The effectiveness of the proposed method is shown by the application results on the New England 39-bus power system and a practical power system--the southern power system of Hebei province.
Algorithm Sorts Groups Of Data
NASA Technical Reports Server (NTRS)
Evans, J. D.
1987-01-01
For efficient sorting, algorithm finds set containing minimum or maximum most significant data. Sets of data sorted as desired. Sorting process simplified by reduction of each multielement set of data to single representative number. First, each set of data expressed as polynomial with suitably chosen base, using elements of set as coefficients. Most significant element placed in term containing largest exponent. Base selected by examining range in value of data elements. Resulting series summed to yield single representative number. Numbers easily sorted, and each such number converted back to original set of data by successive division. Program written in BASIC.
PCA-LBG-based algorithms for VQ codebook generation
NASA Astrophysics Data System (ADS)
Tsai, Jinn-Tsong; Yang, Po-Yuan
2015-04-01
Vector quantisation (VQ) codebooks are generated by combining principal component analysis (PCA) algorithms with Linde-Buzo-Gray (LBG) algorithms. All training vectors are grouped according to the projected values of the principal components. The PCA-LBG-based algorithms include (1) PCA-LBG-Median, which selects the median vector of each group, (2) PCA-LBG-Centroid, which adopts the centroid vector of each group, and (3) PCA-LBG-Random, which randomly selects a vector of each group. The LBG algorithm finds a codebook based on the better vectors sent to an initial codebook by the PCA. The PCA performs an orthogonal transformation to convert a set of potentially correlated variables into a set of variables that are not linearly correlated. Because the orthogonal transformation efficiently distinguishes test image vectors, the proposed PCA-LBG-based algorithm is expected to outperform conventional algorithms in designing VQ codebooks. The experimental results confirm that the proposed PCA-LBG-based algorithms indeed obtain better results compared to existing methods reported in the literature.
A feasibility study of using event-related potential as a biometrics.
Yih-Choung Yu; Sicheng Wang; Gabel, Lisa A
2016-08-01
The use of an individual's neural response to stimuli (the event-related potential or ERP) has potential as a biometric because it is highly resistant to fraud relative to other conventional authentication systems. P300 is an ERP in human electroencephalography (EEG) that occurs in response to an oddball stimulus when an individual is actively engaged in a target detection task. Because P300 is consistently detectable from almost every subject, it is considered a potential signal for biometric applications. This paper presents a feasibility study of using topological plots of P300 as a biometric in subject authentication. The variation in latency and location of P300 response of 24 participants performing the P300Speller task were studied. Data sets from four participants were used for algorithm training; data from the other 20 participants were used as imposters for algorithm validation. The result showed that the algorithm was able to correctly identify three out of these four participants. Validation test also proved that the algorithm was able to reject 95% of the imposters for those three authenticated participants.
Evaluation of a Smartphone-based Human Activity Recognition System in a Daily Living Environment.
Lemaire, Edward D; Tundo, Marco D; Baddour, Natalie
2015-12-11
An evaluation method that includes continuous activities in a daily-living environment was developed for Wearable Mobility Monitoring Systems (WMMS) that attempt to recognize user activities. Participants performed a pre-determined set of daily living actions within a continuous test circuit that included mobility activities (walking, standing, sitting, lying, ascending/descending stairs), daily living tasks (combing hair, brushing teeth, preparing food, eating, washing dishes), and subtle environment changes (opening doors, using an elevator, walking on inclines, traversing staircase landings, walking outdoors). To evaluate WMMS performance on this circuit, fifteen able-bodied participants completed the tasks while wearing a smartphone at their right front pelvis. The WMMS application used smartphone accelerometer and gyroscope signals to classify activity states. A gold standard comparison data set was created by video-recording each trial and manually logging activity onset times. Gold standard and WMMS data were analyzed offline. Three classification sets were calculated for each circuit: (i) mobility or immobility, ii) sit, stand, lie, or walking, and (iii) sit, stand, lie, walking, climbing stairs, or small standing movement. Sensitivities, specificities, and F-Scores for activity categorization and changes-of-state were calculated. The mobile versus immobile classification set had a sensitivity of 86.30% ± 7.2% and specificity of 98.96% ± 0.6%, while the second prediction set had a sensitivity of 88.35% ± 7.80% and specificity of 98.51% ± 0.62%. For the third classification set, sensitivity was 84.92% ± 6.38% and specificity was 98.17 ± 0.62. F1 scores for the first, second and third classification sets were 86.17 ± 6.3, 80.19 ± 6.36, and 78.42 ± 5.96, respectively. This demonstrates that WMMS performance depends on the evaluation protocol in addition to the algorithms. The demonstrated protocol can be used and tailored for evaluating human activity recognition systems in rehabilitation medicine where mobility monitoring may be beneficial in clinical decision-making.
Analysis of Multivariate Experimental Data Using A Simplified Regression Model Search Algorithm
NASA Technical Reports Server (NTRS)
Ulbrich, Norbert M.
2013-01-01
A new regression model search algorithm was developed that may be applied to both general multivariate experimental data sets and wind tunnel strain-gage balance calibration data. The algorithm is a simplified version of a more complex algorithm that was originally developed for the NASA Ames Balance Calibration Laboratory. The new algorithm performs regression model term reduction to prevent overfitting of data. It has the advantage that it needs only about one tenth of the original algorithm's CPU time for the completion of a regression model search. In addition, extensive testing showed that the prediction accuracy of math models obtained from the simplified algorithm is similar to the prediction accuracy of math models obtained from the original algorithm. The simplified algorithm, however, cannot guarantee that search constraints related to a set of statistical quality requirements are always satisfied in the optimized regression model. Therefore, the simplified algorithm is not intended to replace the original algorithm. Instead, it may be used to generate an alternate optimized regression model of experimental data whenever the application of the original search algorithm fails or requires too much CPU time. Data from a machine calibration of NASA's MK40 force balance is used to illustrate the application of the new search algorithm.
Analysis of Multivariate Experimental Data Using A Simplified Regression Model Search Algorithm
NASA Technical Reports Server (NTRS)
Ulbrich, Norbert Manfred
2013-01-01
A new regression model search algorithm was developed in 2011 that may be used to analyze both general multivariate experimental data sets and wind tunnel strain-gage balance calibration data. The new algorithm is a simplified version of a more complex search algorithm that was originally developed at the NASA Ames Balance Calibration Laboratory. The new algorithm has the advantage that it needs only about one tenth of the original algorithm's CPU time for the completion of a search. In addition, extensive testing showed that the prediction accuracy of math models obtained from the simplified algorithm is similar to the prediction accuracy of math models obtained from the original algorithm. The simplified algorithm, however, cannot guarantee that search constraints related to a set of statistical quality requirements are always satisfied in the optimized regression models. Therefore, the simplified search algorithm is not intended to replace the original search algorithm. Instead, it may be used to generate an alternate optimized regression model of experimental data whenever the application of the original search algorithm either fails or requires too much CPU time. Data from a machine calibration of NASA's MK40 force balance is used to illustrate the application of the new regression model search algorithm.
2015-06-01
10-2014 to 00-11-2014 4. TITLE AND SUBTITLE Postprocessing of Voxel-Based Topologies for Additive Manufacturing Using the Computational Geometry...ABSTRACT Postprocessing of 3-dimensional (3-D) topologies that are defined as a set of voxels using the Computational Geometry Algorithms Library (CGAL... computational geometry algorithms, several of which are suited to the task. The work flow described in this report involves first defining a set of
Jimenez-Del-Toro, Oscar; Muller, Henning; Krenn, Markus; Gruenberg, Katharina; Taha, Abdel Aziz; Winterstein, Marianne; Eggel, Ivan; Foncubierta-Rodriguez, Antonio; Goksel, Orcun; Jakab, Andras; Kontokotsios, Georgios; Langs, Georg; Menze, Bjoern H; Salas Fernandez, Tomas; Schaer, Roger; Walleyo, Anna; Weber, Marc-Andre; Dicente Cid, Yashin; Gass, Tobias; Heinrich, Mattias; Jia, Fucang; Kahl, Fredrik; Kechichian, Razmig; Mai, Dominic; Spanier, Assaf B; Vincent, Graham; Wang, Chunliang; Wyeth, Daniel; Hanbury, Allan
2016-11-01
Variations in the shape and appearance of anatomical structures in medical images are often relevant radiological signs of disease. Automatic tools can help automate parts of this manual process. A cloud-based evaluation framework is presented in this paper including results of benchmarking current state-of-the-art medical imaging algorithms for anatomical structure segmentation and landmark detection: the VISCERAL Anatomy benchmarks. The algorithms are implemented in virtual machines in the cloud where participants can only access the training data and can be run privately by the benchmark administrators to objectively compare their performance in an unseen common test set. Overall, 120 computed tomography and magnetic resonance patient volumes were manually annotated to create a standard Gold Corpus containing a total of 1295 structures and 1760 landmarks. Ten participants contributed with automatic algorithms for the organ segmentation task, and three for the landmark localization task. Different algorithms obtained the best scores in the four available imaging modalities and for subsets of anatomical structures. The annotation framework, resulting data set, evaluation setup, results and performance analysis from the three VISCERAL Anatomy benchmarks are presented in this article. Both the VISCERAL data set and Silver Corpus generated with the fusion of the participant algorithms on a larger set of non-manually-annotated medical images are available to the research community.
Guturu, Parthasarathy; Dantu, Ram
2008-06-01
Many graph- and set-theoretic problems, because of their tremendous application potential and theoretical appeal, have been well investigated by the researchers in complexity theory and were found to be NP-hard. Since the combinatorial complexity of these problems does not permit exhaustive searches for optimal solutions, only near-optimal solutions can be explored using either various problem-specific heuristic strategies or metaheuristic global-optimization methods, such as simulated annealing, genetic algorithms, etc. In this paper, we propose a unified evolutionary algorithm (EA) to the problems of maximum clique finding, maximum independent set, minimum vertex cover, subgraph and double subgraph isomorphism, set packing, set partitioning, and set cover. In the proposed approach, we first map these problems onto the maximum clique-finding problem (MCP), which is later solved using an evolutionary strategy. The proposed impatient EA with probabilistic tabu search (IEA-PTS) for the MCP integrates the best features of earlier successful approaches with a number of new heuristics that we developed to yield a performance that advances the state of the art in EAs for the exploration of the maximum cliques in a graph. Results of experimentation with the 37 DIMACS benchmark graphs and comparative analyses with six state-of-the-art algorithms, including two from the smaller EA community and four from the larger metaheuristics community, indicate that the IEA-PTS outperforms the EAs with respect to a Pareto-lexicographic ranking criterion and offers competitive performance on some graph instances when individually compared to the other heuristic algorithms. It has also successfully set a new benchmark on one graph instance. On another benchmark suite called Benchmarks with Hidden Optimal Solutions, IEA-PTS ranks second, after a very recent algorithm called COVER, among its peers that have experimented with this suite.
A review of estimation of distribution algorithms in bioinformatics
Armañanzas, Rubén; Inza, Iñaki; Santana, Roberto; Saeys, Yvan; Flores, Jose Luis; Lozano, Jose Antonio; Peer, Yves Van de; Blanco, Rosa; Robles, Víctor; Bielza, Concha; Larrañaga, Pedro
2008-01-01
Evolutionary search algorithms have become an essential asset in the algorithmic toolbox for solving high-dimensional optimization problems in across a broad range of bioinformatics problems. Genetic algorithms, the most well-known and representative evolutionary search technique, have been the subject of the major part of such applications. Estimation of distribution algorithms (EDAs) offer a novel evolutionary paradigm that constitutes a natural and attractive alternative to genetic algorithms. They make use of a probabilistic model, learnt from the promising solutions, to guide the search process. In this paper, we set out a basic taxonomy of EDA techniques, underlining the nature and complexity of the probabilistic model of each EDA variant. We review a set of innovative works that make use of EDA techniques to solve challenging bioinformatics problems, emphasizing the EDA paradigm's potential for further research in this domain. PMID:18822112
Comparing multiple turbulence restoration algorithms performance on noisy anisoplanatic imagery
NASA Astrophysics Data System (ADS)
Rucci, Michael A.; Hardie, Russell C.; Dapore, Alexander J.
2017-05-01
In this paper, we compare the performance of multiple turbulence mitigation algorithms to restore imagery degraded by atmospheric turbulence and camera noise. In order to quantify and compare algorithm performance, imaging scenes were simulated by applying noise and varying levels of turbulence. For the simulation, a Monte-Carlo wave optics approach is used to simulate the spatially and temporally varying turbulence in an image sequence. A Poisson-Gaussian noise mixture model is then used to add noise to the observed turbulence image set. These degraded image sets are processed with three separate restoration algorithms: Lucky Look imaging, bispectral speckle imaging, and a block matching method with restoration filter. These algorithms were chosen because they incorporate different approaches and processing techniques. The results quantitatively show how well the algorithms are able to restore the simulated degraded imagery.
Fast Steerable Principal Component Analysis
Zhao, Zhizhen; Shkolnisky, Yoel; Singer, Amit
2016-01-01
Cryo-electron microscopy nowadays often requires the analysis of hundreds of thousands of 2-D images as large as a few hundred pixels in each direction. Here, we introduce an algorithm that efficiently and accurately performs principal component analysis (PCA) for a large set of 2-D images, and, for each image, the set of its uniform rotations in the plane and their reflections. For a dataset consisting of n images of size L × L pixels, the computational complexity of our algorithm is O(nL3 + L4), while existing algorithms take O(nL4). The new algorithm computes the expansion coefficients of the images in a Fourier–Bessel basis efficiently using the nonuniform fast Fourier transform. We compare the accuracy and efficiency of the new algorithm with traditional PCA and existing algorithms for steerable PCA. PMID:27570801
Stride search: A general algorithm for storm detection in high-resolution climate data
Bosler, Peter A.; Roesler, Erika L.; Taylor, Mark A.; ...
2016-04-13
This study discusses the problem of identifying extreme climate events such as intense storms within large climate data sets. The basic storm detection algorithm is reviewed, which splits the problem into two parts: a spatial search followed by a temporal correlation problem. Two specific implementations of the spatial search algorithm are compared: the commonly used grid point search algorithm is reviewed, and a new algorithm called Stride Search is introduced. The Stride Search algorithm is defined independently of the spatial discretization associated with a particular data set. Results from the two algorithms are compared for the application of tropical cyclonemore » detection, and shown to produce similar results for the same set of storm identification criteria. Differences between the two algorithms arise for some storms due to their different definition of search regions in physical space. The physical space associated with each Stride Search region is constant, regardless of data resolution or latitude, and Stride Search is therefore capable of searching all regions of the globe in the same manner. Stride Search's ability to search high latitudes is demonstrated for the case of polar low detection. Wall clock time required for Stride Search is shown to be smaller than a grid point search of the same data, and the relative speed up associated with Stride Search increases as resolution increases.« less
Any Two Learning Algorithms Are (Almost) Exactly Identical
NASA Technical Reports Server (NTRS)
Wolpert, David H.
2000-01-01
This paper shows that if one is provided with a loss function, it can be used in a natural way to specify a distance measure quantifying the similarity of any two supervised learning algorithms, even non-parametric algorithms. Intuitively, this measure gives the fraction of targets and training sets for which the expected performance of the two algorithms differs significantly. Bounds on the value of this distance are calculated for the case of binary outputs and 0-1 loss, indicating that any two learning algorithms are almost exactly identical for such scenarios. As an example, for any two algorithms A and B, even for small input spaces and training sets, for less than 2e(-50) of all targets will the difference between A's and B's generalization performance of exceed 1%. In particular, this is true if B is bagging applied to A, or boosting applied to A. These bounds can be viewed alternatively as telling us, for example, that the simple English phrase 'I expect that algorithm A will generalize from the training set with an accuracy of at least 75% on the rest of the target' conveys 20,000 bytes of information concerning the target. The paper ends by discussing some of the subtleties of extending the distance measure to give a full (non-parametric) differential geometry of the manifold of learning algorithms.
Siuly; Li, Yan; Paul Wen, Peng
2014-03-01
Motor imagery (MI) tasks classification provides an important basis for designing brain-computer interface (BCI) systems. If the MI tasks are reliably distinguished through identifying typical patterns in electroencephalography (EEG) data, a motor disabled people could communicate with a device by composing sequences of these mental states. In our earlier study, we developed a cross-correlation based logistic regression (CC-LR) algorithm for the classification of MI tasks for BCI applications, but its performance was not satisfactory. This study develops a modified version of the CC-LR algorithm exploring a suitable feature set that can improve the performance. The modified CC-LR algorithm uses the C3 electrode channel (in the international 10-20 system) as a reference channel for the cross-correlation (CC) technique and applies three diverse feature sets separately, as the input to the logistic regression (LR) classifier. The present algorithm investigates which feature set is the best to characterize the distribution of MI tasks based EEG data. This study also provides an insight into how to select a reference channel for the CC technique with EEG signals considering the anatomical structure of the human brain. The proposed algorithm is compared with eight of the most recently reported well-known methods including the BCI III Winner algorithm. The findings of this study indicate that the modified CC-LR algorithm has potential to improve the identification performance of MI tasks in BCI systems. The results demonstrate that the proposed technique provides a classification improvement over the existing methods tested. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Addison, Paul S; Wang, Rui; Uribe, Alberto A; Bergese, Sergio D
2015-06-01
DPOP (∆POP or Delta-POP) is a non-invasive parameter which measures the strength of respiratory modulations present in the pulse oximetry photoplethysmogram (pleth) waveform. It has been proposed as a non-invasive surrogate parameter for pulse pressure variation (PPV) used in the prediction of the response to volume expansion in hypovolemic patients. Many groups have reported on the DPOP parameter and its correlation with PPV using various semi-automated algorithmic implementations. The study reported here demonstrates the performance gains made by adding increasingly sophisticated signal processing components to a fully automated DPOP algorithm. A DPOP algorithm was coded and its performance systematically enhanced through a series of code module alterations and additions. Each algorithm iteration was tested on data from 20 mechanically ventilated OR patients. Correlation coefficients and ROC curve statistics were computed at each stage. For the purposes of the analysis we split the data into a manually selected 'stable' region subset of the data containing relatively noise free segments and a 'global' set incorporating the whole data record. Performance gains were measured in terms of correlation against PPV measurements in OR patients undergoing controlled mechanical ventilation. Through increasingly advanced pre-processing and post-processing enhancements to the algorithm, the correlation coefficient between DPOP and PPV improved from a baseline value of R = 0.347 to R = 0.852 for the stable data set, and, correspondingly, R = 0.225 to R = 0.728 for the more challenging global data set. Marked gains in algorithm performance are achievable for manually selected stable regions of the signals using relatively simple algorithm enhancements. Significant additional algorithm enhancements, including a correction for low perfusion values, were required before similar gains were realised for the more challenging global data set.
A greedy algorithm for species selection in dimension reduction of combustion chemistry
NASA Astrophysics Data System (ADS)
Hiremath, Varun; Ren, Zhuyin; Pope, Stephen B.
2010-09-01
Computational calculations of combustion problems involving large numbers of species and reactions with a detailed description of the chemistry can be very expensive. Numerous dimension reduction techniques have been developed in the past to reduce the computational cost. In this paper, we consider the rate controlled constrained-equilibrium (RCCE) dimension reduction method, in which a set of constrained species is specified. For a given number of constrained species, the 'optimal' set of constrained species is that which minimizes the dimension reduction error. The direct determination of the optimal set is computationally infeasible, and instead we present a greedy algorithm which aims at determining a 'good' set of constrained species; that is, one leading to near-minimal dimension reduction error. The partially-stirred reactor (PaSR) involving methane premixed combustion with chemistry described by the GRI-Mech 1.2 mechanism containing 31 species is used to test the algorithm. Results on dimension reduction errors for different sets of constrained species are presented to assess the effectiveness of the greedy algorithm. It is shown that the first four constrained species selected using the proposed greedy algorithm produce lower dimension reduction error than constraints on the major species: CH4, O2, CO2 and H2O. It is also shown that the first ten constrained species selected using the proposed greedy algorithm produce a non-increasing dimension reduction error with every additional constrained species; and produce the lowest dimension reduction error in many cases tested over a wide range of equivalence ratios, pressures and initial temperatures.
A high-performance spatial database based approach for pathology imaging algorithm evaluation
Wang, Fusheng; Kong, Jun; Gao, Jingjing; Cooper, Lee A.D.; Kurc, Tahsin; Zhou, Zhengwen; Adler, David; Vergara-Niedermayr, Cristobal; Katigbak, Bryan; Brat, Daniel J.; Saltz, Joel H.
2013-01-01
Background: Algorithm evaluation provides a means to characterize variability across image analysis algorithms, validate algorithms by comparison with human annotations, combine results from multiple algorithms for performance improvement, and facilitate algorithm sensitivity studies. The sizes of images and image analysis results in pathology image analysis pose significant challenges in algorithm evaluation. We present an efficient parallel spatial database approach to model, normalize, manage, and query large volumes of analytical image result data. This provides an efficient platform for algorithm evaluation. Our experiments with a set of brain tumor images demonstrate the application, scalability, and effectiveness of the platform. Context: The paper describes an approach and platform for evaluation of pathology image analysis algorithms. The platform facilitates algorithm evaluation through a high-performance database built on the Pathology Analytic Imaging Standards (PAIS) data model. Aims: (1) Develop a framework to support algorithm evaluation by modeling and managing analytical results and human annotations from pathology images; (2) Create a robust data normalization tool for converting, validating, and fixing spatial data from algorithm or human annotations; (3) Develop a set of queries to support data sampling and result comparisons; (4) Achieve high performance computation capacity via a parallel data management infrastructure, parallel data loading and spatial indexing optimizations in this infrastructure. Materials and Methods: We have considered two scenarios for algorithm evaluation: (1) algorithm comparison where multiple result sets from different methods are compared and consolidated; and (2) algorithm validation where algorithm results are compared with human annotations. We have developed a spatial normalization toolkit to validate and normalize spatial boundaries produced by image analysis algorithms or human annotations. The validated data were formatted based on the PAIS data model and loaded into a spatial database. To support efficient data loading, we have implemented a parallel data loading tool that takes advantage of multi-core CPUs to accelerate data injection. The spatial database manages both geometric shapes and image features or classifications, and enables spatial sampling, result comparison, and result aggregation through expressive structured query language (SQL) queries with spatial extensions. To provide scalable and efficient query support, we have employed a shared nothing parallel database architecture, which distributes data homogenously across multiple database partitions to take advantage of parallel computation power and implements spatial indexing to achieve high I/O throughput. Results: Our work proposes a high performance, parallel spatial database platform for algorithm validation and comparison. This platform was evaluated by storing, managing, and comparing analysis results from a set of brain tumor whole slide images. The tools we develop are open source and available to download. Conclusions: Pathology image algorithm validation and comparison are essential to iterative algorithm development and refinement. One critical component is the support for queries involving spatial predicates and comparisons. In our work, we develop an efficient data model and parallel database approach to model, normalize, manage and query large volumes of analytical image result data. Our experiments demonstrate that the data partitioning strategy and the grid-based indexing result in good data distribution across database nodes and reduce I/O overhead in spatial join queries through parallel retrieval of relevant data and quick subsetting of datasets. The set of tools in the framework provide a full pipeline to normalize, load, manage and query analytical results for algorithm evaluation. PMID:23599905
A new algorithm to construct phylogenetic networks from trees.
Wang, J
2014-03-06
Developing appropriate methods for constructing phylogenetic networks from tree sets is an important problem, and much research is currently being undertaken in this area. BIMLR is an algorithm that constructs phylogenetic networks from tree sets. The algorithm can construct a much simpler network than other available methods. Here, we introduce an improved version of the BIMLR algorithm, QuickCass. QuickCass changes the selection strategy of the labels of leaves below the reticulate nodes, i.e., the nodes with an indegree of at least 2 in BIMLR. We show that QuickCass can construct simpler phylogenetic networks than BIMLR. Furthermore, we show that QuickCass is a polynomial-time algorithm when the output network that is constructed by QuickCass is binary.
Cota-Ruiz, Juan; Rosiles, Jose-Gerardo; Sifuentes, Ernesto; Rivas-Perea, Pablo
2012-01-01
This research presents a distributed and formula-based bilateration algorithm that can be used to provide initial set of locations. In this scheme each node uses distance estimates to anchors to solve a set of circle-circle intersection (CCI) problems, solved through a purely geometric formulation. The resulting CCIs are processed to pick those that cluster together and then take the average to produce an initial node location. The algorithm is compared in terms of accuracy and computational complexity with a Least-Squares localization algorithm, based on the Levenberg-Marquardt methodology. Results in accuracy vs. computational performance show that the bilateration algorithm is competitive compared with well known optimized localization algorithms.
Directly reconstructing principal components of heterogeneous particles from cryo-EM images.
Tagare, Hemant D; Kucukelbir, Alp; Sigworth, Fred J; Wang, Hongwei; Rao, Murali
2015-08-01
Structural heterogeneity of particles can be investigated by their three-dimensional principal components. This paper addresses the question of whether, and with what algorithm, the three-dimensional principal components can be directly recovered from cryo-EM images. The first part of the paper extends the Fourier slice theorem to covariance functions showing that the three-dimensional covariance, and hence the principal components, of a heterogeneous particle can indeed be recovered from two-dimensional cryo-EM images. The second part of the paper proposes a practical algorithm for reconstructing the principal components directly from cryo-EM images without the intermediate step of calculating covariances. This algorithm is based on maximizing the posterior likelihood using the Expectation-Maximization algorithm. The last part of the paper applies this algorithm to simulated data and to two real cryo-EM data sets: a data set of the 70S ribosome with and without Elongation Factor-G (EF-G), and a data set of the influenza virus RNA dependent RNA Polymerase (RdRP). The first principal component of the 70S ribosome data set reveals the expected conformational changes of the ribosome as the EF-G binds and unbinds. The first principal component of the RdRP data set reveals a conformational change in the two dimers of the RdRP. Copyright © 2015 Elsevier Inc. All rights reserved.
Stargate GTM: Bridging Descriptor and Activity Spaces.
Gaspar, Héléna A; Baskin, Igor I; Marcou, Gilles; Horvath, Dragos; Varnek, Alexandre
2015-11-23
Predicting the activity profile of a molecule or discovering structures possessing a specific activity profile are two important goals in chemoinformatics, which could be achieved by bridging activity and molecular descriptor spaces. In this paper, we introduce the "Stargate" version of the Generative Topographic Mapping approach (S-GTM) in which two different multidimensional spaces (e.g., structural descriptor space and activity space) are linked through a common 2D latent space. In the S-GTM algorithm, the manifolds are trained simultaneously in two initial spaces using the probabilities in the 2D latent space calculated as a weighted geometric mean of probability distributions in both spaces. S-GTM has the following interesting features: (1) activities are involved during the training procedure; therefore, the method is supervised, unlike conventional GTM; (2) using molecular descriptors of a given compound as input, the model predicts a whole activity profile, and (3) using an activity profile as input, areas populated by relevant chemical structures can be detected. To assess the performance of S-GTM prediction models, a descriptor space (ISIDA descriptors) of a set of 1325 GPCR ligands was related to a B-dimensional (B = 1 or 8) activity space corresponding to pKi values for eight different targets. S-GTM outperforms conventional GTM for individual activities and performs similarly to the Lasso multitask learning algorithm, although it is still slightly less accurate than the Random Forest method.
Gulshan, Varun; Peng, Lily; Coram, Marc; Stumpe, Martin C; Wu, Derek; Narayanaswamy, Arunachalam; Venugopalan, Subhashini; Widner, Kasumi; Madams, Tom; Cuadros, Jorge; Kim, Ramasamy; Raman, Rajiv; Nelson, Philip C; Mega, Jessica L; Webster, Dale R
2016-12-13
Deep learning is a family of computational methods that allow an algorithm to program itself by learning from a large set of examples that demonstrate the desired behavior, removing the need to specify rules explicitly. Application of these methods to medical imaging requires further assessment and validation. To apply deep learning to create an algorithm for automated detection of diabetic retinopathy and diabetic macular edema in retinal fundus photographs. A specific type of neural network optimized for image classification called a deep convolutional neural network was trained using a retrospective development data set of 128 175 retinal images, which were graded 3 to 7 times for diabetic retinopathy, diabetic macular edema, and image gradability by a panel of 54 US licensed ophthalmologists and ophthalmology senior residents between May and December 2015. The resultant algorithm was validated in January and February 2016 using 2 separate data sets, both graded by at least 7 US board-certified ophthalmologists with high intragrader consistency. Deep learning-trained algorithm. The sensitivity and specificity of the algorithm for detecting referable diabetic retinopathy (RDR), defined as moderate and worse diabetic retinopathy, referable diabetic macular edema, or both, were generated based on the reference standard of the majority decision of the ophthalmologist panel. The algorithm was evaluated at 2 operating points selected from the development set, one selected for high specificity and another for high sensitivity. The EyePACS-1 data set consisted of 9963 images from 4997 patients (mean age, 54.4 years; 62.2% women; prevalence of RDR, 683/8878 fully gradable images [7.8%]); the Messidor-2 data set had 1748 images from 874 patients (mean age, 57.6 years; 42.6% women; prevalence of RDR, 254/1745 fully gradable images [14.6%]). For detecting RDR, the algorithm had an area under the receiver operating curve of 0.991 (95% CI, 0.988-0.993) for EyePACS-1 and 0.990 (95% CI, 0.986-0.995) for Messidor-2. Using the first operating cut point with high specificity, for EyePACS-1, the sensitivity was 90.3% (95% CI, 87.5%-92.7%) and the specificity was 98.1% (95% CI, 97.8%-98.5%). For Messidor-2, the sensitivity was 87.0% (95% CI, 81.1%-91.0%) and the specificity was 98.5% (95% CI, 97.7%-99.1%). Using a second operating point with high sensitivity in the development set, for EyePACS-1 the sensitivity was 97.5% and specificity was 93.4% and for Messidor-2 the sensitivity was 96.1% and specificity was 93.9%. In this evaluation of retinal fundus photographs from adults with diabetes, an algorithm based on deep machine learning had high sensitivity and specificity for detecting referable diabetic retinopathy. Further research is necessary to determine the feasibility of applying this algorithm in the clinical setting and to determine whether use of the algorithm could lead to improved care and outcomes compared with current ophthalmologic assessment.
Active Noise Control Experiments using Sound Energy Flu
NASA Astrophysics Data System (ADS)
Krause, Uli
2015-03-01
This paper reports on the latest results concerning the active noise control approach using net flow of acoustic energy. The test set-up consists of two loudspeakers simulating the engine noise and two smaller loudspeakers which belong to the active noise system. The system is completed by two acceleration sensors and one microphone per loudspeaker. The microphones are located in the near sound field of the loudspeakers. The control algorithm including the update equation of the feed-forward controller is introduced. Numerical simulations are performed with a comparison to a state of the art method minimising the radiated sound power. The proposed approach is experimentally validated.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bosler, Peter A.; Roesler, Erika L.; Taylor, Mark A.
This study discusses the problem of identifying extreme climate events such as intense storms within large climate data sets. The basic storm detection algorithm is reviewed, which splits the problem into two parts: a spatial search followed by a temporal correlation problem. Two specific implementations of the spatial search algorithm are compared: the commonly used grid point search algorithm is reviewed, and a new algorithm called Stride Search is introduced. The Stride Search algorithm is defined independently of the spatial discretization associated with a particular data set. Results from the two algorithms are compared for the application of tropical cyclonemore » detection, and shown to produce similar results for the same set of storm identification criteria. Differences between the two algorithms arise for some storms due to their different definition of search regions in physical space. The physical space associated with each Stride Search region is constant, regardless of data resolution or latitude, and Stride Search is therefore capable of searching all regions of the globe in the same manner. Stride Search's ability to search high latitudes is demonstrated for the case of polar low detection. Wall clock time required for Stride Search is shown to be smaller than a grid point search of the same data, and the relative speed up associated with Stride Search increases as resolution increases.« less
NASA Astrophysics Data System (ADS)
Nishizuka, N.; Sugiura, K.; Kubo, Y.; Den, M.; Watari, S.; Ishii, M.
2017-02-01
We developed a flare prediction model using machine learning, which is optimized to predict the maximum class of flares occurring in the following 24 hr. Machine learning is used to devise algorithms that can learn from and make decisions on a huge amount of data. We used solar observation data during the period 2010-2015, such as vector magnetograms, ultraviolet (UV) emission, and soft X-ray emission taken by the Solar Dynamics Observatory and the Geostationary Operational Environmental Satellite. We detected active regions (ARs) from the full-disk magnetogram, from which ˜60 features were extracted with their time differentials, including magnetic neutral lines, the current helicity, the UV brightening, and the flare history. After standardizing the feature database, we fully shuffled and randomly separated it into two for training and testing. To investigate which algorithm is best for flare prediction, we compared three machine-learning algorithms: the support vector machine, k-nearest neighbors (k-NN), and extremely randomized trees. The prediction score, the true skill statistic, was higher than 0.9 with a fully shuffled data set, which is higher than that for human forecasts. It was found that k-NN has the highest performance among the three algorithms. The ranking of the feature importance showed that previous flare activity is most effective, followed by the length of magnetic neutral lines, the unsigned magnetic flux, the area of UV brightening, and the time differentials of features over 24 hr, all of which are strongly correlated with the flux emergence dynamics in an AR.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nishizuka, N.; Kubo, Y.; Den, M.
We developed a flare prediction model using machine learning, which is optimized to predict the maximum class of flares occurring in the following 24 hr. Machine learning is used to devise algorithms that can learn from and make decisions on a huge amount of data. We used solar observation data during the period 2010–2015, such as vector magnetograms, ultraviolet (UV) emission, and soft X-ray emission taken by the Solar Dynamics Observatory and the Geostationary Operational Environmental Satellite . We detected active regions (ARs) from the full-disk magnetogram, from which ∼60 features were extracted with their time differentials, including magnetic neutralmore » lines, the current helicity, the UV brightening, and the flare history. After standardizing the feature database, we fully shuffled and randomly separated it into two for training and testing. To investigate which algorithm is best for flare prediction, we compared three machine-learning algorithms: the support vector machine, k-nearest neighbors (k-NN), and extremely randomized trees. The prediction score, the true skill statistic, was higher than 0.9 with a fully shuffled data set, which is higher than that for human forecasts. It was found that k-NN has the highest performance among the three algorithms. The ranking of the feature importance showed that previous flare activity is most effective, followed by the length of magnetic neutral lines, the unsigned magnetic flux, the area of UV brightening, and the time differentials of features over 24 hr, all of which are strongly correlated with the flux emergence dynamics in an AR.« less
Development of Personalized Urination Recognition Technology Using Smart Bands.
Eun, Sung-Jong; Whangbo, Taeg-Keun; Park, Dong Kyun; Kim, Khae-Hawn
2017-04-01
This study collected and analyzed activity data sensed through smart bands worn by patients in order to resolve the clinical issues posed by using voiding charts. By developing a smart band-based algorithm for recognizing urination activity in patients, this study aimed to explore the feasibility of urination monitoring systems. This study aimed to develop an algorithm that recognizes urination based on a patient's posture and changes in posture. Motion data was obtained from a smart band on the arm. An algorithm that recognizes the 3 stages of urination (forward movement, urination, backward movement) was developed based on data collected from a 3-axis accelerometer and from tilt angle data. Real-time data were acquired from the smart band, and for data corresponding to a certain duration, the absolute value of the signals was calculated and then compared with the set threshold value to determine the occurrence of vibration signals. In feature extraction, the most essential information describing each pattern was identified after analyzing the characteristics of the data. The results of the feature extraction process were sorted using a classifier to detect urination. An experiment was carried out to assess the performance of the recognition technology proposed in this study. The final accuracy of the algorithm was calculated based on clinical guidelines for urologists. The experiment showed a high average accuracy of 90.4%, proving the robustness of the proposed algorithm. The proposed urination recognition technology draws on acceleration data and tilt angle data collected via a smart band; these data were then analyzed using a classifier after comparative analyses with standardized feature patterns.
Ko, Gene M; Garg, Rajni; Bailey, Barbara A; Kumar, Sunil
2016-01-01
Quantitative structure-activity relationship (QSAR) models can be used as a predictive tool for virtual screening of chemical libraries to identify novel drug candidates. The aims of this paper were to report the results of a study performed for descriptor selection, QSAR model development, and virtual screening for identifying novel HIV-1 integrase inhibitor drug candidates. First, three evolutionary algorithms were compared for descriptor selection: differential evolution-binary particle swarm optimization (DE-BPSO), binary particle swarm optimization, and genetic algorithms. Next, three QSAR models were developed from an ensemble of multiple linear regression, partial least squares, and extremely randomized trees models. A comparison of the performances of three evolutionary algorithms showed that DE-BPSO has a significant improvement over the other two algorithms. QSAR models developed in this study were used in consensus as a predictive tool for virtual screening of the NCI Open Database containing 265,242 compounds to identify potential novel HIV-1 integrase inhibitors. Six compounds were predicted to be highly active (plC50 > 6) by each of the three models. The use of a hybrid evolutionary algorithm (DE-BPSO) for descriptor selection and QSAR model development in drug design is a novel approach. Consensus modeling may provide better predictivity by taking into account a broader range of chemical properties within the data set conducive for inhibition that may be missed by an individual model. The six compounds identified provide novel drug candidate leads in the design of next generation HIV- 1 integrase inhibitors targeting drug resistant mutant viruses.
Anti AIDS drug design with the help of neural networks
NASA Astrophysics Data System (ADS)
Tetko, I. V.; Tanchuk, V. Yu.; Luik, A. I.
1995-04-01
Artificial neural networks were used to analyze and predict the human immunodefiency virus type 1 reverse transcriptase inhibitors. Training and control set included 44 molecules (most of them are well-known substances such as AZT, TIBO, dde, etc.) The biological activities of molecules were taken from literature and rated for two classes: active and inactive compounds according to their values. We used topological indices as molecular parameters. Four most informative parameters (out of 46) were chosen using cluster analysis and original input parameters' estimation procedure and were used to predict activities of both control and new (synthesized in our institute) molecules. We applied pruning network algorithm and network ensembles to obtain the final classifier and avoid chance correlation. The increasing of neural network generalization of the data from the control set was observed, when using the aforementioned methods. The prognosis of new molecules revealed one molecule as possibly active. It was confirmed by further biological tests. The compound was as active as AZT and in order less toxic. The active compound is currently being evaluated in pre clinical trials as possible drug for anti-AIDS therapy.
NASA Astrophysics Data System (ADS)
Lin, Geng; Guan, Jian; Feng, Huibin
2018-06-01
The positive influence dominating set problem is a variant of the minimum dominating set problem, and has lots of applications in social networks. It is NP-hard, and receives more and more attention. Various methods have been proposed to solve the positive influence dominating set problem. However, most of the existing work focused on greedy algorithms, and the solution quality needs to be improved. In this paper, we formulate the minimum positive influence dominating set problem as an integer linear programming (ILP), and propose an ILP based memetic algorithm (ILPMA) for solving the problem. The ILPMA integrates a greedy randomized adaptive construction procedure, a crossover operator, a repair operator, and a tabu search procedure. The performance of ILPMA is validated on nine real-world social networks with nodes up to 36,692. The results show that ILPMA significantly improves the solution quality, and is robust.
Ultra-high resolution computed tomography imaging
Paulus, Michael J.; Sari-Sarraf, Hamed; Tobin, Jr., Kenneth William; Gleason, Shaun S.; Thomas, Jr., Clarence E.
2002-01-01
A method for ultra-high resolution computed tomography imaging, comprising the steps of: focusing a high energy particle beam, for example x-rays or gamma-rays, onto a target object; acquiring a 2-dimensional projection data set representative of the target object; generating a corrected projection data set by applying a deconvolution algorithm, having an experimentally determined a transfer function, to the 2-dimensional data set; storing the corrected projection data set; incrementally rotating the target object through an angle of approximately 180.degree., and after each the incremental rotation, repeating the radiating, acquiring, generating and storing steps; and, after the rotating step, applying a cone-beam algorithm, for example a modified tomographic reconstruction algorithm, to the corrected projection data sets to generate a 3-dimensional image. The size of the spot focus of the beam is reduced to not greater than approximately 1 micron, and even to not greater than approximately 0.5 microns.
Automatable algorithms to identify nonmedical opioid use using electronic data: a systematic review.
Canan, Chelsea; Polinski, Jennifer M; Alexander, G Caleb; Kowal, Mary K; Brennan, Troyen A; Shrank, William H
2017-11-01
Improved methods to identify nonmedical opioid use can help direct health care resources to individuals who need them. Automated algorithms that use large databases of electronic health care claims or records for surveillance are a potential means to achieve this goal. In this systematic review, we reviewed the utility, attempts at validation, and application of such algorithms to detect nonmedical opioid use. We searched PubMed and Embase for articles describing automatable algorithms that used electronic health care claims or records to identify patients or prescribers with likely nonmedical opioid use. We assessed algorithm development, validation, and performance characteristics and the settings where they were applied. Study variability precluded a meta-analysis. Of 15 included algorithms, 10 targeted patients, 2 targeted providers, 2 targeted both, and 1 identified medications with high abuse potential. Most patient-focused algorithms (67%) used prescription drug claims and/or medical claims, with diagnosis codes of substance abuse and/or dependence as the reference standard. Eleven algorithms were developed via regression modeling. Four used natural language processing, data mining, audit analysis, or factor analysis. Automated algorithms can facilitate population-level surveillance. However, there is no true gold standard for determining nonmedical opioid use. Users must recognize the implications of identifying false positives and, conversely, false negatives. Few algorithms have been applied in real-world settings. Automated algorithms may facilitate identification of patients and/or providers most likely to need more intensive screening and/or intervention for nonmedical opioid use. Additional implementation research in real-world settings would clarify their utility. © The Author 2017. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Fast graph-based relaxed clustering for large data sets using minimal enclosing ball.
Qian, Pengjiang; Chung, Fu-Lai; Wang, Shitong; Deng, Zhaohong
2012-06-01
Although graph-based relaxed clustering (GRC) is one of the spectral clustering algorithms with straightforwardness and self-adaptability, it is sensitive to the parameters of the adopted similarity measure and also has high time complexity O(N(3)) which severely weakens its usefulness for large data sets. In order to overcome these shortcomings, after introducing certain constraints for GRC, an enhanced version of GRC [constrained GRC (CGRC)] is proposed to increase the robustness of GRC to the parameters of the adopted similarity measure, and accordingly, a novel algorithm called fast GRC (FGRC) based on CGRC is developed in this paper by using the core-set-based minimal enclosing ball approximation. A distinctive advantage of FGRC is that its asymptotic time complexity is linear with the data set size N. At the same time, FGRC also inherits the straightforwardness and self-adaptability from GRC, making the proposed FGRC a fast and effective clustering algorithm for large data sets. The advantages of FGRC are validated by various benchmarking and real data sets.
Multiprocessor sparse L/U decomposition with controlled fill-in
NASA Technical Reports Server (NTRS)
Alaghband, G.; Jordan, H. F.
1985-01-01
Generation of the maximal compatibles of pivot elements for a class of small sparse matrices is studied. The algorithm involves a binary tree search and has a complexity exponential in the order of the matrix. Different strategies for selection of a set of compatible pivots based on the Markowitz criterion are investigated. The competing issues of parallelism and fill-in generation are studied and results are provided. A technque for obtaining an ordered compatible set directly from the ordered incompatible table is given. This technique generates a set of compatible pivots with the property of generating few fills. A new hueristic algorithm is then proposed that combines the idea of an ordered compatible set with a limited binary tree search to generate several sets of compatible pivots in linear time. Finally, an elimination set to reduce the matrix is selected. Parameters are suggested to obtain a balance between parallelism and fill-ins. Results of applying the proposed algorithms on several large application matrices are presented and analyzed.
Turbo-SMT: Parallel Coupled Sparse Matrix-Tensor Factorizations and Applications
Papalexakis, Evangelos E.; Faloutsos, Christos; Mitchell, Tom M.; Talukdar, Partha Pratim; Sidiropoulos, Nicholas D.; Murphy, Brian
2016-01-01
How can we correlate the neural activity in the human brain as it responds to typed words, with properties of these terms (like ’edible’, ’fits in hand’)? In short, we want to find latent variables, that jointly explain both the brain activity, as well as the behavioral responses. This is one of many settings of the Coupled Matrix-Tensor Factorization (CMTF) problem. Can we enhance any CMTF solver, so that it can operate on potentially very large datasets that may not fit in main memory? We introduce Turbo-SMT, a meta-method capable of doing exactly that: it boosts the performance of any CMTF algorithm, produces sparse and interpretable solutions, and parallelizes any CMTF algorithm, producing sparse and interpretable solutions (up to 65 fold). Additionally, we improve upon ALS, the work-horse algorithm for CMTF, with respect to efficiency and robustness to missing values. We apply Turbo-SMT to BrainQ, a dataset consisting of a (nouns, brain voxels, human subjects) tensor and a (nouns, properties) matrix, with coupling along the nouns dimension. Turbo-SMT is able to find meaningful latent variables, as well as to predict brain activity with competitive accuracy. Finally, we demonstrate the generality of Turbo-SMT, by applying it on a Facebook dataset (users, ’friends’, wall-postings); there, Turbo-SMT spots spammer-like anomalies. PMID:27672406
Tuzun, Burak; Yavuz, Sevtap Caglar; Sabanci, Nazmiye; Saripinar, Emin
2018-05-13
In the present work, pharmacophore identification and biological activity prediction for 86 pyrazole pyridine carboxylic acid derivatives were made using the electron conformational genetic algorithm approach which was introduced as a 4D-QSAR analysis by us in recent years. In the light of the data obtained from quantum chemical calculations at HF/6-311 G** level, the electron conformational matrices of congruity (ECMC) were constructed by EMRE software. Comparing the matrices, electron conformational submatrix of activity (ECSA, Pha) was revealed that are common for these compounds within a minimum tolerance. A parameter pool was generated considering the obtained pharmacophore. To determine the theoretical biological activity of molecules and identify the best subset of variables affecting bioactivities, we used the nonlinear least square regression method and genetic algorithm. The results obtained in this study are in good agreement with the experimental data presented in the literature. The model for training and test sets attained by the optimum 12 parameters gave highly satisfactory results with R2training= 0.889, q2=0.839 and SEtraining=0.066, q2ext1 = 0.770, q2ext2 = 0.750, q2ext3=0.824, ccctr = 0.941, ccctest = 0.869 and cccall = 0.927. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Making predictions in a changing world-inference, uncertainty, and learning.
O'Reilly, Jill X
2013-01-01
To function effectively, brains need to make predictions about their environment based on past experience, i.e., they need to learn about their environment. The algorithms by which learning occurs are of interest to neuroscientists, both in their own right (because they exist in the brain) and as a tool to model participants' incomplete knowledge of task parameters and hence, to better understand their behavior. This review focusses on a particular challenge for learning algorithms-how to match the rate at which they learn to the rate of change in the environment, so that they use as much observed data as possible whilst disregarding irrelevant, old observations. To do this algorithms must evaluate whether the environment is changing. We discuss the concepts of likelihood, priors and transition functions, and how these relate to change detection. We review expected and estimation uncertainty, and how these relate to change detection and learning rate. Finally, we consider the neural correlates of uncertainty and learning. We argue that the neural correlates of uncertainty bear a resemblance to neural systems that are active when agents actively explore their environments, suggesting that the mechanisms by which the rate of learning is set may be subject to top down control (in circumstances when agents actively seek new information) as well as bottom up control (by observations that imply change in the environment).
On local search for bi-objective knapsack problems.
Liefooghe, Arnaud; Paquete, Luís; Figueira, José Rui
2013-01-01
In this article, a local search approach is proposed for three variants of the bi-objective binary knapsack problem, with the aim of maximizing the total profit and minimizing the total weight. First, an experimental study on a given structural property of connectedness of the efficient set is conducted. Based on this property, a local search algorithm is proposed and its performance is compared to exact algorithms in terms of runtime and quality metrics. The experimental results indicate that this simple local search algorithm is able to find a representative set of optimal solutions in most of the cases, and in much less time than exact algorithms.
Quantum algorithms for topological and geometric analysis of data
Lloyd, Seth; Garnerone, Silvano; Zanardi, Paolo
2016-01-01
Extracting useful information from large data sets can be a daunting task. Topological methods for analysing data sets provide a powerful technique for extracting such information. Persistent homology is a sophisticated tool for identifying topological features and for determining how such features persist as the data is viewed at different scales. Here we present quantum machine learning algorithms for calculating Betti numbers—the numbers of connected components, holes and voids—in persistent homology, and for finding eigenvectors and eigenvalues of the combinatorial Laplacian. The algorithms provide an exponential speed-up over the best currently known classical algorithms for topological data analysis. PMID:26806491
Detection and Tracking of Moving Objects with Real-Time Onboard Vision System
NASA Astrophysics Data System (ADS)
Erokhin, D. Y.; Feldman, A. B.; Korepanov, S. E.
2017-05-01
Detection of moving objects in video sequence received from moving video sensor is a one of the most important problem in computer vision. The main purpose of this work is developing set of algorithms, which can detect and track moving objects in real time computer vision system. This set includes three main parts: the algorithm for estimation and compensation of geometric transformations of images, an algorithm for detection of moving objects, an algorithm to tracking of the detected objects and prediction their position. The results can be claimed to create onboard vision systems of aircraft, including those relating to small and unmanned aircraft.
Distributed Function Mining for Gene Expression Programming Based on Fast Reduction.
Deng, Song; Yue, Dong; Yang, Le-chan; Fu, Xiong; Feng, Ya-zhou
2016-01-01
For high-dimensional and massive data sets, traditional centralized gene expression programming (GEP) or improved algorithms lead to increased run-time and decreased prediction accuracy. To solve this problem, this paper proposes a new improved algorithm called distributed function mining for gene expression programming based on fast reduction (DFMGEP-FR). In DFMGEP-FR, fast attribution reduction in binary search algorithms (FAR-BSA) is proposed to quickly find the optimal attribution set, and the function consistency replacement algorithm is given to solve integration of the local function model. Thorough comparative experiments for DFMGEP-FR, centralized GEP and the parallel gene expression programming algorithm based on simulated annealing (parallel GEPSA) are included in this paper. For the waveform, mushroom, connect-4 and musk datasets, the comparative results show that the average time-consumption of DFMGEP-FR drops by 89.09%%, 88.85%, 85.79% and 93.06%, respectively, in contrast to centralized GEP and by 12.5%, 8.42%, 9.62% and 13.75%, respectively, compared with parallel GEPSA. Six well-studied UCI test data sets demonstrate the efficiency and capability of our proposed DFMGEP-FR algorithm for distributed function mining.
NASA Astrophysics Data System (ADS)
Zhang, Tianran; Wooster, Martin
2016-04-01
Until recently, crop residues have been the second largest industrial waste product produced in China and field-based burning of crop residues is considered to remain extremely widespread, with impacts on air quality and potential negative effects on health, public transportation. However, due to the small size and perhaps short-lived nature of the individual burns, the extent of the activity and its spatial variability remains somewhat unclear. Satellite EO data has been used to gauge the timing and magnitude of Chinese crop burning, but current approaches very likely miss significant amounts of the activity because the individual burned areas are either too small to detect with frequently acquired moderate spatial resolution data such as MODIS. The Visible Infrared Imaging Radiometer Suite (VIIRS) on-board Suomi-NPP (National Polar-orbiting Partnership) satellite launched on October, 2011 has one set of multi-spectral channels providing full global coverage at 375 m nadir spatial resolutions. It is expected that the 375 m spatial resolution "I-band" imagery provided by VIIRS will allow active fires to be detected that are ~ 10× smaller than those that can be detected by MODIS. In this study the new small fire detection algorithm is built based on VIIRS-I band global fire detection algorithm and hot spot detection algorithm for the BIRD satellite mission. VIIRS-I band imagery data will be used to identify agricultural fire activity across Eastern China. A 30 m spatial resolution global land cover data map is used for false alarm masking. The ground-based validation is performed using images taken from UAV. The fire detection result is been compared with active fire product from the long-standing MODIS sensor onboard the TERRA and AQUA satellites, which shows small fires missed from traditional MODIS fire product may count for over 1/3 of total fire energy in Eastern China.
Asymmetric bagging and feature selection for activities prediction of drug molecules.
Li, Guo-Zheng; Meng, Hao-Hua; Lu, Wen-Cong; Yang, Jack Y; Yang, Mary Qu
2008-05-28
Activities of drug molecules can be predicted by QSAR (quantitative structure activity relationship) models, which overcomes the disadvantages of high cost and long cycle by employing the traditional experimental method. With the fact that the number of drug molecules with positive activity is rather fewer than that of negatives, it is important to predict molecular activities considering such an unbalanced situation. Here, asymmetric bagging and feature selection are introduced into the problem and asymmetric bagging of support vector machines (asBagging) is proposed on predicting drug activities to treat the unbalanced problem. At the same time, the features extracted from the structures of drug molecules affect prediction accuracy of QSAR models. Therefore, a novel algorithm named PRIFEAB is proposed, which applies an embedded feature selection method to remove redundant and irrelevant features for asBagging. Numerical experimental results on a data set of molecular activities show that asBagging improve the AUC and sensitivity values of molecular activities and PRIFEAB with feature selection further helps to improve the prediction ability. Asymmetric bagging can help to improve prediction accuracy of activities of drug molecules, which can be furthermore improved by performing feature selection to select relevant features from the drug molecules data sets.
Multi-particle phase space integration with arbitrary set of singularities in CompHEP
NASA Astrophysics Data System (ADS)
Kovalenko, D. N.; Pukhov, A. E.
1997-02-01
We describe an algorithm of multi-particle phase space integration for collision and decay processes realized in CompHEP package version 3.2. In the framework of this algorithm it is possible to regularize an arbitrary set of singularities caused by virtual particle propagators. The algorithm is based on the method of the recursive representation of kinematics and on the multichannel Monte Carlo approach. CompHEP package is available by WWW: http://theory.npi.msu.su/pukhov/comphep.html
2012-09-03
prac- tice to solve these initial value problems. Additionally, the predictor / corrector methods are combined with adaptive stepsize and adaptive ...for implementing a numerical path tracking algorithm is to decide which predictor / corrector method to employ, how large to take the step ∆t, and what...the endgame algorithm . Output: A steady state solution Set ǫ = 1 while ǫ >= ǫend do set the stepsize ∆ǫ by using adaptive stepsize control algorithm
Full glowworm swarm optimization algorithm for whole-set orders scheduling in single machine.
Yu, Zhang; Yang, Xiaomei
2013-01-01
By analyzing the characteristics of whole-set orders problem and combining the theory of glowworm swarm optimization, a new glowworm swarm optimization algorithm for scheduling is proposed. A new hybrid-encoding schema combining with two-dimensional encoding and random-key encoding is given. In order to enhance the capability of optimal searching and speed up the convergence rate, the dynamical changed step strategy is integrated into this algorithm. Furthermore, experimental results prove its feasibility and efficiency.
Algorithm Updates for the Fourth SeaWiFS Data Reprocessing
NASA Technical Reports Server (NTRS)
Hooker, Stanford, B. (Editor); Firestone, Elaine R. (Editor); Patt, Frederick S.; Barnes, Robert A.; Eplee, Robert E., Jr.; Franz, Bryan A.; Robinson, Wayne D.; Feldman, Gene Carl; Bailey, Sean W.
2003-01-01
The efforts to improve the data quality for the Sea-viewing Wide Field-of-view Sensor (SeaWiFS) data products have continued, following the third reprocessing of the global data set in May 2000. Analyses have been ongoing to address all aspects of the processing algorithms, particularly the calibration methodologies, atmospheric correction, and data flagging and masking. All proposed changes were subjected to rigorous testing, evaluation and validation. The results of these activities culminated in the fourth reprocessing, which was completed in July 2002. The algorithm changes, which were implemented for this reprocessing, are described in the chapters of this volume. Chapter 1 presents an overview of the activities leading up to the fourth reprocessing, and summarizes the effects of the changes. Chapter 2 describes the modifications to the on-orbit calibration, specifically the focal plane temperature correction and the temporal dependence. Chapter 3 describes the changes to the vicarious calibration, including the stray light correction to the Marine Optical Buoy (MOBY) data and improved data screening procedures. Chapter 4 describes improvements to the near-infrared (NIR) band correction algorithm. Chapter 5 describes changes to the atmospheric correction and the oceanic property retrieval algorithms, including out-of-band corrections, NIR noise reduction, and handling of unusual conditions. Chapter 6 describes various changes to the flags and masks, to increase the number of valid retrievals, improve the detection of the flag conditions, and add new flags. Chapter 7 describes modifications to the level-la and level-3 algorithms, to improve the navigation accuracy, correct certain types of spacecraft time anomalies, and correct a binning logic error. Chapter 8 describes the algorithm used to generate the SeaWiFS photosynthetically available radiation (PAR) product. Chapter 9 describes a coupled ocean-atmosphere model, which is used in one of the changes described in Chapter 4. Finally, Chapter 10 describes a comparison of results from the third and fourth reprocessings along the US. Northeast coast.
Optimizing human activity patterns using global sensitivity analysis.
Fairchild, Geoffrey; Hickmann, Kyle S; Mniszewski, Susan M; Del Valle, Sara Y; Hyman, James M
2014-12-01
Implementing realistic activity patterns for a population is crucial for modeling, for example, disease spread, supply and demand, and disaster response. Using the dynamic activity simulation engine, DASim, we generate schedules for a population that capture regular (e.g., working, eating, and sleeping) and irregular activities (e.g., shopping or going to the doctor). We use the sample entropy (SampEn) statistic to quantify a schedule's regularity for a population. We show how to tune an activity's regularity by adjusting SampEn, thereby making it possible to realistically design activities when creating a schedule. The tuning process sets up a computationally intractable high-dimensional optimization problem. To reduce the computational demand, we use Bayesian Gaussian process regression to compute global sensitivity indices and identify the parameters that have the greatest effect on the variance of SampEn. We use the harmony search (HS) global optimization algorithm to locate global optima. Our results show that HS combined with global sensitivity analysis can efficiently tune the SampEn statistic with few search iterations. We demonstrate how global sensitivity analysis can guide statistical emulation and global optimization algorithms to efficiently tune activities and generate realistic activity patterns. Though our tuning methods are applied to dynamic activity schedule generation, they are general and represent a significant step in the direction of automated tuning and optimization of high-dimensional computer simulations.
GASS-WEB: a web server for identifying enzyme active sites based on genetic algorithms.
Moraes, João P A; Pappa, Gisele L; Pires, Douglas E V; Izidoro, Sandro C
2017-07-03
Enzyme active sites are important and conserved functional regions of proteins whose identification can be an invaluable step toward protein function prediction. Most of the existing methods for this task are based on active site similarity and present limitations including performing only exact matches on template residues, template size restraints, despite not being capable of finding inter-domain active sites. To fill this gap, we proposed GASS-WEB, a user-friendly web server that uses GASS (Genetic Active Site Search), a method based on an evolutionary algorithm to search for similar active sites in proteins. GASS-WEB can be used under two different scenarios: (i) given a protein of interest, to match a set of specific active site templates; or (ii) given an active site template, looking for it in a database of protein structures. The method has shown to be very effective on a range of experiments and was able to correctly identify >90% of the catalogued active sites from the Catalytic Site Atlas. It also managed to achieve a Matthew correlation coefficient of 0.63 using the Critical Assessment of protein Structure Prediction (CASP 10) dataset. In our analysis, GASS was ranking fourth among 18 methods. GASS-WEB is freely available at http://gass.unifei.edu.br/. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Rastgarpour, Maryam; Shanbehzadeh, Jamshid
2014-01-01
Researchers recently apply an integrative approach to automate medical image segmentation for benefiting available methods and eliminating their disadvantages. Intensity inhomogeneity is a challenging and open problem in this area, which has received less attention by this approach. It has considerable effects on segmentation accuracy. This paper proposes a new kernel-based fuzzy level set algorithm by an integrative approach to deal with this problem. It can directly evolve from the initial level set obtained by Gaussian Kernel-Based Fuzzy C-Means (GKFCM). The controlling parameters of level set evolution are also estimated from the results of GKFCM. Moreover the proposed algorithm is enhanced with locally regularized evolution based on an image model that describes the composition of real-world images, in which intensity inhomogeneity is assumed as a component of an image. Such improvements make level set manipulation easier and lead to more robust segmentation in intensity inhomogeneity. The proposed algorithm has valuable benefits including automation, invariant of intensity inhomogeneity, and high accuracy. Performance evaluation of the proposed algorithm was carried on medical images from different modalities. The results confirm its effectiveness for medical image segmentation.
Motion compensation using origin ensembles in awake small animal positron emission tomography
NASA Astrophysics Data System (ADS)
Gillam, John E.; Angelis, Georgios I.; Kyme, Andre Z.; Meikle, Steven R.
2017-02-01
In emission tomographic imaging, the stochastic origin ensembles algorithm provides unique information regarding the detected counts given the measured data. Precision in both voxel and region-wise parameters may be determined for a single data set based on the posterior distribution of the count density allowing uncertainty estimates to be allocated to quantitative measures. Uncertainty estimates are of particular importance in awake animal neurological and behavioral studies for which head motion, unique for each acquired data set, perturbs the measured data. Motion compensation can be conducted when rigid head pose is measured during the scan. However, errors in pose measurements used for compensation can degrade the data and hence quantitative outcomes. In this investigation motion compensation and detector resolution models were incorporated into the basic origin ensembles algorithm and an efficient approach to computation was developed. The approach was validated against maximum liklihood—expectation maximisation and tested using simulated data. The resultant algorithm was then used to analyse quantitative uncertainty in regional activity estimates arising from changes in pose measurement precision. Finally, the posterior covariance acquired from a single data set was used to describe correlations between regions of interest providing information about pose measurement precision that may be useful in system analysis and design. The investigation demonstrates the use of origin ensembles as a powerful framework for evaluating statistical uncertainty of voxel and regional estimates. While in this investigation rigid motion was considered in the context of awake animal PET, the extension to arbitrary motion may provide clinical utility where respiratory or cardiac motion perturb the measured data.
Evaluation of prostate segmentation algorithms for MRI: the PROMISE12 challenge
Litjens, Geert; Toth, Robert; van de Ven, Wendy; Hoeks, Caroline; Kerkstra, Sjoerd; van Ginneken, Bram; Vincent, Graham; Guillard, Gwenael; Birbeck, Neil; Zhang, Jindang; Strand, Robin; Malmberg, Filip; Ou, Yangming; Davatzikos, Christos; Kirschner, Matthias; Jung, Florian; Yuan, Jing; Qiu, Wu; Gao, Qinquan; Edwards, Philip “Eddie”; Maan, Bianca; van der Heijden, Ferdinand; Ghose, Soumya; Mitra, Jhimli; Dowling, Jason; Barratt, Dean; Huisman, Henkjan; Madabhushi, Anant
2014-01-01
Prostate MRI image segmentation has been an area of intense research due to the increased use of MRI as a modality for the clinical workup of prostate cancer. Segmentation is useful for various tasks, e.g. to accurately localize prostate boundaries for radiotherapy or to initialize multi-modal registration algorithms. In the past, it has been difficult for research groups to evaluate prostate segmentation algorithms on multi-center, multi-vendor and multi-protocol data. Especially because we are dealing with MR images, image appearance, resolution and the presence of artifacts are affected by differences in scanners and/or protocols, which in turn can have a large influence on algorithm accuracy. The Prostate MR Image Segmentation (PROMISE12) challenge was setup to allow a fair and meaningful comparison of segmentation methods on the basis of performance and robustness. In this work we will discuss the initial results of the online PROMISE12 challenge, and the results obtained in the live challenge workshop hosted by the MICCAI2012 conference. In the challenge, 100 prostate MR cases from 4 different centers were included, with differences in scanner manufacturer, field strength and protocol. A total of 11 teams from academic research groups and industry participated. Algorithms showed a wide variety in methods and implementation, including active appearance models, atlas registration and level sets. Evaluation was performed using boundary and volume based metrics which were combined into a single score relating the metrics to human expert performance. The winners of the challenge where the algorithms by teams Imorphics and ScrAutoProstate, with scores of 85.72 and 84.29 overall. Both algorithms where significantly better than all other algorithms in the challenge (p < 0.05) and had an efficient implementation with a run time of 8 minutes and 3 second per case respectively. Overall, active appearance model based approaches seemed to outperform other approaches like multi-atlas registration, both on accuracy and computation time. Although average algorithm performance was good to excellent and the Imorphics algorithm outperformed the second observer on average, we showed that algorithm combination might lead to further improvement, indicating that optimal performance for prostate segmentation is not yet obtained. All results are available online at http://promise12.grand-challenge.org/. PMID:24418598
An efficient, scalable, and adaptable framework for solving generic systems of level-set PDEs
Mosaliganti, Kishore R.; Gelas, Arnaud; Megason, Sean G.
2013-01-01
In the last decade, level-set methods have been actively developed for applications in image registration, segmentation, tracking, and reconstruction. However, the development of a wide variety of level-set PDEs and their numerical discretization schemes, coupled with hybrid combinations of PDE terms, stopping criteria, and reinitialization strategies, has created a software logistics problem. In the absence of an integrative design, current toolkits support only specific types of level-set implementations which restrict future algorithm development since extensions require significant code duplication and effort. In the new NIH/NLM Insight Toolkit (ITK) v4 architecture, we implemented a level-set software design that is flexible to different numerical (continuous, discrete, and sparse) and grid representations (point, mesh, and image-based). Given that a generic PDE is a summation of different terms, we used a set of linked containers to which level-set terms can be added or deleted at any point in the evolution process. This container-based approach allows the user to explore and customize terms in the level-set equation at compile-time in a flexible manner. The framework is optimized so that repeated computations of common intensity functions (e.g., gradient and Hessians) across multiple terms is eliminated. The framework further enables the evolution of multiple level-sets for multi-object segmentation and processing of large datasets. For doing so, we restrict level-set domains to subsets of the image domain and use multithreading strategies to process groups of subdomains or level-set functions. Users can also select from a variety of reinitialization policies and stopping criteria. Finally, we developed a visualization framework that shows the evolution of a level-set in real-time to help guide algorithm development and parameter optimization. We demonstrate the power of our new framework using confocal microscopy images of cells in a developing zebrafish embryo. PMID:24501592
An efficient, scalable, and adaptable framework for solving generic systems of level-set PDEs.
Mosaliganti, Kishore R; Gelas, Arnaud; Megason, Sean G
2013-01-01
In the last decade, level-set methods have been actively developed for applications in image registration, segmentation, tracking, and reconstruction. However, the development of a wide variety of level-set PDEs and their numerical discretization schemes, coupled with hybrid combinations of PDE terms, stopping criteria, and reinitialization strategies, has created a software logistics problem. In the absence of an integrative design, current toolkits support only specific types of level-set implementations which restrict future algorithm development since extensions require significant code duplication and effort. In the new NIH/NLM Insight Toolkit (ITK) v4 architecture, we implemented a level-set software design that is flexible to different numerical (continuous, discrete, and sparse) and grid representations (point, mesh, and image-based). Given that a generic PDE is a summation of different terms, we used a set of linked containers to which level-set terms can be added or deleted at any point in the evolution process. This container-based approach allows the user to explore and customize terms in the level-set equation at compile-time in a flexible manner. The framework is optimized so that repeated computations of common intensity functions (e.g., gradient and Hessians) across multiple terms is eliminated. The framework further enables the evolution of multiple level-sets for multi-object segmentation and processing of large datasets. For doing so, we restrict level-set domains to subsets of the image domain and use multithreading strategies to process groups of subdomains or level-set functions. Users can also select from a variety of reinitialization policies and stopping criteria. Finally, we developed a visualization framework that shows the evolution of a level-set in real-time to help guide algorithm development and parameter optimization. We demonstrate the power of our new framework using confocal microscopy images of cells in a developing zebrafish embryo.
Melanoma segmentation based on deep learning.
Zhang, Xiaoqing
2017-12-01
Malignant melanoma is one of the most deadly forms of skin cancer, which is one of the world's fastest-growing cancers. Early diagnosis and treatment is critical. In this study, a neural network structure is utilized to construct a broad and accurate basis for the diagnosis of skin cancer, thereby reducing screening errors. The technique is able to improve the efficacy for identification of normally indistinguishable lesions (such as pigment spots) versus clinically unknown lesions, and to ultimately improve the diagnostic accuracy. In the field of medical imaging, in general, using neural networks for image segmentation is relatively rare. The existing traditional machine-learning neural network algorithms still cannot completely solve the problem of information loss, nor detect the precise division of the boundary area. We use an improved neural network framework, described herein, to achieve efficacious feature learning, and satisfactory segmentation of melanoma images. The architecture of the network includes multiple convolution layers, dropout layers, softmax layers, multiple filters, and activation functions. The number of data sets can be increased via rotation of the training set. A non-linear activation function (such as ReLU and ELU) is employed to alleviate the problem of gradient disappearance, and RMSprop/Adam are incorporated to optimize the loss algorithm. A batch normalization layer is added between the convolution layer and the activation layer to solve the problem of gradient disappearance and explosion. Experiments, described herein, show that our improved neural network architecture achieves higher accuracy for segmentation of melanoma images as compared with existing processes.
NASA Astrophysics Data System (ADS)
Zhu, Qiao; Yue, Jun-Zhou; Liu, Wei-Qun; Wang, Xu-Dong; Chen, Jun; Hu, Guang-Di
2017-04-01
This work is focused on the active vibration control of piezoelectric cantilever beam, where an adaptive feedforward controller (AFC) is utilized to reject the vibration with unknown multiple frequencies. First, the experiment setup and its mathematical model are introduced. Due to that the channel between the disturbance and the vibration output is unknown in practice, a concept of equivalent input disturbance (EID) is employed to put an equivalent disturbance into the input channel. In this situation, the vibration control can be achieved by setting the control input be the identified EID. Then, for the EID with known multiple frequencies, the AFC is introduced to perfectly reject the vibration but is sensitive to the frequencies. In order to accurately identify the unknown frequencies of EID in presence of the random disturbances and un-modeled nonlinear dynamics, the time-frequency-analysis (TFA) method is employed to precisely identify the unknown frequencies. Consequently, a TFA-based AFC algorithm is proposed to the active vibration control with unknown frequencies. Finally, four cases are given to illustrate the efficiency of the proposed TFA-based AFC algorithm by experiment.
NASA Astrophysics Data System (ADS)
Millard, R. C.; Seaver, G.
1990-12-01
A 27-term index of refraction algorithm for pure and sea waters has been developed using four experimental data sets of differing accuracies. They cover the range 500-700 nm in wavelength, 0-30°C in temperature, 0-40 psu in salinity, and 0-11,000 db in pressure. The index of refraction algorithm has an accuracy that varies from 0.4 ppm for pure water at atmospheric pressure to 80 ppm at high pressures, but preserves the accuracy of each original data set. This algorithm is a significant improvement over existing descriptions as it is in analytical form with a better and more carefully defined accuracy. A salinometer algorithm with the same uncertainty has been created by numerically inverting the index algorithm using the Newton-Raphson method. The 27-term index algorithm was used to generate a pseudo-data set at the sodium D wavelength (589.26 nm) from which a 6-term densitometer algorithm was constructed. The densitometer algorithm also produces salinity as an intermediate step in the salinity inversion. The densitometer residuals have a standard deviation of 0.049 kg m -3 which is not accurate enough for most oceanographic applications. However, the densitometer algorithm was used to explore the sensitivity of density from this technique to temperature and pressure uncertainties. To achieve a deep ocean densitometer of 0.001 kg m -3 accuracy would require the index of refraction to have an accuracy of 0.3 ppm, the temperature an accuracy of 0.01°C and the pressure 1 db. Our assessment of the currently available index of refraction measurements finds that only the data for fresh water at atmospheric pressure produce an algorithm satisfactory for oceanographic use (density to 0.4 ppm). The data base for the algorithm at higher pressures and various salinities requires an order of magnitude or better improvement in index measurement accuracy before the resultant density accuracy will be comparable to the currently available oceanographic algorithm.
Bron, Esther E; Smits, Marion; van der Flier, Wiesje M; Vrenken, Hugo; Barkhof, Frederik; Scheltens, Philip; Papma, Janne M; Steketee, Rebecca M E; Méndez Orellana, Carolina; Meijboom, Rozanna; Pinto, Madalena; Meireles, Joana R; Garrett, Carolina; Bastos-Leite, António J; Abdulkadir, Ahmed; Ronneberger, Olaf; Amoroso, Nicola; Bellotti, Roberto; Cárdenas-Peña, David; Álvarez-Meza, Andrés M; Dolph, Chester V; Iftekharuddin, Khan M; Eskildsen, Simon F; Coupé, Pierrick; Fonov, Vladimir S; Franke, Katja; Gaser, Christian; Ledig, Christian; Guerrero, Ricardo; Tong, Tong; Gray, Katherine R; Moradi, Elaheh; Tohka, Jussi; Routier, Alexandre; Durrleman, Stanley; Sarica, Alessia; Di Fatta, Giuseppe; Sensi, Francesco; Chincarini, Andrea; Smith, Garry M; Stoyanov, Zhivko V; Sørensen, Lauge; Nielsen, Mads; Tangaro, Sabina; Inglese, Paolo; Wachinger, Christian; Reuter, Martin; van Swieten, John C; Niessen, Wiro J; Klein, Stefan
2015-05-01
Algorithms for computer-aided diagnosis of dementia based on structural MRI have demonstrated high performance in the literature, but are difficult to compare as different data sets and methodology were used for evaluation. In addition, it is unclear how the algorithms would perform on previously unseen data, and thus, how they would perform in clinical practice when there is no real opportunity to adapt the algorithm to the data at hand. To address these comparability, generalizability and clinical applicability issues, we organized a grand challenge that aimed to objectively compare algorithms based on a clinically representative multi-center data set. Using clinical practice as the starting point, the goal was to reproduce the clinical diagnosis. Therefore, we evaluated algorithms for multi-class classification of three diagnostic groups: patients with probable Alzheimer's disease, patients with mild cognitive impairment and healthy controls. The diagnosis based on clinical criteria was used as reference standard, as it was the best available reference despite its known limitations. For evaluation, a previously unseen test set was used consisting of 354 T1-weighted MRI scans with the diagnoses blinded. Fifteen research teams participated with a total of 29 algorithms. The algorithms were trained on a small training set (n=30) and optionally on data from other sources (e.g., the Alzheimer's Disease Neuroimaging Initiative, the Australian Imaging Biomarkers and Lifestyle flagship study of aging). The best performing algorithm yielded an accuracy of 63.0% and an area under the receiver-operating-characteristic curve (AUC) of 78.8%. In general, the best performances were achieved using feature extraction based on voxel-based morphometry or a combination of features that included volume, cortical thickness, shape and intensity. The challenge is open for new submissions via the web-based framework: http://caddementia.grand-challenge.org. Copyright © 2015 Elsevier Inc. All rights reserved.
Providing QoS through machine-learning-driven adaptive multimedia applications.
Ruiz, Pedro M; Botía, Juan A; Gómez-Skarmeta, Antonio
2004-06-01
We investigate the optimization of the quality of service (QoS) offered by real-time multimedia adaptive applications through machine learning algorithms. These applications are able to adapt in real time their internal settings (i.e., video sizes, audio and video codecs, among others) to the unpredictably changing capacity of the network. Traditional adaptive applications just select a set of settings to consume less than the available bandwidth. We propose a novel approach in which the selected set of settings is the one which offers a better user-perceived QoS among all those combinations which satisfy the bandwidth restrictions. We use a genetic algorithm to decide when to trigger the adaptation process depending on the network conditions (i.e., loss-rate, jitter, etc.). Additionally, the selection of the new set of settings is done according to a set of rules which model the user-perceived QoS. These rules are learned using the SLIPPER rule induction algorithm over a set of examples extracted from scores provided by real users. We will demonstrate that the proposed approach guarantees a good user-perceived QoS even when the network conditions are constantly changing.
Anizan, Nadège; Carlier, Thomas; Hindorf, Cecilia; Barbet, Jacques; Bardiès, Manuel
2012-02-13
Noninvasive multimodality imaging is essential for preclinical evaluation of the biodistribution and pharmacokinetics of radionuclide therapy and for monitoring tumor response. Imaging with nonstandard positron-emission tomography [PET] isotopes such as 124I is promising in that context but requires accurate activity quantification. The decay scheme of 124I implies an optimization of both acquisition settings and correction processing. The PET scanner investigated in this study was the Inveon PET/CT system dedicated to small animal imaging. The noise equivalent count rate [NECR], the scatter fraction [SF], and the gamma-prompt fraction [GF] were used to determine the best acquisition parameters for mouse- and rat-sized phantoms filled with 124I. An image-quality phantom as specified by the National Electrical Manufacturers Association NU 4-2008 protocol was acquired and reconstructed with two-dimensional filtered back projection, 2D ordered-subset expectation maximization [2DOSEM], and 3DOSEM with maximum a posteriori [3DOSEM/MAP] algorithms, with and without attenuation correction, scatter correction, and gamma-prompt correction (weighted uniform distribution subtraction). Optimal energy windows were established for the rat phantom (390 to 550 keV) and the mouse phantom (400 to 590 keV) by combining the NECR, SF, and GF results. The coincidence time window had no significant impact regarding the NECR curve variation. Activity concentration of 124I measured in the uniform region of an image-quality phantom was underestimated by 9.9% for the 3DOSEM/MAP algorithm with attenuation and scatter corrections, and by 23% with the gamma-prompt correction. Attenuation, scatter, and gamma-prompt corrections decreased the residual signal in the cold insert. The optimal energy windows were chosen with the NECR, SF, and GF evaluation. Nevertheless, an image quality and an activity quantification assessment were required to establish the most suitable reconstruction algorithm and corrections for 124I small animal imaging.
A jazz-based approach for optimal setting of pressure reducing valves in water distribution networks
NASA Astrophysics Data System (ADS)
De Paola, Francesco; Galdiero, Enzo; Giugni, Maurizio
2016-05-01
This study presents a model for valve setting in water distribution networks (WDNs), with the aim of reducing the level of leakage. The approach is based on the harmony search (HS) optimization algorithm. The HS mimics a jazz improvisation process able to find the best solutions, in this case corresponding to valve settings in a WDN. The model also interfaces with the improved version of a popular hydraulic simulator, EPANET 2.0, to check the hydraulic constraints and to evaluate the performances of the solutions. Penalties are introduced in the objective function in case of violation of the hydraulic constraints. The model is applied to two case studies, and the obtained results in terms of pressure reductions are comparable with those of competitive metaheuristic algorithms (e.g. genetic algorithms). The results demonstrate the suitability of the HS algorithm for water network management and optimization.
The Construction of 3-d Neutral Density for Arbitrary Data Sets
NASA Astrophysics Data System (ADS)
Riha, S.; McDougall, T. J.; Barker, P. M.
2014-12-01
The Neutral Density variable allows inference of water pathways from thermodynamic properties in the global ocean, and is therefore an essential component of global ocean circulation analysis. The widely used algorithm for the computation of Neutral Density yields accurate results for data sets which are close to the observed climatological ocean. Long-term numerical climate simulations, however, often generate a significant drift from present-day climate, which renders the existing algorithm inaccurate. To remedy this problem, new algorithms which operate on arbitrary data have been developed, which may potentially be used to compute Neutral Density during runtime of a numerical model.We review existing approaches for the construction of Neutral Density in arbitrary data sets, detail their algorithmic structure, and present an analysis of the computational cost for implementations on a single-CPU computer. We discuss possible strategies for the implementation in state-of-the-art numerical models, with a focus on distributed computing environments.
Survey of PRT Vehicle Management Algorithms
DOT National Transportation Integrated Search
1974-01-01
The document summarizes the results of a literature survey of state of the art vehicle management algorithms applicable to Personal Rapid Transit Systems(PRT). The surveyed vehicle management algorithms are organized into a set of five major componen...
A collaborative filtering recommendation algorithm based on weighted SimRank and social trust
NASA Astrophysics Data System (ADS)
Su, Chang; Zhang, Butao
2017-05-01
Collaborative filtering is one of the most widely used recommendation technologies, but the data sparsity and cold start problem of collaborative filtering algorithms are difficult to solve effectively. In order to alleviate the problem of data sparsity in collaborative filtering algorithm, firstly, a weighted improved SimRank algorithm is proposed to compute the rating similarity between users in rating data set. The improved SimRank can find more nearest neighbors for target users according to the transmissibility of rating similarity. Then, we build trust network and introduce the calculation of trust degree in the trust relationship data set. Finally, we combine rating similarity and trust to build a comprehensive similarity in order to find more appropriate nearest neighbors for target user. Experimental results show that the algorithm proposed in this paper improves the recommendation precision of the Collaborative algorithm effectively.
Application of Least Mean Square Algorithms to Spacecraft Vibration Compensation
NASA Technical Reports Server (NTRS)
Woodard , Stanley E.; Nagchaudhuri, Abhijit
1998-01-01
This paper describes the application of the Least Mean Square (LMS) algorithm in tandem with the Filtered-X Least Mean Square algorithm for controlling a science instrument's line-of-sight pointing. Pointing error is caused by a periodic disturbance and spacecraft vibration. A least mean square algorithm is used on-orbit to produce the transfer function between the instrument's servo-mechanism and error sensor. The result is a set of adaptive transversal filter weights tuned to the transfer function. The Filtered-X LMS algorithm, which is an extension of the LMS, tunes a set of transversal filter weights to the transfer function between the disturbance source and the servo-mechanism's actuation signal. The servo-mechanism's resulting actuation counters the disturbance response and thus maintains accurate science instrumental pointing. A simulation model of the Upper Atmosphere Research Satellite is used to demonstrate the algorithms.
LSA SAF Meteosat FRP products - Part 1: Algorithms, product contents, and analysis
NASA Astrophysics Data System (ADS)
Wooster, M. J.; Roberts, G.; Freeborn, P. H.; Xu, W.; Govaerts, Y.; Beeby, R.; He, J.; Lattanzio, A.; Fisher, D.; Mullen, R.
2015-11-01
Characterizing changes in landscape fire activity at better than hourly temporal resolution is achievable using thermal observations of actively burning fires made from geostationary Earth Observation (EO) satellites. Over the last decade or more, a series of research and/or operational "active fire" products have been developed from geostationary EO data, often with the aim of supporting biomass burning fuel consumption and trace gas and aerosol emission calculations. Such Fire Radiative Power (FRP) products are generated operationally from Meteosat by the Land Surface Analysis Satellite Applications Facility (LSA SAF) and are available freely every 15 min in both near-real-time and archived form. These products map the location of actively burning fires and characterize their rates of thermal radiative energy release (FRP), which is believed proportional to rates of biomass consumption and smoke emission. The FRP-PIXEL product contains the full spatio-temporal resolution FRP data set derivable from the SEVIRI (Spinning Enhanced Visible and Infrared Imager) imager onboard Meteosat at a 3 km spatial sampling distance (decreasing away from the west African sub-satellite point), whilst the FRP-GRID product is an hourly summary at 5° grid resolution that includes simple bias adjustments for meteorological cloud cover and regional underestimation of FRP caused primarily by underdetection of low FRP fires. Here we describe the enhanced geostationary Fire Thermal Anomaly (FTA) detection algorithm used to deliver these products and detail the methods used to generate the atmospherically corrected FRP and per-pixel uncertainty metrics. Using SEVIRI scene simulations and real SEVIRI data, including from a period of Meteosat-8 "special operations", we describe certain sensor and data pre-processing characteristics that influence SEVIRI's active fire detection and FRP measurement capability, and use these to specify parameters in the FTA algorithm and to make recommendations for the forthcoming Meteosat Third Generation operations in relation to active fire measures. We show that the current SEVIRI FTA algorithm is able to discriminate actively burning fires covering down to 10-4 of a pixel and that it appears more sensitive to fire than other algorithms used to generate many widely exploited active fire products. Finally, we briefly illustrate the information contained within the current Meteosat FRP-PIXEL and FRP-GRID products, providing example analyses for both individual fires and multi-year regional-scale fire activity; the companion paper (Roberts et al., 2015) provides a full product performance evaluation and a demonstration of product use within components of the Copernicus Atmosphere Monitoring Service (CAMS).
Equal Area Logistic Estimation for Item Response Theory
NASA Astrophysics Data System (ADS)
Lo, Shih-Ching; Wang, Kuo-Chang; Chang, Hsin-Li
2009-08-01
Item response theory (IRT) models use logistic functions exclusively as item response functions (IRFs). Applications of IRT models require obtaining the set of values for logistic function parameters that best fit an empirical data set. However, success in obtaining such set of values does not guarantee that the constructs they represent actually exist, for the adequacy of a model is not sustained by the possibility of estimating parameters. In this study, an equal area based two-parameter logistic model estimation algorithm is proposed. Two theorems are given to prove that the results of the algorithm are equivalent to the results of fitting data by logistic model. Numerical results are presented to show the stability and accuracy of the algorithm.
List-Based Simulated Annealing Algorithm for Traveling Salesman Problem
Zhan, Shi-hua; Lin, Juan; Zhang, Ze-jun
2016-01-01
Simulated annealing (SA) algorithm is a popular intelligent optimization algorithm which has been successfully applied in many fields. Parameters' setting is a key factor for its performance, but it is also a tedious work. To simplify parameters setting, we present a list-based simulated annealing (LBSA) algorithm to solve traveling salesman problem (TSP). LBSA algorithm uses a novel list-based cooling schedule to control the decrease of temperature. Specifically, a list of temperatures is created first, and then the maximum temperature in list is used by Metropolis acceptance criterion to decide whether to accept a candidate solution. The temperature list is adapted iteratively according to the topology of the solution space of the problem. The effectiveness and the parameter sensitivity of the list-based cooling schedule are illustrated through benchmark TSP problems. The LBSA algorithm, whose performance is robust on a wide range of parameter values, shows competitive performance compared with some other state-of-the-art algorithms. PMID:27034650
Approximation algorithms for planning and control
NASA Technical Reports Server (NTRS)
Boddy, Mark; Dean, Thomas
1989-01-01
A control system operating in a complex environment will encounter a variety of different situations, with varying amounts of time available to respond to critical events. Ideally, such a control system will do the best possible with the time available. In other words, its responses should approximate those that would result from having unlimited time for computation, where the degree of the approximation depends on the amount of time it actually has. There exist approximation algorithms for a wide variety of problems. Unfortunately, the solution to any reasonably complex control problem will require solving several computationally intensive problems. Algorithms for successive approximation are a subclass of the class of anytime algorithms, algorithms that return answers for any amount of computation time, where the answers improve as more time is allotted. An architecture is described for allocating computation time to a set of anytime algorithms, based on expectations regarding the value of the answers they return. The architecture described is quite general, producing optimal schedules for a set of algorithms under widely varying conditions.
Comparison of three methods for materials identification and mapping with imaging spectroscopy
NASA Technical Reports Server (NTRS)
Clark, Roger N.; Swayze, Gregg; Boardman, Joe; Kruse, Fred
1993-01-01
We are comparing three methods of mapping analysis tools for imaging spectroscopy data. The purpose of this comparison is to understand the advantages and disadvantages of each algorithm so others would be better able to choose the best algorithm or combinations of algorithms for a particular problem. The three algorithms are: (1) the spectralfeature modified least squares mapping algorithm of Clark et al (1990, 1991): programs mbandmap and tricorder; (2) the Spectral Angle Mapper Algorithm(Boardman, 1993) found in the CU CSES SIPS package; and (3) the Expert System of Kruse et al. (1993). The comparison uses a ground-calibrated 1990 AVIRIS scene of 400 by 410 pixels over Cuprite, Nevada. Along with the test data set is a spectral library of 38 minerals. Each algorithm is tested with the same AVIRIS data set and spectral library. Field work has confirmed the presence of many of these minerals in the AVIRIS scene (Swayze et al. 1992).
Genetic algorithms for the vehicle routing problem
NASA Astrophysics Data System (ADS)
Volna, Eva
2016-06-01
The Vehicle Routing Problem (VRP) is one of the most challenging combinatorial optimization tasks. This problem consists in designing the optimal set of routes for fleet of vehicles in order to serve a given set of customers. Evolutionary algorithms are general iterative algorithms for combinatorial optimization. These algorithms have been found to be very effective and robust in solving numerous problems from a wide range of application domains. This problem is known to be NP-hard; hence many heuristic procedures for its solution have been suggested. For such problems it is often desirable to obtain approximate solutions, so they can be found fast enough and are sufficiently accurate for the purpose. In this paper we have performed an experimental study that indicates the suitable use of genetic algorithms for the vehicle routing problem.
Dimitrakopoulou, Konstantina; Vrahatis, Aristidis G; Wilk, Esther; Tsakalidis, Athanasios K; Bezerianos, Anastasios
2013-09-01
The increasing flow of short time series microarray experiments for the study of dynamic cellular processes poses the need for efficient clustering tools. These tools must deal with three primary issues: first, to consider the multi-functionality of genes; second, to evaluate the similarity of the relative change of amplitude in the time domain rather than the absolute values; third, to cope with the constraints of conventional clustering algorithms such as the assignment of the appropriate cluster number. To address these, we propose OLYMPUS, a novel unsupervised clustering algorithm that integrates Differential Evolution (DE) method into Fuzzy Short Time Series (FSTS) algorithm with the scope to utilize efficiently the information of population of the first and enhance the performance of the latter. Our hybrid approach provides sets of genes that enable the deciphering of distinct phases in dynamic cellular processes. We proved the efficiency of OLYMPUS on synthetic as well as on experimental data. The discriminative power of OLYMPUS provided clusters, which refined the so far perspective of the dynamics of host response mechanisms to Influenza A (H1N1). Our kinetic model sets a timeline for several pathways and cell populations, implicated to participate in host response; yet no timeline was assigned to them (e.g. cell cycle, homeostasis). Regarding the activity of B cells, our approach revealed that some antibody-related mechanisms remain activated until day 60 post infection. The Matlab codes for implementing OLYMPUS, as well as example datasets, are freely accessible via the Web (http://biosignal.med.upatras.gr/wordpress/biosignal/). Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Ice_Sheets_CCI: Essential Climate Variables for the Greenland Ice Sheet
NASA Astrophysics Data System (ADS)
Forsberg, R.; Sørensen, L. S.; Khan, A.; Aas, C.; Evansberget, D.; Adalsteinsdottir, G.; Mottram, R.; Andersen, S. B.; Ahlstrøm, A.; Dall, J.; Kusk, A.; Merryman, J.; Hvidberg, C.; Khvorostovsky, K.; Nagler, T.; Rott, H.; Scharrer, M.; Shepard, A.; Ticconi, F.; Engdahl, M.
2012-04-01
As part of the ESA Climate Change Initiative (www.esa-cci.org) a long-term project "ice_sheets_cci" started January 1, 2012, in addition to the existing 11 projects already generating Essential Climate Variables (ECV) for the Global Climate Observing System (GCOS). The "ice_sheets_cci" goal is to generate a consistent, long-term and timely set of key climate parameters for the Greenland ice sheet, to maximize the impact of European satellite data on climate research, from missions such as ERS, Envisat and the future Sentinel satellites. The climate parameters to be provided, at first in a research context, and in the longer perspective by a routine production system, would be grids of Greenland ice sheet elevation changes from radar altimetry, ice velocity from repeat-pass SAR data, as well as time series of marine-terminating glacier calving front locations and grounding lines for floating-front glaciers. The ice_sheets_cci project will involve a broad interaction of the relevant cryosphere and climate communities, first through user consultations and specifications, and later in 2012 optional participation in "best" algorithm selection activities, where prototype climate parameter variables for selected regions and time frames will be produced and validated using an objective set of criteria ("Round-Robin intercomparison"). This comparative algorithm selection activity will be completely open, and we invite all interested scientific groups with relevant experience to participate. The results of the "Round Robin" exercise will form the algorithmic basis for the future ECV production system. First prototype results will be generated and validated by early 2014. The poster will show the planned outline of the project and some early prototype results.
Egocentric daily activity recognition via multitask clustering.
Yan, Yan; Ricci, Elisa; Liu, Gaowen; Sebe, Nicu
2015-10-01
Recognizing human activities from videos is a fundamental research problem in computer vision. Recently, there has been a growing interest in analyzing human behavior from data collected with wearable cameras. First-person cameras continuously record several hours of their wearers' life. To cope with this vast amount of unlabeled and heterogeneous data, novel algorithmic solutions are required. In this paper, we propose a multitask clustering framework for activity of daily living analysis from visual data gathered from wearable cameras. Our intuition is that, even if the data are not annotated, it is possible to exploit the fact that the tasks of recognizing everyday activities of multiple individuals are related, since typically people perform the same actions in similar environments, e.g., people working in an office often read and write documents). In our framework, rather than clustering data from different users separately, we propose to look for clustering partitions which are coherent among related tasks. In particular, two novel multitask clustering algorithms, derived from a common optimization problem, are introduced. Our experimental evaluation, conducted both on synthetic data and on publicly available first-person vision data sets, shows that the proposed approach outperforms several single-task and multitask learning methods.
A cluster analysis on road traffic accidents using genetic algorithms
NASA Astrophysics Data System (ADS)
Saharan, Sabariah; Baragona, Roberto
2017-04-01
The analysis of traffic road accidents is increasingly important because of the accidents cost and public road safety. The availability or large data sets makes the study of factors that affect the frequency and severity accidents are viable. However, the data are often highly unbalanced and overlapped. We deal with the data set of the road traffic accidents recorded in Christchurch, New Zealand, from 2000-2009 with a total of 26440 accidents. The data is in a binary set and there are 50 factors road traffic accidents with four level of severity. We used genetic algorithm for the analysis because we are in the presence of a large unbalanced data set and standard clustering like k-means algorithm may not be suitable for the task. The genetic algorithm based on clustering for unknown K, (GCUK) has been used to identify the factors associated with accidents of different levels of severity. The results provided us with an interesting insight into the relationship between factors and accidents severity level and suggest that the two main factors that contributes to fatal accidents are "Speed greater than 60 km h" and "Did not see other people until it was too late". A comparison with the k-means algorithm and the independent component analysis is performed to validate the results.
Preconditioning 2D Integer Data for Fast Convex Hull Computations
2016-01-01
In order to accelerate computing the convex hull on a set of n points, a heuristic procedure is often applied to reduce the number of points to a set of s points, s ≤ n, which also contains the same hull. We present an algorithm to precondition 2D data with integer coordinates bounded by a box of size p × q before building a 2D convex hull, with three distinct advantages. First, we prove that under the condition min(p, q) ≤ n the algorithm executes in time within O(n); second, no explicit sorting of data is required; and third, the reduced set of s points forms a simple polygonal chain and thus can be directly pipelined into an O(n) time convex hull algorithm. This paper empirically evaluates and quantifies the speed up gained by preconditioning a set of points by a method based on the proposed algorithm before using common convex hull algorithms to build the final hull. A speedup factor of at least four is consistently found from experiments on various datasets when the condition min(p, q) ≤ n holds; the smaller the ratio min(p, q)/n is in the dataset, the greater the speedup factor achieved. PMID:26938221
GreedyMAX-type Algorithms for the Maximum Independent Set Problem
NASA Astrophysics Data System (ADS)
Borowiecki, Piotr; Göring, Frank
A maximum independent set problem for a simple graph G = (V,E) is to find the largest subset of pairwise nonadjacent vertices. The problem is known to be NP-hard and it is also hard to approximate. Within this article we introduce a non-negative integer valued function p defined on the vertex set V(G) and called a potential function of a graph G, while P(G) = max v ∈ V(G) p(v) is called a potential of G. For any graph P(G) ≤ Δ(G), where Δ(G) is the maximum degree of G. Moreover, Δ(G) - P(G) may be arbitrarily large. A potential of a vertex lets us get a closer insight into the properties of its neighborhood which leads to the definition of the family of GreedyMAX-type algorithms having the classical GreedyMAX algorithm as their origin. We establish a lower bound 1/(P + 1) for the performance ratio of GreedyMAX-type algorithms which favorably compares with the bound 1/(Δ + 1) known to hold for GreedyMAX. The cardinality of an independent set generated by any GreedyMAX-type algorithm is at least sum_{vin V(G)} (p(v)+1)^{-1}, which strengthens the bounds of Turán and Caro-Wei stated in terms of vertex degrees.
NASA Astrophysics Data System (ADS)
De Smedt, Isabelle; Richter, Andreas; Beirle, Steffen; Danckaert, Thomas; Van Roozendael, Michel; Yu, Huan; Bösch, Tim; Hilboll, Andreas; Peters, Enno; Doerner, Steffen; Wagner, Thomas; Wang, Yang; Lorente, Alba; Eskes, Henk; Van Geffen, Jos; Boersma, Folkert
2016-04-01
One of the main goals of the QA4ECV project is to define community best-practices for the generation of multi-decadal ECV data records from satellite instruments. QA4ECV will develop retrieval algorithms for the Land ECVs surface albedo, leaf area index (LAI), and fraction of active photosynthetic radiation (fAPAR), as well as for the Atmosphere ECV ozone and aerosol precursors nitrogen dioxide (NO2), formaldehyde (HCHO), and carbon monoxide (CO). Here we assess best practices and provide recommendations for the retrieval of HCHO. Best practices are established based on (1) a detailed intercomparison exercise between the QA4ECV partner's for each specific algorithm processing steps, (2) the feasibility of implementation, and (3) the requirement to generate consistent multi-sensor multi-decadal data records. We propose a fitting window covering the 328.5-346 nm spectral interval for the morning sensors (GOME, SCIAMACHY and GOME-2) and an extension to 328.5-359 nm for OMI and GOME-2, allowed by improved quality of the recorded spectra. A high level of consistency between group algorithms is found when the retrieval settings are carefully aligned. However, the retrieval of slant columns is highly sensitive to any change in the selected settings. The use of a mean background radiance as DOAS reference spectrum allows for a stabilization of the retrievals. A background correction based on the reference sector method is recommended for implementation in the QA4ECV HCHO algorithm as it further reduces retrieval uncertainties. HCHO AMFs using different radiative transfer codes show a good overall consistency when harmonized settings are used. As for NO2, it is proposed to use a priori HCHO profiles from the TM5 model. These are provided on a 1°x1° latitude-longitude grid.
NASA Astrophysics Data System (ADS)
Bleier, T.; Heraud, J. A.; Dunson, J. C.
2015-12-01
QuakeFinder (QF) and its international collaborators have installed and currently maintain 165 three-axis induction magnetometer instrument sites in California, Peru, Taiwan, Greece, Chile and Sumatra. The data from these instruments are being analyzed for pre-quake signatures. This analysis consists of both private research by QuakeFinder, and institutional collaborators (PUCP in Peru, NCU in Taiwan, PUCC in Chile, NOA in Greece, Syiah Kuala University in Indonesia, LASP at U of Colo., Stanford, and USGS). Recently, NASA Hq and QuakeFinder tried a new approach to help with the analysis of this huge (50+TB) data archive. A collaboration with Apirio/TopCoder, Harvard University, Amazon, QuakeFinder, and NASA Hq. resulted in an open algorithm development contest called "Quest for Quakes" in which contestants (freelance algorithm developers) attempted to identify quakes from a subset of the QuakeFinder data (3TB). The contest included a $25K prize pool, and contained 100 cases where earthquakes (and null sets) included data from up to 5 remote sites, near and far from quakes greater than M4. These data sets were made available through Amazon.com to hundreds of contestants over a two week contest period. In a more traditional approach, several new algorithms were tried by actively sharing the QF data with universities over a longer period. These algorithms included Principal Component Analysis-PCA and deep neural networks in an effort to automatically identify earthquake signals within typical, noise-filled environments. This presentation examines the pros and cons of employing these two approaches, from both logistical and scientific perspectives.
NASA Technical Reports Server (NTRS)
Lee, Zhong-Ping; Carder, Kendall L.
2001-01-01
A multi-band analytical (MBA) algorithm is developed to retrieve absorption and backscattering coefficients for optically deep waters, which can be applied to data from past and current satellite sensors, as well as data from hyperspectral sensors. This MBA algorithm applies a remote-sensing reflectance model derived from the Radiative Transfer Equation, and values of absorption and backscattering coefficients are analytically calculated from values of remote-sensing reflectance. There are only limited empirical relationships involved in the algorithm, which implies that this MBA algorithm could be applied to a wide dynamic range of waters. Applying the algorithm to a simulated non-"Case 1" data set, which has no relation to the development of the algorithm, the percentage error for the total absorption coefficient at 440 nm a (sub 440) is approximately 12% for a range of 0.012 - 2.1 per meter (approximately 6% for a (sub 440) less than approximately 0.3 per meter), while a traditional band-ratio approach returns a percentage error of approximately 30%. Applying it to a field data set ranging from 0.025 to 2.0 per meter, the result for a (sub 440) is very close to that using a full spectrum optimization technique (9.6% difference). Compared to the optimization approach, the MBA algorithm cuts the computation time dramatically with only a small sacrifice in accuracy, making it suitable for processing large data sets such as satellite images. Significant improvements over empirical algorithms have also been achieved in retrieving the optical properties of optically deep waters.
Ciesielski, Krzysztof Chris; Udupa, Jayaram K.
2011-01-01
In the current vast image segmentation literature, there seems to be considerable redundancy among algorithms, while there is a serious lack of methods that would allow their theoretical comparison to establish their similarity, equivalence, or distinctness. In this paper, we make an attempt to fill this gap. To accomplish this goal, we argue that: (1) every digital segmentation algorithm A should have a well defined continuous counterpart MA, referred to as its model, which constitutes an asymptotic of A when image resolution goes to infinity; (2) the equality of two such models MA and MA′ establishes a theoretical (asymptotic) equivalence of their digital counterparts A and A′. Such a comparison is of full theoretical value only when, for each involved algorithm A, its model MA is proved to be an asymptotic of A. So far, such proofs do not appear anywhere in the literature, even in the case of algorithms introduced as digitizations of continuous models, like level set segmentation algorithms. The main goal of this article is to explore a line of investigation for formally pairing the digital segmentation algorithms with their asymptotic models, justifying such relations with mathematical proofs, and using the results to compare the segmentation algorithms in this general theoretical framework. As a first step towards this general goal, we prove here that the gradient based thresholding model M∇ is the asymptotic for the fuzzy connectedness Udupa and Samarasekera segmentation algorithm used with gradient based affinity A∇. We also argue that, in a sense, M∇ is the asymptotic for the original front propagation level set algorithm of Malladi, Sethian, and Vemuri, thus establishing a theoretical equivalence between these two specific algorithms. Experimental evidence of this last equivalence is also provided. PMID:21442014
A study on the performance comparison of metaheuristic algorithms on the learning of neural networks
NASA Astrophysics Data System (ADS)
Lai, Kee Huong; Zainuddin, Zarita; Ong, Pauline
2017-08-01
The learning or training process of neural networks entails the task of finding the most optimal set of parameters, which includes translation vectors, dilation parameter, synaptic weights, and bias terms. Apart from the traditional gradient descent-based methods, metaheuristic methods can also be used for this learning purpose. Since the inception of genetic algorithm half a century ago, the last decade witnessed the explosion of a variety of novel metaheuristic algorithms, such as harmony search algorithm, bat algorithm, and whale optimization algorithm. Despite the proof of the no free lunch theorem in the discipline of optimization, a survey in the literature of machine learning gives contrasting results. Some researchers report that certain metaheuristic algorithms are superior to the others, whereas some others argue that different metaheuristic algorithms give comparable performance. As such, this paper aims to investigate if a certain metaheuristic algorithm will outperform the other algorithms. In this work, three metaheuristic algorithms, namely genetic algorithms, particle swarm optimization, and harmony search algorithm are considered. The algorithms are incorporated in the learning of neural networks and their classification results on the benchmark UCI machine learning data sets are compared. It is found that all three metaheuristic algorithms give similar and comparable performance, as captured in the average overall classification accuracy. The results corroborate the findings reported in the works done by previous researchers. Several recommendations are given, which include the need of statistical analysis to verify the results and further theoretical works to support the obtained empirical results.
Optimizing human activity patterns using global sensitivity analysis
Hickmann, Kyle S.; Mniszewski, Susan M.; Del Valle, Sara Y.; Hyman, James M.
2014-01-01
Implementing realistic activity patterns for a population is crucial for modeling, for example, disease spread, supply and demand, and disaster response. Using the dynamic activity simulation engine, DASim, we generate schedules for a population that capture regular (e.g., working, eating, and sleeping) and irregular activities (e.g., shopping or going to the doctor). We use the sample entropy (SampEn) statistic to quantify a schedule’s regularity for a population. We show how to tune an activity’s regularity by adjusting SampEn, thereby making it possible to realistically design activities when creating a schedule. The tuning process sets up a computationally intractable high-dimensional optimization problem. To reduce the computational demand, we use Bayesian Gaussian process regression to compute global sensitivity indices and identify the parameters that have the greatest effect on the variance of SampEn. We use the harmony search (HS) global optimization algorithm to locate global optima. Our results show that HS combined with global sensitivity analysis can efficiently tune the SampEn statistic with few search iterations. We demonstrate how global sensitivity analysis can guide statistical emulation and global optimization algorithms to efficiently tune activities and generate realistic activity patterns. Though our tuning methods are applied to dynamic activity schedule generation, they are general and represent a significant step in the direction of automated tuning and optimization of high-dimensional computer simulations. PMID:25580080
Optimizing human activity patterns using global sensitivity analysis
Fairchild, Geoffrey; Hickmann, Kyle S.; Mniszewski, Susan M.; ...
2013-12-10
Implementing realistic activity patterns for a population is crucial for modeling, for example, disease spread, supply and demand, and disaster response. Using the dynamic activity simulation engine, DASim, we generate schedules for a population that capture regular (e.g., working, eating, and sleeping) and irregular activities (e.g., shopping or going to the doctor). We use the sample entropy (SampEn) statistic to quantify a schedule’s regularity for a population. We show how to tune an activity’s regularity by adjusting SampEn, thereby making it possible to realistically design activities when creating a schedule. The tuning process sets up a computationally intractable high-dimensional optimizationmore » problem. To reduce the computational demand, we use Bayesian Gaussian process regression to compute global sensitivity indices and identify the parameters that have the greatest effect on the variance of SampEn. Here we use the harmony search (HS) global optimization algorithm to locate global optima. Our results show that HS combined with global sensitivity analysis can efficiently tune the SampEn statistic with few search iterations. We demonstrate how global sensitivity analysis can guide statistical emulation and global optimization algorithms to efficiently tune activities and generate realistic activity patterns. Finally, though our tuning methods are applied to dynamic activity schedule generation, they are general and represent a significant step in the direction of automated tuning and optimization of high-dimensional computer simulations.« less
NASA Technical Reports Server (NTRS)
Dupnick, E.; Wiggins, D.
1980-01-01
The scheduling algorithm for mission planning and logistics evaluation (SAMPLE) is presented. Two major subsystems are included: The mission payloads program; and the set covering program. Formats and parameter definitions for the payload data set (payload model), feasible combination file, and traffic model are documented.
Anomaly Detection in Large Sets of High-Dimensional Symbol Sequences
NASA Technical Reports Server (NTRS)
Budalakoti, Suratna; Srivastava, Ashok N.; Akella, Ram; Turkov, Eugene
2006-01-01
This paper addresses the problem of detecting and describing anomalies in large sets of high-dimensional symbol sequences. The approach taken uses unsupervised clustering of sequences using the normalized longest common subsequence (LCS) as a similarity measure, followed by detailed analysis of outliers to detect anomalies. As the LCS measure is expensive to compute, the first part of the paper discusses existing algorithms, such as the Hunt-Szymanski algorithm, that have low time-complexity. We then discuss why these algorithms often do not work well in practice and present a new hybrid algorithm for computing the LCS that, in our tests, outperforms the Hunt-Szymanski algorithm by a factor of five. The second part of the paper presents new algorithms for outlier analysis that provide comprehensible indicators as to why a particular sequence was deemed to be an outlier. The algorithms provide a coherent description to an analyst of the anomalies in the sequence, compared to more normal sequences. The algorithms we present are general and domain-independent, so we discuss applications in related areas such as anomaly detection.