An Extraction Method of an Informative DOM Node from a Web Page by Using Layout Information
NASA Astrophysics Data System (ADS)
Tsuruta, Masanobu; Masuyama, Shigeru
We propose an informative DOM node extraction method from a Web page for preprocessing of Web content mining. Our proposed method LM uses layout data of DOM nodes generated by a generic Web browser, and the learning set consists of hundreds of Web pages and the annotations of informative DOM nodes of those Web pages. Our method does not require large scale crawling of the whole Web site to which the target Web page belongs. We design LM so that it uses the information of the learning set more efficiently in comparison to the existing method that uses the same learning set. By experiments, we evaluate the methods obtained by combining one that consists of the method for extracting the informative DOM node both the proposed method and the existing methods, and the existing noise elimination methods: Heur removes advertisements and link-lists by some heuristics and CE removes the DOM nodes existing in the Web pages in the same Web site to which the target Web page belongs. Experimental results show that 1) LM outperforms other methods for extracting the informative DOM node, 2) the combination method (LM, {CE(10), Heur}) based on LM (precision: 0.755, recall: 0.826, F-measure: 0.746) outperforms other combination methods.
Zhao, Huaqing; Rebbeck, Timothy R; Mitra, Nandita
2009-12-01
Confounding due to population stratification (PS) arises when differences in both allele and disease frequencies exist in a population of mixed racial/ethnic subpopulations. Genomic control, structured association, principal components analysis (PCA), and multidimensional scaling (MDS) approaches have been proposed to address this bias using genetic markers. However, confounding due to PS can also be due to non-genetic factors. Propensity scores are widely used to address confounding in observational studies but have not been adapted to deal with PS in genetic association studies. We propose a genomic propensity score (GPS) approach to correct for bias due to PS that considers both genetic and non-genetic factors. We compare the GPS method with PCA and MDS using simulation studies. Our results show that GPS can adequately adjust and consistently correct for bias due to PS. Under no/mild, moderate, and severe PS, GPS yielded estimated with bias close to 0 (mean=-0.0044, standard error=0.0087). Under moderate or severe PS, the GPS method consistently outperforms the PCA method in terms of bias, coverage probability (CP), and type I error. Under moderate PS, the GPS method consistently outperforms the MDS method in terms of CP. PCA maintains relatively high power compared to both MDS and GPS methods under the simulated situations. GPS and MDS are comparable in terms of statistical properties such as bias, type I error, and power. The GPS method provides a novel and robust tool for obtaining less-biased estimates of genetic associations that can consider both genetic and non-genetic factors. 2009 Wiley-Liss, Inc.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tuo, Rui; Wu, C. F. Jeff
Many computer models contain unknown parameters which need to be estimated using physical observations. Furthermore, the calibration method based on Gaussian process models may lead to unreasonable estimate for imperfect computer models. In this work, we extend their study to calibration problems with stochastic physical data. We propose a novel method, called the L 2 calibration, and show its semiparametric efficiency. The conventional method of the ordinary least squares is also studied. Theoretical analysis shows that it is consistent but not efficient. Here, numerical examples show that the proposed method outperforms the existing ones.
Jackowski, Konrad; Krawczyk, Bartosz; Woźniak, Michał
2014-05-01
Currently, methods of combined classification are the focus of intense research. A properly designed group of combined classifiers exploiting knowledge gathered in a pool of elementary classifiers can successfully outperform a single classifier. There are two essential issues to consider when creating combined classifiers: how to establish the most comprehensive pool and how to design a fusion model that allows for taking full advantage of the collected knowledge. In this work, we address the issues and propose an AdaSS+, training algorithm dedicated for the compound classifier system that effectively exploits local specialization of the elementary classifiers. An effective training procedure consists of two phases. The first phase detects the classifier competencies and adjusts the respective fusion parameters. The second phase boosts classification accuracy by elevating the degree of local specialization. The quality of the proposed algorithms are evaluated on the basis of a wide range of computer experiments that show that AdaSS+ can outperform the original method and several reference classifiers.
Comparison of Classification Methods for Detecting Emotion from Mandarin Speech
NASA Astrophysics Data System (ADS)
Pao, Tsang-Long; Chen, Yu-Te; Yeh, Jun-Heng
It is said that technology comes out from humanity. What is humanity? The very definition of humanity is emotion. Emotion is the basis for all human expression and the underlying theme behind everything that is done, said, thought or imagined. Making computers being able to perceive and respond to human emotion, the human-computer interaction will be more natural. Several classifiers are adopted for automatically assigning an emotion category, such as anger, happiness or sadness, to a speech utterance. These classifiers were designed independently and tested on various emotional speech corpora, making it difficult to compare and evaluate their performance. In this paper, we first compared several popular classification methods and evaluated their performance by applying them to a Mandarin speech corpus consisting of five basic emotions, including anger, happiness, boredom, sadness and neutral. The extracted feature streams contain MFCC, LPCC, and LPC. The experimental results show that the proposed WD-MKNN classifier achieves an accuracy of 81.4% for the 5-class emotion recognition and outperforms other classification techniques, including KNN, MKNN, DW-KNN, LDA, QDA, GMM, HMM, SVM, and BPNN. Then, to verify the advantage of the proposed method, we compared these classifiers by applying them to another Mandarin expressive speech corpus consisting of two emotions. The experimental results still show that the proposed WD-MKNN outperforms others.
Efficient calibration for imperfect computer models
Tuo, Rui; Wu, C. F. Jeff
2015-12-01
Many computer models contain unknown parameters which need to be estimated using physical observations. Furthermore, the calibration method based on Gaussian process models may lead to unreasonable estimate for imperfect computer models. In this work, we extend their study to calibration problems with stochastic physical data. We propose a novel method, called the L 2 calibration, and show its semiparametric efficiency. The conventional method of the ordinary least squares is also studied. Theoretical analysis shows that it is consistent but not efficient. Here, numerical examples show that the proposed method outperforms the existing ones.
NASA Astrophysics Data System (ADS)
Stas, Michiel; Dong, Qinghan; Heremans, Stien; Zhang, Beier; Van Orshoven, Jos
2016-08-01
This paper compares two machine learning techniques to predict regional winter wheat yields. The models, based on Boosted Regression Trees (BRT) and Support Vector Machines (SVM), are constructed of Normalized Difference Vegetation Indices (NDVI) derived from low resolution SPOT VEGETATION satellite imagery. Three types of NDVI-related predictors were used: Single NDVI, Incremental NDVI and Targeted NDVI. BRT and SVM were first used to select features with high relevance for predicting the yield. Although the exact selections differed between the prefectures, certain periods with high influence scores for multiple prefectures could be identified. The same period of high influence stretching from March to June was detected by both machine learning methods. After feature selection, BRT and SVM models were applied to the subset of selected features for actual yield forecasting. Whereas both machine learning methods returned very low prediction errors, BRT seems to slightly but consistently outperform SVM.
Yang, Jian; Zhang, David; Yang, Jing-Yu; Niu, Ben
2007-04-01
This paper develops an unsupervised discriminant projection (UDP) technique for dimensionality reduction of high-dimensional data in small sample size cases. UDP can be seen as a linear approximation of a multimanifolds-based learning framework which takes into account both the local and nonlocal quantities. UDP characterizes the local scatter as well as the nonlocal scatter, seeking to find a projection that simultaneously maximizes the nonlocal scatter and minimizes the local scatter. This characteristic makes UDP more intuitive and more powerful than the most up-to-date method, Locality Preserving Projection (LPP), which considers only the local scatter for clustering or classification tasks. The proposed method is applied to face and palm biometrics and is examined using the Yale, FERET, and AR face image databases and the PolyU palmprint database. The experimental results show that UDP consistently outperforms LPP and PCA and outperforms LDA when the training sample size per class is small. This demonstrates that UDP is a good choice for real-world biometrics applications.
Sinusoidal Analysis-Synthesis of Audio Using Perceptual Criteria
NASA Astrophysics Data System (ADS)
Painter, Ted; Spanias, Andreas
2003-12-01
This paper presents a new method for the selection of sinusoidal components for use in compact representations of narrowband audio. The method consists of ranking and selecting the most perceptually relevant sinusoids. The idea behind the method is to maximize the matching between the auditory excitation pattern associated with the original signal and the corresponding auditory excitation pattern associated with the modeled signal that is being represented by a small set of sinusoidal parameters. The proposed component-selection methodology is shown to outperform the maximum signal-to-mask ratio selection strategy in terms of subjective quality.
Consistent Adjoint Driven Importance Sampling using Space, Energy and Angle
DOE Office of Scientific and Technical Information (OSTI.GOV)
Peplow, Douglas E.; Mosher, Scott W; Evans, Thomas M
2012-08-01
For challenging radiation transport problems, hybrid methods combine the accuracy of Monte Carlo methods with the global information present in deterministic methods. One of the most successful hybrid methods is CADIS Consistent Adjoint Driven Importance Sampling. This method uses a deterministic adjoint solution to construct a biased source distribution and consistent weight windows to optimize a specific tally in a Monte Carlo calculation. The method has been implemented into transport codes using just the spatial and energy information from the deterministic adjoint and has been used in many applications to compute tallies with much higher figures-of-merit than analog calculations. CADISmore » also outperforms user-supplied importance values, which usually take long periods of user time to develop. This work extends CADIS to develop weight windows that are a function of the position, energy, and direction of the Monte Carlo particle. Two types of consistent source biasing are presented: one method that biases the source in space and energy while preserving the original directional distribution and one method that biases the source in space, energy, and direction. Seven simple example problems are presented which compare the use of the standard space/energy CADIS with the new space/energy/angle treatments.« less
Disease gene classification with metagraph representations.
Kircali Ata, Sezin; Fang, Yuan; Wu, Min; Li, Xiao-Li; Xiao, Xiaokui
2017-12-01
Protein-protein interaction (PPI) networks play an important role in studying the functional roles of proteins, including their association with diseases. However, protein interaction networks are not sufficient without the support of additional biological knowledge for proteins such as their molecular functions and biological processes. To complement and enrich PPI networks, we propose to exploit biological properties of individual proteins. More specifically, we integrate keywords describing protein properties into the PPI network, and construct a novel PPI-Keywords (PPIK) network consisting of both proteins and keywords as two different types of nodes. As disease proteins tend to have a similar topological characteristics on the PPIK network, we further propose to represent proteins with metagraphs. Different from a traditional network motif or subgraph, a metagraph can capture a particular topological arrangement involving the interactions/associations between both proteins and keywords. Based on the novel metagraph representations for proteins, we further build classifiers for disease protein classification through supervised learning. Our experiments on three different PPI databases demonstrate that the proposed method consistently improves disease protein prediction across various classifiers, by 15.3% in AUC on average. It outperforms the baselines including the diffusion-based methods (e.g., RWR) and the module-based methods by 13.8-32.9% for overall disease protein prediction. For predicting breast cancer genes, it outperforms RWR, PRINCE and the module-based baselines by 6.6-14.2%. Finally, our predictions also turn out to have better correlations with literature findings from PubMed. Copyright © 2017 Elsevier Inc. All rights reserved.
Hybrid Power Management for Office Equipment
NASA Astrophysics Data System (ADS)
Gingade, Ganesh P.
Office machines (such as printers, scanners, fax, and copiers) can consume significant amounts of power. Few studies have been devoted to power management of office equipment. Most office machines have sleep modes to save power. Power management of these machines are usually timeout-based: a machine sleeps after being idle long enough. Setting the timeout duration can be difficult: if it is too long, the machine wastes power during idleness. If it is too short, the machine sleeps too soon and too often--the wakeup delay can significantly degrade productivity. Thus, power management is a tradeoff between saving energy and keeping short response time. Many power management policies have been published and one policy may outperform another in some scenarios. There is no definite conclusion which policy is always better. This thesis describes two methods for office equipment power management. The first method adaptively reduces power based on a constraint of the wakeup delay. The second method is a hybrid with multiple candidate policies and it selects the most appropriate power management policy. Using six months of request traces from 18 different offices, we demonstrate that the hybrid policy outperforms individual policies. We also discover that power management based on business hours does not produce consistent energy savings.
Mean-Reverting Portfolio With Budget Constraint
NASA Astrophysics Data System (ADS)
Zhao, Ziping; Palomar, Daniel P.
2018-05-01
This paper considers the mean-reverting portfolio design problem arising from statistical arbitrage in the financial markets. We first propose a general problem formulation aimed at finding a portfolio of underlying component assets by optimizing a mean-reversion criterion characterizing the mean-reversion strength, taking into consideration the variance of the portfolio and an investment budget constraint. Then several specific problems are considered based on the general formulation, and efficient algorithms are proposed. Numerical results on both synthetic and market data show that our proposed mean-reverting portfolio design methods can generate consistent profits and outperform the traditional design methods and the benchmark methods in the literature.
Automatic Authorship Detection Using Textual Patterns Extracted from Integrated Syntactic Graphs
Gómez-Adorno, Helena; Sidorov, Grigori; Pinto, David; Vilariño, Darnes; Gelbukh, Alexander
2016-01-01
We apply the integrated syntactic graph feature extraction methodology to the task of automatic authorship detection. This graph-based representation allows integrating different levels of language description into a single structure. We extract textual patterns based on features obtained from shortest path walks over integrated syntactic graphs and apply them to determine the authors of documents. On average, our method outperforms the state of the art approaches and gives consistently high results across different corpora, unlike existing methods. Our results show that our textual patterns are useful for the task of authorship attribution. PMID:27589740
Robust phase retrieval of complex-valued object in phase modulation by hybrid Wirtinger flow method
NASA Astrophysics Data System (ADS)
Wei, Zhun; Chen, Wen; Yin, Tiantian; Chen, Xudong
2017-09-01
This paper presents a robust iterative algorithm, known as hybrid Wirtinger flow (HWF), for phase retrieval (PR) of complex objects from noisy diffraction intensities. Numerical simulations indicate that the HWF method consistently outperforms conventional PR methods in terms of both accuracy and convergence rate in multiple phase modulations. The proposed algorithm is also more robust to low oversampling ratios, loose constraints, and noisy environments. Furthermore, compared with traditional Wirtinger flow, sample complexity is largely reduced. It is expected that the proposed HWF method will find applications in the rapidly growing coherent diffractive imaging field for high-quality image reconstruction with multiple modulations, as well as other disciplines where PR is needed.
Sengupta Chattopadhyay, Amrita; Hsiao, Ching-Lin; Chang, Chien Ching; Lian, Ie-Bin; Fann, Cathy S J
2014-01-01
Identifying susceptibility genes that influence complex diseases is extremely difficult because loci often influence the disease state through genetic interactions. Numerous approaches to detect disease-associated SNP-SNP interactions have been developed, but none consistently generates high-quality results under different disease scenarios. Using summarizing techniques to combine a number of existing methods may provide a solution to this problem. Here we used three popular non-parametric methods-Gini, absolute probability difference (APD), and entropy-to develop two novel summary scores, namely principle component score (PCS) and Z-sum score (ZSS), with which to predict disease-associated genetic interactions. We used a simulation study to compare performance of the non-parametric scores, the summary scores, the scaled-sum score (SSS; used in polymorphism interaction analysis (PIA)), and the multifactor dimensionality reduction (MDR). The non-parametric methods achieved high power, but no non-parametric method outperformed all others under a variety of epistatic scenarios. PCS and ZSS, however, outperformed MDR. PCS, ZSS and SSS displayed controlled type-I-errors (<0.05) compared to GS, APDS, ES (>0.05). A real data study using the genetic-analysis-workshop 16 (GAW 16) rheumatoid arthritis dataset identified a number of interesting SNP-SNP interactions. © 2013 Elsevier B.V. All rights reserved.
Human Splice-Site Prediction with Deep Neural Networks.
Naito, Tatsuhiko
2018-04-18
Accurate splice-site prediction is essential to delineate gene structures from sequence data. Several computational techniques have been applied to create a system to predict canonical splice sites. For classification tasks, deep neural networks (DNNs) have achieved record-breaking results and often outperformed other supervised learning techniques. In this study, a new method of splice-site prediction using DNNs was proposed. The proposed system receives an input sequence data and returns an answer as to whether it is splice site. The length of input is 140 nucleotides, with the consensus sequence (i.e., "GT" and "AG" for the donor and acceptor sites, respectively) in the middle. Each input sequence model is applied to the pretrained DNN model that determines the probability that an input is a splice site. The model consists of convolutional layers and bidirectional long short-term memory network layers. The pretraining and validation were conducted using the data set tested in previously reported methods. The performance evaluation results showed that the proposed method can outperform the previous methods. In addition, the pattern learned by the DNNs was visualized as position frequency matrices (PFMs). Some of PFMs were very similar to the consensus sequence. The trained DNN model and the brief source code for the prediction system are uploaded. Further improvement will be achieved following the further development of DNNs.
A Cross-Lingual Similarity Measure for Detecting Biomedical Term Translations
Bollegala, Danushka; Kontonatsios, Georgios; Ananiadou, Sophia
2015-01-01
Bilingual dictionaries for technical terms such as biomedical terms are an important resource for machine translation systems as well as for humans who would like to understand a concept described in a foreign language. Often a biomedical term is first proposed in English and later it is manually translated to other languages. Despite the fact that there are large monolingual lexicons of biomedical terms, only a fraction of those term lexicons are translated to other languages. Manually compiling large-scale bilingual dictionaries for technical domains is a challenging task because it is difficult to find a sufficiently large number of bilingual experts. We propose a cross-lingual similarity measure for detecting most similar translation candidates for a biomedical term specified in one language (source) from another language (target). Specifically, a biomedical term in a language is represented using two types of features: (a) intrinsic features that consist of character n-grams extracted from the term under consideration, and (b) extrinsic features that consist of unigrams and bigrams extracted from the contextual windows surrounding the term under consideration. We propose a cross-lingual similarity measure using each of those feature types. First, to reduce the dimensionality of the feature space in each language, we propose prototype vector projection (PVP)—a non-negative lower-dimensional vector projection method. Second, we propose a method to learn a mapping between the feature spaces in the source and target language using partial least squares regression (PLSR). The proposed method requires only a small number of training instances to learn a cross-lingual similarity measure. The proposed PVP method outperforms popular dimensionality reduction methods such as the singular value decomposition (SVD) and non-negative matrix factorization (NMF) in a nearest neighbor prediction task. Moreover, our experimental results covering several language pairs such as English–French, English–Spanish, English–Greek, and English–Japanese show that the proposed method outperforms several other feature projection methods in biomedical term translation prediction tasks. PMID:26030738
District Learning Tied to Student Learning
ERIC Educational Resources Information Center
McFadden, Ledyard
2009-01-01
Winners and finalists for the annual Broad Prize for Urban Education have consistently outperformed peer districts serving similar student populations. What makes the difference? These districts consistently demonstrate a learning loop that influences the district's ability to learn, which ultimately influences student opportunities to learn.…
Gene expression inference with deep learning.
Chen, Yifei; Li, Yi; Narayan, Rajiv; Subramanian, Aravind; Xie, Xiaohui
2016-06-15
Large-scale gene expression profiling has been widely used to characterize cellular states in response to various disease conditions, genetic perturbations, etc. Although the cost of whole-genome expression profiles has been dropping steadily, generating a compendium of expression profiling over thousands of samples is still very expensive. Recognizing that gene expressions are often highly correlated, researchers from the NIH LINCS program have developed a cost-effective strategy of profiling only ∼1000 carefully selected landmark genes and relying on computational methods to infer the expression of remaining target genes. However, the computational approach adopted by the LINCS program is currently based on linear regression (LR), limiting its accuracy since it does not capture complex nonlinear relationship between expressions of genes. We present a deep learning method (abbreviated as D-GEX) to infer the expression of target genes from the expression of landmark genes. We used the microarray-based Gene Expression Omnibus dataset, consisting of 111K expression profiles, to train our model and compare its performance to those from other methods. In terms of mean absolute error averaged across all genes, deep learning significantly outperforms LR with 15.33% relative improvement. A gene-wise comparative analysis shows that deep learning achieves lower error than LR in 99.97% of the target genes. We also tested the performance of our learned model on an independent RNA-Seq-based GTEx dataset, which consists of 2921 expression profiles. Deep learning still outperforms LR with 6.57% relative improvement, and achieves lower error in 81.31% of the target genes. D-GEX is available at https://github.com/uci-cbcl/D-GEX CONTACT: xhx@ics.uci.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Gene expression inference with deep learning
Chen, Yifei; Li, Yi; Narayan, Rajiv; Subramanian, Aravind; Xie, Xiaohui
2016-01-01
Motivation: Large-scale gene expression profiling has been widely used to characterize cellular states in response to various disease conditions, genetic perturbations, etc. Although the cost of whole-genome expression profiles has been dropping steadily, generating a compendium of expression profiling over thousands of samples is still very expensive. Recognizing that gene expressions are often highly correlated, researchers from the NIH LINCS program have developed a cost-effective strategy of profiling only ∼1000 carefully selected landmark genes and relying on computational methods to infer the expression of remaining target genes. However, the computational approach adopted by the LINCS program is currently based on linear regression (LR), limiting its accuracy since it does not capture complex nonlinear relationship between expressions of genes. Results: We present a deep learning method (abbreviated as D-GEX) to infer the expression of target genes from the expression of landmark genes. We used the microarray-based Gene Expression Omnibus dataset, consisting of 111K expression profiles, to train our model and compare its performance to those from other methods. In terms of mean absolute error averaged across all genes, deep learning significantly outperforms LR with 15.33% relative improvement. A gene-wise comparative analysis shows that deep learning achieves lower error than LR in 99.97% of the target genes. We also tested the performance of our learned model on an independent RNA-Seq-based GTEx dataset, which consists of 2921 expression profiles. Deep learning still outperforms LR with 6.57% relative improvement, and achieves lower error in 81.31% of the target genes. Availability and implementation: D-GEX is available at https://github.com/uci-cbcl/D-GEX. Contact: xhx@ics.uci.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26873929
Text-line extraction in handwritten Chinese documents based on an energy minimization framework.
Koo, Hyung Il; Cho, Nam Ik
2012-03-01
Text-line extraction in unconstrained handwritten documents remains a challenging problem due to nonuniform character scale, spatially varying text orientation, and the interference between text lines. In order to address these problems, we propose a new cost function that considers the interactions between text lines and the curvilinearity of each text line. Precisely, we achieve this goal by introducing normalized measures for them, which are based on an estimated line spacing. We also present an optimization method that exploits the properties of our cost function. Experimental results on a database consisting of 853 handwritten Chinese document images have shown that our method achieves a detection rate of 99.52% and an error rate of 0.32%, which outperforms conventional methods.
Parallel tempering for the traveling salesman problem
DOE Office of Scientific and Technical Information (OSTI.GOV)
Percus, Allon; Wang, Richard; Hyman, Jeffrey
We explore the potential of parallel tempering as a combinatorial optimization method, applying it to the traveling salesman problem. We compare simulation results of parallel tempering with a benchmark implementation of simulated annealing, and study how different choices of parameters affect the relative performance of the two methods. We find that a straightforward implementation of parallel tempering can outperform simulated annealing in several crucial respects. When parameters are chosen appropriately, both methods yield close approximation to the actual minimum distance for an instance with 200 nodes. However, parallel tempering yields more consistently accurate results when a series of independent simulationsmore » are performed. Our results suggest that parallel tempering might offer a simple but powerful alternative to simulated annealing for combinatorial optimization problems.« less
NASA Technical Reports Server (NTRS)
Simpson, Timothy W.
1998-01-01
The use of response surface models and kriging models are compared for approximating non-random, deterministic computer analyses. After discussing the traditional response surface approach for constructing polynomial models for approximation, kriging is presented as an alternative statistical-based approximation method for the design and analysis of computer experiments. Both approximation methods are applied to the multidisciplinary design and analysis of an aerospike nozzle which consists of a computational fluid dynamics model and a finite element analysis model. Error analysis of the response surface and kriging models is performed along with a graphical comparison of the approximations. Four optimization problems are formulated and solved using both approximation models. While neither approximation technique consistently outperforms the other in this example, the kriging models using only a constant for the underlying global model and a Gaussian correlation function perform as well as the second order polynomial response surface models.
Ensemble Methods for Classification of Physical Activities from Wrist Accelerometry.
Chowdhury, Alok Kumar; Tjondronegoro, Dian; Chandran, Vinod; Trost, Stewart G
2017-09-01
To investigate whether the use of ensemble learning algorithms improve physical activity recognition accuracy compared to the single classifier algorithms, and to compare the classification accuracy achieved by three conventional ensemble machine learning methods (bagging, boosting, random forest) and a custom ensemble model comprising four algorithms commonly used for activity recognition (binary decision tree, k nearest neighbor, support vector machine, and neural network). The study used three independent data sets that included wrist-worn accelerometer data. For each data set, a four-step classification framework consisting of data preprocessing, feature extraction, normalization and feature selection, and classifier training and testing was implemented. For the custom ensemble, decisions from the single classifiers were aggregated using three decision fusion methods: weighted majority vote, naïve Bayes combination, and behavior knowledge space combination. Classifiers were cross-validated using leave-one subject out cross-validation and compared on the basis of average F1 scores. In all three data sets, ensemble learning methods consistently outperformed the individual classifiers. Among the conventional ensemble methods, random forest models provided consistently high activity recognition; however, the custom ensemble model using weighted majority voting demonstrated the highest classification accuracy in two of the three data sets. Combining multiple individual classifiers using conventional or custom ensemble learning methods can improve activity recognition accuracy from wrist-worn accelerometer data.
Investigating Gender Differences in Reading
ERIC Educational Resources Information Center
Logan, Sarah; Johnston, Rhona
2010-01-01
Girls consistently outperform boys on tests of reading comprehension, although the reason for this is not clear. In this review, differences between boys and girls in areas relating to reading will be investigated as possible explanations for consistent gender differences in reading attainment. The review will examine gender differences within the…
A Dictionary Learning Method with Total Generalized Variation for MRI Reconstruction
Lu, Hongyang; Wei, Jingbo; Wang, Yuhao; Deng, Xiaohua
2016-01-01
Reconstructing images from their noisy and incomplete measurements is always a challenge especially for medical MR image with important details and features. This work proposes a novel dictionary learning model that integrates two sparse regularization methods: the total generalized variation (TGV) approach and adaptive dictionary learning (DL). In the proposed method, the TGV selectively regularizes different image regions at different levels to avoid oil painting artifacts largely. At the same time, the dictionary learning adaptively represents the image features sparsely and effectively recovers details of images. The proposed model is solved by variable splitting technique and the alternating direction method of multiplier. Extensive simulation experimental results demonstrate that the proposed method consistently recovers MR images efficiently and outperforms the current state-of-the-art approaches in terms of higher PSNR and lower HFEN values. PMID:27110235
A Dictionary Learning Method with Total Generalized Variation for MRI Reconstruction.
Lu, Hongyang; Wei, Jingbo; Liu, Qiegen; Wang, Yuhao; Deng, Xiaohua
2016-01-01
Reconstructing images from their noisy and incomplete measurements is always a challenge especially for medical MR image with important details and features. This work proposes a novel dictionary learning model that integrates two sparse regularization methods: the total generalized variation (TGV) approach and adaptive dictionary learning (DL). In the proposed method, the TGV selectively regularizes different image regions at different levels to avoid oil painting artifacts largely. At the same time, the dictionary learning adaptively represents the image features sparsely and effectively recovers details of images. The proposed model is solved by variable splitting technique and the alternating direction method of multiplier. Extensive simulation experimental results demonstrate that the proposed method consistently recovers MR images efficiently and outperforms the current state-of-the-art approaches in terms of higher PSNR and lower HFEN values.
A New Shape Description Method Using Angular Radial Transform
NASA Astrophysics Data System (ADS)
Lee, Jong-Min; Kim, Whoi-Yul
Shape is one of the primary low-level image features in content-based image retrieval. In this paper we propose a new shape description method that consists of a rotationally invariant angular radial transform descriptor (IARTD). The IARTD is a feature vector that combines the magnitude and aligned phases of the angular radial transform (ART) coefficients. A phase correction scheme is employed to produce the aligned phase so that the IARTD is invariant to rotation. The distance between two IARTDs is defined by combining differences in the magnitudes and aligned phases. In an experiment using the MPEG-7 shape dataset, the proposed method outperforms existing methods; the average BEP of the proposed method is 57.69%, while the average BEPs of the invariant Zernike moments descriptor and the traditional ART are 41.64% and 36.51%, respectively.
Genetic Algorithms Evolve Optimized Transforms for Signal Processing Applications
2005-04-01
coefficient sets describing inverse transforms and matched forward/ inverse transform pairs that consistently outperform wavelets for image compression and reconstruction applications under conditions subject to quantization error.
Cluster Correspondence Analysis.
van de Velden, M; D'Enza, A Iodice; Palumbo, F
2017-03-01
A method is proposed that combines dimension reduction and cluster analysis for categorical data by simultaneously assigning individuals to clusters and optimal scaling values to categories in such a way that a single between variance maximization objective is achieved. In a unified framework, a brief review of alternative methods is provided and we show that the proposed method is equivalent to GROUPALS applied to categorical data. Performance of the methods is appraised by means of a simulation study. The results of the joint dimension reduction and clustering methods are compared with the so-called tandem approach, a sequential analysis of dimension reduction followed by cluster analysis. The tandem approach is conjectured to perform worse when variables are added that are unrelated to the cluster structure. Our simulation study confirms this conjecture. Moreover, the results of the simulation study indicate that the proposed method also consistently outperforms alternative joint dimension reduction and clustering methods.
Consistently Sampled Correlation Filters with Space Anisotropic Regularization for Visual Tracking
Shi, Guokai; Xu, Tingfa; Luo, Jiqiang; Li, Yuankun
2017-01-01
Most existing correlation filter-based tracking algorithms, which use fixed patches and cyclic shifts as training and detection measures, assume that the training samples are reliable and ignore the inconsistencies between training samples and detection samples. We propose to construct and study a consistently sampled correlation filter with space anisotropic regularization (CSSAR) to solve these two problems simultaneously. Our approach constructs a spatiotemporally consistent sample strategy to alleviate the redundancies in training samples caused by the cyclical shifts, eliminate the inconsistencies between training samples and detection samples, and introduce space anisotropic regularization to constrain the correlation filter for alleviating drift caused by occlusion. Moreover, an optimization strategy based on the Gauss-Seidel method was developed for obtaining robust and efficient online learning. Both qualitative and quantitative evaluations demonstrate that our tracker outperforms state-of-the-art trackers in object tracking benchmarks (OTBs). PMID:29231876
Performance of time-series methods in forecasting the demand for red blood cell transfusion.
Pereira, Arturo
2004-05-01
Planning the future blood collection efforts must be based on adequate forecasts of transfusion demand. In this study, univariate time-series methods were investigated for their performance in forecasting the monthly demand for RBCs at one tertiary-care, university hospital. Three time-series methods were investigated: autoregressive integrated moving average (ARIMA), the Holt-Winters family of exponential smoothing models, and one neural-network-based method. The time series consisted of the monthly demand for RBCs from January 1988 to December 2002 and was divided into two segments: the older one was used to fit or train the models, and the younger to test for the accuracy of predictions. Performance was compared across forecasting methods by calculating goodness-of-fit statistics, the percentage of months in which forecast-based supply would have met the RBC demand (coverage rate), and the outdate rate. The RBC transfusion series was best fitted by a seasonal ARIMA(0,1,1)(0,1,1)(12) model. Over 1-year time horizons, forecasts generated by ARIMA or exponential smoothing laid within the +/- 10 percent interval of the real RBC demand in 79 percent of months (62% in the case of neural networks). The coverage rate for the three methods was 89, 91, and 86 percent, respectively. Over 2-year time horizons, exponential smoothing largely outperformed the other methods. Predictions by exponential smoothing laid within the +/- 10 percent interval of real values in 75 percent of the 24 forecasted months, and the coverage rate was 87 percent. Over 1-year time horizons, predictions of RBC demand generated by ARIMA or exponential smoothing are accurate enough to be of help in the planning of blood collection efforts. For longer time horizons, exponential smoothing outperforms the other forecasting methods.
Generalized fourier analyses of the advection-diffusion equation - Part II: two-dimensional domains
NASA Astrophysics Data System (ADS)
Voth, Thomas E.; Martinez, Mario J.; Christon, Mark A.
2004-07-01
Part I of this work presents a detailed multi-methods comparison of the spatial errors associated with the one-dimensional finite difference, finite element and finite volume semi-discretizations of the scalar advection-diffusion equation. In Part II we extend the analysis to two-dimensional domains and also consider the effects of wave propagation direction and grid aspect ratio on the phase speed, and the discrete and artificial diffusivities. The observed dependence of dispersive and diffusive behaviour on propagation direction makes comparison of methods more difficult relative to the one-dimensional results. For this reason, integrated (over propagation direction and wave number) error and anisotropy metrics are introduced to facilitate comparison among the various methods. With respect to these metrics, the consistent mass Galerkin and consistent mass control-volume finite element methods, and their streamline upwind derivatives, exhibit comparable accuracy, and generally out-perform their lumped mass counterparts and finite-difference based schemes. While this work can only be considered a first step in a comprehensive multi-methods analysis and comparison, it serves to identify some of the relative strengths and weaknesses of multiple numerical methods in a common mathematical framework. Published in 2004 by John Wiley & Sons, Ltd.
A ℓ2, 1 norm regularized multi-kernel learning for false positive reduction in Lung nodule CAD.
Cao, Peng; Liu, Xiaoli; Zhang, Jian; Li, Wei; Zhao, Dazhe; Huang, Min; Zaiane, Osmar
2017-03-01
The aim of this paper is to describe a novel algorithm for False Positive Reduction in lung nodule Computer Aided Detection(CAD). In this paper, we describes a new CT lung CAD method which aims to detect solid nodules. Specially, we proposed a multi-kernel classifier with a ℓ 2, 1 norm regularizer for heterogeneous feature fusion and selection from the feature subset level, and designed two efficient strategies to optimize the parameters of kernel weights in non-smooth ℓ 2, 1 regularized multiple kernel learning algorithm. The first optimization algorithm adapts a proximal gradient method for solving the ℓ 2, 1 norm of kernel weights, and use an accelerated method based on FISTA; the second one employs an iterative scheme based on an approximate gradient descent method. The results demonstrates that the FISTA-style accelerated proximal descent method is efficient for the ℓ 2, 1 norm formulation of multiple kernel learning with the theoretical guarantee of the convergence rate. Moreover, the experimental results demonstrate the effectiveness of the proposed methods in terms of Geometric mean (G-mean) and Area under the ROC curve (AUC), and significantly outperforms the competing methods. The proposed approach exhibits some remarkable advantages both in heterogeneous feature subsets fusion and classification phases. Compared with the fusion strategies of feature-level and decision level, the proposed ℓ 2, 1 norm multi-kernel learning algorithm is able to accurately fuse the complementary and heterogeneous feature sets, and automatically prune the irrelevant and redundant feature subsets to form a more discriminative feature set, leading a promising classification performance. Moreover, the proposed algorithm consistently outperforms the comparable classification approaches in the literature. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Identifying Wrist Fracture Patients with High Accuracy by Automatic Categorization of X-ray Reports
de Bruijn, Berry; Cranney, Ann; O’Donnell, Siobhan; Martin, Joel D.; Forster, Alan J.
2006-01-01
The authors performed this study to determine the accuracy of several text classification methods to categorize wrist x-ray reports. We randomly sampled 751 textual wrist x-ray reports. Two expert reviewers rated the presence (n = 301) or absence (n = 450) of an acute fracture of wrist. We developed two information retrieval (IR) text classification methods and a machine learning method using a support vector machine (TC-1). In cross-validation on the derivation set (n = 493), TC-1 outperformed the two IR based methods and six benchmark classifiers, including Naive Bayes and a Neural Network. In the validation set (n = 258), TC-1 demonstrated consistent performance with 93.8% accuracy; 95.5% sensitivity; 92.9% specificity; and 87.5% positive predictive value. TC-1 was easy to implement and superior in performance to the other classification methods. PMID:16929046
Real-time anomaly detection for very short-term load forecasting
DOE Office of Scientific and Technical Information (OSTI.GOV)
Luo, Jian; Hong, Tao; Yue, Meng
Although the recent load information is critical to very short-term load forecasting (VSTLF), power companies often have difficulties in collecting the most recent load values accurately and timely for VSTLF applications. This paper tackles the problem of real-time anomaly detection in most recent load information used by VSTLF. This paper proposes a model-based anomaly detection method that consists of two components, a dynamic regression model and an adaptive anomaly threshold. The case study is developed using the data from ISO New England. This paper demonstrates that the proposed method significantly outperforms three other anomaly detection methods including two methods commonlymore » used in the field and one state-of-the-art method used by a winning team of the Global Energy Forecasting Competition 2014. Lastly, a general anomaly detection framework is proposed for the future research.« less
Real-time anomaly detection for very short-term load forecasting
Luo, Jian; Hong, Tao; Yue, Meng
2018-01-06
Although the recent load information is critical to very short-term load forecasting (VSTLF), power companies often have difficulties in collecting the most recent load values accurately and timely for VSTLF applications. This paper tackles the problem of real-time anomaly detection in most recent load information used by VSTLF. This paper proposes a model-based anomaly detection method that consists of two components, a dynamic regression model and an adaptive anomaly threshold. The case study is developed using the data from ISO New England. This paper demonstrates that the proposed method significantly outperforms three other anomaly detection methods including two methods commonlymore » used in the field and one state-of-the-art method used by a winning team of the Global Energy Forecasting Competition 2014. Lastly, a general anomaly detection framework is proposed for the future research.« less
Exploiting salient semantic analysis for information retrieval
NASA Astrophysics Data System (ADS)
Luo, Jing; Meng, Bo; Quan, Changqin; Tu, Xinhui
2016-11-01
Recently, many Wikipedia-based methods have been proposed to improve the performance of different natural language processing (NLP) tasks, such as semantic relatedness computation, text classification and information retrieval. Among these methods, salient semantic analysis (SSA) has been proven to be an effective way to generate conceptual representation for words or documents. However, its feasibility and effectiveness in information retrieval is mostly unknown. In this paper, we study how to efficiently use SSA to improve the information retrieval performance, and propose a SSA-based retrieval method under the language model framework. First, SSA model is adopted to build conceptual representations for documents and queries. Then, these conceptual representations and the bag-of-words (BOW) representations can be used in combination to estimate the language models of queries and documents. The proposed method is evaluated on several standard text retrieval conference (TREC) collections. Experiment results on standard TREC collections show the proposed models consistently outperform the existing Wikipedia-based retrieval methods.
Identification of an Efficient Gene Expression Panel for Glioblastoma Classification
Zelaya, Ivette; Laks, Dan R.; Zhao, Yining; Kawaguchi, Riki; Gao, Fuying; Kornblum, Harley I.; Coppola, Giovanni
2016-01-01
We present here a novel genetic algorithm-based random forest (GARF) modeling technique that enables a reduction in the complexity of large gene disease signatures to highly accurate, greatly simplified gene panels. When applied to 803 glioblastoma multiforme samples, this method allowed the 840-gene Verhaak et al. gene panel (the standard in the field) to be reduced to a 48-gene classifier, while retaining 90.91% classification accuracy, and outperforming the best available alternative methods. Additionally, using this approach we produced a 32-gene panel which allows for better consistency between RNA-seq and microarray-based classifications, improving cross-platform classification retention from 69.67% to 86.07%. A webpage producing these classifications is available at http://simplegbm.semel.ucla.edu. PMID:27855170
Community Detection on the GPU
DOE Office of Scientific and Technical Information (OSTI.GOV)
Naim, Md; Manne, Fredrik; Halappanavar, Mahantesh
We present and evaluate a new GPU algorithm based on the Louvain method for community detection. Our algorithm is the first for this problem that parallelizes the access to individual edges. In this way we can fine tune the load balance when processing networks with nodes of highly varying degrees. This is achieved by scaling the number of threads assigned to each node according to its degree. Extensive experiments show that we obtain speedups up to a factor of 270 compared to the sequential algorithm. The algorithm consistently outperforms other recent shared memory implementations and is only one order ofmore » magnitude slower than the current fastest parallel Louvain method running on a Blue Gene/Q supercomputer using more than 500K threads.« less
Minimal entropy approximation for cellular automata
NASA Astrophysics Data System (ADS)
Fukś, Henryk
2014-02-01
We present a method for the construction of approximate orbits of measures under the action of cellular automata which is complementary to the local structure theory. The local structure theory is based on the idea of Bayesian extension, that is, construction of a probability measure consistent with given block probabilities and maximizing entropy. If instead of maximizing entropy one minimizes it, one can develop another method for the construction of approximate orbits, at the heart of which is the iteration of finite-dimensional maps, called minimal entropy maps. We present numerical evidence that the minimal entropy approximation sometimes outperforms the local structure theory in characterizing the properties of cellular automata. The density response curve for elementary CA rule 26 is used to illustrate this claim.
NASA Astrophysics Data System (ADS)
Woldegiorgis, Befekadu Taddesse; van Griensven, Ann; Pereira, Fernando; Bauwens, Willy
2017-06-01
Most common numerical solutions used in CSTR-based in-stream water quality simulators are susceptible to instabilities and/or solution inconsistencies. Usually, they cope with instability problems by adopting computationally expensive small time steps. However, some simulators use fixed computation time steps and hence do not have the flexibility to do so. This paper presents a novel quasi-analytical solution for CSTR-based water quality simulators of an unsteady system. The robustness of the new method is compared with the commonly used fourth-order Runge-Kutta methods, the Euler method and three versions of the SWAT model (SWAT2012, SWAT-TCEQ, and ESWAT). The performance of each method is tested for different hypothetical experiments. Besides the hypothetical data, a real case study is used for comparison. The growth factors we derived as stability measures for the different methods and the R-factor—considered as a consistency measure—turned out to be very useful for determining the most robust method. The new method outperformed all the numerical methods used in the hypothetical comparisons. The application for the Zenne River (Belgium) shows that the new method provides stable and consistent BOD simulations whereas the SWAT2012 model is shown to be unstable for the standard daily computation time step. The new method unconditionally simulates robust solutions. Therefore, it is a reliable scheme for CSTR-based water quality simulators that use first-order reaction formulations.
A Reconstructed Discontinuous Galerkin Method for the Euler Equations on Arbitrary Grids
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hong Luo; Luqing Luo; Robert Nourgaliev
2012-11-01
A reconstruction-based discontinuous Galerkin (RDG(P1P2)) method, a variant of P1P2 method, is presented for the solution of the compressible Euler equations on arbitrary grids. In this method, an in-cell reconstruction, designed to enhance the accuracy of the discontinuous Galerkin method, is used to obtain a quadratic polynomial solution (P2) from the underlying linear polynomial (P1) discontinuous Galerkin solution using a least-squares method. The stencils used in the reconstruction involve only the von Neumann neighborhood (face-neighboring cells) and are compact and consistent with the underlying DG method. The developed RDG method is used to compute a variety of flow problems onmore » arbitrary meshes to demonstrate its accuracy, efficiency, robustness, and versatility. The numerical results indicate that this RDG(P1P2) method is third-order accurate, and outperforms the third-order DG method (DG(P2)) in terms of both computing costs and storage requirements.« less
Emerging From Water: Underwater Image Color Correction Based on Weakly Supervised Color Transfer
NASA Astrophysics Data System (ADS)
Li, Chongyi; Guo, Jichang; Guo, Chunle
2018-03-01
Underwater vision suffers from severe effects due to selective attenuation and scattering when light propagates through water. Such degradation not only affects the quality of underwater images but limits the ability of vision tasks. Different from existing methods which either ignore the wavelength dependency of the attenuation or assume a specific spectral profile, we tackle color distortion problem of underwater image from a new view. In this letter, we propose a weakly supervised color transfer method to correct color distortion, which relaxes the need of paired underwater images for training and allows for the underwater images unknown where were taken. Inspired by Cycle-Consistent Adversarial Networks, we design a multi-term loss function including adversarial loss, cycle consistency loss, and SSIM (Structural Similarity Index Measure) loss, which allows the content and structure of the corrected result the same as the input, but the color as if the image was taken without the water. Experiments on underwater images captured under diverse scenes show that our method produces visually pleasing results, even outperforms the art-of-the-state methods. Besides, our method can improve the performance of vision tasks.
Microblog sentiment analysis using social and topic context.
Zou, Xiaomei; Yang, Jing; Zhang, Jianpei
2018-01-01
Analyzing massive user-generated microblogs is very crucial in many fields, attracting many researchers to study. However, it is very challenging to process such noisy and short microblogs. Most prior works only use texts to identify sentiment polarity and assume that microblogs are independent and identically distributed, which ignore microblogs are networked data. Therefore, their performance is not usually satisfactory. Inspired by two sociological theories (sentimental consistency and emotional contagion), in this paper, we propose a new method combining social context and topic context to analyze microblog sentiment. In particular, different from previous work using direct user relations, we introduce structure similarity context into social contexts and propose a method to measure structure similarity. In addition, we also introduce topic context to model the semantic relations between microblogs. Social context and topic context are combined by the Laplacian matrix of the graph built by these contexts and Laplacian regularization are added into the microblog sentiment analysis model. Experimental results on two real Twitter datasets demonstrate that our proposed model can outperform baseline methods consistently and significantly.
Li, Wei; Cao, Peng; Zhao, Dazhe; Wang, Junbo
2016-01-01
Computer aided detection (CAD) systems can assist radiologists by offering a second opinion on early diagnosis of lung cancer. Classification and feature representation play critical roles in false-positive reduction (FPR) in lung nodule CAD. We design a deep convolutional neural networks method for nodule classification, which has an advantage of autolearning representation and strong generalization ability. A specified network structure for nodule images is proposed to solve the recognition of three types of nodules, that is, solid, semisolid, and ground glass opacity (GGO). Deep convolutional neural networks are trained by 62,492 regions-of-interest (ROIs) samples including 40,772 nodules and 21,720 nonnodules from the Lung Image Database Consortium (LIDC) database. Experimental results demonstrate the effectiveness of the proposed method in terms of sensitivity and overall accuracy and that it consistently outperforms the competing methods.
Retinal artery-vein classification via topology estimation
Estrada, Rolando; Allingham, Michael J.; Mettu, Priyatham S.; Cousins, Scott W.; Tomasi, Carlo; Farsiu, Sina
2015-01-01
We propose a novel, graph-theoretic framework for distinguishing arteries from veins in a fundus image. We make use of the underlying vessel topology to better classify small and midsized vessels. We extend our previously proposed tree topology estimation framework by incorporating expert, domain-specific features to construct a simple, yet powerful global likelihood model. We efficiently maximize this model by iteratively exploring the space of possible solutions consistent with the projected vessels. We tested our method on four retinal datasets and achieved classification accuracies of 91.0%, 93.5%, 91.7%, and 90.9%, outperforming existing methods. Our results show the effectiveness of our approach, which is capable of analyzing the entire vasculature, including peripheral vessels, in wide field-of-view fundus photographs. This topology-based method is a potentially important tool for diagnosing diseases with retinal vascular manifestation. PMID:26068204
Zhang, Geng; Wang, Shuang; Li, Libo; Hu, Xiuqing; Hu, Bingliang
2016-11-01
The lunar spectrum has been used in radiometric calibration and sensor stability monitoring for spaceborne optical sensors. A ground-based large-aperture static image spectrometer (LASIS) can be used to acquire the lunar spectral image for lunar radiance model improvement when the moon orbits over its viewing field. The lunar orbiting behavior is not consistent with the desired scanning speed and direction of LASIS. To correctly extract interferograms from the obtained data, a translation correction method based on image correlation is proposed. This method registers the frames to a reference frame to reduce accumulative errors. Furthermore, we propose a circle-matching-based approach to achieve even higher accuracy during observation of the full moon. To demonstrate the effectiveness of our approaches, experiments are run on true lunar observation data. The results show that the proposed approaches outperform the state-of-the-art methods.
Inconsistent-handed advantage in episodic memory extends to paragraph-level materials.
Prichard, Eric C; Christman, Stephen D
2017-09-01
Past research using handedness as a proxy for functional access to the right hemisphere demonstrates that individuals who are mixed/inconsistently handed outperform strong/consistently handed individuals when performing episodic recall tasks. However, research has generally been restricted to stimuli presented in a list format. In the present paper, we present two studies in which participants were presented with paragraph-level material and then asked to recall material from the passages. The first study was based on a classic study looking at retroactive interference with prose materials. The second was modelled on a classic experiment looking at perspective taking and the content of memory. In both studies, the classic effects were replicated and the general finding that mixed/inconsistent-handers outperform strong/consistent-handers was replicated. This suggests that considering degree of handedness may be an empirically useful means of reducing error variance in paradigms looking at memory for prose level material.
Bakhshaei, Mahsa; Henderson, Rita Isabel
2016-07-28
In French-language secondary schools in Quebec, among all immigrant-origin students, those originating from South Asia have the highest dropout rate. However, girls belonging to this group consistently outperform their male peers of similar ethnic background. This stirs questions about the reasons for this relative outperformance and its linkage with overall wellbeing among these girls. A mixed methods approach guided data collection. It involved in-depth interviews with female and male students of South Asian origin (n = 19) and with individuals holding educational roles in the lives of youth (n = 25). An additional anonymous questionnaire aggregated parent perspectives (n = 36), though this article focuses primarily on qualitative lessons. This article shows three main reasons for why South Asian female adolescents in Quebec French-language secondary schools outperform their male counterparts in schooling attainment: parental expectations after migration, socialization at home, and relationships at school. According to our findings, academic perseverance among these girls does not necessarily translate into their improved wellbeing or their involvement in an advantageous process of acculturation. This study highlights that although gender, ethnicity, and class can create an interlocking system of oppression in certain social spheres for a specific group of women, it can emerge as advantageous in other contexts for the same group. This provides educational policy makers, as well as school and community workers, with guidance and avenues for action that can promote the wellbeing of immigrant-origin girls through involvement in beneficial processes of acculturation aligned with their improved academic performance.
Cao, Buwen; Deng, Shuguang; Qin, Hua; Ding, Pingjian; Chen, Shaopeng; Li, Guanghui
2018-06-15
High-throughput technology has generated large-scale protein interaction data, which is crucial in our understanding of biological organisms. Many complex identification algorithms have been developed to determine protein complexes. However, these methods are only suitable for dense protein interaction networks, because their capabilities decrease rapidly when applied to sparse protein⁻protein interaction (PPI) networks. In this study, based on penalized matrix decomposition ( PMD ), a novel method of penalized matrix decomposition for the identification of protein complexes (i.e., PMD pc ) was developed to detect protein complexes in the human protein interaction network. This method mainly consists of three steps. First, the adjacent matrix of the protein interaction network is normalized. Second, the normalized matrix is decomposed into three factor matrices. The PMD pc method can detect protein complexes in sparse PPI networks by imposing appropriate constraints on factor matrices. Finally, the results of our method are compared with those of other methods in human PPI network. Experimental results show that our method can not only outperform classical algorithms, such as CFinder, ClusterONE, RRW, HC-PIN, and PCE-FR, but can also achieve an ideal overall performance in terms of a composite score consisting of F-measure, accuracy (ACC), and the maximum matching ratio (MMR).
On the Reliability of Individual Brain Activity Networks.
Cassidy, Ben; Bowman, F DuBois; Rae, Caroline; Solo, Victor
2018-02-01
There is intense interest in fMRI research on whole-brain functional connectivity, and however, two fundamental issues are still unresolved: the impact of spatiotemporal data resolution (spatial parcellation and temporal sampling) and the impact of the network construction method on the reliability of functional brain networks. In particular, the impact of spatiotemporal data resolution on the resulting connectivity findings has not been sufficiently investigated. In fact, a number of studies have already observed that functional networks often give different conclusions across different parcellation scales. If the interpretations from functional networks are inconsistent across spatiotemporal scales, then the whole validity of the functional network paradigm is called into question. This paper investigates the consistency of resting state network structure when using different temporal sampling or spatial parcellation, or different methods for constructing the networks. To pursue this, we develop a novel network comparison framework based on persistent homology from a topological data analysis. We use the new network comparison tools to characterize the spatial and temporal scales under which consistent functional networks can be constructed. The methods are illustrated on Human Connectome Project data, showing that the DISCOH 2 network construction method outperforms other approaches at most data spatiotemporal resolutions.
Advanced lattice Boltzmann scheme for high-Reynolds-number magneto-hydrodynamic flows
NASA Astrophysics Data System (ADS)
De Rosis, Alessandro; Lévêque, Emmanuel; Chahine, Robert
2018-06-01
Is the lattice Boltzmann method suitable to investigate numerically high-Reynolds-number magneto-hydrodynamic (MHD) flows? It is shown that a standard approach based on the Bhatnagar-Gross-Krook (BGK) collision operator rapidly yields unstable simulations as the Reynolds number increases. In order to circumvent this limitation, it is here suggested to address the collision procedure in the space of central moments for the fluid dynamics. Therefore, an hybrid lattice Boltzmann scheme is introduced, which couples a central-moment scheme for the velocity with a BGK scheme for the space-and-time evolution of the magnetic field. This method outperforms the standard approach in terms of stability, allowing us to simulate high-Reynolds-number MHD flows with non-unitary Prandtl number while maintaining accuracy and physical consistency.
Automatic comic page image understanding based on edge segment analysis
NASA Astrophysics Data System (ADS)
Liu, Dong; Wang, Yongtao; Tang, Zhi; Li, Luyuan; Gao, Liangcai
2013-12-01
Comic page image understanding aims to analyse the layout of the comic page images by detecting the storyboards and identifying the reading order automatically. It is the key technique to produce the digital comic documents suitable for reading on mobile devices. In this paper, we propose a novel comic page image understanding method based on edge segment analysis. First, we propose an efficient edge point chaining method to extract Canny edge segments (i.e., contiguous chains of Canny edge points) from the input comic page image; second, we propose a top-down scheme to detect line segments within each obtained edge segment; third, we develop a novel method to detect the storyboards by selecting the border lines and further identify the reading order of these storyboards. The proposed method is performed on a data set consisting of 2000 comic page images from ten printed comic series. The experimental results demonstrate that the proposed method achieves satisfactory results on different comics and outperforms the existing methods.
Robust Methods for Moderation Analysis with a Two-Level Regression Model.
Yang, Miao; Yuan, Ke-Hai
2016-01-01
Moderation analysis has many applications in social sciences. Most widely used estimation methods for moderation analysis assume that errors are normally distributed and homoscedastic. When these assumptions are not met, the results from a classical moderation analysis can be misleading. For more reliable moderation analysis, this article proposes two robust methods with a two-level regression model when the predictors do not contain measurement error. One method is based on maximum likelihood with Student's t distribution and the other is based on M-estimators with Huber-type weights. An algorithm for obtaining the robust estimators is developed. Consistent estimates of standard errors of the robust estimators are provided. The robust approaches are compared against normal-distribution-based maximum likelihood (NML) with respect to power and accuracy of parameter estimates through a simulation study. Results show that the robust approaches outperform NML under various distributional conditions. Application of the robust methods is illustrated through a real data example. An R program is developed and documented to facilitate the application of the robust methods.
NASA Astrophysics Data System (ADS)
Zhou, Wei-Xing; Sornette, Didier
2007-07-01
We have recently introduced the “thermal optimal path” (TOP) method to investigate the real-time lead-lag structure between two time series. The TOP method consists in searching for a robust noise-averaged optimal path of the distance matrix along which the two time series have the greatest similarity. Here, we generalize the TOP method by introducing a more general definition of distance which takes into account possible regime shifts between positive and negative correlations. This generalization to track possible changes of correlation signs is able to identify possible transitions from one convention (or consensus) to another. Numerical simulations on synthetic time series verify that the new TOP method performs as expected even in the presence of substantial noise. We then apply it to investigate changes of convention in the dependence structure between the historical volatilities of the USA inflation rate and economic growth rate. Several measures show that the new TOP method significantly outperforms standard cross-correlation methods.
Brain tumor classification and segmentation using sparse coding and dictionary learning.
Salman Al-Shaikhli, Saif Dawood; Yang, Michael Ying; Rosenhahn, Bodo
2016-08-01
This paper presents a novel fully automatic framework for multi-class brain tumor classification and segmentation using a sparse coding and dictionary learning method. The proposed framework consists of two steps: classification and segmentation. The classification of the brain tumors is based on brain topology and texture. The segmentation is based on voxel values of the image data. Using K-SVD, two types of dictionaries are learned from the training data and their associated ground truth segmentation: feature dictionary and voxel-wise coupled dictionaries. The feature dictionary consists of global image features (topological and texture features). The coupled dictionaries consist of coupled information: gray scale voxel values of the training image data and their associated label voxel values of the ground truth segmentation of the training data. For quantitative evaluation, the proposed framework is evaluated using different metrics. The segmentation results of the brain tumor segmentation (MICCAI-BraTS-2013) database are evaluated using five different metric scores, which are computed using the online evaluation tool provided by the BraTS-2013 challenge organizers. Experimental results demonstrate that the proposed approach achieves an accurate brain tumor classification and segmentation and outperforms the state-of-the-art methods.
Multiscale site-response mapping: A case study of Parkfield, California
Thompson, E.M.; Baise, L.G.; Kayen, R.E.; Morgan, E.C.; Kaklamanos, J.
2011-01-01
The scale of previously proposed methods for mapping site-response ranges from global coverage down to individual urban regions. Typically, spatial coverage and accuracy are inversely related.We use the densely spaced strong-motion stations in Parkfield, California, to estimate the accuracy of different site-response mapping methods and demonstrate a method for integrating multiple site-response estimates from the site to the global scale. This method is simply a weighted mean of a suite of different estimates, where the weights are the inverse of the variance of the individual estimates. Thus, the dominant site-response model varies in space as a function of the accuracy of the different models. For mapping applications, site-response models should be judged in terms of both spatial coverage and the degree of correlation with observed amplifications. Performance varies with period, but in general the Parkfield data show that: (1) where a velocity profile is available, the square-rootof- impedance (SRI) method outperforms the measured VS30 (30 m divided by the S-wave travel time to 30 m depth) and (2) where velocity profiles are unavailable, the topographic slope method outperforms surficial geology for short periods, but geology outperforms slope at longer periods. We develop new equations to estimate site response from topographic slope, derived from the Next Generation Attenuation (NGA) database.
Yang, Tzuhsiung; Berry, John F
2018-06-04
The computation of nuclear second derivatives of energy, or the nuclear Hessian, is an essential routine in quantum chemical investigations of ground and transition states, thermodynamic calculations, and molecular vibrations. Analytic nuclear Hessian computations require the resolution of costly coupled-perturbed self-consistent field (CP-SCF) equations, while numerical differentiation of analytic first derivatives has an unfavorable 6 N ( N = number of atoms) prefactor. Herein, we present a new method in which grid computing is used to accelerate and/or enable the evaluation of the nuclear Hessian via numerical differentiation: NUMFREQ@Grid. Nuclear Hessians were successfully evaluated by NUMFREQ@Grid at the DFT level as well as using RIJCOSX-ZORA-MP2 or RIJCOSX-ZORA-B2PLYP for a set of linear polyacenes with systematically increasing size. For the larger members of this group, NUMFREQ@Grid was found to outperform the wall clock time of analytic Hessian evaluation; at the MP2 or B2LYP levels, these Hessians cannot even be evaluated analytically. We also evaluated a 156-atom catalytically relevant open-shell transition metal complex and found that NUMFREQ@Grid is faster (7.7 times shorter wall clock time) and less demanding (4.4 times less memory requirement) than an analytic Hessian. Capitalizing on the capabilities of parallel grid computing, NUMFREQ@Grid can outperform analytic methods in terms of wall time, memory requirements, and treatable system size. The NUMFREQ@Grid method presented herein demonstrates how grid computing can be used to facilitate embarrassingly parallel computational procedures and is a pioneer for future implementations.
Validity of Willingness to Pay Measures under Preference Uncertainty.
Braun, Carola; Rehdanz, Katrin; Schmidt, Ulrich
2016-01-01
Recent studies in the marketing literature developed a new method for eliciting willingness to pay (WTP) with an open-ended elicitation format: the Range-WTP method. In contrast to the traditional approach of eliciting WTP as a single value (Point-WTP), Range-WTP explicitly allows for preference uncertainty in responses. The aim of this paper is to apply Range-WTP to the domain of contingent valuation and to test for its theoretical validity and robustness in comparison to the Point-WTP. Using data from two novel large-scale surveys on the perception of solar radiation management (SRM), a little-known technique for counteracting climate change, we compare the performance of both methods in the field. In addition to the theoretical validity (i.e. the degree to which WTP values are consistent with theoretical expectations), we analyse the test-retest reliability and stability of our results over time. Our evidence suggests that the Range-WTP method clearly outperforms the Point-WTP method.
Validity of Willingness to Pay Measures under Preference Uncertainty
Braun, Carola; Rehdanz, Katrin; Schmidt, Ulrich
2016-01-01
Recent studies in the marketing literature developed a new method for eliciting willingness to pay (WTP) with an open-ended elicitation format: the Range-WTP method. In contrast to the traditional approach of eliciting WTP as a single value (Point-WTP), Range-WTP explicitly allows for preference uncertainty in responses. The aim of this paper is to apply Range-WTP to the domain of contingent valuation and to test for its theoretical validity and robustness in comparison to the Point-WTP. Using data from two novel large-scale surveys on the perception of solar radiation management (SRM), a little-known technique for counteracting climate change, we compare the performance of both methods in the field. In addition to the theoretical validity (i.e. the degree to which WTP values are consistent with theoretical expectations), we analyse the test-retest reliability and stability of our results over time. Our evidence suggests that the Range-WTP method clearly outperforms the Point-WTP method. PMID:27096163
Infrared and visible image fusion with spectral graph wavelet transform.
Yan, Xiang; Qin, Hanlin; Li, Jia; Zhou, Huixin; Zong, Jing-guo
2015-09-01
Infrared and visible image fusion technique is a popular topic in image analysis because it can integrate complementary information and obtain reliable and accurate description of scenes. Multiscale transform theory as a signal representation method is widely used in image fusion. In this paper, a novel infrared and visible image fusion method is proposed based on spectral graph wavelet transform (SGWT) and bilateral filter. The main novelty of this study is that SGWT is used for image fusion. On the one hand, source images are decomposed by SGWT in its transform domain. The proposed approach not only effectively preserves the details of different source images, but also excellently represents the irregular areas of the source images. On the other hand, a novel weighted average method based on bilateral filter is proposed to fuse low- and high-frequency subbands by taking advantage of spatial consistency of natural images. Experimental results demonstrate that the proposed method outperforms seven recently proposed image fusion methods in terms of both visual effect and objective evaluation metrics.
Combining multiple tools outperforms individual methods in gene set enrichment analyses.
Alhamdoosh, Monther; Ng, Milica; Wilson, Nicholas J; Sheridan, Julie M; Huynh, Huy; Wilson, Michael J; Ritchie, Matthew E
2017-02-01
Gene set enrichment (GSE) analysis allows researchers to efficiently extract biological insight from long lists of differentially expressed genes by interrogating them at a systems level. In recent years, there has been a proliferation of GSE analysis methods and hence it has become increasingly difficult for researchers to select an optimal GSE tool based on their particular dataset. Moreover, the majority of GSE analysis methods do not allow researchers to simultaneously compare gene set level results between multiple experimental conditions. The ensemble of genes set enrichment analyses (EGSEA) is a method developed for RNA-sequencing data that combines results from twelve algorithms and calculates collective gene set scores to improve the biological relevance of the highest ranked gene sets. EGSEA's gene set database contains around 25 000 gene sets from sixteen collections. It has multiple visualization capabilities that allow researchers to view gene sets at various levels of granularity. EGSEA has been tested on simulated data and on a number of human and mouse datasets and, based on biologists' feedback, consistently outperforms the individual tools that have been combined. Our evaluation demonstrates the superiority of the ensemble approach for GSE analysis, and its utility to effectively and efficiently extrapolate biological functions and potential involvement in disease processes from lists of differentially regulated genes. EGSEA is available as an R package at http://www.bioconductor.org/packages/EGSEA/ . The gene sets collections are available in the R package EGSEAdata from http://www.bioconductor.org/packages/EGSEAdata/ . monther.alhamdoosh@csl.com.au mritchie@wehi.edu.au. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.
Example-Based Image Colorization Using Locality Consistent Sparse Representation.
Bo Li; Fuchen Zhao; Zhuo Su; Xiangguo Liang; Yu-Kun Lai; Rosin, Paul L
2017-11-01
Image colorization aims to produce a natural looking color image from a given gray-scale image, which remains a challenging problem. In this paper, we propose a novel example-based image colorization method exploiting a new locality consistent sparse representation. Given a single reference color image, our method automatically colorizes the target gray-scale image by sparse pursuit. For efficiency and robustness, our method operates at the superpixel level. We extract low-level intensity features, mid-level texture features, and high-level semantic features for each superpixel, which are then concatenated to form its descriptor. The collection of feature vectors for all the superpixels from the reference image composes the dictionary. We formulate colorization of target superpixels as a dictionary-based sparse reconstruction problem. Inspired by the observation that superpixels with similar spatial location and/or feature representation are likely to match spatially close regions from the reference image, we further introduce a locality promoting regularization term into the energy formulation, which substantially improves the matching consistency and subsequent colorization results. Target superpixels are colorized based on the chrominance information from the dominant reference superpixels. Finally, to further improve coherence while preserving sharpness, we develop a new edge-preserving filter for chrominance channels with the guidance from the target gray-scale image. To the best of our knowledge, this is the first work on sparse pursuit image colorization from single reference images. Experimental results demonstrate that our colorization method outperforms the state-of-the-art methods, both visually and quantitatively using a user study.
Non-linear eigensolver-based alternative to traditional SCF methods
NASA Astrophysics Data System (ADS)
Gavin, B.; Polizzi, E.
2013-05-01
The self-consistent procedure in electronic structure calculations is revisited using a highly efficient and robust algorithm for solving the non-linear eigenvector problem, i.e., H({ψ})ψ = Eψ. This new scheme is derived from a generalization of the FEAST eigenvalue algorithm to account for the non-linearity of the Hamiltonian with the occupied eigenvectors. Using a series of numerical examples and the density functional theory-Kohn/Sham model, it will be shown that our approach can outperform the traditional SCF mixing-scheme techniques by providing a higher converge rate, convergence to the correct solution regardless of the choice of the initial guess, and a significant reduction of the eigenvalue solve time in simulations.
Sim, K S; Teh, V; Tey, Y C; Kho, T K
2016-11-01
This paper introduces new development technique to improve the Scanning Electron Microscope (SEM) image quality and we name it as sub-blocking multiple peak histogram equalization (SUB-B-MPHE) with convolution operator. By using this new proposed technique, it shows that the new modified MPHE performs better than original MPHE. In addition, the sub-blocking method consists of convolution operator which can help to remove the blocking effect for SEM images after applying this new developed technique. Hence, by using the convolution operator, it effectively removes the blocking effect by properly distributing the suitable pixel value for the whole image. Overall, the SUB-B-MPHE with convolution outperforms the rest of methods. SCANNING 38:492-501, 2016. © 2015 Wiley Periodicals, Inc. © Wiley Periodicals, Inc.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Aliaga, José I., E-mail: aliaga@uji.es; Alonso, Pedro; Badía, José M.
We introduce a new iterative Krylov subspace-based eigensolver for the simulation of macromolecular motions on desktop multithreaded platforms equipped with multicore processors and, possibly, a graphics accelerator (GPU). The method consists of two stages, with the original problem first reduced into a simpler band-structured form by means of a high-performance compute-intensive procedure. This is followed by a memory-intensive but low-cost Krylov iteration, which is off-loaded to be computed on the GPU by means of an efficient data-parallel kernel. The experimental results reveal the performance of the new eigensolver. Concretely, when applied to the simulation of macromolecules with a few thousandsmore » degrees of freedom and the number of eigenpairs to be computed is small to moderate, the new solver outperforms other methods implemented as part of high-performance numerical linear algebra packages for multithreaded architectures.« less
Sensitivity-based virtual fields for the non-linear virtual fields method
NASA Astrophysics Data System (ADS)
Marek, Aleksander; Davis, Frances M.; Pierron, Fabrice
2017-09-01
The virtual fields method is an approach to inversely identify material parameters using full-field deformation data. In this manuscript, a new set of automatically-defined virtual fields for non-linear constitutive models has been proposed. These new sensitivity-based virtual fields reduce the influence of noise on the parameter identification. The sensitivity-based virtual fields were applied to a numerical example involving small strain plasticity; however, the general formulation derived for these virtual fields is applicable to any non-linear constitutive model. To quantify the improvement offered by these new virtual fields, they were compared with stiffness-based and manually defined virtual fields. The proposed sensitivity-based virtual fields were consistently able to identify plastic model parameters and outperform the stiffness-based and manually defined virtual fields when the data was corrupted by noise.
ERIC Educational Resources Information Center
Feldman, Sandra
2004-01-01
One of the most comprehensive studies to highlight the achievement gap between students in the United States and those in many other countries is the Third International Mathematics and Science Study (TIMSS). TIMSS showed that a handful of countries (with Japan near the top) consistently outperformed the others studied (including the United…
A deep convolutional neural network for recognizing foods
NASA Astrophysics Data System (ADS)
Jahani Heravi, Elnaz; Habibi Aghdam, Hamed; Puig, Domenec
2015-12-01
Controlling the food intake is an efficient way that each person can undertake to tackle the obesity problem in countries worldwide. This is achievable by developing a smartphone application that is able to recognize foods and compute their calories. State-of-art methods are chiefly based on hand-crafted feature extraction methods such as HOG and Gabor. Recent advances in large-scale object recognition datasets such as ImageNet have revealed that deep Convolutional Neural Networks (CNN) possess more representation power than the hand-crafted features. The main challenge with CNNs is to find the appropriate architecture for each problem. In this paper, we propose a deep CNN which consists of 769; 988 parameters. Our experiments show that the proposed CNN outperforms the state-of-art methods and improves the best result of traditional methods 17%. Moreover, using an ensemble of two CNNs that have been trained two different times, we are able to improve the classification performance 21:5%.
Calibrated Multivariate Regression with Application to Neural Semantic Basis Discovery.
Liu, Han; Wang, Lie; Zhao, Tuo
2015-08-01
We propose a calibrated multivariate regression method named CMR for fitting high dimensional multivariate regression models. Compared with existing methods, CMR calibrates regularization for each regression task with respect to its noise level so that it simultaneously attains improved finite-sample performance and tuning insensitiveness. Theoretically, we provide sufficient conditions under which CMR achieves the optimal rate of convergence in parameter estimation. Computationally, we propose an efficient smoothed proximal gradient algorithm with a worst-case numerical rate of convergence O (1/ ϵ ), where ϵ is a pre-specified accuracy of the objective function value. We conduct thorough numerical simulations to illustrate that CMR consistently outperforms other high dimensional multivariate regression methods. We also apply CMR to solve a brain activity prediction problem and find that it is as competitive as a handcrafted model created by human experts. The R package camel implementing the proposed method is available on the Comprehensive R Archive Network http://cran.r-project.org/web/packages/camel/.
Script-independent text line segmentation in freestyle handwritten documents.
Li, Yi; Zheng, Yefeng; Doermann, David; Jaeger, Stefan; Li, Yi
2008-08-01
Text line segmentation in freestyle handwritten documents remains an open document analysis problem. Curvilinear text lines and small gaps between neighboring text lines present a challenge to algorithms developed for machine printed or hand-printed documents. In this paper, we propose a novel approach based on density estimation and a state-of-the-art image segmentation technique, the level set method. From an input document image, we estimate a probability map, where each element represents the probability that the underlying pixel belongs to a text line. The level set method is then exploited to determine the boundary of neighboring text lines by evolving an initial estimate. Unlike connected component based methods ( [1], [2] for example), the proposed algorithm does not use any script-specific knowledge. Extensive quantitative experiments on freestyle handwritten documents with diverse scripts, such as Arabic, Chinese, Korean, and Hindi, demonstrate that our algorithm consistently outperforms previous methods [1]-[3]. Further experiments show the proposed algorithm is robust to scale change, rotation, and noise.
Ávila-Arcos, María C.; Cappellini, Enrico; Romero-Navarro, J. Alberto; Wales, Nathan; Moreno-Mayar, J. Víctor; Rasmussen, Morten; Fordyce, Sarah L.; Montiel, Rafael; Vielle-Calzada, Jean-Philippe; Willerslev, Eske; Gilbert, M. Thomas P.
2011-01-01
The development of second-generation sequencing technologies has greatly benefitted the field of ancient DNA (aDNA). Its application can be further exploited by the use of targeted capture-enrichment methods to overcome restrictions posed by low endogenous and contaminating DNA in ancient samples. We tested the performance of Agilent's SureSelect and Mycroarray's MySelect in-solution capture systems on Illumina sequencing libraries built from ancient maize to identify key factors influencing aDNA capture experiments. High levels of clonality as well as the presence of multiple-copy sequences in the capture targets led to biases in the data regardless of the capture method. Neither method consistently outperformed the other in terms of average target enrichment, and no obvious difference was observed either when two tiling designs were compared. In addition to demonstrating the plausibility of capturing aDNA from ancient plant material, our results also enable us to provide useful recommendations for those planning targeted-sequencing on aDNA. PMID:22355593
Optimal consistency in microRNA expression analysis using reference-gene-based normalization.
Wang, Xi; Gardiner, Erin J; Cairns, Murray J
2015-05-01
Normalization of high-throughput molecular expression profiles secures differential expression analysis between samples of different phenotypes or biological conditions, and facilitates comparison between experimental batches. While the same general principles apply to microRNA (miRNA) normalization, there is mounting evidence that global shifts in their expression patterns occur in specific circumstances, which pose a challenge for normalizing miRNA expression data. As an alternative to global normalization, which has the propensity to flatten large trends, normalization against constitutively expressed reference genes presents an advantage through their relative independence. Here we investigated the performance of reference-gene-based (RGB) normalization for differential miRNA expression analysis of microarray expression data, and compared the results with other normalization methods, including: quantile, variance stabilization, robust spline, simple scaling, rank invariant, and Loess regression. The comparative analyses were executed using miRNA expression in tissue samples derived from subjects with schizophrenia and non-psychiatric controls. We proposed a consistency criterion for evaluating methods by examining the overlapping of differentially expressed miRNAs detected using different partitions of the whole data. Based on this criterion, we found that RGB normalization generally outperformed global normalization methods. Thus we recommend the application of RGB normalization for miRNA expression data sets, and believe that this will yield a more consistent and useful readout of differentially expressed miRNAs, particularly in biological conditions characterized by large shifts in miRNA expression.
A permutation-based non-parametric analysis of CRISPR screen data.
Jia, Gaoxiang; Wang, Xinlei; Xiao, Guanghua
2017-07-19
Clustered regularly-interspaced short palindromic repeats (CRISPR) screens are usually implemented in cultured cells to identify genes with critical functions. Although several methods have been developed or adapted to analyze CRISPR screening data, no single specific algorithm has gained popularity. Thus, rigorous procedures are needed to overcome the shortcomings of existing algorithms. We developed a Permutation-Based Non-Parametric Analysis (PBNPA) algorithm, which computes p-values at the gene level by permuting sgRNA labels, and thus it avoids restrictive distributional assumptions. Although PBNPA is designed to analyze CRISPR data, it can also be applied to analyze genetic screens implemented with siRNAs or shRNAs and drug screens. We compared the performance of PBNPA with competing methods on simulated data as well as on real data. PBNPA outperformed recent methods designed for CRISPR screen analysis, as well as methods used for analyzing other functional genomics screens, in terms of Receiver Operating Characteristics (ROC) curves and False Discovery Rate (FDR) control for simulated data under various settings. Remarkably, the PBNPA algorithm showed better consistency and FDR control on published real data as well. PBNPA yields more consistent and reliable results than its competitors, especially when the data quality is low. R package of PBNPA is available at: https://cran.r-project.org/web/packages/PBNPA/ .
Calibrating random forests for probability estimation.
Dankowski, Theresa; Ziegler, Andreas
2016-09-30
Probabilities can be consistently estimated using random forests. It is, however, unclear how random forests should be updated to make predictions for other centers or at different time points. In this work, we present two approaches for updating random forests for probability estimation. The first method has been proposed by Elkan and may be used for updating any machine learning approach yielding consistent probabilities, so-called probability machines. The second approach is a new strategy specifically developed for random forests. Using the terminal nodes, which represent conditional probabilities, the random forest is first translated to logistic regression models. These are, in turn, used for re-calibration. The two updating strategies were compared in a simulation study and are illustrated with data from the German Stroke Study Collaboration. In most simulation scenarios, both methods led to similar improvements. In the simulation scenario in which the stricter assumptions of Elkan's method were not met, the logistic regression-based re-calibration approach for random forests outperformed Elkan's method. It also performed better on the stroke data than Elkan's method. The strength of Elkan's method is its general applicability to any probability machine. However, if the strict assumptions underlying this approach are not met, the logistic regression-based approach is preferable for updating random forests for probability estimation. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.
Prediction of fatigue-related driver performance from EEG data by deep Riemannian model.
Hajinoroozi, Mehdi; Jianqiu Zhang; Yufei Huang
2017-07-01
Prediction of the drivers' drowsy and alert states is important for safety purposes. The prediction of drivers' drowsy and alert states from electroencephalography (EEG) using shallow and deep Riemannian methods is presented. For shallow Riemannian methods, the minimum distance to Riemannian mean (mdm) and Log-Euclidian metric are investigated, where it is shown that Log-Euclidian metric outperforms the mdm algorithm. In addition the SPDNet, a deep Riemannian model, that takes the EEG covariance matrix as the input is investigated. It is shown that SPDNet outperforms all tested shallow and deep classification methods. Performance of SPDNet is 6.02% and 2.86% higher than the best performance by the conventional Euclidian classifiers and shallow Riemannian models, respectively.
2013-01-01
Background Differential gene expression (DGE) analysis is commonly used to reveal the deregulated molecular mechanisms of complex diseases. However, traditional DGE analysis (e.g., the t test or the rank sum test) tests each gene independently without considering interactions between them. Top-ranked differentially regulated genes prioritized by the analysis may not directly relate to the coherent molecular changes underlying complex diseases. Joint analyses of co-expression and DGE have been applied to reveal the deregulated molecular modules underlying complex diseases. Most of these methods consist of separate steps: first to identify gene-gene relationships under the studied phenotype then to integrate them with gene expression changes for prioritizing signature genes, or vice versa. It is warrant a method that can simultaneously consider gene-gene co-expression strength and corresponding expression level changes so that both types of information can be leveraged optimally. Results In this paper, we develop a gene module based method for differential gene expression analysis, named network-based differential gene expression (nDGE) analysis, a one-step integrative process for prioritizing deregulated genes and grouping them into gene modules. We demonstrate that nDGE outperforms existing methods in prioritizing deregulated genes and discovering deregulated gene modules using simulated data sets. When tested on a series of smoker and non-smoker lung adenocarcinoma data sets, we show that top differentially regulated genes identified by the rank sum test in different sets are not consistent while top ranked genes defined by nDGE in different data sets significantly overlap. nDGE results suggest that a differentially regulated gene module, which is enriched for cell cycle related genes and E2F1 targeted genes, plays a role in the molecular differences between smoker and non-smoker lung adenocarcinoma. Conclusions In this paper, we develop nDGE to prioritize deregulated genes and group them into gene modules by simultaneously considering gene expression level changes and gene-gene co-regulations. When applied to both simulated and empirical data, nDGE outperforms the traditional DGE method. More specifically, when applied to smoker and non-smoker lung cancer sets, nDGE results illustrate the molecular differences between smoker and non-smoker lung cancer. PMID:24341432
Gender Inequalities in the Transition to College
ERIC Educational Resources Information Center
Buchmann, Claudia
2009-01-01
Background: In terms of high school graduation, college entry, and persistence to earning a college degree, young women now consistently outperform their male peers. Yet most research on gender inequalities in education continues to focus on aspects of education where women trail men, such as women's underrepresentation at top-tier institutions…
Combining Pear Ester with Codlemone Improves Management of Codling Moth
USDA-ARS?s Scientific Manuscript database
Several management approaches utilizing pear ester combined with codlemone have been developed in the first 10 years after the discovery of this ripe pear fruit volatile’s kairomonal activity for larvae and both sexes of codling moth. These include a lure that consistently outperforms other high loa...
ERIC Educational Resources Information Center
Ho, Sophia Shi-Huei; Peng, Michael Yao-Ping
2016-01-01
Changes in social systems demonstrate that various structural disadvantages have jointly led to increasing competition among higher education institutions (HEIs) in many countries, especially Taiwan. Institutional administrators must recognize the need to understand how to improve performance and consistently outperform other institutions.…
Cross-cultural aspect of the Group Embedded Figures Test: norms for Turkish eighth graders.
Cakan, Mehtap
2003-10-01
The Group Embedded Figures Test was administered to 206 Turkish (123 boys versus 83 girls) eighth grade students. Distribution characteristics, item analysis, reliability, and internal consistency are presented. No sex differences on subsections or the full scale were found. Socioeconomic status as indicated by parental education was significantly associated with the cognitive style scores of the students. Subjects whose fathers had a higher education outperformed those whose fathers had less education. No significant differences in students' means were found among groups whose mothers had low, middle, and high education. The Turkish sample showed the same performance as a 5th grade American sample, and Canadian 8th graders outperformed the Turkish participants. The practice effects are also discussed.
Channel characteristics and coordination in three-echelon dual-channel supply chain
NASA Astrophysics Data System (ADS)
Saha, Subrata
2016-02-01
We explore the impact of channel structure on the manufacturer, the distributer, the retailer and the entire supply chain by considering three different channel structures in radiance of with and without coordination. These structures include a traditional retail channel and two manufacturer direct channels with and without consistent pricing. By comparing the performance of the manufacturer, the distributer and the retailer, and the entire supply chain in three different supply chain structures, it is established analytically that, under some conditions, a dual channel can outperform a single retail channel; as a consequence, a coordination mechanism is developed that not only coordinates the dual channel but also outperforms the non-cooperative single retail channel. All the analytical results are further analysed through numerical examples.
Prediction of heterotrimeric protein complexes by two-phase learning using neighboring kernels
2014-01-01
Background Protein complexes play important roles in biological systems such as gene regulatory networks and metabolic pathways. Most methods for predicting protein complexes try to find protein complexes with size more than three. It, however, is known that protein complexes with smaller sizes occupy a large part of whole complexes for several species. In our previous work, we developed a method with several feature space mappings and the domain composition kernel for prediction of heterodimeric protein complexes, which outperforms existing methods. Results We propose methods for prediction of heterotrimeric protein complexes by extending techniques in the previous work on the basis of the idea that most heterotrimeric protein complexes are not likely to share the same protein with each other. We make use of the discriminant function in support vector machines (SVMs), and design novel feature space mappings for the second phase. As the second classifier, we examine SVMs and relevance vector machines (RVMs). We perform 10-fold cross-validation computational experiments. The results suggest that our proposed two-phase methods and SVM with the extended features outperform the existing method NWE, which was reported to outperform other existing methods such as MCL, MCODE, DPClus, CMC, COACH, RRW, and PPSampler for prediction of heterotrimeric protein complexes. Conclusions We propose two-phase prediction methods with the extended features, the domain composition kernel, SVMs and RVMs. The two-phase method with the extended features and the domain composition kernel using SVM as the second classifier is particularly useful for prediction of heterotrimeric protein complexes. PMID:24564744
Toward link predictability of complex networks
Lü, Linyuan; Pan, Liming; Zhou, Tao; Zhang, Yi-Cheng; Stanley, H. Eugene
2015-01-01
The organization of real networks usually embodies both regularities and irregularities, and, in principle, the former can be modeled. The extent to which the formation of a network can be explained coincides with our ability to predict missing links. To understand network organization, we should be able to estimate link predictability. We assume that the regularity of a network is reflected in the consistency of structural features before and after a random removal of a small set of links. Based on the perturbation of the adjacency matrix, we propose a universal structural consistency index that is free of prior knowledge of network organization. Extensive experiments on disparate real-world networks demonstrate that (i) structural consistency is a good estimation of link predictability and (ii) a derivative algorithm outperforms state-of-the-art link prediction methods in both accuracy and robustness. This analysis has further applications in evaluating link prediction algorithms and monitoring sudden changes in evolving network mechanisms. It will provide unique fundamental insights into the above-mentioned academic research fields, and will foster the development of advanced information filtering technologies of interest to information technology practitioners. PMID:25659742
Vessel Enhancement and Segmentation of 4D CT Lung Image Using Stick Tensor Voting
NASA Astrophysics Data System (ADS)
Cong, Tan; Hao, Yang; Jingli, Shi; Xuan, Yang
2016-12-01
Vessel enhancement and segmentation plays a significant role in medical image analysis. This paper proposes a novel vessel enhancement and segmentation method for 4D CT lung image using stick tensor voting algorithm, which focuses on addressing the vessel distortion issue of vessel enhancement diffusion (VED) method. Furthermore, the enhanced results are easily segmented using level-set segmentation. In our method, firstly, vessels are filtered using Frangi's filter to reduce intrapulmonary noises and extract rough blood vessels. Secondly, stick tensor voting algorithm is employed to estimate the correct direction along the vessel. Then the estimated direction along the vessel is used as the anisotropic diffusion direction of vessel in VED algorithm, which makes the intensity diffusion of points locating at the vessel wall be consistent with the directions of vessels and enhance the tubular features of vessels. Finally, vessels can be extracted from the enhanced image by applying level-set segmentation method. A number of experiments results show that our method outperforms traditional VED method in vessel enhancement and results in satisfied segmented vessels.
Franklin, Brandon M.; Xiang, Lin; Collett, Jason A.; Rhoads, Megan K.
2015-01-01
Student populations are diverse such that different types of learners struggle with traditional didactic instruction. Problem-based learning has existed for several decades, but there is still controversy regarding the optimal mode of instruction to ensure success at all levels of students' past achievement. The present study addressed this problem by dividing students into the following three instructional groups for an upper-level course in animal physiology: traditional lecture-style instruction (LI), guided problem-based instruction (GPBI), and open problem-based instruction (OPBI). Student performance was measured by three summative assessments consisting of 50% multiple-choice questions and 50% short-answer questions as well as a final overall course assessment. The present study also examined how students of different academic achievement histories performed under each instructional method. When student achievement levels were not considered, the effects of instructional methods on student outcomes were modest; OPBI students performed moderately better on short-answer exam questions than both LI and GPBI groups. High-achieving students showed no difference in performance for any of the instructional methods on any metric examined. In students with low-achieving academic histories, OPBI students largely outperformed LI students on all metrics (short-answer exam: P < 0.05, d = 1.865; multiple-choice question exam: P < 0.05, d = 1.166; and final score: P < 0.05, d = 1.265). They also outperformed GPBI students on short-answer exam questions (P < 0.05, d = 1.109) but not multiple-choice exam questions (P = 0.071, d = 0.716) or final course outcome (P = 0.328, d = 0.513). These findings strongly suggest that typically low-achieving students perform at a higher level under OPBI as long as the proper support systems (formative assessment and scaffolding) are provided to encourage student success. PMID:26628656
Counting malaria parasites with a two-stage EM based algorithm using crowsourced data.
Cabrera-Bean, Margarita; Pages-Zamora, Alba; Diaz-Vilor, Carles; Postigo-Camps, Maria; Cuadrado-Sanchez, Daniel; Luengo-Oroz, Miguel Angel
2017-07-01
Malaria eradication of the worldwide is currently one of the main WHO's global goals. In this work, we focus on the use of human-machine interaction strategies for low-cost fast reliable malaria diagnostic based on a crowdsourced approach. The addressed technical problem consists in detecting spots in images even under very harsh conditions when positive objects are very similar to some artifacts. The clicks or tags delivered by several annotators labeling an image are modeled as a robust finite mixture, and techniques based on the Expectation-Maximization (EM) algorithm are proposed for accurately counting malaria parasites on thick blood smears obtained by microscopic Giemsa-stained techniques. This approach outperforms other traditional methods as it is shown through experimentation with real data.
A multistage gene normalization system integrating multiple effective methods.
Li, Lishuang; Liu, Shanshan; Li, Lihua; Fan, Wenting; Huang, Degen; Zhou, Huiwei
2013-01-01
Gene/protein recognition and normalization is an important preliminary step for many biological text mining tasks. In this paper, we present a multistage gene normalization system which consists of four major subtasks: pre-processing, dictionary matching, ambiguity resolution and filtering. For the first subtask, we apply the gene mention tagger developed in our earlier work, which achieves an F-score of 88.42% on the BioCreative II GM testing set. In the stage of dictionary matching, the exact matching and approximate matching between gene names and the EntrezGene lexicon have been combined. For the ambiguity resolution subtask, we propose a semantic similarity disambiguation method based on Munkres' Assignment Algorithm. At the last step, a filter based on Wikipedia has been built to remove the false positives. Experimental results show that the presented system can achieve an F-score of 90.1%, outperforming most of the state-of-the-art systems.
A hybrid skull-stripping algorithm based on adaptive balloon snake models
NASA Astrophysics Data System (ADS)
Liu, Hung-Ting; Sheu, Tony W. H.; Chang, Herng-Hua
2013-02-01
Skull-stripping is one of the most important preprocessing steps in neuroimage analysis. We proposed a hybrid algorithm based on an adaptive balloon snake model to handle this challenging task. The proposed framework consists of two stages: first, the fuzzy possibilistic c-means (FPCM) is used for voxel clustering, which provides a labeled image for the snake contour initialization. In the second stage, the contour is initialized outside the brain surface based on the FPCM result and evolves under the guidance of the balloon snake model, which drives the contour with an adaptive inward normal force to capture the boundary of the brain. The similarity indices indicate that our method outperformed the BSE and BET methods in skull-stripping the MR image volumes in the IBSR data set. Experimental results show the effectiveness of this new scheme and potential applications in a wide variety of skull-stripping applications.
Kang, Dongwan D.; Froula, Jeff; Egan, Rob; ...
2015-01-01
Grouping large genomic fragments assembled from shotgun metagenomic sequences to deconvolute complex microbial communities, or metagenome binning, enables the study of individual organisms and their interactions. Because of the complex nature of these communities, existing metagenome binning methods often miss a large number of microbial species. In addition, most of the tools are not scalable to large datasets. Here we introduce automated software called MetaBAT that integrates empirical probabilistic distances of genome abundance and tetranucleotide frequency for accurate metagenome binning. MetaBAT outperforms alternative methods in accuracy and computational efficiency on both synthetic and real metagenome datasets. Lastly, it automatically formsmore » hundreds of high quality genome bins on a very large assembly consisting millions of contigs in a matter of hours on a single node. MetaBAT is open source software and available at https://bitbucket.org/berkeleylab/metabat.« less
Automatic Summarization as a Combinatorial Optimization Problem
NASA Astrophysics Data System (ADS)
Hirao, Tsutomu; Suzuki, Jun; Isozaki, Hideki
We derived the oracle summary with the highest ROUGE score that can be achieved by integrating sentence extraction with sentence compression from the reference abstract. The analysis results of the oracle revealed that summarization systems have to assign an appropriate compression rate for each sentence in the document. In accordance with this observation, this paper proposes a summarization method as a combinatorial optimization: selecting the set of sentences that maximize the sum of the sentence scores from the pool which consists of the sentences with various compression rates, subject to length constrains. The score of the sentence is defined by its compression rate, content words and positional information. The parameters for the compression rates and positional information are optimized by minimizing the loss between score of oracles and that of candidates. The results obtained from TSC-2 corpus showed that our method outperformed the previous systems with statistical significance.
A Review of Depth and Normal Fusion Algorithms
Štolc, Svorad; Pock, Thomas
2018-01-01
Geometric surface information such as depth maps and surface normals can be acquired by various methods such as stereo light fields, shape from shading and photometric stereo techniques. We compare several algorithms which deal with the combination of depth with surface normal information in order to reconstruct a refined depth map. The reasons for performance differences are examined from the perspective of alternative formulations of surface normals for depth reconstruction. We review and analyze methods in a systematic way. Based on our findings, we introduce a new generalized fusion method, which is formulated as a least squares problem and outperforms previous methods in the depth error domain by introducing a novel normal weighting that performs closer to the geodesic distance measure. Furthermore, a novel method is introduced based on Total Generalized Variation (TGV) which further outperforms previous approaches in terms of the geodesic normal distance error and maintains comparable quality in the depth error domain. PMID:29389903
New Method of Calculating a Multiplication by using the Generalized Bernstein-Vazirani Algorithm
NASA Astrophysics Data System (ADS)
Nagata, Koji; Nakamura, Tadao; Geurdes, Han; Batle, Josep; Abdalla, Soliman; Farouk, Ahmed
2018-06-01
We present a new method of more speedily calculating a multiplication by using the generalized Bernstein-Vazirani algorithm and many parallel quantum systems. Given the set of real values a1,a2,a3,\\ldots ,aN and a function g:bf {R}→ {0,1}, we shall determine the following values g(a1),g(a2),g(a3),\\ldots , g(aN) simultaneously. The speed of determining the values is shown to outperform the classical case by a factor of N. Next, we consider it as a number in binary representation; M 1 = ( g( a 1), g( a 2), g( a 3),…, g( a N )). By using M parallel quantum systems, we have M numbers in binary representation, simultaneously. The speed of obtaining the M numbers is shown to outperform the classical case by a factor of M. Finally, we calculate the product; M1× M2× \\cdots × MM. The speed of obtaining the product is shown to outperform the classical case by a factor of N × M.
Product component genealogy modeling and field-failure prediction
DOE Office of Scientific and Technical Information (OSTI.GOV)
King, Caleb; Hong, Yili; Meeker, William Q.
Many industrial products consist of multiple components that are necessary for system operation. There is an abundance of literature on modeling the lifetime of such components through competing risks models. During the life-cycle of a product, it is common for there to be incremental design changes to improve reliability, to reduce costs, or due to changes in availability of certain part numbers. These changes can affect product reliability but are often ignored in system lifetime modeling. By incorporating this information about changes in part numbers over time (information that is readily available in most production databases), better accuracy can bemore » achieved in predicting time to failure, thus yielding more accurate field-failure predictions. This paper presents methods for estimating parameters and predictions for this generational model and a comparison with existing methods through the use of simulation. Our results indicate that the generational model has important practical advantages and outperforms the existing methods in predicting field failures.« less
NASA Astrophysics Data System (ADS)
De Ridder, Simon; Vandermarliere, Benjamin; Ryckebusch, Jan
2016-11-01
A framework based on generalized hierarchical random graphs (GHRGs) for the detection of change points in the structure of temporal networks has recently been developed by Peel and Clauset (2015 Proc. 29th AAAI Conf. on Artificial Intelligence). We build on this methodology and extend it to also include the versatile stochastic block models (SBMs) as a parametric family for reconstructing the empirical networks. We use five different techniques for change point detection on prototypical temporal networks, including empirical and synthetic ones. We find that none of the considered methods can consistently outperform the others when it comes to detecting and locating the expected change points in empirical temporal networks. With respect to the precision and the recall of the results of the change points, we find that the method based on a degree-corrected SBM has better recall properties than other dedicated methods, especially for sparse networks and smaller sliding time window widths.
Product component genealogy modeling and field-failure prediction
King, Caleb; Hong, Yili; Meeker, William Q.
2016-04-13
Many industrial products consist of multiple components that are necessary for system operation. There is an abundance of literature on modeling the lifetime of such components through competing risks models. During the life-cycle of a product, it is common for there to be incremental design changes to improve reliability, to reduce costs, or due to changes in availability of certain part numbers. These changes can affect product reliability but are often ignored in system lifetime modeling. By incorporating this information about changes in part numbers over time (information that is readily available in most production databases), better accuracy can bemore » achieved in predicting time to failure, thus yielding more accurate field-failure predictions. This paper presents methods for estimating parameters and predictions for this generational model and a comparison with existing methods through the use of simulation. Our results indicate that the generational model has important practical advantages and outperforms the existing methods in predicting field failures.« less
A hybrid method for classifying cognitive states from fMRI data.
Parida, S; Dehuri, S; Cho, S-B; Cacha, L A; Poznanski, R R
2015-09-01
Functional magnetic resonance imaging (fMRI) makes it possible to detect brain activities in order to elucidate cognitive-states. The complex nature of fMRI data requires under-standing of the analyses applied to produce possible avenues for developing models of cognitive state classification and improving brain activity prediction. While many models of classification task of fMRI data analysis have been developed, in this paper, we present a novel hybrid technique through combining the best attributes of genetic algorithms (GAs) and ensemble decision tree technique that consistently outperforms all other methods which are being used for cognitive-state classification. Specifically, this paper illustrates the combined effort of decision-trees ensemble and GAs for feature selection through an extensive simulation study and discusses the classification performance with respect to fMRI data. We have shown that our proposed method exhibits significant reduction of the number of features with clear edge classification accuracy over ensemble of decision-trees.
Adav, Sunil S; Lee, Duu-Jong
2008-06-15
Extracellular polymeric substances (EPS) were extracted from aerobic granules of compact interior structure using seven extraction methods. Ultrasound followed by the chemical reagents formamide and NaOH outperformed other methods in extracting EPS from aerobic granules of compact interior. The collected EPS revealed no contamination by intracellular substances and consisted mainly of proteins, polysaccharides, humic substances and lipids. The quantity of extracted proteins exhibited a weak correlation with quantity of extracted carbohydrates but no correlation with quantity of extracted humic substances. The total polysaccharides/total proteins (PN/PS) ratios for sludge flocs were approximately 0.9 regardless of extraction method. Protein content was significantly enriched in the granules, producing a PN/PS ratio of 3.4-6.2. This experimental result correlated with observations using excitation-emission matrix (EEM) and confocal laser scanning microscope technique. However, detailed study disproved the use of EEM results as a quantitative index of extracted EPS from sludge flocs or from granules.
Collaer, Marcia L; Reimers, Stian; Manning, John T
2007-04-01
We investigated whether performance on a visuospatial line judgment task, the Judgment of Line Angle and Position-15 test (JLAP-15), showed evidence of sensitivity to early sex steroid exposure by examining how it related to sex, as well as to sexual orientation and 2D:4D digit ratios. Participants were drawn from a large Internet study with over 250,000 participants. In the main sample (ages 12-58 years), males outperformed females on the JLAP-15, showing a moderate effect size for sex. In agreement with a prenatal sex hormone hypothesis, line judgment accuracy in adults related to 2D:4D and sexual orientation, both of which are postulated to be influenced by early steroids. In both sexes, better visuospatial performance was associated with lower (more male-typical) digit ratios. For men, heterosexual participants outperformed homosexual/bisexual participants on the JLAP-15 and, for women, homosexual/bisexual participants outperformed heterosexual participants. In children aged 8-10 years, presumed to be a largely prepubertal group, boys also outperformed girls. These findings are consistent with the hypothesis that visuospatial ability is influenced by early sex steroids, although they do not rule out alternative explanations or additional influences. More broadly, such results support a prenatal sex hormone hypothesis that degree of androgen exposure may influence the neural circuitry underlying cognition (visuospatial ability) and sexual orientation as well as aspects of somatic (digit ratio) development.
Perceptions of Mathematics Curricula and Teaching in China
ERIC Educational Resources Information Center
Moy, Robert; Peverly, Stephen T.
2005-01-01
China and other East Asian countries (Japan, South Korea, Taiwan) have consistently outperformed the United States and other Western countries in mathematics achievement. As part of a Fulbright-sponsored trip to China in the Summer of 2002, a New York City public school teacher and a trainer of school psychologists offer their impressions of some…
Cultural Differences in Early Math Skills among U.S., Taiwanese, Dutch, and Peruvian Preschoolers
ERIC Educational Resources Information Center
Paik, Jae H.; van Gelderen, Loes; Gonzales, Manuel; de Jong, Peter F.; Hayes, Michael
2011-01-01
East Asian children have consistently outperformed children from other nations on mathematical tests. However, most previous cross-cultural studies mainly compared East Asian countries and the United States and have largely ignored cultures from other parts of the world. The present study explored cultural differences in young children's early…
The Academic Impacts of Attending a KIPP Charter School in Arkansas
ERIC Educational Resources Information Center
Rose, Caleb P.
2013-01-01
KIPP Delta College Preparatory School (KIPP: DCPS), an open-enrollment charter school, opened in 2002 in Helena, Arkansas. Since its opening, KIPP: DCPS students have consistently outperformed their peers in the Helena/West Helena School district, and moreover, recent test scores suggest that white students and minority students are achieving at…
Regional frequency analysis of extreme rainfalls using partial L moments method
NASA Astrophysics Data System (ADS)
Zakaria, Zahrahtul Amani; Shabri, Ani
2013-07-01
An approach based on regional frequency analysis using L moments and LH moments are revisited in this study. Subsequently, an alternative regional frequency analysis using the partial L moments (PL moments) method is employed, and a new relationship for homogeneity analysis is developed. The results were then compared with those obtained using the method of L moments and LH moments of order two. The Selangor catchment, consisting of 37 sites and located on the west coast of Peninsular Malaysia, is chosen as a case study. PL moments for the generalized extreme value (GEV), generalized logistic (GLO), and generalized Pareto distributions were derived and used to develop the regional frequency analysis procedure. PL moment ratio diagram and Z test were employed in determining the best-fit distribution. Comparison between the three approaches showed that GLO and GEV distributions were identified as the suitable distributions for representing the statistical properties of extreme rainfall in Selangor. Monte Carlo simulation used for performance evaluation shows that the method of PL moments would outperform L and LH moments methods for estimation of large return period events.
Robust Sex Differences in Jigsaw Puzzle Solving—Are Boys Really Better in Most Visuospatial Tasks?
Kocijan, Vid; Horvat, Marina; Majdic, Gregor
2017-01-01
Sex differences are consistently reported in different visuospatial tasks with men usually performing better in mental rotation tests while women are better on tests for memory of object locations. In the present study, we investigated sex differences in solving jigsaw puzzles in children. In total 22 boys and 24 girls were tested using custom build tablet application representing a jigsaw puzzle consisting of 25 pieces and featuring three different pictures. Girls outperformed boys in solving jigsaw puzzles regardless of the picture. Girls were faster than boys in solving the puzzle, made less incorrect moves with the pieces of the puzzle, and spent less time moving the pieces around the tablet. It appears that the strategy of solving the jigsaw puzzle was the main factor affecting differences in success, as girls tend to solve the puzzle more systematically while boys performed more trial and error attempts, thus having more incorrect moves with the puzzle pieces. Results of this study suggest a very robust sex difference in solving the jigsaw puzzle with girls outperforming boys by a large margin. PMID:29109682
Robust Sex Differences in Jigsaw Puzzle Solving-Are Boys Really Better in Most Visuospatial Tasks?
Kocijan, Vid; Horvat, Marina; Majdic, Gregor
2017-01-01
Sex differences are consistently reported in different visuospatial tasks with men usually performing better in mental rotation tests while women are better on tests for memory of object locations. In the present study, we investigated sex differences in solving jigsaw puzzles in children. In total 22 boys and 24 girls were tested using custom build tablet application representing a jigsaw puzzle consisting of 25 pieces and featuring three different pictures. Girls outperformed boys in solving jigsaw puzzles regardless of the picture. Girls were faster than boys in solving the puzzle, made less incorrect moves with the pieces of the puzzle, and spent less time moving the pieces around the tablet. It appears that the strategy of solving the jigsaw puzzle was the main factor affecting differences in success, as girls tend to solve the puzzle more systematically while boys performed more trial and error attempts, thus having more incorrect moves with the puzzle pieces. Results of this study suggest a very robust sex difference in solving the jigsaw puzzle with girls outperforming boys by a large margin.
Brosch, Tom; Tang, Lisa Y W; Youngjin Yoo; Li, David K B; Traboulsee, Anthony; Tam, Roger
2016-05-01
We propose a novel segmentation approach based on deep 3D convolutional encoder networks with shortcut connections and apply it to the segmentation of multiple sclerosis (MS) lesions in magnetic resonance images. Our model is a neural network that consists of two interconnected pathways, a convolutional pathway, which learns increasingly more abstract and higher-level image features, and a deconvolutional pathway, which predicts the final segmentation at the voxel level. The joint training of the feature extraction and prediction pathways allows for the automatic learning of features at different scales that are optimized for accuracy for any given combination of image types and segmentation task. In addition, shortcut connections between the two pathways allow high- and low-level features to be integrated, which enables the segmentation of lesions across a wide range of sizes. We have evaluated our method on two publicly available data sets (MICCAI 2008 and ISBI 2015 challenges) with the results showing that our method performs comparably to the top-ranked state-of-the-art methods, even when only relatively small data sets are available for training. In addition, we have compared our method with five freely available and widely used MS lesion segmentation methods (EMS, LST-LPA, LST-LGA, Lesion-TOADS, and SLS) on a large data set from an MS clinical trial. The results show that our method consistently outperforms these other methods across a wide range of lesion sizes.
The performance of the spatiotemporal Kalman filter and LORETA in seizure onset localization.
Hamid, Laith; Sarabi, Masoud; Japaridze, Natia; Wiegand, Gert; Heute, Ulrich; Stephani, Ulrich; Galka, Andreas; Siniatchkin, Michael
2015-08-01
The assumption of spatial-smoothness is often used to solve the bioelectric inverse problem during electroencephalographic (EEG) source imaging, e.g., in low resolution electromagnetic tomography (LORETA). Since the EEG data show a temporal structure, the combination of the temporal-smoothness and the spatial-smoothness constraints may improve the solution of the EEG inverse problem. This study investigates the performance of the spatiotemporal Kalman filter (STKF) method, which is based on spatial and temporal smoothness, in the localization of a focal seizure's onset and compares its results to those of LORETA. The main finding of the study was that the STKF with an autoregressive model of order two significantly outperformed LORETA in the accuracy and consistency of the localization, provided that the source space consists of a whole-brain volumetric grid. In the future, these promising results will be confirmed using data from more patients and performing statistical analyses on the results. Furthermore, the effects of the temporal smoothness constraint will be studied using different types of focal seizures.
Assessment of tautomer distribution using the condensed reaction graph approach
NASA Astrophysics Data System (ADS)
Gimadiev, T. R.; Madzhidov, T. I.; Nugmanov, R. I.; Baskin, I. I.; Antipin, I. S.; Varnek, A.
2018-03-01
We report the first direct QSPR modeling of equilibrium constants of tautomeric transformations (logK T ) in different solvents and at different temperatures, which do not require intermediate assessment of acidity (basicity) constants for all tautomeric forms. The key step of the modeling consisted in the merging of two tautomers in one sole molecular graph ("condensed reaction graph") which enables to compute molecular descriptors characterizing entire equilibrium. The support vector regression method was used to build the models. The training set consisted of 785 transformations belonging to 11 types of tautomeric reactions with equilibrium constants measured in different solvents and at different temperatures. The models obtained perform well both in cross-validation (Q2 = 0.81 RMSE = 0.7 logK T units) and on two external test sets. Benchmarking studies demonstrate that our models outperform results obtained with DFT B3LYP/6-311 ++ G(d,p) and ChemAxon Tautomerizer applicable only in water at room temperature.
Orthogonal Procrustes Analysis for Dictionary Learning in Sparse Linear Representation.
Grossi, Giuliano; Lanzarotti, Raffaella; Lin, Jianyi
2017-01-01
In the sparse representation model, the design of overcomplete dictionaries plays a key role for the effectiveness and applicability in different domains. Recent research has produced several dictionary learning approaches, being proven that dictionaries learnt by data examples significantly outperform structured ones, e.g. wavelet transforms. In this context, learning consists in adapting the dictionary atoms to a set of training signals in order to promote a sparse representation that minimizes the reconstruction error. Finding the best fitting dictionary remains a very difficult task, leaving the question still open. A well-established heuristic method for tackling this problem is an iterative alternating scheme, adopted for instance in the well-known K-SVD algorithm. Essentially, it consists in repeating two stages; the former promotes sparse coding of the training set and the latter adapts the dictionary to reduce the error. In this paper we present R-SVD, a new method that, while maintaining the alternating scheme, adopts the Orthogonal Procrustes analysis to update the dictionary atoms suitably arranged into groups. Comparative experiments on synthetic data prove the effectiveness of R-SVD with respect to well known dictionary learning algorithms such as K-SVD, ILS-DLA and the online method OSDL. Moreover, experiments on natural data such as ECG compression, EEG sparse representation, and image modeling confirm R-SVD's robustness and wide applicability.
SCOUT: simultaneous time segmentation and community detection in dynamic networks
Hulovatyy, Yuriy; Milenković, Tijana
2016-01-01
Many evolving complex real-world systems can be modeled via dynamic networks. An important problem in dynamic network research is community detection, which finds groups of topologically related nodes. Typically, this problem is approached by assuming either that each time point has a distinct community organization or that all time points share a single community organization. The reality likely lies between these two extremes. To find the compromise, we consider community detection in the context of the problem of segment detection, which identifies contiguous time periods with consistent network structure. Consequently, we formulate a combined problem of segment community detection (SCD), which simultaneously partitions the network into contiguous time segments with consistent community organization and finds this community organization for each segment. To solve SCD, we introduce SCOUT, an optimization framework that explicitly considers both segmentation quality and partition quality. SCOUT addresses limitations of existing methods that can be adapted to solve SCD, which consider only one of segmentation quality or partition quality. In a thorough evaluation, SCOUT outperforms the existing methods in terms of both accuracy and computational complexity. We apply SCOUT to biological network data to study human aging. PMID:27881879
Application of Poisson random effect models for highway network screening.
Jiang, Ximiao; Abdel-Aty, Mohamed; Alamili, Samer
2014-02-01
In recent years, Bayesian random effect models that account for the temporal and spatial correlations of crash data became popular in traffic safety research. This study employs random effect Poisson Log-Normal models for crash risk hotspot identification. Both the temporal and spatial correlations of crash data were considered. Potential for Safety Improvement (PSI) were adopted as a measure of the crash risk. Using the fatal and injury crashes that occurred on urban 4-lane divided arterials from 2006 to 2009 in the Central Florida area, the random effect approaches were compared to the traditional Empirical Bayesian (EB) method and the conventional Bayesian Poisson Log-Normal model. A series of method examination tests were conducted to evaluate the performance of different approaches. These tests include the previously developed site consistence test, method consistence test, total rank difference test, and the modified total score test, as well as the newly proposed total safety performance measure difference test. Results show that the Bayesian Poisson model accounting for both temporal and spatial random effects (PTSRE) outperforms the model that with only temporal random effect, and both are superior to the conventional Poisson Log-Normal model (PLN) and the EB model in the fitting of crash data. Additionally, the method evaluation tests indicate that the PTSRE model is significantly superior to the PLN model and the EB model in consistently identifying hotspots during successive time periods. The results suggest that the PTSRE model is a superior alternative for road site crash risk hotspot identification. Copyright © 2013 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Müller, H.; Haberlandt, U.
2018-01-01
Rainfall time series of high temporal resolution and spatial density are crucial for urban hydrology. The multiplicative random cascade model can be used for temporal rainfall disaggregation of daily data to generate such time series. Here, the uniform splitting approach with a branching number of 3 in the first disaggregation step is applied. To achieve a final resolution of 5 min, subsequent steps after disaggregation are necessary. Three modifications at different disaggregation levels are tested in this investigation (uniform splitting at Δt = 15 min, linear interpolation at Δt = 7.5 min and Δt = 3.75 min). Results are compared both with observations and an often used approach, based on the assumption that a time steps with Δt = 5.625 min, as resulting if a branching number of 2 is applied throughout, can be replaced with Δt = 5 min (called the 1280 min approach). Spatial consistence is implemented in the disaggregated time series using a resampling algorithm. In total, 24 recording stations in Lower Saxony, Northern Germany with a 5 min resolution have been used for the validation of the disaggregation procedure. The urban-hydrological suitability is tested with an artificial combined sewer system of about 170 hectares. The results show that all three variations outperform the 1280 min approach regarding reproduction of wet spell duration, average intensity, fraction of dry intervals and lag-1 autocorrelation. Extreme values with durations of 5 min are also better represented. For durations of 1 h, all approaches show only slight deviations from the observed extremes. The applied resampling algorithm is capable to achieve sufficient spatial consistence. The effects on the urban hydrological simulations are significant. Without spatial consistence, flood volumes of manholes and combined sewer overflow are strongly underestimated. After resampling, results using disaggregated time series as input are in the range of those using observed time series. Best overall performance regarding rainfall statistics are obtained by the method in which the disaggregation process ends at time steps with 7.5 min duration, deriving the 5 min time steps by linear interpolation. With subsequent resampling this method leads to a good representation of manhole flooding and combined sewer overflow volume in terms of hydrological simulations and outperforms the 1280 min approach.
Good match exploration for infrared face recognition
NASA Astrophysics Data System (ADS)
Yang, Changcai; Zhou, Huabing; Sun, Sheng; Liu, Renfeng; Zhao, Ji; Ma, Jiayi
2014-11-01
Establishing good feature correspondence is a critical prerequisite and a challenging task for infrared (IR) face recognition. Recent studies revealed that the scale invariant feature transform (SIFT) descriptor outperforms other local descriptors for feature matching. However, it only uses local appearance information for matching, and hence inevitably leads to a number of false matches. To address this issue, this paper explores global structure information (GSI) among SIFT correspondences, and proposes a new method SIFT-GSI for good match exploration. This is achieved by fitting a smooth mapping function for the underlying correct matches, which involves softassign and deterministic annealing. Quantitative comparisons with state-of-the-art methods on a publicly available IR human face database demonstrate that SIFT-GSI significantly outperforms other methods for feature matching, and hence it is able to improve the reliability of IR face recognition systems.
Automatic bladder segmentation from CT images using deep CNN and 3D fully connected CRF-RNN.
Xu, Xuanang; Zhou, Fugen; Liu, Bo
2018-03-19
Automatic approach for bladder segmentation from computed tomography (CT) images is highly desirable in clinical practice. It is a challenging task since the bladder usually suffers large variations of appearance and low soft-tissue contrast in CT images. In this study, we present a deep learning-based approach which involves a convolutional neural network (CNN) and a 3D fully connected conditional random fields recurrent neural network (CRF-RNN) to perform accurate bladder segmentation. We also propose a novel preprocessing method, called dual-channel preprocessing, to further advance the segmentation performance of our approach. The presented approach works as following: first, we apply our proposed preprocessing method on the input CT image and obtain a dual-channel image which consists of the CT image and an enhanced bladder density map. Second, we exploit a CNN to predict a coarse voxel-wise bladder score map on this dual-channel image. Finally, a 3D fully connected CRF-RNN refines the coarse bladder score map and produce final fine-localized segmentation result. We compare our approach to the state-of-the-art V-net on a clinical dataset. Results show that our approach achieves superior segmentation accuracy, outperforming the V-net by a significant margin. The Dice Similarity Coefficient of our approach (92.24%) is 8.12% higher than that of the V-net. Moreover, the bladder probability maps performed by our approach present sharper boundaries and more accurate localizations compared with that of the V-net. Our approach achieves higher segmentation accuracy than the state-of-the-art method on clinical data. Both the dual-channel processing and the 3D fully connected CRF-RNN contribute to this improvement. The united deep network composed of the CNN and 3D CRF-RNN also outperforms a system where the CRF model acts as a post-processing method disconnected from the CNN.
Spindel, Jennifer; Begum, Hasina; Akdemir, Deniz; Virk, Parminder; Collard, Bertrand; Redoña, Edilberto; Atlin, Gary; Jannink, Jean-Luc; McCouch, Susan R
2015-02-01
Genomic Selection (GS) is a new breeding method in which genome-wide markers are used to predict the breeding value of individuals in a breeding population. GS has been shown to improve breeding efficiency in dairy cattle and several crop plant species, and here we evaluate for the first time its efficacy for breeding inbred lines of rice. We performed a genome-wide association study (GWAS) in conjunction with five-fold GS cross-validation on a population of 363 elite breeding lines from the International Rice Research Institute's (IRRI) irrigated rice breeding program and herein report the GS results. The population was genotyped with 73,147 markers using genotyping-by-sequencing. The training population, statistical method used to build the GS model, number of markers, and trait were varied to determine their effect on prediction accuracy. For all three traits, genomic prediction models outperformed prediction based on pedigree records alone. Prediction accuracies ranged from 0.31 and 0.34 for grain yield and plant height to 0.63 for flowering time. Analyses using subsets of the full marker set suggest that using one marker every 0.2 cM is sufficient for genomic selection in this collection of rice breeding materials. RR-BLUP was the best performing statistical method for grain yield where no large effect QTL were detected by GWAS, while for flowering time, where a single very large effect QTL was detected, the non-GS multiple linear regression method outperformed GS models. For plant height, in which four mid-sized QTL were identified by GWAS, random forest produced the most consistently accurate GS models. Our results suggest that GS, informed by GWAS interpretations of genetic architecture and population structure, could become an effective tool for increasing the efficiency of rice breeding as the costs of genotyping continue to decline.
Spindel, Jennifer; Begum, Hasina; Akdemir, Deniz; Virk, Parminder; Collard, Bertrand; Redoña, Edilberto; Atlin, Gary; Jannink, Jean-Luc; McCouch, Susan R.
2015-01-01
Genomic Selection (GS) is a new breeding method in which genome-wide markers are used to predict the breeding value of individuals in a breeding population. GS has been shown to improve breeding efficiency in dairy cattle and several crop plant species, and here we evaluate for the first time its efficacy for breeding inbred lines of rice. We performed a genome-wide association study (GWAS) in conjunction with five-fold GS cross-validation on a population of 363 elite breeding lines from the International Rice Research Institute's (IRRI) irrigated rice breeding program and herein report the GS results. The population was genotyped with 73,147 markers using genotyping-by-sequencing. The training population, statistical method used to build the GS model, number of markers, and trait were varied to determine their effect on prediction accuracy. For all three traits, genomic prediction models outperformed prediction based on pedigree records alone. Prediction accuracies ranged from 0.31 and 0.34 for grain yield and plant height to 0.63 for flowering time. Analyses using subsets of the full marker set suggest that using one marker every 0.2 cM is sufficient for genomic selection in this collection of rice breeding materials. RR-BLUP was the best performing statistical method for grain yield where no large effect QTL were detected by GWAS, while for flowering time, where a single very large effect QTL was detected, the non-GS multiple linear regression method outperformed GS models. For plant height, in which four mid-sized QTL were identified by GWAS, random forest produced the most consistently accurate GS models. Our results suggest that GS, informed by GWAS interpretations of genetic architecture and population structure, could become an effective tool for increasing the efficiency of rice breeding as the costs of genotyping continue to decline. PMID:25689273
A Hybrid Method for Pancreas Extraction from CT Image Based on Level Set Methods
Tan, Hanqing; Fujita, Hiroshi
2013-01-01
This paper proposes a novel semiautomatic method to extract the pancreas from abdominal CT images. Traditional level set and region growing methods that request locating initial contour near the final boundary of object have problem of leakage to nearby tissues of pancreas region. The proposed method consists of a customized fast-marching level set method which generates an optimal initial pancreas region to solve the problem that the level set method is sensitive to the initial contour location and a modified distance regularized level set method which extracts accurate pancreas. The novelty in our method is the proper selection and combination of level set methods, furthermore an energy-decrement algorithm and an energy-tune algorithm are proposed to reduce the negative impact of bonding force caused by connected tissue whose intensity is similar with pancreas. As a result, our method overcomes the shortages of oversegmentation at weak boundary and can accurately extract pancreas from CT images. The proposed method is compared to other five state-of-the-art medical image segmentation methods based on a CT image dataset which contains abdominal images from 10 patients. The evaluated results demonstrate that our method outperforms other methods by achieving higher accuracy and making less false segmentation in pancreas extraction. PMID:24066016
ACT Participation and Performance for Montgomery County Public Schools Students [2013]. Memorandum
ERIC Educational Resources Information Center
Sanderson, Geoffrey T.
2013-01-01
The Montgomery County Public Schools (MCPS) Class of 2013 consistently outperformed graduates across Maryland and the nation on all sections of the ACT, according to the ACT, Inc. annual report released Wednesday, August 21, 2013. In 2013, 29 percent of MCPS graduates took the ACT exam. According to the ACT, Inc. report, ACT participation among…
Is South Korea a Case of High-Stakes Testing Gone Too Far? Information Capsule. Volume 1107
ERIC Educational Resources Information Center
Blazer, Christie
2012-01-01
South Korea's students consistently outperform their counterparts in almost every country in reading and math. Experts have concluded, however, that the South Korean education system has produced students who score well on tests, but fall short on creativity and innovative thinking. They blame these shortcomings on schools' emphasis on rote…
Starting Young: Massachusetts Birth-3rd Grade Policies That Support Children's Literacy Development
ERIC Educational Resources Information Center
Cook, Shayna; Bornfreund, Laura
2015-01-01
Massachusetts is one of a handful of states that is often recognized as a leader in public education, and for good reason. The Commonwealth consistently outperforms most states on national reading and math tests and often leads the pack in education innovations. "Starting Young: Massachusetts Birth-3rd Grade Policies that Support Children's…
From the Delta Banks to the Upper Ranks: An Evaluation of KIPP Charter Schools in Rural Arkansas
ERIC Educational Resources Information Center
Rose, Caleb P.; Maranto, Robert; Ritter, Gary W.
2017-01-01
Knowledge is Power Program Delta College Preparatory School (KIPP DCPS), an open-enrollment charter school,1 opened in 2002 in Helena, Arkansas. KIPP DCPS students have consistently outperformed their peers from neighboring districts on year-end student achievement scores, and KIPP's national reputation led Arkansas lawmakers to exempt KIPP from…
South Dakota Department of Education 2010 Annual Report
ERIC Educational Resources Information Center
South Dakota Department of Education, 2010
2010-01-01
South Dakota has many things to be proud of: Its students consistently outperform their peers on national assessments. The state has a high graduation rate, and it ranks among the top states in the nation for students going on to postsecondary. Credit for these achievements goes to the state's local school districts. This annual report covers key…
Multiple imputation of rainfall missing data in the Iberian Mediterranean context
NASA Astrophysics Data System (ADS)
Miró, Juan Javier; Caselles, Vicente; Estrela, María José
2017-11-01
Given the increasing need for complete rainfall data networks, in recent years have been proposed diverse methods for filling gaps in observed precipitation series, progressively more advanced that traditional approaches to overcome the problem. The present study has consisted in validate 10 methods (6 linear, 2 non-linear and 2 hybrid) that allow multiple imputation, i.e., fill at the same time missing data of multiple incomplete series in a dense network of neighboring stations. These were applied for daily and monthly rainfall in two sectors in the Júcar River Basin Authority (east Iberian Peninsula), which is characterized by a high spatial irregularity and difficulty of rainfall estimation. A classification of precipitation according to their genetic origin was applied as pre-processing, and a quantile-mapping adjusting as post-processing technique. The results showed in general a better performance for the non-linear and hybrid methods, highlighting that the non-linear PCA (NLPCA) method outperforms considerably the Self Organizing Maps (SOM) method within non-linear approaches. On linear methods, the Regularized Expectation Maximization method (RegEM) was the best, but far from NLPCA. Applying EOF filtering as post-processing of NLPCA (hybrid approach) yielded the best results.
Cyclic Solvent Vapor Annealing for Rapid, Robust Vertical Orientation of Features in BCP Thin Films
NASA Astrophysics Data System (ADS)
Paradiso, Sean; Delaney, Kris; Fredrickson, Glenn
2015-03-01
Methods for reliably controlling block copolymer self assembly have seen much attention over the past decade as new applications for nanostructured thin films emerge in the fields of nanopatterning and lithography. While solvent assisted annealing techniques are established as flexible and simple methods for achieving long range order, solvent annealing alone exhibits a very weak thermodynamic driving force for vertically orienting domains with respect to the free surface. To address the desire for oriented features, we have investigated a cyclic solvent vapor annealing (CSVA) approach that combines the mobility benefits of solvent annealing with selective stress experienced by structures oriented parallel to the free surface as the film is repeatedly swollen with solvent and dried. Using dynamical self-consistent field theory (DSCFT) calculations, we establish the conditions under which the method significantly outperforms both static and cyclic thermal annealing and implicate the orientation selection as a consequence of the swelling/deswelling process. Our results suggest that CSVA may prove to be a potent method for the rapid formation of highly ordered, vertically oriented features in block copolymer thin films.
NASA Astrophysics Data System (ADS)
Chang, Yu Min; Lu, Nien Hua; Wu, Tsung Chiang
2005-06-01
This study applies 3D Laser scanning technology to develop a high-precision measuring system for digital survey of historical building. It outperformed other methods in obtaining abundant high-precision measuring points and computing data instantly. In this study, the Pei-tien Temple, a Chinese Taoism temple in southern Taiwan famous for its highly intricate architecture and more than 300-year history, was adopted as the target to proof the high accuracy and efficiency of this system. By using French made MENSI GS-100 Laser Scanner, numerous measuring points were precisely plotted to present the plane map, vertical map and 3D map of the property. Accuracies of 0.1-1 mm in the digital data have consistently been achieved for the historical heritage measurement.
Stirman, Shannon Wiltsey; Gamarra, Jennifer; Bartlett, Brooke; Calloway, Amber; Gutner, Cassidy
2017-12-01
This review describes methods used to examine the modifications and adaptations to evidence-based psychological treatments (EBPTs), assesses what is known about the impact of modifications and adaptations to EBPTs, and makes recommendations for future research and clinical care. One hundred eight primary studies and three meta-analyses were identified. All studies examined planned adaptations, and many simultaneously investigated multiple types of adaptations. With the exception of studies on adding or removing specific EBPT elements, few studies compared adapted EBPTs to the original protocols. There was little evidence that adaptations in the studies were detrimental, but there was also limited consistent evidence that adapted protocols outperformed the original protocols, with the exception of adding components to EBPTs. Implications for EBPT delivery and future research are discussed.
The MIMIC Method with Scale Purification for Detecting Differential Item Functioning
ERIC Educational Resources Information Center
Wang, Wen-Chung; Shih, Ching-Lin; Yang, Chih-Chien
2009-01-01
This study implements a scale purification procedure onto the standard MIMIC method for differential item functioning (DIF) detection and assesses its performance through a series of simulations. It is found that the MIMIC method with scale purification (denoted as M-SP) outperforms the standard MIMIC method (denoted as M-ST) in controlling…
Wang, Yong-Cui; Wang, Yong; Yang, Zhi-Xia; Deng, Nai-Yang
2011-06-20
Enzymes are known as the largest class of proteins and their functions are usually annotated by the Enzyme Commission (EC), which uses a hierarchy structure, i.e., four numbers separated by periods, to classify the function of enzymes. Automatically categorizing enzyme into the EC hierarchy is crucial to understand its specific molecular mechanism. In this paper, we introduce two key improvements in predicting enzyme function within the machine learning framework. One is to introduce the efficient sequence encoding methods for representing given proteins. The second one is to develop a structure-based prediction method with low computational complexity. In particular, we propose to use the conjoint triad feature (CTF) to represent the given protein sequences by considering not only the composition of amino acids but also the neighbor relationships in the sequence. Then we develop a support vector machine (SVM)-based method, named as SVMHL (SVM for hierarchy labels), to output enzyme function by fully considering the hierarchical structure of EC. The experimental results show that our SVMHL with the CTF outperforms SVMHL with the amino acid composition (AAC) feature both in predictive accuracy and Matthew's correlation coefficient (MCC). In addition, SVMHL with the CTF obtains the accuracy and MCC ranging from 81% to 98% and 0.82 to 0.98 when predicting the first three EC digits on a low-homologous enzyme dataset. We further demonstrate that our method outperforms the methods which do not take account of hierarchical relationship among enzyme categories and alternative methods which incorporate prior knowledge about inter-class relationships. Our structure-based prediction model, SVMHL with the CTF, reduces the computational complexity and outperforms the alternative approaches in enzyme function prediction. Therefore our new method will be a useful tool for enzyme function prediction community.
Naveros, Francisco; Luque, Niceto R; Garrido, Jesús A; Carrillo, Richard R; Anguita, Mancia; Ros, Eduardo
2015-07-01
Time-driven simulation methods in traditional CPU architectures perform well and precisely when simulating small-scale spiking neural networks. Nevertheless, they still have drawbacks when simulating large-scale systems. Conversely, event-driven simulation methods in CPUs and time-driven simulation methods in graphic processing units (GPUs) can outperform CPU time-driven methods under certain conditions. With this performance improvement in mind, we have developed an event-and-time-driven spiking neural network simulator suitable for a hybrid CPU-GPU platform. Our neural simulator is able to efficiently simulate bio-inspired spiking neural networks consisting of different neural models, which can be distributed heterogeneously in both small layers and large layers or subsystems. For the sake of efficiency, the low-activity parts of the neural network can be simulated in CPU using event-driven methods while the high-activity subsystems can be simulated in either CPU (a few neurons) or GPU (thousands or millions of neurons) using time-driven methods. In this brief, we have undertaken a comparative study of these different simulation methods. For benchmarking the different simulation methods and platforms, we have used a cerebellar-inspired neural-network model consisting of a very dense granular layer and a Purkinje layer with a smaller number of cells (according to biological ratios). Thus, this cerebellar-like network includes a dense diverging neural layer (increasing the dimensionality of its internal representation and sparse coding) and a converging neural layer (integration) similar to many other biologically inspired and also artificial neural networks.
Reference point detection for camera-based fingerprint image based on wavelet transformation.
Khalil, Mohammed S
2015-04-30
Fingerprint recognition systems essentially require core-point detection prior to fingerprint matching. The core-point is used as a reference point to align the fingerprint with a template database. When processing a larger fingerprint database, it is necessary to consider the core-point during feature extraction. Numerous core-point detection methods are available and have been reported in the literature. However, these methods are generally applied to scanner-based images. Hence, this paper attempts to explore the feasibility of applying a core-point detection method to a fingerprint image obtained using a camera phone. The proposed method utilizes a discrete wavelet transform to extract the ridge information from a color image. The performance of proposed method is evaluated in terms of accuracy and consistency. These two indicators are calculated automatically by comparing the method's output with the defined core points. The proposed method is tested on two data sets, controlled and uncontrolled environment, collected from 13 different subjects. In the controlled environment, the proposed method achieved a detection rate 82.98%. In uncontrolled environment, the proposed method yield a detection rate of 78.21%. The proposed method yields promising results in a collected-image database. Moreover, the proposed method outperformed compare to existing method.
NASA Astrophysics Data System (ADS)
Fresnay, S.; Ponte, A. L.; Le Gentil, S.; Le Sommer, J.
2018-03-01
Several methods that reconstruct the three-dimensional ocean dynamics from sea level are presented and evaluated in the Gulf Stream region with a 1/60° realistic numerical simulation. The use of sea level is motivated by its better correlation with interior pressure or quasi-geostrophic potential vorticity (PV) compared to sea surface temperature and sea surface salinity, and, by its observability via satellite altimetry. The simplest method of reconstruction relies on a linear estimation of pressure at depth from sea level. Another method consists in linearly estimating PV from sea level first and then performing a PV inversion. The last method considered, labeled SQG for surface quasi-geostrophy, relies on a PV inversion but assumes no PV anomalies. The first two methods show comparable skill at levels above -800 m. They moderately outperform SQG which emphasizes the difficulty of estimating interior PV from surface variables. Over the 250-1,000 m depth range, the three methods skillfully reconstruct pressure at wavelengths between 500 and 200 km whereas they exhibit a rapid loss of skill between 200 and 100 km wavelengths. Applicability to a real case scenario and leads for improvements are discussed.
Functional Parallel Factor Analysis for Functions of One- and Two-dimensional Arguments.
Choi, Ji Yeh; Hwang, Heungsun; Timmerman, Marieke E
2018-03-01
Parallel factor analysis (PARAFAC) is a useful multivariate method for decomposing three-way data that consist of three different types of entities simultaneously. This method estimates trilinear components, each of which is a low-dimensional representation of a set of entities, often called a mode, to explain the maximum variance of the data. Functional PARAFAC permits the entities in different modes to be smooth functions or curves, varying over a continuum, rather than a collection of unconnected responses. The existing functional PARAFAC methods handle functions of a one-dimensional argument (e.g., time) only. In this paper, we propose a new extension of functional PARAFAC for handling three-way data whose responses are sequenced along both a two-dimensional domain (e.g., a plane with x- and y-axis coordinates) and a one-dimensional argument. Technically, the proposed method combines PARAFAC with basis function expansion approximations, using a set of piecewise quadratic finite element basis functions for estimating two-dimensional smooth functions and a set of one-dimensional basis functions for estimating one-dimensional smooth functions. In a simulation study, the proposed method appeared to outperform the conventional PARAFAC. We apply the method to EEG data to demonstrate its empirical usefulness.
Interleaved EPI diffusion imaging using SPIRiT-based reconstruction with virtual coil compression.
Dong, Zijing; Wang, Fuyixue; Ma, Xiaodong; Zhang, Zhe; Dai, Erpeng; Yuan, Chun; Guo, Hua
2018-03-01
To develop a novel diffusion imaging reconstruction framework based on iterative self-consistent parallel imaging reconstruction (SPIRiT) for multishot interleaved echo planar imaging (iEPI), with computation acceleration by virtual coil compression. As a general approach for autocalibrating parallel imaging, SPIRiT improves the performance of traditional generalized autocalibrating partially parallel acquisitions (GRAPPA) methods in that the formulation with self-consistency is better conditioned, suggesting SPIRiT to be a better candidate in k-space-based reconstruction. In this study, a general SPIRiT framework is adopted to incorporate both coil sensitivity and phase variation information as virtual coils and then is applied to 2D navigated iEPI diffusion imaging. To reduce the reconstruction time when using a large number of coils and shots, a novel shot-coil compression method is proposed for computation acceleration in Cartesian sampling. Simulations and in vivo experiments were conducted to evaluate the performance of the proposed method. Compared with the conventional coil compression, the shot-coil compression achieved higher compression rates with reduced errors. The simulation and in vivo experiments demonstrate that the SPIRiT-based reconstruction outperformed the existing method, realigned GRAPPA, and provided superior images with reduced artifacts. The SPIRiT-based reconstruction with virtual coil compression is a reliable method for high-resolution iEPI diffusion imaging. Magn Reson Med 79:1525-1531, 2018. © 2017 International Society for Magnetic Resonance in Medicine. © 2017 International Society for Magnetic Resonance in Medicine.
An Unscented Kalman-Particle Hybrid Filter for Space Object Tracking
NASA Astrophysics Data System (ADS)
Raihan A. V, Dilshad; Chakravorty, Suman
2018-03-01
Optimal and consistent estimation of the state of space objects is pivotal to surveillance and tracking applications. However, probabilistic estimation of space objects is made difficult by the non-Gaussianity and nonlinearity associated with orbital mechanics. In this paper, we present an unscented Kalman-particle hybrid filtering framework for recursive Bayesian estimation of space objects. The hybrid filtering scheme is designed to provide accurate and consistent estimates when measurements are sparse without incurring a large computational cost. It employs an unscented Kalman filter (UKF) for estimation when measurements are available. When the target is outside the field of view (FOV) of the sensor, it updates the state probability density function (PDF) via a sequential Monte Carlo method. The hybrid filter addresses the problem of particle depletion through a suitably designed filter transition scheme. To assess the performance of the hybrid filtering approach, we consider two test cases of space objects that are assumed to undergo full three dimensional orbital motion under the effects of J 2 and atmospheric drag perturbations. It is demonstrated that the hybrid filters can furnish fast, accurate and consistent estimates outperforming standard UKF and particle filter (PF) implementations.
Schall, Marina; Martiny, Sarah E; Goetz, Thomas; Hall, Nathan C
2016-05-01
Although expressing positive emotions is typically socially rewarded, in the present work, we predicted that people suppress positive emotions and thereby experience social benefits when outperformed others are present. We tested our predictions in three experimental studies with high school students. In Studies 1 and 2, we manipulated the type of social situation (outperformance vs. non-outperformance) and assessed suppression of positive emotions. In both studies, individuals reported suppressing positive emotions more in outperformance situations than in non-outperformance situations. In Study 3, we manipulated the social situation (outperformance vs. non-outperformance) as well as the videotaped person's expression of positive emotions (suppression vs. expression). The findings showed that when outperforming others, individuals were indeed evaluated more positively when they suppressed rather than expressed their positive emotions, and demonstrate the importance of the specific social situation with respect to the effects of suppression. © 2016 by the Society for Personality and Social Psychology, Inc.
Optic disk localization by a robust fusion method
NASA Astrophysics Data System (ADS)
Zhang, Jielin; Yin, Fengshou; Wong, Damon W. K.; Liu, Jiang; Baskaran, Mani; Cheng, Ching-Yu; Wong, Tien Yin
2013-02-01
The optic disk localization plays an important role in developing computer-aided diagnosis (CAD) systems for ocular diseases such as glaucoma, diabetic retinopathy and age-related macula degeneration. In this paper, we propose an intelligent fusion of methods for the localization of the optic disk in retinal fundus images. Three different approaches are developed to detect the location of the optic disk separately. The first method is the maximum vessel crossing method, which finds the region with the most number of blood vessel crossing points. The second one is the multichannel thresholding method, targeting the area with the highest intensity. The final method searches the vertical and horizontal region-of-interest separately on the basis of blood vessel structure and neighborhood entropy profile. Finally, these three methods are combined using an intelligent fusion method to improve the overall accuracy. The proposed algorithm was tested on the STARE database and the ORIGAlight database, each consisting of images with various pathologies. The preliminary result on the STARE database can achieve 81.5%, while a higher result of 99% can be obtained for the ORIGAlight database. The proposed method outperforms each individual approach and state-of-the-art method which utilizes an intensity-based approach. The result demonstrates a high potential for this method to be used in retinal CAD systems.
Kurvers, Ralf H J M; de Zoete, Annemarie; Bachman, Shelby L; Algra, Paul R; Ostelo, Raymond
2018-01-01
Diagnosing the causes of low back pain is a challenging task, prone to errors. A novel approach to increase diagnostic accuracy in medical decision making is collective intelligence, which refers to the ability of groups to outperform individual decision makers in solving problems. We investigated whether combining the independent ratings of chiropractors, chiropractic radiologists and medical radiologists can improve diagnostic accuracy when interpreting diagnostic images of the lumbosacral spine. Evaluations were obtained from two previously published studies: study 1 consisted of 13 raters independently rating 300 lumbosacral radiographs; study 2 consisted of 14 raters independently rating 100 lumbosacral magnetic resonance images. In both studies, raters evaluated the presence of "abnormalities", which are indicators of a serious health risk and warrant immediate further examination. We combined independent decisions of raters using a majority rule which takes as final diagnosis the decision of the majority of the group. We compared the performance of the majority rule to the performance of single raters. Our results show that with increasing group size (i.e., increasing the number of independent decisions) both sensitivity and specificity increased in both data-sets, with groups consistently outperforming single raters. These results were found for radiographs and MR image reading alike. Our findings suggest that combining independent ratings can improve the accuracy of lumbosacral diagnostic image reading.
G3//BMK and Its Application to Calculation of Bond Dissociation Enthalpies.
Zheng, Wen-Rui; Fu, Yao; Guo, Qing-Xiang
2008-08-01
On the basis of systematic examinations it was found that the BMK functional significantly outperformed the other popular density functional theory methods including B3LYP, B3P86, KMLYP, MPW1P86, O3LYP, and X3LYP for the calculation of bond dissociation enthalpies (BDEs). However, it was also found that even the BMK functional might dramatically fail in predicting the BDEs of some chemical bonds. To solve this problem, a new composite ab initio method named G3//BMK was developed by combining the strengths of both the G3 theory and BMK. G3//BMK was found to outperform the G3 and G3//B3LYP methods. It could accurately predict the BDEs of diverse types of chemical bonds in various organic molecules within a precision of ca. 1.2 kcal/mol.
ERIC Educational Resources Information Center
Dyson, Dana D.
2009-01-01
Reforms in American public education have not resolved the wide academic performance gap between students attending schools in poor, large, urban centers versus schools in wealthier areas. Aggregate performance outcomes on state achievement tests reveal that some school districts consistently outperform others, a few fluctuate over time but are…
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wagner, John C; Peplow, Douglas E.; Mosher, Scott W
2014-01-01
This paper presents a new hybrid (Monte Carlo/deterministic) method for increasing the efficiency of Monte Carlo calculations of distributions, such as flux or dose rate distributions (e.g., mesh tallies), as well as responses at multiple localized detectors and spectra. This method, referred to as Forward-Weighted CADIS (FW-CADIS), is an extension of the Consistent Adjoint Driven Importance Sampling (CADIS) method, which has been used for more than a decade to very effectively improve the efficiency of Monte Carlo calculations of localized quantities, e.g., flux, dose, or reaction rate at a specific location. The basis of this method is the development ofmore » an importance function that represents the importance of particles to the objective of uniform Monte Carlo particle density in the desired tally regions. Implementation of this method utilizes the results from a forward deterministic calculation to develop a forward-weighted source for a deterministic adjoint calculation. The resulting adjoint function is then used to generate consistent space- and energy-dependent source biasing parameters and weight windows that are used in a forward Monte Carlo calculation to obtain more uniform statistical uncertainties in the desired tally regions. The FW-CADIS method has been implemented and demonstrated within the MAVRIC sequence of SCALE and the ADVANTG/MCNP framework. Application of the method to representative, real-world problems, including calculation of dose rate and energy dependent flux throughout the problem space, dose rates in specific areas, and energy spectra at multiple detectors, is presented and discussed. Results of the FW-CADIS method and other recently developed global variance reduction approaches are also compared, and the FW-CADIS method outperformed the other methods in all cases considered.« less
Optimal design of a beam-based dynamic vibration absorber using fixed-points theory
NASA Astrophysics Data System (ADS)
Hua, Yingyu; Wong, Waion; Cheng, Li
2018-05-01
The addition of a dynamic vibration absorber (DVA) to a vibrating structure could provide an economic solution for vibration suppressions if the absorber is properly designed and located onto the structure. A common design of the DVA is a sprung mass because of its simple structure and low cost. However, the vibration suppression performance of this kind of DVA is limited by the ratio between the absorber mass and the mass of the primary structure. In this paper, a beam-based DVA (beam DVA) is proposed and optimized for minimizing the resonant vibration of a general structure. The vibration suppression performance of the proposed beam DVA depends on the mass ratio, the flexural rigidity and length of the beam. In comparison with the traditional sprung mass DVA, the proposed beam DVA shows more flexibility in vibration control design because it has more design parameters. With proper design, the beam DVA's vibration suppression capability can outperform that of the traditional DVA under the same mass constraint. The general approach is illustrated using a benchmark cantilever beam as an example. The receptance theory is introduced to model the compound system consisting of the host beam and the attached beam-based DVA. The model is validated through comparisons with the results from Abaqus as well as the Transfer Matrix method (TMM) method. Fixed-points theory is then employed to derive the analytical expressions for the optimum tuning ratio and damping ratio of the proposed beam absorber. A design guideline is then presented to choose the parameters of the beam absorber. Comparisons are finally presented between the beam absorber and the traditional DVA in terms of the vibration suppression effect. It is shown that the proposed beam absorber can outperform the traditional DVA by following this proposed guideline.
Binding SNOMED CT terms to archetype elements. Establishing a baseline of results.
Berges, I; Bermudez, J; Illarramendi, A
2015-01-01
This article is part of the Focus Theme of METHODS of Information in Medicine on "Managing Interoperability and Complexity in Health Systems". The proliferation of archetypes as a means to represent information of Electronic Health Records has raised the need of binding terminological codes - such as SNOMED CT codes - to their elements, in order to identify them univocally. However, the large size of the terminologies makes it difficult to perform this task manually. To establish a baseline of results for the aforementioned problem by using off-the-shelf string comparison-based techniques against which results from more complex techniques could be evaluated. Nine Typed Comparison METHODS were evaluated for binding using a set of 487 archetype elements. Their recall was calculated and Friedman and Nemenyi tests were applied in order to assess whether any of the methods outperformed the others. Using the qGrams method along with the 'Text' information piece of archetype elements outperforms the other methods if a level of confidence of 90% is considered. A recall of 25.26% is obtained if just one SNOMED CT term is retrieved for each archetype element. This recall rises to 50.51% and 75.56% if 10 and 100 elements are retrieved respectively, that being a reduction of more than 99.99% on the SNOMED CT code set. The baseline has been established following the above-mentioned results. Moreover, it has been observed that although string comparison-based methods do not outperform more sophisticated techniques, they still can be an alternative for providing a reduced set of candidate terms for each archetype element from which the ultimate term can be chosen later in the more-than-likely manual supervision task.
Methods of Reverberation Mapping. I. Time-lag Determination by Measures of Randomness
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chelouche, Doron; Pozo-Nuñez, Francisco; Zucker, Shay, E-mail: doron@sci.haifa.ac.il, E-mail: francisco.pozon@gmail.com, E-mail: shayz@post.tau.ac.il
A class of methods for measuring time delays between astronomical time series is introduced in the context of quasar reverberation mapping, which is based on measures of randomness or complexity of the data. Several distinct statistical estimators are considered that do not rely on polynomial interpolations of the light curves nor on their stochastic modeling, and do not require binning in correlation space. Methods based on von Neumann’s mean-square successive-difference estimator are found to be superior to those using other estimators. An optimized von Neumann scheme is formulated, which better handles sparsely sampled data and outperforms current implementations of discretemore » correlation function methods. This scheme is applied to existing reverberation data of varying quality, and consistency with previously reported time delays is found. In particular, the size–luminosity relation of the broad-line region in quasars is recovered with a scatter comparable to that obtained by other works, yet with fewer assumptions made concerning the process underlying the variability. The proposed method for time-lag determination is particularly relevant for irregularly sampled time series, and in cases where the process underlying the variability cannot be adequately modeled.« less
Data decomposition method for parallel polygon rasterization considering load balancing
NASA Astrophysics Data System (ADS)
Zhou, Chen; Chen, Zhenjie; Liu, Yongxue; Li, Feixue; Cheng, Liang; Zhu, A.-xing; Li, Manchun
2015-12-01
It is essential to adopt parallel computing technology to rapidly rasterize massive polygon data. In parallel rasterization, it is difficult to design an effective data decomposition method. Conventional methods ignore load balancing of polygon complexity in parallel rasterization and thus fail to achieve high parallel efficiency. In this paper, a novel data decomposition method based on polygon complexity (DMPC) is proposed. First, four factors that possibly affect the rasterization efficiency were investigated. Then, a metric represented by the boundary number and raster pixel number in the minimum bounding rectangle was developed to calculate the complexity of each polygon. Using this metric, polygons were rationally allocated according to the polygon complexity, and each process could achieve balanced loads of polygon complexity. To validate the efficiency of DMPC, it was used to parallelize different polygon rasterization algorithms and tested on different datasets. Experimental results showed that DMPC could effectively parallelize polygon rasterization algorithms. Furthermore, the implemented parallel algorithms with DMPC could achieve good speedup ratios of at least 15.69 and generally outperformed conventional decomposition methods in terms of parallel efficiency and load balancing. In addition, the results showed that DMPC exhibited consistently better performance for different spatial distributions of polygons.
Ruan, Peiying; Hayashida, Morihiro; Maruyama, Osamu; Akutsu, Tatsuya
2013-01-01
Since many proteins express their functional activity by interacting with other proteins and forming protein complexes, it is very useful to identify sets of proteins that form complexes. For that purpose, many prediction methods for protein complexes from protein-protein interactions have been developed such as MCL, MCODE, RNSC, PCP, RRW, and NWE. These methods have dealt with only complexes with size of more than three because the methods often are based on some density of subgraphs. However, heterodimeric protein complexes that consist of two distinct proteins occupy a large part according to several comprehensive databases of known complexes. In this paper, we propose several feature space mappings from protein-protein interaction data, in which each interaction is weighted based on reliability. Furthermore, we make use of prior knowledge on protein domains to develop feature space mappings, domain composition kernel and its combination kernel with our proposed features. We perform ten-fold cross-validation computational experiments. These results suggest that our proposed kernel considerably outperforms the naive Bayes-based method, which is the best existing method for predicting heterodimeric protein complexes. PMID:23776458
A novel visual saliency detection method for infrared video sequences
NASA Astrophysics Data System (ADS)
Wang, Xin; Zhang, Yuzhen; Ning, Chen
2017-12-01
Infrared video applications such as target detection and recognition, moving target tracking, and so forth can benefit a lot from visual saliency detection, which is essentially a method to automatically localize the ;important; content in videos. In this paper, a novel visual saliency detection method for infrared video sequences is proposed. Specifically, for infrared video saliency detection, both the spatial saliency and temporal saliency are considered. For spatial saliency, we adopt a mutual consistency-guided spatial cues combination-based method to capture the regions with obvious luminance contrast and contour features. For temporal saliency, a multi-frame symmetric difference approach is proposed to discriminate salient moving regions of interest from background motions. Then, the spatial saliency and temporal saliency are combined to compute the spatiotemporal saliency using an adaptive fusion strategy. Besides, to highlight the spatiotemporal salient regions uniformly, a multi-scale fusion approach is embedded into the spatiotemporal saliency model. Finally, a Gestalt theory-inspired optimization algorithm is designed to further improve the reliability of the final saliency map. Experimental results demonstrate that our method outperforms many state-of-the-art saliency detection approaches for infrared videos under various backgrounds.
Adaptive Modeling Procedure Selection by Data Perturbation.
Zhang, Yongli; Shen, Xiaotong
2015-10-01
Many procedures have been developed to deal with the high-dimensional problem that is emerging in various business and economics areas. To evaluate and compare these procedures, modeling uncertainty caused by model selection and parameter estimation has to be assessed and integrated into a modeling process. To do this, a data perturbation method estimates the modeling uncertainty inherited in a selection process by perturbing the data. Critical to data perturbation is the size of perturbation, as the perturbed data should resemble the original dataset. To account for the modeling uncertainty, we derive the optimal size of perturbation, which adapts to the data, the model space, and other relevant factors in the context of linear regression. On this basis, we develop an adaptive data-perturbation method that, unlike its nonadaptive counterpart, performs well in different situations. This leads to a data-adaptive model selection method. Both theoretical and numerical analysis suggest that the data-adaptive model selection method adapts to distinct situations in that it yields consistent model selection and optimal prediction, without knowing which situation exists a priori. The proposed method is applied to real data from the commodity market and outperforms its competitors in terms of price forecasting accuracy.
Tenzer, S; Peters, B; Bulik, S; Schoor, O; Lemmel, C; Schatz, M M; Kloetzel, P-M; Rammensee, H-G; Schild, H; Holzhütter, H-G
2005-05-01
Epitopes presented by major histocompatibility complex (MHC) class I molecules are selected by a multi-step process. Here we present the first computational prediction of this process based on in vitro experiments characterizing proteasomal cleavage, transport by the transporter associated with antigen processing (TAP) and MHC class I binding. Our novel prediction method for proteasomal cleavages outperforms existing methods when tested on in vitro cleavage data. The analysis of our predictions for a new dataset consisting of 390 endogenously processed MHC class I ligands from cells with known proteasome composition shows that the immunological advantage of switching from constitutive to immunoproteasomes is mainly to suppress the creation of peptides in the cytosol that TAP cannot transport. Furthermore, we show that proteasomes are unlikely to generate MHC class I ligands with a C-terminal lysine residue, suggesting processing of these ligands by a different protease that may be tripeptidyl-peptidase II (TPPII).
C-fuzzy variable-branch decision tree with storage and classification error rate constraints
NASA Astrophysics Data System (ADS)
Yang, Shiueng-Bien
2009-10-01
The C-fuzzy decision tree (CFDT), which is based on the fuzzy C-means algorithm, has recently been proposed. The CFDT is grown by selecting the nodes to be split according to its classification error rate. However, the CFDT design does not consider the classification time taken to classify the input vector. Thus, the CFDT can be improved. We propose a new C-fuzzy variable-branch decision tree (CFVBDT) with storage and classification error rate constraints. The design of the CFVBDT consists of two phases-growing and pruning. The CFVBDT is grown by selecting the nodes to be split according to the classification error rate and the classification time in the decision tree. Additionally, the pruning method selects the nodes to prune based on the storage requirement and the classification time of the CFVBDT. Furthermore, the number of branches of each internal node is variable in the CFVBDT. Experimental results indicate that the proposed CFVBDT outperforms the CFDT and other methods.
Metal Oxide Gas Sensor Drift Compensation Using a Two-Dimensional Classifier Ensemble
Liu, Hang; Chu, Renzhi; Tang, Zhenan
2015-01-01
Sensor drift is the most challenging problem in gas sensing at present. We propose a novel two-dimensional classifier ensemble strategy to solve the gas discrimination problem, regardless of the gas concentration, with high accuracy over extended periods of time. This strategy is appropriate for multi-class classifiers that consist of combinations of pairwise classifiers, such as support vector machines. We compare the performance of the strategy with those of competing methods in an experiment based on a public dataset that was compiled over a period of three years. The experimental results demonstrate that the two-dimensional ensemble outperforms the other methods considered. Furthermore, we propose a pre-aging process inspired by that applied to the sensors to improve the stability of the classifier ensemble. The experimental results demonstrate that the weight of each multi-class classifier model in the ensemble remains fairly static before and after the addition of new classifier models to the ensemble, when a pre-aging procedure is applied. PMID:25942640
Surface Estimation, Variable Selection, and the Nonparametric Oracle Property.
Storlie, Curtis B; Bondell, Howard D; Reich, Brian J; Zhang, Hao Helen
2011-04-01
Variable selection for multivariate nonparametric regression is an important, yet challenging, problem due, in part, to the infinite dimensionality of the function space. An ideal selection procedure should be automatic, stable, easy to use, and have desirable asymptotic properties. In particular, we define a selection procedure to be nonparametric oracle (np-oracle) if it consistently selects the correct subset of predictors and at the same time estimates the smooth surface at the optimal nonparametric rate, as the sample size goes to infinity. In this paper, we propose a model selection procedure for nonparametric models, and explore the conditions under which the new method enjoys the aforementioned properties. Developed in the framework of smoothing spline ANOVA, our estimator is obtained via solving a regularization problem with a novel adaptive penalty on the sum of functional component norms. Theoretical properties of the new estimator are established. Additionally, numerous simulated and real examples further demonstrate that the new approach substantially outperforms other existing methods in the finite sample setting.
Surface Estimation, Variable Selection, and the Nonparametric Oracle Property
Storlie, Curtis B.; Bondell, Howard D.; Reich, Brian J.; Zhang, Hao Helen
2010-01-01
Variable selection for multivariate nonparametric regression is an important, yet challenging, problem due, in part, to the infinite dimensionality of the function space. An ideal selection procedure should be automatic, stable, easy to use, and have desirable asymptotic properties. In particular, we define a selection procedure to be nonparametric oracle (np-oracle) if it consistently selects the correct subset of predictors and at the same time estimates the smooth surface at the optimal nonparametric rate, as the sample size goes to infinity. In this paper, we propose a model selection procedure for nonparametric models, and explore the conditions under which the new method enjoys the aforementioned properties. Developed in the framework of smoothing spline ANOVA, our estimator is obtained via solving a regularization problem with a novel adaptive penalty on the sum of functional component norms. Theoretical properties of the new estimator are established. Additionally, numerous simulated and real examples further demonstrate that the new approach substantially outperforms other existing methods in the finite sample setting. PMID:21603586
Just how good an investment is the biopharmaceutical sector?
Thakor, Richard T; Anaya, Nicholas; Zhang, Yuwei; Vilanilam, Christian; Siah, Kien Wei; Wong, Chi Heem; Lo, Andrew W
2017-12-01
Uncertainty surrounding the risk and reward of investments in biopharmaceutical companies poses a challenge to those interested in funding such enterprises. Using data on publicly traded stocks, we track the performance of 1,066 biopharmaceutical companies from 1930 to 2015-the most comprehensive financial analysis of this sector to date. Our systematic exploration of methods for distinguishing biotech and pharmaceutical companies yields a dynamic, more accurate classification method. We find that the performance of the biotech sector is highly sensitive to the presence of a few outlier companies, and confirm that nearly all biotech companies are loss-making enterprises, exhibiting high stock volatility. In contrast, since 2000, pharmaceutical companies have become increasingly profitable, with risk-adjusted returns consistently outperforming the market. The performance of all biopharmaceutical companies is subject not only to factors arising from their drug pipelines (idiosyncratic risk), but also from general economic conditions (systematic risk). The risk associated with returns has profound implications both for patterns of investment and for funding innovation in biomedical R&D.
Robust model-based analysis of single-particle tracking experiments with Spot-On
Grimm, Jonathan B; Lavis, Luke D
2018-01-01
Single-particle tracking (SPT) has become an important method to bridge biochemistry and cell biology since it allows direct observation of protein binding and diffusion dynamics in live cells. However, accurately inferring information from SPT studies is challenging due to biases in both data analysis and experimental design. To address analysis bias, we introduce ‘Spot-On’, an intuitive web-interface. Spot-On implements a kinetic modeling framework that accounts for known biases, including molecules moving out-of-focus, and robustly infers diffusion constants and subpopulations from pooled single-molecule trajectories. To minimize inherent experimental biases, we implement and validate stroboscopic photo-activation SPT (spaSPT), which minimizes motion-blur bias and tracking errors. We validate Spot-On using experimentally realistic simulations and show that Spot-On outperforms other methods. We then apply Spot-On to spaSPT data from live mammalian cells spanning a wide range of nuclear dynamics and demonstrate that Spot-On consistently and robustly infers subpopulation fractions and diffusion constants. PMID:29300163
Robust model-based analysis of single-particle tracking experiments with Spot-On.
Hansen, Anders S; Woringer, Maxime; Grimm, Jonathan B; Lavis, Luke D; Tjian, Robert; Darzacq, Xavier
2018-01-04
Single-particle tracking (SPT) has become an important method to bridge biochemistry and cell biology since it allows direct observation of protein binding and diffusion dynamics in live cells. However, accurately inferring information from SPT studies is challenging due to biases in both data analysis and experimental design. To address analysis bias, we introduce 'Spot-On', an intuitive web-interface. Spot-On implements a kinetic modeling framework that accounts for known biases, including molecules moving out-of-focus, and robustly infers diffusion constants and subpopulations from pooled single-molecule trajectories. To minimize inherent experimental biases, we implement and validate stroboscopic photo-activation SPT (spaSPT), which minimizes motion-blur bias and tracking errors. We validate Spot-On using experimentally realistic simulations and show that Spot-On outperforms other methods. We then apply Spot-On to spaSPT data from live mammalian cells spanning a wide range of nuclear dynamics and demonstrate that Spot-On consistently and robustly infers subpopulation fractions and diffusion constants. © 2018, Hansen et al.
Identifying online user reputation of user-object bipartite networks
NASA Astrophysics Data System (ADS)
Liu, Xiao-Lu; Liu, Jian-Guo; Yang, Kai; Guo, Qiang; Han, Jing-Ti
2017-02-01
Identifying online user reputation based on the rating information of the user-object bipartite networks is important for understanding online user collective behaviors. Based on the Bayesian analysis, we present a parameter-free algorithm for ranking online user reputation, where the user reputation is calculated based on the probability that their ratings are consistent with the main part of all user opinions. The experimental results show that the AUC values of the presented algorithm could reach 0.8929 and 0.8483 for the MovieLens and Netflix data sets, respectively, which is better than the results generated by the CR and IARR methods. Furthermore, the experimental results for different user groups indicate that the presented algorithm outperforms the iterative ranking methods in both ranking accuracy and computation complexity. Moreover, the results for the synthetic networks show that the computation complexity of the presented algorithm is a linear function of the network size, which suggests that the presented algorithm is very effective and efficient for the large scale dynamic online systems.
Security Event Recognition for Visual Surveillance
NASA Astrophysics Data System (ADS)
Liao, W.; Yang, C.; Yang, M. Ying; Rosenhahn, B.
2017-05-01
With rapidly increasing deployment of surveillance cameras, the reliable methods for automatically analyzing the surveillance video and recognizing special events are demanded by different practical applications. This paper proposes a novel effective framework for security event analysis in surveillance videos. First, convolutional neural network (CNN) framework is used to detect objects of interest in the given videos. Second, the owners of the objects are recognized and monitored in real-time as well. If anyone moves any object, this person will be verified whether he/she is its owner. If not, this event will be further analyzed and distinguished between two different scenes: moving the object away or stealing it. To validate the proposed approach, a new video dataset consisting of various scenarios is constructed for more complex tasks. For comparison purpose, the experiments are also carried out on the benchmark databases related to the task on abandoned luggage detection. The experimental results show that the proposed approach outperforms the state-of-the-art methods and effective in recognizing complex security events.
A Hybrid alldifferent-Tabu Search Algorithm for Solving Sudoku Puzzles
Crawford, Broderick; Paredes, Fernando; Norero, Enrique
2015-01-01
The Sudoku problem is a well-known logic-based puzzle of combinatorial number-placement. It consists in filling a n 2 × n 2 grid, composed of n columns, n rows, and n subgrids, each one containing distinct integers from 1 to n 2. Such a puzzle belongs to the NP-complete collection of problems, to which there exist diverse exact and approximate methods able to solve it. In this paper, we propose a new hybrid algorithm that smartly combines a classic tabu search procedure with the alldifferent global constraint from the constraint programming world. The alldifferent constraint is known to be efficient for domain filtering in the presence of constraints that must be pairwise different, which are exactly the kind of constraints that Sudokus own. This ability clearly alleviates the work of the tabu search, resulting in a faster and more robust approach for solving Sudokus. We illustrate interesting experimental results where our proposed algorithm outperforms the best results previously reported by hybrids and approximate methods. PMID:26078751
An experimental phylogeny to benchmark ancestral sequence reconstruction
Randall, Ryan N.; Radford, Caelan E.; Roof, Kelsey A.; Natarajan, Divya K.; Gaucher, Eric A.
2016-01-01
Ancestral sequence reconstruction (ASR) is a still-burgeoning method that has revealed many key mechanisms of molecular evolution. One criticism of the approach is an inability to validate its algorithms within a biological context as opposed to a computer simulation. Here we build an experimental phylogeny using the gene of a single red fluorescent protein to address this criticism. The evolved phylogeny consists of 19 operational taxonomic units (leaves) and 17 ancestral bifurcations (nodes) that display a wide variety of fluorescent phenotypes. The 19 leaves then serve as ‘modern' sequences that we subject to ASR analyses using various algorithms and to benchmark against the known ancestral genotypes and ancestral phenotypes. We confirm computer simulations that show all algorithms infer ancient sequences with high accuracy, yet we also reveal wide variation in the phenotypes encoded by incorrectly inferred sequences. Specifically, Bayesian methods incorporating rate variation significantly outperform the maximum parsimony criterion in phenotypic accuracy. Subsampling of extant sequences had minor effect on the inference of ancestral sequences. PMID:27628687
Discriminative parameter estimation for random walks segmentation.
Baudin, Pierre-Yves; Goodman, Danny; Kumrnar, Puneet; Azzabou, Noura; Carlier, Pierre G; Paragios, Nikos; Kumar, M Pawan
2013-01-01
The Random Walks (RW) algorithm is one of the most efficient and easy-to-use probabilistic segmentation methods. By combining contrast terms with prior terms, it provides accurate segmentations of medical images in a fully automated manner. However, one of the main drawbacks of using the RW algorithm is that its parameters have to be hand-tuned. we propose a novel discriminative learning framework that estimates the parameters using a training dataset. The main challenge we face is that the training samples are not fully supervised. Specifically, they provide a hard segmentation of the images, instead of a probabilistic segmentation. We overcome this challenge by treating the optimal probabilistic segmentation that is compatible with the given hard segmentation as a latent variable. This allows us to employ the latent support vector machine formulation for parameter estimation. We show that our approach significantly outperforms the baseline methods on a challenging dataset consisting of real clinical 3D MRI volumes of skeletal muscles.
A Hybrid alldifferent-Tabu Search Algorithm for Solving Sudoku Puzzles.
Soto, Ricardo; Crawford, Broderick; Galleguillos, Cristian; Paredes, Fernando; Norero, Enrique
2015-01-01
The Sudoku problem is a well-known logic-based puzzle of combinatorial number-placement. It consists in filling a n(2) × n(2) grid, composed of n columns, n rows, and n subgrids, each one containing distinct integers from 1 to n(2). Such a puzzle belongs to the NP-complete collection of problems, to which there exist diverse exact and approximate methods able to solve it. In this paper, we propose a new hybrid algorithm that smartly combines a classic tabu search procedure with the alldifferent global constraint from the constraint programming world. The alldifferent constraint is known to be efficient for domain filtering in the presence of constraints that must be pairwise different, which are exactly the kind of constraints that Sudokus own. This ability clearly alleviates the work of the tabu search, resulting in a faster and more robust approach for solving Sudokus. We illustrate interesting experimental results where our proposed algorithm outperforms the best results previously reported by hybrids and approximate methods.
Floden, Evan W; Tommaso, Paolo D; Chatzou, Maria; Magis, Cedrik; Notredame, Cedric; Chang, Jia-Ming
2016-07-08
The PSI/TM-Coffee web server performs multiple sequence alignment (MSA) of proteins by combining homology extension with a consistency based alignment approach. Homology extension is performed with Position Specific Iterative (PSI) BLAST searches against a choice of redundant and non-redundant databases. The main novelty of this server is to allow databases of reduced complexity to rapidly perform homology extension. This server also gives the possibility to use transmembrane proteins (TMPs) reference databases to allow even faster homology extension on this important category of proteins. Aside from an MSA, the server also outputs topological prediction of TMPs using the HMMTOP algorithm. Previous benchmarking of the method has shown this approach outperforms the most accurate alignment methods such as MSAProbs, Kalign, PROMALS, MAFFT, ProbCons and PRALINE™. The web server is available at http://tcoffee.crg.cat/tmcoffee. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
A Robust Statistics Approach to Minimum Variance Portfolio Optimization
NASA Astrophysics Data System (ADS)
Yang, Liusha; Couillet, Romain; McKay, Matthew R.
2015-12-01
We study the design of portfolios under a minimum risk criterion. The performance of the optimized portfolio relies on the accuracy of the estimated covariance matrix of the portfolio asset returns. For large portfolios, the number of available market returns is often of similar order to the number of assets, so that the sample covariance matrix performs poorly as a covariance estimator. Additionally, financial market data often contain outliers which, if not correctly handled, may further corrupt the covariance estimation. We address these shortcomings by studying the performance of a hybrid covariance matrix estimator based on Tyler's robust M-estimator and on Ledoit-Wolf's shrinkage estimator while assuming samples with heavy-tailed distribution. Employing recent results from random matrix theory, we develop a consistent estimator of (a scaled version of) the realized portfolio risk, which is minimized by optimizing online the shrinkage intensity. Our portfolio optimization method is shown via simulations to outperform existing methods both for synthetic and real market data.
Orthogonal Procrustes Analysis for Dictionary Learning in Sparse Linear Representation
Grossi, Giuliano; Lin, Jianyi
2017-01-01
In the sparse representation model, the design of overcomplete dictionaries plays a key role for the effectiveness and applicability in different domains. Recent research has produced several dictionary learning approaches, being proven that dictionaries learnt by data examples significantly outperform structured ones, e.g. wavelet transforms. In this context, learning consists in adapting the dictionary atoms to a set of training signals in order to promote a sparse representation that minimizes the reconstruction error. Finding the best fitting dictionary remains a very difficult task, leaving the question still open. A well-established heuristic method for tackling this problem is an iterative alternating scheme, adopted for instance in the well-known K-SVD algorithm. Essentially, it consists in repeating two stages; the former promotes sparse coding of the training set and the latter adapts the dictionary to reduce the error. In this paper we present R-SVD, a new method that, while maintaining the alternating scheme, adopts the Orthogonal Procrustes analysis to update the dictionary atoms suitably arranged into groups. Comparative experiments on synthetic data prove the effectiveness of R-SVD with respect to well known dictionary learning algorithms such as K-SVD, ILS-DLA and the online method OSDL. Moreover, experiments on natural data such as ECG compression, EEG sparse representation, and image modeling confirm R-SVD’s robustness and wide applicability. PMID:28103283
Iwata, Hiroaki; Sawada, Ryusuke; Mizutani, Sayaka; Yamanishi, Yoshihiro
2015-02-23
Drug repositioning, or the application of known drugs to new indications, is a challenging issue in pharmaceutical science. In this study, we developed a new computational method to predict unknown drug indications for systematic drug repositioning in a framework of supervised network inference. We defined a descriptor for each drug-disease pair based on the phenotypic features of drugs (e.g., medicinal effects and side effects) and various molecular features of diseases (e.g., disease-causing genes, diagnostic markers, disease-related pathways, and environmental factors) and constructed a statistical model to predict new drug-disease associations for a wide range of diseases in the International Classification of Diseases. Our results show that the proposed method outperforms previous methods in terms of accuracy and applicability, and its performance does not depend on drug chemical structure similarity. Finally, we performed a comprehensive prediction of a drug-disease association network consisting of 2349 drugs and 858 diseases and described biologically meaningful examples of newly predicted drug indications for several types of cancers and nonhereditary diseases.
Wu, Zhenyu; Guo, Yang; Lin, Wenfang; Yu, Shuyang; Ji, Yang
2018-04-05
Predictive maintenance plays an important role in modern Cyber-Physical Systems (CPSs) and data-driven methods have been a worthwhile direction for Prognostics Health Management (PHM). However, two main challenges have significant influences on the traditional fault diagnostic models: one is that extracting hand-crafted features from multi-dimensional sensors with internal dependencies depends too much on expertise knowledge; the other is that imbalance pervasively exists among faulty and normal samples. As deep learning models have proved to be good methods for automatic feature extraction, the objective of this paper is to study an optimized deep learning model for imbalanced fault diagnosis for CPSs. Thus, this paper proposes a weighted Long Recurrent Convolutional LSTM model with sampling policy (wLRCL-D) to deal with these challenges. The model consists of 2-layer CNNs, 2-layer inner LSTMs and 2-Layer outer LSTMs, with under-sampling policy and weighted cost-sensitive loss function. Experiments are conducted on PHM 2015 challenge datasets, and the results show that wLRCL-D outperforms other baseline methods.
Guo, Yang; Lin, Wenfang; Yu, Shuyang; Ji, Yang
2018-01-01
Predictive maintenance plays an important role in modern Cyber-Physical Systems (CPSs) and data-driven methods have been a worthwhile direction for Prognostics Health Management (PHM). However, two main challenges have significant influences on the traditional fault diagnostic models: one is that extracting hand-crafted features from multi-dimensional sensors with internal dependencies depends too much on expertise knowledge; the other is that imbalance pervasively exists among faulty and normal samples. As deep learning models have proved to be good methods for automatic feature extraction, the objective of this paper is to study an optimized deep learning model for imbalanced fault diagnosis for CPSs. Thus, this paper proposes a weighted Long Recurrent Convolutional LSTM model with sampling policy (wLRCL-D) to deal with these challenges. The model consists of 2-layer CNNs, 2-layer inner LSTMs and 2-Layer outer LSTMs, with under-sampling policy and weighted cost-sensitive loss function. Experiments are conducted on PHM 2015 challenge datasets, and the results show that wLRCL-D outperforms other baseline methods. PMID:29621131
An adaptive block-based fusion method with LUE-SSIM for multi-focus images
NASA Astrophysics Data System (ADS)
Zheng, Jianing; Guo, Yongcai; Huang, Yukun
2016-09-01
Because of the lenses' limited depth of field, digital cameras are incapable of acquiring an all-in-focus image of objects at varying distances in a scene. Multi-focus image fusion technique can effectively solve this problem. Aiming at the block-based multi-focus image fusion methods, the problem that blocking-artifacts often occurs. An Adaptive block-based fusion method based on lifting undistorted-edge structural similarity (LUE-SSIM) is put forward. In this method, image quality metrics LUE-SSIM is firstly proposed, which utilizes the characteristics of human visual system (HVS) and structural similarity (SSIM) to make the metrics consistent with the human visual perception. Particle swarm optimization(PSO) algorithm which selects LUE-SSIM as the object function is used for optimizing the block size to construct the fused image. Experimental results on LIVE image database shows that LUE-SSIM outperform SSIM on Gaussian defocus blur images quality assessment. Besides, multi-focus image fusion experiment is carried out to verify our proposed image fusion method in terms of visual and quantitative evaluation. The results show that the proposed method performs better than some other block-based methods, especially in reducing the blocking-artifact of the fused image. And our method can effectively preserve the undistorted-edge details in focus region of the source images.
Efficient video-equipped fire detection approach for automatic fire alarm systems
NASA Astrophysics Data System (ADS)
Kang, Myeongsu; Tung, Truong Xuan; Kim, Jong-Myon
2013-01-01
This paper proposes an efficient four-stage approach that automatically detects fire using video capabilities. In the first stage, an approximate median method is used to detect video frame regions involving motion. In the second stage, a fuzzy c-means-based clustering algorithm is employed to extract candidate regions of fire from all of the movement-containing regions. In the third stage, a gray level co-occurrence matrix is used to extract texture parameters by tracking red-colored objects in the candidate regions. These texture features are, subsequently, used as inputs of a back-propagation neural network to distinguish between fire and nonfire. Experimental results indicate that the proposed four-stage approach outperforms other fire detection algorithms in terms of consistently increasing the accuracy of fire detection in both indoor and outdoor test videos.
Adaptive filtering with the self-organizing map: a performance comparison.
Barreto, Guilherme A; Souza, Luís Gustavo M
2006-01-01
In this paper we provide an in-depth evaluation of the SOM as a feasible tool for nonlinear adaptive filtering. A comprehensive survey of existing SOM-based and related architectures for learning input-output mappings is carried out and the application of these architectures to nonlinear adaptive filtering is formulated. Then, we introduce two simple procedures for building RBF-based nonlinear filters using the Vector-Quantized Temporal Associative Memory (VQTAM), a recently proposed method for learning dynamical input-output mappings using the SOM. The aforementioned SOM-based adaptive filters are compared with standard FIR/LMS and FIR/LMS-Newton linear transversal filters, as well as with powerful MLP-based filters in nonlinear channel equalization and inverse modeling tasks. The obtained results in both tasks indicate that SOM-based filters can consistently outperform powerful MLP-based ones.
Corporate Health and Wellness and the Financial Bottom Line
Conradie, Christina Susanna; van der Merwe Smit, Eon; Malan, Daniel Pieter
2016-01-01
Objective: The research objective was to test the hypothesis that corporate health and wellness contributed positively to South African companies’ financial results. Methods: The past share market performance of eligible healthy companies, based on Discovery's Healthy Company Index, was tracked under three investment scenarios and compared with the market performance on the basis of the JSE FTSE All Share Index. Results: The evidence supports the hypothesis that a culture of health and wellness provides a financial advantage, in so far as the portfolio of healthy companies consistently outperformed the market over the selected simulations. Conclusions: Given the limitations of the investigation, namely small sample size, the brevity of the period of investigation, and the reliance on accessibility sampling, the research provides the first and preliminary evidence supportive of the direct financial benefits of companies’ wellness programs. PMID:26849271
NASA Astrophysics Data System (ADS)
Macian-Sorribes, Hector; Pulido-Velazquez, Manuel; Tilmant, Amaury
2015-04-01
Stochastic programming methods are better suited to deal with the inherent uncertainty of inflow time series in water resource management. However, one of the most important hurdles in their use in practical implementations is the lack of generalized Decision Support System (DSS) shells, usually based on a deterministic approach. The purpose of this contribution is to present a general-purpose DSS shell, named Explicit Stochastic Programming Advanced Tool (ESPAT), able to build and solve stochastic programming problems for most water resource systems. It implements a hydro-economic approach, optimizing the total system benefits as the sum of the benefits obtained by each user. It has been coded using GAMS, and implements a Microsoft Excel interface with a GAMS-Excel link that allows the user to introduce the required data and recover the results. Therefore, no GAMS skills are required to run the program. The tool is divided into four modules according to its capabilities: 1) the ESPATR module, which performs stochastic optimization procedures in surface water systems using a Stochastic Dual Dynamic Programming (SDDP) approach; 2) the ESPAT_RA module, which optimizes coupled surface-groundwater systems using a modified SDDP approach; 3) the ESPAT_SDP module, capable of performing stochastic optimization procedures in small-size surface systems using a standard SDP approach; and 4) the ESPAT_DET module, which implements a deterministic programming procedure using non-linear programming, able to solve deterministic optimization problems in complex surface-groundwater river basins. The case study of the Mijares river basin (Spain) is used to illustrate the method. It consists in two reservoirs in series, one aquifer and four agricultural demand sites currently managed using historical (XIV century) rights, which give priority to the most traditional irrigation district over the XX century agricultural developments. Its size makes it possible to use either the SDP or the SDDP methods. The independent use of surface and groundwater can be examined with and without the aquifer. The ESPAT_DET, ESPATR and ESPAT_SDP modules were executed for the surface system, while the ESPAT_RA and the ESPAT_DET modules were run for the surface-groundwater system. The surface system's results show a similar performance between the ESPAT_SDP and ESPATR modules, with outperform the one showed by the current policies besides being outperformed by the ESPAT_DET results, which have the advantage of the perfect foresight. The surface-groundwater system's results show a robust situation in which the differences between the module's results and the current policies are lower due the use of pumped groundwater in the XX century crops when surface water is scarce. The results are realistic, with the deterministic optimization outperforming the stochastic one, which at the same time outperforms the current policies; showing that the tool is able to stochastically optimize river-aquifer water resources systems. We are currently working in the application of these tools in the analysis of changes in systems' operation under global change conditions. ACKNOWLEDGEMENT: This study has been partially supported by the IMPADAPT project (CGL2013-48424-C2-1-R) with Spanish MINECO (Ministerio de Economía y Competitividad) funds.
Deep Learning for Automated Extraction of Primary Sites From Cancer Pathology Reports.
Qiu, John X; Yoon, Hong-Jun; Fearn, Paul A; Tourassi, Georgia D
2018-01-01
Pathology reports are a primary source of information for cancer registries which process high volumes of free-text reports annually. Information extraction and coding is a manual, labor-intensive process. In this study, we investigated deep learning and a convolutional neural network (CNN), for extracting ICD-O-3 topographic codes from a corpus of breast and lung cancer pathology reports. We performed two experiments, using a CNN and a more conventional term frequency vector approach, to assess the effects of class prevalence and inter-class transfer learning. The experiments were based on a set of 942 pathology reports with human expert annotations as the gold standard. CNN performance was compared against a more conventional term frequency vector space approach. We observed that the deep learning models consistently outperformed the conventional approaches in the class prevalence experiment, resulting in micro- and macro-F score increases of up to 0.132 and 0.226, respectively, when class labels were well populated. Specifically, the best performing CNN achieved a micro-F score of 0.722 over 12 ICD-O-3 topography codes. Transfer learning provided a consistent but modest performance boost for the deep learning methods but trends were contingent on the CNN method and cancer site. These encouraging results demonstrate the potential of deep learning for automated abstraction of pathology reports.
RELIC: a novel dye-bias correction method for Illumina Methylation BeadChip.
Xu, Zongli; Langie, Sabine A S; De Boever, Patrick; Taylor, Jack A; Niu, Liang
2017-01-03
The Illumina Infinium HumanMethylation450 BeadChip and its successor, Infinium MethylationEPIC BeadChip, have been extensively utilized in epigenome-wide association studies. Both arrays use two fluorescent dyes (Cy3-green/Cy5-red) to measure methylation level at CpG sites. However, performance difference between dyes can result in biased estimates of methylation levels. Here we describe a novel method, called REgression on Logarithm of Internal Control probes (RELIC) to correct for dye bias on whole array by utilizing the intensity values of paired internal control probes that monitor the two color channels. We evaluate the method in several datasets against other widely used dye-bias correction methods. Results on data quality improvement showed that RELIC correction statistically significantly outperforms alternative dye-bias correction methods. We incorporated the method into the R package ENmix, which is freely available from the Bioconductor website ( https://www.bioconductor.org/packages/release/bioc/html/ENmix.html ). RELIC is an efficient and robust method to correct for dye-bias in Illumina Methylation BeadChip data. It outperforms other alternative methods and conveniently implemented in R package ENmix to facilitate DNA methylation studies.
ERIC Educational Resources Information Center
Blackburn, Angelique Michelle
2013-01-01
Bilinguals sometimes outperform age-matched monolinguals on non-language tasks involving cognitive control. But the bilingual advantage is not consistently found in every experiment and may reflect specific attributes of the bilinguals tested. The goal of this dissertation was to determine if the way in which bilinguals use language, specifically…
1999-12-01
strategies that lead to sustained competitive advantage (set of factors or capabilities that allows firms to consistently outperform their rivals). This...a strategy-supportive culture, creating an effective organizational structure, redirecting marketing efforts, preparing budgets, developing and...evaluates if strategies are working well. This is important since external and internal factors are constantly changing. Three fundamental strategy
ERIC Educational Resources Information Center
De Goede, Maartje; Postma, Albert
2008-01-01
Object-location memory is the only spatial task where female subjects have been shown to outperform males. This result is not consistent across all studies, and may be due to the combination of the multi-component structure of object location memory with the conditions under which different studies were done. Possible gender differences in object…
MINE: Module Identification in Networks
2011-01-01
Background Graphical models of network associations are useful for both visualizing and integrating multiple types of association data. Identifying modules, or groups of functionally related gene products, is an important challenge in analyzing biological networks. However, existing tools to identify modules are insufficient when applied to dense networks of experimentally derived interaction data. To address this problem, we have developed an agglomerative clustering method that is able to identify highly modular sets of gene products within highly interconnected molecular interaction networks. Results MINE outperforms MCODE, CFinder, NEMO, SPICi, and MCL in identifying non-exclusive, high modularity clusters when applied to the C. elegans protein-protein interaction network. The algorithm generally achieves superior geometric accuracy and modularity for annotated functional categories. In comparison with the most closely related algorithm, MCODE, the top clusters identified by MINE are consistently of higher density and MINE is less likely to designate overlapping modules as a single unit. MINE offers a high level of granularity with a small number of adjustable parameters, enabling users to fine-tune cluster results for input networks with differing topological properties. Conclusions MINE was created in response to the challenge of discovering high quality modules of gene products within highly interconnected biological networks. The algorithm allows a high degree of flexibility and user-customisation of results with few adjustable parameters. MINE outperforms several popular clustering algorithms in identifying modules with high modularity and obtains good overall recall and precision of functional annotations in protein-protein interaction networks from both S. cerevisiae and C. elegans. PMID:21605434
Stochastic gradient ascent outperforms gamers in the Quantum Moves game
NASA Astrophysics Data System (ADS)
Sels, Dries
2018-04-01
In a recent work on quantum state preparation, Sørensen and co-workers [Nature (London) 532, 210 (2016), 10.1038/nature17620] explore the possibility of using video games to help design quantum control protocols. The authors present a game called "Quantum Moves" (https://www.scienceathome.org/games/quantum-moves/) in which gamers have to move an atom from A to B by means of optical tweezers. They report that, "players succeed where purely numerical optimization fails." Moreover, by harnessing the player strategies, they can "outperform the most prominent established numerical methods." The aim of this Rapid Communication is to analyze the problem in detail and show that those claims are untenable. In fact, without any prior knowledge and starting from a random initial seed, a simple stochastic local optimization method finds near-optimal solutions which outperform all players. Counterdiabatic driving can even be used to generate protocols without resorting to numeric optimization. The analysis results in an accurate analytic estimate of the quantum speed limit which, apart from zero-point motion, is shown to be entirely classical in nature. The latter might explain why gamers are reasonably good at the game. A simple modification of the BringHomeWater challenge is proposed to test this hypothesis.
Liu, Bin; Jin, Min; Zeng, Pan
2015-10-01
The identification of gene-phenotype relationships is very important for the treatment of human diseases. Studies have shown that genes causing the same or similar phenotypes tend to interact with each other in a protein-protein interaction (PPI) network. Thus, many identification methods based on the PPI network model have achieved good results. However, in the PPI network, some interactions between the proteins encoded by candidate gene and the proteins encoded by known disease genes are very weak. Therefore, some studies have combined the PPI network with other genomic information and reported good predictive performances. However, we believe that the results could be further improved. In this paper, we propose a new method that uses the semantic similarity between the candidate gene and known disease genes to set the initial probability vector of a random walk with a restart algorithm in a human PPI network. The effectiveness of our method was demonstrated by leave-one-out cross-validation, and the experimental results indicated that our method outperformed other methods. Additionally, our method can predict new causative genes of multifactor diseases, including Parkinson's disease, breast cancer and obesity. The top predictions were good and consistent with the findings in the literature, which further illustrates the effectiveness of our method. Copyright © 2015 Elsevier Inc. All rights reserved.
Does Video-Autotutorial Instruction Improve College Student Achievement?
ERIC Educational Resources Information Center
Fisher, K. M.; And Others
1977-01-01
Compares student achievement in an upper-division college introductory course taught by the video-autotutorial method with that in two comparable courses taught by the lecture-discussion method. Pre-post tests of 623 students reveal that video-autotutorial students outperform lecture/discussion participants at all ability levels and that in…
Ou, Yangming; Resnick, Susan M.; Gur, Ruben C.; Gur, Raquel E.; Satterthwaite, Theodore D.; Furth, Susan; Davatzikos, Christos
2016-01-01
Atlas-based automated anatomical labeling is a fundamental tool in medical image segmentation, as it defines regions of interest for subsequent analysis of structural and functional image data. The extensive investigation of multi-atlas warping and fusion techniques over the past 5 or more years has clearly demonstrated the advantages of consensus-based segmentation. However, the common approach is to use multiple atlases with a single registration method and parameter set, which is not necessarily optimal for every individual scan, anatomical region, and problem/data-type. Different registration criteria and parameter sets yield different solutions, each providing complementary information. Herein, we present a consensus labeling framework that generates a broad ensemble of labeled atlases in target image space via the use of several warping algorithms, regularization parameters, and atlases. The label fusion integrates two complementary sources of information: a local similarity ranking to select locally optimal atlases and a boundary modulation term to refine the segmentation consistently with the target image's intensity profile. The ensemble approach consistently outperforms segmentations using individual warping methods alone, achieving high accuracy on several benchmark datasets. The MUSE methodology has been used for processing thousands of scans from various datasets, producing robust and consistent results. MUSE is publicly available both as a downloadable software package, and as an application that can be run on the CBICA Image Processing Portal (https://ipp.cbica.upenn.edu), a web based platform for remote processing of medical images. PMID:26679328
Multiratio fusion change detection with adaptive thresholding
NASA Astrophysics Data System (ADS)
Hytla, Patrick C.; Balster, Eric J.; Vasquez, Juan R.; Neuroth, Robert M.
2017-04-01
A ratio-based change detection method known as multiratio fusion (MRF) is proposed and tested. The MRF framework builds on other change detection components proposed in this work: dual ratio (DR) and multiratio (MR). The DR method involves two ratios coupled with adaptive thresholds to maximize detected changes and minimize false alarms. The use of two ratios is shown to outperform the single ratio case when the means of the image pairs are not equal. MR change detection builds on the DR method by including negative imagery to produce four total ratios with adaptive thresholds. Inclusion of negative imagery is shown to improve detection sensitivity and to boost detection performance in certain target and background cases. MRF further expands this concept by fusing together the ratio outputs using a routine in which detections must be verified by two or more ratios to be classified as a true changed pixel. The proposed method is tested with synthetically generated test imagery and real datasets with results compared to other methods found in the literature. DR is shown to significantly outperform the standard single ratio method. MRF produces excellent change detection results that exhibit up to a 22% performance improvement over other methods from the literature at low false-alarm rates.
Color correction with blind image restoration based on multiple images using a low-rank model
NASA Astrophysics Data System (ADS)
Li, Dong; Xie, Xudong; Lam, Kin-Man
2014-03-01
We present a method that can handle the color correction of multiple photographs with blind image restoration simultaneously and automatically. We prove that the local colors of a set of images of the same scene exhibit the low-rank property locally both before and after a color-correction operation. This property allows us to correct all kinds of errors in an image under a low-rank matrix model without particular priors or assumptions. The possible errors may be caused by changes of viewpoint, large illumination variations, gross pixel corruptions, partial occlusions, etc. Furthermore, a new iterative soft-segmentation method is proposed for local color transfer using color influence maps. Due to the fact that the correct color information and the spatial information of images can be recovered using the low-rank model, more precise color correction and many other image-restoration tasks-including image denoising, image deblurring, and gray-scale image colorizing-can be performed simultaneously. Experiments have verified that our method can achieve consistent and promising results on uncontrolled real photographs acquired from the Internet and that it outperforms current state-of-the-art methods.
Protein binding hot spots prediction from sequence only by a new ensemble learning method.
Hu, Shan-Shan; Chen, Peng; Wang, Bing; Li, Jinyan
2017-10-01
Hot spots are interfacial core areas of binding proteins, which have been applied as targets in drug design. Experimental methods are costly in both time and expense to locate hot spot areas. Recently, in-silicon computational methods have been widely used for hot spot prediction through sequence or structure characterization. As the structural information of proteins is not always solved, and thus hot spot identification from amino acid sequences only is more useful for real-life applications. This work proposes a new sequence-based model that combines physicochemical features with the relative accessible surface area of amino acid sequences for hot spot prediction. The model consists of 83 classifiers involving the IBk (Instance-based k means) algorithm, where instances are encoded by important properties extracted from a total of 544 properties in the AAindex1 (Amino Acid Index) database. Then top-performance classifiers are selected to form an ensemble by a majority voting technique. The ensemble classifier outperforms the state-of-the-art computational methods, yielding an F1 score of 0.80 on the benchmark binding interface database (BID) test set. http://www2.ahu.edu.cn/pchen/web/HotspotEC.htm .
NASA Astrophysics Data System (ADS)
Guerrout, EL-Hachemi; Ait-Aoudia, Samy; Michelucci, Dominique; Mahiou, Ramdane
2018-05-01
Many routine medical examinations produce images of patients suffering from various pathologies. With the huge number of medical images, the manual analysis and interpretation became a tedious task. Thus, automatic image segmentation became essential for diagnosis assistance. Segmentation consists in dividing the image into homogeneous and significant regions. We focus on hidden Markov random fields referred to as HMRF to model the problem of segmentation. This modelisation leads to a classical function minimisation problem. Broyden-Fletcher-Goldfarb-Shanno algorithm referred to as BFGS is one of the most powerful methods to solve unconstrained optimisation problem. In this paper, we investigate the combination of HMRF and BFGS algorithm to perform the segmentation operation. The proposed method shows very good segmentation results comparing with well-known approaches. The tests are conducted on brain magnetic resonance image databases (BrainWeb and IBSR) largely used to objectively confront the results obtained. The well-known Dice coefficient (DC) was used as similarity metric. The experimental results show that, in many cases, our proposed method approaches the perfect segmentation with a Dice Coefficient above .9. Moreover, it generally outperforms other methods in the tests conducted.
Still-to-video face recognition in unconstrained environments
NASA Astrophysics Data System (ADS)
Wang, Haoyu; Liu, Changsong; Ding, Xiaoqing
2015-02-01
Face images from video sequences captured in unconstrained environments usually contain several kinds of variations, e.g. pose, facial expression, illumination, image resolution and occlusion. Motion blur and compression artifacts also deteriorate recognition performance. Besides, in various practical systems such as law enforcement, video surveillance and e-passport identification, only a single still image per person is enrolled as the gallery set. Many existing methods may fail to work due to variations in face appearances and the limit of available gallery samples. In this paper, we propose a novel approach for still-to-video face recognition in unconstrained environments. By assuming that faces from still images and video frames share the same identity space, a regularized least squares regression method is utilized to tackle the multi-modality problem. Regularization terms based on heuristic assumptions are enrolled to avoid overfitting. In order to deal with the single image per person problem, we exploit face variations learned from training sets to synthesize virtual samples for gallery samples. We adopt a learning algorithm combining both affine/convex hull-based approach and regularizations to match image sets. Experimental results on a real-world dataset consisting of unconstrained video sequences demonstrate that our method outperforms the state-of-the-art methods impressively.
Wavelet median denoising of ultrasound images
NASA Astrophysics Data System (ADS)
Macey, Katherine E.; Page, Wyatt H.
2002-05-01
Ultrasound images are contaminated with both additive and multiplicative noise, which is modeled by Gaussian and speckle noise respectively. Distinguishing small features such as fallopian tubes in the female genital tract in the noisy environment is problematic. A new method for noise reduction, Wavelet Median Denoising, is presented. Wavelet Median Denoising consists of performing a standard noise reduction technique, median filtering, in the wavelet domain. The new method is tested on 126 images, comprised of 9 original images each with 14 levels of Gaussian or speckle noise. Results for both separable and non-separable wavelets are evaluated, relative to soft-thresholding in the wavelet domain, using the signal-to-noise ratio and subjective assessment. The performance of Wavelet Median Denoising is comparable to that of soft-thresholding. Both methods are more successful in removing Gaussian noise than speckle noise. Wavelet Median Denoising outperforms soft-thresholding for a larger number of cases of speckle noise reduction than of Gaussian noise reduction. Noise reduction is more successful using non-separable wavelets than separable wavelets. When both methods are applied to ultrasound images obtained from a phantom of the female genital tract a small improvement is seen; however, a substantial improvement is required prior to clinical use.
NASA Astrophysics Data System (ADS)
Bainbridge, Matthew B.; Webb, John K.
2017-06-01
A new and automated method is presented for the analysis of high-resolution absorption spectra. Three established numerical methods are unified into one `artificial intelligence' process: a genetic algorithm (Genetic Voigt Profile FIT, gvpfit); non-linear least-squares with parameter constraints (vpfit); and Bayesian model averaging (BMA). The method has broad application but here we apply it specifically to the problem of measuring the fine structure constant at high redshift. For this we need objectivity and reproducibility. gvpfit is also motivated by the importance of obtaining a large statistical sample of measurements of Δα/α. Interactive analyses are both time consuming and complex and automation makes obtaining a large sample feasible. In contrast to previous methodologies, we use BMA to derive results using a large set of models and show that this procedure is more robust than a human picking a single preferred model since BMA avoids the systematic uncertainties associated with model choice. Numerical simulations provide stringent tests of the whole process and we show using both real and simulated spectra that the unified automated fitting procedure out-performs a human interactive analysis. The method should be invaluable in the context of future instrumentation like ESPRESSO on the VLT and indeed future ELTs. We apply the method to the zabs = 1.8389 absorber towards the zem = 2.145 quasar J110325-264515. The derived constraint of Δα/α = 3.3 ± 2.9 × 10-6 is consistent with no variation and also consistent with the tentative spatial variation reported in Webb et al. and King et al.
DeepQA: improving the estimation of single protein model quality with deep belief networks.
Cao, Renzhi; Bhattacharya, Debswapna; Hou, Jie; Cheng, Jianlin
2016-12-05
Protein quality assessment (QA) useful for ranking and selecting protein models has long been viewed as one of the major challenges for protein tertiary structure prediction. Especially, estimating the quality of a single protein model, which is important for selecting a few good models out of a large model pool consisting of mostly low-quality models, is still a largely unsolved problem. We introduce a novel single-model quality assessment method DeepQA based on deep belief network that utilizes a number of selected features describing the quality of a model from different perspectives, such as energy, physio-chemical characteristics, and structural information. The deep belief network is trained on several large datasets consisting of models from the Critical Assessment of Protein Structure Prediction (CASP) experiments, several publicly available datasets, and models generated by our in-house ab initio method. Our experiments demonstrate that deep belief network has better performance compared to Support Vector Machines and Neural Networks on the protein model quality assessment problem, and our method DeepQA achieves the state-of-the-art performance on CASP11 dataset. It also outperformed two well-established methods in selecting good outlier models from a large set of models of mostly low quality generated by ab initio modeling methods. DeepQA is a useful deep learning tool for protein single model quality assessment and protein structure prediction. The source code, executable, document and training/test datasets of DeepQA for Linux is freely available to non-commercial users at http://cactus.rnet.missouri.edu/DeepQA/ .
Multigrid contact detection method
NASA Astrophysics Data System (ADS)
He, Kejing; Dong, Shoubin; Zhou, Zhaoyao
2007-03-01
Contact detection is a general problem of many physical simulations. This work presents a O(N) multigrid method for general contact detection problems (MGCD). The multigrid idea is integrated with contact detection problems. Both the time complexity and memory consumption of the MGCD are O(N) . Unlike other methods, whose efficiencies are influenced strongly by the object size distribution, the performance of MGCD is insensitive to the object size distribution. We compare the MGCD with the no binary search (NBS) method and the multilevel boxing method in three dimensions for both time complexity and memory consumption. For objects with similar size, the MGCD is as good as the NBS method, both of which outperform the multilevel boxing method regarding memory consumption. For objects with diverse size, the MGCD outperform both the NBS method and the multilevel boxing method. We use the MGCD to solve the contact detection problem for a granular simulation system based on the discrete element method. From this granular simulation, we get the density property of monosize packing and binary packing with size ratio equal to 10. The packing density for monosize particles is 0.636. For binary packing with size ratio equal to 10, when the number of small particles is 300 times as the number of big particles, the maximal packing density 0.824 is achieved.
MATRIX DISCRIMINANT ANALYSIS WITH APPLICATION TO COLORIMETRIC SENSOR ARRAY DATA
Suslick, Kenneth S.
2014-01-01
With the rapid development of nano-technology, a “colorimetric sensor array” (CSA) which is referred to as an optical electronic nose has been developed for the identification of toxicants. Unlike traditional sensors which rely on a single chemical interaction, CSA can measure multiple chemical interactions by using chemo-responsive dyes. The color changes of the chemo-responsive dyes are recorded before and after exposure to toxicants and serve as a template for classification. The color changes are digitalized in the form of a matrix with rows representing dye effects and columns representing the spectrum of colors. Thus, matrix-classification methods are highly desirable. In this article, we develop a novel classification method, matrix discriminant analysis (MDA), which is a generalization of linear discriminant analysis (LDA) for the data in matrix form. By incorporating the intrinsic matrix-structure of the data in discriminant analysis, the proposed method can improve CSA’s sensitivity and more importantly, specificity. A penalized MDA method, PMDA, is also introduced to further incorporate sparsity structure in discriminant function. Numerical studies suggest that the proposed MDA and PMDA methods outperform LDA and other competing discriminant methods for matrix predictors. The asymptotic consistency of MDA is also established. R code and data are available online as supplementary material. PMID:26783371
Multiagent scheduling method with earliness and tardiness objectives in flexible job shops.
Wu, Zuobao; Weng, Michael X
2005-04-01
Flexible job-shop scheduling problems are an important extension of the classical job-shop scheduling problems and present additional complexity. Such problems are mainly due to the existence of a considerable amount of overlapping capacities with modern machines. Classical scheduling methods are generally incapable of addressing such capacity overlapping. We propose a multiagent scheduling method with job earliness and tardiness objectives in a flexible job-shop environment. The earliness and tardiness objectives are consistent with the just-in-time production philosophy which has attracted significant attention in both industry and academic community. A new job-routing and sequencing mechanism is proposed. In this mechanism, two kinds of jobs are defined to distinguish jobs with one operation left from jobs with more than one operation left. Different criteria are proposed to route these two kinds of jobs. Job sequencing enables to hold a job that may be completed too early. Two heuristic algorithms for job sequencing are developed to deal with these two kinds of jobs. The computational experiments show that the proposed multiagent scheduling method significantly outperforms the existing scheduling methods in the literature. In addition, the proposed method is quite fast. In fact, the simulation time to find a complete schedule with over 2000 jobs on ten machines is less than 1.5 min.
An Exact Algorithm to Compute the Double-Cut-and-Join Distance for Genomes with Duplicate Genes.
Shao, Mingfu; Lin, Yu; Moret, Bernard M E
2015-05-01
Computing the edit distance between two genomes is a basic problem in the study of genome evolution. The double-cut-and-join (DCJ) model has formed the basis for most algorithmic research on rearrangements over the last few years. The edit distance under the DCJ model can be computed in linear time for genomes without duplicate genes, while the problem becomes NP-hard in the presence of duplicate genes. In this article, we propose an integer linear programming (ILP) formulation to compute the DCJ distance between two genomes with duplicate genes. We also provide an efficient preprocessing approach to simplify the ILP formulation while preserving optimality. Comparison on simulated genomes demonstrates that our method outperforms MSOAR in computing the edit distance, especially when the genomes contain long duplicated segments. We also apply our method to assign orthologous gene pairs among human, mouse, and rat genomes, where once again our method outperforms MSOAR.
LAMDA at TREC CDS track 2015: Clinical Decision Support Track
2015-11-20
outperforms all the other vector space models supported by Elasticsearch. MetaMap is the online tool that maps biomedical text to the Metathesaurus, and...cases. The medical knowledge consists of 700,000 biomedical documents supported by the PubMed Central [3] which is online digital database freely...Science Research Program through the National Research Foundation (NRF) of Korea funded by the Ministry of Science, ICT , and Future Planning (MSIP
Rich Representations with Exposed Semantics for Deep Visual Reasoning
2016-06-01
Gupta, Abh inav; Hebert, Martial ; Aminoff, Elissa; Park, HyunSoo; Forsyth, David; Shi, Jianbo; Tarr, Michael Se. TASK NUMBER Sf. WORK UNIT NUMBER 7...approach performs equally well or better when compared to the state-of-the- art . We also expanded our research on perceiving the interactions between...cross-instance correspondences. We demonstrated that our end-to-end trained ConvNet supervised by cycle-consistency outperforms state-of-the- art
Comparison of DNA preservation methods for environmental bacterial community samples
Gray, Michael A.; Pratte, Zoe A.; Kellogg, Christina A.
2013-01-01
Field collections of environmental samples, for example corals, for molecular microbial analyses present distinct challenges. The lack of laboratory facilities in remote locations is common, and preservation of microbial community DNA for later study is critical. A particular challenge is keeping samples frozen in transit. Five nucleic acid preservation methods that do not require cold storage were compared for effectiveness over time and ease of use. Mixed microbial communities of known composition were created and preserved by DNAgard™, RNAlater®, DMSO–EDTA–salt (DESS), FTA® cards, and FTA Elute® cards. Automated ribosomal intergenic spacer analysis and clone libraries were used to detect specific changes in the faux communities over weeks and months of storage. A previously known bias in FTA® cards that results in lower recovery of pure cultures of Gram-positive bacteria was also detected in mixed community samples. There appears to be a uniform bias across all five preservation methods against microorganisms with high G + C DNA. Overall, the liquid-based preservatives (DNAgard™, RNAlater®, and DESS) outperformed the card-based methods. No single liquid method clearly outperformed the others, leaving method choice to be based on experimental design, field facilities, shipping constraints, and allowable cost.
Computation-aware algorithm selection approach for interlaced-to-progressive conversion
NASA Astrophysics Data System (ADS)
Park, Sang-Jun; Jeon, Gwanggil; Jeong, Jechang
2010-05-01
We discuss deinterlacing results in a computationally constrained and varied environment. The proposed computation-aware algorithm selection approach (CASA) for fast interlaced to progressive conversion algorithm consists of three methods: the line-averaging (LA) method for plain regions, the modified edge-based line-averaging (MELA) method for medium regions, and the proposed covariance-based adaptive deinterlacing (CAD) method for complex regions. The proposed CASA uses two criteria, mean-squared error (MSE) and CPU time, for assigning the method. We proposed a CAD method. The principle idea of CAD is based on the correspondence between the high and low-resolution covariances. We estimated the local covariance coefficients from an interlaced image using Wiener filtering theory and then used these optimal minimum MSE interpolation coefficients to obtain a deinterlaced image. The CAD method, though more robust than most known methods, was not found to be very fast compared to the others. To alleviate this issue, we proposed an adaptive selection approach using a fast deinterlacing algorithm rather than using only one CAD algorithm. The proposed hybrid approach of switching between the conventional schemes (LA and MELA) and our CAD was proposed to reduce the overall computational load. A reliable condition to be used for switching the schemes was presented after a wide set of initial training processes. The results of computer simulations showed that the proposed methods outperformed a number of methods presented in the literature.
Classification of hyperspectral imagery with neural networks: comparison to conventional tools
NASA Astrophysics Data System (ADS)
Merényi, Erzsébet; Farrand, William H.; Taranik, James V.; Minor, Timothy B.
2014-12-01
Efficient exploitation of hyperspectral imagery is of great importance in remote sensing. Artificial intelligence approaches have been receiving favorable reviews for classification of hyperspectral data because the complexity of such data challenges the limitations of many conventional methods. Artificial neural networks (ANNs) were shown to outperform traditional classifiers in many situations. However, studies that use the full spectral dimensionality of hyperspectral images to classify a large number of surface covers are scarce if non-existent. We advocate the need for methods that can handle the full dimensionality and a large number of classes to retain the discovery potential and the ability to discriminate classes with subtle spectral differences. We demonstrate that such a method exists in the family of ANNs. We compare the maximum likelihood, Mahalonobis distance, minimum distance, spectral angle mapper, and a hybrid ANN classifier for real hyperspectral AVIRIS data, using the full spectral resolution to map 23 cover types and using a small training set. Rigorous evaluation of the classification accuracies shows that the ANN outperforms the other methods and achieves ≈90% accuracy on test data.
Linden, Ariel
2017-08-01
When a randomized controlled trial is not feasible, health researchers typically use observational data and rely on statistical methods to adjust for confounding when estimating treatment effects. These methods generally fall into 3 categories: (1) estimators based on a model for the outcome using conventional regression adjustment; (2) weighted estimators based on the propensity score (ie, a model for the treatment assignment); and (3) "doubly robust" (DR) estimators that model both the outcome and propensity score within the same framework. In this paper, we introduce a new DR estimator that utilizes marginal mean weighting through stratification (MMWS) as the basis for weighted adjustment. This estimator may prove more accurate than treatment effect estimators because MMWS has been shown to be more accurate than other models when the propensity score is misspecified. We therefore compare the performance of this new estimator to other commonly used treatment effects estimators. Monte Carlo simulation is used to compare the DR-MMWS estimator to regression adjustment, 2 weighted estimators based on the propensity score and 2 other DR methods. To assess performance under varied conditions, we vary the level of misspecification of the propensity score model as well as misspecify the outcome model. Overall, DR estimators generally outperform methods that model one or the other components (eg, propensity score or outcome). The DR-MMWS estimator outperforms all other estimators when both the propensity score and outcome models are misspecified and performs equally as well as other DR estimators when only the propensity score is misspecified. Health researchers should consider using DR-MMWS as the principal evaluation strategy in observational studies, as this estimator appears to outperform other estimators in its class. © 2017 John Wiley & Sons, Ltd.
NASA Astrophysics Data System (ADS)
Krestyannikov, E.; Tohka, J.; Ruotsalainen, U.
2008-06-01
This paper presents a novel statistical approach for joint estimation of regions-of-interest (ROIs) and the corresponding time-activity curves (TACs) from dynamic positron emission tomography (PET) brain projection data. It is based on optimizing the joint objective function that consists of a data log-likelihood term and two penalty terms reflecting the available a priori information about the human brain anatomy. The developed local optimization strategy iteratively updates both the ROI and TAC parameters and is guaranteed to monotonically increase the objective function. The quantitative evaluation of the algorithm is performed with numerically and Monte Carlo-simulated dynamic PET brain data of the 11C-Raclopride and 18F-FDG tracers. The results demonstrate that the method outperforms the existing sequential ROI quantification approaches in terms of accuracy, and can noticeably reduce the errors in TACs arising due to the finite spatial resolution and ROI delineation.
Wang, Yue; Adalý, Tülay; Kung, Sun-Yuan; Szabo, Zsolt
2007-01-01
This paper presents a probabilistic neural network based technique for unsupervised quantification and segmentation of brain tissues from magnetic resonance images. It is shown that this problem can be solved by distribution learning and relaxation labeling, resulting in an efficient method that may be particularly useful in quantifying and segmenting abnormal brain tissues where the number of tissue types is unknown and the distributions of tissue types heavily overlap. The new technique uses suitable statistical models for both the pixel and context images and formulates the problem in terms of model-histogram fitting and global consistency labeling. The quantification is achieved by probabilistic self-organizing mixtures and the segmentation by a probabilistic constraint relaxation network. The experimental results show the efficient and robust performance of the new algorithm and that it outperforms the conventional classification based approaches. PMID:18172510
Highway traffic estimation of improved precision using the derivative-free nonlinear Kalman Filter
NASA Astrophysics Data System (ADS)
Rigatos, Gerasimos; Siano, Pierluigi; Zervos, Nikolaos; Melkikh, Alexey
2015-12-01
The paper proves that the PDE dynamic model of the highway traffic is a differentially flat one and by applying spatial discretization its shows that the model's transformation into an equivalent linear canonical state-space form is possible. For the latter representation of the traffic's dynamics, state estimation is performed with the use of the Derivative-free nonlinear Kalman Filter. The proposed filter consists of the Kalman Filter recursion applied on the transformed state-space model of the highway traffic. Moreover, it makes use of an inverse transformation, based again on differential flatness theory which enables to obtain estimates of the state variables of the initial nonlinear PDE model. By avoiding approximate linearizations and the truncation of nonlinear terms from the PDE model of the traffic's dynamics the proposed filtering methods outperforms, in terms of accuracy, other nonlinear estimators such as the Extended Kalman Filter. The article's theoretical findings are confirmed through simulation experiments.
Machine Learning for Treatment Assignment: Improving Individualized Risk Attribution
Weiss, Jeremy; Kuusisto, Finn; Boyd, Kendrick; Liu, Jie; Page, David
2015-01-01
Clinical studies model the average treatment effect (ATE), but apply this population-level effect to future individuals. Due to recent developments of machine learning algorithms with useful statistical guarantees, we argue instead for modeling the individualized treatment effect (ITE), which has better applicability to new patients. We compare ATE-estimation using randomized and observational analysis methods against ITE-estimation using machine learning, and describe how the ITE theoretically generalizes to new population distributions, whereas the ATE may not. On a synthetic data set of statin use and myocardial infarction (MI), we show that a learned ITE model improves true ITE estimation and outperforms the ATE. We additionally argue that ITE models should be learned with a consistent, nonparametric algorithm from unweighted examples and show experiments in favor of our argument using our synthetic data model and a real data set of D-penicillamine use for primary biliary cirrhosis. PMID:26958271
Machine Learning for Treatment Assignment: Improving Individualized Risk Attribution.
Weiss, Jeremy; Kuusisto, Finn; Boyd, Kendrick; Liu, Jie; Page, David
2015-01-01
Clinical studies model the average treatment effect (ATE), but apply this population-level effect to future individuals. Due to recent developments of machine learning algorithms with useful statistical guarantees, we argue instead for modeling the individualized treatment effect (ITE), which has better applicability to new patients. We compare ATE-estimation using randomized and observational analysis methods against ITE-estimation using machine learning, and describe how the ITE theoretically generalizes to new population distributions, whereas the ATE may not. On a synthetic data set of statin use and myocardial infarction (MI), we show that a learned ITE model improves true ITE estimation and outperforms the ATE. We additionally argue that ITE models should be learned with a consistent, nonparametric algorithm from unweighted examples and show experiments in favor of our argument using our synthetic data model and a real data set of D-penicillamine use for primary biliary cirrhosis.
SOUNET: Self-Organized Underwater Wireless Sensor Network.
Kim, Hee-Won; Cho, Ho-Shin
2017-02-02
In this paper, we propose an underwater wireless sensor network (UWSN) named SOUNET where sensor nodes form and maintain a tree-topological network for data gathering in a self-organized manner. After network topology discovery via packet flooding, the sensor nodes consistently update their parent node to ensure the best connectivity by referring to the timevarying neighbor tables. Such a persistent and self-adaptive method leads to high network connectivity without any centralized control, even when sensor nodes are added or unexpectedly lost. Furthermore, malfunctions that frequently happen in self-organized networks such as node isolation and closed loop are resolved in a simple way. Simulation results show that SOUNET outperforms other conventional schemes in terms of network connectivity, packet delivery ratio (PDR), and energy consumption throughout the network. In addition, we performed an experiment at the Gyeongcheon Lake in Korea using commercial underwater modems to verify that SOUNET works well in a real environment.
SOUNET: Self-Organized Underwater Wireless Sensor Network
Kim, Hee-won; Cho, Ho-Shin
2017-01-01
In this paper, we propose an underwater wireless sensor network (UWSN) named SOUNET where sensor nodes form and maintain a tree-topological network for data gathering in a self-organized manner. After network topology discovery via packet flooding, the sensor nodes consistently update their parent node to ensure the best connectivity by referring to the time-varying neighbor tables. Such a persistent and self-adaptive method leads to high network connectivity without any centralized control, even when sensor nodes are added or unexpectedly lost. Furthermore, malfunctions that frequently happen in self-organized networks such as node isolation and closed loop are resolved in a simple way. Simulation results show that SOUNET outperforms other conventional schemes in terms of network connectivity, packet delivery ratio (PDR), and energy consumption throughout the network. In addition, we performed an experiment at the Gyeongcheon Lake in Korea using commercial underwater modems to verify that SOUNET works well in a real environment. PMID:28157164
Warmerdam, G; Vullings, R; Van Pul, C; Andriessen, P; Oei, S G; Wijn, P
2013-01-01
Non-invasive fetal electrocardiography (ECG) can be used for prolonged monitoring of the fetal heart rate (FHR). However, the signal-to-noise-ratio (SNR) of non-invasive ECG recordings is often insufficient for reliable detection of the FHR. To overcome this problem, source separation techniques can be used to enhance the fetal ECG. This study uses a physiology-based source separation (PBSS) technique that has already been demonstrated to outperform widely used blind source separation techniques. Despite the relatively good performance of PBSS in enhancing the fetal ECG, PBSS is still susceptible to artifacts. In this study an augmented PBSS technique is developed to reduce the influence of artifacts. The performance of the developed method is compared to PBSS on multi-channel non-invasive fetal ECG recordings. Based on this comparison, the developed method is shown to outperform PBSS for the enhancement of the fetal ECG.
Hayashi, Ryusuke; Watanabe, Osamu; Yokoyama, Hiroki; Nishida, Shin'ya
2017-06-01
Characterization of the functional relationship between sensory inputs and neuronal or observers' perceptual responses is one of the fundamental goals of systems neuroscience and psychophysics. Conventional methods, such as reverse correlation and spike-triggered data analyses are limited in their ability to resolve complex and inherently nonlinear neuronal/perceptual processes because these methods require input stimuli to be Gaussian with a zero mean. Recent studies have shown that analyses based on a generalized linear model (GLM) do not require such specific input characteristics and have advantages over conventional methods. GLM, however, relies on iterative optimization algorithms and its calculation costs become very expensive when estimating the nonlinear parameters of a large-scale system using large volumes of data. In this paper, we introduce a new analytical method for identifying a nonlinear system without relying on iterative calculations and yet also not requiring any specific stimulus distribution. We demonstrate the results of numerical simulations, showing that our noniterative method is as accurate as GLM in estimating nonlinear parameters in many cases and outperforms conventional, spike-triggered data analyses. As an example of the application of our method to actual psychophysical data, we investigated how different spatiotemporal frequency channels interact in assessments of motion direction. The nonlinear interaction estimated by our method was consistent with findings from previous vision studies and supports the validity of our method for nonlinear system identification.
Possible world based consistency learning model for clustering and classifying uncertain data.
Liu, Han; Zhang, Xianchao; Zhang, Xiaotong
2018-06-01
Possible world has shown to be effective for handling various types of data uncertainty in uncertain data management. However, few uncertain data clustering and classification algorithms are proposed based on possible world. Moreover, existing possible world based algorithms suffer from the following issues: (1) they deal with each possible world independently and ignore the consistency principle across different possible worlds; (2) they require the extra post-processing procedure to obtain the final result, which causes that the effectiveness highly relies on the post-processing method and the efficiency is also not very good. In this paper, we propose a novel possible world based consistency learning model for uncertain data, which can be extended both for clustering and classifying uncertain data. This model utilizes the consistency principle to learn a consensus affinity matrix for uncertain data, which can make full use of the information across different possible worlds and then improve the clustering and classification performance. Meanwhile, this model imposes a new rank constraint on the Laplacian matrix of the consensus affinity matrix, thereby ensuring that the number of connected components in the consensus affinity matrix is exactly equal to the number of classes. This also means that the clustering and classification results can be directly obtained without any post-processing procedure. Furthermore, for the clustering and classification tasks, we respectively derive the efficient optimization methods to solve the proposed model. Experimental results on real benchmark datasets and real world uncertain datasets show that the proposed model outperforms the state-of-the-art uncertain data clustering and classification algorithms in effectiveness and performs competitively in efficiency. Copyright © 2018 Elsevier Ltd. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhang, Hanming; Wang, Linyuan; Li, Lei
2016-06-15
Purpose: Metal artifact reduction (MAR) is a major problem and a challenging issue in x-ray computed tomography (CT) examinations. Iterative reconstruction from sinograms unaffected by metals shows promising potential in detail recovery. This reconstruction has been the subject of much research in recent years. However, conventional iterative reconstruction methods easily introduce new artifacts around metal implants because of incomplete data reconstruction and inconsistencies in practical data acquisition. Hence, this work aims at developing a method to suppress newly introduced artifacts and improve the image quality around metal implants for the iterative MAR scheme. Methods: The proposed method consists of twomore » steps based on the general iterative MAR framework. An uncorrected image is initially reconstructed, and the corresponding metal trace is obtained. The iterative reconstruction method is then used to reconstruct images from the unaffected sinogram. In the reconstruction step of this work, an iterative strategy utilizing unmatched projector/backprojector pairs is used. A ramp filter is introduced into the back-projection procedure to restrain the inconsistency components in low frequencies and generate more reliable images of the regions around metals. Furthermore, a constrained total variation (TV) minimization model is also incorporated to enhance efficiency. The proposed strategy is implemented based on an iterative FBP and an alternating direction minimization (ADM) scheme, respectively. The developed algorithms are referred to as “iFBP-TV” and “TV-FADM,” respectively. Two projection-completion-based MAR methods and three iterative MAR methods are performed simultaneously for comparison. Results: The proposed method performs reasonably on both simulation and real CT-scanned datasets. This approach could reduce streak metal artifacts effectively and avoid the mentioned effects in the vicinity of the metals. The improvements are evaluated by inspecting regions of interest and by comparing the root-mean-square errors, normalized mean absolute distance, and universal quality index metrics of the images. Both iFBP-TV and TV-FADM methods outperform other counterparts in all cases. Unlike the conventional iterative methods, the proposed strategy utilizing unmatched projector/backprojector pairs shows excellent performance in detail preservation and prevention of the introduction of new artifacts. Conclusions: Qualitative and quantitative evaluations of experimental results indicate that the developed method outperforms classical MAR algorithms in suppressing streak artifacts and preserving the edge structural information of the object. In particular, structures lying close to metals can be gradually recovered because of the reduction of artifacts caused by inconsistency effects.« less
Guo, Wei-Li; Huang, De-Shuang
2017-08-22
Transcription factors (TFs) are DNA-binding proteins that have a central role in regulating gene expression. Identification of DNA-binding sites of TFs is a key task in understanding transcriptional regulation, cellular processes and disease. Chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) enables genome-wide identification of in vivo TF binding sites. However, it is still difficult to map every TF in every cell line owing to cost and biological material availability, which poses an enormous obstacle for integrated analysis of gene regulation. To address this problem, we propose a novel computational approach, TFBSImpute, for predicting additional TF binding profiles by leveraging information from available ChIP-seq TF binding data. TFBSImpute fuses the dataset to a 3-mode tensor and imputes missing TF binding signals via simultaneous completion of multiple TF binding matrices with positional consistency. We show that signals predicted by our method achieve overall similarity with experimental data and that TFBSImpute significantly outperforms baseline approaches, by assessing the performance of imputation methods against observed ChIP-seq TF binding profiles. Besides, motif analysis shows that TFBSImpute preforms better in capturing binding motifs enriched in observed data compared with baselines, indicating that the higher performance of TFBSImpute is not simply due to averaging related samples. We anticipate that our approach will constitute a useful complement to experimental mapping of TF binding, which is beneficial for further study of regulation mechanisms and disease.
Improving cerebellar segmentation with statistical fusion
NASA Astrophysics Data System (ADS)
Plassard, Andrew J.; Yang, Zhen; Prince, Jerry L.; Claassen, Daniel O.; Landman, Bennett A.
2016-03-01
The cerebellum is a somatotopically organized central component of the central nervous system well known to be involved with motor coordination and increasingly recognized roles in cognition and planning. Recent work in multiatlas labeling has created methods that offer the potential for fully automated 3-D parcellation of the cerebellar lobules and vermis (which are organizationally equivalent to cortical gray matter areas). This work explores the trade offs of using different statistical fusion techniques and post hoc optimizations in two datasets with distinct imaging protocols. We offer a novel fusion technique by extending the ideas of the Selective and Iterative Method for Performance Level Estimation (SIMPLE) to a patch-based performance model. We demonstrate the effectiveness of our algorithm, Non- Local SIMPLE, for segmentation of a mixed population of healthy subjects and patients with severe cerebellar anatomy. Under the first imaging protocol, we show that Non-Local SIMPLE outperforms previous gold-standard segmentation techniques. In the second imaging protocol, we show that Non-Local SIMPLE outperforms previous gold standard techniques but is outperformed by a non-locally weighted vote with the deeper population of atlases available. This work advances the state of the art in open source cerebellar segmentation algorithms and offers the opportunity for routinely including cerebellar segmentation in magnetic resonance imaging studies that acquire whole brain T1-weighted volumes with approximately 1 mm isotropic resolution.
Xia, Jiaqi; Peng, Zhenling; Qi, Dawei; Mu, Hongbo; Yang, Jianyi
2017-03-15
Protein fold classification is a critical step in protein structure prediction. There are two possible ways to classify protein folds. One is through template-based fold assignment and the other is ab-initio prediction using machine learning algorithms. Combination of both solutions to improve the prediction accuracy was never explored before. We developed two algorithms, HH-fold and SVM-fold for protein fold classification. HH-fold is a template-based fold assignment algorithm using the HHsearch program. SVM-fold is a support vector machine-based ab-initio classification algorithm, in which a comprehensive set of features are extracted from three complementary sequence profiles. These two algorithms are then combined, resulting to the ensemble approach TA-fold. We performed a comprehensive assessment for the proposed methods by comparing with ab-initio methods and template-based threading methods on six benchmark datasets. An accuracy of 0.799 was achieved by TA-fold on the DD dataset that consists of proteins from 27 folds. This represents improvement of 5.4-11.7% over ab-initio methods. After updating this dataset to include more proteins in the same folds, the accuracy increased to 0.971. In addition, TA-fold achieved >0.9 accuracy on a large dataset consisting of 6451 proteins from 184 folds. Experiments on the LE dataset show that TA-fold consistently outperforms other threading methods at the family, superfamily and fold levels. The success of TA-fold is attributed to the combination of template-based fold assignment and ab-initio classification using features from complementary sequence profiles that contain rich evolution information. http://yanglab.nankai.edu.cn/TA-fold/. yangjy@nankai.edu.cn or mhb-506@163.com. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Correcting for deformation in skin-based marker systems.
Alexander, E J; Andriacchi, T P
2001-03-01
A new technique is described that reduces error due to skin movement artifact in the opto-electronic measurement of in vivo skeletal motion. This work builds on a previously described point cluster technique marker set and estimation algorithm by extending the transformation equations to the general deformation case using a set of activity-dependent deformation models. Skin deformation during activities of daily living are modeled as consisting of a functional form defined over the observation interval (the deformation model) plus additive noise (modeling error). The method is described as an interval deformation technique. The method was tested using simulation trials with systematic and random components of deformation error introduced into marker position vectors. The technique was found to substantially outperform methods that require rigid-body assumptions. The method was tested in vivo on a patient fitted with an external fixation device (Ilizarov). Simultaneous measurements from markers placed on the Ilizarov device (fixed to bone) were compared to measurements derived from skin-based markers. The interval deformation technique reduced the errors in limb segment pose estimate by 33 and 25% compared to the classic rigid-body technique for position and orientation, respectively. This newly developed method has demonstrated that by accounting for the changing shape of the limb segment, a substantial improvement in the estimates of in vivo skeletal movement can be achieved.
Multiconstrained gene clustering based on generalized projections
2010-01-01
Background Gene clustering for annotating gene functions is one of the fundamental issues in bioinformatics. The best clustering solution is often regularized by multiple constraints such as gene expressions, Gene Ontology (GO) annotations and gene network structures. How to integrate multiple pieces of constraints for an optimal clustering solution still remains an unsolved problem. Results We propose a novel multiconstrained gene clustering (MGC) method within the generalized projection onto convex sets (POCS) framework used widely in image reconstruction. Each constraint is formulated as a corresponding set. The generalized projector iteratively projects the clustering solution onto these sets in order to find a consistent solution included in the intersection set that satisfies all constraints. Compared with previous MGC methods, POCS can integrate multiple constraints from different nature without distorting the original constraints. To evaluate the clustering solution, we also propose a new performance measure referred to as Gene Log Likelihood (GLL) that considers genes having more than one function and hence in more than one cluster. Comparative experimental results show that our POCS-based gene clustering method outperforms current state-of-the-art MGC methods. Conclusions The POCS-based MGC method can successfully combine multiple constraints from different nature for gene clustering. Also, the proposed GLL is an effective performance measure for the soft clustering solutions. PMID:20356386
The Edge Detectors Suitable for Retinal OCT Image Segmentation
Yang, Jing; Gao, Qian; Zhou, Sheng
2017-01-01
Retinal layer thickness measurement offers important information for reliable diagnosis of retinal diseases and for the evaluation of disease development and medical treatment responses. This task critically depends on the accurate edge detection of the retinal layers in OCT images. Here, we intended to search for the most suitable edge detectors for the retinal OCT image segmentation task. The three most promising edge detection algorithms were identified in the related literature: Canny edge detector, the two-pass method, and the EdgeFlow technique. The quantitative evaluation results show that the two-pass method outperforms consistently the Canny detector and the EdgeFlow technique in delineating the retinal layer boundaries in the OCT images. In addition, the mean localization deviation metrics show that the two-pass method caused the smallest edge shifting problem. These findings suggest that the two-pass method is the best among the three algorithms for detecting retinal layer boundaries. The overall better performance of Canny and two-pass methods over EdgeFlow technique implies that the OCT images contain more intensity gradient information than texture changes along the retinal layer boundaries. The results will guide our future efforts in the quantitative analysis of retinal OCT images for the effective use of OCT technologies in the field of ophthalmology. PMID:29065594
Estimation of retinal vessel caliber using model fitting and random forests
NASA Astrophysics Data System (ADS)
Araújo, Teresa; Mendonça, Ana Maria; Campilho, Aurélio
2017-03-01
Retinal vessel caliber changes are associated with several major diseases, such as diabetes and hypertension. These caliber changes can be evaluated using eye fundus images. However, the clinical assessment is tiresome and prone to errors, motivating the development of automatic methods. An automatic method based on vessel crosssection intensity profile model fitting for the estimation of vessel caliber in retinal images is herein proposed. First, vessels are segmented from the image, vessel centerlines are detected and individual segments are extracted and smoothed. Intensity profiles are extracted perpendicularly to the vessel, and the profile lengths are determined. Then, model fitting is applied to the smoothed profiles. A novel parametric model (DoG-L7) is used, consisting on a Difference-of-Gaussians multiplied by a line which is able to describe profile asymmetry. Finally, the parameters of the best-fit model are used for determining the vessel width through regression using ensembles of bagged regression trees with random sampling of the predictors (random forests). The method is evaluated on the REVIEW public dataset. A precision close to the observers is achieved, outperforming other state-of-the-art methods. The method is robust and reliable for width estimation in images with pathologies and artifacts, with performance independent of the range of diameters.
A new family of Polak-Ribiere-Polyak conjugate gradient method with the strong-Wolfe line search
NASA Astrophysics Data System (ADS)
Ghani, Nur Hamizah Abdul; Mamat, Mustafa; Rivaie, Mohd
2017-08-01
Conjugate gradient (CG) method is an important technique in unconstrained optimization, due to its effectiveness and low memory requirements. The focus of this paper is to introduce a new CG method for solving large scale unconstrained optimization. Theoretical proofs show that the new method fulfills sufficient descent condition if strong Wolfe-Powell inexact line search is used. Besides, computational results show that our proposed method outperforms to other existing CG methods.
Gauge-free cluster variational method by maximal messages and moment matching.
Domínguez, Eduardo; Lage-Castellanos, Alejandro; Mulet, Roberto; Ricci-Tersenghi, Federico
2017-04-01
We present an implementation of the cluster variational method (CVM) as a message passing algorithm. The kind of message passing algorithm used for CVM, usually named generalized belief propagation (GBP), is a generalization of the belief propagation algorithm in the same way that CVM is a generalization of the Bethe approximation for estimating the partition function. However, the connection between fixed points of GBP and the extremal points of the CVM free energy is usually not a one-to-one correspondence because of the existence of a gauge transformation involving the GBP messages. Our contribution is twofold. First, we propose a way of defining messages (fields) in a generic CVM approximation, such that messages arrive on a given region from all its ancestors, and not only from its direct parents, as in the standard parent-to-child GBP. We call this approach maximal messages. Second, we focus on the case of binary variables, reinterpreting the messages as fields enforcing the consistency between the moments of the local (marginal) probability distributions. We provide a precise rule to enforce all consistencies, avoiding any redundancy, that would otherwise lead to a gauge transformation on the messages. This moment matching method is gauge free, i.e., it guarantees that the resulting GBP is not gauge invariant. We apply our maximal messages and moment matching GBP to obtain an analytical expression for the critical temperature of the Ising model in general dimensions at the level of plaquette CVM. The values obtained outperform Bethe estimates, and are comparable with loop corrected belief propagation equations. The method allows for a straightforward generalization to disordered systems.
A comparison of methods for the analysis of binomial clustered outcomes in behavioral research.
Ferrari, Alberto; Comelli, Mario
2016-12-01
In behavioral research, data consisting of a per-subject proportion of "successes" and "failures" over a finite number of trials often arise. This clustered binary data are usually non-normally distributed, which can distort inference if the usual general linear model is applied and sample size is small. A number of more advanced methods is available, but they are often technically challenging and a comparative assessment of their performances in behavioral setups has not been performed. We studied the performances of some methods applicable to the analysis of proportions; namely linear regression, Poisson regression, beta-binomial regression and Generalized Linear Mixed Models (GLMMs). We report on a simulation study evaluating power and Type I error rate of these models in hypothetical scenarios met by behavioral researchers; plus, we describe results from the application of these methods on data from real experiments. Our results show that, while GLMMs are powerful instruments for the analysis of clustered binary outcomes, beta-binomial regression can outperform them in a range of scenarios. Linear regression gave results consistent with the nominal level of significance, but was overall less powerful. Poisson regression, instead, mostly led to anticonservative inference. GLMMs and beta-binomial regression are generally more powerful than linear regression; yet linear regression is robust to model misspecification in some conditions, whereas Poisson regression suffers heavily from violations of the assumptions when used to model proportion data. We conclude providing directions to behavioral scientists dealing with clustered binary data and small sample sizes. Copyright © 2016 Elsevier B.V. All rights reserved.
Gauge-free cluster variational method by maximal messages and moment matching
NASA Astrophysics Data System (ADS)
Domínguez, Eduardo; Lage-Castellanos, Alejandro; Mulet, Roberto; Ricci-Tersenghi, Federico
2017-04-01
We present an implementation of the cluster variational method (CVM) as a message passing algorithm. The kind of message passing algorithm used for CVM, usually named generalized belief propagation (GBP), is a generalization of the belief propagation algorithm in the same way that CVM is a generalization of the Bethe approximation for estimating the partition function. However, the connection between fixed points of GBP and the extremal points of the CVM free energy is usually not a one-to-one correspondence because of the existence of a gauge transformation involving the GBP messages. Our contribution is twofold. First, we propose a way of defining messages (fields) in a generic CVM approximation, such that messages arrive on a given region from all its ancestors, and not only from its direct parents, as in the standard parent-to-child GBP. We call this approach maximal messages. Second, we focus on the case of binary variables, reinterpreting the messages as fields enforcing the consistency between the moments of the local (marginal) probability distributions. We provide a precise rule to enforce all consistencies, avoiding any redundancy, that would otherwise lead to a gauge transformation on the messages. This moment matching method is gauge free, i.e., it guarantees that the resulting GBP is not gauge invariant. We apply our maximal messages and moment matching GBP to obtain an analytical expression for the critical temperature of the Ising model in general dimensions at the level of plaquette CVM. The values obtained outperform Bethe estimates, and are comparable with loop corrected belief propagation equations. The method allows for a straightforward generalization to disordered systems.
Optimal lattice-structured materials
Messner, Mark C.
2016-07-09
This paper describes a method for optimizing the mesostructure of lattice-structured materials. These materials are periodic arrays of slender members resembling efficient, lightweight macroscale structures like bridges and frame buildings. Current additive manufacturing technologies can assemble lattice structures with length scales ranging from nanometers to millimeters. Previous work demonstrates that lattice materials have excellent stiffness- and strength-to-weight scaling, outperforming natural materials. However, there are currently no methods for producing optimal mesostructures that consider the full space of possible 3D lattice topologies. The inverse homogenization approach for optimizing the periodic structure of lattice materials requires a parameterized, homogenized material model describingmore » the response of an arbitrary structure. This work develops such a model, starting with a method for describing the long-wavelength, macroscale deformation of an arbitrary lattice. The work combines the homogenized model with a parameterized description of the total design space to generate a parameterized model. Finally, the work describes an optimization method capable of producing optimal mesostructures. Several examples demonstrate the optimization method. One of these examples produces an elastically isotropic, maximally stiff structure, here called the isotruss, that arguably outperforms the anisotropic octet truss topology.« less
Koutsoukas, Alexios; Monaghan, Keith J; Li, Xiaoli; Huan, Jun
2017-06-28
In recent years, research in artificial neural networks has resurged, now under the deep-learning umbrella, and grown extremely popular. Recently reported success of DL techniques in crowd-sourced QSAR and predictive toxicology competitions has showcased these methods as powerful tools in drug-discovery and toxicology research. The aim of this work was dual, first large number of hyper-parameter configurations were explored to investigate how they affect the performance of DNNs and could act as starting points when tuning DNNs and second their performance was compared to popular methods widely employed in the field of cheminformatics namely Naïve Bayes, k-nearest neighbor, random forest and support vector machines. Moreover, robustness of machine learning methods to different levels of artificially introduced noise was assessed. The open-source Caffe deep-learning framework and modern NVidia GPU units were utilized to carry out this study, allowing large number of DNN configurations to be explored. We show that feed-forward deep neural networks are capable of achieving strong classification performance and outperform shallow methods across diverse activity classes when optimized. Hyper-parameters that were found to play critical role are the activation function, dropout regularization, number hidden layers and number of neurons. When compared to the rest methods, tuned DNNs were found to statistically outperform, with p value <0.01 based on Wilcoxon statistical test. DNN achieved on average MCC units of 0.149 higher than NB, 0.092 than kNN, 0.052 than SVM with linear kernel, 0.021 than RF and finally 0.009 higher than SVM with radial basis function kernel. When exploring robustness to noise, non-linear methods were found to perform well when dealing with low levels of noise, lower than or equal to 20%, however when dealing with higher levels of noise, higher than 30%, the Naïve Bayes method was found to perform well and even outperform at the highest level of noise 50% more sophisticated methods across several datasets.
Are Informant Reports of Personality More Internally Consistent Than Self Reports of Personality?
Balsis, Steve; Cooper, Luke D; Oltmanns, Thomas F
2015-08-01
The present study examined whether informant-reported personality was more or less internally consistent than self-reported personality in an epidemiological community sample (n = 1,449). Results indicated that across the 5 NEO (Neuroticism-Extraversion-Openness) personality factors and the 10 personality disorder trait dimensions, informant reports tended to be more internally consistent than self reports, as indicated by equal or higher Cronbach's alpha scores and higher average interitem correlations. In addition, the informant reports collectively outperformed the self reports for predicting responses on a global measure of health, indicating that the informant reports are not only more reliable than self reports, but they can also be useful in predicting an external criterion. Collectively these findings indicate that informant reports tend to have greater internal consistency than self reports. © The Author(s) 2014.
Personalized recommendation based on unbiased consistence
NASA Astrophysics Data System (ADS)
Zhu, Xuzhen; Tian, Hui; Zhang, Ping; Hu, Zheng; Zhou, Tao
2015-08-01
Recently, in physical dynamics, mass-diffusion-based recommendation algorithms on bipartite network provide an efficient solution by automatically pushing possible relevant items to users according to their past preferences. However, traditional mass-diffusion-based algorithms just focus on unidirectional mass diffusion from objects having been collected to those which should be recommended, resulting in a biased causal similarity estimation and not-so-good performance. In this letter, we argue that in many cases, a user's interests are stable, and thus bidirectional mass diffusion abilities, no matter originated from objects having been collected or from those which should be recommended, should be consistently powerful, showing unbiased consistence. We further propose a consistence-based mass diffusion algorithm via bidirectional diffusion against biased causality, outperforming the state-of-the-art recommendation algorithms in disparate real data sets, including Netflix, MovieLens, Amazon and Rate Your Music.
A Comparison of the Analytic Hierarchy Process and the Geometric Mean Procedure for Ratio Scaling
1987-12-01
hierarchy process ( AHP ) has attracted much attention, having been successfully applied in such diverse areas as marketing (Wind & Saaty, 1980...political science (Saaty & Bennett, 1977), and the measurement of subjective probabilities (Yager, 1979). However, the AHP has had its share of criticism...of Williams and Crawford, shows the percentage of cases (matrices) in which the GM procedure outperformed the AHP . A In the "almost consistent case
Reshaping the Mind: The Benefits of Bilingualism
Bialystok, Ellen
2015-01-01
Studies have shown that bilingual individuals consistently outperform their monolingual counterparts on tasks involving executive control. The present paper reviews some of the evidence for this conclusion and relates the findings to the effect of bilingualism on cognitive organisation and to conceptual issues in the structure of executive control. Evidence for the protective effect of bilingualism against Alzheimer’s disease is presented with some speculation about the reason for that protection. PMID:21910523
Mutual information of optical communication in phase-conjugating Gaussian channels
NASA Astrophysics Data System (ADS)
Schäfermeier, Clemens; Andersen, Ulrik L.
2018-03-01
In all practical communication channels, the code word consists of Gaussian states and the measurement strategy is often a Gaussian detector such as homodyning or heterodyning. We investigate the communication performance using a phase-conjugated alphabet and joint Gaussian detection in a phase-insensitive amplifying channel. We find that a communication scheme consisting of a phase-conjugating alphabet of coherent states and a joint detection strategy significantly outperforms a standard coherent-state strategy based in individual detection. Moreover, we show that the performance can be further enhanced by using entanglement and that the performance is completely independent of the gain of the phase-insensitively amplifying channel.
MAI-free performance of PMU-OFDM transceiver in time-variant environment
NASA Astrophysics Data System (ADS)
Tadjpour, Layla; Tsai, Shang-Ho; Kuo, C.-C. J.
2005-06-01
An approximately multi-user OFDM transceiver was introduced to reduce the multi-access interference (MAI ) due to the carrier frequency offset (CFO) to a negligible amount via precoding by Tsai, Lin and Kuo. In this work, we investigate the performance of this precoded multi-user (PMU) OFDM system in a time-variant channel environment. We analyze and compare the MAI effect caused by time-variant channels in the PMU-OFDM and the OFDMA systems. Generally speaking, the MAI effect consists of two parts. The first part is due to the loss of orthogonality among subchannels for all users while the second part is due to the CFO effect caused by the Doppler shift. Simulation results show that, although OFDMA outperforms the PMU-OFDM transceiver in a fast time-variant environment without CFO, PMU-OFDM outperforms OFDMA in a slow time-variant channel via the use of M/2 symmetric or anti-symmetric codewords of M Hadamard-Walsh codes.
Application of a New Resampling Method to SEM: A Comparison of S-SMART with the Bootstrap
ERIC Educational Resources Information Center
Bai, Haiyan; Sivo, Stephen A.; Pan, Wei; Fan, Xitao
2016-01-01
Among the commonly used resampling methods of dealing with small-sample problems, the bootstrap enjoys the widest applications because it often outperforms its counterparts. However, the bootstrap still has limitations when its operations are contemplated. Therefore, the purpose of this study is to examine an alternative, new resampling method…
Resampling and Distribution of the Product Methods for Testing Indirect Effects in Complex Models
ERIC Educational Resources Information Center
Williams, Jason; MacKinnon, David P.
2008-01-01
Recent advances in testing mediation have found that certain resampling methods and tests based on the mathematical distribution of 2 normal random variables substantially outperform the traditional "z" test. However, these studies have primarily focused only on models with a single mediator and 2 component paths. To address this limitation, a…
Chen, Hongyu; Martin, Bronwen; Daimon, Caitlin M; Maudsley, Stuart
2013-01-01
Text mining is rapidly becoming an essential technique for the annotation and analysis of large biological data sets. Biomedical literature currently increases at a rate of several thousand papers per week, making automated information retrieval methods the only feasible method of managing this expanding corpus. With the increasing prevalence of open-access journals and constant growth of publicly-available repositories of biomedical literature, literature mining has become much more effective with respect to the extraction of biomedically-relevant data. In recent years, text mining of popular databases such as MEDLINE has evolved from basic term-searches to more sophisticated natural language processing techniques, indexing and retrieval methods, structural analysis and integration of literature with associated metadata. In this review, we will focus on Latent Semantic Indexing (LSI), a computational linguistics technique increasingly used for a variety of biological purposes. It is noted for its ability to consistently outperform benchmark Boolean text searches and co-occurrence models at information retrieval and its power to extract indirect relationships within a data set. LSI has been used successfully to formulate new hypotheses, generate novel connections from existing data, and validate empirical data.
Parametrizing linear generalized Langevin dynamics from explicit molecular dynamics simulations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gottwald, Fabian; Karsten, Sven; Ivanov, Sergei D., E-mail: sergei.ivanov@uni-rostock.de
2015-06-28
Fundamental understanding of complex dynamics in many-particle systems on the atomistic level is of utmost importance. Often the systems of interest are of macroscopic size but can be partitioned into a few important degrees of freedom which are treated most accurately and others which constitute a thermal bath. Particular attention in this respect attracts the linear generalized Langevin equation, which can be rigorously derived by means of a linear projection technique. Within this framework, a complicated interaction with the bath can be reduced to a single memory kernel. This memory kernel in turn is parametrized for a particular system studied,more » usually by means of time-domain methods based on explicit molecular dynamics data. Here, we discuss that this task is more naturally achieved in frequency domain and develop a Fourier-based parametrization method that outperforms its time-domain analogues. Very surprisingly, the widely used rigid bond method turns out to be inappropriate in general. Importantly, we show that the rigid bond approach leads to a systematic overestimation of relaxation times, unless the system under study consists of a harmonic bath bi-linearly coupled to the relevant degrees of freedom.« less
Regularized estimation of Euler pole parameters
NASA Astrophysics Data System (ADS)
Aktuğ, Bahadir; Yildirim, Ömer
2013-07-01
Euler vectors provide a unified framework to quantify the relative or absolute motions of tectonic plates through various geodetic and geophysical observations. With the advent of space geodesy, Euler parameters of several relatively small plates have been determined through the velocities derived from the space geodesy observations. However, the available data are usually insufficient in number and quality to estimate both the Euler vector components and the Euler pole parameters reliably. Since Euler vectors are defined globally in an Earth-centered Cartesian frame, estimation with the limited geographic coverage of the local/regional geodetic networks usually results in highly correlated vector components. In the case of estimating the Euler pole parameters directly, the situation is even worse, and the position of the Euler pole is nearly collinear with the magnitude of the rotation rate. In this study, a new method, which consists of an analytical derivation of the covariance matrix of the Euler vector in an ideal network configuration, is introduced and a regularized estimation method specifically tailored for estimating the Euler vector is presented. The results show that the proposed method outperforms the least squares estimation in terms of the mean squared error.
Li, Ji; Hu, Guoqing; Zhou, Yonghong; Zou, Chong; Peng, Wei; Alam Sm, Jahangir
2016-10-14
A piezo-resistive pressure sensor is made of silicon, the nature of which is considerably influenced by ambient temperature. The effect of temperature should be eliminated during the working period in expectation of linear output. To deal with this issue, an approach consists of a hybrid kernel Least Squares Support Vector Machine (LSSVM) optimized by a chaotic ions motion algorithm presented. To achieve the learning and generalization for excellent performance, a hybrid kernel function, constructed by a local kernel as Radial Basis Function (RBF) kernel, and a global kernel as polynomial kernel is incorporated into the Least Squares Support Vector Machine. The chaotic ions motion algorithm is introduced to find the best hyper-parameters of the Least Squares Support Vector Machine. The temperature data from a calibration experiment is conducted to validate the proposed method. With attention on algorithm robustness and engineering applications, the compensation result shows the proposed scheme outperforms other compared methods on several performance measures as maximum absolute relative error, minimum absolute relative error mean and variance of the averaged value on fifty runs. Furthermore, the proposed temperature compensation approach lays a foundation for more extensive research.
Alfaro-Núñez, Alonzo; Gilbert, M Thomas P
2014-09-01
The Chelonid fibropapilloma-associated herpesvirus (CFPHV) is hypothesized to be the causative agent of fibropapillomatosis, a neoplastic disease in sea turtles, given its consistent detection by PCR in fibropapilloma tumours. CFPHV has also been detected recently by PCR in tissue samples from clinically healthy (non exhibiting fibropapilloma tumours) turtles, thus representing presumably latent infections of the pathogen. Given that template copy numbers of viruses in latent infections can be very low, extremely sensitive PCR assays are needed to optimize detection efficiency. In this study, efficiency of several PCR assays designed for CFPHV detection is explored and compared to a method published previously. The results show that adoption of a triplet set of singleplex PCR assays outperforms other methods, with an approximately 3-fold increase in detection success in comparison to the standard assay. Thus, a new assay for the detection of CFPHV DNA markers is presented, and adoption of its methodology is recommended in future CFPHV screens among sea turtles. Copyright © 2014 Elsevier B.V. All rights reserved.
Estimation of signal-dependent noise level function in transform domain via a sparse recovery model.
Yang, Jingyu; Gan, Ziqiao; Wu, Zhaoyang; Hou, Chunping
2015-05-01
This paper proposes a novel algorithm to estimate the noise level function (NLF) of signal-dependent noise (SDN) from a single image based on the sparse representation of NLFs. Noise level samples are estimated from the high-frequency discrete cosine transform (DCT) coefficients of nonlocal-grouped low-variation image patches. Then, an NLF recovery model based on the sparse representation of NLFs under a trained basis is constructed to recover NLF from the incomplete noise level samples. Confidence levels of the NLF samples are incorporated into the proposed model to promote reliable samples and weaken unreliable ones. We investigate the behavior of the estimation performance with respect to the block size, sampling rate, and confidence weighting. Simulation results on synthetic noisy images show that our method outperforms existing state-of-the-art schemes. The proposed method is evaluated on real noisy images captured by three types of commodity imaging devices, and shows consistently excellent SDN estimation performance. The estimated NLFs are incorporated into two well-known denoising schemes, nonlocal means and BM3D, and show significant improvements in denoising SDN-polluted images.
Scholz, Matthew; Lo, Chien -Chi; Chain, Patrick S. G.
2014-10-01
Assembly of metagenomic samples is a very complex process, with algorithms designed to address sequencing platform-specific issues, (read length, data volume, and/or community complexity), while also faced with genomes that differ greatly in nucleotide compositional biases and in abundance. To address these issues, we have developed a post-assembly process: MetaGenomic Assembly by Merging (MeGAMerge). We compare this process to the performance of several assemblers, using both real, and in-silico generated samples of different community composition and complexity. MeGAMerge consistently outperforms individual assembly methods, producing larger contigs with an increased number of predicted genes, without replication of data. MeGAMerge contigs aremore » supported by read mapping and contig alignment data, when using synthetically-derived and real metagenomic data, as well as by gene prediction analyses and similarity searches. Ultimately, MeGAMerge is a flexible method that generates improved metagenome assemblies, with the ability to accommodate upcoming sequencing platforms, as well as present and future assembly algorithms.« less
Tang, Rongnian; Chen, Xupeng; Li, Chuang
2018-05-01
Near-infrared spectroscopy is an efficient, low-cost technology that has potential as an accurate method in detecting the nitrogen content of natural rubber leaves. Successive projections algorithm (SPA) is a widely used variable selection method for multivariate calibration, which uses projection operations to select a variable subset with minimum multi-collinearity. However, due to the fluctuation of correlation between variables, high collinearity may still exist in non-adjacent variables of subset obtained by basic SPA. Based on analysis to the correlation matrix of the spectra data, this paper proposed a correlation-based SPA (CB-SPA) to apply the successive projections algorithm in regions with consistent correlation. The result shows that CB-SPA can select variable subsets with more valuable variables and less multi-collinearity. Meanwhile, models established by the CB-SPA subset outperform basic SPA subsets in predicting nitrogen content in terms of both cross-validation and external prediction. Moreover, CB-SPA is assured to be more efficient, for the time cost in its selection procedure is one-twelfth that of the basic SPA.
Akita, Yasuyuki; Baldasano, Jose M; Beelen, Rob; Cirach, Marta; de Hoogh, Kees; Hoek, Gerard; Nieuwenhuijsen, Mark; Serre, Marc L; de Nazelle, Audrey
2014-04-15
In recognition that intraurban exposure gradients may be as large as between-city variations, recent air pollution epidemiologic studies have become increasingly interested in capturing within-city exposure gradients. In addition, because of the rapidly accumulating health data, recent studies also need to handle large study populations distributed over large geographic domains. Even though several modeling approaches have been introduced, a consistent modeling framework capturing within-city exposure variability and applicable to large geographic domains is still missing. To address these needs, we proposed a modeling framework based on the Bayesian Maximum Entropy method that integrates monitoring data and outputs from existing air quality models based on Land Use Regression (LUR) and Chemical Transport Models (CTM). The framework was applied to estimate the yearly average NO2 concentrations over the region of Catalunya in Spain. By jointly accounting for the global scale variability in the concentration from the output of CTM and the intraurban scale variability through LUR model output, the proposed framework outperformed more conventional approaches.
NASA Astrophysics Data System (ADS)
Hu, Yan-Yan; Li, Dong-Sheng
2016-01-01
The hyperspectral images(HSI) consist of many closely spaced bands carrying the most object information. While due to its high dimensionality and high volume nature, it is hard to get satisfactory classification performance. In order to reduce HSI data dimensionality preparation for high classification accuracy, it is proposed to combine a band selection method of artificial immune systems (AIS) with a hybrid kernels support vector machine (SVM-HK) algorithm. In fact, after comparing different kernels for hyperspectral analysis, the approach mixed radial basis function kernel (RBF-K) with sigmoid kernel (Sig-K) and applied the optimized hybrid kernels in SVM classifiers. Then the SVM-HK algorithm used to induce the bands selection of an improved version of AIS. The AIS was composed of clonal selection and elite antibody mutation, including evaluation process with optional index factor (OIF). Experimental classification performance was on a San Diego Naval Base acquired by AVIRIS, the HRS dataset shows that the method is able to efficiently achieve bands redundancy removal while outperforming the traditional SVM classifier.
Adaptive metric learning with deep neural networks for video-based facial expression recognition
NASA Astrophysics Data System (ADS)
Liu, Xiaofeng; Ge, Yubin; Yang, Chao; Jia, Ping
2018-01-01
Video-based facial expression recognition has become increasingly important for plenty of applications in the real world. Despite that numerous efforts have been made for the single sequence, how to balance the complex distribution of intra- and interclass variations well between sequences has remained a great difficulty in this area. We propose the adaptive (N+M)-tuplet clusters loss function and optimize it with the softmax loss simultaneously in the training phrase. The variations introduced by personal attributes are alleviated using the similarity measurements of multiple samples in the feature space with many fewer comparison times as conventional deep metric learning approaches, which enables the metric calculations for large data applications (e.g., videos). Both the spatial and temporal relations are well explored by a unified framework that consists of an Inception-ResNet network with long short term memory and the two fully connected layer branches structure. Our proposed method has been evaluated with three well-known databases, and the experimental results show that our method outperforms many state-of-the-art approaches.
XQ-NLM: Denoising Diffusion MRI Data via x-q Space Non-Local Patch Matching.
Chen, Geng; Wu, Yafeng; Shen, Dinggang; Yap, Pew-Thian
2016-10-01
Noise is a major issue influencing quantitative analysis in diffusion MRI. The effects of noise can be reduced by repeated acquisitions, but this leads to long acquisition times that can be unrealistic in clinical settings. For this reason, post-acquisition denoising methods have been widely used to improve SNR. Among existing methods, non-local means (NLM) has been shown to produce good image quality with edge preservation. However, currently the application of NLM to diffusion MRI has been mostly focused on the spatial space (i.e., the x -space), despite the fact that diffusion data live in a combined space consisting of the x -space and the q -space (i.e., the space of wavevectors). In this paper, we propose to extend NLM to both x -space and q -space. We show how patch-matching, as required in NLM, can be performed concurrently in x-q space with the help of azimuthal equidistant projection and rotation invariant features. Extensive experiments on both synthetic and real data confirm that the proposed x-q space NLM (XQ-NLM) outperforms the classic NLM.
Deep Learning for Automated Extraction of Primary Sites from Cancer Pathology Reports
Qiu, John; Yoon, Hong-Jun; Fearn, Paul A.; ...
2017-05-03
Pathology reports are a primary source of information for cancer registries which process high volumes of free-text reports annually. Information extraction and coding is a manual, labor-intensive process. Here in this study we investigated deep learning and a convolutional neural network (CNN), for extracting ICDO- 3 topographic codes from a corpus of breast and lung cancer pathology reports. We performed two experiments, using a CNN and a more conventional term frequency vector approach, to assess the effects of class prevalence and inter-class transfer learning. The experiments were based on a set of 942 pathology reports with human expert annotations asmore » the gold standard. CNN performance was compared against a more conventional term frequency vector space approach. We observed that the deep learning models consistently outperformed the conventional approaches in the class prevalence experiment, resulting in micro and macro-F score increases of up to 0.132 and 0.226 respectively when class labels were well populated. Specifically, the best performing CNN achieved a micro-F score of 0.722 over 12 ICD-O-3 topography codes. Transfer learning provided a consistent but modest performance boost for the deep learning methods but trends were contingent on CNN method and cancer site. Finally, these encouraging results demonstrate the potential of deep learning for automated abstraction of pathology reports.« less
Deep Learning for Automated Extraction of Primary Sites from Cancer Pathology Reports
DOE Office of Scientific and Technical Information (OSTI.GOV)
Qiu, John; Yoon, Hong-Jun; Fearn, Paul A.
Pathology reports are a primary source of information for cancer registries which process high volumes of free-text reports annually. Information extraction and coding is a manual, labor-intensive process. Here in this study we investigated deep learning and a convolutional neural network (CNN), for extracting ICDO- 3 topographic codes from a corpus of breast and lung cancer pathology reports. We performed two experiments, using a CNN and a more conventional term frequency vector approach, to assess the effects of class prevalence and inter-class transfer learning. The experiments were based on a set of 942 pathology reports with human expert annotations asmore » the gold standard. CNN performance was compared against a more conventional term frequency vector space approach. We observed that the deep learning models consistently outperformed the conventional approaches in the class prevalence experiment, resulting in micro and macro-F score increases of up to 0.132 and 0.226 respectively when class labels were well populated. Specifically, the best performing CNN achieved a micro-F score of 0.722 over 12 ICD-O-3 topography codes. Transfer learning provided a consistent but modest performance boost for the deep learning methods but trends were contingent on CNN method and cancer site. Finally, these encouraging results demonstrate the potential of deep learning for automated abstraction of pathology reports.« less
A fast fully constrained geometric unmixing of hyperspectral images
NASA Astrophysics Data System (ADS)
Zhou, Xin; Li, Xiao-run; Cui, Jian-tao; Zhao, Liao-ying; Zheng, Jun-peng
2014-11-01
A great challenge in hyperspectral image analysis is decomposing a mixed pixel into a collection of endmembers and their corresponding abundance fractions. This paper presents an improved implementation of Barycentric Coordinate approach to unmix hyperspectral images, integrating with the Most-Negative Remove Projection method to meet the abundance sum-to-one constraint (ASC) and abundance non-negativity constraint (ANC). The original barycentric coordinate approach interprets the endmember unmixing problem as a simplex volume ratio problem, which is solved by calculate the determinants of two augmented matrix. One consists of all the members and the other consist of the to-be-unmixed pixel and all the endmembers except for the one corresponding to the specific abundance that is to be estimated. In this paper, we first modified the algorithm of Barycentric Coordinate approach by bringing in the Matrix Determinant Lemma to simplify the unmixing process, which makes the calculation only contains linear matrix and vector operations. So, the matrix determinant calculation of every pixel, as the original algorithm did, is avoided. By the end of this step, the estimated abundance meet the ASC constraint. Then, the Most-Negative Remove Projection method is used to make the abundance fractions meet the full constraints. This algorithm is demonstrated both on synthetic and real images. The resulting algorithm yields the abundance maps that are similar to those obtained by FCLS, while the runtime is outperformed as its computational simplicity.
Circular Mixture Modeling of Color Distribution for Blind Stain Separation in Pathology Images.
Li, Xingyu; Plataniotis, Konstantinos N
2017-01-01
In digital pathology, to address color variation and histological component colocalization in pathology images, stain decomposition is usually performed preceding spectral normalization and tissue component segmentation. This paper examines the problem of stain decomposition, which is a naturally nonnegative matrix factorization (NMF) problem in algebra, and introduces a systematical and analytical solution consisting of a circular color analysis module and an NMF-based computation module. Unlike the paradigm of existing stain decomposition algorithms where stain proportions are computed from estimated stain spectra using a matrix inverse operation directly, the introduced solution estimates stain spectra and stain depths via probabilistic reasoning individually. Since the proposed method pays extra attentions to achromatic pixels in color analysis and stain co-occurrence in pixel clustering, it achieves consistent and reliable stain decomposition with minimum decomposition residue. Particularly, aware of the periodic and angular nature of hue, we propose the use of a circular von Mises mixture model to analyze the hue distribution, and provide a complete color-based pixel soft-clustering solution to address color mixing introduced by stain overlap. This innovation combined with saturation-weighted computation makes our study effective for weak stains and broad-spectrum stains. Extensive experimentation on multiple public pathology datasets suggests that our approach outperforms state-of-the-art blind stain separation methods in terms of decomposition effectiveness.
Finding Dantzig Selectors with a Proximity Operator based Fixed-point Algorithm
2014-11-01
experiments showed that this method usually outperforms the method in [2] in terms of CPU time while producing solutions of comparable quality. The... method proposed in [19]. To alleviate the difficulty caused by the subprob- lem without a closed form solution , a linearized ADM was proposed for the...a closed form solution , but the β-related subproblem does not and is solved approximately by using the nonmonotone gradient method in [18]. The
High precision automated face localization in thermal images: oral cancer dataset as test case
NASA Astrophysics Data System (ADS)
Chakraborty, M.; Raman, S. K.; Mukhopadhyay, S.; Patsa, S.; Anjum, N.; Ray, J. G.
2017-02-01
Automated face detection is the pivotal step in computer vision aided facial medical diagnosis and biometrics. This paper presents an automatic, subject adaptive framework for accurate face detection in the long infrared spectrum on our database for oral cancer detection consisting of malignant, precancerous and normal subjects of varied age group. Previous works on oral cancer detection using Digital Infrared Thermal Imaging(DITI) reveals that patients and normal subjects differ significantly in their facial thermal distribution. Therefore, it is a challenging task to formulate a completely adaptive framework to veraciously localize face from such a subject specific modality. Our model consists of first extracting the most probable facial regions by minimum error thresholding followed by ingenious adaptive methods to leverage the horizontal and vertical projections of the segmented thermal image. Additionally, the model incorporates our domain knowledge of exploiting temperature difference between strategic locations of the face. To our best knowledge, this is the pioneering work on detecting faces in thermal facial images comprising both patients and normal subjects. Previous works on face detection have not specifically targeted automated medical diagnosis; face bounding box returned by those algorithms are thus loose and not apt for further medical automation. Our algorithm significantly outperforms contemporary face detection algorithms in terms of commonly used metrics for evaluating face detection accuracy. Since our method has been tested on challenging dataset consisting of both patients and normal subjects of diverse age groups, it can be seamlessly adapted in any DITI guided facial healthcare or biometric applications.
Walia, Rasna R; Xue, Li C; Wilkins, Katherine; El-Manzalawy, Yasser; Dobbs, Drena; Honavar, Vasant
2014-01-01
Protein-RNA interactions are central to essential cellular processes such as protein synthesis and regulation of gene expression and play roles in human infectious and genetic diseases. Reliable identification of protein-RNA interfaces is critical for understanding the structural bases and functional implications of such interactions and for developing effective approaches to rational drug design. Sequence-based computational methods offer a viable, cost-effective way to identify putative RNA-binding residues in RNA-binding proteins. Here we report two novel approaches: (i) HomPRIP, a sequence homology-based method for predicting RNA-binding sites in proteins; (ii) RNABindRPlus, a new method that combines predictions from HomPRIP with those from an optimized Support Vector Machine (SVM) classifier trained on a benchmark dataset of 198 RNA-binding proteins. Although highly reliable, HomPRIP cannot make predictions for the unaligned parts of query proteins and its coverage is limited by the availability of close sequence homologs of the query protein with experimentally determined RNA-binding sites. RNABindRPlus overcomes these limitations. We compared the performance of HomPRIP and RNABindRPlus with that of several state-of-the-art predictors on two test sets, RB44 and RB111. On a subset of proteins for which homologs with experimentally determined interfaces could be reliably identified, HomPRIP outperformed all other methods achieving an MCC of 0.63 on RB44 and 0.83 on RB111. RNABindRPlus was able to predict RNA-binding residues of all proteins in both test sets, achieving an MCC of 0.55 and 0.37, respectively, and outperforming all other methods, including those that make use of structure-derived features of proteins. More importantly, RNABindRPlus outperforms all other methods for any choice of tradeoff between precision and recall. An important advantage of both HomPRIP and RNABindRPlus is that they rely on readily available sequence and sequence-derived features of RNA-binding proteins. A webserver implementation of both methods is freely available at http://einstein.cs.iastate.edu/RNABindRPlus/.
A foreground object features-based stereoscopic image visual comfort assessment model
NASA Astrophysics Data System (ADS)
Jin, Xin; Jiang, G.; Ying, H.; Yu, M.; Ding, S.; Peng, Z.; Shao, F.
2014-11-01
Since stereoscopic images provide observers with both realistic and discomfort viewing experience, it is necessary to investigate the determinants of visual discomfort. By considering that foreground object draws most attention when human observing stereoscopic images. This paper proposes a new foreground object based visual comfort assessment (VCA) metric. In the first place, a suitable segmentation method is applied to disparity map and then the foreground object is ascertained as the one having the biggest average disparity. In the second place, three visual features being average disparity, average width and spatial complexity of foreground object are computed from the perspective of visual attention. Nevertheless, object's width and complexity do not consistently influence the perception of visual comfort in comparison with disparity. In accordance with this psychological phenomenon, we divide the whole images into four categories on the basis of different disparity and width, and exert four different models to more precisely predict its visual comfort in the third place. Experimental results show that the proposed VCA metric outperformance other existing metrics and can achieve a high consistency between objective and subjective visual comfort scores. The Pearson Linear Correlation Coefficient (PLCC) and Spearman Rank Order Correlation Coefficient (SROCC) are over 0.84 and 0.82, respectively.
Integrative Analysis of Prognosis Data on Multiple Cancer Subtypes
Liu, Jin; Huang, Jian; Zhang, Yawei; Lan, Qing; Rothman, Nathaniel; Zheng, Tongzhang; Ma, Shuangge
2014-01-01
Summary In cancer research, profiling studies have been extensively conducted, searching for genes/SNPs associated with prognosis. Cancer is diverse. Examining the similarity and difference in the genetic basis of multiple subtypes of the same cancer can lead to a better understanding of their connections and distinctions. Classic meta-analysis methods analyze each subtype separately and then compare analysis results across subtypes. Integrative analysis methods, in contrast, analyze the raw data on multiple subtypes simultaneously and can outperform meta-analysis methods. In this study, prognosis data on multiple subtypes of the same cancer are analyzed. An AFT (accelerated failure time) model is adopted to describe survival. The genetic basis of multiple subtypes is described using the heterogeneity model, which allows a gene/SNP to be associated with prognosis of some subtypes but not others. A compound penalization method is developed to identify genes that contain important SNPs associated with prognosis. The proposed method has an intuitive formulation and is realized using an iterative algorithm. Asymptotic properties are rigorously established. Simulation shows that the proposed method has satisfactory performance and outperforms a penalization-based meta-analysis method and a regularized thresholding method. An NHL (non-Hodgkin lymphoma) prognosis study with SNP measurements is analyzed. Genes associated with the three major subtypes, namely DLBCL, FL, and CLL/SLL, are identified. The proposed method identifies genes that are different from alternatives and have important implications and satisfactory prediction performance. PMID:24766212
JPEG2000 vs. full frame wavelet packet compression for smart card medical records.
Leehan, Joaquín Azpirox; Lerallut, Jean-Francois
2006-01-01
This paper describes a comparison among different compression methods to be used in the context of electronic health records in the newer version of "smart cards". The JPEG2000 standard is compared to a full-frame wavelet packet compression method at high (33:1 and 50:1) compression rates. Results show that the full-frame method outperforms the JPEG2K standard qualitatively and quantitatively.
Horton, Bethany Jablonski; Wages, Nolan A.; Conaway, Mark R.
2016-01-01
Toxicity probability interval designs have received increasing attention as a dose-finding method in recent years. In this study, we compared the two-stage, likelihood-based continual reassessment method (CRM), modified toxicity probability interval (mTPI), and the Bayesian optimal interval design (BOIN) in order to evaluate each method's performance in dose selection for Phase I trials. We use several summary measures to compare the performance of these methods, including percentage of correct selection (PCS) of the true maximum tolerable dose (MTD), allocation of patients to doses at and around the true MTD, and an accuracy index. This index is an efficiency measure that describes the entire distribution of MTD selection and patient allocation by taking into account the distance between the true probability of toxicity at each dose level and the target toxicity rate. The simulation study considered a broad range of toxicity curves and various sample sizes. When considering PCS, we found that CRM outperformed the two competing methods in most scenarios, followed by BOIN, then mTPI. We observed a similar trend when considering the accuracy index for dose allocation, where CRM most often outperformed both the mTPI and BOIN. These trends were more pronounced with increasing number of dose levels. PMID:27435150
Comparison of DNA preservation methods for environmental bacterial community samples.
Gray, Michael A; Pratte, Zoe A; Kellogg, Christina A
2013-02-01
Field collections of environmental samples, for example corals, for molecular microbial analyses present distinct challenges. The lack of laboratory facilities in remote locations is common, and preservation of microbial community DNA for later study is critical. A particular challenge is keeping samples frozen in transit. Five nucleic acid preservation methods that do not require cold storage were compared for effectiveness over time and ease of use. Mixed microbial communities of known composition were created and preserved by DNAgard(™), RNAlater(®), DMSO-EDTA-salt (DESS), FTA(®) cards, and FTA Elute(®) cards. Automated ribosomal intergenic spacer analysis and clone libraries were used to detect specific changes in the faux communities over weeks and months of storage. A previously known bias in FTA(®) cards that results in lower recovery of pure cultures of Gram-positive bacteria was also detected in mixed community samples. There appears to be a uniform bias across all five preservation methods against microorganisms with high G + C DNA. Overall, the liquid-based preservatives (DNAgard(™), RNAlater(®), and DESS) outperformed the card-based methods. No single liquid method clearly outperformed the others, leaving method choice to be based on experimental design, field facilities, shipping constraints, and allowable cost. © 2012 Federation of European Microbiological Societies. Published by Blackwell Publishing Ltd. All rights reserved.
SSVEP recognition using common feature analysis in brain-computer interface.
Zhang, Yu; Zhou, Guoxu; Jin, Jing; Wang, Xingyu; Cichocki, Andrzej
2015-04-15
Canonical correlation analysis (CCA) has been successfully applied to steady-state visual evoked potential (SSVEP) recognition for brain-computer interface (BCI) application. Although the CCA method outperforms the traditional power spectral density analysis through multi-channel detection, it requires additionally pre-constructed reference signals of sine-cosine waves. It is likely to encounter overfitting in using a short time window since the reference signals include no features from training data. We consider that a group of electroencephalogram (EEG) data trials recorded at a certain stimulus frequency on a same subject should share some common features that may bear the real SSVEP characteristics. This study therefore proposes a common feature analysis (CFA)-based method to exploit the latent common features as natural reference signals in using correlation analysis for SSVEP recognition. Good performance of the CFA method for SSVEP recognition is validated with EEG data recorded from ten healthy subjects, in contrast to CCA and a multiway extension of CCA (MCCA). Experimental results indicate that the CFA method significantly outperformed the CCA and the MCCA methods for SSVEP recognition in using a short time window (i.e., less than 1s). The superiority of the proposed CFA method suggests it is promising for the development of a real-time SSVEP-based BCI. Copyright © 2014 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Kukkonen, M.; Maltamo, M.; Packalen, P.
2017-08-01
Image matching is emerging as a compelling alternative to airborne laser scanning (ALS) as a data source for forest inventory and management. There is currently an open discussion in the forest inventory community about whether, and to what extent, the new method can be applied to practical inventory campaigns. This paper aims to contribute to this discussion by comparing two different image matching algorithms (Semi-Global Matching [SGM] and Next-Generation Automatic Terrain Extraction [NGATE]) and ALS in a typical managed boreal forest environment in southern Finland. Spectral features from unrectified aerial images were included in the modeling and the potential of image matching in areas without a high resolution digital terrain model (DTM) was also explored. Plot level predictions for total volume, stem number, basal area, height of basal area median tree and diameter of basal area median tree were modeled using an area-based approach. Plot level dominant tree species were predicted using a random forest algorithm, also using an area-based approach. The statistical difference between the error rates from different datasets was evaluated using a bootstrap method. Results showed that ALS outperformed image matching with every forest attribute, even when a high resolution DTM was used for height normalization and spectral information from images was included. Dominant tree species classification with image matching achieved accuracy levels similar to ALS regardless of the resolution of the DTM when spectral metrics were used. Neither of the image matching algorithms consistently outperformed the other, but there were noticeably different error rates depending on the parameter configuration, spectral band, resolution of DTM, or response variable. This study showed that image matching provides reasonable point cloud data for forest inventory purposes, especially when a high resolution DTM is available and information from the understory is redundant.
A computational approach to compare regression modelling strategies in prediction research.
Pajouheshnia, Romin; Pestman, Wiebe R; Teerenstra, Steven; Groenwold, Rolf H H
2016-08-25
It is often unclear which approach to fit, assess and adjust a model will yield the most accurate prediction model. We present an extension of an approach for comparing modelling strategies in linear regression to the setting of logistic regression and demonstrate its application in clinical prediction research. A framework for comparing logistic regression modelling strategies by their likelihoods was formulated using a wrapper approach. Five different strategies for modelling, including simple shrinkage methods, were compared in four empirical data sets to illustrate the concept of a priori strategy comparison. Simulations were performed in both randomly generated data and empirical data to investigate the influence of data characteristics on strategy performance. We applied the comparison framework in a case study setting. Optimal strategies were selected based on the results of a priori comparisons in a clinical data set and the performance of models built according to each strategy was assessed using the Brier score and calibration plots. The performance of modelling strategies was highly dependent on the characteristics of the development data in both linear and logistic regression settings. A priori comparisons in four empirical data sets found that no strategy consistently outperformed the others. The percentage of times that a model adjustment strategy outperformed a logistic model ranged from 3.9 to 94.9 %, depending on the strategy and data set. However, in our case study setting the a priori selection of optimal methods did not result in detectable improvement in model performance when assessed in an external data set. The performance of prediction modelling strategies is a data-dependent process and can be highly variable between data sets within the same clinical domain. A priori strategy comparison can be used to determine an optimal logistic regression modelling strategy for a given data set before selecting a final modelling approach.
Reshaping the mind: the benefits of bilingualism.
Bialystok, Ellen
2011-12-01
Studies have shown that bilingual individuals consistently outperform their monolingual counterparts on tasks involving executive control. The present paper reviews some of the evidence for this conclusion and relates the findings to the effect of bilingualism on cognitive organisation and to conceptual issues in the structure of executive control. Evidence for the protective effect of bilingualism against Alzheimer's disease is presented with some speculation about the reason for that protection. PsycINFO Database Record (c) 2011 APA, all rights reserved.
Accurate and reproducible functional maps in 127 human cell types via 2D genome segmentation
Hardison, Ross C.
2017-01-01
Abstract The Roadmap Epigenomics Consortium has published whole-genome functional annotation maps in 127 human cell types by integrating data from studies of multiple epigenetic marks. These maps have been widely used for studying gene regulation in cell type-specific contexts and predicting the functional impact of DNA mutations on disease. Here, we present a new map of functional elements produced by applying a method called IDEAS on the same data. The method has several unique advantages and outperforms existing methods, including that used by the Roadmap Epigenomics Consortium. Using five categories of independent experimental datasets, we compared the IDEAS and Roadmap Epigenomics maps. While the overall concordance between the two maps is high, the maps differ substantially in the prediction details and in their consistency of annotation of a given genomic position across cell types. The annotation from IDEAS is uniformly more accurate than the Roadmap Epigenomics annotation and the improvement is substantial based on several criteria. We further introduce a pipeline that improves the reproducibility of functional annotation maps. Thus, we provide a high-quality map of candidate functional regions across 127 human cell types and compare the quality of different annotation methods in order to facilitate biomedical research in epigenomics. PMID:28973456
Vision-Aided RAIM: A New Method for GPS Integrity Monitoring in Approach and Landing Phase
Fu, Li; Zhang, Jun; Li, Rui; Cao, Xianbin; Wang, Jinling
2015-01-01
In the 1980s, Global Positioning System (GPS) receiver autonomous integrity monitoring (RAIM) was proposed to provide the integrity of a navigation system by checking the consistency of GPS measurements. However, during the approach and landing phase of a flight path, where there is often low GPS visibility conditions, the performance of the existing RAIM method may not meet the stringent aviation requirements for availability and integrity due to insufficient observations. To solve this problem, a new RAIM method, named vision-aided RAIM (VA-RAIM), is proposed for GPS integrity monitoring in the approach and landing phase. By introducing landmarks as pseudo-satellites, the VA-RAIM enriches the navigation observations to improve the performance of RAIM. In the method, a computer vision system photographs and matches these landmarks to obtain additional measurements for navigation. Nevertheless, the challenging issue is that such additional measurements may suffer from vision errors. To ensure the reliability of the vision measurements, a GPS-based calibration algorithm is presented to reduce the time-invariant part of the vision errors. Then, the calibrated vision measurements are integrated with the GPS observations for integrity monitoring. Simulation results show that the VA-RAIM outperforms the conventional RAIM with a higher level of availability and fault detection rate. PMID:26378533
Vision-Aided RAIM: A New Method for GPS Integrity Monitoring in Approach and Landing Phase.
Fu, Li; Zhang, Jun; Li, Rui; Cao, Xianbin; Wang, Jinling
2015-09-10
In the 1980s, Global Positioning System (GPS) receiver autonomous integrity monitoring (RAIM) was proposed to provide the integrity of a navigation system by checking the consistency of GPS measurements. However, during the approach and landing phase of a flight path, where there is often low GPS visibility conditions, the performance of the existing RAIM method may not meet the stringent aviation requirements for availability and integrity due to insufficient observations. To solve this problem, a new RAIM method, named vision-aided RAIM (VA-RAIM), is proposed for GPS integrity monitoring in the approach and landing phase. By introducing landmarks as pseudo-satellites, the VA-RAIM enriches the navigation observations to improve the performance of RAIM. In the method, a computer vision system photographs and matches these landmarks to obtain additional measurements for navigation. Nevertheless, the challenging issue is that such additional measurements may suffer from vision errors. To ensure the reliability of the vision measurements, a GPS-based calibration algorithm is presented to reduce the time-invariant part of the vision errors. Then, the calibrated vision measurements are integrated with the GPS observations for integrity monitoring. Simulation results show that the VA-RAIM outperforms the conventional RAIM with a higher level of availability and fault detection rate.
A Class of Prediction-Correction Methods for Time-Varying Convex Optimization
NASA Astrophysics Data System (ADS)
Simonetto, Andrea; Mokhtari, Aryan; Koppel, Alec; Leus, Geert; Ribeiro, Alejandro
2016-09-01
This paper considers unconstrained convex optimization problems with time-varying objective functions. We propose algorithms with a discrete time-sampling scheme to find and track the solution trajectory based on prediction and correction steps, while sampling the problem data at a constant rate of $1/h$, where $h$ is the length of the sampling interval. The prediction step is derived by analyzing the iso-residual dynamics of the optimality conditions. The correction step adjusts for the distance between the current prediction and the optimizer at each time step, and consists either of one or multiple gradient steps or Newton steps, which respectively correspond to the gradient trajectory tracking (GTT) or Newton trajectory tracking (NTT) algorithms. Under suitable conditions, we establish that the asymptotic error incurred by both proposed methods behaves as $O(h^2)$, and in some cases as $O(h^4)$, which outperforms the state-of-the-art error bound of $O(h)$ for correction-only methods in the gradient-correction step. Moreover, when the characteristics of the objective function variation are not available, we propose approximate gradient and Newton tracking algorithms (AGT and ANT, respectively) that still attain these asymptotical error bounds. Numerical simulations demonstrate the practical utility of the proposed methods and that they improve upon existing techniques by several orders of magnitude.
NASA Astrophysics Data System (ADS)
Choi, Jinhyeok; Kim, Hyeonjin
2016-12-01
To improve the efficacy of undersampled MRI, a method of designing adaptive sampling functions is proposed that is simple to implement on an MR scanner and yet effectively improves the performance of the sampling functions. An approximation of the energy distribution of an image (E-map) is estimated from highly undersampled k-space data acquired in a prescan and efficiently recycled in the main scan. An adaptive probability density function (PDF) is generated by combining the E-map with a modeled PDF. A set of candidate sampling functions are then prepared from the adaptive PDF, among which the one with maximum energy is selected as the final sampling function. To validate its computational efficiency, the proposed method was implemented on an MR scanner, and its robust performance in Fourier-transform (FT) MRI and compressed sensing (CS) MRI was tested by simulations and in a cherry tomato. The proposed method consistently outperforms the conventional modeled PDF approach for undersampling ratios of 0.2 or higher in both FT-MRI and CS-MRI. To fully benefit from undersampled MRI, it is preferable that the design of adaptive sampling functions be performed online immediately before the main scan. In this way, the proposed method may further improve the efficacy of the undersampled MRI.
Comparison of time-series registration methods in breast dynamic infrared imaging
NASA Astrophysics Data System (ADS)
Riyahi-Alam, S.; Agostini, V.; Molinari, F.; Knaflitz, M.
2015-03-01
Automated motion reduction in dynamic infrared imaging is on demand in clinical applications, since movement disarranges time-temperature series of each pixel, thus originating thermal artifacts that might bias the clinical decision. All previously proposed registration methods are feature based algorithms requiring manual intervention. The aim of this work is to optimize the registration strategy specifically for Breast Dynamic Infrared Imaging and to make it user-independent. We implemented and evaluated 3 different 3D time-series registration methods: 1. Linear affine, 2. Non-linear Bspline, 3. Demons applied to 12 datasets of healthy breast thermal images. The results are evaluated through normalized mutual information with average values of 0.70 ±0.03, 0.74 ±0.03 and 0.81 ±0.09 (out of 1) for Affine, Bspline and Demons registration, respectively, as well as breast boundary overlap and Jacobian determinant of the deformation field. The statistical analysis of the results showed that symmetric diffeomorphic Demons' registration method outperforms also with the best breast alignment and non-negative Jacobian values which guarantee image similarity and anatomical consistency of the transformation, due to homologous forces enforcing the pixel geometric disparities to be shortened on all the frames. We propose Demons' registration as an effective technique for time-series dynamic infrared registration, to stabilize the local temperature oscillation.
Invariant 2D object recognition using the wavelet transform and structured neural networks
NASA Astrophysics Data System (ADS)
Khalil, Mahmoud I.; Bayoumi, Mohamed M.
1999-03-01
This paper applies the dyadic wavelet transform and the structured neural networks approach to recognize 2D objects under translation, rotation, and scale transformation. Experimental results are presented and compared with traditional methods. The experimental results showed that this refined technique successfully classified the objects and outperformed some traditional methods especially in the presence of noise.
ERIC Educational Resources Information Center
Boger, Zvi; Kuflik, Tsvi; Shoval, Peretz; Shapira, Bracha
2001-01-01
Discussion of information filtering (IF) and information retrieval focuses on the use of an artificial neural network (ANN) as an alternative method for both IF and term selection and compares its effectiveness to that of traditional methods. Results show that the ANN relevance prediction out-performs the prediction of an IF system. (Author/LRW)
Measuring Link-Resolver Success: Comparing 360 Link with a Local Implementation of WebBridge
ERIC Educational Resources Information Center
Herrera, Gail
2011-01-01
This study reviewed link resolver success comparing 360 Link and a local implementation of WebBridge. Two methods were used: (1) comparing article-level access and (2) examining technical issues for 384 randomly sampled OpenURLs. Google Analytics was used to collect user-generated OpenURLs. For both methods, 360 Link out-performed the local…
Waytowich, Nicholas R.; Lawhern, Vernon J.; Bohannon, Addison W.; Ball, Kenneth R.; Lance, Brent J.
2016-01-01
Recent advances in signal processing and machine learning techniques have enabled the application of Brain-Computer Interface (BCI) technologies to fields such as medicine, industry, and recreation; however, BCIs still suffer from the requirement of frequent calibration sessions due to the intra- and inter-individual variability of brain-signals, which makes calibration suppression through transfer learning an area of increasing interest for the development of practical BCI systems. In this paper, we present an unsupervised transfer method (spectral transfer using information geometry, STIG), which ranks and combines unlabeled predictions from an ensemble of information geometry classifiers built on data from individual training subjects. The STIG method is validated in both off-line and real-time feedback analysis during a rapid serial visual presentation task (RSVP). For detection of single-trial, event-related potentials (ERPs), the proposed method can significantly outperform existing calibration-free techniques as well as outperform traditional within-subject calibration techniques when limited data is available. This method demonstrates that unsupervised transfer learning for single-trial detection in ERP-based BCIs can be achieved without the requirement of costly training data, representing a step-forward in the overall goal of achieving a practical user-independent BCI system. PMID:27713685
Waytowich, Nicholas R; Lawhern, Vernon J; Bohannon, Addison W; Ball, Kenneth R; Lance, Brent J
2016-01-01
Recent advances in signal processing and machine learning techniques have enabled the application of Brain-Computer Interface (BCI) technologies to fields such as medicine, industry, and recreation; however, BCIs still suffer from the requirement of frequent calibration sessions due to the intra- and inter-individual variability of brain-signals, which makes calibration suppression through transfer learning an area of increasing interest for the development of practical BCI systems. In this paper, we present an unsupervised transfer method (spectral transfer using information geometry, STIG), which ranks and combines unlabeled predictions from an ensemble of information geometry classifiers built on data from individual training subjects. The STIG method is validated in both off-line and real-time feedback analysis during a rapid serial visual presentation task (RSVP). For detection of single-trial, event-related potentials (ERPs), the proposed method can significantly outperform existing calibration-free techniques as well as outperform traditional within-subject calibration techniques when limited data is available. This method demonstrates that unsupervised transfer learning for single-trial detection in ERP-based BCIs can be achieved without the requirement of costly training data, representing a step-forward in the overall goal of achieving a practical user-independent BCI system.
Community-based Inquiry Improves Critical Thinking in General Education Biology
Faiola, Celia L.; Johnson, James E.; Kurtz, Martha J.
2008-01-01
National stakeholders are becoming increasingly concerned about the inability of college graduates to think critically. Research shows that, while both faculty and students deem critical thinking essential, only a small fraction of graduates can demonstrate the thinking skills necessary for academic and professional success. Many faculty are considering nontraditional teaching methods that incorporate undergraduate research because they more closely align with the process of doing investigative science. This study compared a research-focused teaching method called community-based inquiry (CBI) with traditional lecture/laboratory in general education biology to discover which method would elicit greater gains in critical thinking. Results showed significant critical-thinking gains in the CBI group but decreases in a traditional group and a mixed CBI/traditional group. Prior critical-thinking skill, instructor, and ethnicity also significantly influenced critical-thinking gains, with nearly all ethnicities in the CBI group outperforming peers in both the mixed and traditional groups. Females, who showed decreased critical thinking in traditional courses relative to males, outperformed their male counterparts in CBI courses. Through the results of this study, it is hoped that faculty who value both research and critical thinking will consider using the CBI method. PMID:18765755
Fourment, Mathieu; Holmes, Edward C
2014-07-24
Early methods for estimating divergence times from gene sequence data relied on the assumption of a molecular clock. More sophisticated methods were created to model rate variation and used auto-correlation of rates, local clocks, or the so called "uncorrelated relaxed clock" where substitution rates are assumed to be drawn from a parametric distribution. In the case of Bayesian inference methods the impact of the prior on branching times is not clearly understood, and if the amount of data is limited the posterior could be strongly influenced by the prior. We develop a maximum likelihood method--Physher--that uses local or discrete clocks to estimate evolutionary rates and divergence times from heterochronous sequence data. Using two empirical data sets we show that our discrete clock estimates are similar to those obtained by other methods, and that Physher outperformed some methods in the estimation of the root age of an influenza virus data set. A simulation analysis suggests that Physher can outperform a Bayesian method when the real topology contains two long branches below the root node, even when evolution is strongly clock-like. These results suggest it is advisable to use a variety of methods to estimate evolutionary rates and divergence times from heterochronous sequence data. Physher and the associated data sets used here are available online at http://code.google.com/p/physher/.
Tkachenko, Pavlo; Kriukova, Galyna; Aleksandrova, Marharyta; Chertov, Oleg; Renard, Eric; Pereverzyev, Sergei V
2016-10-01
Nocturnal hypoglycemia (NH) is common in patients with insulin-treated diabetes. Despite the risk associated with NH, there are only a few methods aiming at the prediction of such events based on intermittent blood glucose monitoring data and none has been validated for clinical use. Here we propose a method of combining several predictors into a new one that will perform at the level of the best involved one, or even outperform all individual candidates. The idea of the method is to use a recently developed strategy for aggregating ranking algorithms. The method has been calibrated and tested on data extracted from clinical trials, performed in the European FP7-funded project DIAdvisor. Then we have tested the proposed approach on other datasets to show the portability of the method. This feature of the method allows its simple implementation in the form of a diabetic smartphone app. On the considered datasets the proposed approach exhibits good performance in terms of sensitivity, specificity and predictive values. Moreover, the resulting predictor automatically performs at the level of the best involved method or even outperforms it. We propose a strategy for a combination of NH predictors that leads to a method exhibiting a reliable performance and the potential for everyday use by any patient who performs self-monitoring of blood glucose. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Mao, Wenzhi; Kaya, Cihan; Dutta, Anindita; Horovitz, Amnon; Bahar, Ivet
2015-06-15
With rapid accumulation of sequence data on several species, extracting rational and systematic information from multiple sequence alignments (MSAs) is becoming increasingly important. Currently, there is a plethora of computational methods for investigating coupled evolutionary changes in pairs of positions along the amino acid sequence, and making inferences on structure and function. Yet, the significance of coevolution signals remains to be established. Also, a large number of false positives (FPs) arise from insufficient MSA size, phylogenetic background and indirect couplings. Here, a set of 16 pairs of non-interacting proteins is thoroughly examined to assess the effectiveness and limitations of different methods. The analysis shows that recent computationally expensive methods designed to remove biases from indirect couplings outperform others in detecting tertiary structural contacts as well as eliminating intermolecular FPs; whereas traditional methods such as mutual information benefit from refinements such as shuffling, while being highly efficient. Computations repeated with 2,330 pairs of protein families from the Negatome database corroborated these results. Finally, using a training dataset of 162 families of proteins, we propose a combined method that outperforms existing individual methods. Overall, the study provides simple guidelines towards the choice of suitable methods and strategies based on available MSA size and computing resources. Software is freely available through the Evol component of ProDy API. © The Author 2015. Published by Oxford University Press.
Combining Biomarkers Linearly and Nonlinearly for Classification Using the Area Under the ROC Curve
Fong, Youyi; Yin, Shuxin; Huang, Ying
2016-01-01
In biomedical studies, it is often of interest to classify/predict a subject’s disease status based on a variety of biomarker measurements. A commonly used classification criterion is based on AUC - Area under the Receiver Operating Characteristic Curve. Many methods have been proposed to optimize approximated empirical AUC criteria, but there are two limitations to the existing methods. First, most methods are only designed to find the best linear combination of biomarkers, which may not perform well when there is strong nonlinearity in the data. Second, many existing linear combination methods use gradient-based algorithms to find the best marker combination, which often result in sub-optimal local solutions. In this paper, we address these two problems by proposing a new kernel-based AUC optimization method called Ramp AUC (RAUC). This method approximates the empirical AUC loss function with a ramp function, and finds the best combination by a difference of convex functions algorithm. We show that as a linear combination method, RAUC leads to a consistent and asymptotically normal estimator of the linear marker combination when the data is generated from a semiparametric generalized linear model, just as the Smoothed AUC method (SAUC). Through simulation studies and real data examples, we demonstrate that RAUC out-performs SAUC in finding the best linear marker combinations, and can successfully capture nonlinear pattern in the data to achieve better classification performance. We illustrate our method with a dataset from a recent HIV vaccine trial. PMID:27058981
Cyclosporine ophthalmic emulsions for the treatment of dry eye: a review of the clinical evidence
Ames, Philip; Galor, Anat
2015-01-01
Dry eye has gained recognition as a public health problem given its high prevalence, morbidity and cost implications. Although dry eye is common and affects patients’ quality of life, only one medication, cyclosporine 0.05% emulsion, has been approved by the US FDA for its treatment. In this review, we summarize the basic science and clinical data regarding the use of cyclosporine in the treatment of dry eye. Randomized controlled trials showed that cyclosporine emulsion outperformed vehicles in the majority of trials, consistently decreasing corneal staining and increasing Schirmer scores. Symptom improvement was more variable, however, with ocular dryness shown to be the most consistently improved symptom over vehicle. PMID:25960865
Iterative Nonlocal Total Variation Regularization Method for Image Restoration
Xu, Huanyu; Sun, Quansen; Luo, Nan; Cao, Guo; Xia, Deshen
2013-01-01
In this paper, a Bregman iteration based total variation image restoration algorithm is proposed. Based on the Bregman iteration, the algorithm splits the original total variation problem into sub-problems that are easy to solve. Moreover, non-local regularization is introduced into the proposed algorithm, and a method to choose the non-local filter parameter locally and adaptively is proposed. Experiment results show that the proposed algorithms outperform some other regularization methods. PMID:23776560
An Efficient Augmented Lagrangian Method with Applications to Total Variation Minimization
2012-08-17
the classic augmented Lagrangian multiplier method, we propose, analyze and test an algorithm for solving a class of equality-constrained non-smooth...method, we propose, analyze and test an algorithm for solving a class of equality-constrained non-smooth optimization problems (chie y but not...significantly outperforming several state-of-the-art solvers on most tested problems. The resulting MATLAB solver, called TVAL3, has been posted online [23]. 2
A case study of alternative site response explanatory variables in Parkfield, California
Thompson, E.M.; Baise, L.G.; Kayen, R.E.; Morgan, E.C.; Kaklamanos, J.
2011-01-01
The combination of densely-spaced strong-motion stations in Parkfield, California, and spectral analysis of surface waves (SASW) profiles provides an ideal dataset for assessing the accuracy of different site response explanatory variables. We judge accuracy in terms of spatial coverage and correlation with observations. The performance of the alternative models is period-dependent, but generally we observe that: (1) where a profile is available, the square-root-of-impedance method outperforms VS30 (average S-wave velocity to 30 m depth), and (2) where a profile is unavailable, the topographic-slope method outperforms surficial geology. The fundamental site frequency is a valuable site response explanatory variable, though less valuable than VS30. However, given the expense and difficulty of obtaining reliable estimates of VS30 and the relative ease with which the fundamental site frequency can be computed, the fundamental site frequency may prove to be a valuable site response explanatory variable for many applications. ?? 2011 ASCE.
Derivation of global vegetation biophysical parameters from EUMETSAT Polar System
NASA Astrophysics Data System (ADS)
García-Haro, Francisco Javier; Campos-Taberner, Manuel; Muñoz-Marí, Jordi; Laparra, Valero; Camacho, Fernando; Sánchez-Zapero, Jorge; Camps-Valls, Gustau
2018-05-01
This paper presents the algorithm developed in LSA-SAF (Satellite Application Facility for Land Surface Analysis) for the derivation of global vegetation parameters from the AVHRR (Advanced Very High Resolution Radiometer) sensor on board MetOp (Meteorological-Operational) satellites forming the EUMETSAT (European Organization for the Exploitation of Meteorological Satellites) Polar System (EPS). The suite of LSA-SAF EPS vegetation products includes the leaf area index (LAI), the fractional vegetation cover (FVC), and the fraction of absorbed photosynthetically active radiation (FAPAR). LAI, FAPAR, and FVC characterize the structure and the functioning of vegetation and are key parameters for a wide range of land-biosphere applications. The algorithm is based on a hybrid approach that blends the generalization capabilities offered by physical radiative transfer models with the accuracy and computational efficiency of machine learning methods. One major feature is the implementation of multi-output retrieval methods able to jointly and more consistently estimate all the biophysical parameters at the same time. We propose a multi-output Gaussian process regression (GPRmulti), which outperforms other considered methods over PROSAIL (coupling of PROSPECT and SAIL (Scattering by Arbitrary Inclined Leaves) radiative transfer models) EPS simulations. The global EPS products include uncertainty estimates taking into account the uncertainty captured by the retrieval method and input errors propagation. A sensitivity analysis is performed to assess several sources of uncertainties in retrievals and maximize the positive impact of modeling the noise in training simulations. The paper discusses initial validation studies and provides details about the characteristics and overall quality of the products, which can be of interest to assist the successful use of the data by a broad user's community. The consistent generation and distribution of the EPS vegetation products will constitute a valuable tool for monitoring of earth surface dynamic processes.
Liu, Peigui; Elshall, Ahmed S.; Ye, Ming; ...
2016-02-05
Evaluating marginal likelihood is the most critical and computationally expensive task, when conducting Bayesian model averaging to quantify parametric and model uncertainties. The evaluation is commonly done by using Laplace approximations to evaluate semianalytical expressions of the marginal likelihood or by using Monte Carlo (MC) methods to evaluate arithmetic or harmonic mean of a joint likelihood function. This study introduces a new MC method, i.e., thermodynamic integration, which has not been attempted in environmental modeling. Instead of using samples only from prior parameter space (as in arithmetic mean evaluation) or posterior parameter space (as in harmonic mean evaluation), the thermodynamicmore » integration method uses samples generated gradually from the prior to posterior parameter space. This is done through a path sampling that conducts Markov chain Monte Carlo simulation with different power coefficient values applied to the joint likelihood function. The thermodynamic integration method is evaluated using three analytical functions by comparing the method with two variants of the Laplace approximation method and three MC methods, including the nested sampling method that is recently introduced into environmental modeling. The thermodynamic integration method outperforms the other methods in terms of their accuracy, convergence, and consistency. The thermodynamic integration method is also applied to a synthetic case of groundwater modeling with four alternative models. The application shows that model probabilities obtained using the thermodynamic integration method improves predictive performance of Bayesian model averaging. As a result, the thermodynamic integration method is mathematically rigorous, and its MC implementation is computationally general for a wide range of environmental problems.« less
Experimental comparison of manufacturing techniques of toughened and nanoreinforced polyamides
NASA Astrophysics Data System (ADS)
Siengchin, S.; Bergmann, C.; Dangtungee, R.
2011-11-01
Composites consisting of polyamide-6 (PA-6), nitrile rubber (NBR), and sodium fluorohectorite (FH) or alumina silicate (Sungloss; SG) were produced by different techniques with latex precompounding. Their tensile and thermomechanical properties were determined by using tensile tests and a dynamic-mechanical analysis, performed at various temperatures. The PA-6/NBR composite systems produced by the direct melt compounding outperformed those obtained by using the masterbatch technique with respect to the strength and ductility, but the latter ones had a higher storage modulus.
A reconsideration of negative ratings for network-based recommendation
NASA Astrophysics Data System (ADS)
Hu, Liang; Ren, Liang; Lin, Wenbin
2018-01-01
Recommendation algorithms based on bipartite networks have become increasingly popular, thanks to their accuracy and flexibility. Currently, many of these methods ignore users' negative ratings. In this work, we propose a method to exploit negative ratings for the network-based inference algorithm. We find that negative ratings play a positive role regardless of sparsity of data sets. Furthermore, we improve the efficiency of our method and compare it with the state-of-the-art algorithms. Experimental results show that the present method outperforms the existing algorithms.
Local Intrinsic Dimension Estimation by Generalized Linear Modeling.
Hino, Hideitsu; Fujiki, Jun; Akaho, Shotaro; Murata, Noboru
2017-07-01
We propose a method for intrinsic dimension estimation. By fitting the power of distance from an inspection point and the number of samples included inside a ball with a radius equal to the distance, to a regression model, we estimate the goodness of fit. Then, by using the maximum likelihood method, we estimate the local intrinsic dimension around the inspection point. The proposed method is shown to be comparable to conventional methods in global intrinsic dimension estimation experiments. Furthermore, we experimentally show that the proposed method outperforms a conventional local dimension estimation method.
NASA Astrophysics Data System (ADS)
Chen, X.; Kumar, M.; Basso, S.; Marani, M.
2017-12-01
Storage-discharge (S-Q) relations are widely used to derive watershed properties and predict streamflow responses. These relations are often obtained using different recession analysis methods, which vary in recession period identification criteria and Q vs. -dQ/dt fitting scheme. Although previous studies have indicated that different recession analysis methods can result in significantly different S-Q relations and subsequently derived hydrological variables, this observation has often been overlooked and S-Q relations have been used in as is form. This study evaluated the effectiveness of four recession analysis methods in obtaining the characteristic S-Q relation and reconstructing the streamflow. Results indicate that while some methods generally performed better than others, none of them consistently outperformed the others. Even the best-performing method could not yield accurate reconstructed streamflow time series and its PDFs in some watersheds, implying that either derived S-Q relations might not be reliable or S-Q relations cannot be used for hydrological simulations. Notably, accuracy of the methods is influenced by the extent of scatter in the ln(-dQ/dt) vs. ln(Q) plot. In addition, the derived S-Q relation was very sensitive to the criteria used for identifying recession periods. This result raises a warning sign against indiscriminate application of recession analysis methods and derived S-Q relations for watershed characterizations or hydrologic simulations. Thorough evaluation of representativeness of the derived S-Q relation should be performed before it is used for hydrologic analysis.
Improved Statistical Methods Enable Greater Sensitivity in Rhythm Detection for Genome-Wide Data
Hutchison, Alan L.; Maienschein-Cline, Mark; Chiang, Andrew H.; Tabei, S. M. Ali; Gudjonson, Herman; Bahroos, Neil; Allada, Ravi; Dinner, Aaron R.
2015-01-01
Robust methods for identifying patterns of expression in genome-wide data are important for generating hypotheses regarding gene function. To this end, several analytic methods have been developed for detecting periodic patterns. We improve one such method, JTK_CYCLE, by explicitly calculating the null distribution such that it accounts for multiple hypothesis testing and by including non-sinusoidal reference waveforms. We term this method empirical JTK_CYCLE with asymmetry search, and we compare its performance to JTK_CYCLE with Bonferroni and Benjamini-Hochberg multiple hypothesis testing correction, as well as to five other methods: cyclohedron test, address reduction, stable persistence, ANOVA, and F24. We find that ANOVA, F24, and JTK_CYCLE consistently outperform the other three methods when data are limited and noisy; empirical JTK_CYCLE with asymmetry search gives the greatest sensitivity while controlling for the false discovery rate. Our analysis also provides insight into experimental design and we find that, for a fixed number of samples, better sensitivity and specificity are achieved with higher numbers of replicates than with higher sampling density. Application of the methods to detecting circadian rhythms in a metadataset of microarrays that quantify time-dependent gene expression in whole heads of Drosophila melanogaster reveals annotations that are enriched among genes with highly asymmetric waveforms. These include a wide range of oxidation reduction and metabolic genes, as well as genes with transcripts that have multiple splice forms. PMID:25793520
Post-boosting of classification boundary for imbalanced data using geometric mean.
Du, Jie; Vong, Chi-Man; Pun, Chi-Man; Wong, Pak-Kin; Ip, Weng-Fai
2017-12-01
In this paper, a novel imbalance learning method for binary classes is proposed, named as Post-Boosting of classification boundary for Imbalanced data (PBI), which can significantly improve the performance of any trained neural networks (NN) classification boundary. The procedure of PBI simply consists of two steps: an (imbalanced) NN learning method is first applied to produce a classification boundary, which is then adjusted by PBI under the geometric mean (G-mean). For data imbalance, the geometric mean of the accuracies of both minority and majority classes is considered, that is statistically more suitable than the common metric accuracy. PBI also has the following advantages over traditional imbalance methods: (i) PBI can significantly improve the classification accuracy on minority class while improving or keeping that on majority class as well; (ii) PBI is suitable for large data even with high imbalance ratio (up to 0.001). For evaluation of (i), a new metric called Majority loss/Minority advance ratio (MMR) is proposed that evaluates the loss ratio of majority class to minority class. Experiments have been conducted for PBI and several imbalance learning methods over benchmark datasets of different sizes, different imbalance ratios, and different dimensionalities. By analyzing the experimental results, PBI is shown to outperform other imbalance learning methods on almost all datasets. Copyright © 2017 Elsevier Ltd. All rights reserved.
Thrasher, James F.; Reid, Jessica L.; Hammond, David
2016-01-01
Abstract Introduction: Studies examining cigarette package pictorial health warning label (HWL) content have primarily used designs that do not allow determination of effectiveness after repeated, naturalistic exposure. This research aimed to determine the predictive and external validity of a pre-market evaluation study of pictorial HWLs. Methods: Data were analyzed from: (1) a pre-market convenience sample of 544 adult smokers who participated in field experiments in Mexico City before pictorial HWL implementation (September 2010); and (2) a post-market population-based representative sample of 1765 adult smokers in the Mexican administration of the International Tobacco Control Policy Evaluation Survey after pictorial HWL implementation. Participants in both samples rated six HWLs that appeared on cigarette packs, and also ranked HWLs with four different themes. Mixed effects models were estimated for each sample to assess ratings of relative effectiveness for the six HWLs, and to assess which HWL themes were ranked as the most effective. Results: Pre- and post-market data showed similar relative ratings across the six HWLs, with the least and most effective HWLs consistently differentiated from other HWLs. Models predicting rankings of HWL themes in post-market sample indicated: (1) pictorial HWLs were ranked as more effective than text-only HWLs; (2) HWLs with both graphic and “lived experience” content outperformed symbolic content; and, (3) testimonial content significantly outperformed didactic content. Pre-market data showed a similar pattern of results, but with fewer statistically significant findings. Conclusions: The study suggests well-designed pre-market studies can have predictive and external validity, helping regulators select HWL content. PMID:26377516
Weisman de Mamani, Amy; Suro, Giulia
2015-01-01
Objective Caring for a family member with schizophrenia often results in high degrees of self-conscious emotions (shame and guilt/self -blame), burden, and other serious mental health consequences. Research suggests that ethnic and cultural factors strongly influence the manner in which family members respond to mental illness. Research further indicates that certain cultural practices and values (spirituality, collectivism) may assist family members in coping with the self-conscious emotions and burden associated with caregiving. With this in mind, we have developed a family focused, culturally-informed treatment for schizophrenia (CIT-S). Method Using a sample of 113 caregivers of patients with schizophrenia (60% Hispanic, 28.2% Caucasian, 8% African American and 3.8% “Other”), we assessed the ability of CIT-S to reduce self-conscious emotions and caregiver burden above and beyond a three-session psychoeducation (PSY-ED) control condition. We further examined whether self-conscious emotions mediated the relationship between treatment type and caregiver burden. Results In line with expectations, CIT-S was found to outperform PSY-ED in reducing guilt/self-blame and caregiver burden. Furthermore, consistent with hypotheses, reductions in guilt/self-blame were found to mediate the changes observed between treatment type and caregiver burden. While caregivers in both treatment groups demonstrated significant post treatment reductions in shame, CIT-S was not found to outperform PSY-ED in reducing levels of this construct. Conclusions Results suggest that caregivers of patients with schizophrenia may respond well to a treatment that specifically taps in to their cultural beliefs, values, and behaviors in helping them cope with schizophrenia in a loved one. Study implications and future directions are discussed. PMID:26654115
Evaluating performance of stormwater sampling approaches using a dynamic watershed model.
Ackerman, Drew; Stein, Eric D; Ritter, Kerry J
2011-09-01
Accurate quantification of stormwater pollutant levels is essential for estimating overall contaminant discharge to receiving waters. Numerous sampling approaches exist that attempt to balance accuracy against the costs associated with the sampling method. This study employs a novel and practical approach of evaluating the accuracy of different stormwater monitoring methodologies using stormflows and constituent concentrations produced by a fully validated continuous simulation watershed model. A major advantage of using a watershed model to simulate pollutant concentrations is that a large number of storms representing a broad range of conditions can be applied in testing the various sampling approaches. Seventy-eight distinct methodologies were evaluated by "virtual samplings" of 166 simulated storms of varying size, intensity and duration, representing 14 years of storms in Ballona Creek near Los Angeles, California. The 78 methods can be grouped into four general strategies: volume-paced compositing, time-paced compositing, pollutograph sampling, and microsampling. The performances of each sampling strategy was evaluated by comparing the (1) median relative error between the virtually sampled and the true modeled event mean concentration (EMC) of each storm (accuracy), (2) median absolute deviation about the median or "MAD" of the relative error or (precision), and (3) the percentage of storms where sampling methods were within 10% of the true EMC (combined measures of accuracy and precision). Finally, costs associated with site setup, sampling, and laboratory analysis were estimated for each method. Pollutograph sampling consistently outperformed the other three methods both in terms of accuracy and precision, but was the most costly method evaluated. Time-paced sampling consistently underestimated while volume-paced sampling over estimated the storm EMCs. Microsampling performance approached that of pollutograph sampling at a substantial cost savings. The most efficient method for routine stormwater monitoring in terms of a balance between performance and cost was volume-paced microsampling, with variable sample pacing to ensure that the entirety of the storm was captured. Pollutograph sampling is recommended if the data are to be used for detailed analysis of runoff dynamics.
A novel approach to identifying regulatory motifs in distantly related genomes
Van Hellemont, Ruth; Monsieurs, Pieter; Thijs, Gert; De Moor, Bart; Van de Peer, Yves; Marchal, Kathleen
2005-01-01
Although proven successful in the identification of regulatory motifs, phylogenetic footprinting methods still show some shortcomings. To assess these difficulties, most apparent when applying phylogenetic footprinting to distantly related organisms, we developed a two-step procedure that combines the advantages of sequence alignment and motif detection approaches. The results on well-studied benchmark datasets indicate that the presented method outperforms other methods when the sequences become either too long or too heterogeneous in size. PMID:16420672
Efficient mouse genome engineering by CRISPR-EZ technology.
Modzelewski, Andrew J; Chen, Sean; Willis, Brandon J; Lloyd, K C Kent; Wood, Joshua A; He, Lin
2018-06-01
CRISPR/Cas9 technology has transformed mouse genome editing with unprecedented precision, efficiency, and ease; however, the current practice of microinjecting CRISPR reagents into pronuclear-stage embryos remains rate-limiting. We thus developed CRISPR ribonucleoprotein (RNP) electroporation of zygotes (CRISPR-EZ), an electroporation-based technology that outperforms pronuclear and cytoplasmic microinjection in efficiency, simplicity, cost, and throughput. In C57BL/6J and C57BL/6N mouse strains, CRISPR-EZ achieves 100% delivery of Cas9/single-guide RNA (sgRNA) RNPs, facilitating indel mutations (insertions or deletions), exon deletions, point mutations, and small insertions. In a side-by-side comparison in the high-throughput KnockOut Mouse Project (KOMP) pipeline, CRISPR-EZ consistently outperformed microinjection. Here, we provide an optimized protocol covering sgRNA synthesis, embryo collection, RNP electroporation, mouse generation, and genotyping strategies. Using CRISPR-EZ, a graduate-level researcher with basic embryo-manipulation skills can obtain genetically modified mice in 6 weeks. Altogether, CRISPR-EZ is a simple, economic, efficient, and high-throughput technology that is potentially applicable to other mammalian species.
An algorithm for direct causal learning of influences on patient outcomes.
Rathnam, Chandramouli; Lee, Sanghoon; Jiang, Xia
2017-01-01
This study aims at developing and introducing a new algorithm, called direct causal learner (DCL), for learning the direct causal influences of a single target. We applied it to both simulated and real clinical and genome wide association study (GWAS) datasets and compared its performance to classic causal learning algorithms. The DCL algorithm learns the causes of a single target from passive data using Bayesian-scoring, instead of using independence checks, and a novel deletion algorithm. We generate 14,400 simulated datasets and measure the number of datasets for which DCL correctly and partially predicts the direct causes. We then compare its performance with the constraint-based path consistency (PC) and conservative PC (CPC) algorithms, the Bayesian-score based fast greedy search (FGS) algorithm, and the partial ancestral graphs algorithm fast causal inference (FCI). In addition, we extend our comparison of all five algorithms to both a real GWAS dataset and real breast cancer datasets over various time-points in order to observe how effective they are at predicting the causal influences of Alzheimer's disease and breast cancer survival. DCL consistently outperforms FGS, PC, CPC, and FCI in discovering the parents of the target for the datasets simulated using a simple network. Overall, DCL predicts significantly more datasets correctly (McNemar's test significance: p<0.0001) than any of the other algorithms for these network types. For example, when assessing overall performance (simple and complex network results combined), DCL correctly predicts approximately 1400 more datasets than the top FGS method, 1600 more datasets than the top CPC method, 4500 more datasets than the top PC method, and 5600 more datasets than the top FCI method. Although FGS did correctly predict more datasets than DCL for the complex networks, and DCL correctly predicted only a few more datasets than CPC for these networks, there is no significant difference in performance between these three algorithms for this network type. However, when we use a more continuous measure of accuracy, we find that all the DCL methods are able to better partially predict more direct causes than FGS and CPC for the complex networks. In addition, DCL consistently had faster runtimes than the other algorithms. In the application to the real datasets, DCL identified rs6784615, located on the NISCH gene, and rs10824310, located on the PRKG1 gene, as direct causes of late onset Alzheimer's disease (LOAD) development. In addition, DCL identified ER category as a direct predictor of breast cancer mortality within 5 years, and HER2 status as a direct predictor of 10-year breast cancer mortality. These predictors have been identified in previous studies to have a direct causal relationship with their respective phenotypes, supporting the predictive power of DCL. When the other algorithms discovered predictors from the real datasets, these predictors were either also found by DCL or could not be supported by previous studies. Our results show that DCL outperforms FGS, PC, CPC, and FCI in almost every case, demonstrating its potential to advance causal learning. Furthermore, our DCL algorithm effectively identifies direct causes in the LOAD and Metabric GWAS datasets, which indicates its potential for clinical applications. Copyright © 2016 Elsevier B.V. All rights reserved.
The processing of emotional prosody and semantics in schizophrenia: relationship to gender and IQ.
Scholten, M R M; Aleman, A; Kahn, R S
2008-06-01
Female patients with schizophrenia are less impaired in social life than male patients. Because social impairment in schizophrenia has been found to be associated with deficits in emotion recognition, we examined whether the female advantage in processing emotional prosody and semantics is preserved in schizophrenia. Forty-eight patients (25 males, 23 females) and 46 controls (23 males, 23 females) were assessed using an emotional language task (in which healthy women generally outperform healthy men), consisting of 96 sentences in four conditions: (1) neutral-content/emotional-tone (happy, sad, angry or anxious); (2) neutral-tone/emotional-content; (3) emotional-tone/incongruous emotional-content; and (4) emotional-content/incongruous emotional-tone. Participants had to ignore the emotional-content in the third condition and the emotional-tone in the fourth condition. In addition, participants were assessed with a visuospatial task (in which healthy men typically excel). Correlation coefficients were computed for associations between emotional language data, visuospatial data, IQ measures and patient variables. Overall, on the emotional language task, patients made more errors than control subjects, and women outperformed men across diagnostic groups. Controlling for IQ revealed a significant effect on task performance in all groups, especially in the incongruent tasks. On the rotation task, healthy men outperformed healthy women, but male patients, female patients and female controls obtained similar scores. The advantage in emotional prosodic and semantic processing in healthy women is preserved in schizophrenia, whereas the male advantage in visuospatial processing is lost. These findings may explain, in part, why social functioning is less compromised in women with schizophrenia than in men.
Registration of organs with sliding interfaces and changing topologies
NASA Astrophysics Data System (ADS)
Berendsen, Floris F.; Kotte, Alexis N. T. J.; Viergever, Max A.; Pluim, Josien P. W.
2014-03-01
Smoothness and continuity assumptions on the deformation field in deformable image registration do not hold for applications where the imaged objects have sliding interfaces. Recent extensions to deformable image registration that accommodate for sliding motion of organs are limited to sliding motion along approximately planar surfaces or cannot model sliding that changes the topological configuration in case of multiple organs. We propose a new extension to free-form image registration that is not limited in this way. Our method uses a transformation model that consists of uniform B-spline transformations for each organ region separately, which is based on segmentation of one image. Since this model can create overlapping regions or gaps between regions, we introduce a penalty term that minimizes this undesired effect. The penalty term acts on the surfaces of the organ regions and is optimized simultaneously with the image similarity. To evaluate our method registrations were performed on publicly available inhale-exhale CT scans for which performances of other methods are known. Target registration errors are computed on dense landmark sets that are available with these datasets. On these data our method outperforms the other methods in terms of target registration error and, where applicable, also in terms of overlap and gap volumes. The approximation of the other methods of sliding motion along planar surfaces is reasonably well suited for the motion present in the lung data. The ability of our method to handle sliding along curved boundaries and for changing region topology configurations was demonstrated on synthetic images.
A Deep Learning Approach to on-Node Sensor Data Analytics for Mobile or Wearable Devices.
Ravi, Daniele; Wong, Charence; Lo, Benny; Yang, Guang-Zhong
2017-01-01
The increasing popularity of wearable devices in recent years means that a diverse range of physiological and functional data can now be captured continuously for applications in sports, wellbeing, and healthcare. This wealth of information requires efficient methods of classification and analysis where deep learning is a promising technique for large-scale data analytics. While deep learning has been successful in implementations that utilize high-performance computing platforms, its use on low-power wearable devices is limited by resource constraints. In this paper, we propose a deep learning methodology, which combines features learned from inertial sensor data together with complementary information from a set of shallow features to enable accurate and real-time activity classification. The design of this combined method aims to overcome some of the limitations present in a typical deep learning framework where on-node computation is required. To optimize the proposed method for real-time on-node computation, spectral domain preprocessing is used before the data are passed onto the deep learning framework. The classification accuracy of our proposed deep learning approach is evaluated against state-of-the-art methods using both laboratory and real world activity datasets. Our results show the validity of the approach on different human activity datasets, outperforming other methods, including the two methods used within our combined pipeline. We also demonstrate that the computation times for the proposed method are consistent with the constraints of real-time on-node processing on smartphones and a wearable sensor platform.
Scalable video transmission over Rayleigh fading channels using LDPC codes
NASA Astrophysics Data System (ADS)
Bansal, Manu; Kondi, Lisimachos P.
2005-03-01
In this paper, we investigate an important problem of efficiently utilizing the available resources for video transmission over wireless channels while maintaining a good decoded video quality and resilience to channel impairments. Our system consists of the video codec based on 3-D set partitioning in hierarchical trees (3-D SPIHT) algorithm and employs two different schemes using low-density parity check (LDPC) codes for channel error protection. The first method uses the serial concatenation of the constant-rate LDPC code and rate-compatible punctured convolutional (RCPC) codes. Cyclic redundancy check (CRC) is used to detect transmission errors. In the other scheme, we use the product code structure consisting of a constant rate LDPC/CRC code across the rows of the `blocks' of source data and an erasure-correction systematic Reed-Solomon (RS) code as the column code. In both the schemes introduced here, we use fixed-length source packets protected with unequal forward error correction coding ensuring a strictly decreasing protection across the bitstream. A Rayleigh flat-fading channel with additive white Gaussian noise (AWGN) is modeled for the transmission. The rate-distortion optimization algorithm is developed and carried out for the selection of source coding and channel coding rates using Lagrangian optimization. The experimental results demonstrate the effectiveness of this system under different wireless channel conditions and both the proposed methods (LDPC+RCPC/CRC and RS+LDPC/CRC) outperform the more conventional schemes such as those employing RCPC/CRC.
Michailidis, George
2014-01-01
Reconstructing transcriptional regulatory networks is an important task in functional genomics. Data obtained from experiments that perturb genes by knockouts or RNA interference contain useful information for addressing this reconstruction problem. However, such data can be limited in size and/or are expensive to acquire. On the other hand, observational data of the organism in steady state (e.g., wild-type) are more readily available, but their informational content is inadequate for the task at hand. We develop a computational approach to appropriately utilize both data sources for estimating a regulatory network. The proposed approach is based on a three-step algorithm to estimate the underlying directed but cyclic network, that uses as input both perturbation screens and steady state gene expression data. In the first step, the algorithm determines causal orderings of the genes that are consistent with the perturbation data, by combining an exhaustive search method with a fast heuristic that in turn couples a Monte Carlo technique with a fast search algorithm. In the second step, for each obtained causal ordering, a regulatory network is estimated using a penalized likelihood based method, while in the third step a consensus network is constructed from the highest scored ones. Extensive computational experiments show that the algorithm performs well in reconstructing the underlying network and clearly outperforms competing approaches that rely only on a single data source. Further, it is established that the algorithm produces a consistent estimate of the regulatory network. PMID:24586224
Human Activity Recognition by Combining a Small Number of Classifiers.
Nazabal, Alfredo; Garcia-Moreno, Pablo; Artes-Rodriguez, Antonio; Ghahramani, Zoubin
2016-09-01
We consider the problem of daily human activity recognition (HAR) using multiple wireless inertial sensors, and specifically, HAR systems with a very low number of sensors, each one providing an estimation of the performed activities. We propose new Bayesian models to combine the output of the sensors. The models are based on a soft outputs combination of individual classifiers to deal with the small number of sensors. We also incorporate the dynamic nature of human activities as a first-order homogeneous Markov chain. We develop both inductive and transductive inference methods for each model to be employed in supervised and semisupervised situations, respectively. Using different real HAR databases, we compare our classifiers combination models against a single classifier that employs all the signals from the sensors. Our models exhibit consistently a reduction of the error rate and an increase of robustness against sensor failures. Our models also outperform other classifiers combination models that do not consider soft outputs and an Markovian structure of the human activities.
Gaussian mixed model in support of semiglobal matching leveraged by ground control points
NASA Astrophysics Data System (ADS)
Ma, Hao; Zheng, Shunyi; Li, Chang; Li, Yingsong; Gui, Li
2017-04-01
Semiglobal matching (SGM) has been widely applied in large aerial images because of its good tradeoff between complexity and robustness. The concept of ground control points (GCPs) is adopted to make SGM more robust. We model the effect of GCPs as two data terms for stereo matching between high-resolution aerial epipolar images in an iterative scheme. One term based on GCPs is formulated by Gaussian mixture model, which strengths the relation between GCPs and the pixels to be estimated and encodes some degree of consistency between them with respect to disparity values. Another term depends on pixel-wise confidence, and we further design a confidence updating equation based on three rules. With this confidence-based term, the assignment of disparity can be heuristically selected among disparity search ranges during the iteration process. Several iterations are sufficient to bring out satisfactory results according to our experiments. Experimental results validate that the proposed method outperforms surface reconstruction, which is a representative variant of SGM and behaves excellently on aerial images.
Assessing the performance of winter footwear using a new maximum achievable incline method.
Hsu, Jennifer; Li, Yue; Dutta, Tilak; Fernie, Geoff
2015-09-01
More informative tests of winter footwear performance are required in order to identify footwear that will prevent injurious slips and falls on icy conditions. In this study, eight participants tested four styles of winter boots on smooth wet ice. The surface was progressively tilted to create increasing longitudinal and cross-slopes until participants could no longer continue standing or walking. Maximum achievable incline angles provided consistent measures of footwear slip resistance and demonstrated better resolution than mechanical tests. One footwear outsole material and tread combination outperformed the others on wet ice allowing participants to successfully walk on steep longitudinal slopes of 17.5° ± 1.9° (mean ± SD). By further exploiting the methodology to include additional surfaces and contaminants, such tests could be used to optimize tread designs and materials that are ideal for reducing the risk of slips and falls. Copyright © 2015 Elsevier Ltd and The Ergonomics Society. All rights reserved.
Real-Time Visual Tracking through Fusion Features
Ruan, Yang; Wei, Zhenzhong
2016-01-01
Due to their high-speed, correlation filters for object tracking have begun to receive increasing attention. Traditional object trackers based on correlation filters typically use a single type of feature. In this paper, we attempt to integrate multiple feature types to improve the performance, and we propose a new DD-HOG fusion feature that consists of discriminative descriptors (DDs) and histograms of oriented gradients (HOG). However, fusion features as multi-vector descriptors cannot be directly used in prior correlation filters. To overcome this difficulty, we propose a multi-vector correlation filter (MVCF) that can directly convolve with a multi-vector descriptor to obtain a single-channel response that indicates the location of an object. Experiments on the CVPR2013 tracking benchmark with the evaluation of state-of-the-art trackers show the effectiveness and speed of the proposed method. Moreover, we show that our MVCF tracker, which uses the DD-HOG descriptor, outperforms the structure-preserving object tracker (SPOT) in multi-object tracking because of its high-speed and ability to address heavy occlusion. PMID:27347951
Internal resonance and low frequency vibration energy harvesting
NASA Astrophysics Data System (ADS)
Yang, Wei; Towfighian, Shahrzad
2017-09-01
A nonlinear vibration energy harvester with internal resonance is presented. The proposed harvester consists of two cantilevers, each with a permanent magnet on its tip. One cantilever has a piezoelectric layer at its base. When magnetic force is applied this two degrees-of-freedom nonlinear vibration system shows the internal resonance phenomenon that broadens the frequency bandwidth compared to a linear system. Three coupled partial differential equations are obtained to predict the dynamic behavior of the nonlinear energy harvester. The perturbation method of multiple scales is used to solve equations. Results from experiments done at different vibration levels with varying distances between the magnets validate the mathematical model. Experiments and simulations show the design outperforms the linear system by doubling the frequency bandwidth. Output voltage for frequency response is studied for different system parameters. The optimal load resistance is obtained for the maximum power in the internal resonance case. The results demonstrate that a design combining internal resonance and magnetic nonlinearity improves the efficiency of energy harvesting.
Detection of exudates in fundus images using a Markovian segmentation model.
Harangi, Balazs; Hajdu, Andras
2014-01-01
Diabetic retinopathy (DR) is one of the most common causing of vision loss in developed countries. In early stage of DR, some signs like exudates appear in the retinal images. An automatic screening system must be capable to detect these signs properly so that the treatment of the patients may begin in time. The appearance of exudates shows a rich variety regarding their shape and size making automatic detection more challenging. We propose a way for the automatic segmentation of exudates consisting of a candidate extraction step followed by exact contour detection and region-wise classification. More specifically, we extract possible exudate candidates using grayscale morphology and their proper shape is determined by a Markovian segmentation model considering edge information. Finally, we label the candidates as true or false ones by an optimally adjusted SVM classifier. For testing purposes, we considered the publicly available database DiaretDB1, where the proposed method outperformed several state-of-the-art exudate detectors.
Gerend, Mary A.; Shepherd, Janet E.
2012-01-01
Background Although theories of health behavior have guided thousands of studies, relatively few studies have compared these theories against one another. Purpose The purpose of the current study was to compare two classic theories of health behavior—the Health Belief Model (HBM) and the Theory of Planned Behavior (TPB)—in their prediction of human papillomavirus (HPV) vaccination. Methods After watching a gain-framed, loss-framed, or control video, women (N=739) ages 18–26 completed a survey assessing HBM and TPB constructs. HPV vaccine uptake was assessed ten months later. Results Although the message framing intervention had no effect on vaccine uptake, support was observed for both the TPB and HBM. Nevertheless, the TPB consistently outperformed the HBM. Key predictors of uptake included subjective norms, self-efficacy, and vaccine cost. Conclusions Despite the observed advantage of the TPB, findings revealed considerable overlap between the two theories and highlighted the importance of proximal versus distal predictors of health behavior. PMID:22547155
NASA Technical Reports Server (NTRS)
Jones, Denise R.; Parrish, Russell V.
1990-01-01
A piloted simulation study was conducted comparing three different input methods for interfacing to a large screen, multiwindow, whole flight deck display for management of transport aircraft systems. The thumball concept utilized a miniature trackball embedded in a conventional side arm controller. The multifunction control throttle and stick (MCTAS) concept employed a thumb switch located in the throttle handle. The touch screen concept provided data entry through a capacitive touch screen installed on the display surface. The objective and subjective results obtained indicate that, with present implementations, the thumball concept was the most appropriate for interfacing with aircraft systems/subsystems presented on a large screen display. Not unexpectedly, the completion time differences between the three concepts varied with the task being performed, although the thumball implementation consistently outperformed the other two concepts. However, pilot suggestions for improved implementations of the MCTAS and touch screen concepts could reduce some of these differences.
NASA Astrophysics Data System (ADS)
Selouani, Sid-Ahmed; O'Shaughnessy, Douglas
2003-12-01
Limiting the decrease in performance due to acoustic environment changes remains a major challenge for continuous speech recognition (CSR) systems. We propose a novel approach which combines the Karhunen-Loève transform (KLT) in the mel-frequency domain with a genetic algorithm (GA) to enhance the data representing corrupted speech. The idea consists of projecting noisy speech parameters onto the space generated by the genetically optimized principal axis issued from the KLT. The enhanced parameters increase the recognition rate for highly interfering noise environments. The proposed hybrid technique, when included in the front-end of an HTK-based CSR system, outperforms that of the conventional recognition process in severe interfering car noise environments for a wide range of signal-to-noise ratios (SNRs) varying from 16 dB to[InlineEquation not available: see fulltext.] dB. We also showed the effectiveness of the KLT-GA method in recognizing speech subject to telephone channel degradations.
Spherical hashing: binary code embedding with hyperspheres.
Heo, Jae-Pil; Lee, Youngwoon; He, Junfeng; Chang, Shih-Fu; Yoon, Sung-Eui
2015-11-01
Many binary code embedding schemes have been actively studied recently, since they can provide efficient similarity search, and compact data representations suitable for handling large scale image databases. Existing binary code embedding techniques encode high-dimensional data by using hyperplane-based hashing functions. In this paper we propose a novel hypersphere-based hashing function, spherical hashing, to map more spatially coherent data points into a binary code compared to hyperplane-based hashing functions. We also propose a new binary code distance function, spherical Hamming distance, tailored for our hypersphere-based binary coding scheme, and design an efficient iterative optimization process to achieve both balanced partitioning for each hash function and independence between hashing functions. Furthermore, we generalize spherical hashing to support various similarity measures defined by kernel functions. Our extensive experiments show that our spherical hashing technique significantly outperforms state-of-the-art techniques based on hyperplanes across various benchmarks with sizes ranging from one to 75 million of GIST, BoW and VLAD descriptors. The performance gains are consistent and large, up to 100 percent improvements over the second best method among tested methods. These results confirm the unique merits of using hyperspheres to encode proximity regions in high-dimensional spaces. Finally, our method is intuitive and easy to implement.
Ramkumar, Barathram; Sabarimalai Manikandan, M.
2017-01-01
Automatic electrocardiogram (ECG) signal enhancement has become a crucial pre-processing step in most ECG signal analysis applications. In this Letter, the authors propose an automated noise-aware dictionary learning-based generalised ECG signal enhancement framework which can automatically learn the dictionaries based on the ECG noise type for effective representation of ECG signal and noises, and can reduce the computational load of sparse representation-based ECG enhancement system. The proposed framework consists of noise detection and identification, noise-aware dictionary learning, sparse signal decomposition and reconstruction. The noise detection and identification is performed based on the moving average filter, first-order difference, and temporal features such as number of turning points, maximum absolute amplitude, zerocrossings, and autocorrelation features. The representation dictionary is learned based on the type of noise identified in the previous stage. The proposed framework is evaluated using noise-free and noisy ECG signals. Results demonstrate that the proposed method can significantly reduce computational load as compared with conventional dictionary learning-based ECG denoising approaches. Further, comparative results show that the method outperforms existing methods in automatically removing noises such as baseline wanders, power-line interference, muscle artefacts and their combinations without distorting the morphological content of local waves of ECG signal. PMID:28529758
Automated lung tumor segmentation for whole body PET volume based on novel downhill region growing
NASA Astrophysics Data System (ADS)
Ballangan, Cherry; Wang, Xiuying; Eberl, Stefan; Fulham, Michael; Feng, Dagan
2010-03-01
We propose an automated lung tumor segmentation method for whole body PET images based on a novel downhill region growing (DRG) technique, which regards homogeneous tumor hotspots as 3D monotonically decreasing functions. The method has three major steps: thoracic slice extraction with K-means clustering of the slice features; hotspot segmentation with DRG; and decision tree analysis based hotspot classification. To overcome the common problem of leakage into adjacent hotspots in automated lung tumor segmentation, DRG employs the tumors' SUV monotonicity features. DRG also uses gradient magnitude of tumors' SUV to improve tumor boundary definition. We used 14 PET volumes from patients with primary NSCLC for validation. The thoracic region extraction step achieved good and consistent results for all patients despite marked differences in size and shape of the lungs and the presence of large tumors. The DRG technique was able to avoid the problem of leakage into adjacent hotspots and produced a volumetric overlap fraction of 0.61 +/- 0.13 which outperformed four other methods where the overlap fraction varied from 0.40 +/- 0.24 to 0.59 +/- 0.14. Of the 18 tumors in 14 NSCLC studies, 15 lesions were classified correctly, 2 were false negative and 15 were false positive.
Peng, Hui; Lan, Chaowang; Liu, Yuansheng; Liu, Tao; Blumenstein, Michael; Li, Jinyan
2017-10-03
Disease-related protein-coding genes have been widely studied, but disease-related non-coding genes remain largely unknown. This work introduces a new vector to represent diseases, and applies the newly vectorized data for a positive-unlabeled learning algorithm to predict and rank disease-related long non-coding RNA (lncRNA) genes. This novel vector representation for diseases consists of two sub-vectors, one is composed of 45 elements, characterizing the information entropies of the disease genes distribution over 45 chromosome substructures. This idea is supported by our observation that some substructures (e.g., the chromosome 6 p-arm) are highly preferred by disease-related protein coding genes, while some (e.g., the 21 p-arm) are not favored at all. The second sub-vector is 30-dimensional, characterizing the distribution of disease gene enriched KEGG pathways in comparison with our manually created pathway groups. The second sub-vector complements with the first one to differentiate between various diseases. Our prediction method outperforms the state-of-the-art methods on benchmark datasets for prioritizing disease related lncRNA genes. The method also works well when only the sequence information of an lncRNA gene is known, or even when a given disease has no currently recognized long non-coding genes.
Peng, Hui; Lan, Chaowang; Liu, Yuansheng; Liu, Tao; Blumenstein, Michael; Li, Jinyan
2017-01-01
Disease-related protein-coding genes have been widely studied, but disease-related non-coding genes remain largely unknown. This work introduces a new vector to represent diseases, and applies the newly vectorized data for a positive-unlabeled learning algorithm to predict and rank disease-related long non-coding RNA (lncRNA) genes. This novel vector representation for diseases consists of two sub-vectors, one is composed of 45 elements, characterizing the information entropies of the disease genes distribution over 45 chromosome substructures. This idea is supported by our observation that some substructures (e.g., the chromosome 6 p-arm) are highly preferred by disease-related protein coding genes, while some (e.g., the 21 p-arm) are not favored at all. The second sub-vector is 30-dimensional, characterizing the distribution of disease gene enriched KEGG pathways in comparison with our manually created pathway groups. The second sub-vector complements with the first one to differentiate between various diseases. Our prediction method outperforms the state-of-the-art methods on benchmark datasets for prioritizing disease related lncRNA genes. The method also works well when only the sequence information of an lncRNA gene is known, or even when a given disease has no currently recognized long non-coding genes. PMID:29108274
PepMapper: a collaborative web tool for mapping epitopes from affinity-selected peptides.
Chen, Wenhan; Guo, William W; Huang, Yanxin; Ma, Zhiqiang
2012-01-01
Epitope mapping from affinity-selected peptides has become popular in epitope prediction, and correspondingly many Web-based tools have been developed in recent years. However, the performance of these tools varies in different circumstances. To address this problem, we employed an ensemble approach to incorporate two popular Web tools, MimoPro and Pep-3D-Search, together for taking advantages offered by both methods so as to give users more options for their specific purposes of epitope-peptide mapping. The combined operation of Union finds as many associated peptides as possible from both methods, which increases sensitivity in finding potential epitopic regions on a given antigen surface. The combined operation of Intersection achieves to some extent the mutual verification by the two methods and hence increases the likelihood of locating the genuine epitopic region on a given antigen in relation to the interacting peptides. The Consistency between Intersection and Union is an indirect sufficient condition to assess the likelihood of successful peptide-epitope mapping. On average from 27 tests, the combined operations of PepMapper outperformed either MimoPro or Pep-3D-Search alone. Therefore, PepMapper is another multipurpose mapping tool for epitope prediction from affinity-selected peptides. The Web server can be freely accessed at: http://informatics.nenu.edu.cn/PepMapper/
Peng, Huan-Kai; Marculescu, Radu
2015-01-01
Objective Social media exhibit rich yet distinct temporal dynamics which cover a wide range of different scales. In order to study this complex dynamics, two fundamental questions revolve around (1) the signatures of social dynamics at different time scales, and (2) the way in which these signatures interact and form higher-level meanings. Method In this paper, we propose the Recursive Convolutional Bayesian Model (RCBM) to address both of these fundamental questions. The key idea behind our approach consists of constructing a deep-learning framework using specialized convolution operators that are designed to exploit the inherent heterogeneity of social dynamics. RCBM’s runtime and convergence properties are guaranteed by formal analyses. Results Experimental results show that the proposed method outperforms the state-of-the-art approaches both in terms of solution quality and computational efficiency. Indeed, by applying the proposed method on two social network datasets, Twitter and Yelp, we are able to identify the compositional structures that can accurately characterize the complex social dynamics from these two social media. We further show that identifying these patterns can enable new applications such as anomaly detection and improved social dynamics forecasting. Finally, our analysis offers new insights on understanding and engineering social media dynamics, with direct applications to opinion spreading and online content promotion. PMID:25830775
Satija, Udit; Ramkumar, Barathram; Sabarimalai Manikandan, M
2017-02-01
Automatic electrocardiogram (ECG) signal enhancement has become a crucial pre-processing step in most ECG signal analysis applications. In this Letter, the authors propose an automated noise-aware dictionary learning-based generalised ECG signal enhancement framework which can automatically learn the dictionaries based on the ECG noise type for effective representation of ECG signal and noises, and can reduce the computational load of sparse representation-based ECG enhancement system. The proposed framework consists of noise detection and identification, noise-aware dictionary learning, sparse signal decomposition and reconstruction. The noise detection and identification is performed based on the moving average filter, first-order difference, and temporal features such as number of turning points, maximum absolute amplitude, zerocrossings, and autocorrelation features. The representation dictionary is learned based on the type of noise identified in the previous stage. The proposed framework is evaluated using noise-free and noisy ECG signals. Results demonstrate that the proposed method can significantly reduce computational load as compared with conventional dictionary learning-based ECG denoising approaches. Further, comparative results show that the method outperforms existing methods in automatically removing noises such as baseline wanders, power-line interference, muscle artefacts and their combinations without distorting the morphological content of local waves of ECG signal.
Application of Wavelet Transform for PDZ Domain Classification
Daqrouq, Khaled; Alhmouz, Rami; Balamesh, Ahmed; Memic, Adnan
2015-01-01
PDZ domains have been identified as part of an array of signaling proteins that are often unrelated, except for the well-conserved structural PDZ domain they contain. These domains have been linked to many disease processes including common Avian influenza, as well as very rare conditions such as Fraser and Usher syndromes. Historically, based on the interactions and the nature of bonds they form, PDZ domains have most often been classified into one of three classes (class I, class II and others - class III), that is directly dependent on their binding partner. In this study, we report on three unique feature extraction approaches based on the bigram and trigram occurrence and existence rearrangements within the domain's primary amino acid sequences in assisting PDZ domain classification. Wavelet packet transform (WPT) and Shannon entropy denoted by wavelet entropy (WE) feature extraction methods were proposed. Using 115 unique human and mouse PDZ domains, the existence rearrangement approach yielded a high recognition rate (78.34%), which outperformed our occurrence rearrangements based method. The recognition rate was (81.41%) with validation technique. The method reported for PDZ domain classification from primary sequences proved to be an encouraging approach for obtaining consistent classification results. We anticipate that by increasing the database size, we can further improve feature extraction and correct classification. PMID:25860375
FGWAS: Functional genome wide association analysis.
Huang, Chao; Thompson, Paul; Wang, Yalin; Yu, Yang; Zhang, Jingwen; Kong, Dehan; Colen, Rivka R; Knickmeyer, Rebecca C; Zhu, Hongtu
2017-10-01
Functional phenotypes (e.g., subcortical surface representation), which commonly arise in imaging genetic studies, have been used to detect putative genes for complexly inherited neuropsychiatric and neurodegenerative disorders. However, existing statistical methods largely ignore the functional features (e.g., functional smoothness and correlation). The aim of this paper is to develop a functional genome-wide association analysis (FGWAS) framework to efficiently carry out whole-genome analyses of functional phenotypes. FGWAS consists of three components: a multivariate varying coefficient model, a global sure independence screening procedure, and a test procedure. Compared with the standard multivariate regression model, the multivariate varying coefficient model explicitly models the functional features of functional phenotypes through the integration of smooth coefficient functions and functional principal component analysis. Statistically, compared with existing methods for genome-wide association studies (GWAS), FGWAS can substantially boost the detection power for discovering important genetic variants influencing brain structure and function. Simulation studies show that FGWAS outperforms existing GWAS methods for searching sparse signals in an extremely large search space, while controlling for the family-wise error rate. We have successfully applied FGWAS to large-scale analysis of data from the Alzheimer's Disease Neuroimaging Initiative for 708 subjects, 30,000 vertices on the left and right hippocampal surfaces, and 501,584 SNPs. Copyright © 2017 Elsevier Inc. All rights reserved.
Zhao, Yitian; Zheng, Yalin; Liu, Yonghuai; Yang, Jian; Zhao, Yifan; Chen, Duanduan; Wang, Yongtian
2017-01-01
Leakage in retinal angiography currently is a key feature for confirming the activities of lesions in the management of a wide range of retinal diseases, such as diabetic maculopathy and paediatric malarial retinopathy. This paper proposes a new saliency-based method for the detection of leakage in fluorescein angiography. A superpixel approach is firstly employed to divide the image into meaningful patches (or superpixels) at different levels. Two saliency cues, intensity and compactness, are then proposed for the estimation of the saliency map of each individual superpixel at each level. The saliency maps at different levels over the same cues are fused using an averaging operator. The two saliency maps over different cues are fused using a pixel-wise multiplication operator. Leaking regions are finally detected by thresholding the saliency map followed by a graph-cut segmentation. The proposed method has been validated using the only two publicly available datasets: one for malarial retinopathy and the other for diabetic retinopathy. The experimental results show that it outperforms one of the latest competitors and performs as well as a human expert for leakage detection and outperforms several state-of-the-art methods for saliency detection.
Schwartzkopf, Wade C; Bovik, Alan C; Evans, Brian L
2005-12-01
Traditional chromosome imaging has been limited to grayscale images, but recently a 5-fluorophore combinatorial labeling technique (M-FISH) was developed wherein each class of chromosomes binds with a different combination of fluorophores. This results in a multispectral image, where each class of chromosomes has distinct spectral components. In this paper, we develop new methods for automatic chromosome identification by exploiting the multispectral information in M-FISH chromosome images and by jointly performing chromosome segmentation and classification. We (1) develop a maximum-likelihood hypothesis test that uses multispectral information, together with conventional criteria, to select the best segmentation possibility; (2) use this likelihood function to combine chromosome segmentation and classification into a robust chromosome identification system; and (3) show that the proposed likelihood function can also be used as a reliable indicator of errors in segmentation, errors in classification, and chromosome anomalies, which can be indicators of radiation damage, cancer, and a wide variety of inherited diseases. We show that the proposed multispectral joint segmentation-classification method outperforms past grayscale segmentation methods when decomposing touching chromosomes. We also show that it outperforms past M-FISH classification techniques that do not use segmentation information.
Detect2Rank: Combining Object Detectors Using Learning to Rank.
Karaoglu, Sezer; Yang Liu; Gevers, Theo
2016-01-01
Object detection is an important research area in the field of computer vision. Many detection algorithms have been proposed. However, each object detector relies on specific assumptions of the object appearance and imaging conditions. As a consequence, no algorithm can be considered universal. With the large variety of object detectors, the subsequent question is how to select and combine them. In this paper, we propose a framework to learn how to combine object detectors. The proposed method uses (single) detectors like Deformable Part Models, Color Names and Ensemble of Exemplar-SVMs, and exploits their correlation by high-level contextual features to yield a combined detection list. Experiments on the PASCAL VOC07 and VOC10 data sets show that the proposed method significantly outperforms single object detectors, DPM (8.4%), CN (6.8%) and EES (17.0%) on VOC07 and DPM (6.5%), CN (5.5%) and EES (16.2%) on VOC10. We show with an experiment that there are no constraints on the type of the detector. The proposed method outperforms (2.4%) the state-of-the-art object detector (RCNN) on VOC07 when Regions with Convolutional Neural Network is combined with other detectors used in this paper.
Robust regression on noisy data for fusion scaling laws
DOE Office of Scientific and Technical Information (OSTI.GOV)
Verdoolaege, Geert, E-mail: geert.verdoolaege@ugent.be; Laboratoire de Physique des Plasmas de l'ERM - Laboratorium voor Plasmafysica van de KMS
2014-11-15
We introduce the method of geodesic least squares (GLS) regression for estimating fusion scaling laws. Based on straightforward principles, the method is easily implemented, yet it clearly outperforms established regression techniques, particularly in cases of significant uncertainty on both the response and predictor variables. We apply GLS for estimating the scaling of the L-H power threshold, resulting in estimates for ITER that are somewhat higher than predicted earlier.
Holtzclaw, Dan J
2017-02-01
Previously published research for a single metropolitan market (Austin, Texas) found that periodontists fare poorly on the Yelp website for nearly all measured metrics, including average star ratings, number of reviews, review removal rate, and evaluations by "elite" Yelp users. The purpose of the current study is to confirm or refute these findings by expanding datasets to additional metropolitan markets of various sizes and geographic locations. A total of 6,559 Yelp reviews were examined for general dentists, endodontists, pediatric dentists, oral surgeons, orthodontists, and periodontists in small (Austin, Texas), medium (Seattle, Washington), and large (New York City, New York) metropolitan markets. Numerous review characteristics were evaluated, including: 1) total number of reviews; 2) average star rating; 3) review filtering rate; and 4) number of reviews by Yelp members with elite status. Results were compared in multiple ways to determine whether statistically significant differences existed. In all metropolitan markets, periodontists were outperformed by all other dental specialties for all measured Yelp metrics in this study. Intermetropolitan comparisons of periodontal practices showed no statistically significant differences. Periodontists were outperformed consistently by all other dental specialties in every measured metric on the Yelp website. These results were consistent and repeated in all three metropolitan markets evaluated in this study. Poor performance of periodontists on Yelp may be related to the age profile of patients in the typical periodontal practice. This may result in inadvertently biased filtering of periodontal reviews and subsequently poor performance in multiple other categories.
Integrative Analysis of High-throughput Cancer Studies with Contrasted Penalization
Shi, Xingjie; Liu, Jin; Huang, Jian; Zhou, Yong; Shia, BenChang; Ma, Shuangge
2015-01-01
In cancer studies with high-throughput genetic and genomic measurements, integrative analysis provides a way to effectively pool and analyze heterogeneous raw data from multiple independent studies and outperforms “classic” meta-analysis and single-dataset analysis. When marker selection is of interest, the genetic basis of multiple datasets can be described using the homogeneity model or the heterogeneity model. In this study, we consider marker selection under the heterogeneity model, which includes the homogeneity model as a special case and can be more flexible. Penalization methods have been developed in the literature for marker selection. This study advances from the published ones by introducing the contrast penalties, which can accommodate the within- and across-dataset structures of covariates/regression coefficients and, by doing so, further improve marker selection performance. Specifically, we develop a penalization method that accommodates the across-dataset structures by smoothing over regression coefficients. An effective iterative algorithm, which calls an inner coordinate descent iteration, is developed. Simulation shows that the proposed method outperforms the benchmark with more accurate marker identification. The analysis of breast cancer and lung cancer prognosis studies with gene expression measurements shows that the proposed method identifies genes different from those using the benchmark and has better prediction performance. PMID:24395534
2014-01-01
Background The DerSimonian and Laird approach (DL) is widely used for random effects meta-analysis, but this often results in inappropriate type I error rates. The method described by Hartung, Knapp, Sidik and Jonkman (HKSJ) is known to perform better when trials of similar size are combined. However evidence in realistic situations, where one trial might be much larger than the other trials, is lacking. We aimed to evaluate the relative performance of the DL and HKSJ methods when studies of different sizes are combined and to develop a simple method to convert DL results to HKSJ results. Methods We evaluated the performance of the HKSJ versus DL approach in simulated meta-analyses of 2–20 trials with varying sample sizes and between-study heterogeneity, and allowing trials to have various sizes, e.g. 25% of the trials being 10-times larger than the smaller trials. We also compared the number of “positive” (statistically significant at p < 0.05) findings using empirical data of recent meta-analyses with > = 3 studies of interventions from the Cochrane Database of Systematic Reviews. Results The simulations showed that the HKSJ method consistently resulted in more adequate error rates than the DL method. When the significance level was 5%, the HKSJ error rates at most doubled, whereas for DL they could be over 30%. DL, and, far less so, HKSJ had more inflated error rates when the combined studies had unequal sizes and between-study heterogeneity. The empirical data from 689 meta-analyses showed that 25.1% of the significant findings for the DL method were non-significant with the HKSJ method. DL results can be easily converted into HKSJ results. Conclusions Our simulations showed that the HKSJ method consistently results in more adequate error rates than the DL method, especially when the number of studies is small, and can easily be applied routinely in meta-analyses. Even with the HKSJ method, extra caution is needed when there are = <5 studies of very unequal sizes. PMID:24548571
NASA Astrophysics Data System (ADS)
Hanasoge, Shravan; Agarwal, Umang; Tandon, Kunj; Koelman, J. M. Vianney A.
2017-09-01
Determining the pressure differential required to achieve a desired flow rate in a porous medium requires solving Darcy's law, a Laplace-like equation, with a spatially varying tensor permeability. In various scenarios, the permeability coefficient is sampled at high spatial resolution, which makes solving Darcy's equation numerically prohibitively expensive. As a consequence, much effort has gone into creating upscaled or low-resolution effective models of the coefficient while ensuring that the estimated flow rate is well reproduced, bringing to the fore the classic tradeoff between computational cost and numerical accuracy. Here we perform a statistical study to characterize the relative success of upscaling methods on a large sample of permeability coefficients that are above the percolation threshold. We introduce a technique based on mode-elimination renormalization group theory (MG) to build coarse-scale permeability coefficients. Comparing the results with coefficients upscaled using other methods, we find that MG is consistently more accurate, particularly due to its ability to address the tensorial nature of the coefficients. MG places a low computational demand, in the manner in which we have implemented it, and accurate flow-rate estimates are obtained when using MG-upscaled permeabilities that approach or are beyond the percolation threshold.
Global Contrast Based Salient Region Detection.
Cheng, Ming-Ming; Mitra, Niloy J; Huang, Xiaolei; Torr, Philip H S; Hu, Shi-Min
2015-03-01
Automatic estimation of salient object regions across images, without any prior assumption or knowledge of the contents of the corresponding scenes, enhances many computer vision and computer graphics applications. We introduce a regional contrast based salient object detection algorithm, which simultaneously evaluates global contrast differences and spatial weighted coherence scores. The proposed algorithm is simple, efficient, naturally multi-scale, and produces full-resolution, high-quality saliency maps. These saliency maps are further used to initialize a novel iterative version of GrabCut, namely SaliencyCut, for high quality unsupervised salient object segmentation. We extensively evaluated our algorithm using traditional salient object detection datasets, as well as a more challenging Internet image dataset. Our experimental results demonstrate that our algorithm consistently outperforms 15 existing salient object detection and segmentation methods, yielding higher precision and better recall rates. We also show that our algorithm can be used to efficiently extract salient object masks from Internet images, enabling effective sketch-based image retrieval (SBIR) via simple shape comparisons. Despite such noisy internet images, where the saliency regions are ambiguous, our saliency guided image retrieval achieves a superior retrieval rate compared with state-of-the-art SBIR methods, and additionally provides important target object region information.
NASA Astrophysics Data System (ADS)
Santra, Tapesh; Delatola, Eleni Ioanna
2016-07-01
Presence of considerable noise and missing data points make analysis of mass-spectrometry (MS) based proteomic data a challenging task. The missing values in MS data are caused by the inability of MS machines to reliably detect proteins whose abundances fall below the detection limit. We developed a Bayesian algorithm that exploits this knowledge and uses missing data points as a complementary source of information to the observed protein intensities in order to find differentially expressed proteins by analysing MS based proteomic data. We compared its accuracy with many other methods using several simulated datasets. It consistently outperformed other methods. We then used it to analyse proteomic screens of a breast cancer (BC) patient cohort. It revealed large differences between the proteomic landscapes of triple negative and Luminal A, which are the most and least aggressive types of BC. Unexpectedly, majority of these differences could be attributed to the direct transcriptional activity of only seven transcription factors some of which are known to be inactive in triple negative BC. We also identified two new proteins which significantly correlated with the survival of BC patients, and therefore may have potential diagnostic/prognostic values.
A predictive machine learning approach for microstructure optimization and materials design
NASA Astrophysics Data System (ADS)
Liu, Ruoqian; Kumar, Abhishek; Chen, Zhengzhang; Agrawal, Ankit; Sundararaghavan, Veera; Choudhary, Alok
2015-06-01
This paper addresses an important materials engineering question: How can one identify the complete space (or as much of it as possible) of microstructures that are theoretically predicted to yield the desired combination of properties demanded by a selected application? We present a problem involving design of magnetoelastic Fe-Ga alloy microstructure for enhanced elastic, plastic and magnetostrictive properties. While theoretical models for computing properties given the microstructure are known for this alloy, inversion of these relationships to obtain microstructures that lead to desired properties is challenging, primarily due to the high dimensionality of microstructure space, multi-objective design requirement and non-uniqueness of solutions. These challenges render traditional search-based optimization methods incompetent in terms of both searching efficiency and result optimality. In this paper, a route to address these challenges using a machine learning methodology is proposed. A systematic framework consisting of random data generation, feature selection and classification algorithms is developed. Experiments with five design problems that involve identification of microstructures that satisfy both linear and nonlinear property constraints show that our framework outperforms traditional optimization methods with the average running time reduced by as much as 80% and with optimality that would not be achieved otherwise.
Mane, Vijay Mahadeo; Jadhav, D V
2017-05-24
Diabetic retinopathy (DR) is the most common diabetic eye disease. Doctors are using various test methods to detect DR. But, the availability of test methods and requirements of domain experts pose a new challenge in the automatic detection of DR. In order to fulfill this objective, a variety of algorithms has been developed in the literature. In this paper, we propose a system consisting of a novel sparking process and a holoentropy-based decision tree for automatic classification of DR images to further improve the effectiveness. The sparking process algorithm is developed for automatic segmentation of blood vessels through the estimation of optimal threshold. The holoentropy enabled decision tree is newly developed for automatic classification of retinal images into normal or abnormal using hybrid features which preserve the disease-level patterns even more than the signal level of the feature. The effectiveness of the proposed system is analyzed using standard fundus image databases DIARETDB0 and DIARETDB1 for sensitivity, specificity and accuracy. The proposed system yields sensitivity, specificity and accuracy values of 96.72%, 97.01% and 96.45%, respectively. The experimental result reveals that the proposed technique outperforms the existing algorithms.
Body Fat Percentage Prediction Using Intelligent Hybrid Approaches
Shao, Yuehjen E.
2014-01-01
Excess of body fat often leads to obesity. Obesity is typically associated with serious medical diseases, such as cancer, heart disease, and diabetes. Accordingly, knowing the body fat is an extremely important issue since it affects everyone's health. Although there are several ways to measure the body fat percentage (BFP), the accurate methods are often associated with hassle and/or high costs. Traditional single-stage approaches may use certain body measurements or explanatory variables to predict the BFP. Diverging from existing approaches, this study proposes new intelligent hybrid approaches to obtain fewer explanatory variables, and the proposed forecasting models are able to effectively predict the BFP. The proposed hybrid models consist of multiple regression (MR), artificial neural network (ANN), multivariate adaptive regression splines (MARS), and support vector regression (SVR) techniques. The first stage of the modeling includes the use of MR and MARS to obtain fewer but more important sets of explanatory variables. In the second stage, the remaining important variables are served as inputs for the other forecasting methods. A real dataset was used to demonstrate the development of the proposed hybrid models. The prediction results revealed that the proposed hybrid schemes outperformed the typical, single-stage forecasting models. PMID:24723804
NASA Astrophysics Data System (ADS)
Hild, Kenneth E.; Alleva, Giovanna; Nagarajan, Srikantan; Comani, Silvia
2007-01-01
In this study we compare the performance of six independent components analysis (ICA) algorithms on 16 real fetal magnetocardiographic (fMCG) datasets for the application of extracting the fetal cardiac signal. We also compare the extraction results for real data with the results previously obtained for synthetic data. The six ICA algorithms are FastICA, CubICA, JADE, Infomax, MRMI-SIG and TDSEP. The results obtained using real fMCG data indicate that the FastICA method consistently outperforms the others in regard to separation quality and that the performance of an ICA method that uses temporal information suffers in the presence of noise. These two results confirm the previous results obtained using synthetic fMCG data. There were also two notable differences between the studies based on real and synthetic data. The differences are that all six ICA algorithms are independent of gestational age and sensor dimensionality for synthetic data, but depend on gestational age and sensor dimensionality for real data. It is possible to explain these differences by assuming that the number of point sources needed to completely explain the data is larger than the dimensionality used in the ICA extraction.
Clevert, Djork-Arné; Mitterecker, Andreas; Mayr, Andreas; Klambauer, Günter; Tuefferd, Marianne; De Bondt, An; Talloen, Willem; Göhlmann, Hinrich; Hochreiter, Sepp
2011-07-01
Cost-effective oligonucleotide genotyping arrays like the Affymetrix SNP 6.0 are still the predominant technique to measure DNA copy number variations (CNVs). However, CNV detection methods for microarrays overestimate both the number and the size of CNV regions and, consequently, suffer from a high false discovery rate (FDR). A high FDR means that many CNVs are wrongly detected and therefore not associated with a disease in a clinical study, though correction for multiple testing takes them into account and thereby decreases the study's discovery power. For controlling the FDR, we propose a probabilistic latent variable model, 'cn.FARMS', which is optimized by a Bayesian maximum a posteriori approach. cn.FARMS controls the FDR through the information gain of the posterior over the prior. The prior represents the null hypothesis of copy number 2 for all samples from which the posterior can only deviate by strong and consistent signals in the data. On HapMap data, cn.FARMS clearly outperformed the two most prevalent methods with respect to sensitivity and FDR. The software cn.FARMS is publicly available as a R package at http://www.bioinf.jku.at/software/cnfarms/cnfarms.html.
Improved image alignment method in application to X-ray images and biological images.
Wang, Ching-Wei; Chen, Hsiang-Chou
2013-08-01
Alignment of medical images is a vital component of a large number of applications throughout the clinical track of events; not only within clinical diagnostic settings, but prominently so in the area of planning, consummation and evaluation of surgical and radiotherapeutical procedures. However, image registration of medical images is challenging because of variations on data appearance, imaging artifacts and complex data deformation problems. Hence, the aim of this study is to develop a robust image alignment method for medical images. An improved image registration method is proposed, and the method is evaluated with two types of medical data, including biological microscopic tissue images and dental X-ray images and compared with five state-of-the-art image registration techniques. The experimental results show that the presented method consistently performs well on both types of medical images, achieving 88.44 and 88.93% averaged registration accuracies for biological tissue images and X-ray images, respectively, and outperforms the benchmark methods. Based on the Tukey's honestly significant difference test and Fisher's least square difference test tests, the presented method performs significantly better than all existing methods (P ≤ 0.001) for tissue image alignment, and for the X-ray image registration, the proposed method performs significantly better than the two benchmark b-spline approaches (P < 0.001). The software implementation of the presented method and the data used in this study are made publicly available for scientific communities to use (http://www-o.ntust.edu.tw/∼cweiwang/ImprovedImageRegistration/). cweiwang@mail.ntust.edu.tw.
Connecting clinical and actuarial prediction with rule-based methods.
Fokkema, Marjolein; Smits, Niels; Kelderman, Henk; Penninx, Brenda W J H
2015-06-01
Meta-analyses comparing the accuracy of clinical versus actuarial prediction have shown actuarial methods to outperform clinical methods, on average. However, actuarial methods are still not widely used in clinical practice, and there has been a call for the development of actuarial prediction methods for clinical practice. We argue that rule-based methods may be more useful than the linear main effect models usually employed in prediction studies, from a data and decision analytic as well as a practical perspective. In addition, decision rules derived with rule-based methods can be represented as fast and frugal trees, which, unlike main effects models, can be used in a sequential fashion, reducing the number of cues that have to be evaluated before making a prediction. We illustrate the usability of rule-based methods by applying RuleFit, an algorithm for deriving decision rules for classification and regression problems, to a dataset on prediction of the course of depressive and anxiety disorders from Penninx et al. (2011). The RuleFit algorithm provided a model consisting of 2 simple decision rules, requiring evaluation of only 2 to 4 cues. Predictive accuracy of the 2-rule model was very similar to that of a logistic regression model incorporating 20 predictor variables, originally applied to the dataset. In addition, the 2-rule model required, on average, evaluation of only 3 cues. Therefore, the RuleFit algorithm appears to be a promising method for creating decision tools that are less time consuming and easier to apply in psychological practice, and with accuracy comparable to traditional actuarial methods. (c) 2015 APA, all rights reserved).
NASA Astrophysics Data System (ADS)
Passow, Christian; Donner, Reik
2017-04-01
Quantile mapping (QM) is an established concept that allows to correct systematic biases in multiple quantiles of the distribution of a climatic observable. It shows remarkable results in correcting biases in historical simulations through observational data and outperforms simpler correction methods which relate only to the mean or variance. Since it has been shown that bias correction of future predictions or scenario runs with basic QM can result in misleading trends in the projection, adjusted, trend preserving, versions of QM were introduced in the form of detrended quantile mapping (DQM) and quantile delta mapping (QDM) (Cannon, 2015, 2016). Still, all previous versions and applications of QM based bias correction rely on the assumption of time-independent quantiles over the investigated period, which can be misleading in the context of a changing climate. Here, we propose a novel combination of linear quantile regression (QR) with the classical QM method to introduce a consistent, time-dependent and trend preserving approach of bias correction for historical and future projections. Since QR is a regression method, it is possible to estimate quantiles in the same resolution as the given data and include trends or other dependencies. We demonstrate the performance of the new method of linear regression quantile mapping (RQM) in correcting biases of temperature and precipitation products from historical runs (1959 - 2005) of the COSMO model in climate mode (CCLM) from the Euro-CORDEX ensemble relative to gridded E-OBS data of the same spatial and temporal resolution. A thorough comparison with established bias correction methods highlights the strengths and potential weaknesses of the new RQM approach. References: A.J. Cannon, S.R. Sorbie, T.Q. Murdock: Bias Correction of GCM Precipitation by Quantile Mapping - How Well Do Methods Preserve Changes in Quantiles and Extremes? Journal of Climate, 28, 6038, 2015 A.J. Cannon: Multivariate Bias Correction of Climate Model Outputs - Matching Marginal Distributions and Inter-variable Dependence Structure. Journal of Climate, 29, 7045, 2016
Quintela-del-Río, Alejandro; Francisco-Fernández, Mario
2011-02-01
The study of extreme values and prediction of ozone data is an important topic of research when dealing with environmental problems. Classical extreme value theory is usually used in air-pollution studies. It consists in fitting a parametric generalised extreme value (GEV) distribution to a data set of extreme values, and using the estimated distribution to compute return levels and other quantities of interest. Here, we propose to estimate these values using nonparametric functional data methods. Functional data analysis is a relatively new statistical methodology that generally deals with data consisting of curves or multi-dimensional variables. In this paper, we use this technique, jointly with nonparametric curve estimation, to provide alternatives to the usual parametric statistical tools. The nonparametric estimators are applied to real samples of maximum ozone values obtained from several monitoring stations belonging to the Automatic Urban and Rural Network (AURN) in the UK. The results show that nonparametric estimators work satisfactorily, outperforming the behaviour of classical parametric estimators. Functional data analysis is also used to predict stratospheric ozone concentrations. We show an application, using the data set of mean monthly ozone concentrations in Arosa, Switzerland, and the results are compared with those obtained by classical time series (ARIMA) analysis. Copyright © 2010 Elsevier Ltd. All rights reserved.
He, Bo; Zhang, Shujing; Yan, Tianhong; Zhang, Tao; Liang, Yan; Zhang, Hongjin
2011-01-01
Mobile autonomous systems are very important for marine scientific investigation and military applications. Many algorithms have been studied to deal with the computational efficiency problem required for large scale simultaneous localization and mapping (SLAM) and its related accuracy and consistency. Among these methods, submap-based SLAM is a more effective one. By combining the strength of two popular mapping algorithms, the Rao-Blackwellised particle filter (RBPF) and extended information filter (EIF), this paper presents a combined SLAM-an efficient submap-based solution to the SLAM problem in a large scale environment. RBPF-SLAM is used to produce local maps, which are periodically fused into an EIF-SLAM algorithm. RBPF-SLAM can avoid linearization of the robot model during operating and provide a robust data association, while EIF-SLAM can improve the whole computational speed, and avoid the tendency of RBPF-SLAM to be over-confident. In order to further improve the computational speed in a real time environment, a binary-tree-based decision-making strategy is introduced. Simulation experiments show that the proposed combined SLAM algorithm significantly outperforms currently existing algorithms in terms of accuracy and consistency, as well as the computing efficiency. Finally, the combined SLAM algorithm is experimentally validated in a real environment by using the Victoria Park dataset.
Methodology to estimate the relative pressure field from noisy experimental velocity data
NASA Astrophysics Data System (ADS)
Bolin, C. D.; Raguin, L. G.
2008-11-01
The determination of intravascular pressure fields is important to the characterization of cardiovascular pathology. We present a two-stage method that solves the inverse problem of estimating the relative pressure field from noisy velocity fields measured by phase contrast magnetic resonance imaging (PC-MRI) on an irregular domain with limited spatial resolution, and includes a filter for the experimental noise. For the pressure calculation, the Poisson pressure equation is solved by embedding the irregular flow domain into a regular domain. To lessen the propagation of the noise inherent to the velocity measurements, three filters - a median filter and two physics-based filters - are evaluated using a 2-D Couette flow. The two physics-based filters outperform the median filter for the estimation of the relative pressure field for realistic signal-to-noise ratios (SNR = 5 to 30). The most accurate pressure field results from a filter that applies in a least-squares sense three constraints simultaneously: consistency between measured and filtered velocity fields, divergence-free and additional smoothness conditions. This filter leads to a 5-fold gain in accuracy for the estimated relative pressure field compared to without noise filtering, in conditions consistent with PC-MRI of the carotid artery: SNR = 5, 20 x 20 discretized flow domain (25 X 25 computational domain).
Searching for transcription factor binding sites in vector spaces
2012-01-01
Background Computational approaches to transcription factor binding site identification have been actively researched in the past decade. Learning from known binding sites, new binding sites of a transcription factor in unannotated sequences can be identified. A number of search methods have been introduced over the years. However, one can rarely find one single method that performs the best on all the transcription factors. Instead, to identify the best method for a particular transcription factor, one usually has to compare a handful of methods. Hence, it is highly desirable for a method to perform automatic optimization for individual transcription factors. Results We proposed to search for transcription factor binding sites in vector spaces. This framework allows us to identify the best method for each individual transcription factor. We further introduced two novel methods, the negative-to-positive vector (NPV) and optimal discriminating vector (ODV) methods, to construct query vectors to search for binding sites in vector spaces. Extensive cross-validation experiments showed that the proposed methods significantly outperformed the ungapped likelihood under positional background method, a state-of-the-art method, and the widely-used position-specific scoring matrix method. We further demonstrated that motif subtypes of a TF can be readily identified in this framework and two variants called the k NPV and k ODV methods benefited significantly from motif subtype identification. Finally, independent validation on ChIP-seq data showed that the ODV and NPV methods significantly outperformed the other compared methods. Conclusions We conclude that the proposed framework is highly flexible. It enables the two novel methods to automatically identify a TF-specific subspace to search for binding sites. Implementations are available as source code at: http://biogrid.engr.uconn.edu/tfbs_search/. PMID:23244338
Li, Xingyu; Plataniotis, Konstantinos N
2015-07-01
In digital histopathology, tasks of segmentation and disease diagnosis are achieved by quantitative analysis of image content. However, color variation in image samples makes it challenging to produce reliable results. This paper introduces a complete normalization scheme to address the problem of color variation in histopathology images jointly caused by inconsistent biopsy staining and nonstandard imaging condition. Method : Different from existing normalization methods that either address partial cause of color variation or lump them together, our method identifies causes of color variation based on a microscopic imaging model and addresses inconsistency in biopsy imaging and staining by an illuminant normalization module and a spectral normalization module, respectively. In evaluation, we use two public datasets that are representative of histopathology images commonly received in clinics to examine the proposed method from the aspects of robustness to system settings, performance consistency against achromatic pixels, and normalization effectiveness in terms of histological information preservation. As the saturation-weighted statistics proposed in this study generates stable and reliable color cues for stain normalization, our scheme is robust to system parameters and insensitive to image content and achromatic colors. Extensive experimentation suggests that our approach outperforms state-of-the-art normalization methods as the proposed method is the only approach that succeeds to preserve histological information after normalization. The proposed color normalization solution would be useful to mitigate effects of color variation in pathology images on subsequent quantitative analysis.
DeepPap: Deep Convolutional Networks for Cervical Cell Classification.
Zhang, Ling; Le Lu; Nogues, Isabella; Summers, Ronald M; Liu, Shaoxiong; Yao, Jianhua
2017-11-01
Automation-assisted cervical screening via Pap smear or liquid-based cytology (LBC) is a highly effective cell imaging based cancer detection tool, where cells are partitioned into "abnormal" and "normal" categories. However, the success of most traditional classification methods relies on the presence of accurate cell segmentations. Despite sixty years of research in this field, accurate segmentation remains a challenge in the presence of cell clusters and pathologies. Moreover, previous classification methods are only built upon the extraction of hand-crafted features, such as morphology and texture. This paper addresses these limitations by proposing a method to directly classify cervical cells-without prior segmentation-based on deep features, using convolutional neural networks (ConvNets). First, the ConvNet is pretrained on a natural image dataset. It is subsequently fine-tuned on a cervical cell dataset consisting of adaptively resampled image patches coarsely centered on the nuclei. In the testing phase, aggregation is used to average the prediction scores of a similar set of image patches. The proposed method is evaluated on both Pap smear and LBC datasets. Results show that our method outperforms previous algorithms in classification accuracy (98.3%), area under the curve (0.99) values, and especially specificity (98.3%), when applied to the Herlev benchmark Pap smear dataset and evaluated using five-fold cross validation. Similar superior performances are also achieved on the HEMLBC (H&E stained manual LBC) dataset. Our method is promising for the development of automation-assisted reading systems in primary cervical screening.
An Active Patch Model for Real World Texture and Appearance Classification
Mao, Junhua; Zhu, Jun; Yuille, Alan L.
2014-01-01
This paper addresses the task of natural texture and appearance classification. Our goal is to develop a simple and intuitive method that performs at state of the art on datasets ranging from homogeneous texture (e.g., material texture), to less homogeneous texture (e.g., the fur of animals), and to inhomogeneous texture (the appearance patterns of vehicles). Our method uses a bag-of-words model where the features are based on a dictionary of active patches. Active patches are raw intensity patches which can undergo spatial transformations (e.g., rotation and scaling) and adjust themselves to best match the image regions. The dictionary of active patches is required to be compact and representative, in the sense that we can use it to approximately reconstruct the images that we want to classify. We propose a probabilistic model to quantify the quality of image reconstruction and design a greedy learning algorithm to obtain the dictionary. We classify images using the occurrence frequency of the active patches. Feature extraction is fast (about 100 ms per image) using the GPU. The experimental results show that our method improves the state of the art on a challenging material texture benchmark dataset (KTH-TIPS2). To test our method on less homogeneous or inhomogeneous images, we construct two new datasets consisting of appearance image patches of animals and vehicles cropped from the PASCAL VOC dataset. Our method outperforms competing methods on these datasets. PMID:25531013
Yang, Yong; Tong, Song; Huang, Shuying; Lin, Pan
2014-01-01
This paper presents a novel framework for the fusion of multi-focus images explicitly designed for visual sensor network (VSN) environments. Multi-scale based fusion methods can often obtain fused images with good visual effect. However, because of the defects of the fusion rules, it is almost impossible to completely avoid the loss of useful information in the thus obtained fused images. The proposed fusion scheme can be divided into two processes: initial fusion and final fusion. The initial fusion is based on a dual-tree complex wavelet transform (DTCWT). The Sum-Modified-Laplacian (SML)-based visual contrast and SML are employed to fuse the low- and high-frequency coefficients, respectively, and an initial composited image is obtained. In the final fusion process, the image block residuals technique and consistency verification are used to detect the focusing areas and then a decision map is obtained. The map is used to guide how to achieve the final fused image. The performance of the proposed method was extensively tested on a number of multi-focus images, including no-referenced images, referenced images, and images with different noise levels. The experimental results clearly indicate that the proposed method outperformed various state-of-the-art fusion methods, in terms of both subjective and objective evaluations, and is more suitable for VSNs. PMID:25587878
Yang, Yong; Tong, Song; Huang, Shuying; Lin, Pan
2014-11-26
This paper presents a novel framework for the fusion of multi-focus images explicitly designed for visual sensor network (VSN) environments. Multi-scale based fusion methods can often obtain fused images with good visual effect. However, because of the defects of the fusion rules, it is almost impossible to completely avoid the loss of useful information in the thus obtained fused images. The proposed fusion scheme can be divided into two processes: initial fusion and final fusion. The initial fusion is based on a dual-tree complex wavelet transform (DTCWT). The Sum-Modified-Laplacian (SML)-based visual contrast and SML are employed to fuse the low- and high-frequency coefficients, respectively, and an initial composited image is obtained. In the final fusion process, the image block residuals technique and consistency verification are used to detect the focusing areas and then a decision map is obtained. The map is used to guide how to achieve the final fused image. The performance of the proposed method was extensively tested on a number of multi-focus images, including no-referenced images, referenced images, and images with different noise levels. The experimental results clearly indicate that the proposed method outperformed various state-of-the-art fusion methods, in terms of both subjective and objective evaluations, and is more suitable for VSNs.
Efficient mining of association rules for the early diagnosis of Alzheimer's disease
NASA Astrophysics Data System (ADS)
Chaves, R.; Górriz, J. M.; Ramírez, J.; Illán, I. A.; Salas-Gonzalez, D.; Gómez-Río, M.
2011-09-01
In this paper, a novel technique based on association rules (ARs) is presented in order to find relations among activated brain areas in single photon emission computed tomography (SPECT) imaging. In this sense, the aim of this work is to discover associations among attributes which characterize the perfusion patterns of normal subjects and to make use of them for the early diagnosis of Alzheimer's disease (AD). Firstly, voxel-as-feature-based activation estimation methods are used to find the tridimensional activated brain regions of interest (ROIs) for each patient. These ROIs serve as input to secondly mine ARs with a minimum support and confidence among activation blocks by using a set of controls. In this context, support and confidence measures are related to the proportion of functional areas which are singularly and mutually activated across the brain. Finally, we perform image classification by comparing the number of ARs verified by each subject under test to a given threshold that depends on the number of previously mined rules. Several classification experiments were carried out in order to evaluate the proposed methods using a SPECT database that consists of 41 controls (NOR) and 56 AD patients labeled by trained physicians. The proposed methods were validated by means of the leave-one-out cross validation strategy, yielding up to 94.87% classification accuracy, thus outperforming recent developed methods for computer aided diagnosis of AD.
PockDrug-Server: a new web server for predicting pocket druggability on holo and apo proteins
Hussein, Hiba Abi; Borrel, Alexandre; Geneix, Colette; Petitjean, Michel; Regad, Leslie; Camproux, Anne-Claude
2015-01-01
Predicting protein pocket's ability to bind drug-like molecules with high affinity, i.e. druggability, is of major interest in the target identification phase of drug discovery. Therefore, pocket druggability investigations represent a key step of compound clinical progression projects. Currently computational druggability prediction models are attached to one unique pocket estimation method despite pocket estimation uncertainties. In this paper, we propose ‘PockDrug-Server’ to predict pocket druggability, efficient on both (i) estimated pockets guided by the ligand proximity (extracted by proximity to a ligand from a holo protein structure) and (ii) estimated pockets based solely on protein structure information (based on amino atoms that form the surface of potential binding cavities). PockDrug-Server provides consistent druggability results using different pocket estimation methods. It is robust with respect to pocket boundary and estimation uncertainties, thus efficient using apo pockets that are challenging to estimate. It clearly distinguishes druggable from less druggable pockets using different estimation methods and outperformed recent druggability models for apo pockets. It can be carried out from one or a set of apo/holo proteins using different pocket estimation methods proposed by our web server or from any pocket previously estimated by the user. PockDrug-Server is publicly available at: http://pockdrug.rpbs.univ-paris-diderot.fr. PMID:25956651
MR Image Reconstruction Using Block Matching and Adaptive Kernel Methods.
Schmidt, Johannes F M; Santelli, Claudio; Kozerke, Sebastian
2016-01-01
An approach to Magnetic Resonance (MR) image reconstruction from undersampled data is proposed. Undersampling artifacts are removed using an iterative thresholding algorithm applied to nonlinearly transformed image block arrays. Each block array is transformed using kernel principal component analysis where the contribution of each image block to the transform depends in a nonlinear fashion on the distance to other image blocks. Elimination of undersampling artifacts is achieved by conventional principal component analysis in the nonlinear transform domain, projection onto the main components and back-mapping into the image domain. Iterative image reconstruction is performed by interleaving the proposed undersampling artifact removal step and gradient updates enforcing consistency with acquired k-space data. The algorithm is evaluated using retrospectively undersampled MR cardiac cine data and compared to k-t SPARSE-SENSE, block matching with spatial Fourier filtering and k-t ℓ1-SPIRiT reconstruction. Evaluation of image quality and root-mean-squared-error (RMSE) reveal improved image reconstruction for up to 8-fold undersampled data with the proposed approach relative to k-t SPARSE-SENSE, block matching with spatial Fourier filtering and k-t ℓ1-SPIRiT. In conclusion, block matching and kernel methods can be used for effective removal of undersampling artifacts in MR image reconstruction and outperform methods using standard compressed sensing and ℓ1-regularized parallel imaging methods.
Enhanced Regulatory Sequence Prediction Using Gapped k-mer Features
Mohammad-Noori, Morteza; Beer, Michael A.
2014-01-01
Abstract Oligomers of length k, or k-mers, are convenient and widely used features for modeling the properties and functions of DNA and protein sequences. However, k-mers suffer from the inherent limitation that if the parameter k is increased to resolve longer features, the probability of observing any specific k-mer becomes very small, and k-mer counts approach a binary variable, with most k-mers absent and a few present once. Thus, any statistical learning approach using k-mers as features becomes susceptible to noisy training set k-mer frequencies once k becomes large. To address this problem, we introduce alternative feature sets using gapped k-mers, a new classifier, gkm-SVM, and a general method for robust estimation of k-mer frequencies. To make the method applicable to large-scale genome wide applications, we develop an efficient tree data structure for computing the kernel matrix. We show that compared to our original kmer-SVM and alternative approaches, our gkm-SVM predicts functional genomic regulatory elements and tissue specific enhancers with significantly improved accuracy, increasing the precision by up to a factor of two. We then show that gkm-SVM consistently outperforms kmer-SVM on human ENCODE ChIP-seq datasets, and further demonstrate the general utility of our method using a Naïve-Bayes classifier. Although developed for regulatory sequence analysis, these methods can be applied to any sequence classification problem. PMID:25033408
Enhanced regulatory sequence prediction using gapped k-mer features.
Ghandi, Mahmoud; Lee, Dongwon; Mohammad-Noori, Morteza; Beer, Michael A
2014-07-01
Oligomers of length k, or k-mers, are convenient and widely used features for modeling the properties and functions of DNA and protein sequences. However, k-mers suffer from the inherent limitation that if the parameter k is increased to resolve longer features, the probability of observing any specific k-mer becomes very small, and k-mer counts approach a binary variable, with most k-mers absent and a few present once. Thus, any statistical learning approach using k-mers as features becomes susceptible to noisy training set k-mer frequencies once k becomes large. To address this problem, we introduce alternative feature sets using gapped k-mers, a new classifier, gkm-SVM, and a general method for robust estimation of k-mer frequencies. To make the method applicable to large-scale genome wide applications, we develop an efficient tree data structure for computing the kernel matrix. We show that compared to our original kmer-SVM and alternative approaches, our gkm-SVM predicts functional genomic regulatory elements and tissue specific enhancers with significantly improved accuracy, increasing the precision by up to a factor of two. We then show that gkm-SVM consistently outperforms kmer-SVM on human ENCODE ChIP-seq datasets, and further demonstrate the general utility of our method using a Naïve-Bayes classifier. Although developed for regulatory sequence analysis, these methods can be applied to any sequence classification problem.
Rank-based pooling for deep convolutional neural networks.
Shi, Zenglin; Ye, Yangdong; Wu, Yunpeng
2016-11-01
Pooling is a key mechanism in deep convolutional neural networks (CNNs) which helps to achieve translation invariance. Numerous studies, both empirically and theoretically, show that pooling consistently boosts the performance of the CNNs. The conventional pooling methods are operated on activation values. In this work, we alternatively propose rank-based pooling. It is derived from the observations that ranking list is invariant under changes of activation values in a pooling region, and thus rank-based pooling operation may achieve more robust performance. In addition, the reasonable usage of rank can avoid the scale problems encountered by value-based methods. The novel pooling mechanism can be regarded as an instance of weighted pooling where a weighted sum of activations is used to generate the pooling output. This pooling mechanism can also be realized as rank-based average pooling (RAP), rank-based weighted pooling (RWP) and rank-based stochastic pooling (RSP) according to different weighting strategies. As another major contribution, we present a novel criterion to analyze the discriminant ability of various pooling methods, which is heavily under-researched in machine learning and computer vision community. Experimental results on several image benchmarks show that rank-based pooling outperforms the existing pooling methods in classification performance. We further demonstrate better performance on CIFAR datasets by integrating RSP into Network-in-Network. Copyright © 2016 Elsevier Ltd. All rights reserved.
PockDrug-Server: a new web server for predicting pocket druggability on holo and apo proteins.
Hussein, Hiba Abi; Borrel, Alexandre; Geneix, Colette; Petitjean, Michel; Regad, Leslie; Camproux, Anne-Claude
2015-07-01
Predicting protein pocket's ability to bind drug-like molecules with high affinity, i.e. druggability, is of major interest in the target identification phase of drug discovery. Therefore, pocket druggability investigations represent a key step of compound clinical progression projects. Currently computational druggability prediction models are attached to one unique pocket estimation method despite pocket estimation uncertainties. In this paper, we propose 'PockDrug-Server' to predict pocket druggability, efficient on both (i) estimated pockets guided by the ligand proximity (extracted by proximity to a ligand from a holo protein structure) and (ii) estimated pockets based solely on protein structure information (based on amino atoms that form the surface of potential binding cavities). PockDrug-Server provides consistent druggability results using different pocket estimation methods. It is robust with respect to pocket boundary and estimation uncertainties, thus efficient using apo pockets that are challenging to estimate. It clearly distinguishes druggable from less druggable pockets using different estimation methods and outperformed recent druggability models for apo pockets. It can be carried out from one or a set of apo/holo proteins using different pocket estimation methods proposed by our web server or from any pocket previously estimated by the user. PockDrug-Server is publicly available at: http://pockdrug.rpbs.univ-paris-diderot.fr. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Phylogenomics of plant genomes: a methodology for genome-wide searches for orthologs in plants
Conte, Matthieu G; Gaillard, Sylvain; Droc, Gaetan; Perin, Christophe
2008-01-01
Background Gene ortholog identification is now a major objective for mining the increasing amount of sequence data generated by complete or partial genome sequencing projects. Comparative and functional genomics urgently need a method for ortholog detection to reduce gene function inference and to aid in the identification of conserved or divergent genetic pathways between several species. As gene functions change during evolution, reconstructing the evolutionary history of genes should be a more accurate way to differentiate orthologs from paralogs. Phylogenomics takes into account phylogenetic information from high-throughput genome annotation and is the most straightforward way to infer orthologs. However, procedures for automatic detection of orthologs are still scarce and suffer from several limitations. Results We developed a procedure for ortholog prediction between Oryza sativa and Arabidopsis thaliana. Firstly, we established an efficient method to cluster A. thaliana and O. sativa full proteomes into gene families. Then, we developed an optimized phylogenomics pipeline for ortholog inference. We validated the full procedure using test sets of orthologs and paralogs to demonstrate that our method outperforms pairwise methods for ortholog predictions. Conclusion Our procedure achieved a high level of accuracy in predicting ortholog and paralog relationships. Phylogenomic predictions for all validated gene families in both species were easily achieved and we can conclude that our methodology outperforms similarly based methods. PMID:18426584
DOE Office of Scientific and Technical Information (OSTI.GOV)
Niu, S; Zhang, Y; Ma, J
Purpose: To investigate iterative reconstruction via prior image constrained total generalized variation (PICTGV) for spectral computed tomography (CT) using fewer projections while achieving greater image quality. Methods: The proposed PICTGV method is formulated as an optimization problem, which balances the data fidelity and prior image constrained total generalized variation of reconstructed images in one framework. The PICTGV method is based on structure correlations among images in the energy domain and high-quality images to guide the reconstruction of energy-specific images. In PICTGV method, the high-quality image is reconstructed from all detector-collected X-ray signals and is referred as the broad-spectrum image. Distinctmore » from the existing reconstruction methods applied on the images with first order derivative, the higher order derivative of the images is incorporated into the PICTGV method. An alternating optimization algorithm is used to minimize the PICTGV objective function. We evaluate the performance of PICTGV on noise and artifacts suppressing using phantom studies and compare the method with the conventional filtered back-projection method as well as TGV based method without prior image. Results: On the digital phantom, the proposed method outperforms the existing TGV method in terms of the noise reduction, artifacts suppression, and edge detail preservation. Compared to that obtained by the TGV based method without prior image, the relative root mean square error in the images reconstructed by the proposed method is reduced by over 20%. Conclusion: The authors propose an iterative reconstruction via prior image constrained total generalize variation for spectral CT. Also, we have developed an alternating optimization algorithm and numerically demonstrated the merits of our approach. Results show that the proposed PICTGV method outperforms the TGV method for spectral CT.« less
Wang, Lei; You, Zhu-Hong; Chen, Xing; Li, Jian-Qiang; Yan, Xin; Zhang, Wei; Huang, Yu-An
2017-01-01
Protein–Protein Interactions (PPI) is not only the critical component of various biological processes in cells, but also the key to understand the mechanisms leading to healthy and diseased states in organisms. However, it is time-consuming and cost-intensive to identify the interactions among proteins using biological experiments. Hence, how to develop a more efficient computational method rapidly became an attractive topic in the post-genomic era. In this paper, we propose a novel method for inference of protein-protein interactions from protein amino acids sequences only. Specifically, protein amino acids sequence is firstly transformed into Position-Specific Scoring Matrix (PSSM) generated by multiple sequences alignments; then the Pseudo PSSM is used to extract feature descriptors. Finally, ensemble Rotation Forest (RF) learning system is trained to predict and recognize PPIs based solely on protein sequence feature. When performed the proposed method on the three benchmark data sets (Yeast, H. pylori, and independent dataset) for predicting PPIs, our method can achieve good average accuracies of 98.38%, 89.75%, and 96.25%, respectively. In order to further evaluate the prediction performance, we also compare the proposed method with other methods using same benchmark data sets. The experiment results demonstrate that the proposed method consistently outperforms other state-of-the-art method. Therefore, our method is effective and robust and can be taken as a useful tool in exploring and discovering new relationships between proteins. A web server is made publicly available at the URL http://202.119.201.126:8888/PsePSSM/ for academic use. PMID:28029645
ERIC Educational Resources Information Center
Montalvo, Edris J.
2013-01-01
In many public U.S. universities, Hispanic undergraduates are underrepresented in terms of enrollment and graduation. This mixed-method geographical study investigated whether some public universities outperform others in recruiting and retaining Hispanic undergraduates. The quantitative findings showed that the effect of financial aid and…
A new diffusion-inhibited oxidation-resistant coating for superalloys
NASA Technical Reports Server (NTRS)
Gedwill, M. A.; Glasgow, T. K.; Levine, S. R.
1981-01-01
A concept for enhanced protection of superalloys consists of adding an oxidation- and diffusion-resistant cermet layer between the superalloy and the outer oxidation-resistant metallic alloy coating. Such a duplex coating was compared with a physical-vapor-deposited (PVD) NiCrAlY coating in cyclic oxidation at 1150 C. The substrate alloy was MA 754 - an oxide-dispersion-strengthened superalloy that is difficult to coat. The duplex coating, applied by plasma spraying, outperformed the PVD coating on the basis of weight change and both macroscopic and metallographic observations.
Can Diastat Grafts Meet the Challenges of Daily Punctures?
Chandran, Prem K G; Messer, Diane; Sidwell, Richard A; Stubbs, David H; Nish, Andrew D
1997-01-01
To determine whether Diastat grafts can meet the challenges of daily needle punctures required for home hemodialysis (HD), a retrospective analysis was performed on the experience with 47 grafts placed in 44 patients receiving HD three times a week. The control group consisted of 17 patients who received 17 stretch polytetrafluoroethylene (s-PTFE) grafts. Apart from their ability to better contain bleeding after needle withdrawal, in all measures of longevity the Diastat grafts were outperformed by the s-PTFE grafts. No more direct data exist to address the original challenge.
Experimental problem solving: An instructional improvement field experiment
NASA Astrophysics Data System (ADS)
Ross, John A.; Maynes, Florence J.
An instructional program based on expert-novice differences in experimental problem-solving performance was taught to grade 6 students (N = 265). Classes of students were randomly assigned to conditions in a delayed treatment design. Performance was assessed with multiple-choice and open-ended measures of specific transfer. Between group comparisons using pretest scores as a covariate showed that treatment condition students consistently outperformed controls; similar results were revealed in the within group comparisons. The achievement of the early treatment group did not decline in tests administered one month after the posttest.
Lajnef, Tarek; Chaibi, Sahbi; Ruby, Perrine; Aguera, Pierre-Emmanuel; Eichenlaub, Jean-Baptiste; Samet, Mounir; Kachouri, Abdennaceur; Jerbi, Karim
2015-07-30
Sleep staging is a critical step in a range of electrophysiological signal processing pipelines used in clinical routine as well as in sleep research. Although the results currently achievable with automatic sleep staging methods are promising, there is need for improvement, especially given the time-consuming and tedious nature of visual sleep scoring. Here we propose a sleep staging framework that consists of a multi-class support vector machine (SVM) classification based on a decision tree approach. The performance of the method was evaluated using polysomnographic data from 15 subjects (electroencephalogram (EEG), electrooculogram (EOG) and electromyogram (EMG) recordings). The decision tree, or dendrogram, was obtained using a hierarchical clustering technique and a wide range of time and frequency-domain features were extracted. Feature selection was carried out using forward sequential selection and classification was evaluated using k-fold cross-validation. The dendrogram-based SVM (DSVM) achieved mean specificity, sensitivity and overall accuracy of 0.92, 0.74 and 0.88 respectively, compared to expert visual scoring. Restricting DSVM classification to data where both experts' scoring was consistent (76.73% of the data) led to a mean specificity, sensitivity and overall accuracy of 0.94, 0.82 and 0.92 respectively. The DSVM framework outperforms classification with more standard multi-class "one-against-all" SVM and linear-discriminant analysis. The promising results of the proposed methodology suggest that it may be a valuable alternative to existing automatic methods and that it could accelerate visual scoring by providing a robust starting hypnogram that can be further fine-tuned by expert inspection. Copyright © 2015 Elsevier B.V. All rights reserved.
Psoriasis image representation using patch-based dictionary learning for erythema severity scoring.
George, Yasmeen; Aldeen, Mohammad; Garnavi, Rahil
2018-06-01
Psoriasis is a chronic skin disease which can be life-threatening. Accurate severity scoring helps dermatologists to decide on the treatment. In this paper, we present a semi-supervised computer-aided system for automatic erythema severity scoring in psoriasis images. Firstly, the unsupervised stage includes a novel image representation method. We construct a dictionary, which is then used in the sparse representation for local feature extraction. To acquire the final image representation vector, an aggregation method is exploited over the local features. Secondly, the supervised phase is where various multi-class machine learning (ML) classifiers are trained for erythema severity scoring. Finally, we compare the proposed system with two popular unsupervised feature extractor methods, namely: bag of visual words model (BoVWs) and AlexNet pretrained model. Root mean square error (RMSE) and F1 score are used as performance measures for the learned dictionaries and the trained ML models, respectively. A psoriasis image set consisting of 676 images, is used in this study. Experimental results demonstrate that the use of the proposed procedure can provide a setup where erythema scoring is accurate and consistent. Also, it is revealed that dictionaries with large number of atoms and small patch sizes yield the best representative erythema severity features. Further, random forest (RF) outperforms other classifiers with F1 score 0.71, followed by support vector machine (SVM) and boosting with 0.66 and 0.64 scores, respectively. Furthermore, the conducted comparative studies confirm the effectiveness of the proposed approach with improvement of 9% and 12% over BoVWs and AlexNet based features, respectively. Crown Copyright © 2018. Published by Elsevier Ltd. All rights reserved.
Smoothing of climate time series revisited
NASA Astrophysics Data System (ADS)
Mann, Michael E.
2008-08-01
We present an easily implemented method for smoothing climate time series, generalizing upon an approach previously described by Mann (2004). The method adaptively weights the three lowest order time series boundary constraints to optimize the fit with the raw time series. We apply the method to the instrumental global mean temperature series from 1850-2007 and to various surrogate global mean temperature series from 1850-2100 derived from the CMIP3 multimodel intercomparison project. These applications demonstrate that the adaptive method systematically out-performs certain widely used default smoothing methods, and is more likely to yield accurate assessments of long-term warming trends.
Monte Carlo sampling in diffusive dynamical systems
NASA Astrophysics Data System (ADS)
Tapias, Diego; Sanders, David P.; Altmann, Eduardo G.
2018-05-01
We introduce a Monte Carlo algorithm to efficiently compute transport properties of chaotic dynamical systems. Our method exploits the importance sampling technique that favors trajectories in the tail of the distribution of displacements, where deviations from a diffusive process are most prominent. We search for initial conditions using a proposal that correlates states in the Markov chain constructed via a Metropolis-Hastings algorithm. We show that our method outperforms the direct sampling method and also Metropolis-Hastings methods with alternative proposals. We test our general method through numerical simulations in 1D (box-map) and 2D (Lorentz gas) systems.
Probability genotype imputation method and integrated weighted lasso for QTL identification.
Demetrashvili, Nino; Van den Heuvel, Edwin R; Wit, Ernst C
2013-12-30
Many QTL studies have two common features: (1) often there is missing marker information, (2) among many markers involved in the biological process only a few are causal. In statistics, the second issue falls under the headings "sparsity" and "causal inference". The goal of this work is to develop a two-step statistical methodology for QTL mapping for markers with binary genotypes. The first step introduces a novel imputation method for missing genotypes. Outcomes of the proposed imputation method are probabilities which serve as weights to the second step, namely in weighted lasso. The sparse phenotype inference is employed to select a set of predictive markers for the trait of interest. Simulation studies validate the proposed methodology under a wide range of realistic settings. Furthermore, the methodology outperforms alternative imputation and variable selection methods in such studies. The methodology was applied to an Arabidopsis experiment, containing 69 markers for 165 recombinant inbred lines of a F8 generation. The results confirm previously identified regions, however several new markers are also found. On the basis of the inferred ROC behavior these markers show good potential for being real, especially for the germination trait Gmax. Our imputation method shows higher accuracy in terms of sensitivity and specificity compared to alternative imputation method. Also, the proposed weighted lasso outperforms commonly practiced multiple regression as well as the traditional lasso and adaptive lasso with three weighting schemes. This means that under realistic missing data settings this methodology can be used for QTL identification.
Kang, Guangliang; Du, Li; Zhang, Hong
2016-06-22
The growing complexity of biological experiment design based on high-throughput RNA sequencing (RNA-seq) is calling for more accommodative statistical tools. We focus on differential expression (DE) analysis using RNA-seq data in the presence of multiple treatment conditions. We propose a novel method, multiDE, for facilitating DE analysis using RNA-seq read count data with multiple treatment conditions. The read count is assumed to follow a log-linear model incorporating two factors (i.e., condition and gene), where an interaction term is used to quantify the association between gene and condition. The number of the degrees of freedom is reduced to one through the first order decomposition of the interaction, leading to a dramatically power improvement in testing DE genes when the number of conditions is greater than two. In our simulation situations, multiDE outperformed the benchmark methods (i.e. edgeR and DESeq2) even if the underlying model was severely misspecified, and the power gain was increasing in the number of conditions. In the application to two real datasets, multiDE identified more biologically meaningful DE genes than the benchmark methods. An R package implementing multiDE is available publicly at http://homepage.fudan.edu.cn/zhangh/softwares/multiDE . When the number of conditions is two, multiDE performs comparably with the benchmark methods. When the number of conditions is greater than two, multiDE outperforms the benchmark methods.
Lai, Zongying; Zhang, Xinlin; Guo, Di; Du, Xiaofeng; Yang, Yonggui; Guo, Gang; Chen, Zhong; Qu, Xiaobo
2018-05-03
Multi-contrast images in magnetic resonance imaging (MRI) provide abundant contrast information reflecting the characteristics of the internal tissues of human bodies, and thus have been widely utilized in clinical diagnosis. However, long acquisition time limits the application of multi-contrast MRI. One efficient way to accelerate data acquisition is to under-sample the k-space data and then reconstruct images with sparsity constraint. However, images are compromised at high acceleration factor if images are reconstructed individually. We aim to improve the images with a jointly sparse reconstruction and Graph-based redundant wavelet transform (GBRWT). First, a sparsifying transform, GBRWT, is trained to reflect the similarity of tissue structures in multi-contrast images. Second, joint multi-contrast image reconstruction is formulated as a ℓ 2, 1 norm optimization problem under GBRWT representations. Third, the optimization problem is numerically solved using a derived alternating direction method. Experimental results in synthetic and in vivo MRI data demonstrate that the proposed joint reconstruction method can achieve lower reconstruction errors and better preserve image structures than the compared joint reconstruction methods. Besides, the proposed method outperforms single image reconstruction with joint sparsity constraint of multi-contrast images. The proposed method explores the joint sparsity of multi-contrast MRI images under graph-based redundant wavelet transform and realizes joint sparse reconstruction of multi-contrast images. Experiment demonstrate that the proposed method outperforms the compared joint reconstruction methods as well as individual reconstructions. With this high quality image reconstruction method, it is possible to achieve the high acceleration factors by exploring the complementary information provided by multi-contrast MRI.
Banić, Nikola; Lončarić, Sven
2015-11-01
Removing the influence of illumination on image colors and adjusting the brightness across the scene are important image enhancement problems. This is achieved by applying adequate color constancy and brightness adjustment methods. One of the earliest models to deal with both of these problems was the Retinex theory. Some of the Retinex implementations tend to give high-quality results by performing local operations, but they are computationally relatively slow. One of the recent Retinex implementations is light random sprays Retinex (LRSR). In this paper, a new method is proposed for brightness adjustment and color correction that overcomes the main disadvantages of LRSR. There are three main contributions of this paper. First, a concept of memory sprays is proposed to reduce the number of LRSR's per-pixel operations to a constant regardless of the parameter values, thereby enabling a fast Retinex-based local image enhancement. Second, an effective remapping of image intensities is proposed that results in significantly higher quality. Third, the problem of LRSR's halo effect is significantly reduced by using an alternative illumination processing method. The proposed method enables a fast Retinex-based image enhancement by processing Retinex paths in a constant number of steps regardless of the path size. Due to the halo effect removal and remapping of the resulting intensities, the method outperforms many of the well-known image enhancement methods in terms of resulting image quality. The results are presented and discussed. It is shown that the proposed method outperforms most of the tested methods in terms of image brightness adjustment, color correction, and computational speed.
A comparison of cosegregation analysis methods for the clinical setting.
Rañola, John Michael O; Liu, Quanhui; Rosenthal, Elisabeth A; Shirts, Brian H
2018-04-01
Quantitative cosegregation analysis can help evaluate the pathogenicity of genetic variants. However, genetics professionals without statistical training often use simple methods, reporting only qualitative findings. We evaluate the potential utility of quantitative cosegregation in the clinical setting by comparing three methods. One thousand pedigrees each were simulated for benign and pathogenic variants in BRCA1 and MLH1 using United States historical demographic data to produce pedigrees similar to those seen in the clinic. These pedigrees were analyzed using two robust methods, full likelihood Bayes factors (FLB) and cosegregation likelihood ratios (CSLR), and a simpler method, counting meioses. Both FLB and CSLR outperform counting meioses when dealing with pathogenic variants, though counting meioses is not far behind. For benign variants, FLB and CSLR greatly outperform as counting meioses is unable to generate evidence for benign variants. Comparing FLB and CSLR, we find that the two methods perform similarly, indicating that quantitative results from either of these methods could be combined in multifactorial calculations. Combining quantitative information will be important as isolated use of cosegregation in single families will yield classification for less than 1% of variants. To encourage wider use of robust cosegregation analysis, we present a website ( http://www.analyze.myvariant.org ) which implements the CSLR, FLB, and Counting Meioses methods for ATM, BRCA1, BRCA2, CHEK2, MEN1, MLH1, MSH2, MSH6, and PMS2. We also present an R package, CoSeg, which performs the CSLR analysis on any gene with user supplied parameters. Future variant classification guidelines should allow nuanced inclusion of cosegregation evidence against pathogenicity.
Peel, D; Waples, R S; Macbeth, G M; Do, C; Ovenden, J R
2013-03-01
Theoretical models are often applied to population genetic data sets without fully considering the effect of missing data. Researchers can deal with missing data by removing individuals that have failed to yield genotypes and/or by removing loci that have failed to yield allelic determinations, but despite their best efforts, most data sets still contain some missing data. As a consequence, realized sample size differs among loci, and this poses a problem for unbiased methods that must explicitly account for random sampling error. One commonly used solution for the calculation of contemporary effective population size (N(e) ) is to calculate the effective sample size as an unweighted mean or harmonic mean across loci. This is not ideal because it fails to account for the fact that loci with different numbers of alleles have different information content. Here we consider this problem for genetic estimators of contemporary effective population size (N(e) ). To evaluate bias and precision of several statistical approaches for dealing with missing data, we simulated populations with known N(e) and various degrees of missing data. Across all scenarios, one method of correcting for missing data (fixed-inverse variance-weighted harmonic mean) consistently performed the best for both single-sample and two-sample (temporal) methods of estimating N(e) and outperformed some methods currently in widespread use. The approach adopted here may be a starting point to adjust other population genetics methods that include per-locus sample size components. © 2012 Blackwell Publishing Ltd.
NASA Technical Reports Server (NTRS)
Crawford, Winifred C.
2010-01-01
The AMU created new logistic regression equations in an effort to increase the skill of the Objective Lightning Forecast Tool developed in Phase II (Lambert 2007). One equation was created for each of five sub-seasons based on the daily lightning climatology instead of by month as was done in Phase II. The assumption was that these equations would capture the physical attributes that contribute to thunderstorm formation more so than monthly equations. However, the SS values in Section 5.3.2 showed that the Phase III equations had worse skill than the Phase II equations and, therefore, will not be transitioned into operations. The current Objective Lightning Forecast Tool developed in Phase II will continue to be used operationally in MIDDS. Three warm seasons were added to the Phase II dataset to increase the POR from 17 to 20 years (1989-2008), and data for October were included since the daily climatology showed lightning occurrence extending into that month. None of the three methods tested to determine the start of the subseason in each individual year were able to discern the start dates with consistent accuracy. Therefore, the start dates were determined by the daily climatology shown in Figure 10 and were the same in every year. The procedures used to create the predictors and develop the equations were identical to those in Phase II. The equations were made up of one to three predictors. TI and the flow regime probabilities were the top predictors followed by 1-day persistence, then VT and Ll. Each equation outperformed four other forecast methods by 7-57% using the verification dataset, but the new equations were outperformed by the Phase II equations in every sub-season. The reason for the degradation may be due to the fact that the same sub-season start dates were used in every year. It is likely there was overlap of sub-season days at the beginning and end of each defined sub-season in each individual year, which could very well affect equation performance.
Pressman, Alice R.; Avins, Andrew L.; Hubbard, Alan; Satariano, William A.
2014-01-01
Background There is a paucity of literature comparing Bayesian analytic techniques with traditional approaches for analyzing clinical trials using real trial data. Methods We compared Bayesian and frequentist group sequential methods using data from two published clinical trials. We chose two widely accepted frequentist rules, O'Brien–Fleming and Lan–DeMets, and conjugate Bayesian priors. Using the nonparametric bootstrap, we estimated a sampling distribution of stopping times for each method. Because current practice dictates the preservation of an experiment-wise false positive rate (Type I error), we approximated these error rates for our Bayesian and frequentist analyses with the posterior probability of detecting an effect in a simulated null sample. Thus for the data-generated distribution represented by these trials, we were able to compare the relative performance of these techniques. Results No final outcomes differed from those of the original trials. However, the timing of trial termination differed substantially by method and varied by trial. For one trial, group sequential designs of either type dictated early stopping of the study. In the other, stopping times were dependent upon the choice of spending function and prior distribution. Conclusions Results indicate that trialists ought to consider Bayesian methods in addition to traditional approaches for analysis of clinical trials. Though findings from this small sample did not demonstrate either method to consistently outperform the other, they did suggest the need to replicate these comparisons using data from varied clinical trials in order to determine the conditions under which the different methods would be most efficient. PMID:21453792
Chen, Yikai; Wang, Kai; Xu, Chengcheng; Shi, Qin; He, Jie; Li, Peiqing; Shi, Ting
2018-05-19
To overcome the limitations of previous highway alignment safety evaluation methods, this article presents a highway alignment safety evaluation method based on fault tree analysis (FTA) and the characteristics of vehicle safety boundaries, within the framework of dynamic modeling of the driver-vehicle-road system. Approaches for categorizing the vehicle failure modes while driving on highways and the corresponding safety boundaries were comprehensively investigated based on vehicle system dynamics theory. Then, an overall crash probability model was formulated based on FTA considering the risks of 3 failure modes: losing steering capability, losing track-holding capability, and rear-end collision. The proposed method was implemented on a highway segment between Bengbu and Nanjing in China. A driver-vehicle-road multibody dynamics model was developed based on the 3D alignments of the Bengbu to Nanjing section of Ning-Luo expressway using Carsim, and the dynamics indices, such as sideslip angle and, yaw rate were obtained. Then, the average crash probability of each road section was calculated with a fixed-length method. Finally, the average crash probability was validated against the crash frequency per kilometer to demonstrate the accuracy of the proposed method. The results of the regression analysis and correlation analysis indicated good consistency between the results of the safety evaluation and the crash data and that it outperformed the safety evaluation methods used in previous studies. The proposed method has the potential to be used in practical engineering applications to identify crash-prone locations and alignment deficiencies on highways in the planning and design phases, as well as those in service.
Zou, Meng; Liu, Zhaoqi; Zhang, Xiang-Sun; Wang, Yong
2015-10-15
In prognosis and survival studies, an important goal is to identify multi-biomarker panels with predictive power using molecular characteristics or clinical observations. Such analysis is often challenged by censored, small-sample-size, but high-dimensional genomic profiles or clinical data. Therefore, sophisticated models and algorithms are in pressing need. In this study, we propose a novel Area Under Curve (AUC) optimization method for multi-biomarker panel identification named Nearest Centroid Classifier for AUC optimization (NCC-AUC). Our method is motived by the connection between AUC score for classification accuracy evaluation and Harrell's concordance index in survival analysis. This connection allows us to convert the survival time regression problem to a binary classification problem. Then an optimization model is formulated to directly maximize AUC and meanwhile minimize the number of selected features to construct a predictor in the nearest centroid classifier framework. NCC-AUC shows its great performance by validating both in genomic data of breast cancer and clinical data of stage IB Non-Small-Cell Lung Cancer (NSCLC). For the genomic data, NCC-AUC outperforms Support Vector Machine (SVM) and Support Vector Machine-based Recursive Feature Elimination (SVM-RFE) in classification accuracy. It tends to select a multi-biomarker panel with low average redundancy and enriched biological meanings. Also NCC-AUC is more significant in separation of low and high risk cohorts than widely used Cox model (Cox proportional-hazards regression model) and L1-Cox model (L1 penalized in Cox model). These performance gains of NCC-AUC are quite robust across 5 subtypes of breast cancer. Further in an independent clinical data, NCC-AUC outperforms SVM and SVM-RFE in predictive accuracy and is consistently better than Cox model and L1-Cox model in grouping patients into high and low risk categories. In summary, NCC-AUC provides a rigorous optimization framework to systematically reveal multi-biomarker panel from genomic and clinical data. It can serve as a useful tool to identify prognostic biomarkers for survival analysis. NCC-AUC is available at http://doc.aporc.org/wiki/NCC-AUC. ywang@amss.ac.cn Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Fitness differentials amongst schools: how are they related to school sector?
Olds, T; Tomkinson, G; Baker, S
2003-09-01
Data on the performance fitness of 50,385 Australian students aged between 12 and 15 years were used to determine whether students differed in physical fitness according to school sector (independent vs government vs Catholic). Students were tested between 1995 and 2001 as part of the Australian Sports Commission's Talent Search program. The results of the 20 m shuttle run (20mSRT), vertical jump and 40 m sprint tests were selected as being representative of aerobic, explosive and anaerobic performance. All results were expressed as age- and gender-specific z-scores. MANOVA showed that independent school students outperformed students from the Catholic and government sectors on the selected tests for both boys and girls (p < 0.0001). In the 20mSRT, the difference amounted to 0.28-0.43 SDs. In the sprint and jump tests, independent school students were superior by 0.05-0.17 SDs. A proxy for socio-economic status (SES) explained about 90% of the differences between sectors, with high SES schools consistently outperforming low SES schools. Nonetheless, even when SES was factored in, sectoral differences remained significant. Insofar as fitness is related to school activities, these findings raise equity concerns in Australian school physical education.
Implicit negotiation beliefs and performance: experimental and longitudinal evidence.
Kray, Laura J; Haselhuhn, Michael P
2007-07-01
The authors argue that implicit negotiation beliefs, which speak to the expected malleability of negotiating ability, affect performance in dyadic negotiations. They expected negotiators who believe negotiating attributes are malleable (incremental theorists) to outperform negotiators who believe negotiating attributes are fixed (entity theorists). In Study 1, they gathered evidence of convergent and discriminant validity for the implicit negotiation belief construct. In Study 2, they examined the impact of implicit beliefs on the achievement goals that negotiators pursue. In Study 3, they explored the causal role of implicit beliefs on negotiation performance by manipulating negotiators' implicit beliefs within dyads. They also identified perceived ability as a moderator of the link between implicit negotiation beliefs and performance. In Study 4, they measured negotiators' beliefs in a classroom setting and examined how these beliefs affected negotiation performance and overall performance in the course 15 weeks later. Across all performance measures, incremental theorists outperformed entity theorists. Consistent with the authors' hypotheses, incremental theorists captured more of the bargaining surplus and were more integrative than their entity theorist counterparts, suggesting implicit theories are important determinants of how negotiators perform. Implications and future directions are discussed. Copyright 2007 APA, all rights reserved.
Comparative assessment of three standardized robotic surgery training methods.
Hung, Andrew J; Jayaratna, Isuru S; Teruya, Kara; Desai, Mihir M; Gill, Inderbir S; Goh, Alvin C
2013-10-01
To evaluate three standardized robotic surgery training methods, inanimate, virtual reality and in vivo, for their construct validity. To explore the concept of cross-method validity, where the relative performance of each method is compared. Robotic surgical skills were prospectively assessed in 49 participating surgeons who were classified as follows: 'novice/trainee': urology residents, previous experience <30 cases (n = 38) and 'experts': faculty surgeons, previous experience ≥30 cases (n = 11). Three standardized, validated training methods were used: (i) structured inanimate tasks; (ii) virtual reality exercises on the da Vinci Skills Simulator (Intuitive Surgical, Sunnyvale, CA, USA); and (iii) a standardized robotic surgical task in a live porcine model with performance graded by the Global Evaluative Assessment of Robotic Skills (GEARS) tool. A Kruskal-Wallis test was used to evaluate performance differences between novices and experts (construct validity). Spearman's correlation coefficient (ρ) was used to measure the association of performance across inanimate, simulation and in vivo methods (cross-method validity). Novice and expert surgeons had previously performed a median (range) of 0 (0-20) and 300 (30-2000) robotic cases, respectively (P < 0.001). Construct validity: experts consistently outperformed residents with all three methods (P < 0.001). Cross-method validity: overall performance of inanimate tasks significantly correlated with virtual reality robotic performance (ρ = -0.7, P < 0.001) and in vivo robotic performance based on GEARS (ρ = -0.8, P < 0.0001). Virtual reality performance and in vivo tissue performance were also found to be strongly correlated (ρ = 0.6, P < 0.001). We propose the novel concept of cross-method validity, which may provide a method of evaluating the relative value of various forms of skills education and assessment. We externally confirmed the construct validity of each featured training tool. © 2013 BJU International.
A performance comparison of two small rocket nozzles
NASA Technical Reports Server (NTRS)
Arrington, Lynn A.; Reed, Brian D.; Rivera, Angel, Jr.
1996-01-01
An experimental study was conducted on two small rockets (110 N thrust class) to directly compare a standard conical nozzle with a bell nozzle optimized for maximum thrust using the Rao method. In large rockets, with throat Reynolds numbers of greater than 1 x 10(exp 5), bell nozzles outperform conical nozzles. In rockets with throat Reynolds numbers below 1 x 10(exp 5), however, test results have been ambiguous. An experimental program was conducted to test two small nozzles at two different fuel film cooling percentages and three different chamber pressures. Test results showed that for the throat Reynolds number range from 2 x 10(exp 4) to 4 x 10(exp 4), the bell nozzle outperformed the conical nozzle. Thrust coefficients for the bell nozzle were approximately 4 to 12 percent higher than those obtained with the conical nozzle. As expected, testing showed that lowering the fuel film cooling increased performance for both nozzle types.
[Gender differences in cognitive functions and influence of sex hormones].
Torres, A; Gómez-Gil, E; Vidal, A; Puig, O; Boget, T; Salamero, M
2006-01-01
To review scientific evidence on gender differences in cognitive functions and influence of sex hormones on cognitive performance. Systematical search of related studies identified in Medline. Women outperform men on verbal fluency, perceptual speed tasks, fine motor skills, verbal memory and verbal learning. Men outperform women on visuospatial ability, mathematical problem solving and visual memory. No gender differences on attention and working memory are found. Researchers distinguish four methods to investigate hormonal influence on cognitive performance: a) patient with hormonal disorders; b) neuroimaging in individuals during hormone administration; c) in women during different phases of menstrual cycle, and d) in patients receiving hormonal treatment (idiopathic hypogonadotropic hypogonadism, postmenopausal women and transsexuals). The findings mostly suggest an influence of sex hormones on some cognitive functions, but they are not conclusive because of limitations and scarcity of the studies. There are gender differences on cognitive functions. Sex hormones seem to influence cognitive performance.
AUC-Maximizing Ensembles through Metalearning.
LeDell, Erin; van der Laan, Mark J; Petersen, Maya
2016-05-01
Area Under the ROC Curve (AUC) is often used to measure the performance of an estimator in binary classification problems. An AUC-maximizing classifier can have significant advantages in cases where ranking correctness is valued or if the outcome is rare. In a Super Learner ensemble, maximization of the AUC can be achieved by the use of an AUC-maximining metalearning algorithm. We discuss an implementation of an AUC-maximization technique that is formulated as a nonlinear optimization problem. We also evaluate the effectiveness of a large number of different nonlinear optimization algorithms to maximize the cross-validated AUC of the ensemble fit. The results provide evidence that AUC-maximizing metalearners can, and often do, out-perform non-AUC-maximizing metalearning methods, with respect to ensemble AUC. The results also demonstrate that as the level of imbalance in the training data increases, the Super Learner ensemble outperforms the top base algorithm by a larger degree.
AUC-Maximizing Ensembles through Metalearning
LeDell, Erin; van der Laan, Mark J.; Peterson, Maya
2016-01-01
Area Under the ROC Curve (AUC) is often used to measure the performance of an estimator in binary classification problems. An AUC-maximizing classifier can have significant advantages in cases where ranking correctness is valued or if the outcome is rare. In a Super Learner ensemble, maximization of the AUC can be achieved by the use of an AUC-maximining metalearning algorithm. We discuss an implementation of an AUC-maximization technique that is formulated as a nonlinear optimization problem. We also evaluate the effectiveness of a large number of different nonlinear optimization algorithms to maximize the cross-validated AUC of the ensemble fit. The results provide evidence that AUC-maximizing metalearners can, and often do, out-perform non-AUC-maximizing metalearning methods, with respect to ensemble AUC. The results also demonstrate that as the level of imbalance in the training data increases, the Super Learner ensemble outperforms the top base algorithm by a larger degree. PMID:27227721
A family of chaotic pure analog coding schemes based on baker's map function
NASA Astrophysics Data System (ADS)
Liu, Yang; Li, Jing; Lu, Xuanxuan; Yuen, Chau; Wu, Jun
2015-12-01
This paper considers a family of pure analog coding schemes constructed from dynamic systems which are governed by chaotic functions—baker's map function and its variants. Various decoding methods, including maximum likelihood (ML), minimum mean square error (MMSE), and mixed ML-MMSE decoding algorithms, have been developed for these novel encoding schemes. The proposed mirrored baker's and single-input baker's analog codes perform a balanced protection against the fold error (large distortion) and weak distortion and outperform the classical chaotic analog coding and analog joint source-channel coding schemes in literature. Compared to the conventional digital communication system, where quantization and digital error correction codes are used, the proposed analog coding system has graceful performance evolution, low decoding latency, and no quantization noise. Numerical results show that under the same bandwidth expansion, the proposed analog system outperforms the digital ones over a wide signal-to-noise (SNR) range.
New technology in postfire rehab
Joe Sabel
2007-01-01
PAM-12⢠is a recycled office paper byproduct made into a spreadable mulch with added Water Soluble Polyacrylamide (WSPAM), a previously difficult polymer to apply. PAM-12 is extremely versatile and can be applied through several methods. In a field test, PAM-12 outperformed straw in every targeted performance area: erosion control, improving soil hydrophobicity, and...
RNA-seq mixology: designing realistic control experiments to compare protocols and analysis methods
Holik, Aliaksei Z.; Law, Charity W.; Liu, Ruijie; Wang, Zeya; Wang, Wenyi; Ahn, Jaeil; Asselin-Labat, Marie-Liesse; Smyth, Gordon K.
2017-01-01
Abstract Carefully designed control experiments provide a gold standard for benchmarking different genomics research tools. A shortcoming of many gene expression control studies is that replication involves profiling the same reference RNA sample multiple times. This leads to low, pure technical noise that is atypical of regular studies. To achieve a more realistic noise structure, we generated a RNA-sequencing mixture experiment using two cell lines of the same cancer type. Variability was added by extracting RNA from independent cell cultures and degrading particular samples. The systematic gene expression changes induced by this design allowed benchmarking of different library preparation kits (standard poly-A versus total RNA with Ribozero depletion) and analysis pipelines. Data generated using the total RNA kit had more signal for introns and various RNA classes (ncRNA, snRNA, snoRNA) and less variability after degradation. For differential expression analysis, voom with quality weights marginally outperformed other popular methods, while for differential splicing, DEXSeq was simultaneously the most sensitive and the most inconsistent method. For sample deconvolution analysis, DeMix outperformed IsoPure convincingly. Our RNA-sequencing data set provides a valuable resource for benchmarking different protocols and data pre-processing workflows. The extra noise mimics routine lab experiments more closely, ensuring any conclusions are widely applicable. PMID:27899618
AUC-Maximized Deep Convolutional Neural Fields for Protein Sequence Labeling.
Wang, Sheng; Sun, Siqi; Xu, Jinbo
2016-09-01
Deep Convolutional Neural Networks (DCNN) has shown excellent performance in a variety of machine learning tasks. This paper presents Deep Convolutional Neural Fields (DeepCNF), an integration of DCNN with Conditional Random Field (CRF), for sequence labeling with an imbalanced label distribution. The widely-used training methods, such as maximum-likelihood and maximum labelwise accuracy, do not work well on imbalanced data. To handle this, we present a new training algorithm called maximum-AUC for DeepCNF. That is, we train DeepCNF by directly maximizing the empirical Area Under the ROC Curve (AUC), which is an unbiased measurement for imbalanced data. To fulfill this, we formulate AUC in a pairwise ranking framework, approximate it by a polynomial function and then apply a gradient-based procedure to optimize it. Our experimental results confirm that maximum-AUC greatly outperforms the other two training methods on 8-state secondary structure prediction and disorder prediction since their label distributions are highly imbalanced and also has similar performance as the other two training methods on solvent accessibility prediction, which has three equally-distributed labels. Furthermore, our experimental results show that our AUC-trained DeepCNF models greatly outperform existing popular predictors of these three tasks. The data and software related to this paper are available at https://github.com/realbigws/DeepCNF_AUC.
AUC-Maximized Deep Convolutional Neural Fields for Protein Sequence Labeling
Wang, Sheng; Sun, Siqi
2017-01-01
Deep Convolutional Neural Networks (DCNN) has shown excellent performance in a variety of machine learning tasks. This paper presents Deep Convolutional Neural Fields (DeepCNF), an integration of DCNN with Conditional Random Field (CRF), for sequence labeling with an imbalanced label distribution. The widely-used training methods, such as maximum-likelihood and maximum labelwise accuracy, do not work well on imbalanced data. To handle this, we present a new training algorithm called maximum-AUC for DeepCNF. That is, we train DeepCNF by directly maximizing the empirical Area Under the ROC Curve (AUC), which is an unbiased measurement for imbalanced data. To fulfill this, we formulate AUC in a pairwise ranking framework, approximate it by a polynomial function and then apply a gradient-based procedure to optimize it. Our experimental results confirm that maximum-AUC greatly outperforms the other two training methods on 8-state secondary structure prediction and disorder prediction since their label distributions are highly imbalanced and also has similar performance as the other two training methods on solvent accessibility prediction, which has three equally-distributed labels. Furthermore, our experimental results show that our AUC-trained DeepCNF models greatly outperform existing popular predictors of these three tasks. The data and software related to this paper are available at https://github.com/realbigws/DeepCNF_AUC. PMID:28884168
An algorithm to track laboratory zebrafish shoals.
Feijó, Gregory de Oliveira; Sangalli, Vicenzo Abichequer; da Silva, Isaac Newton Lima; Pinho, Márcio Sarroglia
2018-05-01
In this paper, a semi-automatic multi-object tracking method to track a group of unmarked zebrafish is proposed. This method can handle partial occlusion cases, maintaining the correct identity of each individual. For every object, we extracted a set of geometric features to be used in the two main stages of the algorithm. The first stage selected the best candidate, based both on the blobs identified in the image and the estimate generated by a Kalman Filter instance. In the second stage, if the same candidate-blob is selected by two or more instances, a blob-partitioning algorithm takes place in order to split this blob and reestablish the instances' identities. If the algorithm cannot determine the identity of a blob, a manual intervention is required. This procedure was compared against a manual labeled ground truth on four video sequences with different numbers of fish and spatial resolution. The performance of the proposed method is then compared against two well-known zebrafish tracking methods found in the literature: one that treats occlusion scenarios and one that only track fish that are not in occlusion. Based on the data set used, the proposed method outperforms the first method in correctly separating fish in occlusion, increasing its efficiency by at least 8.15% of the cases. As for the second, the proposed method's overall performance outperformed the second in some of the tested videos, especially those with lower image quality, because the second method requires high-spatial resolution images, which is not a requirement for the proposed method. Yet, the proposed method was able to separate fish involved in occlusion and correctly assign its identity in up to 87.85% of the cases, without accounting for user intervention. Copyright © 2018 Elsevier Ltd. All rights reserved.
Intermediate Cognitive Phenotypes in Bipolar Disorder
Langenecker, Scott A.; Saunders, Erika F.H.; Kade, Allison M.; Ransom, Michael T.; McInnis, Melvin G.
2013-01-01
Background Intermediate cognitive phenotypes (ICPs) are measurable and quantifiable states that may be objectively assessed in a standardized method, and can be integrated into association studies, including genetic, biochemical, clinical, and imaging based correlates. The present study used neuropsychological measures as ICPs, with factor scores in executive functioning, attention, memory, fine motor function, and emotion processing, similar to prior work in schizophrenia. Methods Healthy control subjects (HC, n=34) and euthymic (E, n=66), depressed (D, n=43), or hypomanic/mixed (HM, n=13) patients with bipolar disorder (BD) were assessed with neuropsychological tests. These were from eight domains consistent with previous literature; auditory memory, visual memory, processing speed with interference resolution, verbal fluency and processing speed, conceptual reasoning and set-shifting, inhibitory control, emotion processing, and fine motor dexterity. Results Of the eight factor scores, the HC group outperformed the E group in three (Processing Speed with Interference Resolution, Visual Memory, Fine Motor Dexterity), the D group in seven (all except Inhibitory Control), and the HM group in four (Inhibitory Control, Processing Speed with Interference Resolution, Fine Motor Dexterity, and Auditory Memory). Limitations The HM group was relatively small, thus effects of this phase of illness may have been underestimated. Effects of medication could not be fully controlled without a randomized, double-blind, placebo-controlled study. Conclusions Use of the factor scores can assist in determining ICPs for BD and related disorders, and may provide more specific targets for development of new treatments. We highlight strong ICPs (Processing Speed with Interference Resolution, Visual Memory, Fine Motor Dexterity) for further study, consistent with the existing literature. PMID:19800130
Ambers, Angie; Wiley, Rachel; Novroski, Nicole; Budowle, Bruce
2018-01-01
Previous studies have shown that nylon flocked swabs outperform traditional fiber swabs in DNA recovery due to their innovative design and lack of internal absorbent core to entrap cellular materials. The microFLOQ ® Direct swab, a miniaturized version of the 4N6 FLOQSwab ® , has a small swab head that is treated with a lysing agent which allows for direct amplification and DNA profiling from sample collection to final result in less than two hours. Additionally, the microFLOQ ® system subsamples only a minute portion of a stain and preserves the vast majority of the sample for subsequent testing or re-analysis, if desired. The efficacy of direct amplification of DNA from dilute bloodstains, saliva stains, and touch samples was evaluated using microFLOQ ® Direct swabs and the GlobalFiler™ Express system. Comparisons were made to traditional methods to assess the robustness of this alternate workflow. Controlled studies with 1:19 and 1:99 dilutions of bloodstains and saliva stains consistently yielded higher STR peak heights than standard methods with 1ng input DNA from the same samples. Touch samples from common items yielded single source and mixed profiles that were consistent with primary users of the objects. With this novel methodology/workflow, no sample loss occurs and therefore more template DNA is available during amplification. This approach may have important implications for analysis of low quantity and/or degraded samples that plague forensic casework. Copyright © 2017 The Authors. Published by Elsevier B.V. All rights reserved.
SpeCond: a method to detect condition-specific gene expression
2011-01-01
Transcriptomic studies routinely measure expression levels across numerous conditions. These datasets allow identification of genes that are specifically expressed in a small number of conditions. However, there are currently no statistically robust methods for identifying such genes. Here we present SpeCond, a method to detect condition-specific genes that outperforms alternative approaches. We apply the method to a dataset of 32 human tissues to determine 2,673 specifically expressed genes. An implementation of SpeCond is freely available as a Bioconductor package at http://www.bioconductor.org/packages/release/bioc/html/SpeCond.html. PMID:22008066
An image retrieval framework for real-time endoscopic image retargeting.
Ye, Menglong; Johns, Edward; Walter, Benjamin; Meining, Alexander; Yang, Guang-Zhong
2017-08-01
Serial endoscopic examinations of a patient are important for early diagnosis of malignancies in the gastrointestinal tract. However, retargeting for optical biopsy is challenging due to extensive tissue variations between examinations, requiring the method to be tolerant to these changes whilst enabling real-time retargeting. This work presents an image retrieval framework for inter-examination retargeting. We propose both a novel image descriptor tolerant of long-term tissue changes and a novel descriptor matching method in real time. The descriptor is based on histograms generated from regional intensity comparisons over multiple scales, offering stability over long-term appearance changes at the higher levels, whilst remaining discriminative at the lower levels. The matching method then learns a hashing function using random forests, to compress the string and allow for fast image comparison by a simple Hamming distance metric. A dataset that contains 13 in vivo gastrointestinal videos was collected from six patients, representing serial examinations of each patient, which includes videos captured with significant time intervals. Precision-recall for retargeting shows that our new descriptor outperforms a number of alternative descriptors, whilst our hashing method outperforms a number of alternative hashing approaches. We have proposed a novel framework for optical biopsy in serial endoscopic examinations. A new descriptor, combined with a novel hashing method, achieves state-of-the-art retargeting, with validation on in vivo videos from six patients. Real-time performance also allows for practical integration without disturbing the existing clinical workflow.
Iso-geometric analysis for neutron diffusion problems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hall, S. K.; Eaton, M. D.; Williams, M. M. R.
Iso-geometric analysis can be viewed as a generalisation of the finite element method. It permits the exact representation of a wider range of geometries including conic sections. This is possible due to the use of concepts employed in computer-aided design. The underlying mathematical representations from computer-aided design are used to capture both the geometry and approximate the solution. In this paper the neutron diffusion equation is solved using iso-geometric analysis. The practical advantages are highlighted by looking at the problem of a circular fuel pin in a square moderator. For this problem the finite element method requires the geometry tomore » be approximated. This leads to errors in the shape and size of the interface between the fuel and the moderator. In contrast to this iso-geometric analysis allows the interface to be represented exactly. It is found that, due to a cancellation of errors, the finite element method converges more quickly than iso-geometric analysis for this problem. A fuel pin in a vacuum was then considered as this problem is highly sensitive to the leakage across the interface. In this case iso-geometric analysis greatly outperforms the finite element method. Due to the improvement in the representation of the geometry iso-geometric analysis can outperform traditional finite element methods. It is proposed that the use of iso-geometric analysis on neutron transport problems will allow deterministic solutions to be obtained for exact geometries. Something that is only currently possible with Monte Carlo techniques. (authors)« less
Li, You; Heavican, Tayla B.; Vellichirammal, Neetha N.; Iqbal, Javeed
2017-01-01
Abstract The RNA-Seq technology has revolutionized transcriptome characterization not only by accurately quantifying gene expression, but also by the identification of novel transcripts like chimeric fusion transcripts. The ‘fusion’ or ‘chimeric’ transcripts have improved the diagnosis and prognosis of several tumors, and have led to the development of novel therapeutic regimen. The fusion transcript detection is currently accomplished by several software packages, primarily relying on sequence alignment algorithms. The alignment of sequencing reads from fusion transcript loci in cancer genomes can be highly challenging due to the incorrect mapping induced by genomic alterations, thereby limiting the performance of alignment-based fusion transcript detection methods. Here, we developed a novel alignment-free method, ChimeRScope that accurately predicts fusion transcripts based on the gene fingerprint (as k-mers) profiles of the RNA-Seq paired-end reads. Results on published datasets and in-house cancer cell line datasets followed by experimental validations demonstrate that ChimeRScope consistently outperforms other popular methods irrespective of the read lengths and sequencing depth. More importantly, results on our in-house datasets show that ChimeRScope is a better tool that is capable of identifying novel fusion transcripts with potential oncogenic functions. ChimeRScope is accessible as a standalone software at (https://github.com/ChimeRScope/ChimeRScope/wiki) or via the Galaxy web-interface at (https://galaxy.unmc.edu/). PMID:28472320
Segmenting Continuous Motions with Hidden Semi-markov Models and Gaussian Processes
Nakamura, Tomoaki; Nagai, Takayuki; Mochihashi, Daichi; Kobayashi, Ichiro; Asoh, Hideki; Kaneko, Masahide
2017-01-01
Humans divide perceived continuous information into segments to facilitate recognition. For example, humans can segment speech waves into recognizable morphemes. Analogously, continuous motions are segmented into recognizable unit actions. People can divide continuous information into segments without using explicit segment points. This capacity for unsupervised segmentation is also useful for robots, because it enables them to flexibly learn languages, gestures, and actions. In this paper, we propose a Gaussian process-hidden semi-Markov model (GP-HSMM) that can divide continuous time series data into segments in an unsupervised manner. Our proposed method consists of a generative model based on the hidden semi-Markov model (HSMM), the emission distributions of which are Gaussian processes (GPs). Continuous time series data is generated by connecting segments generated by the GP. Segmentation can be achieved by using forward filtering-backward sampling to estimate the model's parameters, including the lengths and classes of the segments. In an experiment using the CMU motion capture dataset, we tested GP-HSMM with motion capture data containing simple exercise motions; the results of this experiment showed that the proposed GP-HSMM was comparable with other methods. We also conducted an experiment using karate motion capture data, which is more complex than exercise motion capture data; in this experiment, the segmentation accuracy of GP-HSMM was 0.92, which outperformed other methods. PMID:29311889
Relation extraction for biological pathway construction using node2vec.
Kim, Munui; Baek, Seung Han; Song, Min
2018-06-13
Systems biology is an important field for understanding whole biological mechanisms composed of interactions between biological components. One approach for understanding complex and diverse mechanisms is to analyze biological pathways. However, because these pathways consist of important interactions and information on these interactions is disseminated in a large number of biomedical reports, text-mining techniques are essential for extracting these relationships automatically. In this study, we applied node2vec, an algorithmic framework for feature learning in networks, for relationship extraction. To this end, we extracted genes from paper abstracts using pkde4j, a text-mining tool for detecting entities and relationships. Using the extracted genes, a co-occurrence network was constructed and node2vec was used with the network to generate a latent representation. To demonstrate the efficacy of node2vec in extracting relationships between genes, performance was evaluated for gene-gene interactions involved in a type 2 diabetes pathway. Moreover, we compared the results of node2vec to those of baseline methods such as co-occurrence and DeepWalk. Node2vec outperformed existing methods in detecting relationships in the type 2 diabetes pathway, demonstrating that this method is appropriate for capturing the relatedness between pairs of biological entities involved in biological pathways. The results demonstrated that node2vec is useful for automatic pathway construction.
NASA Astrophysics Data System (ADS)
Fei, Baowei; Lu, Guolan; Wang, Xu; Zhang, Hongzheng; Little, James V.; Magliocca, Kelly R.; Chen, Amy Y.
2017-02-01
We are developing label-free hyperspectral imaging (HSI) for tumor margin assessment. HSI data, hypercube (x,y,λ), consists of a series of high-resolution images of the same field of view that are acquired at different wavelengths. Every pixel on the HSI image has an optical spectrum. We developed preprocessing and classification methods for HSI data. We used spectral features from HSI data for the classification of cancer and benign tissue. We collected surgical tissue specimens from 16 human patients who underwent head and neck (H&N) cancer surgery. We acquired both HSI, autofluorescence images, and fluorescence images with 2-NBDG and proflavine from the specimens. Digitized histologic slides were examined by an H&N pathologist. The hyperspectral imaging and classification method was able to distinguish between cancer and normal tissue from oral cavity with an average accuracy of 90+/-8%, sensitivity of 89+/-9%, and specificity of 91+/-6%. For tissue specimens from the thyroid, the method achieved an average accuracy of 94+/-6%, sensitivity of 94+/-6%, and specificity of 95+/-6%. Hyperspectral imaging outperformed autofluorescence imaging or fluorescence imaging with vital dye (2-NBDG or proflavine). This study suggests that label-free hyperspectral imaging has great potential for tumor margin assessment in surgical tissue specimens of H&N cancer patients. Further development of the hyperspectral imaging technology is warranted for its application in image-guided surgery.
Normalization of T2W-MRI prostate images using Rician a priori
NASA Astrophysics Data System (ADS)
Lemaître, Guillaume; Rastgoo, Mojdeh; Massich, Joan; Vilanova, Joan C.; Walker, Paul M.; Freixenet, Jordi; Meyer-Baese, Anke; Mériaudeau, Fabrice; Martí, Robert
2016-03-01
Prostate cancer is reported to be the second most frequently diagnosed cancer of men in the world. In practise, diagnosis can be affected by multiple factors which reduces the chance to detect the potential lesions. In the last decades, new imaging techniques mainly based on MRI are developed in conjunction with Computer-Aided Diagnosis (CAD) systems to help radiologists for such diagnosis. CAD systems are usually designed as a sequential process consisting of four stages: pre-processing, segmentation, registration and classification. As a pre-processing, image normalization is a critical and important step of the chain in order to design a robust classifier and overcome the inter-patients intensity variations. However, little attention has been dedicated to the normalization of T2W-Magnetic Resonance Imaging (MRI) prostate images. In this paper, we propose two methods to normalize T2W-MRI prostate images: (i) based on a Rician a priori and (ii) based on a Square-Root Slope Function (SRSF) representation which does not make any assumption regarding the Probability Density Function (PDF) of the data. A comparison with the state-of-the-art methods is also provided. The normalization of the data is assessed by comparing the alignment of the patient PDFs in both qualitative and quantitative manners. In both evaluation, the normalization using Rician a priori outperforms the other state-of-the-art methods.
Precipitation phase separation schemes in the Naqu River basin, eastern Tibetan plateau
NASA Astrophysics Data System (ADS)
Liu, Shaohua; Yan, Denghua; Qin, Tianling; Weng, Baisha; Lu, Yajing; Dong, Guoqiang; Gong, Boya
2018-01-01
Precipitation phase has a profound influence on the hydrological processes in the Naqu River basin, eastern Tibetan plateau. However, there are only six meteorological stations with precipitation phase (rainfall/snowfall/sleet) before 1979 within and around the basin. In order to separate snowfall from precipitation, a new separation scheme with S-shaped curve of snowfall proportion as an exponential function of daily mean temperature was developed. The determinations of critical temperatures in the single/two temperature threshold (STT/TTT2) methods were explored accordingly, and the temperature corresponding to the 50 % snowfall proportion (SP50 temperature) is an efficiently critical temperature for the STT, and two critical temperatures in TTT2 can be determined based on the exponential function and SP50 temperature. Then, different separation schemes were evaluated in separating snowfall from precipitation in the Naqu River basin. The results show that the S-shaped curve methods outperform other separation schemes. Although the STT and TTT2 slightly underestimate and overestimate the snowfall when the temperature is higher and colder than SP50 temperature respectively, the monthly and annual separation snowfalls are generally consistent with the observed snowfalls. On the whole, S-shaped curve methods, STT, and TTT2 perform well in separating snowfall from precipitation with the Pearson correlation coefficient of annual separation snowfall above 0.8 and provide possible approaches to separate the snowfall from precipitation for hydrological modelling.
Chen, Wansu; Shi, Jiaxiao; Qian, Lei; Azen, Stanley P
2014-06-26
To estimate relative risks or risk ratios for common binary outcomes, the most popular model-based methods are the robust (also known as modified) Poisson and the log-binomial regression. Of the two methods, it is believed that the log-binomial regression yields more efficient estimators because it is maximum likelihood based, while the robust Poisson model may be less affected by outliers. Evidence to support the robustness of robust Poisson models in comparison with log-binomial models is very limited. In this study a simulation was conducted to evaluate the performance of the two methods in several scenarios where outliers existed. The findings indicate that for data coming from a population where the relationship between the outcome and the covariate was in a simple form (e.g. log-linear), the two models yielded comparable biases and mean square errors. However, if the true relationship contained a higher order term, the robust Poisson models consistently outperformed the log-binomial models even when the level of contamination is low. The robust Poisson models are more robust (or less sensitive) to outliers compared to the log-binomial models when estimating relative risks or risk ratios for common binary outcomes. Users should be aware of the limitations when choosing appropriate models to estimate relative risks or risk ratios.
NASA Astrophysics Data System (ADS)
Sun, Biao; Zhao, Wenfeng; Zhu, Xinshan
2017-06-01
Objective. Data compression is crucial for resource-constrained wireless neural recording applications with limited data bandwidth, and compressed sensing (CS) theory has successfully demonstrated its potential in neural recording applications. In this paper, an analytical, training-free CS recovery method, termed group weighted analysis {{\\ell}1} -minimization (GWALM), is proposed for wireless neural recording. Approach. The GWALM method consists of three parts: (1) the analysis model is adopted to enforce sparsity of the neural signals, therefore overcoming the drawbacks of conventional synthesis models and enhancing the recovery performance. (2) A multi-fractional-order difference matrix is constructed as the analysis operator, thus avoiding the dictionary learning procedure and reducing the need for previously acquired data and computational complexities. (3) By exploiting the statistical properties of the analysis coefficients, a group weighting approach is developed to enhance the performance of analysis {{\\ell}1} -minimization. Main results. Experimental results on synthetic and real datasets reveal that the proposed approach outperforms state-of-the-art CS-based methods in terms of both spike recovery quality and classification accuracy. Significance. Energy and area efficiency of the GWALM make it an ideal candidate for resource-constrained, large scale wireless neural recording applications. The training-free feature of the GWALM further improves its robustness to spike shape variation, thus making it more practical for long term wireless neural recording.
Sun, Biao; Zhao, Wenfeng; Zhu, Xinshan
2017-06-01
Data compression is crucial for resource-constrained wireless neural recording applications with limited data bandwidth, and compressed sensing (CS) theory has successfully demonstrated its potential in neural recording applications. In this paper, an analytical, training-free CS recovery method, termed group weighted analysis [Formula: see text]-minimization (GWALM), is proposed for wireless neural recording. The GWALM method consists of three parts: (1) the analysis model is adopted to enforce sparsity of the neural signals, therefore overcoming the drawbacks of conventional synthesis models and enhancing the recovery performance. (2) A multi-fractional-order difference matrix is constructed as the analysis operator, thus avoiding the dictionary learning procedure and reducing the need for previously acquired data and computational complexities. (3) By exploiting the statistical properties of the analysis coefficients, a group weighting approach is developed to enhance the performance of analysis [Formula: see text]-minimization. Experimental results on synthetic and real datasets reveal that the proposed approach outperforms state-of-the-art CS-based methods in terms of both spike recovery quality and classification accuracy. Energy and area efficiency of the GWALM make it an ideal candidate for resource-constrained, large scale wireless neural recording applications. The training-free feature of the GWALM further improves its robustness to spike shape variation, thus making it more practical for long term wireless neural recording.
Zou, Zhengxia; Shi, Zhenwei
2018-03-01
We propose a new paradigm for target detection in high resolution aerial remote sensing images under small target priors. Previous remote sensing target detection methods frame the detection as learning of detection model + inference of class-label and bounding-box coordinates. Instead, we formulate it from a Bayesian view that at inference stage, the detection model is adaptively updated to maximize its posterior that is determined by both training and observation. We call this paradigm "random access memories (RAM)." In this paradigm, "Memories" can be interpreted as any model distribution learned from training data and "random access" means accessing memories and randomly adjusting the model at detection phase to obtain better adaptivity to any unseen distribution of test data. By leveraging some latest detection techniques e.g., deep Convolutional Neural Networks and multi-scale anchors, experimental results on a public remote sensing target detection data set show our method outperforms several other state of the art methods. We also introduce a new data set "LEarning, VIsion and Remote sensing laboratory (LEVIR)", which is one order of magnitude larger than other data sets of this field. LEVIR consists of a large set of Google Earth images, with over 22 k images and 10 k independently labeled targets. RAM gives noticeable upgrade of accuracy (an mean average precision improvement of 1% ~ 4%) of our baseline detectors with acceptable computational overhead.
Comparative modeling without implicit sequence alignments.
Kolinski, Andrzej; Gront, Dominik
2007-10-01
The number of known protein sequences is about thousand times larger than the number of experimentally solved 3D structures. For more than half of the protein sequences a close or distant structural analog could be identified. The key starting point in a classical comparative modeling is to generate the best possible sequence alignment with a template or templates. With decreasing sequence similarity, the number of errors in the alignments increases and these errors are the main causes of the decreasing accuracy of the molecular models generated. Here we propose a new approach to comparative modeling, which does not require the implicit alignment - the model building phase explores geometric, evolutionary and physical properties of a template (or templates). The proposed method requires prior identification of a template, although the initial sequence alignment is ignored. The model is built using a very efficient reduced representation search engine CABS to find the best possible superposition of the query protein onto the template represented as a 3D multi-featured scaffold. The criteria used include: sequence similarity, predicted secondary structure consistency, local geometric features and hydrophobicity profile. For more difficult cases, the new method qualitatively outperforms existing schemes of comparative modeling. The algorithm unifies de novo modeling, 3D threading and sequence-based methods. The main idea is general and could be easily combined with other efficient modeling tools as Rosetta, UNRES and others.
Adaptive Spot Detection With Optimal Scale Selection in Fluorescence Microscopy Images.
Basset, Antoine; Boulanger, Jérôme; Salamero, Jean; Bouthemy, Patrick; Kervrann, Charles
2015-11-01
Accurately detecting subcellular particles in fluorescence microscopy is of primary interest for further quantitative analysis such as counting, tracking, or classification. Our primary goal is to segment vesicles likely to share nearly the same size in fluorescence microscopy images. Our method termed adaptive thresholding of Laplacian of Gaussian (LoG) images with autoselected scale (ATLAS) automatically selects the optimal scale corresponding to the most frequent spot size in the image. Four criteria are proposed and compared to determine the optimal scale in a scale-space framework. Then, the segmentation stage amounts to thresholding the LoG of the intensity image. In contrast to other methods, the threshold is locally adapted given a probability of false alarm (PFA) specified by the user for the whole set of images to be processed. The local threshold is automatically derived from the PFA value and local image statistics estimated in a window whose size is not a critical parameter. We also propose a new data set for benchmarking, consisting of six collections of one hundred images each, which exploits backgrounds extracted from real microscopy images. We have carried out an extensive comparative evaluation on several data sets with ground-truth, which demonstrates that ATLAS outperforms existing methods. ATLAS does not need any fine parameter tuning and requires very low computation time. Convincing results are also reported on real total internal reflection fluorescence microscopy images.
Mining the archives: a cross-platform analysis of gene ...
Formalin-fixed paraffin-embedded (FFPE) tissue samples represent a potentially invaluable resource for genomic research into the molecular basis of disease. However, use of FFPE samples in gene expression studies has been limited by technical challenges resulting from degradation of nucleic acids. Here we evaluated gene expression profiles derived from fresh-frozen (FRO) and FFPE mouse liver tissues using two DNA microarray protocols and two whole transcriptome sequencing (RNA-seq) library preparation methodologies. The ribo-depletion protocol outperformed the other three methods by having the highest correlations of differentially expressed genes (DEGs) and best overlap of pathways between FRO and FFPE groups. We next tested the effect of sample time in formalin (18 hours or 3 weeks) on gene expression profiles. Hierarchical clustering of the datasets indicated that test article treatment, and not preservation method, was the main driver of gene expression profiles. Meta- and pathway analyses indicated that biological responses were generally consistent for 18-hour and 3-week FFPE samples compared to FRO samples. However, clear erosion of signal intensity with time in formalin was evident, and DEG numbers differed by platform and preservation method. Lastly, we investigated the effect of age in FFPE block on genomic profiles. RNA-seq analysis of 8-, 19-, and 26-year-old control blocks using the ribo-depletion protocol resulted in comparable quality metrics, inc
Pressman, Alice R; Avins, Andrew L; Hubbard, Alan; Satariano, William A
2011-07-01
There is a paucity of literature comparing Bayesian analytic techniques with traditional approaches for analyzing clinical trials using real trial data. We compared Bayesian and frequentist group sequential methods using data from two published clinical trials. We chose two widely accepted frequentist rules, O'Brien-Fleming and Lan-DeMets, and conjugate Bayesian priors. Using the nonparametric bootstrap, we estimated a sampling distribution of stopping times for each method. Because current practice dictates the preservation of an experiment-wise false positive rate (Type I error), we approximated these error rates for our Bayesian and frequentist analyses with the posterior probability of detecting an effect in a simulated null sample. Thus for the data-generated distribution represented by these trials, we were able to compare the relative performance of these techniques. No final outcomes differed from those of the original trials. However, the timing of trial termination differed substantially by method and varied by trial. For one trial, group sequential designs of either type dictated early stopping of the study. In the other, stopping times were dependent upon the choice of spending function and prior distribution. Results indicate that trialists ought to consider Bayesian methods in addition to traditional approaches for analysis of clinical trials. Though findings from this small sample did not demonstrate either method to consistently outperform the other, they did suggest the need to replicate these comparisons using data from varied clinical trials in order to determine the conditions under which the different methods would be most efficient. Copyright © 2011 Elsevier Inc. All rights reserved.
Lee, Jinwoo; Chung, Koohong; Kang, Seungmo
2016-12-01
Two different methods for addressing the regression to the mean phenomenon (RTM) were evaluated using empirical data: Data from 110 miles of freeway located in California were used to evaluate the performance of the EB and CRP methods in addressing RTM. CRP outperformed the EB method in estimating collision frequencies in selected high collision concentration locations (HCCLs). Findings indicate that the performance of the EB method can be markedly affected when SPF is biased, while the performance of CRP remains much less affected. The CRP method was more effective in addressing RTM. Published by Elsevier Ltd.
Exponential integrators in time-dependent density-functional calculations
NASA Astrophysics Data System (ADS)
Kidd, Daniel; Covington, Cody; Varga, Kálmán
2017-12-01
The integrating factor and exponential time differencing methods are implemented and tested for solving the time-dependent Kohn-Sham equations. Popular time propagation methods used in physics, as well as other robust numerical approaches, are compared to these exponential integrator methods in order to judge the relative merit of the computational schemes. We determine an improvement in accuracy of multiple orders of magnitude when describing dynamics driven primarily by a nonlinear potential. For cases of dynamics driven by a time-dependent external potential, the accuracy of the exponential integrator methods are less enhanced but still match or outperform the best of the conventional methods tested.
Crichton, Gamal; Guo, Yufan; Pyysalo, Sampo; Korhonen, Anna
2018-05-21
Link prediction in biomedical graphs has several important applications including predicting Drug-Target Interactions (DTI), Protein-Protein Interaction (PPI) prediction and Literature-Based Discovery (LBD). It can be done using a classifier to output the probability of link formation between nodes. Recently several works have used neural networks to create node representations which allow rich inputs to neural classifiers. Preliminary works were done on this and report promising results. However they did not use realistic settings like time-slicing, evaluate performances with comprehensive metrics or explain when or why neural network methods outperform. We investigated how inputs from four node representation algorithms affect performance of a neural link predictor on random- and time-sliced biomedical graphs of real-world sizes (∼ 6 million edges) containing information relevant to DTI, PPI and LBD. We compared the performance of the neural link predictor to those of established baselines and report performance across five metrics. In random- and time-sliced experiments when the neural network methods were able to learn good node representations and there was a negligible amount of disconnected nodes, those approaches outperformed the baselines. In the smallest graph (∼ 15,000 edges) and in larger graphs with approximately 14% disconnected nodes, baselines such as Common Neighbours proved a justifiable choice for link prediction. At low recall levels (∼ 0.3) the approaches were mostly equal, but at higher recall levels across all nodes and average performance at individual nodes, neural network approaches were superior. Analysis showed that neural network methods performed well on links between nodes with no previous common neighbours; potentially the most interesting links. Additionally, while neural network methods benefit from large amounts of data, they require considerable amounts of computational resources to utilise them. Our results indicate that when there is enough data for the neural network methods to use and there are a negligible amount of disconnected nodes, those approaches outperform the baselines. At low recall levels the approaches are mostly equal but at higher recall levels and average performance at individual nodes, neural network approaches are superior. Performance at nodes without common neighbours which indicate more unexpected and perhaps more useful links account for this.
Model-based sensor-less wavefront aberration correction in optical coherence tomography.
Verstraete, Hans R G W; Wahls, Sander; Kalkman, Jeroen; Verhaegen, Michel
2015-12-15
Several sensor-less wavefront aberration correction methods that correct nonlinear wavefront aberrations by maximizing the optical coherence tomography (OCT) signal are tested on an OCT setup. A conventional coordinate search method is compared to two model-based optimization methods. The first model-based method takes advantage of the well-known optimization algorithm (NEWUOA) and utilizes a quadratic model. The second model-based method (DONE) is new and utilizes a random multidimensional Fourier-basis expansion. The model-based algorithms achieve lower wavefront errors with up to ten times fewer measurements. Furthermore, the newly proposed DONE method outperforms the NEWUOA method significantly. The DONE algorithm is tested on OCT images and shows a significantly improved image quality.
Gene Ranking of RNA-Seq Data via Discriminant Non-Negative Matrix Factorization.
Jia, Zhilong; Zhang, Xiang; Guan, Naiyang; Bo, Xiaochen; Barnes, Michael R; Luo, Zhigang
2015-01-01
RNA-sequencing is rapidly becoming the method of choice for studying the full complexity of transcriptomes, however with increasing dimensionality, accurate gene ranking is becoming increasingly challenging. This paper proposes an accurate and sensitive gene ranking method that implements discriminant non-negative matrix factorization (DNMF) for RNA-seq data. To the best of our knowledge, this is the first work to explore the utility of DNMF for gene ranking. When incorporating Fisher's discriminant criteria and setting the reduced dimension as two, DNMF learns two factors to approximate the original gene expression data, abstracting the up-regulated or down-regulated metagene by using the sample label information. The first factor denotes all the genes' weights of two metagenes as the additive combination of all genes, while the second learned factor represents the expression values of two metagenes. In the gene ranking stage, all the genes are ranked as a descending sequence according to the differential values of the metagene weights. Leveraging the nature of NMF and Fisher's criterion, DNMF can robustly boost the gene ranking performance. The Area Under the Curve analysis of differential expression analysis on two benchmarking tests of four RNA-seq data sets with similar phenotypes showed that our proposed DNMF-based gene ranking method outperforms other widely used methods. Moreover, the Gene Set Enrichment Analysis also showed DNMF outweighs others. DNMF is also computationally efficient, substantially outperforming all other benchmarked methods. Consequently, we suggest DNMF is an effective method for the analysis of differential gene expression and gene ranking for RNA-seq data.
Spatial interpolation of monthly mean air temperature data for Latvia
NASA Astrophysics Data System (ADS)
Aniskevich, Svetlana
2016-04-01
Temperature data with high spatial resolution are essential for appropriate and qualitative local characteristics analysis. Nowadays the surface observation station network in Latvia consists of 22 stations recording daily air temperature, thus in order to analyze very specific and local features in the spatial distribution of temperature values in the whole Latvia, a high quality spatial interpolation method is required. Until now inverse distance weighted interpolation was used for the interpolation of air temperature data at the meteorological and climatological service of the Latvian Environment, Geology and Meteorology Centre, and no additional topographical information was taken into account. This method made it almost impossible to reasonably assess the actual temperature gradient and distribution between the observation points. During this project a new interpolation method was applied and tested, considering auxiliary explanatory parameters. In order to spatially interpolate monthly mean temperature values, kriging with external drift was used over a grid of 1 km resolution, which contains parameters such as 5 km mean elevation, continentality, distance from the Gulf of Riga and the Baltic Sea, biggest lakes and rivers, population density. As the most appropriate of these parameters, based on a complex situation analysis, mean elevation and continentality was chosen. In order to validate interpolation results, several statistical indicators of the differences between predicted values and the values actually observed were used. Overall, the introduced model visually and statistically outperforms the previous interpolation method and provides a meteorologically reasonable result, taking into account factors that influence the spatial distribution of the monthly mean temperature.
Lin, Dongyun; Sun, Lei; Toh, Kar-Ann; Zhang, Jing Bo; Lin, Zhiping
2018-05-01
Automated biomedical image classification could confront the challenges of high level noise, image blur, illumination variation and complicated geometric correspondence among various categorical biomedical patterns in practice. To handle these challenges, we propose a cascade method consisting of two stages for biomedical image classification. At stage 1, we propose a confidence score based classification rule with a reject option for a preliminary decision using the support vector machine (SVM). The testing images going through stage 1 are separated into two groups based on their confidence scores. Those testing images with sufficiently high confidence scores are classified at stage 1 while the others with low confidence scores are rejected and fed to stage 2. At stage 2, the rejected images from stage 1 are first processed by a subspace analysis technique called eigenfeature regularization and extraction (ERE), and then classified by another SVM trained in the transformed subspace learned by ERE. At both stages, images are represented based on two types of local features, i.e., SIFT and SURF, respectively. They are encoded using various bag-of-words (BoW) models to handle biomedical patterns with and without geometric correspondence, respectively. Extensive experiments are implemented to evaluate the proposed method on three benchmark real-world biomedical image datasets. The proposed method significantly outperforms several competing state-of-the-art methods in terms of classification accuracy. Copyright © 2018 Elsevier Ltd. All rights reserved.
lidar change detection using building models
NASA Astrophysics Data System (ADS)
Kim, Angela M.; Runyon, Scott C.; Jalobeanu, Andre; Esterline, Chelsea H.; Kruse, Fred A.
2014-06-01
Terrestrial LiDAR scans of building models collected with a FARO Focus3D and a RIEGL VZ-400 were used to investigate point-to-point and model-to-model LiDAR change detection. LiDAR data were scaled, decimated, and georegistered to mimic real world airborne collects. Two physical building models were used to explore various aspects of the change detection process. The first model was a 1:250-scale representation of the Naval Postgraduate School campus in Monterey, CA, constructed from Lego blocks and scanned in a laboratory setting using both the FARO and RIEGL. The second model at 1:8-scale consisted of large cardboard boxes placed outdoors and scanned from rooftops of adjacent buildings using the RIEGL. A point-to-point change detection scheme was applied directly to the point-cloud datasets. In the model-to-model change detection scheme, changes were detected by comparing Digital Surface Models (DSMs). The use of physical models allowed analysis of effects of changes in scanner and scanning geometry, and performance of the change detection methods on different types of changes, including building collapse or subsistence, construction, and shifts in location. Results indicate that at low false-alarm rates, the point-to-point method slightly outperforms the model-to-model method. The point-to-point method is less sensitive to misregistration errors in the data. Best results are obtained when the baseline and change datasets are collected using the same LiDAR system and collection geometry.
Liu, An-An; Li, Kang; Kanade, Takeo
2012-02-01
We propose a semi-Markov model trained in a max-margin learning framework for mitosis event segmentation in large-scale time-lapse phase contrast microscopy image sequences of stem cell populations. Our method consists of three steps. First, we apply a constrained optimization based microscopy image segmentation method that exploits phase contrast optics to extract candidate subsequences in the input image sequence that contains mitosis events. Then, we apply a max-margin hidden conditional random field (MM-HCRF) classifier learned from human-annotated mitotic and nonmitotic sequences to classify each candidate subsequence as a mitosis or not. Finally, a max-margin semi-Markov model (MM-SMM) trained on manually-segmented mitotic sequences is utilized to reinforce the mitosis classification results, and to further segment each mitosis into four predefined temporal stages. The proposed method outperforms the event-detection CRF model recently reported by Huh as well as several other competing methods in very challenging image sequences of multipolar-shaped C3H10T1/2 mesenchymal stem cells. For mitosis detection, an overall precision of 95.8% and a recall of 88.1% were achieved. For mitosis segmentation, the mean and standard deviation for the localization errors of the start and end points of all mitosis stages were well below 1 and 2 frames, respectively. In particular, an overall temporal location error of 0.73 ± 1.29 frames was achieved for locating daughter cell birth events.
Binet, Rachel; Deer, Deanne M; Uhlfelder, Samantha J
2014-06-01
Faster detection of contaminated foods can prevent adulterated foods from being consumed and minimize the risk of an outbreak of foodborne illness. A sensitive molecular detection method is especially important for Shigella because ingestion of as few as 10 of these bacterial pathogens can cause disease. The objectives of this study were to compare the ability of four DNA extraction methods to detect Shigella in six types of produce, post-enrichment, and to evaluate a new and rapid conventional multiplex assay that targets the Shigella ipaH, virB and mxiC virulence genes. This assay can detect less than two Shigella cells in pure culture, even when the pathogen is mixed with background microflora, and it can also differentiate natural Shigella strains from a control strain and eliminate false positive results due to accidental laboratory contamination. The four DNA extraction methods (boiling, PrepMan Ultra [Applied Biosystems], InstaGene Matrix [Bio-Rad], DNeasy Tissue kit [Qiagen]) detected 1.6 × 10(3)Shigella CFU/ml post-enrichment, requiring ∼18 doublings to one cell in 25 g of produce pre-enrichment. Lower sensitivity was obtained, depending on produce type and extraction method. The InstaGene Matrix was the most consistent and sensitive and the multiplex assay accurately detected Shigella in less than 90 min, outperforming, to the best of our knowledge, molecular assays currently in place for this pathogen. Published by Elsevier Ltd.
Classification of burn wounds using support vector machines
NASA Astrophysics Data System (ADS)
Acha, Begona; Serrano, Carmen; Palencia, Sergio; Murillo, Juan Jose
2004-05-01
The purpose of this work is to improve a previous method developed by the authors for the classification of burn wounds into their depths. The inputs of the system are color and texture information, as these are the characteristics observed by physicians in order to give a diagnosis. Our previous work consisted in segmenting the burn wound from the rest of the image and classifying the burn into its depth. In this paper we focus on the classification problem only. We already proposed to use a Fuzzy-ARTMAP neural network (NN). However, we may take advantage of new powerful classification tools such as Support Vector Machines (SVM). We apply the five-folded cross validation scheme to divide the database into training and validating sets. Then, we apply a feature selection method for each classifier, which will give us the set of features that yields the smallest classification error for each classifier. Features used to classify are first-order statistical parameters extracted from the L*, u* and v* color components of the image. The feature selection algorithms used are the Sequential Forward Selection (SFS) and the Sequential Backward Selection (SBS) methods. As data of the problem faced here are not linearly separable, the SVM was trained using some different kernels. The validating process shows that the SVM method, when using a Gaussian kernel of variance 1, outperforms classification results obtained with the rest of the classifiers, yielding an error classification rate of 0.7% whereas the Fuzzy-ARTMAP NN attained 1.6 %.
What Does Stock Ownership Breadth Measure?*
Choi, James J.; Jin, Li; Yan, Hongjun
2013-01-01
Using holdings data on a representative sample of all Shanghai Stock Exchange investors, we show that increases in ownership breadth (the fraction of market participants who own a stock) predict low returns: highest change quintile stocks underperform lowest quintile stocks by 23% per year. Small retail investors drive this result. Retail ownership breadth increases appear to be correlated with overpricing. Among institutional investors, however, the opposite holds: Stocks in the top decile of wealth-weighted institutional breadth change outperform the bottom decile by 8% per year, consistent with prior work that interprets breadth as a measure of short-sales constraints. PMID:24764801
High-sensitivity silicon nanowire phototransistors
NASA Astrophysics Data System (ADS)
Tan, Siew Li; Zhao, Xingyan; Dan, Yaping
2014-08-01
Silicon nanowires (SiNWs) have emerged as a promising material for high-sensitivity photodetection in the UV, visible and near-infrared spectral ranges. In this work, we demonstrate novel planar SiNW phototransistors on silicon-oninsulator (SOI) substrate using CMOS-compatible processes. The device consists of a bipolar transistor structure with an optically-injected base region. The electronic and optical properties of the SiNW phototransistors are investigated. Preliminary simulation and experimental results show that nanowire geometry, doping densities and surface states have considerable effects on the device performance, and that a device with optimized parameters can potentially outperform conventional Si photodetectors.
Experimental Investigation of Spatially-Periodic Scalar Patterns in an Inline Mixer
NASA Astrophysics Data System (ADS)
Baskan, Ozge; Speetjens, Michel F. M.; Clercx, Herman J. H.
2015-11-01
Spatially persisting patterns with exponentially decaying intensities form during the downstream evolution of passive scalars in three-dimensional (3D) spatially periodic flows due to the coupled effect of the chaotic nature of the flow and the diffusivity of the material. This has been investigated in many computational and theoretical studies on 3D spatially-periodic flow fields. However, in the limit of zero-diffusivity, the evolution of the scalar fields results in more detailed structures that can only be captured by experiments due to limitations in the computational tools. Our study employs the-state-of-the-art experimental methods to analyze the evolution of 3D advective scalar field in a representative inline mixer, called Quatro static mixer. The experimental setup consists of an optically accessible test section with transparent internal elements, accommodating a pressure-driven pipe flow and equipped with 3D Laser-Induced Fluorescence. The results reveal that the continuous process of stretching and folding of material creates finer structures as the flow progresses, which is an indicator of chaotic advection and the experiments outperform the simulations by revealing far greater level of detail.
SODa: an Mn/Fe superoxide dismutase prediction and design server.
Kwasigroch, Jean Marc; Wintjens, René; Gilis, Dimitri; Rooman, Marianne
2008-06-02
Superoxide dismutases (SODs) are ubiquitous metalloenzymes that play an important role in the defense of aerobic organisms against oxidative stress, by converting reactive oxygen species into nontoxic molecules. We focus here on the SOD family that uses Fe or Mn as cofactor. The SODa webtool http://babylone.ulb.ac.be/soda predicts if a target sequence corresponds to an Fe/Mn SOD. If so, it predicts the metal ion specificity (Fe, Mn or cambialistic) and the oligomerization mode (dimer or tetramer) of the target. In addition, SODa proposes a list of residue substitutions likely to improve the predicted preferences for the metal cofactor and oligomerization mode. The method is based on residue fingerprints, consisting of residues conserved in SOD sequences or typical of SOD subgroups, and of interaction fingerprints, containing residue pairs that are in contact in SOD structures. SODa is shown to outperform and to be more discriminative than traditional techniques based on pairwise sequence alignments. Moreover, the fact that it proposes selected mutations makes it a valuable tool for rational protein design.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Webb-Robertson, Bobbie-Jo M.; Wiberg, Holli K.; Matzke, Melissa M.
In this review, we apply selected imputation strategies to label-free liquid chromatography–mass spectrometry (LC–MS) proteomics datasets to evaluate the accuracy with respect to metrics of variance and classification. We evaluate several commonly used imputation approaches for individual merits and discuss the caveats of each approach with respect to the example LC–MS proteomics data. In general, local similarity-based approaches, such as the regularized expectation maximization and least-squares adaptive algorithms, yield the best overall performances with respect to metrics of accuracy and robustness. However, no single algorithm consistently outperforms the remaining approaches, and in some cases, performing classification without imputation sometimes yieldedmore » the most accurate classification. Thus, because of the complex mechanisms of missing data in proteomics, which also vary from peptide to protein, no individual method is a single solution for imputation. In summary, on the basis of the observations in this review, the goal for imputation in the field of computational proteomics should be to develop new approaches that work generically for this data type and new strategies to guide users in the selection of the best imputation for their dataset and analysis objectives.« less
Yi, Chucai; Tian, Yingli
2012-09-01
In this paper, we propose a novel framework to extract text regions from scene images with complex backgrounds and multiple text appearances. This framework consists of three main steps: boundary clustering (BC), stroke segmentation, and string fragment classification. In BC, we propose a new bigram-color-uniformity-based method to model both text and attachment surface, and cluster edge pixels based on color pairs and spatial positions into boundary layers. Then, stroke segmentation is performed at each boundary layer by color assignment to extract character candidates. We propose two algorithms to combine the structural analysis of text stroke with color assignment and filter out background interferences. Further, we design a robust string fragment classification based on Gabor-based text features. The features are obtained from feature maps of gradient, stroke distribution, and stroke width. The proposed framework of text localization is evaluated on scene images, born-digital images, broadcast video images, and images of handheld objects captured by blind persons. Experimental results on respective datasets demonstrate that the framework outperforms state-of-the-art localization algorithms.
An improved genetic algorithm for designing optimal temporal patterns of neural stimulation
NASA Astrophysics Data System (ADS)
Cassar, Isaac R.; Titus, Nathan D.; Grill, Warren M.
2017-12-01
Objective. Electrical neuromodulation therapies typically apply constant frequency stimulation, but non-regular temporal patterns of stimulation may be more effective and more efficient. However, the design space for temporal patterns is exceedingly large, and model-based optimization is required for pattern design. We designed and implemented a modified genetic algorithm (GA) intended for design optimal temporal patterns of electrical neuromodulation. Approach. We tested and modified standard GA methods for application to designing temporal patterns of neural stimulation. We evaluated each modification individually and all modifications collectively by comparing performance to the standard GA across three test functions and two biophysically-based models of neural stimulation. Main results. The proposed modifications of the GA significantly improved performance across the test functions and performed best when all were used collectively. The standard GA found patterns that outperformed fixed-frequency, clinically-standard patterns in biophysically-based models of neural stimulation, but the modified GA, in many fewer iterations, consistently converged to higher-scoring, non-regular patterns of stimulation. Significance. The proposed improvements to standard GA methodology reduced the number of iterations required for convergence and identified superior solutions.
Liu, Xiao; Chen, Hsinchun
2015-12-01
Social media offer insights of patients' medical problems such as drug side effects and treatment failures. Patient reports of adverse drug events from social media have great potential to improve current practice of pharmacovigilance. However, extracting patient adverse drug event reports from social media continues to be an important challenge for health informatics research. In this study, we develop a research framework with advanced natural language processing techniques for integrated and high-performance patient reported adverse drug event extraction. The framework consists of medical entity extraction for recognizing patient discussions of drug and events, adverse drug event extraction with shortest dependency path kernel based statistical learning method and semantic filtering with information from medical knowledge bases, and report source classification to tease out noise. To evaluate the proposed framework, a series of experiments were conducted on a test bed encompassing about postings from major diabetes and heart disease forums in the United States. The results reveal that each component of the framework significantly contributes to its overall effectiveness. Our framework significantly outperforms prior work. Published by Elsevier Inc.
Self-Paced Prioritized Curriculum Learning With Coverage Penalty in Deep Reinforcement Learning.
Ren, Zhipeng; Dong, Daoyi; Li, Huaxiong; Chen, Chunlin; Zhipeng Ren; Daoyi Dong; Huaxiong Li; Chunlin Chen; Dong, Daoyi; Li, Huaxiong; Chen, Chunlin; Ren, Zhipeng
2018-06-01
In this paper, a new training paradigm is proposed for deep reinforcement learning using self-paced prioritized curriculum learning with coverage penalty. The proposed deep curriculum reinforcement learning (DCRL) takes the most advantage of experience replay by adaptively selecting appropriate transitions from replay memory based on the complexity of each transition. The criteria of complexity in DCRL consist of self-paced priority as well as coverage penalty. The self-paced priority reflects the relationship between the temporal-difference error and the difficulty of the current curriculum for sample efficiency. The coverage penalty is taken into account for sample diversity. With comparison to deep Q network (DQN) and prioritized experience replay (PER) methods, the DCRL algorithm is evaluated on Atari 2600 games, and the experimental results show that DCRL outperforms DQN and PER on most of these games. More results further show that the proposed curriculum training paradigm of DCRL is also applicable and effective for other memory-based deep reinforcement learning approaches, such as double DQN and dueling network. All the experimental results demonstrate that DCRL can achieve improved training efficiency and robustness for deep reinforcement learning.
Wang, Zhuo; Danziger, Samuel A; Heavner, Benjamin D; Ma, Shuyi; Smith, Jennifer J; Li, Song; Herricks, Thurston; Simeonidis, Evangelos; Baliga, Nitin S; Aitchison, John D; Price, Nathan D
2017-05-01
Gene regulatory and metabolic network models have been used successfully in many organisms, but inherent differences between them make networks difficult to integrate. Probabilistic Regulation Of Metabolism (PROM) provides a partial solution, but it does not incorporate network inference and underperforms in eukaryotes. We present an Integrated Deduced And Metabolism (IDREAM) method that combines statistically inferred Environment and Gene Regulatory Influence Network (EGRIN) models with the PROM framework to create enhanced metabolic-regulatory network models. We used IDREAM to predict phenotypes and genetic interactions between transcription factors and genes encoding metabolic activities in the eukaryote, Saccharomyces cerevisiae. IDREAM models contain many fewer interactions than PROM and yet produce significantly more accurate growth predictions. IDREAM consistently outperformed PROM using any of three popular yeast metabolic models and across three experimental growth conditions. Importantly, IDREAM's enhanced accuracy makes it possible to identify subtle synthetic growth defects. With experimental validation, these novel genetic interactions involving the pyruvate dehydrogenase complex suggested a new role for fatty acid-responsive factor Oaf1 in regulating acetyl-CoA production in glucose grown cells.
Bag-of-features based medical image retrieval via multiple assignment and visual words weighting.
Wang, Jingyan; Li, Yongping; Zhang, Ying; Wang, Chao; Xie, Honglan; Chen, Guoling; Gao, Xin
2011-11-01
Bag-of-features based approaches have become prominent for image retrieval and image classification tasks in the past decade. Such methods represent an image as a collection of local features, such as image patches and key points with scale invariant feature transform (SIFT) descriptors. To improve the bag-of-features methods, we first model the assignments of local descriptors as contribution functions, and then propose a novel multiple assignment strategy. Assuming the local features can be reconstructed by their neighboring visual words in a vocabulary, reconstruction weights can be solved by quadratic programming. The weights are then used to build contribution functions, resulting in a novel assignment method, called quadratic programming (QP) assignment. We further propose a novel visual word weighting method. The discriminative power of each visual word is analyzed by the sub-similarity function in the bin that corresponds to the visual word. Each sub-similarity function is then treated as a weak classifier. A strong classifier is learned by boosting methods that combine those weak classifiers. The weighting factors of the visual words are learned accordingly. We evaluate the proposed methods on medical image retrieval tasks. The methods are tested on three well-known data sets, i.e., the ImageCLEFmed data set, the 304 CT Set, and the basal-cell carcinoma image set. Experimental results demonstrate that the proposed QP assignment outperforms the traditional nearest neighbor assignment, the multiple assignment, and the soft assignment, whereas the proposed boosting based weighting strategy outperforms the state-of-the-art weighting methods, such as the term frequency weights and the term frequency-inverse document frequency weights.
Ghaderi, Parviz; Marateb, Hamid R
2017-07-01
The aim of this study was to reconstruct low-quality High-density surface EMG (HDsEMG) signals, recorded with 2-D electrode arrays, using image inpainting and surface reconstruction methods. It is common that some fraction of the electrodes may provide low-quality signals. We used variety of image inpainting methods, based on partial differential equations (PDEs), and surface reconstruction methods to reconstruct the time-averaged or instantaneous muscle activity maps of those outlier channels. Two novel reconstruction algorithms were also proposed. HDsEMG signals were recorded from the biceps femoris and brachial biceps muscles during low-to-moderate-level isometric contractions, and some of the channels (5-25%) were randomly marked as outliers. The root-mean-square error (RMSE) between the original and reconstructed maps was then calculated. Overall, the proposed Poisson and wave PDE outperformed the other methods (average RMSE 8.7 μV rms ± 6.1 μV rms and 7.5 μV rms ± 5.9 μV rms ) for the time-averaged single-differential and monopolar map reconstruction, respectively. Biharmonic Spline, the discrete cosine transform, and the Poisson PDE outperformed the other methods for the instantaneous map reconstruction. The running time of the proposed Poisson and wave PDE methods, implemented using a Vectorization package, was 4.6 ± 5.7 ms and 0.6 ± 0.5 ms, respectively, for each signal epoch or time sample in each channel. The proposed reconstruction algorithms could be promising new tools for reconstructing muscle activity maps in real-time applications. Proper reconstruction methods could recover the information of low-quality recorded channels in HDsEMG signals.
Kavakiotis, Ioannis; Samaras, Patroklos; Triantafyllidis, Alexandros; Vlahavas, Ioannis
2017-11-01
Single Nucleotide Polymorphism (SNPs) are, nowadays, becoming the marker of choice for biological analyses involving a wide range of applications with great medical, biological, economic and environmental interest. Classification tasks i.e. the assignment of individuals to groups of origin based on their (multi-locus) genotypes, are performed in many fields such as forensic investigations, discrimination between wild and/or farmed populations and others. Τhese tasks, should be performed with a small number of loci, for computational as well as biological reasons. Thus, feature selection should precede classification tasks, especially for Single Nucleotide Polymorphism (SNP) datasets, where the number of features can amount to hundreds of thousands or millions. In this paper, we present a novel data mining approach, called FIFS - Frequent Item Feature Selection, based on the use of frequent items for selection of the most informative markers from population genomic data. It is a modular method, consisting of two main components. The first one identifies the most frequent and unique genotypes for each sampled population. The second one selects the most appropriate among them, in order to create the informative SNP subsets to be returned. The proposed method (FIFS) was tested on a real dataset, which comprised of a comprehensive coverage of pig breed types present in Britain. This dataset consisted of 446 individuals divided in 14 sub-populations, genotyped at 59,436 SNPs. Our method outperforms the state-of-the-art and baseline methods in every case. More specifically, our method surpassed the assignment accuracy threshold of 95% needing only half the number of SNPs selected by other methods (FIFS: 28 SNPs, Delta: 70 SNPs Pairwise FST: 70 SNPs, In: 100 SNPs.) CONCLUSION: Our approach successfully deals with the problem of informative marker selection in high dimensional genomic datasets. It offers better results compared to existing approaches and can aid biologists in selecting the most informative markers with maximum discrimination power for optimization of cost-effective panels with applications related to e.g. species identification, wildlife management, and forensics. Copyright © 2017 Elsevier Ltd. All rights reserved.
A clustering package for nucleotide sequences using Laplacian Eigenmaps and Gaussian Mixture Model.
Bruneau, Marine; Mottet, Thierry; Moulin, Serge; Kerbiriou, Maël; Chouly, Franz; Chretien, Stéphane; Guyeux, Christophe
2018-02-01
In this article, a new Python package for nucleotide sequences clustering is proposed. This package, freely available on-line, implements a Laplacian eigenmap embedding and a Gaussian Mixture Model for DNA clustering. It takes nucleotide sequences as input, and produces the optimal number of clusters along with a relevant visualization. Despite the fact that we did not optimise the computational speed, our method still performs reasonably well in practice. Our focus was mainly on data analytics and accuracy and as a result, our approach outperforms the state of the art, even in the case of divergent sequences. Furthermore, an a priori knowledge on the number of clusters is not required here. For the sake of illustration, this method is applied on a set of 100 DNA sequences taken from the mitochondrially encoded NADH dehydrogenase 3 (ND3) gene, extracted from a collection of Platyhelminthes and Nematoda species. The resulting clusters are tightly consistent with the phylogenetic tree computed using a maximum likelihood approach on gene alignment. They are coherent too with the NCBI taxonomy. Further test results based on synthesized data are then provided, showing that the proposed approach is better able to recover the clusters than the most widely used software, namely Cd-hit-est and BLASTClust. Copyright © 2017 Elsevier Ltd. All rights reserved.
Peng, Huan-Kai; Marculescu, Radu
2015-01-01
Social media exhibit rich yet distinct temporal dynamics which cover a wide range of different scales. In order to study this complex dynamics, two fundamental questions revolve around (1) the signatures of social dynamics at different time scales, and (2) the way in which these signatures interact and form higher-level meanings. In this paper, we propose the Recursive Convolutional Bayesian Model (RCBM) to address both of these fundamental questions. The key idea behind our approach consists of constructing a deep-learning framework using specialized convolution operators that are designed to exploit the inherent heterogeneity of social dynamics. RCBM's runtime and convergence properties are guaranteed by formal analyses. Experimental results show that the proposed method outperforms the state-of-the-art approaches both in terms of solution quality and computational efficiency. Indeed, by applying the proposed method on two social network datasets, Twitter and Yelp, we are able to identify the compositional structures that can accurately characterize the complex social dynamics from these two social media. We further show that identifying these patterns can enable new applications such as anomaly detection and improved social dynamics forecasting. Finally, our analysis offers new insights on understanding and engineering social media dynamics, with direct applications to opinion spreading and online content promotion.
NASA Astrophysics Data System (ADS)
Cao, Qian; Wan, Xiaoxia; Li, Junfeng; Liu, Qiang; Liang, Jingxing; Li, Chan
2016-10-01
This paper proposed two weight functions based on principal component analysis (PCA) to reserve more colorimetric information in spectral data compression process. One weight function consisted of the CIE XYZ color-matching functions representing the characteristic of the human visual system, while another was made up of the CIE XYZ color-matching functions of human visual system and relative spectral power distribution of the CIE standard illuminant D65. The improvement obtained from the proposed two methods were tested to compress and reconstruct the reflectance spectra of 1600 glossy Munsell color chips and 1950 Natural Color System color chips as well as six multispectral images. The performance was evaluated by the mean values of color difference under the CIE 1931 standard colorimetric observer and the CIE standard illuminant D65 and A. The mean values of root mean square errors between the original and reconstructed spectra were also calculated. The experimental results show that the proposed two methods significantly outperform the standard PCA and another two weighted PCA in the aspects of colorimetric reconstruction accuracy with very slight degradation in spectral reconstruction accuracy. In addition, weight functions with the CIE standard illuminant D65 can improve the colorimetric reconstruction accuracy compared to weight functions without the CIE standard illuminant D65.
Lin, Meihua; Li, Haoli; Zhao, Xiaolei; Qin, Jiheng
2013-01-01
Genome-wide analysis of gene-gene interactions has been recognized as a powerful avenue to identify the missing genetic components that can not be detected by using current single-point association analysis. Recently, several model-free methods (e.g. the commonly used information based metrics and several logistic regression-based metrics) were developed for detecting non-linear dependence between genetic loci, but they are potentially at the risk of inflated false positive error, in particular when the main effects at one or both loci are salient. In this study, we proposed two conditional entropy-based metrics to challenge this limitation. Extensive simulations demonstrated that the two proposed metrics, provided the disease is rare, could maintain consistently correct false positive rate. In the scenarios for a common disease, our proposed metrics achieved better or comparable control of false positive error, compared to four previously proposed model-free metrics. In terms of power, our methods outperformed several competing metrics in a range of common disease models. Furthermore, in real data analyses, both metrics succeeded in detecting interactions and were competitive with the originally reported results or the logistic regression approaches. In conclusion, the proposed conditional entropy-based metrics are promising as alternatives to current model-based approaches for detecting genuine epistatic effects. PMID:24339984
A dynamic appearance descriptor approach to facial actions temporal modeling.
Jiang, Bihan; Valstar, Michel; Martinez, Brais; Pantic, Maja
2014-02-01
Both the configuration and the dynamics of facial expressions are crucial for the interpretation of human facial behavior. Yet to date, the vast majority of reported efforts in the field either do not take the dynamics of facial expressions into account, or focus only on prototypic facial expressions of six basic emotions. Facial dynamics can be explicitly analyzed by detecting the constituent temporal segments in Facial Action Coding System (FACS) Action Units (AUs)-onset, apex, and offset. In this paper, we present a novel approach to explicit analysis of temporal dynamics of facial actions using the dynamic appearance descriptor Local Phase Quantization from Three Orthogonal Planes (LPQ-TOP). Temporal segments are detected by combining a discriminative classifier for detecting the temporal segments on a frame-by-frame basis with Markov Models that enforce temporal consistency over the whole episode. The system is evaluated in detail over the MMI facial expression database, the UNBC-McMaster pain database, the SAL database, the GEMEP-FERA dataset in database-dependent experiments, in cross-database experiments using the Cohn-Kanade, and the SEMAINE databases. The comparison with other state-of-the-art methods shows that the proposed LPQ-TOP method outperforms the other approaches for the problem of AU temporal segment detection, and that overall AU activation detection benefits from dynamic appearance information.
NASA Astrophysics Data System (ADS)
Ma, Dan; Liu, Jun; Chen, Kai; Li, Huali; Liu, Ping; Chen, Huijuan; Qian, Jing
2016-04-01
In remote sensing fusion, the spatial details of a panchromatic (PAN) image and the spectrum information of multispectral (MS) images will be transferred into fused images according to the characteristics of the human visual system. Thus, a remote sensing image fusion quality assessment called feature-based fourth-order correlation coefficient (FFOCC) is proposed. FFOCC is based on the feature-based coefficient concept. Spatial features related to spatial details of the PAN image and spectral features related to the spectrum information of MS images are first extracted from the fused image. Then, the fourth-order correlation coefficient between the spatial and spectral features is calculated and treated as the assessment result. FFOCC was then compared with existing widely used indices, such as Erreur Relative Globale Adimensionnelle de Synthese, and quality assessed with no reference. Results of the fusion and distortion experiments indicate that the FFOCC is consistent with subjective evaluation. FFOCC significantly outperforms the other indices in evaluating fusion images that are produced by different fusion methods and that are distorted in spatial and spectral features by blurring, adding noise, and changing intensity. All the findings indicate that the proposed method is an objective and effective quality assessment for remote sensing image fusion.
A predictive machine learning approach for microstructure optimization and materials design
Liu, Ruoqian; Kumar, Abhishek; Chen, Zhengzhang; ...
2015-06-23
This paper addresses an important materials engineering question: How can one identify the complete space (or as much of it as possible) of microstructures that are theoretically predicted to yield the desired combination of properties demanded by a selected application? We present a problem involving design of magnetoelastic Fe-Ga alloy microstructure for enhanced elastic, plastic and magnetostrictive properties. While theoretical models for computing properties given the microstructure are known for this alloy, inversion of these relationships to obtain microstructures that lead to desired properties is challenging, primarily due to the high dimensionality of microstructure space, multi-objective design requirement and non-uniquenessmore » of solutions. These challenges render traditional search-based optimization methods incompetent in terms of both searching efficiency and result optimality. In this paper, a route to address these challenges using a machine learning methodology is proposed. A systematic framework consisting of random data generation, feature selection and classification algorithms is developed. In conclusion, experiments with five design problems that involve identification of microstructures that satisfy both linear and nonlinear property constraints show that our framework outperforms traditional optimization methods with the average running time reduced by as much as 80% and with optimality that would not be achieved otherwise.« less
Adaptive Greedy Dictionary Selection for Web Media Summarization.
Cong, Yang; Liu, Ji; Sun, Gan; You, Quanzeng; Li, Yuncheng; Luo, Jiebo
2017-01-01
Initializing an effective dictionary is an indispensable step for sparse representation. In this paper, we focus on the dictionary selection problem with the objective to select a compact subset of basis from original training data instead of learning a new dictionary matrix as dictionary learning models do. We first design a new dictionary selection model via l 2,0 norm. For model optimization, we propose two methods: one is the standard forward-backward greedy algorithm, which is not suitable for large-scale problems; the other is based on the gradient cues at each forward iteration and speeds up the process dramatically. In comparison with the state-of-the-art dictionary selection models, our model is not only more effective and efficient, but also can control the sparsity. To evaluate the performance of our new model, we select two practical web media summarization problems: 1) we build a new data set consisting of around 500 users, 3000 albums, and 1 million images, and achieve effective assisted albuming based on our model and 2) by formulating the video summarization problem as a dictionary selection issue, we employ our model to extract keyframes from a video sequence in a more flexible way. Generally, our model outperforms the state-of-the-art methods in both these two tasks.
Evaluation of NMME temperature and precipitation bias and forecast skill for South Asia
NASA Astrophysics Data System (ADS)
Cash, Benjamin A.; Manganello, Julia V.; Kinter, James L.
2017-08-01
Systematic error and forecast skill for temperature and precipitation in two regions of Southern Asia are investigated using hindcasts initialized May 1 from the North American Multi-Model Ensemble. We focus on two contiguous but geographically and dynamically diverse regions: the Extended Indian Monsoon Rainfall (70-100E, 10-30 N) and the nearby mountainous area of Pakistan and Afghanistan (60-75E, 23-39 N). Forecast skill is assessed using the Sign test framework, a rigorous statistical method that can be applied to non-Gaussian variables such as precipitation and to different ensemble sizes without introducing bias. We find that models show significant systematic error in both precipitation and temperature for both regions. The multi-model ensemble mean (MMEM) consistently yields the lowest systematic error and the highest forecast skill for both regions and variables. However, we also find that the MMEM consistently provides a statistically significant increase in skill over climatology only in the first month of the forecast. While the MMEM tends to provide higher overall skill than climatology later in the forecast, the differences are not significant at the 95% level. We also find that MMEMs constructed with a relatively small number of ensemble members per model can equal or outperform MMEMs constructed with more members in skill. This suggests some ensemble members either provide no contribution to overall skill or even detract from it.
pGenN, a Gene Normalization Tool for Plant Genes and Proteins in Scientific Literature
Ding, Ruoyao; Arighi, Cecilia N.; Lee, Jung-Youn; Wu, Cathy H.; Vijay-Shanker, K.
2015-01-01
Background Automatically detecting gene/protein names in the literature and connecting them to databases records, also known as gene normalization, provides a means to structure the information buried in free-text literature. Gene normalization is critical for improving the coverage of annotation in the databases, and is an essential component of many text mining systems and database curation pipelines. Methods In this manuscript, we describe a gene normalization system specifically tailored for plant species, called pGenN (pivot-based Gene Normalization). The system consists of three steps: dictionary-based gene mention detection, species assignment, and intra species normalization. We have developed new heuristics to improve each of these phases. Results We evaluated the performance of pGenN on an in-house expertly annotated corpus consisting of 104 plant relevant abstracts. Our system achieved an F-value of 88.9% (Precision 90.9% and Recall 87.2%) on this corpus, outperforming state-of-art systems presented in BioCreative III. We have processed over 440,000 plant-related Medline abstracts using pGenN. The gene normalization results are stored in a local database for direct query from the pGenN web interface (proteininformationresource.org/pgenn/). The annotated literature corpus is also publicly available through the PIR text mining portal (proteininformationresource.org/iprolink/). PMID:26258475
2016-01-01
The motivation behind this research is to innovatively combine new methods like wavelet, principal component analysis (PCA), and artificial neural network (ANN) approaches to analyze trade in today’s increasingly difficult and volatile financial futures markets. The main focus of this study is to facilitate forecasting by using an enhanced denoising process on market data, taken as a multivariate signal, in order to deduct the same noise from the open-high-low-close signal of a market. This research offers evidence on the predictive ability and the profitability of abnormal returns of a new hybrid forecasting model using Wavelet-PCA denoising and ANN (named WPCA-NN) on futures contracts of Hong Kong’s Hang Seng futures, Japan’s NIKKEI 225 futures, Singapore’s MSCI futures, South Korea’s KOSPI 200 futures, and Taiwan’s TAIEX futures from 2005 to 2014. Using a host of technical analysis indicators consisting of RSI, MACD, MACD Signal, Stochastic Fast %K, Stochastic Slow %K, Stochastic %D, and Ultimate Oscillator, empirical results show that the annual mean returns of WPCA-NN are more than the threshold buy-and-hold for the validation, test, and evaluation periods; this is inconsistent with the traditional random walk hypothesis, which insists that mechanical rules cannot outperform the threshold buy-and-hold. The findings, however, are consistent with literature that advocates technical analysis. PMID:27248692
Chan Phooi M'ng, Jacinta; Mehralizadeh, Mohammadali
2016-01-01
The motivation behind this research is to innovatively combine new methods like wavelet, principal component analysis (PCA), and artificial neural network (ANN) approaches to analyze trade in today's increasingly difficult and volatile financial futures markets. The main focus of this study is to facilitate forecasting by using an enhanced denoising process on market data, taken as a multivariate signal, in order to deduct the same noise from the open-high-low-close signal of a market. This research offers evidence on the predictive ability and the profitability of abnormal returns of a new hybrid forecasting model using Wavelet-PCA denoising and ANN (named WPCA-NN) on futures contracts of Hong Kong's Hang Seng futures, Japan's NIKKEI 225 futures, Singapore's MSCI futures, South Korea's KOSPI 200 futures, and Taiwan's TAIEX futures from 2005 to 2014. Using a host of technical analysis indicators consisting of RSI, MACD, MACD Signal, Stochastic Fast %K, Stochastic Slow %K, Stochastic %D, and Ultimate Oscillator, empirical results show that the annual mean returns of WPCA-NN are more than the threshold buy-and-hold for the validation, test, and evaluation periods; this is inconsistent with the traditional random walk hypothesis, which insists that mechanical rules cannot outperform the threshold buy-and-hold. The findings, however, are consistent with literature that advocates technical analysis.
Deformable registration of CT and cone-beam CT with local intensity matching.
Park, Seyoun; Plishker, William; Quon, Harry; Wong, John; Shekhar, Raj; Lee, Junghoon
2017-02-07
Cone-beam CT (CBCT) is a widely used intra-operative imaging modality in image-guided radiotherapy and surgery. A short scan followed by a filtered-backprojection is typically used for CBCT reconstruction. While data on the mid-plane (plane of source-detector rotation) is complete, off-mid-planes undergo different information deficiency and the computed reconstructions are approximate. This causes different reconstruction artifacts at off-mid-planes depending on slice locations, and therefore impedes accurate registration between CT and CBCT. In this paper, we propose a method to accurately register CT and CBCT by iteratively matching local CT and CBCT intensities. We correct CBCT intensities by matching local intensity histograms slice by slice in conjunction with intensity-based deformable registration. The correction-registration steps are repeated in an alternating way until the result image converges. We integrate the intensity matching into three different deformable registration methods, B-spline, demons, and optical flow that are widely used for CT-CBCT registration. All three registration methods were implemented on a graphics processing unit for efficient parallel computation. We tested the proposed methods on twenty five head and neck cancer cases and compared the performance with state-of-the-art registration methods. Normalized cross correlation (NCC), structural similarity index (SSIM), and target registration error (TRE) were computed to evaluate the registration performance. Our method produced overall NCC of 0.96, SSIM of 0.94, and TRE of 2.26 → 2.27 mm, outperforming existing methods by 9%, 12%, and 27%, respectively. Experimental results also show that our method performs consistently and is more accurate than existing algorithms, and also computationally efficient.
Moghbel, Mehrdad; Mashohor, Syamsiah; Mahmud, Rozi; Saripan, M. Iqbal Bin
2016-01-01
Segmentation of liver tumors from Computed Tomography (CT) and tumor burden analysis play an important role in the choice of therapeutic strategies for liver diseases and treatment monitoring. In this paper, a new segmentation method for liver tumors from contrast-enhanced CT imaging is proposed. As manual segmentation of tumors for liver treatment planning is both labor intensive and time-consuming, a highly accurate automatic tumor segmentation is desired. The proposed framework is fully automatic requiring no user interaction. The proposed segmentation evaluated on real-world clinical data from patients is based on a hybrid method integrating cuckoo optimization and fuzzy c-means algorithm with random walkers algorithm. The accuracy of the proposed method was validated using a clinical liver dataset containing one of the highest numbers of tumors utilized for liver tumor segmentation containing 127 tumors in total with further validation of the results by a consultant radiologist. The proposed method was able to achieve one of the highest accuracies reported in the literature for liver tumor segmentation compared to other segmentation methods with a mean overlap error of 22.78 % and dice similarity coefficient of 0.75 in 3Dircadb dataset and a mean overlap error of 15.61 % and dice similarity coefficient of 0.81 in MIDAS dataset. The proposed method was able to outperform most other tumor segmentation methods reported in the literature while representing an overlap error improvement of 6 % compared to one of the best performing automatic methods in the literature. The proposed framework was able to provide consistently accurate results considering the number of tumors and the variations in tumor contrast enhancements and tumor appearances while the tumor burden was estimated with a mean error of 0.84 % in 3Dircadb dataset. PMID:27540353
Deformable registration of CT and cone-beam CT with local intensity matching
NASA Astrophysics Data System (ADS)
Park, Seyoun; Plishker, William; Quon, Harry; Wong, John; Shekhar, Raj; Lee, Junghoon
2017-02-01
Cone-beam CT (CBCT) is a widely used intra-operative imaging modality in image-guided radiotherapy and surgery. A short scan followed by a filtered-backprojection is typically used for CBCT reconstruction. While data on the mid-plane (plane of source-detector rotation) is complete, off-mid-planes undergo different information deficiency and the computed reconstructions are approximate. This causes different reconstruction artifacts at off-mid-planes depending on slice locations, and therefore impedes accurate registration between CT and CBCT. In this paper, we propose a method to accurately register CT and CBCT by iteratively matching local CT and CBCT intensities. We correct CBCT intensities by matching local intensity histograms slice by slice in conjunction with intensity-based deformable registration. The correction-registration steps are repeated in an alternating way until the result image converges. We integrate the intensity matching into three different deformable registration methods, B-spline, demons, and optical flow that are widely used for CT-CBCT registration. All three registration methods were implemented on a graphics processing unit for efficient parallel computation. We tested the proposed methods on twenty five head and neck cancer cases and compared the performance with state-of-the-art registration methods. Normalized cross correlation (NCC), structural similarity index (SSIM), and target registration error (TRE) were computed to evaluate the registration performance. Our method produced overall NCC of 0.96, SSIM of 0.94, and TRE of 2.26 → 2.27 mm, outperforming existing methods by 9%, 12%, and 27%, respectively. Experimental results also show that our method performs consistently and is more accurate than existing algorithms, and also computationally efficient.
Gu, Jinghua; Xuan, Jianhua; Riggins, Rebecca B; Chen, Li; Wang, Yue; Clarke, Robert
2012-08-01
Identification of transcriptional regulatory networks (TRNs) is of significant importance in computational biology for cancer research, providing a critical building block to unravel disease pathways. However, existing methods for TRN identification suffer from the inclusion of excessive 'noise' in microarray data and false-positives in binding data, especially when applied to human tumor-derived cell line studies. More robust methods that can counteract the imperfection of data sources are therefore needed for reliable identification of TRNs in this context. In this article, we propose to establish a link between the quality of one target gene to represent its regulator and the uncertainty of its expression to represent other target genes. Specifically, an outlier sum statistic was used to measure the aggregated evidence for regulation events between target genes and their corresponding transcription factors. A Gibbs sampling method was then developed to estimate the marginal distribution of the outlier sum statistic, hence, to uncover underlying regulatory relationships. To evaluate the effectiveness of our proposed method, we compared its performance with that of an existing sampling-based method using both simulation data and yeast cell cycle data. The experimental results show that our method consistently outperforms the competing method in different settings of signal-to-noise ratio and network topology, indicating its robustness for biological applications. Finally, we applied our method to breast cancer cell line data and demonstrated its ability to extract biologically meaningful regulatory modules related to estrogen signaling and action in breast cancer. The Gibbs sampler MATLAB package is freely available at http://www.cbil.ece.vt.edu/software.htm. xuan@vt.edu Supplementary data are available at Bioinformatics online.
Gu, Jinghua; Xuan, Jianhua; Riggins, Rebecca B.; Chen, Li; Wang, Yue; Clarke, Robert
2012-01-01
Motivation: Identification of transcriptional regulatory networks (TRNs) is of significant importance in computational biology for cancer research, providing a critical building block to unravel disease pathways. However, existing methods for TRN identification suffer from the inclusion of excessive ‘noise’ in microarray data and false-positives in binding data, especially when applied to human tumor-derived cell line studies. More robust methods that can counteract the imperfection of data sources are therefore needed for reliable identification of TRNs in this context. Results: In this article, we propose to establish a link between the quality of one target gene to represent its regulator and the uncertainty of its expression to represent other target genes. Specifically, an outlier sum statistic was used to measure the aggregated evidence for regulation events between target genes and their corresponding transcription factors. A Gibbs sampling method was then developed to estimate the marginal distribution of the outlier sum statistic, hence, to uncover underlying regulatory relationships. To evaluate the effectiveness of our proposed method, we compared its performance with that of an existing sampling-based method using both simulation data and yeast cell cycle data. The experimental results show that our method consistently outperforms the competing method in different settings of signal-to-noise ratio and network topology, indicating its robustness for biological applications. Finally, we applied our method to breast cancer cell line data and demonstrated its ability to extract biologically meaningful regulatory modules related to estrogen signaling and action in breast cancer. Availability and implementation: The Gibbs sampler MATLAB package is freely available at http://www.cbil.ece.vt.edu/software.htm. Contact: xuan@vt.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:22595208
Age-related deficits in free recall: the role of rehearsal.
Ward, Geoff; Maylor, Elizabeth A
2005-01-01
Age-related deficits have been consistently observed in free recall. Recent accounts of episodic memory suggest that these deficits could result from differential patterns of rehearsal. In the present study, 20 young and 20 older adults (mean ages 21 and 72 years, respectively) were presented with lists of 20 words for immediate free recall using the overt rehearsal methodology. The young outperformed the older adults at all serial positions. There were significant age-related differences in the patterns of overt rehearsals: Young adults rehearsed a greater number of different words than did older adults, they rehearsed words to more recent serial positions, and their rehearsals were more widely distributed throughout the list. Consistent with a recency-based account of episodic memory, age deficits in free recall are largely attributable to age differences in the recency, frequency, and distribution of rehearsals.
Label consistent K-SVD: learning a discriminative dictionary for recognition.
Jiang, Zhuolin; Lin, Zhe; Davis, Larry S
2013-11-01
A label consistent K-SVD (LC-KSVD) algorithm to learn a discriminative dictionary for sparse coding is presented. In addition to using class labels of training data, we also associate label information with each dictionary item (columns of the dictionary matrix) to enforce discriminability in sparse codes during the dictionary learning process. More specifically, we introduce a new label consistency constraint called "discriminative sparse-code error" and combine it with the reconstruction error and the classification error to form a unified objective function. The optimal solution is efficiently obtained using the K-SVD algorithm. Our algorithm learns a single overcomplete dictionary and an optimal linear classifier jointly. The incremental dictionary learning algorithm is presented for the situation of limited memory resources. It yields dictionaries so that feature points with the same class labels have similar sparse codes. Experimental results demonstrate that our algorithm outperforms many recently proposed sparse-coding techniques for face, action, scene, and object category recognition under the same learning conditions.
Estimating Demand for Industrial and Commercial Land Use Given Economic Forecasts
Batista e Silva, Filipe; Koomen, Eric; Diogo, Vasco; Lavalle, Carlo
2014-01-01
Current developments in the field of land use modelling point towards greater level of spatial and thematic resolution and the possibility to model large geographical extents. Improvements are taking place as computational capabilities increase and socioeconomic and environmental data are produced with sufficient detail. Integrated approaches to land use modelling rely on the development of interfaces with specialized models from fields like economy, hydrology, and agriculture. Impact assessment of scenarios/policies at various geographical scales can particularly benefit from these advances. A comprehensive land use modelling framework includes necessarily both the estimation of the quantity and the spatial allocation of land uses within a given timeframe. In this paper, we seek to establish straightforward methods to estimate demand for industrial and commercial land uses that can be used in the context of land use modelling, in particular for applications at continental scale, where the unavailability of data is often a major constraint. We propose a set of approaches based on ‘land use intensity’ measures indicating the amount of economic output per existing areal unit of land use. A base model was designed to estimate land demand based on regional-specific land use intensities; in addition, variants accounting for sectoral differences in land use intensity were introduced. A validation was carried out for a set of European countries by estimating land use for 2006 and comparing it to observations. The models’ results were compared with estimations generated using the ‘null model’ (no land use change) and simple trend extrapolations. Results indicate that the proposed approaches clearly outperformed the ‘null model’, but did not consistently outperform the linear extrapolation. An uncertainty analysis further revealed that the models’ performances are particularly sensitive to the quality of the input land use data. In addition, unknown future trends of regional land use intensity widen considerably the uncertainty bands of the predictions. PMID:24647587
NASA Astrophysics Data System (ADS)
Rahnamay Naeini, M.; Sadegh, M.; AghaKouchak, A.; Hsu, K. L.; Sorooshian, S.; Yang, T.
2017-12-01
Meta-Heuristic optimization algorithms have gained a great deal of attention in a wide variety of fields. Simplicity and flexibility of these algorithms, along with their robustness, make them attractive tools for solving optimization problems. Different optimization methods, however, hold algorithm-specific strengths and limitations. Performance of each individual algorithm obeys the "No-Free-Lunch" theorem, which means a single algorithm cannot consistently outperform all possible optimization problems over a variety of problems. From users' perspective, it is a tedious process to compare, validate, and select the best-performing algorithm for a specific problem or a set of test cases. In this study, we introduce a new hybrid optimization framework, entitled Shuffled Complex-Self Adaptive Hybrid EvoLution (SC-SAHEL), which combines the strengths of different evolutionary algorithms (EAs) in a parallel computing scheme, and allows users to select the most suitable algorithm tailored to the problem at hand. The concept of SC-SAHEL is to execute different EAs as separate parallel search cores, and let all participating EAs to compete during the course of the search. The newly developed SC-SAHEL algorithm is designed to automatically select, the best performing algorithm for the given optimization problem. This algorithm is rigorously effective in finding the global optimum for several strenuous benchmark test functions, and computationally efficient as compared to individual EAs. We benchmark the proposed SC-SAHEL algorithm over 29 conceptual test functions, and two real-world case studies - one hydropower reservoir model and one hydrological model (SAC-SMA). Results show that the proposed framework outperforms individual EAs in an absolute majority of the test problems, and can provide competitive results to the fittest EA algorithm with more comprehensive information during the search. The proposed framework is also flexible for merging additional EAs, boundary-handling techniques, and sampling schemes, and has good potential to be used in Water-Energy system optimal operation and management.
Overlapping communities from dense disjoint and high total degree clusters
NASA Astrophysics Data System (ADS)
Zhang, Hongli; Gao, Yang; Zhang, Yue
2018-04-01
Community plays an important role in the field of sociology, biology and especially in domains of computer science, where systems are often represented as networks. And community detection is of great importance in the domains. A community is a dense subgraph of the whole graph with more links between its members than between its members to the outside nodes, and nodes in the same community probably share common properties or play similar roles in the graph. Communities overlap when nodes in a graph belong to multiple communities. A vast variety of overlapping community detection methods have been proposed in the literature, and the local expansion method is one of the most successful techniques dealing with large networks. The paper presents a density-based seeding method, in which dense disjoint local clusters are searched and selected as seeds. The proposed method selects a seed by the total degree and density of local clusters utilizing merely local structures of the network. Furthermore, this paper proposes a novel community refining phase via minimizing the conductance of each community, through which the quality of identified communities is largely improved in linear time. Experimental results in synthetic networks show that the proposed seeding method outperforms other seeding methods in the state of the art and the proposed refining method largely enhances the quality of the identified communities. Experimental results in real graphs with ground-truth communities show that the proposed approach outperforms other state of the art overlapping community detection algorithms, in particular, it is more than two orders of magnitude faster than the existing global algorithms with higher quality, and it obtains much more accurate community structure than the current local algorithms without any priori information.
Effective Feature Selection for Classification of Promoter Sequences.
K, Kouser; P G, Lavanya; Rangarajan, Lalitha; K, Acharya Kshitish
2016-01-01
Exploring novel computational methods in making sense of biological data has not only been a necessity, but also productive. A part of this trend is the search for more efficient in silico methods/tools for analysis of promoters, which are parts of DNA sequences that are involved in regulation of expression of genes into other functional molecules. Promoter regions vary greatly in their function based on the sequence of nucleotides and the arrangement of protein-binding short-regions called motifs. In fact, the regulatory nature of the promoters seems to be largely driven by the selective presence and/or the arrangement of these motifs. Here, we explore computational classification of promoter sequences based on the pattern of motif distributions, as such classification can pave a new way of functional analysis of promoters and to discover the functionally crucial motifs. We make use of Position Specific Motif Matrix (PSMM) features for exploring the possibility of accurately classifying promoter sequences using some of the popular classification techniques. The classification results on the complete feature set are low, perhaps due to the huge number of features. We propose two ways of reducing features. Our test results show improvement in the classification output after the reduction of features. The results also show that decision trees outperform SVM (Support Vector Machine), KNN (K Nearest Neighbor) and ensemble classifier LibD3C, particularly with reduced features. The proposed feature selection methods outperform some of the popular feature transformation methods such as PCA and SVD. Also, the methods proposed are as accurate as MRMR (feature selection method) but much faster than MRMR. Such methods could be useful to categorize new promoters and explore regulatory mechanisms of gene expressions in complex eukaryotic species.
Elfwing, Stefan; Uchibe, Eiji; Doya, Kenji
2016-12-01
Free-energy based reinforcement learning (FERL) was proposed for learning in high-dimensional state and action spaces. However, the FERL method does only really work well with binary, or close to binary, state input, where the number of active states is fewer than the number of non-active states. In the FERL method, the value function is approximated by the negative free energy of a restricted Boltzmann machine (RBM). In our earlier study, we demonstrated that the performance and the robustness of the FERL method can be improved by scaling the free energy by a constant that is related to the size of network. In this study, we propose that RBM function approximation can be further improved by approximating the value function by the negative expected energy (EERL), instead of the negative free energy, as well as being able to handle continuous state input. We validate our proposed method by demonstrating that EERL: (1) outperforms FERL, as well as standard neural network and linear function approximation, for three versions of a gridworld task with high-dimensional image state input; (2) achieves new state-of-the-art results in stochastic SZ-Tetris in both model-free and model-based learning settings; and (3) significantly outperforms FERL and standard neural network function approximation for a robot navigation task with raw and noisy RGB images as state input and a large number of actions. Copyright © 2016 The Author(s). Published by Elsevier Ltd.. All rights reserved.
A total variation diminishing finite difference algorithm for sonic boom propagation models
NASA Technical Reports Server (NTRS)
Sparrow, Victor W.
1993-01-01
It is difficult to accurately model the rise phases of sonic boom waveforms with traditional finite difference algorithms because of finite difference phase dispersion. This paper introduces the concept of a total variation diminishing (TVD) finite difference method as a tool for accurately modeling the rise phases of sonic booms. A standard second order finite difference algorithm and its TVD modified counterpart are both applied to the one-way propagation of a square pulse. The TVD method clearly outperforms the non-TVD method, showing great potential as a new computational tool in the analysis of sonic boom propagation.
An Improved Heuristic Method for Subgraph Isomorphism Problem
NASA Astrophysics Data System (ADS)
Xiang, Yingzhuo; Han, Jiesi; Xu, Haijiang; Guo, Xin
2017-09-01
This paper focus on the subgraph isomorphism (SI) problem. We present an improved genetic algorithm, a heuristic method to search the optimal solution. The contribution of this paper is that we design a dedicated crossover algorithm and a new fitness function to measure the evolution process. Experiments show our improved genetic algorithm performs better than other heuristic methods. For a large graph, such as a subgraph of 40 nodes, our algorithm outperforms the traditional tree search algorithms. We find that the performance of our improved genetic algorithm does not decrease as the number of nodes in prototype graphs.
Improved Adaptive LSB Steganography Based on Chaos and Genetic Algorithm
NASA Astrophysics Data System (ADS)
Yu, Lifang; Zhao, Yao; Ni, Rongrong; Li, Ting
2010-12-01
We propose a novel steganographic method in JPEG images with high performance. Firstly, we propose improved adaptive LSB steganography, which can achieve high capacity while preserving the first-order statistics. Secondly, in order to minimize visual degradation of the stego image, we shuffle bits-order of the message based on chaos whose parameters are selected by the genetic algorithm. Shuffling message's bits-order provides us with a new way to improve the performance of steganography. Experimental results show that our method outperforms classical steganographic methods in image quality, while preserving characteristics of histogram and providing high capacity.
Color Image Classification Using Block Matching and Learning
NASA Astrophysics Data System (ADS)
Kondo, Kazuki; Hotta, Seiji
In this paper, we propose block matching and learning for color image classification. In our method, training images are partitioned into small blocks. Given a test image, it is also partitioned into small blocks, and mean-blocks corresponding to each test block are calculated with neighbor training blocks. Our method classifies a test image into the class that has the shortest total sum of distances between mean blocks and test ones. We also propose a learning method for reducing memory requirement. Experimental results show that our classification outperforms other classifiers such as support vector machine with bag of keypoints.
Deep classification hashing for person re-identification
NASA Astrophysics Data System (ADS)
Wang, Jiabao; Li, Yang; Zhang, Xiancai; Miao, Zhuang; Tao, Gang
2018-04-01
As the development of surveillance in public, person re-identification becomes more and more important. The largescale databases call for efficient computation and storage, hashing technique is one of the most important methods. In this paper, we proposed a new deep classification hashing network by introducing a new binary appropriation layer in the traditional ImageNet pre-trained CNN models. It outputs binary appropriate features, which can be easily quantized into binary hash-codes for hamming similarity comparison. Experiments show that our deep hashing method can outperform the state-of-the-art methods on the public CUHK03 and Market1501 datasets.
Riniker, Sereina; Fechner, Nikolas; Landrum, Gregory A
2013-11-25
The concept of data fusion - the combination of information from different sources describing the same object with the expectation to generate a more accurate representation - has found application in a very broad range of disciplines. In the context of ligand-based virtual screening (VS), data fusion has been applied to combine knowledge from either different active molecules or different fingerprints to improve similarity search performance. Machine-learning (ML) methods based on fusion of multiple homogeneous classifiers, in particular random forests, have also been widely applied in the ML literature. The heterogeneous version of classifier fusion - fusing the predictions from different model types - has been less explored. Here, we investigate heterogeneous classifier fusion for ligand-based VS using three different ML methods, RF, naïve Bayes (NB), and logistic regression (LR), with four 2D fingerprints, atom pairs, topological torsions, RDKit fingerprint, and circular fingerprint. The methods are compared using a previously developed benchmarking platform for 2D fingerprints which is extended to ML methods in this article. The original data sets are filtered for difficulty, and a new set of challenging data sets from ChEMBL is added. Data sets were also generated for a second use case: starting from a small set of related actives instead of diverse actives. The final fused model consistently outperforms the other approaches across the broad variety of targets studied, indicating that heterogeneous classifier fusion is a very promising approach for ligand-based VS. The new data sets together with the adapted source code for ML methods are provided in the Supporting Information .
A COMPARISON OF FLARE FORECASTING METHODS. I. RESULTS FROM THE “ALL-CLEAR” WORKSHOP
DOE Office of Scientific and Technical Information (OSTI.GOV)
Barnes, G.; Leka, K. D.; Dunn, T.
2016-10-01
Solar flares produce radiation that can have an almost immediate effect on the near-Earth environment, making it crucial to forecast flares in order to mitigate their negative effects. The number of published approaches to flare forecasting using photospheric magnetic field observations has proliferated, with varying claims about how well each works. Because of the different analysis techniques and data sets used, it is essentially impossible to compare the results from the literature. This problem is exacerbated by the low event rates of large solar flares. The challenges of forecasting rare events have long been recognized in the meteorology community, butmore » have yet to be fully acknowledged by the space weather community. During the interagency workshop on “all clear” forecasts held in Boulder, CO in 2009, the performance of a number of existing algorithms was compared on common data sets, specifically line-of-sight magnetic field and continuum intensity images from the Michelson Doppler Imager, with consistent definitions of what constitutes an event. We demonstrate the importance of making such systematic comparisons, and of using standard verification statistics to determine what constitutes a good prediction scheme. When a comparison was made in this fashion, no one method clearly outperformed all others, which may in part be due to the strong correlations among the parameters used by different methods to characterize an active region. For M-class flares and above, the set of methods tends toward a weakly positive skill score (as measured with several distinct metrics), with no participating method proving substantially better than climatological forecasts.« less
Skin tumor area extraction using an improved dynamic programming approach.
Abbas, Qaisar; Celebi, M E; Fondón García, Irene
2012-05-01
Border (B) description of melanoma and other pigmented skin lesions is one of the most important tasks for the clinical diagnosis of dermoscopy images using the ABCD rule. For an accurate description of the border, there must be an effective skin tumor area extraction (STAE) method. However, this task is complicated due to uneven illumination, artifacts present in the lesions and smooth areas or fuzzy borders of the desired regions. In this paper, a novel STAE algorithm based on improved dynamic programming (IDP) is presented. The STAE technique consists of the following four steps: color space transform, pre-processing, rough tumor area detection and refinement of the segmented area. The procedure is performed in the CIE L(*) a(*) b(*) color space, which is approximately uniform and is therefore related to dermatologist's perception. After pre-processing the skin lesions to reduce artifacts, the DP algorithm is improved by introducing a local cost function, which is based on color and texture weights. The STAE method is tested on a total of 100 dermoscopic images. In order to compare the performance of STAE with other state-of-the-art algorithms, various statistical measures based on dermatologist-drawn borders are utilized as a ground truth. The proposed method outperforms the others with a sensitivity of 96.64%, a specificity of 98.14% and an error probability of 5.23%. The results demonstrate that this STAE method by IDP is an effective solution when compared with other state-of-the-art segmentation techniques. The proposed method can accurately extract tumor borders in dermoscopy images. © 2011 John Wiley & Sons A/S.
de Heer, Walt A.; Berger, Claire; Ruan, Ming; Sprinkle, Mike; Li, Xuebin; Hu, Yike; Zhang, Baiqian; Hankinson, John; Conrad, Edward
2011-01-01
After the pioneering investigations into graphene-based electronics at Georgia Tech, great strides have been made developing epitaxial graphene on silicon carbide (EG) as a new electronic material. EG has not only demonstrated its potential for large scale applications, it also has become an important material for fundamental two-dimensional electron gas physics. It was long known that graphene mono and multilayers grow on SiC crystals at high temperatures in ultrahigh vacuum. At these temperatures, silicon sublimes from the surface and the carbon rich surface layer transforms to graphene. However the quality of the graphene produced in ultrahigh vacuum is poor due to the high sublimation rates at relatively low temperatures. The Georgia Tech team developed growth methods involving encapsulating the SiC crystals in graphite enclosures, thereby sequestering the evaporated silicon and bringing growth process closer to equilibrium. In this confinement controlled sublimation (CCS) process, very high-quality graphene is grown on both polar faces of the SiC crystals. Since 2003, over 50 publications used CCS grown graphene, where it is known as the “furnace grown” graphene. Graphene multilayers grown on the carbon-terminated face of SiC, using the CCS method, were shown to consist of decoupled high mobility graphene layers. The CCS method is now applied on structured silicon carbide surfaces to produce high mobility nano-patterned graphene structures thereby demonstrating that EG is a viable contender for next-generation electronics. Here we present for the first time the CCS method that outperforms other epitaxial graphene production methods. PMID:21960446
NASA Astrophysics Data System (ADS)
Pascual Garcia, Juan
In this PhD thesis one method of shielded multilayer circuit neural network based analysis has been developed. One of the most successful analysis procedures of these kind of structures is the Integral Equation technique (IE) solved by the Method of Moments (MoM). In order to solve the IE, in the version which uses the media relevant potentials, it is necessary to have a formulation of the Green's functions associated to the mentioned potentials. The main computational burden in the IE resolution lies on the numerical evaluation of the Green's functions. In this work, the circuit analysis has been drastically accelerated thanks to the approximation of the Green's functions by means of neural networks. Once trained, the neural networks substitute the Green's functions in the IE. Two different types of neural networks have been used: the Radial basis function neural networks (RBFNN) and the Chebyshev neural networks. Thanks mainly to two distinct operations the correct approximation of the Green's functions has been possible. On the one hand, a very effective input space division has been developed. On the other hand, the elimination of the singularity makes feasible the approximation of slow variation functions. Two different singularity elimination strategies have been developed. The first one is based on the multiplication by the source-observation points distance (rho). The second one outperforms the first one. It consists of the extraction of two layers of spatial images from the whole summation of images. With regard to the Chebyshev neural networks, the OLS training algorithm has been applied in a novel fashion. This method allows the optimum design in this kind of neural networks. In this way, the performance of these neural networks outperforms greatly the RBFNNs one. In both networks, the time gain reached makes the neural method profitable. The time invested in the input space division and in the neural training is negligible with only few circuit analysis. To show, in a practical way, the ability of the neural based analysis method, two new design procedures have been developed. The first method uses the Genetic Algorithms to optimize an initial filter which does not fulfill the established specifications. A new fitness function, specially well suited to design filters, has been defined in order to assure the correct convergence of the optimization process. This new function measures the fulfillment of the specifications and it also prevents the appearance of the premature convergence problem. The second method is found on the approximation, by means of neural networks, of the relations between the electrical parameters, which defined the circuit response, and the physical dimensions that synthesize the aforementioned parameters. The neural networks trained with these data can be used in the design of many circuits in a given structure. Both methods had been show their ability in the design of practical filters.
Colorectal cancer staging: comparison of whole-body PET/CT and PET/MR.
Catalano, Onofrio A; Coutinho, Artur M; Sahani, Dushyant V; Vangel, Mark G; Gee, Michael S; Hahn, Peter F; Witzel, Thomas; Soricelli, Andrea; Salvatore, Marco; Catana, Ciprian; Mahmood, Umar; Rosen, Bruce R; Gervais, Debra
2017-04-01
Correct staging is imperative for colorectal cancer (CRC) since it influences both prognosis and management. Several imaging methods are used for this purpose, with variable performance. Positron emission tomography-magnetic resonance (PET/MR) is an innovative imaging technique recently employed for clinical application. The present study was undertaken to compare the staging accuracy of whole-body positron emission tomography-computed tomography (PET/CT) with whole-body PET/MR in patients with both newly diagnosed and treated colorectal cancer. Twenty-six patients, who underwent same day whole-body (WB) PET/CT and WB-PET/MR, were evaluated. PET/CT and PET/MR studies were interpreted by consensus by a radiologist and a nuclear medicine physician. Correlations with prior imaging and follow-up studies were used as the reference standard. Correct staging was compared between methods using McNemar's Chi square test. The two methods were in agreement and correct for 18/26 (69%) patients, and in agreement and incorrect for one patient (3.8%). PET/MR and PET/CT stages for the remaining 7/26 patients (27%) were discordant, with PET/MR staging being correct in all seven cases. PET/MR significantly outperformed PET/CT overall for accurate staging (P = 0.02). PET/MR outperformed PET/CT in CRC staging. PET/MR might allow accurate local and distant staging of CRC patients during both at the time of diagnosis and during follow-up.
enDNA-Prot: identification of DNA-binding proteins by applying ensemble learning.
Xu, Ruifeng; Zhou, Jiyun; Liu, Bin; Yao, Lin; He, Yulan; Zou, Quan; Wang, Xiaolong
2014-01-01
DNA-binding proteins are crucial for various cellular processes, such as recognition of specific nucleotide, regulation of transcription, and regulation of gene expression. Developing an effective model for identifying DNA-binding proteins is an urgent research problem. Up to now, many methods have been proposed, but most of them focus on only one classifier and cannot make full use of the large number of negative samples to improve predicting performance. This study proposed a predictor called enDNA-Prot for DNA-binding protein identification by employing the ensemble learning technique. Experiential results showed that enDNA-Prot was comparable with DNA-Prot and outperformed DNAbinder and iDNA-Prot with performance improvement in the range of 3.97-9.52% in ACC and 0.08-0.19 in MCC. Furthermore, when the benchmark dataset was expanded with negative samples, the performance of enDNA-Prot outperformed the three existing methods by 2.83-16.63% in terms of ACC and 0.02-0.16 in terms of MCC. It indicated that enDNA-Prot is an effective method for DNA-binding protein identification and expanding training dataset with negative samples can improve its performance. For the convenience of the vast majority of experimental scientists, we developed a user-friendly web-server for enDNA-Prot which is freely accessible to the public.
TU-AB-303-08: GPU-Based Software Platform for Efficient Image-Guided Adaptive Radiation Therapy
DOE Office of Scientific and Technical Information (OSTI.GOV)
Park, S; Robinson, A; McNutt, T
2015-06-15
Purpose: In this study, we develop an integrated software platform for adaptive radiation therapy (ART) that combines fast and accurate image registration, segmentation, and dose computation/accumulation methods. Methods: The proposed system consists of three key components; 1) deformable image registration (DIR), 2) automatic segmentation, and 3) dose computation/accumulation. The computationally intensive modules including DIR and dose computation have been implemented on a graphics processing unit (GPU). All required patient-specific data including the planning CT (pCT) with contours, daily cone-beam CTs, and treatment plan are automatically queried and retrieved from their own databases. To improve the accuracy of DIR between pCTmore » and CBCTs, we use the double force demons DIR algorithm in combination with iterative CBCT intensity correction by local intensity histogram matching. Segmentation of daily CBCT is then obtained by propagating contours from the pCT. Daily dose delivered to the patient is computed on the registered pCT by a GPU-accelerated superposition/convolution algorithm. Finally, computed daily doses are accumulated to show the total delivered dose to date. Results: Since the accuracy of DIR critically affects the quality of the other processes, we first evaluated our DIR method on eight head-and-neck cancer cases and compared its performance. Normalized mutual-information (NMI) and normalized cross-correlation (NCC) computed as similarity measures, and our method produced overall NMI of 0.663 and NCC of 0.987, outperforming conventional methods by 3.8% and 1.9%, respectively. Experimental results show that our registration method is more consistent and roust than existing algorithms, and also computationally efficient. Computation time at each fraction took around one minute (30–50 seconds for registration and 15–25 seconds for dose computation). Conclusion: We developed an integrated GPU-accelerated software platform that enables accurate and efficient DIR, auto-segmentation, and dose computation, thus supporting an efficient ART workflow. This work was supported by NIH/NCI under grant R42CA137886.« less
Kim, Ki Hwan; Do, Won-Joon; Park, Sung-Hong
2018-05-04
The routine MRI scan protocol consists of multiple pulse sequences that acquire images of varying contrast. Since high frequency contents such as edges are not significantly affected by image contrast, down-sampled images in one contrast may be improved by high resolution (HR) images acquired in another contrast, reducing the total scan time. In this study, we propose a new deep learning framework that uses HR MR images in one contrast to generate HR MR images from highly down-sampled MR images in another contrast. The proposed convolutional neural network (CNN) framework consists of two CNNs: (a) a reconstruction CNN for generating HR images from the down-sampled images using HR images acquired with a different MRI sequence and (b) a discriminator CNN for improving the perceptual quality of the generated HR images. The proposed method was evaluated using a public brain tumor database and in vivo datasets. The performance of the proposed method was assessed in tumor and no-tumor cases separately, with perceptual image quality being judged by a radiologist. To overcome the challenge of training the network with a small number of available in vivo datasets, the network was pretrained using the public database and then fine-tuned using the small number of in vivo datasets. The performance of the proposed method was also compared to that of several compressed sensing (CS) algorithms. Incorporating HR images of another contrast improved the quantitative assessments of the generated HR image in reference to ground truth. Also, incorporating a discriminator CNN yielded perceptually higher image quality. These results were verified in regions of normal tissue as well as tumors for various MRI sequences from pseudo k-space data generated from the public database. The combination of pretraining with the public database and fine-tuning with the small number of real k-space datasets enhanced the performance of CNNs in in vivo application compared to training CNNs from scratch. The proposed method outperformed the compressed sensing methods. The proposed method can be a good strategy for accelerating routine MRI scanning. © 2018 American Association of Physicists in Medicine.
Adaptive Distributed Video Coding with Correlation Estimation using Expectation Propagation
Cui, Lijuan; Wang, Shuang; Jiang, Xiaoqian; Cheng, Samuel
2013-01-01
Distributed video coding (DVC) is rapidly increasing in popularity by the way of shifting the complexity from encoder to decoder, whereas no compression performance degrades, at least in theory. In contrast with conventional video codecs, the inter-frame correlation in DVC is explored at decoder based on the received syndromes of Wyner-Ziv (WZ) frame and side information (SI) frame generated from other frames available only at decoder. However, the ultimate decoding performances of DVC are based on the assumption that the perfect knowledge of correlation statistic between WZ and SI frames should be available at decoder. Therefore, the ability of obtaining a good statistical correlation estimate is becoming increasingly important in practical DVC implementations. Generally, the existing correlation estimation methods in DVC can be classified into two main types: pre-estimation where estimation starts before decoding and on-the-fly (OTF) estimation where estimation can be refined iteratively during decoding. As potential changes between frames might be unpredictable or dynamical, OTF estimation methods usually outperforms pre-estimation techniques with the cost of increased decoding complexity (e.g., sampling methods). In this paper, we propose a low complexity adaptive DVC scheme using expectation propagation (EP), where correlation estimation is performed OTF as it is carried out jointly with decoding of the factor graph-based DVC code. Among different approximate inference methods, EP generally offers better tradeoff between accuracy and complexity. Experimental results show that our proposed scheme outperforms the benchmark state-of-the-art DISCOVER codec and other cases without correlation tracking, and achieves comparable decoding performance but with significantly low complexity comparing with sampling method. PMID:23750314
Adaptive distributed video coding with correlation estimation using expectation propagation
NASA Astrophysics Data System (ADS)
Cui, Lijuan; Wang, Shuang; Jiang, Xiaoqian; Cheng, Samuel
2012-10-01
Distributed video coding (DVC) is rapidly increasing in popularity by the way of shifting the complexity from encoder to decoder, whereas no compression performance degrades, at least in theory. In contrast with conventional video codecs, the inter-frame correlation in DVC is explored at decoder based on the received syndromes of Wyner-Ziv (WZ) frame and side information (SI) frame generated from other frames available only at decoder. However, the ultimate decoding performances of DVC are based on the assumption that the perfect knowledge of correlation statistic between WZ and SI frames should be available at decoder. Therefore, the ability of obtaining a good statistical correlation estimate is becoming increasingly important in practical DVC implementations. Generally, the existing correlation estimation methods in DVC can be classified into two main types: pre-estimation where estimation starts before decoding and on-the-fly (OTF) estimation where estimation can be refined iteratively during decoding. As potential changes between frames might be unpredictable or dynamical, OTF estimation methods usually outperforms pre-estimation techniques with the cost of increased decoding complexity (e.g., sampling methods). In this paper, we propose a low complexity adaptive DVC scheme using expectation propagation (EP), where correlation estimation is performed OTF as it is carried out jointly with decoding of the factor graph-based DVC code. Among different approximate inference methods, EP generally offers better tradeoff between accuracy and complexity. Experimental results show that our proposed scheme outperforms the benchmark state-of-the-art DISCOVER codec and other cases without correlation tracking, and achieves comparable decoding performance but with significantly low complexity comparing with sampling method.
A quantitative experimental phantom study on MRI image uniformity.
Felemban, Doaa; Verdonschot, Rinus G; Iwamoto, Yuri; Uchiyama, Yuka; Kakimoto, Naoya; Kreiborg, Sven; Murakami, Shumei
2018-05-23
Our goal was to assess MR image uniformity by investigating aspects influencing said uniformity via a method laid out by the National Electrical Manufacturers Association (NEMA). Six metallic materials embedded in a glass phantom were scanned (i.e. Au, Ag, Al, Au-Ag-Pd alloy, Ti and Co-Cr alloy) as well as a reference image. Sequences included spin echo (SE) and gradient echo (GRE) scanned in three planes (i.e. axial, coronal, and sagittal). Moreover, three surface coil types (i.e. head and neck, Brain, and temporomandibular joint coils) and two image correction methods (i.e. surface coil intensity correction or SCIC, phased array uniformity enhancement or PURE) were employed to evaluate their effectiveness on image uniformity. Image uniformity was assessed using the National Electrical Manufacturers Association peak-deviation non-uniformity method. Results showed that temporomandibular joint coils elicited the least uniform image and brain coils outperformed head and neck coils when metallic materials were present. Additionally, when metallic materials were present, spin echo outperformed gradient echo especially for Co-Cr (particularly in the axial plane). Furthermore, both SCIC and PURE improved image uniformity compared to uncorrected images, and SCIC slightly surpassed PURE when metallic metals were present. Lastly, Co-Cr elicited the least uniform image while other metallic materials generally showed similar patterns (i.e. no significant deviation from images without metallic metals). Overall, a quantitative understanding of the factors influencing MR image uniformity (e.g. coil type, imaging method, metal susceptibility, and post-hoc correction method) is advantageous to optimize image quality, assists clinical interpretation, and may result in improved medical and dental care.
Adaptive Distributed Video Coding with Correlation Estimation using Expectation Propagation.
Cui, Lijuan; Wang, Shuang; Jiang, Xiaoqian; Cheng, Samuel
2012-10-15
Distributed video coding (DVC) is rapidly increasing in popularity by the way of shifting the complexity from encoder to decoder, whereas no compression performance degrades, at least in theory. In contrast with conventional video codecs, the inter-frame correlation in DVC is explored at decoder based on the received syndromes of Wyner-Ziv (WZ) frame and side information (SI) frame generated from other frames available only at decoder. However, the ultimate decoding performances of DVC are based on the assumption that the perfect knowledge of correlation statistic between WZ and SI frames should be available at decoder. Therefore, the ability of obtaining a good statistical correlation estimate is becoming increasingly important in practical DVC implementations. Generally, the existing correlation estimation methods in DVC can be classified into two main types: pre-estimation where estimation starts before decoding and on-the-fly (OTF) estimation where estimation can be refined iteratively during decoding. As potential changes between frames might be unpredictable or dynamical, OTF estimation methods usually outperforms pre-estimation techniques with the cost of increased decoding complexity (e.g., sampling methods). In this paper, we propose a low complexity adaptive DVC scheme using expectation propagation (EP), where correlation estimation is performed OTF as it is carried out jointly with decoding of the factor graph-based DVC code. Among different approximate inference methods, EP generally offers better tradeoff between accuracy and complexity. Experimental results show that our proposed scheme outperforms the benchmark state-of-the-art DISCOVER codec and other cases without correlation tracking, and achieves comparable decoding performance but with significantly low complexity comparing with sampling method.
DNA Barcoding of Recently Diverged Species: Relative Performance of Matching Methods
van Velzen, Robin; Weitschek, Emanuel; Felici, Giovanni; Bakker, Freek T.
2012-01-01
Recently diverged species are challenging for identification, yet they are frequently of special interest scientifically as well as from a regulatory perspective. DNA barcoding has proven instrumental in species identification, especially in insects and vertebrates, but for the identification of recently diverged species it has been reported to be problematic in some cases. Problems are mostly due to incomplete lineage sorting or simply lack of a ‘barcode gap’ and probably related to large effective population size and/or low mutation rate. Our objective was to compare six methods in their ability to correctly identify recently diverged species with DNA barcodes: neighbor joining and parsimony (both tree-based), nearest neighbor and BLAST (similarity-based), and the diagnostic methods DNA-BAR, and BLOG. We analyzed simulated data assuming three different effective population sizes as well as three selected empirical data sets from published studies. Results show, as expected, that success rates are significantly lower for recently diverged species (∼75%) than for older species (∼97%) (P<0.00001). Similarity-based and diagnostic methods significantly outperform tree-based methods, when applied to simulated DNA barcode data (P<0.00001). The diagnostic method BLOG had highest correct query identification rate based on simulated (86.2%) as well as empirical data (93.1%), indicating that it is a consistently better method overall. Another advantage of BLOG is that it offers species-level information that can be used outside the realm of DNA barcoding, for instance in species description or molecular detection assays. Even though we can confirm that identification success based on DNA barcoding is generally high in our data, recently diverged species remain difficult to identify. Nevertheless, our results contribute to improved solutions for their accurate identification. PMID:22272356
DNA barcoding of recently diverged species: relative performance of matching methods.
van Velzen, Robin; Weitschek, Emanuel; Felici, Giovanni; Bakker, Freek T
2012-01-01
Recently diverged species are challenging for identification, yet they are frequently of special interest scientifically as well as from a regulatory perspective. DNA barcoding has proven instrumental in species identification, especially in insects and vertebrates, but for the identification of recently diverged species it has been reported to be problematic in some cases. Problems are mostly due to incomplete lineage sorting or simply lack of a 'barcode gap' and probably related to large effective population size and/or low mutation rate. Our objective was to compare six methods in their ability to correctly identify recently diverged species with DNA barcodes: neighbor joining and parsimony (both tree-based), nearest neighbor and BLAST (similarity-based), and the diagnostic methods DNA-BAR, and BLOG. We analyzed simulated data assuming three different effective population sizes as well as three selected empirical data sets from published studies. Results show, as expected, that success rates are significantly lower for recently diverged species (∼75%) than for older species (∼97%) (P<0.00001). Similarity-based and diagnostic methods significantly outperform tree-based methods, when applied to simulated DNA barcode data (P<0.00001). The diagnostic method BLOG had highest correct query identification rate based on simulated (86.2%) as well as empirical data (93.1%), indicating that it is a consistently better method overall. Another advantage of BLOG is that it offers species-level information that can be used outside the realm of DNA barcoding, for instance in species description or molecular detection assays. Even though we can confirm that identification success based on DNA barcoding is generally high in our data, recently diverged species remain difficult to identify. Nevertheless, our results contribute to improved solutions for their accurate identification.
Forecasting USAF JP-8 Fuel Needs
2009-03-01
versus complex ones. When we consider long -term forecasts, 5-years in this case, multiple regression outperforms ANN modeling within the specified...with more simple and easy-to-implement methods, versus complex ones. When we consider long -term 5-year forecasts, our multiple regression model...effort. The insight and experience was certainly appreciated. Special thanks to my Turkish peers for their continuous support and help during this long
Quadrotor Intercept Trajectory Planning and Simulation
2017-06-01
Figure 41. Results are grouped by geometry type and colored based on trajectory planner. Figure 41. Summary of Experimental Data Intercept Time...DISTRIBUTION CODE 13. ABSTRACT (maximum 200 words) Quadrotor drones pose a safety hazard when operated in or near controlled airspace. A hazardous...quadrotor could be intercepted and removed by another quadrotor. In this thesis, we seek to determine if optimal control methods outperform missile
Kim, Dongchul; Kang, Mingon; Biswas, Ashis; Liu, Chunyu; Gao, Jean
2016-08-10
Inferring gene regulatory networks is one of the most interesting research areas in the systems biology. Many inference methods have been developed by using a variety of computational models and approaches. However, there are two issues to solve. First, depending on the structural or computational model of inference method, the results tend to be inconsistent due to innately different advantages and limitations of the methods. Therefore the combination of dissimilar approaches is demanded as an alternative way in order to overcome the limitations of standalone methods through complementary integration. Second, sparse linear regression that is penalized by the regularization parameter (lasso) and bootstrapping-based sparse linear regression methods were suggested in state of the art methods for network inference but they are not effective for a small sample size data and also a true regulator could be missed if the target gene is strongly affected by an indirect regulator with high correlation or another true regulator. We present two novel network inference methods based on the integration of three different criteria, (i) z-score to measure the variation of gene expression from knockout data, (ii) mutual information for the dependency between two genes, and (iii) linear regression-based feature selection. Based on these criterion, we propose a lasso-based random feature selection algorithm (LARF) to achieve better performance overcoming the limitations of bootstrapping as mentioned above. In this work, there are three main contributions. First, our z score-based method to measure gene expression variations from knockout data is more effective than similar criteria of related works. Second, we confirmed that the true regulator selection can be effectively improved by LARF. Lastly, we verified that an integrative approach can clearly outperform a single method when two different methods are effectively jointed. In the experiments, our methods were validated by outperforming the state of the art methods on DREAM challenge data, and then LARF was applied to inferences of gene regulatory network associated with psychiatric disorders.
Zhang, Ai-bing; Feng, Jie; Ward, Robert D; Wan, Ping; Gao, Qiang; Wu, Jun; Zhao, Wei-zhong
2012-01-01
Species identification via DNA barcodes is contributing greatly to current bioinventory efforts. The initial, and widely accepted, proposal was to use the protein-coding cytochrome c oxidase subunit I (COI) region as the standard barcode for animals, but recently non-coding internal transcribed spacer (ITS) genes have been proposed as candidate barcodes for both animals and plants. However, achieving a robust alignment for non-coding regions can be problematic. Here we propose two new methods (DV-RBF and FJ-RBF) to address this issue for species assignment by both coding and non-coding sequences that take advantage of the power of machine learning and bioinformatics. We demonstrate the value of the new methods with four empirical datasets, two representing typical protein-coding COI barcode datasets (neotropical bats and marine fish) and two representing non-coding ITS barcodes (rust fungi and brown algae). Using two random sub-sampling approaches, we demonstrate that the new methods significantly outperformed existing Neighbor-joining (NJ) and Maximum likelihood (ML) methods for both coding and non-coding barcodes when there was complete species coverage in the reference dataset. The new methods also out-performed NJ and ML methods for non-coding sequences in circumstances of potentially incomplete species coverage, although then the NJ and ML methods performed slightly better than the new methods for protein-coding barcodes. A 100% success rate of species identification was achieved with the two new methods for 4,122 bat queries and 5,134 fish queries using COI barcodes, with 95% confidence intervals (CI) of 99.75-100%. The new methods also obtained a 96.29% success rate (95%CI: 91.62-98.40%) for 484 rust fungi queries and a 98.50% success rate (95%CI: 96.60-99.37%) for 1094 brown algae queries, both using ITS barcodes.
Evaluation of copy number variation detection for a SNP array platform
2014-01-01
Background Copy Number Variations (CNVs) are usually inferred from Single Nucleotide Polymorphism (SNP) arrays by use of some software packages based on given algorithms. However, there is no clear understanding of the performance of these software packages; it is therefore difficult to select one or several software packages for CNV detection based on the SNP array platform. We selected four publicly available software packages designed for CNV calling from an Affymetrix SNP array, including Birdsuite, dChip, Genotyping Console (GTC) and PennCNV. The publicly available dataset generated by Array-based Comparative Genomic Hybridization (CGH), with a resolution of 24 million probes per sample, was considered to be the “gold standard”. Compared with the CGH-based dataset, the success rate, average stability rate, sensitivity, consistence and reproducibility of these four software packages were assessed compared with the “gold standard”. Specially, we also compared the efficiency of detecting CNVs simultaneously by two, three and all of the software packages with that by a single software package. Results Simply from the quantity of the detected CNVs, Birdsuite detected the most while GTC detected the least. We found that Birdsuite and dChip had obvious detecting bias. And GTC seemed to be inferior because of the least amount of CNVs it detected. Thereafter we investigated the detection consistency produced by one certain software package and the rest three software suits. We found that the consistency of dChip was the lowest while GTC was the highest. Compared with the CNVs detecting result of CGH, in the matching group, GTC called the most matching CNVs, PennCNV-Affy ranked second. In the non-overlapping group, GTC called the least CNVs. With regards to the reproducibility of CNV calling, larger CNVs were usually replicated better. PennCNV-Affy shows the best consistency while Birdsuite shows the poorest. Conclusion We found that PennCNV outperformed the other three packages in the sensitivity and specificity of CNV calling. Obviously, each calling method had its own limitations and advantages for different data analysis. Therefore, the optimized calling methods might be identified using multiple algorithms to evaluate the concordance and discordance of SNP array-based CNV calling. PMID:24555668
A Multi-Channel Method for Detecting Periodic Forced Oscillations in Power Systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Follum, James D.; Tuffner, Francis K.
2016-11-14
Forced oscillations in electric power systems are often symptomatic of equipment malfunction or improper operation. Detecting and addressing the cause of the oscillations can improve overall system operation. In this paper, a multi-channel method of detecting forced oscillations and estimating their frequencies is proposed. The method operates by comparing the sum of scaled periodograms from various channels to a threshold. A method of setting the threshold to specify the detector's probability of false alarm while accounting for the correlation between channels is also presented. Results from simulated and measured power system data indicate that the method outperforms its single-channel counterpartmore » and is suitable for real-world applications.« less
Doubly stochastic radial basis function methods
NASA Astrophysics Data System (ADS)
Yang, Fenglian; Yan, Liang; Ling, Leevan
2018-06-01
We propose a doubly stochastic radial basis function (DSRBF) method for function recoveries. Instead of a constant, we treat the RBF shape parameters as stochastic variables whose distribution were determined by a stochastic leave-one-out cross validation (LOOCV) estimation. A careful operation count is provided in order to determine the ranges of all the parameters in our methods. The overhead cost for setting up the proposed DSRBF method is O (n2) for function recovery problems with n basis. Numerical experiments confirm that the proposed method not only outperforms constant shape parameter formulation (in terms of accuracy with comparable computational cost) but also the optimal LOOCV formulation (in terms of both accuracy and computational cost).
A general approach for predicting the behavior of the Supreme Court of the United States
Bommarito, Michael J.; Blackman, Josh
2017-01-01
Building on developments in machine learning and prior work in the science of judicial prediction, we construct a model designed to predict the behavior of the Supreme Court of the United States in a generalized, out-of-sample context. To do so, we develop a time-evolving random forest classifier that leverages unique feature engineering to predict more than 240,000 justice votes and 28,000 cases outcomes over nearly two centuries (1816-2015). Using only data available prior to decision, our model outperforms null (baseline) models at both the justice and case level under both parametric and non-parametric tests. Over nearly two centuries, we achieve 70.2% accuracy at the case outcome level and 71.9% at the justice vote level. More recently, over the past century, we outperform an in-sample optimized null model by nearly 5%. Our performance is consistent with, and improves on the general level of prediction demonstrated by prior work; however, our model is distinctive because it can be applied out-of-sample to the entire past and future of the Court, not a single term. Our results represent an important advance for the science of quantitative legal prediction and portend a range of other potential applications. PMID:28403140
Shi, Lu-Feng; Morozova, Natalia
2012-08-01
Word recognition is a basic component in a comprehensive hearing evaluation, but data are lacking for listeners speaking two languages. This study obtained such data for Russian natives in the US and analysed the data using the perceptual assimilation model (PAM) and speech learning model (SLM). Listeners were randomly presented 200 NU-6 words in quiet. Listeners responded verbally and in writing. Performance was scored on words and phonemes (word-initial consonants, vowels, and word-final consonants). Seven normal-hearing, adult monolingual English natives (NM), 16 English-dominant (ED), and 15 Russian-dominant (RD) Russian natives participated. ED and RD listeners differed significantly in their language background. Consistent with the SLM, NM outperformed ED listeners and ED outperformed RD listeners, whether responses were scored on words or phonemes. NM and ED listeners shared similar phoneme error patterns, whereas RD listeners' errors had unique patterns that could be largely understood via the PAM. RD listeners had particular difficulty differentiating vowel contrasts /i-I/, /æ-ε/, and /ɑ-Λ/, word-initial consonant contrasts /p-h/ and /b-f/, and word-final contrasts /f-v/. Both first-language phonology and second-language learning history affect word and phoneme recognition. Current findings may help clinicians differentiate word recognition errors due to language background from hearing pathologies.
Christensen, Bruce K; Patrick, Regan E; Stuss, Donald T; Gillingham, Susan; Zipursky, Robert B
2013-01-01
Schizophrenia (SCZ)-related verbal memory impairment is hypothesized to be mediated, in part, by frontal lobe (FTL) dysfunction. However, little research has contrasted the performance of SCZ patients with that of patients exhibiting circumscribed frontal lesions. The current study compared verbal episodic memory in patients with SCZ and focal FTL lesions (left frontal, LF; right frontal, RF; and bi-frontal, BF) on a four-trial list learning task consisting of three lists of varying semantic organizational structure. Each dependent variable was examined at two levels: scores collapsed across all four trials and learning scores (i.e., trial 4-trial 1). Performance deficits were observed in each patient group across most dependent measures at both levels. Regarding patient group differences, SCZ patients outperformed LF/BF patients (i.e., either learning scores or scores collapsed across trial) on free recall, primacy, primary memory, secondary memory, and subjective organization, whereas they only outperformed RF patients on the semantically blocked list on recency and primary memory. Collectively, these results indicate that the pattern of memory performance is largely similar between patients with SCZ and those with RF lesions. These data support tentative arguments that verbal episodic memory deficits in SCZ may be mediated by frontal dysfunction in the right hemisphere.
Deep Networks Can Resemble Human Feed-forward Vision in Invariant Object Recognition
Kheradpisheh, Saeed Reza; Ghodrati, Masoud; Ganjtabesh, Mohammad; Masquelier, Timothée
2016-01-01
Deep convolutional neural networks (DCNNs) have attracted much attention recently, and have shown to be able to recognize thousands of object categories in natural image databases. Their architecture is somewhat similar to that of the human visual system: both use restricted receptive fields, and a hierarchy of layers which progressively extract more and more abstracted features. Yet it is unknown whether DCNNs match human performance at the task of view-invariant object recognition, whether they make similar errors and use similar representations for this task, and whether the answers depend on the magnitude of the viewpoint variations. To investigate these issues, we benchmarked eight state-of-the-art DCNNs, the HMAX model, and a baseline shallow model and compared their results to those of humans with backward masking. Unlike in all previous DCNN studies, we carefully controlled the magnitude of the viewpoint variations to demonstrate that shallow nets can outperform deep nets and humans when variations are weak. When facing larger variations, however, more layers were needed to match human performance and error distributions, and to have representations that are consistent with human behavior. A very deep net with 18 layers even outperformed humans at the highest variation level, using the most human-like representations. PMID:27601096
Validating Variational Bayes Linear Regression Method With Multi-Central Datasets.
Murata, Hiroshi; Zangwill, Linda M; Fujino, Yuri; Matsuura, Masato; Miki, Atsuya; Hirasawa, Kazunori; Tanito, Masaki; Mizoue, Shiro; Mori, Kazuhiko; Suzuki, Katsuyoshi; Yamashita, Takehiro; Kashiwagi, Kenji; Shoji, Nobuyuki; Asaoka, Ryo
2018-04-01
To validate the prediction accuracy of variational Bayes linear regression (VBLR) with two datasets external to the training dataset. The training dataset consisted of 7268 eyes of 4278 subjects from the University of Tokyo Hospital. The Japanese Archive of Multicentral Databases in Glaucoma (JAMDIG) dataset consisted of 271 eyes of 177 patients, and the Diagnostic Innovations in Glaucoma Study (DIGS) dataset includes 248 eyes of 173 patients, which were used for validation. Prediction accuracy was compared between the VBLR and ordinary least squared linear regression (OLSLR). First, OLSLR and VBLR were carried out using total deviation (TD) values at each of the 52 test points from the second to fourth visual fields (VFs) (VF2-4) to 2nd to 10th VF (VF2-10) of each patient in JAMDIG and DIGS datasets, and the TD values of the 11th VF test were predicted every time. The predictive accuracy of each method was compared through the root mean squared error (RMSE) statistic. OLSLR RMSEs with the JAMDIG and DIGS datasets were between 31 and 4.3 dB, and between 19.5 and 3.9 dB. On the other hand, VBLR RMSEs with JAMDIG and DIGS datasets were between 5.0 and 3.7, and between 4.6 and 3.6 dB. There was statistically significant difference between VBLR and OLSLR for both datasets at every series (VF2-4 to VF2-10) (P < 0.01 for all tests). However, there was no statistically significant difference in VBLR RMSEs between JAMDIG and DIGS datasets at any series of VFs (VF2-2 to VF2-10) (P > 0.05). VBLR outperformed OLSLR to predict future VF progression, and the VBLR has a potential to be a helpful tool at clinical settings.
Comparing three pedagogical approaches to psychomotor skills acquisition.
Willis, Ross E; Richa, Jacqueline; Oppeltz, Richard; Nguyen, Patrick; Wagner, Kelly; Van Sickle, Kent R; Dent, Daniel L
2012-01-01
We compared traditional pedagogical approaches such as time- and repetition-based methods with proficiency-based training. Laparoscopic novices were assigned randomly to 1 of 3 training conditions. In experiment 1, participants in the time condition practiced for 60 minutes, participants in the repetition condition performed 5 practice trials, and participants in the proficiency condition trained until reaching a predetermined proficiency goal. In experiment 2, practice time and number of trials were equated across conditions. In experiment 1, participants in the proficiency-based training conditions outperformed participants in the other 2 conditions (P < .014); however, these participants trained longer (P < .001) and performed more repetitions (P < .001). In experiment 2, despite training for similar amounts of time and number of repetitions, participants in the proficiency condition outperformed their counterparts (P < .038). In both experiments, the standard deviations for the proficiency condition were smaller than the other conditions. Proficiency-based training results in trainees who perform uniformly and at a higher level than traditional training methodologies. Copyright © 2012 Elsevier Inc. All rights reserved.
A comparison of 1D and 2D LSTM architectures for the recognition of handwritten Arabic
NASA Astrophysics Data System (ADS)
Yousefi, Mohammad Reza; Soheili, Mohammad Reza; Breuel, Thomas M.; Stricker, Didier
2015-01-01
In this paper, we present an Arabic handwriting recognition method based on recurrent neural network. We use the Long Short Term Memory (LSTM) architecture, that have proven successful in different printed and handwritten OCR tasks. Applications of LSTM for handwriting recognition employ the two-dimensional architecture to deal with the variations in both vertical and horizontal axis. However, we show that using a simple pre-processing step that normalizes the position and baseline of letters, we can make use of 1D LSTM, which is faster in learning and convergence, and yet achieve superior performance. In a series of experiments on IFN/ENIT database for Arabic handwriting recognition, we demonstrate that our proposed pipeline can outperform 2D LSTM networks. Furthermore, we provide comparisons with 1D LSTM networks trained with manually crafted features to show that the automatically learned features in a globally trained 1D LSTM network with our normalization step can even outperform such systems.
Geostatistical modeling of riparian forest microclimate and its implications for sampling
Eskelson, B.N.I.; Anderson, P.D.; Hagar, J.C.; Temesgen, H.
2011-01-01
Predictive models of microclimate under various site conditions in forested headwater stream - riparian areas are poorly developed, and sampling designs for characterizing underlying riparian microclimate gradients are sparse. We used riparian microclimate data collected at eight headwater streams in the Oregon Coast Range to compare ordinary kriging (OK), universal kriging (UK), and kriging with external drift (KED) for point prediction of mean maximum air temperature (Tair). Several topographic and forest structure characteristics were considered as site-specific parameters. Height above stream and distance to stream were the most important covariates in the KED models, which outperformed OK and UK in terms of root mean square error. Sample patterns were optimized based on the kriging variance and the weighted means of shortest distance criterion using the simulated annealing algorithm. The optimized sample patterns outperformed systematic sample patterns in terms of mean kriging variance mainly for small sample sizes. These findings suggest methods for increasing efficiency of microclimate monitoring in riparian areas.
Jones, Paul R.
2012-01-01
Introduction Two studies examined whether stereotype threat impairs women's math performance and whether concurrent threat reduction strategies can be used to offset this effect. Method In Study 1, collegiate men and women (N = 100) watched a video purporting that males and females performed equally well (gender-fair) or males outperformed females (gender differences) on an imminent math test. In Study 2, (N = 44) women viewed the gender differences video, followed by misattribution (cue present, absent) and self-affirmation (present, absent) manipulations, before taking the aforesaid test. Results In the initial study, women underperformed men on the test after receiving the gender differences video, whereas no gender differences emerged in the gender-fair condition. In Study 2, affirming the self led to better performance than not doing so. Planned contrasts indicated, however, that only women receiving a misattribution cue and self-affirmation opportunity outperformed their counterparts not given these reduction strategies. Discussion These findings are discussed relative to Stereotype Threat Theory and educational implications are provided. PMID:22545058
Conjugate gradient heat bath for ill-conditioned actions.
Ceriotti, Michele; Bussi, Giovanni; Parrinello, Michele
2007-08-01
We present a method for performing sampling from a Boltzmann distribution of an ill-conditioned quadratic action. This method is based on heat-bath thermalization along a set of conjugate directions, generated via a conjugate-gradient procedure. The resulting scheme outperforms local updates for matrices with very high condition number, since it avoids the slowing down of modes with lower eigenvalue, and has some advantages over the global heat-bath approach, compared to which it is more stable and allows for more freedom in devising case-specific optimizations.
A Pressure Control Method for Emulsion Pump Station Based on Elman Neural Network
Tan, Chao; Qi, Nan; Yao, Xingang; Wang, Zhongbin; Si, Lei
2015-01-01
In order to realize pressure control of emulsion pump station which is key equipment of coal mine in the safety production, the control requirements were analyzed and a pressure control method based on Elman neural network was proposed. The key techniques such as system framework, pressure prediction model, pressure control model, and the flowchart of proposed approach were presented. Finally, a simulation example was carried out and comparison results indicated that the proposed approach was feasible and efficient and outperformed others. PMID:25861253
Classifying medical relations in clinical text via convolutional neural networks.
He, Bin; Guan, Yi; Dai, Rui
2018-05-16
Deep learning research on relation classification has achieved solid performance in the general domain. This study proposes a convolutional neural network (CNN) architecture with a multi-pooling operation for medical relation classification on clinical records and explores a loss function with a category-level constraint matrix. Experiments using the 2010 i2b2/VA relation corpus demonstrate these models, which do not depend on any external features, outperform previous single-model methods and our best model is competitive with the existing ensemble-based method. Copyright © 2018. Published by Elsevier B.V.
Precision phase estimation based on weak-value amplification
NASA Astrophysics Data System (ADS)
Qiu, Xiaodong; Xie, Linguo; Liu, Xiong; Luo, Lan; Li, Zhaoxue; Zhang, Zhiyou; Du, Jinglei
2017-02-01
In this letter, we propose a precision method for phase estimation based on the weak-value amplification (WVA) technique using a monochromatic light source. The anomalous WVA significantly suppresses the technical noise with respect to the intensity difference signal induced by the phase delay when the post-selection procedure comes into play. The phase measured precision of this method is proportional to the weak-value of a polarization operator in the experimental range. Our results compete well with the wide spectrum light phase weak measurements and outperform the standard homodyne phase detection technique.
MAG4 versus alternative techniques for forecasting active region flare productivity.
Falconer, David A; Moore, Ronald L; Barghouty, Abdulnasser F; Khazanov, Igor
2014-05-01
MAG4 is a technique of forecasting an active region's rate of production of major flares in the coming few days from a free magnetic energy proxy. We present a statistical method of measuring the difference in performance between MAG4 and comparable alternative techniques that forecast an active region's major-flare productivity from alternative observed aspects of the active region. We demonstrate the method by measuring the difference in performance between the "Present MAG4" technique and each of three alternative techniques, called "McIntosh Active-Region Class," "Total Magnetic Flux," and "Next MAG4." We do this by using (1) the MAG4 database of magnetograms and major flare histories of sunspot active regions, (2) the NOAA table of the major-flare productivity of each of 60 McIntosh active-region classes of sunspot active regions, and (3) five technique performance metrics (Heidke Skill Score, True Skill Score, Percent Correct, Probability of Detection, and False Alarm Rate) evaluated from 2000 random two-by-two contingency tables obtained from the databases. We find that (1) Present MAG4 far outperforms both McIntosh Active-Region Class and Total Magnetic Flux, (2) Next MAG4 significantly outperforms Present MAG4, (3) the performance of Next MAG4 is insensitive to the forward and backward temporal windows used, in the range of one to a few days, and (4) forecasting from the free-energy proxy in combination with either any broad category of McIntosh active-region classes or any Mount Wilson active-region class gives no significant performance improvement over forecasting from the free-energy proxy alone (Present MAG4). Quantitative comparison of performance of pairs of forecasting techniques Next MAG4 forecasts major flares more accurately than Present MAG4 Present MAG4 forecast outperforms McIntosh AR Class and total magnetic flux.
MAG4 versus alternative techniques for forecasting active region flare productivity
Falconer, David A; Moore, Ronald L; Barghouty, Abdulnasser F; Khazanov, Igor
2014-01-01
MAG4 is a technique of forecasting an active region's rate of production of major flares in the coming few days from a free magnetic energy proxy. We present a statistical method of measuring the difference in performance between MAG4 and comparable alternative techniques that forecast an active region's major-flare productivity from alternative observed aspects of the active region. We demonstrate the method by measuring the difference in performance between the “Present MAG4” technique and each of three alternative techniques, called “McIntosh Active-Region Class,” “Total Magnetic Flux,” and “Next MAG4.” We do this by using (1) the MAG4 database of magnetograms and major flare histories of sunspot active regions, (2) the NOAA table of the major-flare productivity of each of 60 McIntosh active-region classes of sunspot active regions, and (3) five technique performance metrics (Heidke Skill Score, True Skill Score, Percent Correct, Probability of Detection, and False Alarm Rate) evaluated from 2000 random two-by-two contingency tables obtained from the databases. We find that (1) Present MAG4 far outperforms both McIntosh Active-Region Class and Total Magnetic Flux, (2) Next MAG4 significantly outperforms Present MAG4, (3) the performance of Next MAG4 is insensitive to the forward and backward temporal windows used, in the range of one to a few days, and (4) forecasting from the free-energy proxy in combination with either any broad category of McIntosh active-region classes or any Mount Wilson active-region class gives no significant performance improvement over forecasting from the free-energy proxy alone (Present MAG4). Key Points Quantitative comparison of performance of pairs of forecasting techniques Next MAG4 forecasts major flares more accurately than Present MAG4 Present MAG4 forecast outperforms McIntosh AR Class and total magnetic flux PMID:26213517
A new distributed systems scheduling algorithm: a swarm intelligence approach
NASA Astrophysics Data System (ADS)
Haghi Kashani, Mostafa; Sarvizadeh, Raheleh; Jameii, Mahdi
2011-12-01
The scheduling problem in distributed systems is known as an NP-complete problem, and methods based on heuristic or metaheuristic search have been proposed to obtain optimal and suboptimal solutions. The task scheduling is a key factor for distributed systems to gain better performance. In this paper, an efficient method based on memetic algorithm is developed to solve the problem of distributed systems scheduling. With regard to load balancing efficiently, Artificial Bee Colony (ABC) has been applied as local search in the proposed memetic algorithm. The proposed method has been compared to existing memetic-Based approach in which Learning Automata method has been used as local search. The results demonstrated that the proposed method outperform the above mentioned method in terms of communication cost.
Text String Detection from Natural Scenes by Structure-based Partition and Grouping
Yi, Chucai; Tian, YingLi
2012-01-01
Text information in natural scene images serves as important clues for many image-based applications such as scene understanding, content-based image retrieval, assistive navigation, and automatic geocoding. However, locating text from complex background with multiple colors is a challenging task. In this paper, we explore a new framework to detect text strings with arbitrary orientations in complex natural scene images. Our proposed framework of text string detection consists of two steps: 1) Image partition to find text character candidates based on local gradient features and color uniformity of character components. 2) Character candidate grouping to detect text strings based on joint structural features of text characters in each text string such as character size differences, distances between neighboring characters, and character alignment. By assuming that a text string has at least three characters, we propose two algorithms of text string detection: 1) adjacent character grouping method, and 2) text line grouping method. The adjacent character grouping method calculates the sibling groups of each character candidate as string segments and then merges the intersecting sibling groups into text string. The text line grouping method performs Hough transform to fit text line among the centroids of text candidates. Each fitted text line describes the orientation of a potential text string. The detected text string is presented by a rectangle region covering all characters whose centroids are cascaded in its text line. To improve efficiency and accuracy, our algorithms are carried out in multi-scales. The proposed methods outperform the state-of-the-art results on the public Robust Reading Dataset which contains text only in horizontal orientation. Furthermore, the effectiveness of our methods to detect text strings with arbitrary orientations is evaluated on the Oriented Scene Text Dataset collected by ourselves containing text strings in non-horizontal orientations. PMID:21411405
Liu, Li-Zhi; Wu, Fang-Xiang; Zhang, Wen-Jun
2014-01-01
As an abstract mapping of the gene regulations in the cell, gene regulatory network is important to both biological research study and practical applications. The reverse engineering of gene regulatory networks from microarray gene expression data is a challenging research problem in systems biology. With the development of biological technologies, multiple time-course gene expression datasets might be collected for a specific gene network under different circumstances. The inference of a gene regulatory network can be improved by integrating these multiple datasets. It is also known that gene expression data may be contaminated with large errors or outliers, which may affect the inference results. A novel method, Huber group LASSO, is proposed to infer the same underlying network topology from multiple time-course gene expression datasets as well as to take the robustness to large error or outliers into account. To solve the optimization problem involved in the proposed method, an efficient algorithm which combines the ideas of auxiliary function minimization and block descent is developed. A stability selection method is adapted to our method to find a network topology consisting of edges with scores. The proposed method is applied to both simulation datasets and real experimental datasets. It shows that Huber group LASSO outperforms the group LASSO in terms of both areas under receiver operating characteristic curves and areas under the precision-recall curves. The convergence analysis of the algorithm theoretically shows that the sequence generated from the algorithm converges to the optimal solution of the problem. The simulation and real data examples demonstrate the effectiveness of the Huber group LASSO in integrating multiple time-course gene expression datasets and improving the resistance to large errors or outliers.