Sample records for method significantly outperforms

  1. Good match exploration for infrared face recognition

    NASA Astrophysics Data System (ADS)

    Yang, Changcai; Zhou, Huabing; Sun, Sheng; Liu, Renfeng; Zhao, Ji; Ma, Jiayi

    2014-11-01

    Establishing good feature correspondence is a critical prerequisite and a challenging task for infrared (IR) face recognition. Recent studies revealed that the scale invariant feature transform (SIFT) descriptor outperforms other local descriptors for feature matching. However, it only uses local appearance information for matching, and hence inevitably leads to a number of false matches. To address this issue, this paper explores global structure information (GSI) among SIFT correspondences, and proposes a new method SIFT-GSI for good match exploration. This is achieved by fitting a smooth mapping function for the underlying correct matches, which involves softassign and deterministic annealing. Quantitative comparisons with state-of-the-art methods on a publicly available IR human face database demonstrate that SIFT-GSI significantly outperforms other methods for feature matching, and hence it is able to improve the reliability of IR face recognition systems.

  2. G3//BMK and Its Application to Calculation of Bond Dissociation Enthalpies.

    PubMed

    Zheng, Wen-Rui; Fu, Yao; Guo, Qing-Xiang

    2008-08-01

    On the basis of systematic examinations it was found that the BMK functional significantly outperformed the other popular density functional theory methods including B3LYP, B3P86, KMLYP, MPW1P86, O3LYP, and X3LYP for the calculation of bond dissociation enthalpies (BDEs). However, it was also found that even the BMK functional might dramatically fail in predicting the BDEs of some chemical bonds. To solve this problem, a new composite ab initio method named G3//BMK was developed by combining the strengths of both the G3 theory and BMK. G3//BMK was found to outperform the G3 and G3//B3LYP methods. It could accurately predict the BDEs of diverse types of chemical bonds in various organic molecules within a precision of ca. 1.2 kcal/mol.

  3. Community-based Inquiry Improves Critical Thinking in General Education Biology

    PubMed Central

    Faiola, Celia L.; Johnson, James E.; Kurtz, Martha J.

    2008-01-01

    National stakeholders are becoming increasingly concerned about the inability of college graduates to think critically. Research shows that, while both faculty and students deem critical thinking essential, only a small fraction of graduates can demonstrate the thinking skills necessary for academic and professional success. Many faculty are considering nontraditional teaching methods that incorporate undergraduate research because they more closely align with the process of doing investigative science. This study compared a research-focused teaching method called community-based inquiry (CBI) with traditional lecture/laboratory in general education biology to discover which method would elicit greater gains in critical thinking. Results showed significant critical-thinking gains in the CBI group but decreases in a traditional group and a mixed CBI/traditional group. Prior critical-thinking skill, instructor, and ethnicity also significantly influenced critical-thinking gains, with nearly all ethnicities in the CBI group outperforming peers in both the mixed and traditional groups. Females, who showed decreased critical thinking in traditional courses relative to males, outperformed their male counterparts in CBI courses. Through the results of this study, it is hoped that faculty who value both research and critical thinking will consider using the CBI method. PMID:18765755

  4. An Efficient Augmented Lagrangian Method with Applications to Total Variation Minimization

    DTIC Science & Technology

    2012-08-17

    the classic augmented Lagrangian multiplier method, we propose, analyze and test an algorithm for solving a class of equality-constrained non-smooth...method, we propose, analyze and test an algorithm for solving a class of equality-constrained non-smooth optimization problems (chie y but not...significantly outperforming several state-of-the-art solvers on most tested problems. The resulting MATLAB solver, called TVAL3, has been posted online [23]. 2

  5. Searching for transcription factor binding sites in vector spaces

    PubMed Central

    2012-01-01

    Background Computational approaches to transcription factor binding site identification have been actively researched in the past decade. Learning from known binding sites, new binding sites of a transcription factor in unannotated sequences can be identified. A number of search methods have been introduced over the years. However, one can rarely find one single method that performs the best on all the transcription factors. Instead, to identify the best method for a particular transcription factor, one usually has to compare a handful of methods. Hence, it is highly desirable for a method to perform automatic optimization for individual transcription factors. Results We proposed to search for transcription factor binding sites in vector spaces. This framework allows us to identify the best method for each individual transcription factor. We further introduced two novel methods, the negative-to-positive vector (NPV) and optimal discriminating vector (ODV) methods, to construct query vectors to search for binding sites in vector spaces. Extensive cross-validation experiments showed that the proposed methods significantly outperformed the ungapped likelihood under positional background method, a state-of-the-art method, and the widely-used position-specific scoring matrix method. We further demonstrated that motif subtypes of a TF can be readily identified in this framework and two variants called the k NPV and k ODV methods benefited significantly from motif subtype identification. Finally, independent validation on ChIP-seq data showed that the ODV and NPV methods significantly outperformed the other compared methods. Conclusions We conclude that the proposed framework is highly flexible. It enables the two novel methods to automatically identify a TF-specific subspace to search for binding sites. Implementations are available as source code at: http://biogrid.engr.uconn.edu/tfbs_search/. PMID:23244338

  6. Hybrid Power Management for Office Equipment

    NASA Astrophysics Data System (ADS)

    Gingade, Ganesh P.

    Office machines (such as printers, scanners, fax, and copiers) can consume significant amounts of power. Few studies have been devoted to power management of office equipment. Most office machines have sleep modes to save power. Power management of these machines are usually timeout-based: a machine sleeps after being idle long enough. Setting the timeout duration can be difficult: if it is too long, the machine wastes power during idleness. If it is too short, the machine sleeps too soon and too often--the wakeup delay can significantly degrade productivity. Thus, power management is a tradeoff between saving energy and keeping short response time. Many power management policies have been published and one policy may outperform another in some scenarios. There is no definite conclusion which policy is always better. This thesis describes two methods for office equipment power management. The first method adaptively reduces power based on a constraint of the wakeup delay. The second method is a hybrid with multiple candidate policies and it selects the most appropriate power management policy. Using six months of request traces from 18 different offices, we demonstrate that the hybrid policy outperforms individual policies. We also discover that power management based on business hours does not produce consistent energy savings.

  7. RELIC: a novel dye-bias correction method for Illumina Methylation BeadChip.

    PubMed

    Xu, Zongli; Langie, Sabine A S; De Boever, Patrick; Taylor, Jack A; Niu, Liang

    2017-01-03

    The Illumina Infinium HumanMethylation450 BeadChip and its successor, Infinium MethylationEPIC BeadChip, have been extensively utilized in epigenome-wide association studies. Both arrays use two fluorescent dyes (Cy3-green/Cy5-red) to measure methylation level at CpG sites. However, performance difference between dyes can result in biased estimates of methylation levels. Here we describe a novel method, called REgression on Logarithm of Internal Control probes (RELIC) to correct for dye bias on whole array by utilizing the intensity values of paired internal control probes that monitor the two color channels. We evaluate the method in several datasets against other widely used dye-bias correction methods. Results on data quality improvement showed that RELIC correction statistically significantly outperforms alternative dye-bias correction methods. We incorporated the method into the R package ENmix, which is freely available from the Bioconductor website ( https://www.bioconductor.org/packages/release/bioc/html/ENmix.html ). RELIC is an efficient and robust method to correct for dye-bias in Illumina Methylation BeadChip data. It outperforms other alternative methods and conveniently implemented in R package ENmix to facilitate DNA methylation studies.

  8. Model-based sensor-less wavefront aberration correction in optical coherence tomography.

    PubMed

    Verstraete, Hans R G W; Wahls, Sander; Kalkman, Jeroen; Verhaegen, Michel

    2015-12-15

    Several sensor-less wavefront aberration correction methods that correct nonlinear wavefront aberrations by maximizing the optical coherence tomography (OCT) signal are tested on an OCT setup. A conventional coordinate search method is compared to two model-based optimization methods. The first model-based method takes advantage of the well-known optimization algorithm (NEWUOA) and utilizes a quadratic model. The second model-based method (DONE) is new and utilizes a random multidimensional Fourier-basis expansion. The model-based algorithms achieve lower wavefront errors with up to ten times fewer measurements. Furthermore, the newly proposed DONE method outperforms the NEWUOA method significantly. The DONE algorithm is tested on OCT images and shows a significantly improved image quality.

  9. Robust regression on noisy data for fusion scaling laws

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Verdoolaege, Geert, E-mail: geert.verdoolaege@ugent.be; Laboratoire de Physique des Plasmas de l'ERM - Laboratorium voor Plasmafysica van de KMS

    2014-11-15

    We introduce the method of geodesic least squares (GLS) regression for estimating fusion scaling laws. Based on straightforward principles, the method is easily implemented, yet it clearly outperforms established regression techniques, particularly in cases of significant uncertainty on both the response and predictor variables. We apply GLS for estimating the scaling of the L-H power threshold, resulting in estimates for ITER that are somewhat higher than predicted earlier.

  10. Multiratio fusion change detection with adaptive thresholding

    NASA Astrophysics Data System (ADS)

    Hytla, Patrick C.; Balster, Eric J.; Vasquez, Juan R.; Neuroth, Robert M.

    2017-04-01

    A ratio-based change detection method known as multiratio fusion (MRF) is proposed and tested. The MRF framework builds on other change detection components proposed in this work: dual ratio (DR) and multiratio (MR). The DR method involves two ratios coupled with adaptive thresholds to maximize detected changes and minimize false alarms. The use of two ratios is shown to outperform the single ratio case when the means of the image pairs are not equal. MR change detection builds on the DR method by including negative imagery to produce four total ratios with adaptive thresholds. Inclusion of negative imagery is shown to improve detection sensitivity and to boost detection performance in certain target and background cases. MRF further expands this concept by fusing together the ratio outputs using a routine in which detections must be verified by two or more ratios to be classified as a true changed pixel. The proposed method is tested with synthetically generated test imagery and real datasets with results compared to other methods found in the literature. DR is shown to significantly outperform the standard single ratio method. MRF produces excellent change detection results that exhibit up to a 22% performance improvement over other methods from the literature at low false-alarm rates.

  11. Smart light random memory sprays Retinex: a fast Retinex implementation for high-quality brightness adjustment and color correction.

    PubMed

    Banić, Nikola; Lončarić, Sven

    2015-11-01

    Removing the influence of illumination on image colors and adjusting the brightness across the scene are important image enhancement problems. This is achieved by applying adequate color constancy and brightness adjustment methods. One of the earliest models to deal with both of these problems was the Retinex theory. Some of the Retinex implementations tend to give high-quality results by performing local operations, but they are computationally relatively slow. One of the recent Retinex implementations is light random sprays Retinex (LRSR). In this paper, a new method is proposed for brightness adjustment and color correction that overcomes the main disadvantages of LRSR. There are three main contributions of this paper. First, a concept of memory sprays is proposed to reduce the number of LRSR's per-pixel operations to a constant regardless of the parameter values, thereby enabling a fast Retinex-based local image enhancement. Second, an effective remapping of image intensities is proposed that results in significantly higher quality. Third, the problem of LRSR's halo effect is significantly reduced by using an alternative illumination processing method. The proposed method enables a fast Retinex-based image enhancement by processing Retinex paths in a constant number of steps regardless of the path size. Due to the halo effect removal and remapping of the resulting intensities, the method outperforms many of the well-known image enhancement methods in terms of resulting image quality. The results are presented and discussed. It is shown that the proposed method outperforms most of the tested methods in terms of image brightness adjustment, color correction, and computational speed.

  12. Spectral Transfer Learning Using Information Geometry for a User-Independent Brain-Computer Interface

    PubMed Central

    Waytowich, Nicholas R.; Lawhern, Vernon J.; Bohannon, Addison W.; Ball, Kenneth R.; Lance, Brent J.

    2016-01-01

    Recent advances in signal processing and machine learning techniques have enabled the application of Brain-Computer Interface (BCI) technologies to fields such as medicine, industry, and recreation; however, BCIs still suffer from the requirement of frequent calibration sessions due to the intra- and inter-individual variability of brain-signals, which makes calibration suppression through transfer learning an area of increasing interest for the development of practical BCI systems. In this paper, we present an unsupervised transfer method (spectral transfer using information geometry, STIG), which ranks and combines unlabeled predictions from an ensemble of information geometry classifiers built on data from individual training subjects. The STIG method is validated in both off-line and real-time feedback analysis during a rapid serial visual presentation task (RSVP). For detection of single-trial, event-related potentials (ERPs), the proposed method can significantly outperform existing calibration-free techniques as well as outperform traditional within-subject calibration techniques when limited data is available. This method demonstrates that unsupervised transfer learning for single-trial detection in ERP-based BCIs can be achieved without the requirement of costly training data, representing a step-forward in the overall goal of achieving a practical user-independent BCI system. PMID:27713685

  13. Spectral Transfer Learning Using Information Geometry for a User-Independent Brain-Computer Interface.

    PubMed

    Waytowich, Nicholas R; Lawhern, Vernon J; Bohannon, Addison W; Ball, Kenneth R; Lance, Brent J

    2016-01-01

    Recent advances in signal processing and machine learning techniques have enabled the application of Brain-Computer Interface (BCI) technologies to fields such as medicine, industry, and recreation; however, BCIs still suffer from the requirement of frequent calibration sessions due to the intra- and inter-individual variability of brain-signals, which makes calibration suppression through transfer learning an area of increasing interest for the development of practical BCI systems. In this paper, we present an unsupervised transfer method (spectral transfer using information geometry, STIG), which ranks and combines unlabeled predictions from an ensemble of information geometry classifiers built on data from individual training subjects. The STIG method is validated in both off-line and real-time feedback analysis during a rapid serial visual presentation task (RSVP). For detection of single-trial, event-related potentials (ERPs), the proposed method can significantly outperform existing calibration-free techniques as well as outperform traditional within-subject calibration techniques when limited data is available. This method demonstrates that unsupervised transfer learning for single-trial detection in ERP-based BCIs can be achieved without the requirement of costly training data, representing a step-forward in the overall goal of achieving a practical user-independent BCI system.

  14. SSVEP recognition using common feature analysis in brain-computer interface.

    PubMed

    Zhang, Yu; Zhou, Guoxu; Jin, Jing; Wang, Xingyu; Cichocki, Andrzej

    2015-04-15

    Canonical correlation analysis (CCA) has been successfully applied to steady-state visual evoked potential (SSVEP) recognition for brain-computer interface (BCI) application. Although the CCA method outperforms the traditional power spectral density analysis through multi-channel detection, it requires additionally pre-constructed reference signals of sine-cosine waves. It is likely to encounter overfitting in using a short time window since the reference signals include no features from training data. We consider that a group of electroencephalogram (EEG) data trials recorded at a certain stimulus frequency on a same subject should share some common features that may bear the real SSVEP characteristics. This study therefore proposes a common feature analysis (CFA)-based method to exploit the latent common features as natural reference signals in using correlation analysis for SSVEP recognition. Good performance of the CFA method for SSVEP recognition is validated with EEG data recorded from ten healthy subjects, in contrast to CCA and a multiway extension of CCA (MCCA). Experimental results indicate that the CFA method significantly outperformed the CCA and the MCCA methods for SSVEP recognition in using a short time window (i.e., less than 1s). The superiority of the proposed CFA method suggests it is promising for the development of a real-time SSVEP-based BCI. Copyright © 2014 Elsevier B.V. All rights reserved.

  15. Comparative study of the effectiveness and limitations of current methods for detecting sequence coevolution.

    PubMed

    Mao, Wenzhi; Kaya, Cihan; Dutta, Anindita; Horovitz, Amnon; Bahar, Ivet

    2015-06-15

    With rapid accumulation of sequence data on several species, extracting rational and systematic information from multiple sequence alignments (MSAs) is becoming increasingly important. Currently, there is a plethora of computational methods for investigating coupled evolutionary changes in pairs of positions along the amino acid sequence, and making inferences on structure and function. Yet, the significance of coevolution signals remains to be established. Also, a large number of false positives (FPs) arise from insufficient MSA size, phylogenetic background and indirect couplings. Here, a set of 16 pairs of non-interacting proteins is thoroughly examined to assess the effectiveness and limitations of different methods. The analysis shows that recent computationally expensive methods designed to remove biases from indirect couplings outperform others in detecting tertiary structural contacts as well as eliminating intermolecular FPs; whereas traditional methods such as mutual information benefit from refinements such as shuffling, while being highly efficient. Computations repeated with 2,330 pairs of protein families from the Negatome database corroborated these results. Finally, using a training dataset of 162 families of proteins, we propose a combined method that outperforms existing individual methods. Overall, the study provides simple guidelines towards the choice of suitable methods and strategies based on available MSA size and computing resources. Software is freely available through the Evol component of ProDy API. © The Author 2015. Published by Oxford University Press.

  16. Detect2Rank: Combining Object Detectors Using Learning to Rank.

    PubMed

    Karaoglu, Sezer; Yang Liu; Gevers, Theo

    2016-01-01

    Object detection is an important research area in the field of computer vision. Many detection algorithms have been proposed. However, each object detector relies on specific assumptions of the object appearance and imaging conditions. As a consequence, no algorithm can be considered universal. With the large variety of object detectors, the subsequent question is how to select and combine them. In this paper, we propose a framework to learn how to combine object detectors. The proposed method uses (single) detectors like Deformable Part Models, Color Names and Ensemble of Exemplar-SVMs, and exploits their correlation by high-level contextual features to yield a combined detection list. Experiments on the PASCAL VOC07 and VOC10 data sets show that the proposed method significantly outperforms single object detectors, DPM (8.4%), CN (6.8%) and EES (17.0%) on VOC07 and DPM (6.5%), CN (5.5%) and EES (16.2%) on VOC10. We show with an experiment that there are no constraints on the type of the detector. The proposed method outperforms (2.4%) the state-of-the-art object detector (RCNN) on VOC07 when Regions with Convolutional Neural Network is combined with other detectors used in this paper.

  17. AUC-Maximizing Ensembles through Metalearning.

    PubMed

    LeDell, Erin; van der Laan, Mark J; Petersen, Maya

    2016-05-01

    Area Under the ROC Curve (AUC) is often used to measure the performance of an estimator in binary classification problems. An AUC-maximizing classifier can have significant advantages in cases where ranking correctness is valued or if the outcome is rare. In a Super Learner ensemble, maximization of the AUC can be achieved by the use of an AUC-maximining metalearning algorithm. We discuss an implementation of an AUC-maximization technique that is formulated as a nonlinear optimization problem. We also evaluate the effectiveness of a large number of different nonlinear optimization algorithms to maximize the cross-validated AUC of the ensemble fit. The results provide evidence that AUC-maximizing metalearners can, and often do, out-perform non-AUC-maximizing metalearning methods, with respect to ensemble AUC. The results also demonstrate that as the level of imbalance in the training data increases, the Super Learner ensemble outperforms the top base algorithm by a larger degree.

  18. AUC-Maximizing Ensembles through Metalearning

    PubMed Central

    LeDell, Erin; van der Laan, Mark J.; Peterson, Maya

    2016-01-01

    Area Under the ROC Curve (AUC) is often used to measure the performance of an estimator in binary classification problems. An AUC-maximizing classifier can have significant advantages in cases where ranking correctness is valued or if the outcome is rare. In a Super Learner ensemble, maximization of the AUC can be achieved by the use of an AUC-maximining metalearning algorithm. We discuss an implementation of an AUC-maximization technique that is formulated as a nonlinear optimization problem. We also evaluate the effectiveness of a large number of different nonlinear optimization algorithms to maximize the cross-validated AUC of the ensemble fit. The results provide evidence that AUC-maximizing metalearners can, and often do, out-perform non-AUC-maximizing metalearning methods, with respect to ensemble AUC. The results also demonstrate that as the level of imbalance in the training data increases, the Super Learner ensemble outperforms the top base algorithm by a larger degree. PMID:27227721

  19. Precision phase estimation based on weak-value amplification

    NASA Astrophysics Data System (ADS)

    Qiu, Xiaodong; Xie, Linguo; Liu, Xiong; Luo, Lan; Li, Zhaoxue; Zhang, Zhiyou; Du, Jinglei

    2017-02-01

    In this letter, we propose a precision method for phase estimation based on the weak-value amplification (WVA) technique using a monochromatic light source. The anomalous WVA significantly suppresses the technical noise with respect to the intensity difference signal induced by the phase delay when the post-selection procedure comes into play. The phase measured precision of this method is proportional to the weak-value of a polarization operator in the experimental range. Our results compete well with the wide spectrum light phase weak measurements and outperform the standard homodyne phase detection technique.

  20. Multiscale site-response mapping: A case study of Parkfield, California

    USGS Publications Warehouse

    Thompson, E.M.; Baise, L.G.; Kayen, R.E.; Morgan, E.C.; Kaklamanos, J.

    2011-01-01

    The scale of previously proposed methods for mapping site-response ranges from global coverage down to individual urban regions. Typically, spatial coverage and accuracy are inversely related.We use the densely spaced strong-motion stations in Parkfield, California, to estimate the accuracy of different site-response mapping methods and demonstrate a method for integrating multiple site-response estimates from the site to the global scale. This method is simply a weighted mean of a suite of different estimates, where the weights are the inverse of the variance of the individual estimates. Thus, the dominant site-response model varies in space as a function of the accuracy of the different models. For mapping applications, site-response models should be judged in terms of both spatial coverage and the degree of correlation with observed amplifications. Performance varies with period, but in general the Parkfield data show that: (1) where a velocity profile is available, the square-rootof- impedance (SRI) method outperforms the measured VS30 (30 m divided by the S-wave travel time to 30 m depth) and (2) where velocity profiles are unavailable, the topographic slope method outperforms surficial geology for short periods, but geology outperforms slope at longer periods. We develop new equations to estimate site response from topographic slope, derived from the Next Generation Attenuation (NGA) database.

  1. MAG4 versus alternative techniques for forecasting active region flare productivity.

    PubMed

    Falconer, David A; Moore, Ronald L; Barghouty, Abdulnasser F; Khazanov, Igor

    2014-05-01

    MAG4 is a technique of forecasting an active region's rate of production of major flares in the coming few days from a free magnetic energy proxy. We present a statistical method of measuring the difference in performance between MAG4 and comparable alternative techniques that forecast an active region's major-flare productivity from alternative observed aspects of the active region. We demonstrate the method by measuring the difference in performance between the "Present MAG4" technique and each of three alternative techniques, called "McIntosh Active-Region Class," "Total Magnetic Flux," and "Next MAG4." We do this by using (1) the MAG4 database of magnetograms and major flare histories of sunspot active regions, (2) the NOAA table of the major-flare productivity of each of 60 McIntosh active-region classes of sunspot active regions, and (3) five technique performance metrics (Heidke Skill Score, True Skill Score, Percent Correct, Probability of Detection, and False Alarm Rate) evaluated from 2000 random two-by-two contingency tables obtained from the databases. We find that (1) Present MAG4 far outperforms both McIntosh Active-Region Class and Total Magnetic Flux, (2) Next MAG4 significantly outperforms Present MAG4, (3) the performance of Next MAG4 is insensitive to the forward and backward temporal windows used, in the range of one to a few days, and (4) forecasting from the free-energy proxy in combination with either any broad category of McIntosh active-region classes or any Mount Wilson active-region class gives no significant performance improvement over forecasting from the free-energy proxy alone (Present MAG4). Quantitative comparison of performance of pairs of forecasting techniques Next MAG4 forecasts major flares more accurately than Present MAG4 Present MAG4 forecast outperforms McIntosh AR Class and total magnetic flux.

  2. MAG4 versus alternative techniques for forecasting active region flare productivity

    PubMed Central

    Falconer, David A; Moore, Ronald L; Barghouty, Abdulnasser F; Khazanov, Igor

    2014-01-01

    MAG4 is a technique of forecasting an active region's rate of production of major flares in the coming few days from a free magnetic energy proxy. We present a statistical method of measuring the difference in performance between MAG4 and comparable alternative techniques that forecast an active region's major-flare productivity from alternative observed aspects of the active region. We demonstrate the method by measuring the difference in performance between the “Present MAG4” technique and each of three alternative techniques, called “McIntosh Active-Region Class,” “Total Magnetic Flux,” and “Next MAG4.” We do this by using (1) the MAG4 database of magnetograms and major flare histories of sunspot active regions, (2) the NOAA table of the major-flare productivity of each of 60 McIntosh active-region classes of sunspot active regions, and (3) five technique performance metrics (Heidke Skill Score, True Skill Score, Percent Correct, Probability of Detection, and False Alarm Rate) evaluated from 2000 random two-by-two contingency tables obtained from the databases. We find that (1) Present MAG4 far outperforms both McIntosh Active-Region Class and Total Magnetic Flux, (2) Next MAG4 significantly outperforms Present MAG4, (3) the performance of Next MAG4 is insensitive to the forward and backward temporal windows used, in the range of one to a few days, and (4) forecasting from the free-energy proxy in combination with either any broad category of McIntosh active-region classes or any Mount Wilson active-region class gives no significant performance improvement over forecasting from the free-energy proxy alone (Present MAG4). Key Points Quantitative comparison of performance of pairs of forecasting techniques Next MAG4 forecasts major flares more accurately than Present MAG4 Present MAG4 forecast outperforms McIntosh AR Class and total magnetic flux PMID:26213517

  3. A large-scale evaluation of computational protein function prediction

    PubMed Central

    Radivojac, Predrag; Clark, Wyatt T; Ronnen Oron, Tal; Schnoes, Alexandra M; Wittkop, Tobias; Sokolov, Artem; Graim, Kiley; Funk, Christopher; Verspoor, Karin; Ben-Hur, Asa; Pandey, Gaurav; Yunes, Jeffrey M; Talwalkar, Ameet S; Repo, Susanna; Souza, Michael L; Piovesan, Damiano; Casadio, Rita; Wang, Zheng; Cheng, Jianlin; Fang, Hai; Gough, Julian; Koskinen, Patrik; Törönen, Petri; Nokso-Koivisto, Jussi; Holm, Liisa; Cozzetto, Domenico; Buchan, Daniel W A; Bryson, Kevin; Jones, David T; Limaye, Bhakti; Inamdar, Harshal; Datta, Avik; Manjari, Sunitha K; Joshi, Rajendra; Chitale, Meghana; Kihara, Daisuke; Lisewski, Andreas M; Erdin, Serkan; Venner, Eric; Lichtarge, Olivier; Rentzsch, Robert; Yang, Haixuan; Romero, Alfonso E; Bhat, Prajwal; Paccanaro, Alberto; Hamp, Tobias; Kassner, Rebecca; Seemayer, Stefan; Vicedo, Esmeralda; Schaefer, Christian; Achten, Dominik; Auer, Florian; Böhm, Ariane; Braun, Tatjana; Hecht, Maximilian; Heron, Mark; Hönigschmid, Peter; Hopf, Thomas; Kaufmann, Stefanie; Kiening, Michael; Krompass, Denis; Landerer, Cedric; Mahlich, Yannick; Roos, Manfred; Björne, Jari; Salakoski, Tapio; Wong, Andrew; Shatkay, Hagit; Gatzmann, Fanny; Sommer, Ingolf; Wass, Mark N; Sternberg, Michael J E; Škunca, Nives; Supek, Fran; Bošnjak, Matko; Panov, Panče; Džeroski, Sašo; Šmuc, Tomislav; Kourmpetis, Yiannis A I; van Dijk, Aalt D J; ter Braak, Cajo J F; Zhou, Yuanpeng; Gong, Qingtian; Dong, Xinran; Tian, Weidong; Falda, Marco; Fontana, Paolo; Lavezzo, Enrico; Di Camillo, Barbara; Toppo, Stefano; Lan, Liang; Djuric, Nemanja; Guo, Yuhong; Vucetic, Slobodan; Bairoch, Amos; Linial, Michal; Babbitt, Patricia C; Brenner, Steven E; Orengo, Christine; Rost, Burkhard; Mooney, Sean D; Friedberg, Iddo

    2013-01-01

    Automated annotation of protein function is challenging. As the number of sequenced genomes rapidly grows, the overwhelming majority of protein products can only be annotated computationally. If computational predictions are to be relied upon, it is crucial that the accuracy of these methods be high. Here we report the results from the first large-scale community-based Critical Assessment of protein Function Annotation (CAFA) experiment. Fifty-four methods representing the state-of-the-art for protein function prediction were evaluated on a target set of 866 proteins from eleven organisms. Two findings stand out: (i) today’s best protein function prediction algorithms significantly outperformed widely-used first-generation methods, with large gains on all types of targets; and (ii) although the top methods perform well enough to guide experiments, there is significant need for improvement of currently available tools. PMID:23353650

  4. Extracting laboratory test information from biomedical text

    PubMed Central

    Kang, Yanna Shen; Kayaalp, Mehmet

    2013-01-01

    Background: No previous study reported the efficacy of current natural language processing (NLP) methods for extracting laboratory test information from narrative documents. This study investigates the pathology informatics question of how accurately such information can be extracted from text with the current tools and techniques, especially machine learning and symbolic NLP methods. The study data came from a text corpus maintained by the U.S. Food and Drug Administration, containing a rich set of information on laboratory tests and test devices. Methods: The authors developed a symbolic information extraction (SIE) system to extract device and test specific information about four types of laboratory test entities: Specimens, analytes, units of measures and detection limits. They compared the performance of SIE and three prominent machine learning based NLP systems, LingPipe, GATE and BANNER, each implementing a distinct supervised machine learning method, hidden Markov models, support vector machines and conditional random fields, respectively. Results: Machine learning systems recognized laboratory test entities with moderately high recall, but low precision rates. Their recall rates were relatively higher when the number of distinct entity values (e.g., the spectrum of specimens) was very limited or when lexical morphology of the entity was distinctive (as in units of measures), yet SIE outperformed them with statistically significant margins on extracting specimen, analyte and detection limit information in both precision and F-measure. Its high recall performance was statistically significant on analyte information extraction. Conclusions: Despite its shortcomings against machine learning methods, a well-tailored symbolic system may better discern relevancy among a pile of information of the same type and may outperform a machine learning system by tapping into lexically non-local contextual information such as the document structure. PMID:24083058

  5. Robust binarization of degraded document images using heuristics

    NASA Astrophysics Data System (ADS)

    Parker, Jon; Frieder, Ophir; Frieder, Gideon

    2013-12-01

    Historically significant documents are often discovered with defects that make them difficult to read and analyze. This fact is particularly troublesome if the defects prevent software from performing an automated analysis. Image enhancement methods are used to remove or minimize document defects, improve software performance, and generally make images more legible. We describe an automated, image enhancement method that is input page independent and requires no training data. The approach applies to color or greyscale images with hand written script, typewritten text, images, and mixtures thereof. We evaluated the image enhancement method against the test images provided by the 2011 Document Image Binarization Contest (DIBCO). Our method outperforms all 2011 DIBCO entrants in terms of average F1 measure - doing so with a significantly lower variance than top contest entrants. The capability of the proposed method is also illustrated using select images from a collection of historic documents stored at Yad Vashem Holocaust Memorial in Israel.

  6. Diagnosing Autism Spectrum Disorder from Brain Resting-State Functional Connectivity Patterns Using a Deep Neural Network with a Novel Feature Selection Method.

    PubMed

    Guo, Xinyu; Dominick, Kelli C; Minai, Ali A; Li, Hailong; Erickson, Craig A; Lu, Long J

    2017-01-01

    The whole-brain functional connectivity (FC) pattern obtained from resting-state functional magnetic resonance imaging data are commonly applied to study neuropsychiatric conditions such as autism spectrum disorder (ASD) by using different machine learning models. Recent studies indicate that both hyper- and hypo- aberrant ASD-associated FCs were widely distributed throughout the entire brain rather than only in some specific brain regions. Deep neural networks (DNN) with multiple hidden layers have shown the ability to systematically extract lower-to-higher level information from high dimensional data across a series of neural hidden layers, significantly improving classification accuracy for such data. In this study, a DNN with a novel feature selection method (DNN-FS) is developed for the high dimensional whole-brain resting-state FC pattern classification of ASD patients vs. typical development (TD) controls. The feature selection method is able to help the DNN generate low dimensional high-quality representations of the whole-brain FC patterns by selecting features with high discriminating power from multiple trained sparse auto-encoders. For the comparison, a DNN without the feature selection method (DNN-woFS) is developed, and both of them are tested with different architectures (i.e., with different numbers of hidden layers/nodes). Results show that the best classification accuracy of 86.36% is generated by the DNN-FS approach with 3 hidden layers and 150 hidden nodes (3/150). Remarkably, DNN-FS outperforms DNN-woFS for all architectures studied. The most significant accuracy improvement was 9.09% with the 3/150 architecture. The method also outperforms other feature selection methods, e.g., two sample t -test and elastic net. In addition to improving the classification accuracy, a Fisher's score-based biomarker identification method based on the DNN is also developed, and used to identify 32 FCs related to ASD. These FCs come from or cross different pre-defined brain networks including the default-mode, cingulo-opercular, frontal-parietal, and cerebellum. Thirteen of them are statically significant between ASD and TD groups (two sample t -test p < 0.05) while 19 of them are not. The relationship between the statically significant FCs and the corresponding ASD behavior symptoms is discussed based on the literature and clinician's expert knowledge. Meanwhile, the potential reason of obtaining 19 FCs which are not statistically significant is also provided.

  7. Deep Ensemble Learning for Monaural Speech Separation

    DTIC Science & Technology

    2015-02-01

    cse.ohio- state.edu). typically predict the ideal binary mask (IBM) or ideal ratio mask ( IRM ). For the IBM [21], a T-F unit is assigned 1, if the signal...dominance. For the IRM [17], a T- F unit is assigned some ratio of target energy and mixture energy. Kim et al. [15] used Gaussian mixture models (GMM...significantly outperforms earlier separation methods. Subsequently, Wang et al. [23] examined a number of training targets and suggested that the IRM should be

  8. An image retrieval framework for real-time endoscopic image retargeting.

    PubMed

    Ye, Menglong; Johns, Edward; Walter, Benjamin; Meining, Alexander; Yang, Guang-Zhong

    2017-08-01

    Serial endoscopic examinations of a patient are important for early diagnosis of malignancies in the gastrointestinal tract. However, retargeting for optical biopsy is challenging due to extensive tissue variations between examinations, requiring the method to be tolerant to these changes whilst enabling real-time retargeting. This work presents an image retrieval framework for inter-examination retargeting. We propose both a novel image descriptor tolerant of long-term tissue changes and a novel descriptor matching method in real time. The descriptor is based on histograms generated from regional intensity comparisons over multiple scales, offering stability over long-term appearance changes at the higher levels, whilst remaining discriminative at the lower levels. The matching method then learns a hashing function using random forests, to compress the string and allow for fast image comparison by a simple Hamming distance metric. A dataset that contains 13 in vivo gastrointestinal videos was collected from six patients, representing serial examinations of each patient, which includes videos captured with significant time intervals. Precision-recall for retargeting shows that our new descriptor outperforms a number of alternative descriptors, whilst our hashing method outperforms a number of alternative hashing approaches. We have proposed a novel framework for optical biopsy in serial endoscopic examinations. A new descriptor, combined with a novel hashing method, achieves state-of-the-art retargeting, with validation on in vivo videos from six patients. Real-time performance also allows for practical integration without disturbing the existing clinical workflow.

  9. Classification of clinically useful sentences in clinical evidence resources.

    PubMed

    Morid, Mohammad Amin; Fiszman, Marcelo; Raja, Kalpana; Jonnalagadda, Siddhartha R; Del Fiol, Guilherme

    2016-04-01

    Most patient care questions raised by clinicians can be answered by online clinical knowledge resources. However, important barriers still challenge the use of these resources at the point of care. To design and assess a method for extracting clinically useful sentences from synthesized online clinical resources that represent the most clinically useful information for directly answering clinicians' information needs. We developed a Kernel-based Bayesian Network classification model based on different domain-specific feature types extracted from sentences in a gold standard composed of 18 UpToDate documents. These features included UMLS concepts and their semantic groups, semantic predications extracted by SemRep, patient population identified by a pattern-based natural language processing (NLP) algorithm, and cue words extracted by a feature selection technique. Algorithm performance was measured in terms of precision, recall, and F-measure. The feature-rich approach yielded an F-measure of 74% versus 37% for a feature co-occurrence method (p<0.001). Excluding predication, population, semantic concept or text-based features reduced the F-measure to 62%, 66%, 58% and 69% respectively (p<0.01). The classifier applied to Medline sentences reached an F-measure of 73%, which is equivalent to the performance of the classifier on UpToDate sentences (p=0.62). The feature-rich approach significantly outperformed general baseline methods. This approach significantly outperformed classifiers based on a single type of feature. Different types of semantic features provided a unique contribution to overall classification performance. The classifier's model and features used for UpToDate generalized well to Medline abstracts. Copyright © 2016 Elsevier Inc. All rights reserved.

  10. Multivariate decoding of brain images using ordinal regression.

    PubMed

    Doyle, O M; Ashburner, J; Zelaya, F O; Williams, S C R; Mehta, M A; Marquand, A F

    2013-11-01

    Neuroimaging data are increasingly being used to predict potential outcomes or groupings, such as clinical severity, drug dose response, and transitional illness states. In these examples, the variable (target) we want to predict is ordinal in nature. Conventional classification schemes assume that the targets are nominal and hence ignore their ranked nature, whereas parametric and/or non-parametric regression models enforce a metric notion of distance between classes. Here, we propose a novel, alternative multivariate approach that overcomes these limitations - whole brain probabilistic ordinal regression using a Gaussian process framework. We applied this technique to two data sets of pharmacological neuroimaging data from healthy volunteers. The first study was designed to investigate the effect of ketamine on brain activity and its subsequent modulation with two compounds - lamotrigine and risperidone. The second study investigates the effect of scopolamine on cerebral blood flow and its modulation using donepezil. We compared ordinal regression to multi-class classification schemes and metric regression. Considering the modulation of ketamine with lamotrigine, we found that ordinal regression significantly outperformed multi-class classification and metric regression in terms of accuracy and mean absolute error. However, for risperidone ordinal regression significantly outperformed metric regression but performed similarly to multi-class classification both in terms of accuracy and mean absolute error. For the scopolamine data set, ordinal regression was found to outperform both multi-class and metric regression techniques considering the regional cerebral blood flow in the anterior cingulate cortex. Ordinal regression was thus the only method that performed well in all cases. Our results indicate the potential of an ordinal regression approach for neuroimaging data while providing a fully probabilistic framework with elegant approaches for model selection. Copyright © 2013. Published by Elsevier Inc.

  11. Deep visual-semantic for crowded video understanding

    NASA Astrophysics Data System (ADS)

    Deng, Chunhua; Zhang, Junwen

    2018-03-01

    Visual-semantic features play a vital role for crowded video understanding. Convolutional Neural Networks (CNNs) have experienced a significant breakthrough in learning representations from images. However, the learning of visualsemantic features, and how it can be effectively extracted for video analysis, still remains a challenging task. In this study, we propose a novel visual-semantic method to capture both appearance and dynamic representations. In particular, we propose a spatial context method, based on the fractional Fisher vector (FV) encoding on CNN features, which can be regarded as our main contribution. In addition, to capture temporal context information, we also applied fractional encoding method on dynamic images. Experimental results on the WWW crowed video dataset demonstrate that the proposed method outperform the state of the art.

  12. Prediction of fatigue-related driver performance from EEG data by deep Riemannian model.

    PubMed

    Hajinoroozi, Mehdi; Jianqiu Zhang; Yufei Huang

    2017-07-01

    Prediction of the drivers' drowsy and alert states is important for safety purposes. The prediction of drivers' drowsy and alert states from electroencephalography (EEG) using shallow and deep Riemannian methods is presented. For shallow Riemannian methods, the minimum distance to Riemannian mean (mdm) and Log-Euclidian metric are investigated, where it is shown that Log-Euclidian metric outperforms the mdm algorithm. In addition the SPDNet, a deep Riemannian model, that takes the EEG covariance matrix as the input is investigated. It is shown that SPDNet outperforms all tested shallow and deep classification methods. Performance of SPDNet is 6.02% and 2.86% higher than the best performance by the conventional Euclidian classifiers and shallow Riemannian models, respectively.

  13. Theory of mind may be contagious, but you don't catch it from your twin.

    PubMed

    Wright Cassidy, Kimberly; Shaw Fineberg, Deborah; Brown, Kimberly; Perkins, Alexis

    2005-01-01

    The theory-of-mind abilities of twins, children with nontwin siblings, and only children were compared to investigate further the link between number and type of siblings and theory-of-mind abilities. Three- to 5-year-old children with nontwin siblings outperformed both only children and twins with no other siblings, twins who also had other siblings outperformed twins who did not, and children with at least 1 opposite-sex sibling outperformed children with only same-sex siblings. Twins performed significantly better when asked about the false beliefs of their twins than they did when asked about the false beliefs of their friends. Results are discussed in terms of potential mechanisms that may account for the twin and sibling effects.

  14. Using Objective Structured Clinical Examinations to Assess Intern Orthopaedic Physical Examination Skills: A Multimodal Didactic Comparison.

    PubMed

    Phillips, Donna; Pean, Christian A; Allen, Kathleen; Zuckerman, Joseph; Egol, Kenneth

    Patient care is 1 of the 6 core competencies defined by the Accreditation Council for Graduate Medical Education (ACGME). The physical examination (PE) is a fundamental skill to evaluate patients and make an accurate diagnosis. The purpose of this study was to investigate 3 different methods to teach PE skills and to assess the ability to do a complete PE in a simulated patient encounter. Prospective, uncontrolled, observational. Northeastern academic medical center. A total of 32 orthopedic surgery residents participated and were divided into 3 didactic groups: Group 1 (n = 12) live interactive lectures, demonstration on standardized patients, and textbook reading; Group 2 (n = 11) video recordings of the lectures given to Group 1 and textbook reading alone; Group 3 (n = 9): 90-minute modules taught by residents to interns in near-peer format and textbook reading. The overall score for objective structured clinical examinations from the combined groups was 66%. There was a trend toward more complete PEs in Group 1 taught via live lectures and demonstrations compared to Group 2 that relied on video recording. Near-peer taught residents from Group 3 significantly outperformed Group 2 residents overall (p = 0.02), and trended toward significantly outperforming Group 1 residents as well, with significantly higher scores in the ankle (p = 0.02) and shoulder (p = 0.02) PE cases. This study found that orthopedic interns taught musculoskeletal PE skills by near-peers outperformed other groups overall. An overall score of 66% for the combined didactic groups suggests a baseline deficit in first-year resident musculoskeletal PE skills. The PE should continue to be taught and objectively assessed throughout residency to confirm that budding surgeons have mastered these fundamental skills before going into practice. Copyright © 2017 Association of Program Directors in Surgery. Published by Elsevier Inc. All rights reserved.

  15. Analyzing Enron Data: Bitmap Indexing Outperforms MySQL Queries bySeveral Orders of Magnitude

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Stockinger, Kurt; Rotem, Doron; Shoshani, Arie

    2006-01-28

    FastBit is an efficient, compressed bitmap indexing technology that was developed in our group. In this report we evaluate the performance of MySQL and FastBit for analyzing the email traffic of the Enron dataset. The first finding shows that materializing the join results of several tables significantly improves the query performance. The second finding shows that FastBit outperforms MySQL by several orders of magnitude.

  16. Prediction of heterotrimeric protein complexes by two-phase learning using neighboring kernels

    PubMed Central

    2014-01-01

    Background Protein complexes play important roles in biological systems such as gene regulatory networks and metabolic pathways. Most methods for predicting protein complexes try to find protein complexes with size more than three. It, however, is known that protein complexes with smaller sizes occupy a large part of whole complexes for several species. In our previous work, we developed a method with several feature space mappings and the domain composition kernel for prediction of heterodimeric protein complexes, which outperforms existing methods. Results We propose methods for prediction of heterotrimeric protein complexes by extending techniques in the previous work on the basis of the idea that most heterotrimeric protein complexes are not likely to share the same protein with each other. We make use of the discriminant function in support vector machines (SVMs), and design novel feature space mappings for the second phase. As the second classifier, we examine SVMs and relevance vector machines (RVMs). We perform 10-fold cross-validation computational experiments. The results suggest that our proposed two-phase methods and SVM with the extended features outperform the existing method NWE, which was reported to outperform other existing methods such as MCL, MCODE, DPClus, CMC, COACH, RRW, and PPSampler for prediction of heterotrimeric protein complexes. Conclusions We propose two-phase prediction methods with the extended features, the domain composition kernel, SVMs and RVMs. The two-phase method with the extended features and the domain composition kernel using SVM as the second classifier is particularly useful for prediction of heterotrimeric protein complexes. PMID:24564744

  17. Colorectal cancer staging: comparison of whole-body PET/CT and PET/MR.

    PubMed

    Catalano, Onofrio A; Coutinho, Artur M; Sahani, Dushyant V; Vangel, Mark G; Gee, Michael S; Hahn, Peter F; Witzel, Thomas; Soricelli, Andrea; Salvatore, Marco; Catana, Ciprian; Mahmood, Umar; Rosen, Bruce R; Gervais, Debra

    2017-04-01

    Correct staging is imperative for colorectal cancer (CRC) since it influences both prognosis and management. Several imaging methods are used for this purpose, with variable performance. Positron emission tomography-magnetic resonance (PET/MR) is an innovative imaging technique recently employed for clinical application. The present study was undertaken to compare the staging accuracy of whole-body positron emission tomography-computed tomography (PET/CT) with whole-body PET/MR in patients with both newly diagnosed and treated colorectal cancer. Twenty-six patients, who underwent same day whole-body (WB) PET/CT and WB-PET/MR, were evaluated. PET/CT and PET/MR studies were interpreted by consensus by a radiologist and a nuclear medicine physician. Correlations with prior imaging and follow-up studies were used as the reference standard. Correct staging was compared between methods using McNemar's Chi square test. The two methods were in agreement and correct for 18/26 (69%) patients, and in agreement and incorrect for one patient (3.8%). PET/MR and PET/CT stages for the remaining 7/26 patients (27%) were discordant, with PET/MR staging being correct in all seven cases. PET/MR significantly outperformed PET/CT overall for accurate staging (P = 0.02). PET/MR outperformed PET/CT in CRC staging. PET/MR might allow accurate local and distant staging of CRC patients during both at the time of diagnosis and during follow-up.

  18. From free energy to expected energy: Improving energy-based value function approximation in reinforcement learning.

    PubMed

    Elfwing, Stefan; Uchibe, Eiji; Doya, Kenji

    2016-12-01

    Free-energy based reinforcement learning (FERL) was proposed for learning in high-dimensional state and action spaces. However, the FERL method does only really work well with binary, or close to binary, state input, where the number of active states is fewer than the number of non-active states. In the FERL method, the value function is approximated by the negative free energy of a restricted Boltzmann machine (RBM). In our earlier study, we demonstrated that the performance and the robustness of the FERL method can be improved by scaling the free energy by a constant that is related to the size of network. In this study, we propose that RBM function approximation can be further improved by approximating the value function by the negative expected energy (EERL), instead of the negative free energy, as well as being able to handle continuous state input. We validate our proposed method by demonstrating that EERL: (1) outperforms FERL, as well as standard neural network and linear function approximation, for three versions of a gridworld task with high-dimensional image state input; (2) achieves new state-of-the-art results in stochastic SZ-Tetris in both model-free and model-based learning settings; and (3) significantly outperforms FERL and standard neural network function approximation for a robot navigation task with raw and noisy RGB images as state input and a large number of actions. Copyright © 2016 The Author(s). Published by Elsevier Ltd.. All rights reserved.

  19. A Review of Depth and Normal Fusion Algorithms

    PubMed Central

    Štolc, Svorad; Pock, Thomas

    2018-01-01

    Geometric surface information such as depth maps and surface normals can be acquired by various methods such as stereo light fields, shape from shading and photometric stereo techniques. We compare several algorithms which deal with the combination of depth with surface normal information in order to reconstruct a refined depth map. The reasons for performance differences are examined from the perspective of alternative formulations of surface normals for depth reconstruction. We review and analyze methods in a systematic way. Based on our findings, we introduce a new generalized fusion method, which is formulated as a least squares problem and outperforms previous methods in the depth error domain by introducing a novel normal weighting that performs closer to the geodesic distance measure. Furthermore, a novel method is introduced based on Total Generalized Variation (TGV) which further outperforms previous approaches in terms of the geodesic normal distance error and maintains comparable quality in the depth error domain. PMID:29389903

  20. A Penalized Robust Method for Identifying Gene-Environment Interactions

    PubMed Central

    Shi, Xingjie; Liu, Jin; Huang, Jian; Zhou, Yong; Xie, Yang; Ma, Shuangge

    2015-01-01

    In high-throughput studies, an important objective is to identify gene-environment interactions associated with disease outcomes and phenotypes. Many commonly adopted methods assume specific parametric or semiparametric models, which may be subject to model mis-specification. In addition, they usually use significance level as the criterion for selecting important interactions. In this study, we adopt the rank-based estimation, which is much less sensitive to model specification than some of the existing methods and includes several commonly encountered data and models as special cases. Penalization is adopted for the identification of gene-environment interactions. It achieves simultaneous estimation and identification and does not rely on significance level. For computation feasibility, a smoothed rank estimation is further proposed. Simulation shows that under certain scenarios, for example with contaminated or heavy-tailed data, the proposed method can significantly outperform the existing alternatives with more accurate identification. We analyze a lung cancer prognosis study with gene expression measurements under the AFT (accelerated failure time) model. The proposed method identifies interactions different from those using the alternatives. Some of the identified genes have important implications. PMID:24616063

  1. Resource-constrained scheduling with hard due windows and rejection penalties

    NASA Astrophysics Data System (ADS)

    Garcia, Christopher

    2016-09-01

    This work studies a scheduling problem where each job must be either accepted and scheduled to complete within its specified due window, or rejected altogether. Each job has a certain processing time and contributes a certain profit if accepted or penalty cost if rejected. There is a set of renewable resources, and no resource limit can be exceeded at any time. Each job requires a certain amount of each resource when processed, and the objective is to maximize total profit. A mixed-integer programming formulation and three approximation algorithms are presented: a priority rule heuristic, an algorithm based on the metaheuristic for randomized priority search and an evolutionary algorithm. Computational experiments comparing these four solution methods were performed on a set of generated benchmark problems covering a wide range of problem characteristics. The evolutionary algorithm outperformed the other methods in most cases, often significantly, and never significantly underperformed any method.

  2. GOTHiC, a probabilistic model to resolve complex biases and to identify real interactions in Hi-C data.

    PubMed

    Mifsud, Borbala; Martincorena, Inigo; Darbo, Elodie; Sugar, Robert; Schoenfelder, Stefan; Fraser, Peter; Luscombe, Nicholas M

    2017-01-01

    Hi-C is one of the main methods for investigating spatial co-localisation of DNA in the nucleus. However, the raw sequencing data obtained from Hi-C experiments suffer from large biases and spurious contacts, making it difficult to identify true interactions. Existing methods use complex models to account for biases and do not provide a significance threshold for detecting interactions. Here we introduce a simple binomial probabilistic model that resolves complex biases and distinguishes between true and false interactions. The model corrects biases of known and unknown origin and yields a p-value for each interaction, providing a reliable threshold based on significance. We demonstrate this experimentally by testing the method against a random ligation dataset. Our method outperforms previous methods and provides a statistical framework for further data analysis, such as comparisons of Hi-C interactions between different conditions. GOTHiC is available as a BioConductor package (http://www.bioconductor.org/packages/release/bioc/html/GOTHiC.html).

  3. Degree-strength correlation reveals anomalous trading behavior.

    PubMed

    Sun, Xiao-Qian; Shen, Hua-Wei; Cheng, Xue-Qi; Wang, Zhao-Yang

    2012-01-01

    Manipulation is an important issue for both developed and emerging stock markets. Many efforts have been made to detect manipulation in stock markets. However, it is still an open problem to identify the fraudulent traders, especially when they collude with each other. In this paper, we focus on the problem of identifying the anomalous traders using the transaction data of eight manipulated stocks and forty-four non-manipulated stocks during a one-year period. By analyzing the trading networks of stocks, we find that the trading networks of manipulated stocks exhibit significantly higher degree-strength correlation than the trading networks of non-manipulated stocks and the randomized trading networks. We further propose a method to detect anomalous traders of manipulated stocks based on statistical significance analysis of degree-strength correlation. Experimental results demonstrate that our method is effective at distinguishing the manipulated stocks from non-manipulated ones. Our method outperforms the traditional weight-threshold method at identifying the anomalous traders in manipulated stocks. More importantly, our method is difficult to be fooled by colluded traders.

  4. Butch-Femme Identity and Visuospatial Performance Among Lesbian and Bisexual Women in China.

    PubMed

    Zheng, Lijun; Wen, Guangju; Zheng, Yong

    2018-05-01

    Lesbian and bisexual women who self-identify as "butch" show a masculine profile with regard to gender roles, gender nonconformity, and systemizing cognitive style, whereas lesbian and bisexual women who self-identify as "femme" show a corresponding feminine profile and those who self-identify as "androgynes" show an intermediate profile. This study examined the association between butch or femme lesbian or bisexual identity and visuospatial ability among 323 lesbian and bisexual women, compared to heterosexual women (n = 207) and men (n = 125), from multiple cities in China. Visuospatial ability was assessed using a Shepard and Metzler-type mental rotation task and Judgment of Line Angle and Position (JLAP) test on the Internet. Heterosexual men outperformed heterosexual women on both mental rotation and JLAP tasks. Lesbian and bisexual women outperformed heterosexual women on mental rotation, but not on JLAP. There were significant differences in mental rotation performance among women, with butch- and androgyne-identified lesbian/bisexual women outperforming femme-identified and heterosexual women. There were also significant differences in JLAP performance among women, with butch- and androgyne-identified lesbian/bisexual women and heterosexual women outperforming femme-identified lesbian/bisexual women. The butch-femme differences in visuospatial ability indicated an association between cognitive ability and butch-femme identity and suggest that neurobiological underpinnings may contribute to butch-femme identity although alternative explanations exist.

  5. New Method of Calculating a Multiplication by using the Generalized Bernstein-Vazirani Algorithm

    NASA Astrophysics Data System (ADS)

    Nagata, Koji; Nakamura, Tadao; Geurdes, Han; Batle, Josep; Abdalla, Soliman; Farouk, Ahmed

    2018-06-01

    We present a new method of more speedily calculating a multiplication by using the generalized Bernstein-Vazirani algorithm and many parallel quantum systems. Given the set of real values a1,a2,a3,\\ldots ,aN and a function g:bf {R}→ {0,1}, we shall determine the following values g(a1),g(a2),g(a3),\\ldots , g(aN) simultaneously. The speed of determining the values is shown to outperform the classical case by a factor of N. Next, we consider it as a number in binary representation; M 1 = ( g( a 1), g( a 2), g( a 3),…, g( a N )). By using M parallel quantum systems, we have M numbers in binary representation, simultaneously. The speed of obtaining the M numbers is shown to outperform the classical case by a factor of M. Finally, we calculate the product; M1× M2× \\cdots × MM. The speed of obtaining the product is shown to outperform the classical case by a factor of N × M.

  6. Adaptive Distributed Video Coding with Correlation Estimation using Expectation Propagation

    PubMed Central

    Cui, Lijuan; Wang, Shuang; Jiang, Xiaoqian; Cheng, Samuel

    2013-01-01

    Distributed video coding (DVC) is rapidly increasing in popularity by the way of shifting the complexity from encoder to decoder, whereas no compression performance degrades, at least in theory. In contrast with conventional video codecs, the inter-frame correlation in DVC is explored at decoder based on the received syndromes of Wyner-Ziv (WZ) frame and side information (SI) frame generated from other frames available only at decoder. However, the ultimate decoding performances of DVC are based on the assumption that the perfect knowledge of correlation statistic between WZ and SI frames should be available at decoder. Therefore, the ability of obtaining a good statistical correlation estimate is becoming increasingly important in practical DVC implementations. Generally, the existing correlation estimation methods in DVC can be classified into two main types: pre-estimation where estimation starts before decoding and on-the-fly (OTF) estimation where estimation can be refined iteratively during decoding. As potential changes between frames might be unpredictable or dynamical, OTF estimation methods usually outperforms pre-estimation techniques with the cost of increased decoding complexity (e.g., sampling methods). In this paper, we propose a low complexity adaptive DVC scheme using expectation propagation (EP), where correlation estimation is performed OTF as it is carried out jointly with decoding of the factor graph-based DVC code. Among different approximate inference methods, EP generally offers better tradeoff between accuracy and complexity. Experimental results show that our proposed scheme outperforms the benchmark state-of-the-art DISCOVER codec and other cases without correlation tracking, and achieves comparable decoding performance but with significantly low complexity comparing with sampling method. PMID:23750314

  7. Adaptive distributed video coding with correlation estimation using expectation propagation

    NASA Astrophysics Data System (ADS)

    Cui, Lijuan; Wang, Shuang; Jiang, Xiaoqian; Cheng, Samuel

    2012-10-01

    Distributed video coding (DVC) is rapidly increasing in popularity by the way of shifting the complexity from encoder to decoder, whereas no compression performance degrades, at least in theory. In contrast with conventional video codecs, the inter-frame correlation in DVC is explored at decoder based on the received syndromes of Wyner-Ziv (WZ) frame and side information (SI) frame generated from other frames available only at decoder. However, the ultimate decoding performances of DVC are based on the assumption that the perfect knowledge of correlation statistic between WZ and SI frames should be available at decoder. Therefore, the ability of obtaining a good statistical correlation estimate is becoming increasingly important in practical DVC implementations. Generally, the existing correlation estimation methods in DVC can be classified into two main types: pre-estimation where estimation starts before decoding and on-the-fly (OTF) estimation where estimation can be refined iteratively during decoding. As potential changes between frames might be unpredictable or dynamical, OTF estimation methods usually outperforms pre-estimation techniques with the cost of increased decoding complexity (e.g., sampling methods). In this paper, we propose a low complexity adaptive DVC scheme using expectation propagation (EP), where correlation estimation is performed OTF as it is carried out jointly with decoding of the factor graph-based DVC code. Among different approximate inference methods, EP generally offers better tradeoff between accuracy and complexity. Experimental results show that our proposed scheme outperforms the benchmark state-of-the-art DISCOVER codec and other cases without correlation tracking, and achieves comparable decoding performance but with significantly low complexity comparing with sampling method.

  8. A quantitative experimental phantom study on MRI image uniformity.

    PubMed

    Felemban, Doaa; Verdonschot, Rinus G; Iwamoto, Yuri; Uchiyama, Yuka; Kakimoto, Naoya; Kreiborg, Sven; Murakami, Shumei

    2018-05-23

    Our goal was to assess MR image uniformity by investigating aspects influencing said uniformity via a method laid out by the National Electrical Manufacturers Association (NEMA). Six metallic materials embedded in a glass phantom were scanned (i.e. Au, Ag, Al, Au-Ag-Pd alloy, Ti and Co-Cr alloy) as well as a reference image. Sequences included spin echo (SE) and gradient echo (GRE) scanned in three planes (i.e. axial, coronal, and sagittal). Moreover, three surface coil types (i.e. head and neck, Brain, and temporomandibular joint coils) and two image correction methods (i.e. surface coil intensity correction or SCIC, phased array uniformity enhancement or PURE) were employed to evaluate their effectiveness on image uniformity. Image uniformity was assessed using the National Electrical Manufacturers Association peak-deviation non-uniformity method. Results showed that temporomandibular joint coils elicited the least uniform image and brain coils outperformed head and neck coils when metallic materials were present. Additionally, when metallic materials were present, spin echo outperformed gradient echo especially for Co-Cr (particularly in the axial plane). Furthermore, both SCIC and PURE improved image uniformity compared to uncorrected images, and SCIC slightly surpassed PURE when metallic metals were present. Lastly, Co-Cr elicited the least uniform image while other metallic materials generally showed similar patterns (i.e. no significant deviation from images without metallic metals). Overall, a quantitative understanding of the factors influencing MR image uniformity (e.g. coil type, imaging method, metal susceptibility, and post-hoc correction method) is advantageous to optimize image quality, assists clinical interpretation, and may result in improved medical and dental care.

  9. Adaptive Distributed Video Coding with Correlation Estimation using Expectation Propagation.

    PubMed

    Cui, Lijuan; Wang, Shuang; Jiang, Xiaoqian; Cheng, Samuel

    2012-10-15

    Distributed video coding (DVC) is rapidly increasing in popularity by the way of shifting the complexity from encoder to decoder, whereas no compression performance degrades, at least in theory. In contrast with conventional video codecs, the inter-frame correlation in DVC is explored at decoder based on the received syndromes of Wyner-Ziv (WZ) frame and side information (SI) frame generated from other frames available only at decoder. However, the ultimate decoding performances of DVC are based on the assumption that the perfect knowledge of correlation statistic between WZ and SI frames should be available at decoder. Therefore, the ability of obtaining a good statistical correlation estimate is becoming increasingly important in practical DVC implementations. Generally, the existing correlation estimation methods in DVC can be classified into two main types: pre-estimation where estimation starts before decoding and on-the-fly (OTF) estimation where estimation can be refined iteratively during decoding. As potential changes between frames might be unpredictable or dynamical, OTF estimation methods usually outperforms pre-estimation techniques with the cost of increased decoding complexity (e.g., sampling methods). In this paper, we propose a low complexity adaptive DVC scheme using expectation propagation (EP), where correlation estimation is performed OTF as it is carried out jointly with decoding of the factor graph-based DVC code. Among different approximate inference methods, EP generally offers better tradeoff between accuracy and complexity. Experimental results show that our proposed scheme outperforms the benchmark state-of-the-art DISCOVER codec and other cases without correlation tracking, and achieves comparable decoding performance but with significantly low complexity comparing with sampling method.

  10. A ℓ2, 1 norm regularized multi-kernel learning for false positive reduction in Lung nodule CAD.

    PubMed

    Cao, Peng; Liu, Xiaoli; Zhang, Jian; Li, Wei; Zhao, Dazhe; Huang, Min; Zaiane, Osmar

    2017-03-01

    The aim of this paper is to describe a novel algorithm for False Positive Reduction in lung nodule Computer Aided Detection(CAD). In this paper, we describes a new CT lung CAD method which aims to detect solid nodules. Specially, we proposed a multi-kernel classifier with a ℓ 2, 1 norm regularizer for heterogeneous feature fusion and selection from the feature subset level, and designed two efficient strategies to optimize the parameters of kernel weights in non-smooth ℓ 2, 1 regularized multiple kernel learning algorithm. The first optimization algorithm adapts a proximal gradient method for solving the ℓ 2, 1 norm of kernel weights, and use an accelerated method based on FISTA; the second one employs an iterative scheme based on an approximate gradient descent method. The results demonstrates that the FISTA-style accelerated proximal descent method is efficient for the ℓ 2, 1 norm formulation of multiple kernel learning with the theoretical guarantee of the convergence rate. Moreover, the experimental results demonstrate the effectiveness of the proposed methods in terms of Geometric mean (G-mean) and Area under the ROC curve (AUC), and significantly outperforms the competing methods. The proposed approach exhibits some remarkable advantages both in heterogeneous feature subsets fusion and classification phases. Compared with the fusion strategies of feature-level and decision level, the proposed ℓ 2, 1 norm multi-kernel learning algorithm is able to accurately fuse the complementary and heterogeneous feature sets, and automatically prune the irrelevant and redundant feature subsets to form a more discriminative feature set, leading a promising classification performance. Moreover, the proposed algorithm consistently outperforms the comparable classification approaches in the literature. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  11. Optimized extreme learning machine for urban land cover classification using hyperspectral imagery

    NASA Astrophysics Data System (ADS)

    Su, Hongjun; Tian, Shufang; Cai, Yue; Sheng, Yehua; Chen, Chen; Najafian, Maryam

    2017-12-01

    This work presents a new urban land cover classification framework using the firefly algorithm (FA) optimized extreme learning machine (ELM). FA is adopted to optimize the regularization coefficient C and Gaussian kernel σ for kernel ELM. Additionally, effectiveness of spectral features derived from an FA-based band selection algorithm is studied for the proposed classification task. Three sets of hyperspectral databases were recorded using different sensors, namely HYDICE, HyMap, and AVIRIS. Our study shows that the proposed method outperforms traditional classification algorithms such as SVM and reduces computational cost significantly.

  12. Aberration corrections for free-space optical communications in atmosphere turbulence using orbital angular momentum states.

    PubMed

    Zhao, S M; Leach, J; Gong, L Y; Ding, J; Zheng, B Y

    2012-01-02

    The effect of atmosphere turbulence on light's spatial structure compromises the information capacity of photons carrying the Orbital Angular Momentum (OAM) in free-space optical (FSO) communications. In this paper, we study two aberration correction methods to mitigate this effect. The first one is the Shack-Hartmann wavefront correction method, which is based on the Zernike polynomials, and the second is a phase correction method specific to OAM states. Our numerical results show that the phase correction method for OAM states outperforms the Shark-Hartmann wavefront correction method, although both methods improve significantly purity of a single OAM state and the channel capacities of FSO communication link. At the same time, our experimental results show that the values of participation functions go down at the phase correction method for OAM states, i.e., the correction method ameliorates effectively the bad effect of atmosphere turbulence.

  13. Echo state networks with filter neurons and a delay&sum readout.

    PubMed

    Holzmann, Georg; Hauser, Helmut

    2010-03-01

    Echo state networks (ESNs) are a novel approach to recurrent neural network training with the advantage of a very simple and linear learning algorithm. It has been demonstrated that ESNs outperform other methods on a number of benchmark tasks. Although the approach is appealing, there are still some inherent limitations in the original formulation. Here we suggest two enhancements of this network model. First, the previously proposed idea of filters in neurons is extended to arbitrary infinite impulse response (IIR) filter neurons. This enables such networks to learn multiple attractors and signals at different timescales, which is especially important for modeling real-world time series. Second, a delay&sum readout is introduced, which adds trainable delays in the synaptic connections of output neurons and therefore vastly improves the memory capacity of echo state networks. It is shown in commonly used benchmark tasks and real-world examples, that this new structure is able to significantly outperform standard ESNs and other state-of-the-art models for nonlinear dynamical system modeling. Copyright 2009 Elsevier Ltd. All rights reserved.

  14. Joint Concept Correlation and Feature-Concept Relevance Learning for Multilabel Classification.

    PubMed

    Zhao, Xiaowei; Ma, Zhigang; Li, Zhi; Li, Zhihui

    2018-02-01

    In recent years, multilabel classification has attracted significant attention in multimedia annotation. However, most of the multilabel classification methods focus only on the inherent correlations existing among multiple labels and concepts and ignore the relevance between features and the target concepts. To obtain more robust multilabel classification results, we propose a new multilabel classification method aiming to capture the correlations among multiple concepts by leveraging hypergraph that is proved to be beneficial for relational learning. Moreover, we consider mining feature-concept relevance, which is often overlooked by many multilabel learning algorithms. To better show the feature-concept relevance, we impose a sparsity constraint on the proposed method. We compare the proposed method with several other multilabel classification methods and evaluate the classification performance by mean average precision on several data sets. The experimental results show that the proposed method outperforms the state-of-the-art methods.

  15. Diagnosing Autism Spectrum Disorder from Brain Resting-State Functional Connectivity Patterns Using a Deep Neural Network with a Novel Feature Selection Method

    PubMed Central

    Guo, Xinyu; Dominick, Kelli C.; Minai, Ali A.; Li, Hailong; Erickson, Craig A.; Lu, Long J.

    2017-01-01

    The whole-brain functional connectivity (FC) pattern obtained from resting-state functional magnetic resonance imaging data are commonly applied to study neuropsychiatric conditions such as autism spectrum disorder (ASD) by using different machine learning models. Recent studies indicate that both hyper- and hypo- aberrant ASD-associated FCs were widely distributed throughout the entire brain rather than only in some specific brain regions. Deep neural networks (DNN) with multiple hidden layers have shown the ability to systematically extract lower-to-higher level information from high dimensional data across a series of neural hidden layers, significantly improving classification accuracy for such data. In this study, a DNN with a novel feature selection method (DNN-FS) is developed for the high dimensional whole-brain resting-state FC pattern classification of ASD patients vs. typical development (TD) controls. The feature selection method is able to help the DNN generate low dimensional high-quality representations of the whole-brain FC patterns by selecting features with high discriminating power from multiple trained sparse auto-encoders. For the comparison, a DNN without the feature selection method (DNN-woFS) is developed, and both of them are tested with different architectures (i.e., with different numbers of hidden layers/nodes). Results show that the best classification accuracy of 86.36% is generated by the DNN-FS approach with 3 hidden layers and 150 hidden nodes (3/150). Remarkably, DNN-FS outperforms DNN-woFS for all architectures studied. The most significant accuracy improvement was 9.09% with the 3/150 architecture. The method also outperforms other feature selection methods, e.g., two sample t-test and elastic net. In addition to improving the classification accuracy, a Fisher's score-based biomarker identification method based on the DNN is also developed, and used to identify 32 FCs related to ASD. These FCs come from or cross different pre-defined brain networks including the default-mode, cingulo-opercular, frontal-parietal, and cerebellum. Thirteen of them are statically significant between ASD and TD groups (two sample t-test p < 0.05) while 19 of them are not. The relationship between the statically significant FCs and the corresponding ASD behavior symptoms is discussed based on the literature and clinician's expert knowledge. Meanwhile, the potential reason of obtaining 19 FCs which are not statistically significant is also provided. PMID:28871217

  16. Revision by means of computer-mediated peer discussions

    NASA Astrophysics Data System (ADS)

    Soong, Benson; Mercer, Neil; Er, Siew Shin

    2010-05-01

    In this article, we provide a discussion on our revision method (termed prescriptive tutoring) aimed at revealing students' misconceptions and misunderstandings by getting them to solve physics problems with an anonymous partner via the computer. It is currently being implemented and evaluated in a public secondary school in Singapore, and statistical analysis of our initial small-scale study shows that students in the experimental group significantly outperformed students in both the control and alternative intervention groups. In addition, students in the experimental group perceived that they had gained improved understanding of the physics concepts covered during the intervention, and reported that they would like to continue revising physics concepts using the intervention methods.

  17. View-Invariant Gait Recognition Through Genetic Template Segmentation

    NASA Astrophysics Data System (ADS)

    Isaac, Ebenezer R. H. P.; Elias, Susan; Rajagopalan, Srinivasan; Easwarakumar, K. S.

    2017-08-01

    Template-based model-free approach provides by far the most successful solution to the gait recognition problem in literature. Recent work discusses how isolating the head and leg portion of the template increase the performance of a gait recognition system making it robust against covariates like clothing and carrying conditions. However, most involve a manual definition of the boundaries. The method we propose, the genetic template segmentation (GTS), employs the genetic algorithm to automate the boundary selection process. This method was tested on the GEI, GEnI and AEI templates. GEI seems to exhibit the best result when segmented with our approach. Experimental results depict that our approach significantly outperforms the existing implementations of view-invariant gait recognition.

  18. Cross-view gait recognition using joint Bayesian

    NASA Astrophysics Data System (ADS)

    Li, Chao; Sun, Shouqian; Chen, Xiaoyu; Min, Xin

    2017-07-01

    Human gait, as a soft biometric, helps to recognize people by walking. To further improve the recognition performance under cross-view condition, we propose Joint Bayesian to model the view variance. We evaluated our prosed method with the largest population (OULP) dataset which makes our result reliable in a statically way. As a result, we confirmed our proposed method significantly outperformed state-of-the-art approaches for both identification and verification tasks. Finally, sensitivity analysis on the number of training subjects was conducted, we find Joint Bayesian could achieve competitive results even with a small subset of training subjects (100 subjects). For further comparison, experimental results, learning models, and test codes are available.

  19. A new method for species identification via protein-coding and non-coding DNA barcodes by combining machine learning with bioinformatic methods.

    PubMed

    Zhang, Ai-bing; Feng, Jie; Ward, Robert D; Wan, Ping; Gao, Qiang; Wu, Jun; Zhao, Wei-zhong

    2012-01-01

    Species identification via DNA barcodes is contributing greatly to current bioinventory efforts. The initial, and widely accepted, proposal was to use the protein-coding cytochrome c oxidase subunit I (COI) region as the standard barcode for animals, but recently non-coding internal transcribed spacer (ITS) genes have been proposed as candidate barcodes for both animals and plants. However, achieving a robust alignment for non-coding regions can be problematic. Here we propose two new methods (DV-RBF and FJ-RBF) to address this issue for species assignment by both coding and non-coding sequences that take advantage of the power of machine learning and bioinformatics. We demonstrate the value of the new methods with four empirical datasets, two representing typical protein-coding COI barcode datasets (neotropical bats and marine fish) and two representing non-coding ITS barcodes (rust fungi and brown algae). Using two random sub-sampling approaches, we demonstrate that the new methods significantly outperformed existing Neighbor-joining (NJ) and Maximum likelihood (ML) methods for both coding and non-coding barcodes when there was complete species coverage in the reference dataset. The new methods also out-performed NJ and ML methods for non-coding sequences in circumstances of potentially incomplete species coverage, although then the NJ and ML methods performed slightly better than the new methods for protein-coding barcodes. A 100% success rate of species identification was achieved with the two new methods for 4,122 bat queries and 5,134 fish queries using COI barcodes, with 95% confidence intervals (CI) of 99.75-100%. The new methods also obtained a 96.29% success rate (95%CI: 91.62-98.40%) for 484 rust fungi queries and a 98.50% success rate (95%CI: 96.60-99.37%) for 1094 brown algae queries, both using ITS barcodes.

  20. The MIMIC Method with Scale Purification for Detecting Differential Item Functioning

    ERIC Educational Resources Information Center

    Wang, Wen-Chung; Shih, Ching-Lin; Yang, Chih-Chien

    2009-01-01

    This study implements a scale purification procedure onto the standard MIMIC method for differential item functioning (DIF) detection and assesses its performance through a series of simulations. It is found that the MIMIC method with scale purification (denoted as M-SP) outperforms the standard MIMIC method (denoted as M-ST) in controlling…

  1. Support vector machine prediction of enzyme function with conjoint triad feature and hierarchical context.

    PubMed

    Wang, Yong-Cui; Wang, Yong; Yang, Zhi-Xia; Deng, Nai-Yang

    2011-06-20

    Enzymes are known as the largest class of proteins and their functions are usually annotated by the Enzyme Commission (EC), which uses a hierarchy structure, i.e., four numbers separated by periods, to classify the function of enzymes. Automatically categorizing enzyme into the EC hierarchy is crucial to understand its specific molecular mechanism. In this paper, we introduce two key improvements in predicting enzyme function within the machine learning framework. One is to introduce the efficient sequence encoding methods for representing given proteins. The second one is to develop a structure-based prediction method with low computational complexity. In particular, we propose to use the conjoint triad feature (CTF) to represent the given protein sequences by considering not only the composition of amino acids but also the neighbor relationships in the sequence. Then we develop a support vector machine (SVM)-based method, named as SVMHL (SVM for hierarchy labels), to output enzyme function by fully considering the hierarchical structure of EC. The experimental results show that our SVMHL with the CTF outperforms SVMHL with the amino acid composition (AAC) feature both in predictive accuracy and Matthew's correlation coefficient (MCC). In addition, SVMHL with the CTF obtains the accuracy and MCC ranging from 81% to 98% and 0.82 to 0.98 when predicting the first three EC digits on a low-homologous enzyme dataset. We further demonstrate that our method outperforms the methods which do not take account of hierarchical relationship among enzyme categories and alternative methods which incorporate prior knowledge about inter-class relationships. Our structure-based prediction model, SVMHL with the CTF, reduces the computational complexity and outperforms the alternative approaches in enzyme function prediction. Therefore our new method will be a useful tool for enzyme function prediction community.

  2. Demosaicing images from colour cameras for digital image correlation

    NASA Astrophysics Data System (ADS)

    Forsey, A.; Gungor, S.

    2016-11-01

    Digital image correlation is not the intended use for consumer colour cameras, but with care they can be successfully employed in such a role. The main obstacle is the sparsely sampled colour data caused by the use of a colour filter array (CFA) to separate the colour channels. It is shown that the method used to convert consumer camera raw files into a monochrome image suitable for digital image correlation (DIC) can have a significant effect on the DIC output. A number of widely available software packages and two in-house methods are evaluated in terms of their performance when used with DIC. Using an in-plane rotating disc to produce a highly constrained displacement field, it was found that the bicubic spline based in-house demosaicing method outperformed the other methods in terms of accuracy and aliasing suppression.

  3. Simultaneous kinetic spectrometric determination of three flavonoid antioxidants in fruit with the aid of chemometrics

    NASA Astrophysics Data System (ADS)

    Sun, Ruiling; Wang, Yong; Ni, Yongnian; Kokot, Serge

    2014-03-01

    A simple, inexpensive and sensitive kinetic spectrophotometric method was developed for the simultaneous determination of three anti-carcinogenic flavonoids: catechin, quercetin and naringenin, in fruit samples. A yellow chelate product was produced in the presence neocuproine and Cu(I) - a reduction product of the reaction between the flavonoids with Cu(II), and this enabled the quantitative measurements with UV-vis spectrophotometry. The overlapping spectra obtained, were resolved with chemometrics calibration models, and the best performing method was the fast independent component analysis (fast-ICA/PCR (Principal component regression)); the limits of detection were 0.075, 0.057 and 0.063 mg L-1 for catechin, quercetin and naringenin, respectively. The novel method was found to outperform significantly the common HPLC procedure.

  4. PET/CT detectability and classification of simulated pulmonary lesions using an SUV correction scheme

    NASA Astrophysics Data System (ADS)

    Morrow, Andrew N.; Matthews, Kenneth L., II; Bujenovic, Steven

    2008-03-01

    Positron emission tomography (PET) and computed tomography (CT) together are a powerful diagnostic tool, but imperfect image quality allows false positive and false negative diagnoses to be made by any observer despite experience and training. This work investigates PET acquisition mode, reconstruction method and a standard uptake value (SUV) correction scheme on the classification of lesions as benign or malignant in PET/CT images, in an anthropomorphic phantom. The scheme accounts for partial volume effect (PVE) and PET resolution. The observer draws a region of interest (ROI) around the lesion using the CT dataset. A simulated homogenous PET lesion of the same shape as the drawn ROI is blurred with the point spread function (PSF) of the PET scanner to estimate the PVE, providing a scaling factor to produce a corrected SUV. Computer simulations showed that the accuracy of the corrected PET values depends on variations in the CT-drawn boundary and the position of the lesion with respect to the PET image matrix, especially for smaller lesions. Correction accuracy was affected slightly by mismatch of the simulation PSF and the actual scanner PSF. The receiver operating characteristic (ROC) study resulted in several observations. Using observer drawn ROIs, scaled tumor-background ratios (TBRs) more accurately represented actual TBRs than unscaled TBRs. For the PET images, 3D OSEM outperformed 2D OSEM, 3D OSEM outperformed 3D FBP, and 2D OSEM outperformed 2D FBP. The correction scheme significantly increased sensitivity and slightly increased accuracy for all acquisition and reconstruction modes at the cost of a small decrease in specificity.

  5. Smiling on the Inside: The Social Benefits of Suppressing Positive Emotions in Outperformance Situations.

    PubMed

    Schall, Marina; Martiny, Sarah E; Goetz, Thomas; Hall, Nathan C

    2016-05-01

    Although expressing positive emotions is typically socially rewarded, in the present work, we predicted that people suppress positive emotions and thereby experience social benefits when outperformed others are present. We tested our predictions in three experimental studies with high school students. In Studies 1 and 2, we manipulated the type of social situation (outperformance vs. non-outperformance) and assessed suppression of positive emotions. In both studies, individuals reported suppressing positive emotions more in outperformance situations than in non-outperformance situations. In Study 3, we manipulated the social situation (outperformance vs. non-outperformance) as well as the videotaped person's expression of positive emotions (suppression vs. expression). The findings showed that when outperforming others, individuals were indeed evaluated more positively when they suppressed rather than expressed their positive emotions, and demonstrate the importance of the specific social situation with respect to the effects of suppression. © 2016 by the Society for Personality and Social Psychology, Inc.

  6. Task-induced frequency modulation features for brain-computer interfacing.

    PubMed

    Jayaram, Vinay; Hohmann, Matthias; Just, Jennifer; Schölkopf, Bernhard; Grosse-Wentrup, Moritz

    2017-10-01

    Task-induced amplitude modulation of neural oscillations is routinely used in brain-computer interfaces (BCIs) for decoding subjects' intents, and underlies some of the most robust and common methods in the field, such as common spatial patterns and Riemannian geometry. While there has been some interest in phase-related features for classification, both techniques usually presuppose that the frequencies of neural oscillations remain stable across various tasks. We investigate here whether features based on task-induced modulation of the frequency of neural oscillations enable decoding of subjects' intents with an accuracy comparable to task-induced amplitude modulation. We compare cross-validated classification accuracies using the amplitude and frequency modulated features, as well as a joint feature space, across subjects in various paradigms and pre-processing conditions. We show results with a motor imagery task, a cognitive task, and also preliminary results in patients with amyotrophic lateral sclerosis (ALS), as well as using common spatial patterns and Laplacian filtering. The frequency features alone do not significantly out-perform traditional amplitude modulation features, and in some cases perform significantly worse. However, across both tasks and pre-processing in healthy subjects the joint space significantly out-performs either the frequency or amplitude features alone. This result only does not hold for ALS patients, for whom the dataset is of insufficient size to draw any statistically significant conclusions. Task-induced frequency modulation is robust and straight forward to compute, and increases performance when added to standard amplitude modulation features across paradigms. This allows more information to be extracted from the EEG signal cheaply and can be used throughout the field of BCIs.

  7. Joint histogram-based cost aggregation for stereo matching.

    PubMed

    Min, Dongbo; Lu, Jiangbo; Do, Minh N

    2013-10-01

    This paper presents a novel method for performing efficient cost aggregation in stereo matching. The cost aggregation problem is reformulated from the perspective of a histogram, giving us the potential to reduce the complexity of the cost aggregation in stereo matching significantly. Differently from previous methods which have tried to reduce the complexity in terms of the size of an image and a matching window, our approach focuses on reducing the computational redundancy that exists among the search range, caused by a repeated filtering for all the hypotheses. Moreover, we also reduce the complexity of the window-based filtering through an efficient sampling scheme inside the matching window. The tradeoff between accuracy and complexity is extensively investigated by varying the parameters used in the proposed method. Experimental results show that the proposed method provides high-quality disparity maps with low complexity and outperforms existing local methods. This paper also provides new insights into complexity-constrained stereo-matching algorithm design.

  8. Real-time anomaly detection for very short-term load forecasting

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Luo, Jian; Hong, Tao; Yue, Meng

    Although the recent load information is critical to very short-term load forecasting (VSTLF), power companies often have difficulties in collecting the most recent load values accurately and timely for VSTLF applications. This paper tackles the problem of real-time anomaly detection in most recent load information used by VSTLF. This paper proposes a model-based anomaly detection method that consists of two components, a dynamic regression model and an adaptive anomaly threshold. The case study is developed using the data from ISO New England. This paper demonstrates that the proposed method significantly outperforms three other anomaly detection methods including two methods commonlymore » used in the field and one state-of-the-art method used by a winning team of the Global Energy Forecasting Competition 2014. Lastly, a general anomaly detection framework is proposed for the future research.« less

  9. Real-time anomaly detection for very short-term load forecasting

    DOE PAGES

    Luo, Jian; Hong, Tao; Yue, Meng

    2018-01-06

    Although the recent load information is critical to very short-term load forecasting (VSTLF), power companies often have difficulties in collecting the most recent load values accurately and timely for VSTLF applications. This paper tackles the problem of real-time anomaly detection in most recent load information used by VSTLF. This paper proposes a model-based anomaly detection method that consists of two components, a dynamic regression model and an adaptive anomaly threshold. The case study is developed using the data from ISO New England. This paper demonstrates that the proposed method significantly outperforms three other anomaly detection methods including two methods commonlymore » used in the field and one state-of-the-art method used by a winning team of the Global Energy Forecasting Competition 2014. Lastly, a general anomaly detection framework is proposed for the future research.« less

  10. A first attempt at few coils and low-coverage resistive wall mode stabilization of EXTRAP T2R

    NASA Astrophysics Data System (ADS)

    Olofsson, K. Erik J.; Brunsell, Per R.; Drake, James R.; Frassinetti, Lorenzo

    2012-09-01

    The reversed-field pinch features resistive-shell-type instabilities at any (vanishing and finite) plasma pressure. An attempt to stabilize the full spectrum of these modes using both (i) incomplete coverage and (ii) few coils is presented. Two empirically derived model-based control algorithms are compared with a baseline guaranteed suboptimal intelligent-shell-type (IS) feedback. Experimental stabilization could not be achieved for the coil array subset sizes considered by this first study. But the model-based controllers appear to significantly outperform the decentralized IS method.

  11. Statistical physics inspired energy-efficient coded-modulation for optical communications.

    PubMed

    Djordjevic, Ivan B; Xu, Lei; Wang, Ting

    2012-04-15

    Because Shannon's entropy can be obtained by Stirling's approximation of thermodynamics entropy, the statistical physics energy minimization methods are directly applicable to the signal constellation design. We demonstrate that statistical physics inspired energy-efficient (EE) signal constellation designs, in combination with large-girth low-density parity-check (LDPC) codes, significantly outperform conventional LDPC-coded polarization-division multiplexed quadrature amplitude modulation schemes. We also describe an EE signal constellation design algorithm. Finally, we propose the discrete-time implementation of D-dimensional transceiver and corresponding EE polarization-division multiplexed system. © 2012 Optical Society of America

  12. Degree-Strength Correlation Reveals Anomalous Trading Behavior

    PubMed Central

    Sun, Xiao-Qian; Shen, Hua-Wei; Cheng, Xue-Qi; Wang, Zhao-Yang

    2012-01-01

    Manipulation is an important issue for both developed and emerging stock markets. Many efforts have been made to detect manipulation in stock markets. However, it is still an open problem to identify the fraudulent traders, especially when they collude with each other. In this paper, we focus on the problem of identifying the anomalous traders using the transaction data of eight manipulated stocks and forty-four non-manipulated stocks during a one-year period. By analyzing the trading networks of stocks, we find that the trading networks of manipulated stocks exhibit significantly higher degree-strength correlation than the trading networks of non-manipulated stocks and the randomized trading networks. We further propose a method to detect anomalous traders of manipulated stocks based on statistical significance analysis of degree-strength correlation. Experimental results demonstrate that our method is effective at distinguishing the manipulated stocks from non-manipulated ones. Our method outperforms the traditional weight-threshold method at identifying the anomalous traders in manipulated stocks. More importantly, our method is difficult to be fooled by colluded traders. PMID:23082114

  13. Solving large-scale fixed cost integer linear programming models for grid-based location problems with heuristic techniques

    NASA Astrophysics Data System (ADS)

    Noor-E-Alam, Md.; Doucette, John

    2015-08-01

    Grid-based location problems (GBLPs) can be used to solve location problems in business, engineering, resource exploitation, and even in the field of medical sciences. To solve these decision problems, an integer linear programming (ILP) model is designed and developed to provide the optimal solution for GBLPs considering fixed cost criteria. Preliminary results show that the ILP model is efficient in solving small to moderate-sized problems. However, this ILP model becomes intractable in solving large-scale instances. Therefore, a decomposition heuristic is proposed to solve these large-scale GBLPs, which demonstrates significant reduction of solution runtimes. To benchmark the proposed heuristic, results are compared with the exact solution via ILP. The experimental results show that the proposed method significantly outperforms the exact method in runtime with minimal (and in most cases, no) loss of optimality.

  14. Refining Automatically Extracted Knowledge Bases Using Crowdsourcing.

    PubMed

    Li, Chunhua; Zhao, Pengpeng; Sheng, Victor S; Xian, Xuefeng; Wu, Jian; Cui, Zhiming

    2017-01-01

    Machine-constructed knowledge bases often contain noisy and inaccurate facts. There exists significant work in developing automated algorithms for knowledge base refinement. Automated approaches improve the quality of knowledge bases but are far from perfect. In this paper, we leverage crowdsourcing to improve the quality of automatically extracted knowledge bases. As human labelling is costly, an important research challenge is how we can use limited human resources to maximize the quality improvement for a knowledge base. To address this problem, we first introduce a concept of semantic constraints that can be used to detect potential errors and do inference among candidate facts. Then, based on semantic constraints, we propose rank-based and graph-based algorithms for crowdsourced knowledge refining, which judiciously select the most beneficial candidate facts to conduct crowdsourcing and prune unnecessary questions. Our experiments show that our method improves the quality of knowledge bases significantly and outperforms state-of-the-art automatic methods under a reasonable crowdsourcing cost.

  15. Binding SNOMED CT terms to archetype elements. Establishing a baseline of results.

    PubMed

    Berges, I; Bermudez, J; Illarramendi, A

    2015-01-01

    This article is part of the Focus Theme of METHODS of Information in Medicine on "Managing Interoperability and Complexity in Health Systems". The proliferation of archetypes as a means to represent information of Electronic Health Records has raised the need of binding terminological codes - such as SNOMED CT codes - to their elements, in order to identify them univocally. However, the large size of the terminologies makes it difficult to perform this task manually. To establish a baseline of results for the aforementioned problem by using off-the-shelf string comparison-based techniques against which results from more complex techniques could be evaluated. Nine Typed Comparison METHODS were evaluated for binding using a set of 487 archetype elements. Their recall was calculated and Friedman and Nemenyi tests were applied in order to assess whether any of the methods outperformed the others. Using the qGrams method along with the 'Text' information piece of archetype elements outperforms the other methods if a level of confidence of 90% is considered. A recall of 25.26% is obtained if just one SNOMED CT term is retrieved for each archetype element. This recall rises to 50.51% and 75.56% if 10 and 100 elements are retrieved respectively, that being a reduction of more than 99.99% on the SNOMED CT code set. The baseline has been established following the above-mentioned results. Moreover, it has been observed that although string comparison-based methods do not outperform more sophisticated techniques, they still can be an alternative for providing a reduced set of candidate terms for each archetype element from which the ultimate term can be chosen later in the more-than-likely manual supervision task.

  16. Cross-cultural aspect of the Group Embedded Figures Test: norms for Turkish eighth graders.

    PubMed

    Cakan, Mehtap

    2003-10-01

    The Group Embedded Figures Test was administered to 206 Turkish (123 boys versus 83 girls) eighth grade students. Distribution characteristics, item analysis, reliability, and internal consistency are presented. No sex differences on subsections or the full scale were found. Socioeconomic status as indicated by parental education was significantly associated with the cognitive style scores of the students. Subjects whose fathers had a higher education outperformed those whose fathers had less education. No significant differences in students' means were found among groups whose mothers had low, middle, and high education. The Turkish sample showed the same performance as a 5th grade American sample, and Canadian 8th graders outperformed the Turkish participants. The practice effects are also discussed.

  17. Optimizing Dynamical Network Structure for Pinning Control

    NASA Astrophysics Data System (ADS)

    Orouskhani, Yasin; Jalili, Mahdi; Yu, Xinghuo

    2016-04-01

    Controlling dynamics of a network from any initial state to a final desired state has many applications in different disciplines from engineering to biology and social sciences. In this work, we optimize the network structure for pinning control. The problem is formulated as four optimization tasks: i) optimizing the locations of driver nodes, ii) optimizing the feedback gains, iii) optimizing simultaneously the locations of driver nodes and feedback gains, and iv) optimizing the connection weights. A newly developed population-based optimization technique (cat swarm optimization) is used as the optimization method. In order to verify the methods, we use both real-world networks, and model scale-free and small-world networks. Extensive simulation results show that the optimal placement of driver nodes significantly outperforms heuristic methods including placing drivers based on various centrality measures (degree, betweenness, closeness and clustering coefficient). The pinning controllability is further improved by optimizing the feedback gains. We also show that one can significantly improve the controllability by optimizing the connection weights.

  18. Spatial Copula Model for Imputing Traffic Flow Data from Remote Microwave Sensors.

    PubMed

    Ma, Xiaolei; Luan, Sen; Du, Bowen; Yu, Bin

    2017-09-21

    Issues of missing data have become increasingly serious with the rapid increase in usage of traffic sensors. Analyses of the Beijing ring expressway have showed that up to 50% of microwave sensors pose missing values. The imputation of missing traffic data must be urgently solved although a precise solution that cannot be easily achieved due to the significant number of missing portions. In this study, copula-based models are proposed for the spatial interpolation of traffic flow from remote traffic microwave sensors. Most existing interpolation methods only rely on covariance functions to depict spatial correlation and are unsuitable for coping with anomalies due to Gaussian consumption. Copula theory overcomes this issue and provides a connection between the correlation function and the marginal distribution function of traffic flow. To validate copula-based models, a comparison with three kriging methods is conducted. Results indicate that copula-based models outperform kriging methods, especially on roads with irregular traffic patterns. Copula-based models demonstrate significant potential to impute missing data in large-scale transportation networks.

  19. Measurement of thermally ablated lesions in sonoelastographic images using level set methods

    NASA Astrophysics Data System (ADS)

    Castaneda, Benjamin; Tamez-Pena, Jose Gerardo; Zhang, Man; Hoyt, Kenneth; Bylund, Kevin; Christensen, Jared; Saad, Wael; Strang, John; Rubens, Deborah J.; Parker, Kevin J.

    2008-03-01

    The capability of sonoelastography to detect lesions based on elasticity contrast can be applied to monitor the creation of thermally ablated lesion. Currently, segmentation of lesions depicted in sonoelastographic images is performed manually which can be a time consuming process and prone to significant intra- and inter-observer variability. This work presents a semi-automated segmentation algorithm for sonoelastographic data. The user starts by planting a seed in the perceived center of the lesion. Fast marching methods use this information to create an initial estimate of the lesion. Subsequently, level set methods refine its final shape by attaching the segmented contour to edges in the image while maintaining smoothness. The algorithm is applied to in vivo sonoelastographic images from twenty five thermal ablated lesions created in porcine livers. The estimated area is compared to results from manual segmentation and gross pathology images. Results show that the algorithm outperforms manual segmentation in accuracy, inter- and intra-observer variability. The processing time per image is significantly reduced.

  20. Task-induced frequency modulation features for brain-computer interfacing

    NASA Astrophysics Data System (ADS)

    Jayaram, Vinay; Hohmann, Matthias; Just, Jennifer; Schölkopf, Bernhard; Grosse-Wentrup, Moritz

    2017-10-01

    Objective. Task-induced amplitude modulation of neural oscillations is routinely used in brain-computer interfaces (BCIs) for decoding subjects’ intents, and underlies some of the most robust and common methods in the field, such as common spatial patterns and Riemannian geometry. While there has been some interest in phase-related features for classification, both techniques usually presuppose that the frequencies of neural oscillations remain stable across various tasks. We investigate here whether features based on task-induced modulation of the frequency of neural oscillations enable decoding of subjects’ intents with an accuracy comparable to task-induced amplitude modulation. Approach. We compare cross-validated classification accuracies using the amplitude and frequency modulated features, as well as a joint feature space, across subjects in various paradigms and pre-processing conditions. We show results with a motor imagery task, a cognitive task, and also preliminary results in patients with amyotrophic lateral sclerosis (ALS), as well as using common spatial patterns and Laplacian filtering. Main results. The frequency features alone do not significantly out-perform traditional amplitude modulation features, and in some cases perform significantly worse. However, across both tasks and pre-processing in healthy subjects the joint space significantly out-performs either the frequency or amplitude features alone. This result only does not hold for ALS patients, for whom the dataset is of insufficient size to draw any statistically significant conclusions. Significance. Task-induced frequency modulation is robust and straight forward to compute, and increases performance when added to standard amplitude modulation features across paradigms. This allows more information to be extracted from the EEG signal cheaply and can be used throughout the field of BCIs.

  1. FIND: difFerential chromatin INteractions Detection using a spatial Poisson process

    PubMed Central

    Chen, Yang; Zhang, Michael Q.

    2018-01-01

    Polymer-based simulations and experimental studies indicate the existence of a spatial dependency between the adjacent DNA fibers involved in the formation of chromatin loops. However, the existing strategies for detecting differential chromatin interactions assume that the interacting segments are spatially independent from the other segments nearby. To resolve this issue, we developed a new computational method, FIND, which considers the local spatial dependency between interacting loci. FIND uses a spatial Poisson process to detect differential chromatin interactions that show a significant difference in their interaction frequency and the interaction frequency of their neighbors. Simulation and biological data analysis show that FIND outperforms the widely used count-based methods and has a better signal-to-noise ratio. PMID:29440282

  2. A hybrid frame concealment algorithm for H.264/AVC.

    PubMed

    Yan, Bo; Gharavi, Hamid

    2010-01-01

    In packet-based video transmissions, packets loss due to channel errors may result in the loss of the whole video frame. Recently, many error concealment algorithms have been proposed in order to combat channel errors; however, most of the existing algorithms can only deal with the loss of macroblocks and are not able to conceal the whole missing frame. In order to resolve this problem, in this paper, we have proposed a new hybrid motion vector extrapolation (HMVE) algorithm to recover the whole missing frame, and it is able to provide more accurate estimation for the motion vectors of the missing frame than other conventional methods. Simulation results show that it is highly effective and significantly outperforms other existing frame recovery methods.

  3. Profitability of simple technical trading rules of Chinese stock exchange indexes

    NASA Astrophysics Data System (ADS)

    Zhu, Hong; Jiang, Zhi-Qiang; Li, Sai-Ping; Zhou, Wei-Xing

    2015-12-01

    Although technical trading rules have been widely used by practitioners in financial markets, their profitability still remains controversial. We here investigate the profitability of moving average (MA) and trading range break (TRB) rules by using the Shanghai Stock Exchange Composite Index (SHCI) from May 21, 1992 through December 31, 2013 and Shenzhen Stock Exchange Component Index (SZCI) from April 3, 1991 through December 31, 2013. The t-test is adopted to check whether the mean returns which are conditioned on the trading signals are significantly different from unconditioned returns and whether the mean returns conditioned on the buy signals are significantly different from the mean returns conditioned on the sell signals. We find that TRB rules outperform MA rules and short-term variable moving average (VMA) rules outperform long-term VMA rules. By applying White's Reality Check test and accounting for the data snooping effects, we find that the best trading rule outperforms the buy-and-hold strategy when transaction costs are not taken into consideration. Once transaction costs are included, trading profits will be eliminated completely. Our analysis suggests that simple trading rules like MA and TRB cannot beat the standard buy-and-hold strategy for the Chinese stock exchange indexes.

  4. Smell and taste function in the visually impaired.

    PubMed

    Smith, R S; Doty, R L; Burlingame, G K; McKeown, D A

    1993-11-01

    Surprisingly few quantitative studies have addressed the question of whether visually impaired individuals evidence, perhaps in compensation for their loss of vision, increased acuteness in their other senses. In this experiment we sought to determine whether blind subjects outperform sighted subjects on a number of basic tests of chemosensory function. Over 50 blind and 75 sighted subjects were administered the following olfactory and gustatory tests: the University of Pennsylvania Smell Identification Test (UPSIT); a 16-item odor discrimination test; and a suprathreshold taste test in which measures of taste-quality identification and ratings of the perceived intensity and pleasantness of sucrose, citric acid, sodium chloride, and caffeine were obtained. In addition, 39 blind subjects and 77 sighted subjects were administered a single staircase phenyl ethyl alcohol (PEA) odor detection threshold test. Twenty-three of the sighted subjects were employed by the Philadelphia Water Department and trained to serve on its water quality evaluation panel. The primary findings of the study were that (a) the blind subjects did not outperform sighted subjects on any test of chemosensory function and (b) the trained subjects significantly outperformed the other two groups on the odor detection, odor discrimination, and taste identification tests, and nearly outperformed the blind subjects on the UPSIT. The citric acid concentrations received larger pleasantness ratings from the trained panel members than from the blind subjects, whose ratings did not differ significantly from those of the untrained sighted subjects. Overall, the data imply that blindness, per se, has little influence on chemosensory function and add further support to the notion that specialized training enhances performance on a number of chemosensory tasks.

  5. An iterated Laplacian based semi-supervised dimensionality reduction for classification of breast cancer on ultrasound images.

    PubMed

    Liu, Xiao; Shi, Jun; Zhou, Shichong; Lu, Minhua

    2014-01-01

    The dimensionality reduction is an important step in ultrasound image based computer-aided diagnosis (CAD) for breast cancer. A newly proposed l2,1 regularized correntropy algorithm for robust feature selection (CRFS) has achieved good performance for noise corrupted data. Therefore, it has the potential to reduce the dimensions of ultrasound image features. However, in clinical practice, the collection of labeled instances is usually expensive and time costing, while it is relatively easy to acquire the unlabeled or undetermined instances. Therefore, the semi-supervised learning is very suitable for clinical CAD. The iterated Laplacian regularization (Iter-LR) is a new regularization method, which has been proved to outperform the traditional graph Laplacian regularization in semi-supervised classification and ranking. In this study, to augment the classification accuracy of the breast ultrasound CAD based on texture feature, we propose an Iter-LR-based semi-supervised CRFS (Iter-LR-CRFS) algorithm, and then apply it to reduce the feature dimensions of ultrasound images for breast CAD. We compared the Iter-LR-CRFS with LR-CRFS, original supervised CRFS, and principal component analysis. The experimental results indicate that the proposed Iter-LR-CRFS significantly outperforms all other algorithms.

  6. Combining multiple positive training sets to generate confidence scores for protein-protein interactions.

    PubMed

    Yu, Jingkai; Finley, Russell L

    2009-01-01

    High-throughput experimental and computational methods are generating a wealth of protein-protein interaction data for a variety of organisms. However, data produced by current state-of-the-art methods include many false positives, which can hinder the analyses needed to derive biological insights. One way to address this problem is to assign confidence scores that reflect the reliability and biological significance of each interaction. Most previously described scoring methods use a set of likely true positives to train a model to score all interactions in a dataset. A single positive training set, however, may be biased and not representative of true interaction space. We demonstrate a method to score protein interactions by utilizing multiple independent sets of training positives to reduce the potential bias inherent in using a single training set. We used a set of benchmark yeast protein interactions to show that our approach outperforms other scoring methods. Our approach can also score interactions across data types, which makes it more widely applicable than many previously proposed methods. We applied the method to protein interaction data from both Drosophila melanogaster and Homo sapiens. Independent evaluations show that the resulting confidence scores accurately reflect the biological significance of the interactions.

  7. Reduction hybrid artifacts of EMG-EOG in electroencephalography evoked by prefrontal transcranial magnetic stimulation

    NASA Astrophysics Data System (ADS)

    Bai, Yang; Wan, Xiaohong; Zeng, Ke; Ni, Yinmei; Qiu, Lirong; Li, Xiaoli

    2016-12-01

    Objective. When prefrontal-transcranial magnetic stimulation (p-TMS) performed, it may evoke hybrid artifact mixed with muscle activity and blink activity in EEG recordings. Reducing this kind of hybrid artifact challenges the traditional preprocessing methods. We aim to explore method for the p-TMS evoked hybrid artifact removal. Approach. We propose a novel method used as independent component analysis (ICA) post processing to reduce the p-TMS evoked hybrid artifact. Ensemble empirical mode decomposition (EEMD) was used to decompose signal into multi-components, then the components were separated with artifact reduced by blind source separation (BSS) method. Three standard BSS methods, ICA, independent vector analysis, and canonical correlation analysis (CCA) were tested. Main results. Synthetic results showed that EEMD-CCA outperformed others as ICA post processing step in hybrid artifacts reduction. Its superiority was clearer when signal to noise ratio (SNR) was lower. In application to real experiment, SNR can be significantly increased and the p-TMS evoked potential could be recovered from hybrid artifact contaminated signal. Our proposed method can effectively reduce the p-TMS evoked hybrid artifacts. Significance. Our proposed method may facilitate future prefrontal TMS-EEG researches.

  8. Localization of a variational particle smoother

    NASA Astrophysics Data System (ADS)

    Morzfeld, M.; Hodyss, D.; Poterjoy, J.

    2017-12-01

    Given the success of 4D-variational methods (4D-Var) in numerical weather prediction,and recent efforts to merge ensemble Kalman filters with 4D-Var,we consider a method to merge particle methods and 4D-Var.This leads us to revisit variational particle smoothers (varPS).We study the collapse of varPS in high-dimensional problemsand show how it can be prevented by weight-localization.We test varPS on the Lorenz'96 model of dimensionsn=40, n=400, and n=2000.In our numerical experiments, weight localization prevents the collapse of the varPS,and we note that the varPS yields results comparable to ensemble formulations of 4D-variational methods,while it outperforms EnKF with tuned localization and inflation,and the localized standard particle filter.Additional numerical experiments suggest that using localized weights in varPS may not yield significant advantages over unweighted or linearizedsolutions in near-Gaussian problems.

  9. An Intelligent Model for Pairs Trading Using Genetic Algorithms.

    PubMed

    Huang, Chien-Feng; Hsu, Chi-Jen; Chen, Chi-Chung; Chang, Bao Rong; Li, Chen-An

    2015-01-01

    Pairs trading is an important and challenging research area in computational finance, in which pairs of stocks are bought and sold in pair combinations for arbitrage opportunities. Traditional methods that solve this set of problems mostly rely on statistical methods such as regression. In contrast to the statistical approaches, recent advances in computational intelligence (CI) are leading to promising opportunities for solving problems in the financial applications more effectively. In this paper, we present a novel methodology for pairs trading using genetic algorithms (GA). Our results showed that the GA-based models are able to significantly outperform the benchmark and our proposed method is capable of generating robust models to tackle the dynamic characteristics in the financial application studied. Based upon the promising results obtained, we expect this GA-based method to advance the research in computational intelligence for finance and provide an effective solution to pairs trading for investment in practice.

  10. An Intelligent Model for Pairs Trading Using Genetic Algorithms

    PubMed Central

    Hsu, Chi-Jen; Chen, Chi-Chung; Li, Chen-An

    2015-01-01

    Pairs trading is an important and challenging research area in computational finance, in which pairs of stocks are bought and sold in pair combinations for arbitrage opportunities. Traditional methods that solve this set of problems mostly rely on statistical methods such as regression. In contrast to the statistical approaches, recent advances in computational intelligence (CI) are leading to promising opportunities for solving problems in the financial applications more effectively. In this paper, we present a novel methodology for pairs trading using genetic algorithms (GA). Our results showed that the GA-based models are able to significantly outperform the benchmark and our proposed method is capable of generating robust models to tackle the dynamic characteristics in the financial application studied. Based upon the promising results obtained, we expect this GA-based method to advance the research in computational intelligence for finance and provide an effective solution to pairs trading for investment in practice. PMID:26339236

  11. A joint tracking method for NSCC based on WLS algorithm

    NASA Astrophysics Data System (ADS)

    Luo, Ruidan; Xu, Ying; Yuan, Hong

    2017-12-01

    Navigation signal based on compound carrier (NSCC), has the flexible multi-carrier scheme and various scheme parameters configuration, which enables it to possess significant efficiency of navigation augmentation in terms of spectral efficiency, tracking accuracy, multipath mitigation capability and anti-jamming reduction compared with legacy navigation signals. Meanwhile, the typical scheme characteristics can provide auxiliary information for signal synchronism algorithm design. This paper, based on the characteristics of NSCC, proposed a kind of joint tracking method utilizing Weighted Least Square (WLS) algorithm. In this method, the LS algorithm is employed to jointly estimate each sub-carrier frequency shift with the frequency-Doppler linear relationship, by utilizing the known sub-carrier frequency. Besides, the weighting matrix is set adaptively according to the sub-carrier power to ensure the estimation accuracy. Both the theory analysis and simulation results illustrate that the tracking accuracy and sensitivity of this method outperforms the single-carrier algorithm with lower SNR.

  12. Factoring local sequence composition in motif significance analysis.

    PubMed

    Ng, Patrick; Keich, Uri

    2008-01-01

    We recently introduced a biologically realistic and reliable significance analysis of the output of a popular class of motif finders. In this paper we further improve our significance analysis by incorporating local base composition information. Relying on realistic biological data simulation, as well as on FDR analysis applied to real data, we show that our method is significantly better than the increasingly popular practice of using the normal approximation to estimate the significance of a finder's output. Finally we turn to leveraging our reliable significance analysis to improve the actual motif finding task. Specifically, endowing a variant of the Gibbs Sampler with our improved significance analysis we demonstrate that de novo finders can perform better than has been perceived. Significantly, our new variant outperforms all the finders reviewed in a recently published comprehensive analysis of the Harbison genome-wide binding location data. Interestingly, many of these finders incorporate additional information such as nucleosome positioning and the significance of binding data.

  13. MAG4 Versus Alternative Techniques for Forecasting Active-Region Flare Productivity

    NASA Technical Reports Server (NTRS)

    Falconer, David A.; Moore, Ronald L.; Barghouty, Abdulnasser F.; Khazanov, Igor

    2014-01-01

    MAG4 is a technique of forecasting an active region's rate of production of major flares in the coming few days from a free-magnetic-energy proxy. We present a statistical method of measuring the difference in performance between MAG4 and comparable alternative techniques that forecast an active region's major-flare productivity from alternative observed aspects of the active region. We demonstrate the method by measuring the difference in performance between the "Present MAG4" technique and each of three alternative techniques, called "McIntosh Active-Region Class," "Total Magnetic Flux," and "Next MAG4." We do this by using (1) the MAG4 database of magnetograms and major-flare histories of sunspot active regions, (2) the NOAA table of the major-flare productivity of each of 60 McIntosh active-region classes of sunspot active regions, and (3) five technique-performance metrics (Heidke Skill Score, True Skill Score, Percent Correct, Probability of Detection, and False Alarm Rate) evaluated from 2000 random two-by-two contingency tables obtained from the databases. We find that (1) Present MAG4 far outperforms both McIntosh Active-Region Class and Total Magnetic Flux, (2) Next MAG4 significantly outperforms Present MAG4, (3) the performance of Next MAG4 is insensitive to the forward and backward temporal windows used, in the range of one to a few days, and (4) forecasting from the free-energy proxy in combination with either any broad category of McIntosh active-region classes or any Mount Wilson active-region class gives no significant performance improvement over forecasting from the free-energy proxy alone (Present MAG4).

  14. Identification and functional characterization of HIV-associated neurocognitive disorders with large-scale Granger causality analysis on resting-state functional MRI

    NASA Astrophysics Data System (ADS)

    Chockanathan, Udaysankar; DSouza, Adora M.; Abidin, Anas Z.; Schifitto, Giovanni; Wismüller, Axel

    2018-02-01

    Resting-state functional MRI (rs-fMRI), coupled with advanced multivariate time-series analysis methods such as Granger causality, is a promising tool for the development of novel functional connectivity biomarkers of neurologic and psychiatric disease. Recently large-scale Granger causality (lsGC) has been proposed as an alternative to conventional Granger causality (cGC) that extends the scope of robust Granger causal analyses to high-dimensional systems such as the human brain. In this study, lsGC and cGC were comparatively evaluated on their ability to capture neurologic damage associated with HIV-associated neurocognitive disorders (HAND). Functional brain network models were constructed from rs-fMRI data collected from a cohort of HIV+ and HIV- subjects. Graph theoretic properties of the resulting networks were then used to train a support vector machine (SVM) model to predict clinically relevant parameters, such as HIV status and neuropsychometric (NP) scores. For the HIV+/- classification task, lsGC, which yielded a peak area under the receiver operating characteristic curve (AUC) of 0.83, significantly outperformed cGC, which yielded a peak AUC of 0.61, at all parameter settings tested. For the NP score regression task, lsGC, with a minimum mean squared error (MSE) of 0.75, significantly outperformed cGC, with a minimum MSE of 0.84 (p < 0.001, one-tailed paired t-test). These results show that, at optimal parameter settings, lsGC is better able to capture functional brain connectivity correlates of HAND than cGC. However, given the substantial variation in the performance of the two methods at different parameter settings, particularly for the regression task, improved parameter selection criteria are necessary and constitute an area for future research.

  15. Automatic Mrf-Based Registration of High Resolution Satellite Video Data

    NASA Astrophysics Data System (ADS)

    Platias, C.; Vakalopoulou, M.; Karantzalos, K.

    2016-06-01

    In this paper we propose a deformable registration framework for high resolution satellite video data able to automatically and accurately co-register satellite video frames and/or register them to a reference map/image. The proposed approach performs non-rigid registration, formulates a Markov Random Fields (MRF) model, while efficient linear programming is employed for reaching the lowest potential of the cost function. The developed approach has been applied and validated on satellite video sequences from Skybox Imaging and compared with a rigid, descriptor-based registration method. Regarding the computational performance, both the MRF-based and the descriptor-based methods were quite efficient, with the first one converging in some minutes and the second in some seconds. Regarding the registration accuracy the proposed MRF-based method significantly outperformed the descriptor-based one in all the performing experiments.

  16. Identifying online user reputation in terms of user preference

    NASA Astrophysics Data System (ADS)

    Dai, Lu; Guo, Qiang; Liu, Xiao-Lu; Liu, Jian-Guo; Zhang, Yi-Cheng

    2018-03-01

    Identifying online user reputation is significant for online social systems. In this paper, taking into account the preference physics of online user collective behaviors, we present an improved group-based rating method for ranking online user reputation based on the user preference (PGR). All the ratings given by each specific user are mapped to the same rating criteria. By grouping users according to their mapped ratings, the online user reputation is calculated based on the corresponding group sizes. Results for MovieLens and Netflix data sets show that the AUC values of the PGR method can reach 0.9842 (0.9493) and 0.9995 (0.9987) for malicious (random) spammers, respectively, outperforming the results generated by the traditional group-based method, which indicates that the online preference plays an important role for measuring user reputation.

  17. An Extraction Method of an Informative DOM Node from a Web Page by Using Layout Information

    NASA Astrophysics Data System (ADS)

    Tsuruta, Masanobu; Masuyama, Shigeru

    We propose an informative DOM node extraction method from a Web page for preprocessing of Web content mining. Our proposed method LM uses layout data of DOM nodes generated by a generic Web browser, and the learning set consists of hundreds of Web pages and the annotations of informative DOM nodes of those Web pages. Our method does not require large scale crawling of the whole Web site to which the target Web page belongs. We design LM so that it uses the information of the learning set more efficiently in comparison to the existing method that uses the same learning set. By experiments, we evaluate the methods obtained by combining one that consists of the method for extracting the informative DOM node both the proposed method and the existing methods, and the existing noise elimination methods: Heur removes advertisements and link-lists by some heuristics and CE removes the DOM nodes existing in the Web pages in the same Web site to which the target Web page belongs. Experimental results show that 1) LM outperforms other methods for extracting the informative DOM node, 2) the combination method (LM, {CE(10), Heur}) based on LM (precision: 0.755, recall: 0.826, F-measure: 0.746) outperforms other combination methods.

  18. Stochastic gradient ascent outperforms gamers in the Quantum Moves game

    NASA Astrophysics Data System (ADS)

    Sels, Dries

    2018-04-01

    In a recent work on quantum state preparation, Sørensen and co-workers [Nature (London) 532, 210 (2016), 10.1038/nature17620] explore the possibility of using video games to help design quantum control protocols. The authors present a game called "Quantum Moves" (https://www.scienceathome.org/games/quantum-moves/) in which gamers have to move an atom from A to B by means of optical tweezers. They report that, "players succeed where purely numerical optimization fails." Moreover, by harnessing the player strategies, they can "outperform the most prominent established numerical methods." The aim of this Rapid Communication is to analyze the problem in detail and show that those claims are untenable. In fact, without any prior knowledge and starting from a random initial seed, a simple stochastic local optimization method finds near-optimal solutions which outperform all players. Counterdiabatic driving can even be used to generate protocols without resorting to numeric optimization. The analysis results in an accurate analytic estimate of the quantum speed limit which, apart from zero-point motion, is shown to be entirely classical in nature. The latter might explain why gamers are reasonably good at the game. A simple modification of the BringHomeWater challenge is proposed to test this hypothesis.

  19. Self-directed learning can outperform direct instruction in the course of a modern German medical curriculum - results of a mixed methods trial.

    PubMed

    Peine, Arne; Kabino, Klaus; Spreckelsen, Cord

    2016-06-03

    Modernised medical curricula in Germany (so called "reformed study programs") rely increasingly on alternative self-instructed learning forms such as e-learning and curriculum-guided self-study. However, there is a lack of evidence that these methods can outperform conventional teaching methods such as lectures and seminars. This study was conducted in order to compare extant traditional teaching methods with new instruction forms in terms of learning effect and student satisfaction. In a randomised trial, 244 students of medicine in their third academic year were assigned to one of four study branches representing self-instructed learning forms (e-learning and curriculum-based self-study) and instructed learning forms (lectures and seminars). All groups participated in their respective learning module with standardised materials and instructions. Learning effect was measured with pre-test and post-test multiple-choice questionnaires. Student satisfaction and learning style were examined via self-assessment. Of 244 initial participants, 223 completed the respective module and were included in the study. In the pre-test, the groups showed relatively homogenous scores. All students showed notable improvements compared with the pre-test results. Participants in the non-self-instructed learning groups reached scores of 14.71 (seminar) and 14.37 (lecture), while the groups of self-instructed learners reached higher scores with 17.23 (e-learning) and 15.81 (self-study). All groups improved significantly (p < .001) in the post-test regarding their self-assessment, led by the e-learning group, whose self-assessment improved by 2.36. The study shows that students in modern study curricula learn better through modern self-instructed methods than through conventional methods. These methods should be used more, as they also show good levels of student acceptance and higher scores in personal self-assessment of knowledge.

  20. Does Video-Autotutorial Instruction Improve College Student Achievement?

    ERIC Educational Resources Information Center

    Fisher, K. M.; And Others

    1977-01-01

    Compares student achievement in an upper-division college introductory course taught by the video-autotutorial method with that in two comparable courses taught by the lecture-discussion method. Pre-post tests of 623 students reveal that video-autotutorial students outperform lecture/discussion participants at all ability levels and that in…

  1. Characterizing trabecular bone structure for assessing vertebral fracture risk on volumetric quantitative computed tomography

    NASA Astrophysics Data System (ADS)

    Nagarajan, Mahesh B.; Checefsky, Walter A.; Abidin, Anas Z.; Tsai, Halley; Wang, Xixi; Hobbs, Susan K.; Bauer, Jan S.; Baum, Thomas; Wismüller, Axel

    2015-03-01

    While the proximal femur is preferred for measuring bone mineral density (BMD) in fracture risk estimation, the introduction of volumetric quantitative computed tomography has revealed stronger associations between BMD and spinal fracture status. In this study, we propose to capture properties of trabecular bone structure in spinal vertebrae with advanced second-order statistical features for purposes of fracture risk assessment. For this purpose, axial multi-detector CT (MDCT) images were acquired from 28 spinal vertebrae specimens using a whole-body 256-row CT scanner with a dedicated calibration phantom. A semi-automated method was used to annotate the trabecular compartment in the central vertebral slice with a circular region of interest (ROI) to exclude cortical bone; pixels within were converted to values indicative of BMD. Six second-order statistical features derived from gray-level co-occurrence matrices (GLCM) and the mean BMD within the ROI were then extracted and used in conjunction with a generalized radial basis functions (GRBF) neural network to predict the failure load of the specimens; true failure load was measured through biomechanical testing. Prediction performance was evaluated with a root-mean-square error (RMSE) metric. The best prediction performance was observed with GLCM feature `correlation' (RMSE = 1.02 ± 0.18), which significantly outperformed all other GLCM features (p < 0.01). GLCM feature correlation also significantly outperformed MDCTmeasured mean BMD (RMSE = 1.11 ± 0.17) (p< 10-4). These results suggest that biomechanical strength prediction in spinal vertebrae can be significantly improved through characterization of trabecular bone structure with GLCM-derived texture features.

  2. Highly efficient codec based on significance-linked connected-component analysis of wavelet coefficients

    NASA Astrophysics Data System (ADS)

    Chai, Bing-Bing; Vass, Jozsef; Zhuang, Xinhua

    1997-04-01

    Recent success in wavelet coding is mainly attributed to the recognition of importance of data organization. There has been several very competitive wavelet codecs developed, namely, Shapiro's Embedded Zerotree Wavelets (EZW), Servetto et. al.'s Morphological Representation of Wavelet Data (MRWD), and Said and Pearlman's Set Partitioning in Hierarchical Trees (SPIHT). In this paper, we propose a new image compression algorithm called Significant-Linked Connected Component Analysis (SLCCA) of wavelet coefficients. SLCCA exploits both within-subband clustering of significant coefficients and cross-subband dependency in significant fields. A so-called significant link between connected components is designed to reduce the positional overhead of MRWD. In addition, the significant coefficients' magnitude are encoded in bit plane order to match the probability model of the adaptive arithmetic coder. Experiments show that SLCCA outperforms both EZW and MRWD, and is tied with SPIHT. Furthermore, it is observed that SLCCA generally has the best performance on images with large portion of texture. When applied to fingerprint image compression, it outperforms FBI's wavelet scalar quantization by about 1 dB.

  3. Exploiting graph kernels for high performance biomedical relation extraction.

    PubMed

    Panyam, Nagesh C; Verspoor, Karin; Cohn, Trevor; Ramamohanarao, Kotagiri

    2018-01-30

    Relation extraction from biomedical publications is an important task in the area of semantic mining of text. Kernel methods for supervised relation extraction are often preferred over manual feature engineering methods, when classifying highly ordered structures such as trees and graphs obtained from syntactic parsing of a sentence. Tree kernels such as the Subset Tree Kernel and Partial Tree Kernel have been shown to be effective for classifying constituency parse trees and basic dependency parse graphs of a sentence. Graph kernels such as the All Path Graph kernel (APG) and Approximate Subgraph Matching (ASM) kernel have been shown to be suitable for classifying general graphs with cycles, such as the enhanced dependency parse graph of a sentence. In this work, we present a high performance Chemical-Induced Disease (CID) relation extraction system. We present a comparative study of kernel methods for the CID task and also extend our study to the Protein-Protein Interaction (PPI) extraction task, an important biomedical relation extraction task. We discuss novel modifications to the ASM kernel to boost its performance and a method to apply graph kernels for extracting relations expressed in multiple sentences. Our system for CID relation extraction attains an F-score of 60%, without using external knowledge sources or task specific heuristic or rules. In comparison, the state of the art Chemical-Disease Relation Extraction system achieves an F-score of 56% using an ensemble of multiple machine learning methods, which is then boosted to 61% with a rule based system employing task specific post processing rules. For the CID task, graph kernels outperform tree kernels substantially, and the best performance is obtained with APG kernel that attains an F-score of 60%, followed by the ASM kernel at 57%. The performance difference between the ASM and APG kernels for CID sentence level relation extraction is not significant. In our evaluation of ASM for the PPI task, ASM performed better than APG kernel for the BioInfer dataset, in the Area Under Curve (AUC) measure (74% vs 69%). However, for all the other PPI datasets, namely AIMed, HPRD50, IEPA and LLL, ASM is substantially outperformed by the APG kernel in F-score and AUC measures. We demonstrate a high performance Chemical Induced Disease relation extraction, without employing external knowledge sources or task specific heuristics. Our work shows that graph kernels are effective in extracting relations that are expressed in multiple sentences. We also show that the graph kernels, namely the ASM and APG kernels, substantially outperform the tree kernels. Among the graph kernels, we showed the ASM kernel as effective for biomedical relation extraction, with comparable performance to the APG kernel for datasets such as the CID-sentence level relation extraction and BioInfer in PPI. Overall, the APG kernel is shown to be significantly more accurate than the ASM kernel, achieving better performance on most datasets.

  4. n-Gram-Based Text Compression.

    PubMed

    Nguyen, Vu H; Nguyen, Hien T; Duong, Hieu N; Snasel, Vaclav

    2016-01-01

    We propose an efficient method for compressing Vietnamese text using n -gram dictionaries. It has a significant compression ratio in comparison with those of state-of-the-art methods on the same dataset. Given a text, first, the proposed method splits it into n -grams and then encodes them based on n -gram dictionaries. In the encoding phase, we use a sliding window with a size that ranges from bigram to five grams to obtain the best encoding stream. Each n -gram is encoded by two to four bytes accordingly based on its corresponding n -gram dictionary. We collected 2.5 GB text corpus from some Vietnamese news agencies to build n -gram dictionaries from unigram to five grams and achieve dictionaries with a size of 12 GB in total. In order to evaluate our method, we collected a testing set of 10 different text files with different sizes. The experimental results indicate that our method achieves compression ratio around 90% and outperforms state-of-the-art methods.

  5. Lead-lag cross-sectional structure and detection of correlated anticorrelated regime shifts: Application to the volatilities of inflation and economic growth rates

    NASA Astrophysics Data System (ADS)

    Zhou, Wei-Xing; Sornette, Didier

    2007-07-01

    We have recently introduced the “thermal optimal path” (TOP) method to investigate the real-time lead-lag structure between two time series. The TOP method consists in searching for a robust noise-averaged optimal path of the distance matrix along which the two time series have the greatest similarity. Here, we generalize the TOP method by introducing a more general definition of distance which takes into account possible regime shifts between positive and negative correlations. This generalization to track possible changes of correlation signs is able to identify possible transitions from one convention (or consensus) to another. Numerical simulations on synthetic time series verify that the new TOP method performs as expected even in the presence of substantial noise. We then apply it to investigate changes of convention in the dependence structure between the historical volatilities of the USA inflation rate and economic growth rate. Several measures show that the new TOP method significantly outperforms standard cross-correlation methods.

  6. n-Gram-Based Text Compression

    PubMed Central

    Duong, Hieu N.; Snasel, Vaclav

    2016-01-01

    We propose an efficient method for compressing Vietnamese text using n-gram dictionaries. It has a significant compression ratio in comparison with those of state-of-the-art methods on the same dataset. Given a text, first, the proposed method splits it into n-grams and then encodes them based on n-gram dictionaries. In the encoding phase, we use a sliding window with a size that ranges from bigram to five grams to obtain the best encoding stream. Each n-gram is encoded by two to four bytes accordingly based on its corresponding n-gram dictionary. We collected 2.5 GB text corpus from some Vietnamese news agencies to build n-gram dictionaries from unigram to five grams and achieve dictionaries with a size of 12 GB in total. In order to evaluate our method, we collected a testing set of 10 different text files with different sizes. The experimental results indicate that our method achieves compression ratio around 90% and outperforms state-of-the-art methods. PMID:27965708

  7. Gender and sexual orientation differences in cognition across adulthood: age is kinder to women than to men regardless of sexual orientation.

    PubMed

    Maylor, Elizabeth A; Reimers, Stian; Choi, Jean; Collaer, Marcia L; Peters, Michael; Silverman, Irwin

    2007-04-01

    Despite some evidence of greater age-related deterioration of the brain in males than in females, gender differences in rates of cognitive aging have proved inconsistent. The present study employed web-based methodology to collect data from people aged 20-65 years (109,612 men; 88,509 women). As expected, men outperformed women on tests of mental rotation and line angle judgment, whereas women outperformed men on tests of category fluency and object location memory. Performance on all tests declined with age but significantly more so for men than for women. Heterosexuals of each gender generally outperformed bisexuals and homosexuals on tests where that gender was superior; however, there were no clear interactions between age and sexual orientation for either gender. At least for these particular tests from young adulthood to retirement, age is kinder to women than to men, but treats heterosexuals, bisexuals, and homosexuals just the same.

  8. Results of a real-time irradiation of lithium P/N and conventional N/P silicon solar cells.

    NASA Technical Reports Server (NTRS)

    Reynard, D. L.; Peterson, D. G.

    1972-01-01

    Eight types of lithium-diffused P/N and three types of conventional 10 ohm-cm N/P silicon solar cells were irradiated at four different temperatures with a strontium-90 radioisotope at a rate typical of that expected in earth orbit. The six-month irradiation confirmed earlier accelerator results, showed that certain cell types outperform others at the various temperatures, and, in general, verified the recent improvements and potential usefulness of lithium solar cells. The experimental approach and statistical methods and analyses employed yielded increased confidence in the validity of the results. Injection level effects were observed to be significant.

  9. Multigrid contact detection method

    NASA Astrophysics Data System (ADS)

    He, Kejing; Dong, Shoubin; Zhou, Zhaoyao

    2007-03-01

    Contact detection is a general problem of many physical simulations. This work presents a O(N) multigrid method for general contact detection problems (MGCD). The multigrid idea is integrated with contact detection problems. Both the time complexity and memory consumption of the MGCD are O(N) . Unlike other methods, whose efficiencies are influenced strongly by the object size distribution, the performance of MGCD is insensitive to the object size distribution. We compare the MGCD with the no binary search (NBS) method and the multilevel boxing method in three dimensions for both time complexity and memory consumption. For objects with similar size, the MGCD is as good as the NBS method, both of which outperform the multilevel boxing method regarding memory consumption. For objects with diverse size, the MGCD outperform both the NBS method and the multilevel boxing method. We use the MGCD to solve the contact detection problem for a granular simulation system based on the discrete element method. From this granular simulation, we get the density property of monosize packing and binary packing with size ratio equal to 10. The packing density for monosize particles is 0.636. For binary packing with size ratio equal to 10, when the number of small particles is 300 times as the number of big particles, the maximal packing density 0.824 is achieved.

  10. FIND: difFerential chromatin INteractions Detection using a spatial Poisson process.

    PubMed

    Djekidel, Mohamed Nadhir; Chen, Yang; Zhang, Michael Q

    2018-02-12

    Polymer-based simulations and experimental studies indicate the existence of a spatial dependency between the adjacent DNA fibers involved in the formation of chromatin loops. However, the existing strategies for detecting differential chromatin interactions assume that the interacting segments are spatially independent from the other segments nearby. To resolve this issue, we developed a new computational method, FIND, which considers the local spatial dependency between interacting loci. FIND uses a spatial Poisson process to detect differential chromatin interactions that show a significant difference in their interaction frequency and the interaction frequency of their neighbors. Simulation and biological data analysis show that FIND outperforms the widely used count-based methods and has a better signal-to-noise ratio. © 2018 Djekidel et al.; Published by Cold Spring Harbor Laboratory Press.

  11. Improved adaptive splitting and selection: the hybrid training method of a classifier based on a feature space partitioning.

    PubMed

    Jackowski, Konrad; Krawczyk, Bartosz; Woźniak, Michał

    2014-05-01

    Currently, methods of combined classification are the focus of intense research. A properly designed group of combined classifiers exploiting knowledge gathered in a pool of elementary classifiers can successfully outperform a single classifier. There are two essential issues to consider when creating combined classifiers: how to establish the most comprehensive pool and how to design a fusion model that allows for taking full advantage of the collected knowledge. In this work, we address the issues and propose an AdaSS+, training algorithm dedicated for the compound classifier system that effectively exploits local specialization of the elementary classifiers. An effective training procedure consists of two phases. The first phase detects the classifier competencies and adjusts the respective fusion parameters. The second phase boosts classification accuracy by elevating the degree of local specialization. The quality of the proposed algorithms are evaluated on the basis of a wide range of computer experiments that show that AdaSS+ can outperform the original method and several reference classifiers.

  12. An Exact Algorithm to Compute the Double-Cut-and-Join Distance for Genomes with Duplicate Genes.

    PubMed

    Shao, Mingfu; Lin, Yu; Moret, Bernard M E

    2015-05-01

    Computing the edit distance between two genomes is a basic problem in the study of genome evolution. The double-cut-and-join (DCJ) model has formed the basis for most algorithmic research on rearrangements over the last few years. The edit distance under the DCJ model can be computed in linear time for genomes without duplicate genes, while the problem becomes NP-hard in the presence of duplicate genes. In this article, we propose an integer linear programming (ILP) formulation to compute the DCJ distance between two genomes with duplicate genes. We also provide an efficient preprocessing approach to simplify the ILP formulation while preserving optimality. Comparison on simulated genomes demonstrates that our method outperforms MSOAR in computing the edit distance, especially when the genomes contain long duplicated segments. We also apply our method to assign orthologous gene pairs among human, mouse, and rat genomes, where once again our method outperforms MSOAR.

  13. Cleaning lateral morphological features of the root canal: the role of streaming and cavitation.

    PubMed

    Robinson, J P; Macedo, R G; Verhaagen, B; Versluis, M; Cooper, P R; van der Sluis, L W M; Walmsley, A D

    2018-01-01

    To investigate the effects of ultrasonic activation file type, lateral canal location and irrigant on the removal of a biofilm-mimicking hydrogel from a fabricated lateral canal. Additionally, the amount of cavitation and streaming was quantified for these parameters. An intracanal sonochemical dosimetry method was used to quantify the cavitation generated by an IrriSafe 25 mm length, size 25 file inside a root canal model filled with filtered degassed/saturated water or three different concentrations of NaOCl. Removal of a hydrogel, demonstrated previously to be an appropriate biofilm mimic, was recorded to measure the lateral canal cleaning rate from two different instruments (IrriSafe 25 mm length, size 25 and K 21 mm length, size 15) activated with a P5 Suprasson (Satelec) at power P8.5 in degassed/saturated water or NaOCl. Removal rates were compared for significant differences using nonparametric Kruskal-Wallis and/or Mann-Whitney U-tests. Streaming was measured using high-speed particle imaging velocimetry at 250 kfps, analysing both the oscillatory and steady flow inside the lateral canals. There was no significant difference in amount of cavitation between tap water and oversaturated water (P = 0.538), although more cavitation was observed than in degassed water. The highest cavitation signal was generated with NaOCl solutions (1.0%, 4.5%, 9.0%) (P < 0.007) and increased with concentration (P < 0.014). The IrriSafe file outperformed significantly the K-file in removing hydrogel (P < 0.05). Up to 64% of the total hydrogel volume was removed after 20 s. The IrriSafe file typically outperformed the K-file in generating streaming. The oscillatory velocities were higher inside the lateral canal 3 mm compared to 6 mm from WL and were higher for NaOCl than for saturated water, which in turn was higher than for degassed water. Measurements of cavitation and acoustic streaming have provided insight into their contribution to cleaning. Significant differences in cleaning, cavitation and streaming were found depending on the file type and size, lateral canal location and irrigant used. In general, the IrriSafe file outperformed the K-file, and NaOCl performed better than the other irrigants tested. The cavitation and streaming measurements revealed that both contributed to hydrogel removal and both play a significant role in root canal cleaning. © 2017 International Endodontic Journal. Published by John Wiley & Sons Ltd.

  14. A Comparison of Alternating Current and Direct Current Electrospray Ionization for Mass Spectrometry

    PubMed Central

    Sarver, Scott A.; Gartner, Carlos A.; Chetwani, Nishant; Go, David B.; Dovichi, Norman J.

    2014-01-01

    A series of studies comparing the performance of alternating current electrospray ionization (AC ESI) mass spectrometry (MS) and direct current electrospray ionization (DC ESI) MS has been conducted, exploring the absolute signal intensity and signal-to-background ratios produced by both methods using caffeine and a model peptide as targets. Because the high-voltage AC signal was more susceptible to generating gas discharges, the operating voltage range of AC ESI was significantly smaller than that for DC ESI, such that the absolute signal intensities produced by DC ESI at peak voltages were 1 - 2 orders of magnitude greater than those for AC ESI. Using an electronegative nebulizing gas, sulfur hexafluoride (SF6), instead of nitrogen (N2) increased the operating range of AC ESI by ~50%, but did not appreciably improve signal intensities. While DC ESI generated far greater signal intensities, both ionization methods produced comparable signal-to-background noise, with AC ESI spectra appearing qualitatively cleaner. A quantitative calibration analysis was performed for two analytes, caffeine and the peptide MRFA. AC ESI utilizing SF6 outperforms all other techniques for the detection of MRFA, producing chromatographic limits of detection nearly one order of magnitude lower than that of DC ESI utilizing N2, and one half that of DC ESI utilizing SF6. However, DC ESI outperforms AC ESI for the analysis of caffeine, indicating improvements in spectral quality may benefit certain compounds, or classes of compounds, on an individual basis. PMID:24464359

  15. Comparison of DNA preservation methods for environmental bacterial community samples

    USGS Publications Warehouse

    Gray, Michael A.; Pratte, Zoe A.; Kellogg, Christina A.

    2013-01-01

    Field collections of environmental samples, for example corals, for molecular microbial analyses present distinct challenges. The lack of laboratory facilities in remote locations is common, and preservation of microbial community DNA for later study is critical. A particular challenge is keeping samples frozen in transit. Five nucleic acid preservation methods that do not require cold storage were compared for effectiveness over time and ease of use. Mixed microbial communities of known composition were created and preserved by DNAgard™, RNAlater®, DMSO–EDTA–salt (DESS), FTA® cards, and FTA Elute® cards. Automated ribosomal intergenic spacer analysis and clone libraries were used to detect specific changes in the faux communities over weeks and months of storage. A previously known bias in FTA® cards that results in lower recovery of pure cultures of Gram-positive bacteria was also detected in mixed community samples. There appears to be a uniform bias across all five preservation methods against microorganisms with high G + C DNA. Overall, the liquid-based preservatives (DNAgard™, RNAlater®, and DESS) outperformed the card-based methods. No single liquid method clearly outperformed the others, leaving method choice to be based on experimental design, field facilities, shipping constraints, and allowable cost.

  16. A study of the tolerance block approach to special stratification. [winter wheat in Kansas

    NASA Technical Reports Server (NTRS)

    Richardson, W. (Principal Investigator)

    1979-01-01

    The author has identified the following significant results. Twelve winter wheat LACIE segments in Kansas were used to compare the performance of three clustering methods: (1) BCLUST, which uses a spectral distance function to accumulate clusters; (2) blocks-alone, which divides spectral space into equally populated blocks; and (3) block-seeds, which uses spectral means of blocks-alone as seeds for accumulating distance-type clusters. Both BCLUST and block-seeds performed equally well and outperformed blocks-alone significantly. Their average variance ratio of about 0.5 showed imperfect separation of wheat from non-wheat. This result points to the need to explore the achievable crop separability in the spectral/temporal domain, and suggest evaluating derived features rather than data channels as a means to achieve purer spectral strata.

  17. Brain medical image diagnosis based on corners with importance-values.

    PubMed

    Gao, Linlin; Pan, Haiwei; Li, Qing; Xie, Xiaoqin; Zhang, Zhiqiang; Han, Jinming; Zhai, Xiao

    2017-11-21

    Brain disorders are one of the top causes of human death. Generally, neurologists analyze brain medical images for diagnosis. In the image analysis field, corners are one of the most important features, which makes corner detection and matching studies essential. However, existing corner detection studies do not consider the domain information of brain. This leads to many useless corners and the loss of significant information. Regarding corner matching, the uncertainty and structure of brain are not employed in existing methods. Moreover, most corner matching studies are used for 3D image registration. They are inapplicable for 2D brain image diagnosis because of the different mechanisms. To address these problems, we propose a novel corner-based brain medical image classification method. Specifically, we automatically extract multilayer texture images (MTIs) which embody diagnostic information from neurologists. Moreover, we present a corner matching method utilizing the uncertainty and structure of brain medical images and a bipartite graph model. Finally, we propose a similarity calculation method for diagnosis. Brain CT and MRI image sets are utilized to evaluate the proposed method. First, classifiers are trained in N-fold cross-validation analysis to produce the best θ and K. Then independent brain image sets are tested to evaluate the classifiers. Moreover, the classifiers are also compared with advanced brain image classification studies. For the brain CT image set, the proposed classifier outperforms the comparison methods by at least 8% on accuracy and 2.4% on F1-score. Regarding the brain MRI image set, the proposed classifier is superior to the comparison methods by more than 7.3% on accuracy and 4.9% on F1-score. Results also demonstrate that the proposed method is robust to different intensity ranges of brain medical image. In this study, we develop a robust corner-based brain medical image classifier. Specifically, we propose a corner detection method utilizing the diagnostic information from neurologists and a corner matching method based on the uncertainty and structure of brain medical images. Additionally, we present a similarity calculation method for brain image classification. Experimental results on two brain image sets show the proposed corner-based brain medical image classifier outperforms the state-of-the-art studies.

  18. Detection of LSB+/-1 steganography based on co-occurrence matrix and bit plane clipping

    NASA Astrophysics Data System (ADS)

    Abolghasemi, Mojtaba; Aghaeinia, Hassan; Faez, Karim; Mehrabi, Mohammad Ali

    2010-01-01

    Spatial LSB+/-1 steganography changes smooth characteristics between adjoining pixels of the raw image. We present a novel steganalysis method for LSB+/-1 steganography based on feature vectors derived from the co-occurrence matrix in the spatial domain. We investigate how LSB+/-1 steganography affects the bit planes of an image and show that it changes more least significant bit (LSB) planes of it. The co-occurrence matrix is derived from an image in which some of its most significant bit planes are clipped. By this preprocessing, in addition to reducing the dimensions of the feature vector, the effects of embedding were also preserved. We compute the co-occurrence matrix in different directions and with different dependency and use the elements of the resulting co-occurrence matrix as features. This method is sensitive to the data embedding process. We use a Fisher linear discrimination (FLD) classifier and test our algorithm on different databases and embedding rates. We compare our scheme with the current LSB+/-1 steganalysis methods. It is shown that the proposed scheme outperforms the state-of-the-art methods in detecting the LSB+/-1 steganographic method for grayscale images.

  19. Classification of hyperspectral imagery with neural networks: comparison to conventional tools

    NASA Astrophysics Data System (ADS)

    Merényi, Erzsébet; Farrand, William H.; Taranik, James V.; Minor, Timothy B.

    2014-12-01

    Efficient exploitation of hyperspectral imagery is of great importance in remote sensing. Artificial intelligence approaches have been receiving favorable reviews for classification of hyperspectral data because the complexity of such data challenges the limitations of many conventional methods. Artificial neural networks (ANNs) were shown to outperform traditional classifiers in many situations. However, studies that use the full spectral dimensionality of hyperspectral images to classify a large number of surface covers are scarce if non-existent. We advocate the need for methods that can handle the full dimensionality and a large number of classes to retain the discovery potential and the ability to discriminate classes with subtle spectral differences. We demonstrate that such a method exists in the family of ANNs. We compare the maximum likelihood, Mahalonobis distance, minimum distance, spectral angle mapper, and a hybrid ANN classifier for real hyperspectral AVIRIS data, using the full spectral resolution to map 23 cover types and using a small training set. Rigorous evaluation of the classification accuracies shows that the ANN outperforms the other methods and achieves ≈90% accuracy on test data.

  20. Lysine acetylation sites prediction using an ensemble of support vector machine classifiers.

    PubMed

    Xu, Yan; Wang, Xiao-Bo; Ding, Jun; Wu, Ling-Yun; Deng, Nai-Yang

    2010-05-07

    Lysine acetylation is an essentially reversible and high regulated post-translational modification which regulates diverse protein properties. Experimental identification of acetylation sites is laborious and expensive. Hence, there is significant interest in the development of computational methods for reliable prediction of acetylation sites from amino acid sequences. In this paper we use an ensemble of support vector machine classifiers to perform this work. The experimentally determined acetylation lysine sites are extracted from Swiss-Prot database and scientific literatures. Experiment results show that an ensemble of support vector machine classifiers outperforms single support vector machine classifier and other computational methods such as PAIL and LysAcet on the problem of predicting acetylation lysine sites. The resulting method has been implemented in EnsemblePail, a web server for lysine acetylation sites prediction available at http://www.aporc.org/EnsemblePail/. Copyright (c) 2010 Elsevier Ltd. All rights reserved.

  1. Evolving neural networks through augmenting topologies.

    PubMed

    Stanley, Kenneth O; Miikkulainen, Risto

    2002-01-01

    An important question in neuroevolution is how to gain an advantage from evolving neural network topologies along with weights. We present a method, NeuroEvolution of Augmenting Topologies (NEAT), which outperforms the best fixed-topology method on a challenging benchmark reinforcement learning task. We claim that the increased efficiency is due to (1) employing a principled method of crossover of different topologies, (2) protecting structural innovation using speciation, and (3) incrementally growing from minimal structure. We test this claim through a series of ablation studies that demonstrate that each component is necessary to the system as a whole and to each other. What results is significantly faster learning. NEAT is also an important contribution to GAs because it shows how it is possible for evolution to both optimize and complexify solutions simultaneously, offering the possibility of evolving increasingly complex solutions over generations, and strengthening the analogy with biological evolution.

  2. Improving causal inference with a doubly robust estimator that combines propensity score stratification and weighting.

    PubMed

    Linden, Ariel

    2017-08-01

    When a randomized controlled trial is not feasible, health researchers typically use observational data and rely on statistical methods to adjust for confounding when estimating treatment effects. These methods generally fall into 3 categories: (1) estimators based on a model for the outcome using conventional regression adjustment; (2) weighted estimators based on the propensity score (ie, a model for the treatment assignment); and (3) "doubly robust" (DR) estimators that model both the outcome and propensity score within the same framework. In this paper, we introduce a new DR estimator that utilizes marginal mean weighting through stratification (MMWS) as the basis for weighted adjustment. This estimator may prove more accurate than treatment effect estimators because MMWS has been shown to be more accurate than other models when the propensity score is misspecified. We therefore compare the performance of this new estimator to other commonly used treatment effects estimators. Monte Carlo simulation is used to compare the DR-MMWS estimator to regression adjustment, 2 weighted estimators based on the propensity score and 2 other DR methods. To assess performance under varied conditions, we vary the level of misspecification of the propensity score model as well as misspecify the outcome model. Overall, DR estimators generally outperform methods that model one or the other components (eg, propensity score or outcome). The DR-MMWS estimator outperforms all other estimators when both the propensity score and outcome models are misspecified and performs equally as well as other DR estimators when only the propensity score is misspecified. Health researchers should consider using DR-MMWS as the principal evaluation strategy in observational studies, as this estimator appears to outperform other estimators in its class. © 2017 John Wiley & Sons, Ltd.

  3. Global positioning method based on polarized light compass system

    NASA Astrophysics Data System (ADS)

    Liu, Jun; Yang, Jiangtao; Wang, Yubo; Tang, Jun; Shen, Chong

    2018-05-01

    This paper presents a global positioning method based on a polarized light compass system. A main limitation of polarization positioning is the environment such as weak and locally destroyed polarization environments, and the solution to the positioning problem is given in this paper which is polarization image de-noising and segmentation. Therefore, the pulse coupled neural network is employed for enhancing positioning performance. The prominent advantages of the present positioning technique are as follows: (i) compared to the existing position method based on polarized light, better sun tracking accuracy can be achieved and (ii) the robustness and accuracy of positioning under weak and locally destroyed polarization environments, such as cloudy or building shielding, are improved significantly. Finally, some field experiments are given to demonstrate the effectiveness and applicability of the proposed global positioning technique. The experiments have shown that our proposed method outperforms the conventional polarization positioning method, the real time longitude and latitude with accuracy up to 0.0461° and 0.0911°, respectively.

  4. QRS classification and spatial combination for robust heart rate detection in low-quality fetal ECG recordings.

    PubMed

    Warmerdam, G; Vullings, R; Van Pul, C; Andriessen, P; Oei, S G; Wijn, P

    2013-01-01

    Non-invasive fetal electrocardiography (ECG) can be used for prolonged monitoring of the fetal heart rate (FHR). However, the signal-to-noise-ratio (SNR) of non-invasive ECG recordings is often insufficient for reliable detection of the FHR. To overcome this problem, source separation techniques can be used to enhance the fetal ECG. This study uses a physiology-based source separation (PBSS) technique that has already been demonstrated to outperform widely used blind source separation techniques. Despite the relatively good performance of PBSS in enhancing the fetal ECG, PBSS is still susceptible to artifacts. In this study an augmented PBSS technique is developed to reduce the influence of artifacts. The performance of the developed method is compared to PBSS on multi-channel non-invasive fetal ECG recordings. Based on this comparison, the developed method is shown to outperform PBSS for the enhancement of the fetal ECG.

  5. Structural and Sequence Similarity Makes a Significant Impact on Machine-Learning-Based Scoring Functions for Protein-Ligand Interactions.

    PubMed

    Li, Yang; Yang, Jianyi

    2017-04-24

    The prediction of protein-ligand binding affinity has recently been improved remarkably by machine-learning-based scoring functions. For example, using a set of simple descriptors representing the atomic distance counts, the RF-Score improves the Pearson correlation coefficient to about 0.8 on the core set of the PDBbind 2007 database, which is significantly higher than the performance of any conventional scoring function on the same benchmark. A few studies have been made to discuss the performance of machine-learning-based methods, but the reason for this improvement remains unclear. In this study, by systemically controlling the structural and sequence similarity between the training and test proteins of the PDBbind benchmark, we demonstrate that protein structural and sequence similarity makes a significant impact on machine-learning-based methods. After removal of training proteins that are highly similar to the test proteins identified by structure alignment and sequence alignment, machine-learning-based methods trained on the new training sets do not outperform the conventional scoring functions any more. On the contrary, the performance of conventional functions like X-Score is relatively stable no matter what training data are used to fit the weights of its energy terms.

  6. Unstructured Adaptive Grid Computations on an Array of SMPs

    NASA Technical Reports Server (NTRS)

    Biswas, Rupak; Pramanick, Ira; Sohn, Andrew; Simon, Horst D.

    1996-01-01

    Dynamic load balancing is necessary for parallel adaptive methods to solve unsteady CFD problems on unstructured grids. We have presented such a dynamic load balancing framework called JOVE, in this paper. Results on a four-POWERnode POWER CHALLENGEarray demonstrated that load balancing gives significant performance improvements over no load balancing for such adaptive computations. The parallel speedup of JOVE, implemented using MPI on the POWER CHALLENCEarray, was significant, being as high as 31 for 32 processors. An implementation of JOVE that exploits 'an array of SMPS' architecture was also studied; this hybrid JOVE outperformed flat JOVE by up to 28% on the meshes and adaption models tested. With large, realistic meshes and actual flow-solver and adaption phases incorporated into JOVE, hybrid JOVE can be expected to yield significant advantage over flat JOVE, especially as the number of processors is increased, thus demonstrating the scalability of an array of SMPs architecture.

  7. Improving cerebellar segmentation with statistical fusion

    NASA Astrophysics Data System (ADS)

    Plassard, Andrew J.; Yang, Zhen; Prince, Jerry L.; Claassen, Daniel O.; Landman, Bennett A.

    2016-03-01

    The cerebellum is a somatotopically organized central component of the central nervous system well known to be involved with motor coordination and increasingly recognized roles in cognition and planning. Recent work in multiatlas labeling has created methods that offer the potential for fully automated 3-D parcellation of the cerebellar lobules and vermis (which are organizationally equivalent to cortical gray matter areas). This work explores the trade offs of using different statistical fusion techniques and post hoc optimizations in two datasets with distinct imaging protocols. We offer a novel fusion technique by extending the ideas of the Selective and Iterative Method for Performance Level Estimation (SIMPLE) to a patch-based performance model. We demonstrate the effectiveness of our algorithm, Non- Local SIMPLE, for segmentation of a mixed population of healthy subjects and patients with severe cerebellar anatomy. Under the first imaging protocol, we show that Non-Local SIMPLE outperforms previous gold-standard segmentation techniques. In the second imaging protocol, we show that Non-Local SIMPLE outperforms previous gold standard techniques but is outperformed by a non-locally weighted vote with the deeper population of atlases available. This work advances the state of the art in open source cerebellar segmentation algorithms and offers the opportunity for routinely including cerebellar segmentation in magnetic resonance imaging studies that acquire whole brain T1-weighted volumes with approximately 1 mm isotropic resolution.

  8. Inferring imagined speech using EEG signals: a new approach using Riemannian manifold features

    NASA Astrophysics Data System (ADS)

    Nguyen, Chuong H.; Karavas, George K.; Artemiadis, Panagiotis

    2018-02-01

    Objective. In this paper, we investigate the suitability of imagined speech for brain-computer interface (BCI) applications. Approach. A novel method based on covariance matrix descriptors, which lie in Riemannian manifold, and the relevance vector machines classifier is proposed. The method is applied on electroencephalographic (EEG) signals and tested in multiple subjects. Main results. The method is shown to outperform other approaches in the field with respect to accuracy and robustness. The algorithm is validated on various categories of speech, such as imagined pronunciation of vowels, short words and long words. The classification accuracy of our methodology is in all cases significantly above chance level, reaching a maximum of 70% for cases where we classify three words and 95% for cases of two words. Significance. The results reveal certain aspects that may affect the success of speech imagery classification from EEG signals, such as sound, meaning and word complexity. This can potentially extend the capability of utilizing speech imagery in future BCI applications. The dataset of speech imagery collected from total 15 subjects is also published.

  9. An enriched multimedia eBook application to facilitate learning of anatomy.

    PubMed

    Stirling, Allan; Birt, James

    2014-01-01

    This pilot study compared the use of an enriched multimedia eBook with traditional methods for teaching the gross anatomy of the heart and great vessels. Seventy-one first-year students from an Australian medical school participated in the study. Students' abilities were examined by pretest, intervention, and post-test measurements. Perceptions and attitudes toward eBook technology were examined by survey questions. Results indicated a strongly positive user experience coupled with increased marks; however, there were no statistically significant results for the eBook method of delivery alone outperforming the traditional anatomy practical session. Results did show a statistically significant difference in the final marks achieved based on the sequencing of the learning modalities. With initial interaction with the multimedia content followed by active experimentation in the anatomy lab, students' performance was improved in the final test. Obtained data support the role of eBook technology in modern anatomy curriculum being a useful adjunct to traditional methods. Further study is needed to investigate the importance of sequencing of teaching interventions. © 2013 American Association of Anatomists.

  10. Spatial Copula Model for Imputing Traffic Flow Data from Remote Microwave Sensors

    PubMed Central

    Ma, Xiaolei; Du, Bowen; Yu, Bin

    2017-01-01

    Issues of missing data have become increasingly serious with the rapid increase in usage of traffic sensors. Analyses of the Beijing ring expressway have showed that up to 50% of microwave sensors pose missing values. The imputation of missing traffic data must be urgently solved although a precise solution that cannot be easily achieved due to the significant number of missing portions. In this study, copula-based models are proposed for the spatial interpolation of traffic flow from remote traffic microwave sensors. Most existing interpolation methods only rely on covariance functions to depict spatial correlation and are unsuitable for coping with anomalies due to Gaussian consumption. Copula theory overcomes this issue and provides a connection between the correlation function and the marginal distribution function of traffic flow. To validate copula-based models, a comparison with three kriging methods is conducted. Results indicate that copula-based models outperform kriging methods, especially on roads with irregular traffic patterns. Copula-based models demonstrate significant potential to impute missing data in large-scale transportation networks. PMID:28934164

  11. A new family of Polak-Ribiere-Polyak conjugate gradient method with the strong-Wolfe line search

    NASA Astrophysics Data System (ADS)

    Ghani, Nur Hamizah Abdul; Mamat, Mustafa; Rivaie, Mohd

    2017-08-01

    Conjugate gradient (CG) method is an important technique in unconstrained optimization, due to its effectiveness and low memory requirements. The focus of this paper is to introduce a new CG method for solving large scale unconstrained optimization. Theoretical proofs show that the new method fulfills sufficient descent condition if strong Wolfe-Powell inexact line search is used. Besides, computational results show that our proposed method outperforms to other existing CG methods.

  12. Binary Interval Search: a scalable algorithm for counting interval intersections.

    PubMed

    Layer, Ryan M; Skadron, Kevin; Robins, Gabriel; Hall, Ira M; Quinlan, Aaron R

    2013-01-01

    The comparison of diverse genomic datasets is fundamental to understand genome biology. Researchers must explore many large datasets of genome intervals (e.g. genes, sequence alignments) to place their experimental results in a broader context and to make new discoveries. Relationships between genomic datasets are typically measured by identifying intervals that intersect, that is, they overlap and thus share a common genome interval. Given the continued advances in DNA sequencing technologies, efficient methods for measuring statistically significant relationships between many sets of genomic features are crucial for future discovery. We introduce the Binary Interval Search (BITS) algorithm, a novel and scalable approach to interval set intersection. We demonstrate that BITS outperforms existing methods at counting interval intersections. Moreover, we show that BITS is intrinsically suited to parallel computing architectures, such as graphics processing units by illustrating its utility for efficient Monte Carlo simulations measuring the significance of relationships between sets of genomic intervals. https://github.com/arq5x/bits.

  13. Refining Automatically Extracted Knowledge Bases Using Crowdsourcing

    PubMed Central

    Xian, Xuefeng; Cui, Zhiming

    2017-01-01

    Machine-constructed knowledge bases often contain noisy and inaccurate facts. There exists significant work in developing automated algorithms for knowledge base refinement. Automated approaches improve the quality of knowledge bases but are far from perfect. In this paper, we leverage crowdsourcing to improve the quality of automatically extracted knowledge bases. As human labelling is costly, an important research challenge is how we can use limited human resources to maximize the quality improvement for a knowledge base. To address this problem, we first introduce a concept of semantic constraints that can be used to detect potential errors and do inference among candidate facts. Then, based on semantic constraints, we propose rank-based and graph-based algorithms for crowdsourced knowledge refining, which judiciously select the most beneficial candidate facts to conduct crowdsourcing and prune unnecessary questions. Our experiments show that our method improves the quality of knowledge bases significantly and outperforms state-of-the-art automatic methods under a reasonable crowdsourcing cost. PMID:28588611

  14. Gene expression inference with deep learning.

    PubMed

    Chen, Yifei; Li, Yi; Narayan, Rajiv; Subramanian, Aravind; Xie, Xiaohui

    2016-06-15

    Large-scale gene expression profiling has been widely used to characterize cellular states in response to various disease conditions, genetic perturbations, etc. Although the cost of whole-genome expression profiles has been dropping steadily, generating a compendium of expression profiling over thousands of samples is still very expensive. Recognizing that gene expressions are often highly correlated, researchers from the NIH LINCS program have developed a cost-effective strategy of profiling only ∼1000 carefully selected landmark genes and relying on computational methods to infer the expression of remaining target genes. However, the computational approach adopted by the LINCS program is currently based on linear regression (LR), limiting its accuracy since it does not capture complex nonlinear relationship between expressions of genes. We present a deep learning method (abbreviated as D-GEX) to infer the expression of target genes from the expression of landmark genes. We used the microarray-based Gene Expression Omnibus dataset, consisting of 111K expression profiles, to train our model and compare its performance to those from other methods. In terms of mean absolute error averaged across all genes, deep learning significantly outperforms LR with 15.33% relative improvement. A gene-wise comparative analysis shows that deep learning achieves lower error than LR in 99.97% of the target genes. We also tested the performance of our learned model on an independent RNA-Seq-based GTEx dataset, which consists of 2921 expression profiles. Deep learning still outperforms LR with 6.57% relative improvement, and achieves lower error in 81.31% of the target genes. D-GEX is available at https://github.com/uci-cbcl/D-GEX CONTACT: xhx@ics.uci.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  15. Gene expression inference with deep learning

    PubMed Central

    Chen, Yifei; Li, Yi; Narayan, Rajiv; Subramanian, Aravind; Xie, Xiaohui

    2016-01-01

    Motivation: Large-scale gene expression profiling has been widely used to characterize cellular states in response to various disease conditions, genetic perturbations, etc. Although the cost of whole-genome expression profiles has been dropping steadily, generating a compendium of expression profiling over thousands of samples is still very expensive. Recognizing that gene expressions are often highly correlated, researchers from the NIH LINCS program have developed a cost-effective strategy of profiling only ∼1000 carefully selected landmark genes and relying on computational methods to infer the expression of remaining target genes. However, the computational approach adopted by the LINCS program is currently based on linear regression (LR), limiting its accuracy since it does not capture complex nonlinear relationship between expressions of genes. Results: We present a deep learning method (abbreviated as D-GEX) to infer the expression of target genes from the expression of landmark genes. We used the microarray-based Gene Expression Omnibus dataset, consisting of 111K expression profiles, to train our model and compare its performance to those from other methods. In terms of mean absolute error averaged across all genes, deep learning significantly outperforms LR with 15.33% relative improvement. A gene-wise comparative analysis shows that deep learning achieves lower error than LR in 99.97% of the target genes. We also tested the performance of our learned model on an independent RNA-Seq-based GTEx dataset, which consists of 2921 expression profiles. Deep learning still outperforms LR with 6.57% relative improvement, and achieves lower error in 81.31% of the target genes. Availability and implementation: D-GEX is available at https://github.com/uci-cbcl/D-GEX. Contact: xhx@ics.uci.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26873929

  16. Simultaneous Nonrigid Registration, Segmentation, and Tumor Detection in MRI Guided Cervical Cancer Radiation Therapy

    PubMed Central

    Lu, Chao; Chelikani, Sudhakar; Jaffray, David A.; Milosevic, Michael F.; Staib, Lawrence H.; Duncan, James S.

    2013-01-01

    External beam radiation therapy (EBRT) for the treatment of cancer enables accurate placement of radiation dose on the cancerous region. However, the deformation of soft tissue during the course of treatment, such as in cervical cancer, presents significant challenges for the delineation of the target volume and other structures of interest. Furthermore, the presence and regression of pathologies such as tumors may violate registration constraints and cause registration errors. In this paper, automatic segmentation, nonrigid registration and tumor detection in cervical magnetic resonance (MR) data are addressed simultaneously using a unified Bayesian framework. The proposed novel method can generate a tumor probability map while progressively identifying the boundary of an organ of interest based on the achieved nonrigid transformation. The method is able to handle the challenges of significant tumor regression and its effect on surrounding tissues. The new method was compared to various currently existing algorithms on a set of 36 MR data from six patients, each patient has six T2-weighted MR cervical images. The results show that the proposed approach achieves an accuracy comparable to manual segmentation and it significantly outperforms the existing registration algorithms. In addition, the tumor detection result generated by the proposed method has a high agreement with manual delineation by a qualified clinician. PMID:22328178

  17. Optimal lattice-structured materials

    DOE PAGES

    Messner, Mark C.

    2016-07-09

    This paper describes a method for optimizing the mesostructure of lattice-structured materials. These materials are periodic arrays of slender members resembling efficient, lightweight macroscale structures like bridges and frame buildings. Current additive manufacturing technologies can assemble lattice structures with length scales ranging from nanometers to millimeters. Previous work demonstrates that lattice materials have excellent stiffness- and strength-to-weight scaling, outperforming natural materials. However, there are currently no methods for producing optimal mesostructures that consider the full space of possible 3D lattice topologies. The inverse homogenization approach for optimizing the periodic structure of lattice materials requires a parameterized, homogenized material model describingmore » the response of an arbitrary structure. This work develops such a model, starting with a method for describing the long-wavelength, macroscale deformation of an arbitrary lattice. The work combines the homogenized model with a parameterized description of the total design space to generate a parameterized model. Finally, the work describes an optimization method capable of producing optimal mesostructures. Several examples demonstrate the optimization method. One of these examples produces an elastically isotropic, maximally stiff structure, here called the isotruss, that arguably outperforms the anisotropic octet truss topology.« less

  18. Robust gene selection methods using weighting schemes for microarray data analysis.

    PubMed

    Kang, Suyeon; Song, Jongwoo

    2017-09-02

    A common task in microarray data analysis is to identify informative genes that are differentially expressed between two different states. Owing to the high-dimensional nature of microarray data, identification of significant genes has been essential in analyzing the data. However, the performances of many gene selection techniques are highly dependent on the experimental conditions, such as the presence of measurement error or a limited number of sample replicates. We have proposed new filter-based gene selection techniques, by applying a simple modification to significance analysis of microarrays (SAM). To prove the effectiveness of the proposed method, we considered a series of synthetic datasets with different noise levels and sample sizes along with two real datasets. The following findings were made. First, our proposed methods outperform conventional methods for all simulation set-ups. In particular, our methods are much better when the given data are noisy and sample size is small. They showed relatively robust performance regardless of noise level and sample size, whereas the performance of SAM became significantly worse as the noise level became high or sample size decreased. When sufficient sample replicates were available, SAM and our methods showed similar performance. Finally, our proposed methods are competitive with traditional methods in classification tasks for microarrays. The results of simulation study and real data analysis have demonstrated that our proposed methods are effective for detecting significant genes and classification tasks, especially when the given data are noisy or have few sample replicates. By employing weighting schemes, we can obtain robust and reliable results for microarray data analysis.

  19. Deep-learning: investigating deep neural networks hyper-parameters and comparison of performance to shallow methods for modeling bioactivity data.

    PubMed

    Koutsoukas, Alexios; Monaghan, Keith J; Li, Xiaoli; Huan, Jun

    2017-06-28

    In recent years, research in artificial neural networks has resurged, now under the deep-learning umbrella, and grown extremely popular. Recently reported success of DL techniques in crowd-sourced QSAR and predictive toxicology competitions has showcased these methods as powerful tools in drug-discovery and toxicology research. The aim of this work was dual, first large number of hyper-parameter configurations were explored to investigate how they affect the performance of DNNs and could act as starting points when tuning DNNs and second their performance was compared to popular methods widely employed in the field of cheminformatics namely Naïve Bayes, k-nearest neighbor, random forest and support vector machines. Moreover, robustness of machine learning methods to different levels of artificially introduced noise was assessed. The open-source Caffe deep-learning framework and modern NVidia GPU units were utilized to carry out this study, allowing large number of DNN configurations to be explored. We show that feed-forward deep neural networks are capable of achieving strong classification performance and outperform shallow methods across diverse activity classes when optimized. Hyper-parameters that were found to play critical role are the activation function, dropout regularization, number hidden layers and number of neurons. When compared to the rest methods, tuned DNNs were found to statistically outperform, with p value <0.01 based on Wilcoxon statistical test. DNN achieved on average MCC units of 0.149 higher than NB, 0.092 than kNN, 0.052 than SVM with linear kernel, 0.021 than RF and finally 0.009 higher than SVM with radial basis function kernel. When exploring robustness to noise, non-linear methods were found to perform well when dealing with low levels of noise, lower than or equal to 20%, however when dealing with higher levels of noise, higher than 30%, the Naïve Bayes method was found to perform well and even outperform at the highest level of noise 50% more sophisticated methods across several datasets.

  20. Kidney Injury Molecule-1 Outperforms Traditional Biomarkers of Kidney Injury in Multi-site Preclinical Biomarker Qualification Studies

    PubMed Central

    Vaidya, Vishal S.; Ozer, Josef S.; Frank, Dieterle; Collings, Fitz B.; Ramirez, Victoria; Troth, Sean; Muniappa, Nagaraja; Thudium, Douglas; Gerhold, David; Holder, Daniel J.; Bobadilla, Norma A.; Marrer, Estelle; Perentes, Elias; Cordier, André; Vonderscher, Jacky; Maurer, Gérard; Goering, Peter L.; Sistare, Frank D.; Bonventre, Joseph V.

    2010-01-01

    Kidney toxicity accounts for a significant percentage of morbidity and drug candidate failure. Serum creatinine (SCr) and blood urea nitrogen (BUN) have been used to monitor kidney dysfunction for over a century but these markers are insensitive and non-specific. In multi-site preclinical rat toxicology studies the diagnostic performance of urinary kidney injury molecule-1 (Kim-1) was compared to traditional biomarkers as predictors of kidney tubular histopathologic changes, currently considered the “gold standard” of nephrotoxicity. In multiple models of kidney injury, urinary Kim-1 significantly outperformed SCr and BUN. The area under the receiver operating characteristic curve for Kim-1 was between 0.91 and 0.99 as compared to 0.79 to 0.9 for BUN and 0.73 to 0.85 for SCr. Thus urinary Kim-1 is the first injury biomarker of kidney toxicity qualified by the FDA and EMEA and is expected to significantly improve kidney safety monitoring. PMID:20458318

  1. A robust nonparametric framework for reconstruction of stochastic differential equation models

    NASA Astrophysics Data System (ADS)

    Rajabzadeh, Yalda; Rezaie, Amir Hossein; Amindavar, Hamidreza

    2016-05-01

    In this paper, we employ a nonparametric framework to robustly estimate the functional forms of drift and diffusion terms from discrete stationary time series. The proposed method significantly improves the accuracy of the parameter estimation. In this framework, drift and diffusion coefficients are modeled through orthogonal Legendre polynomials. We employ the least squares regression approach along with the Euler-Maruyama approximation method to learn coefficients of stochastic model. Next, a numerical discrete construction of mean squared prediction error (MSPE) is established to calculate the order of Legendre polynomials in drift and diffusion terms. We show numerically that the new method is robust against the variation in sample size and sampling rate. The performance of our method in comparison with the kernel-based regression (KBR) method is demonstrated through simulation and real data. In case of real dataset, we test our method for discriminating healthy electroencephalogram (EEG) signals from epilepsy ones. We also demonstrate the efficiency of the method through prediction in the financial data. In both simulation and real data, our algorithm outperforms the KBR method.

  2. A Fixed-Lag Kalman Smoother to Filter Power Line Interference in Electrocardiogram Recordings.

    PubMed

    Warmerdam, G J J; Vullings, R; Schmitt, L; Van Laar, J O E H; Bergmans, J W M

    2017-08-01

    Filtering power line interference (PLI) from electrocardiogram (ECG) recordings can lead to significant distortions of the ECG and mask clinically relevant features in ECG waveform morphology. The objective of this study is to filter PLI from ECG recordings with minimal distortion of the ECG waveform. In this paper, we propose a fixed-lag Kalman smoother with adaptive noise estimation. The performance of this Kalman smoother in filtering PLI is compared to that of a fixed-bandwidth notch filter and several adaptive PLI filters that have been proposed in the literature. To evaluate the performance, we corrupted clean neonatal ECG recordings with various simulated PLI. Furthermore, examples are shown of filtering real PLI from an adult and a fetal ECG recording. The fixed-lag Kalman smoother outperforms other PLI filters in terms of step response settling time (improvements that range from 0.1 to 1 s) and signal-to-noise ratio (improvements that range from 17 to 23 dB). Our fixed-lag Kalman smoother can be used for semi real-time applications with a limited delay of 0.4 s. The fixed-lag Kalman smoother presented in this study outperforms other methods for filtering PLI and leads to minimal distortion of the ECG waveform.

  3. Student beats the teacher: deep neural networks for lateral ventricles segmentation in brain MR

    NASA Astrophysics Data System (ADS)

    Ghafoorian, Mohsen; Teuwen, Jonas; Manniesing, Rashindra; Leeuw, Frank-Erik d.; van Ginneken, Bram; Karssemeijer, Nico; Platel, Bram

    2018-03-01

    Ventricular volume and its progression are known to be linked to several brain diseases such as dementia and schizophrenia. Therefore accurate measurement of ventricle volume is vital for longitudinal studies on these disorders, making automated ventricle segmentation algorithms desirable. In the past few years, deep neural networks have shown to outperform the classical models in many imaging domains. However, the success of deep networks is dependent on manually labeled data sets, which are expensive to acquire especially for higher dimensional data in the medical domain. In this work, we show that deep neural networks can be trained on muchcheaper-to-acquire pseudo-labels (e.g., generated by other automated less accurate methods) and still produce more accurate segmentations compared to the quality of the labels. To show this, we use noisy segmentation labels generated by a conventional region growing algorithm to train a deep network for lateral ventricle segmentation. Then on a large manually annotated test set, we show that the network significantly outperforms the conventional region growing algorithm which was used to produce the training labels for the network. Our experiments report a Dice Similarity Coefficient (DSC) of 0.874 for the trained network compared to 0.754 for the conventional region growing algorithm (p < 0.001).

  4. Bayesian adaptive phase II screening design for combination trials

    PubMed Central

    Cai, Chunyan; Yuan, Ying; Johnson, Valen E

    2013-01-01

    Background Trials of combination therapies for the treatment of cancer are playing an increasingly important role in the battle against this disease. To more efficiently handle the large number of combination therapies that must be tested, we propose a novel Bayesian phase II adaptive screening design to simultaneously select among possible treatment combinations involving multiple agents. Methods Our design is based on formulating the selection procedure as a Bayesian hypothesis testing problem in which the superiority of each treatment combination is equated to a single hypothesis. During the trial conduct, we use the current values of the posterior probabilities of all hypotheses to adaptively allocate patients to treatment combinations. Results Simulation studies show that the proposed design substantially outperforms the conventional multiarm balanced factorial trial design. The proposed design yields a significantly higher probability for selecting the best treatment while allocating substantially more patients to efficacious treatments. Limitations The proposed design is most appropriate for the trials combining multiple agents and screening out the efficacious combination to be further investigated. Conclusions The proposed Bayesian adaptive phase II screening design substantially outperformed the conventional complete factorial design. Our design allocates more patients to better treatments while providing higher power to identify the best treatment at the end of the trial. PMID:23359875

  5. Application of a New Resampling Method to SEM: A Comparison of S-SMART with the Bootstrap

    ERIC Educational Resources Information Center

    Bai, Haiyan; Sivo, Stephen A.; Pan, Wei; Fan, Xitao

    2016-01-01

    Among the commonly used resampling methods of dealing with small-sample problems, the bootstrap enjoys the widest applications because it often outperforms its counterparts. However, the bootstrap still has limitations when its operations are contemplated. Therefore, the purpose of this study is to examine an alternative, new resampling method…

  6. Resampling and Distribution of the Product Methods for Testing Indirect Effects in Complex Models

    ERIC Educational Resources Information Center

    Williams, Jason; MacKinnon, David P.

    2008-01-01

    Recent advances in testing mediation have found that certain resampling methods and tests based on the mathematical distribution of 2 normal random variables substantially outperform the traditional "z" test. However, these studies have primarily focused only on models with a single mediator and 2 component paths. To address this limitation, a…

  7. Finding Dantzig Selectors with a Proximity Operator based Fixed-point Algorithm

    DTIC Science & Technology

    2014-11-01

    experiments showed that this method usually outperforms the method in [2] in terms of CPU time while producing solutions of comparable quality. The... method proposed in [19]. To alleviate the difficulty caused by the subprob- lem without a closed form solution , a linearized ADM was proposed for the...a closed form solution , but the β-related subproblem does not and is solved approximately by using the nonmonotone gradient method in [18]. The

  8. Retrieval evaluation and distance learning from perceived similarity between endomicroscopy videos.

    PubMed

    André, Barbara; Vercauteren, Tom; Buchner, Anna M; Wallace, Michael B; Ayache, Nicholas

    2011-01-01

    Evaluating content-based retrieval (CBR) is challenging because it requires an adequate ground-truth. When the available groundtruth is limited to textual metadata such as pathological classes, retrieval results can only be evaluated indirectly, for example in terms of classification performance. In this study we first present a tool to generate perceived similarity ground-truth that enables direct evaluation of endomicroscopic video retrieval. This tool uses a four-points Likert scale and collects subjective pairwise similarities perceived by multiple expert observers. We then evaluate against the generated ground-truth a previously developed dense bag-of-visual-words method for endomicroscopic video retrieval. Confirming the results of previous indirect evaluation based on classification, our direct evaluation shows that this method significantly outperforms several other state-of-the-art CBR methods. In a second step, we propose to improve the CBR method by learning an adjusted similarity metric from the perceived similarity ground-truth. By minimizing a margin-based cost function that differentiates similar and dissimilar video pairs, we learn a weight vector applied to the visual word signatures of videos. Using cross-validation, we demonstrate that the learned similarity distance is significantly better correlated with the perceived similarity than the original visual-word-based distance.

  9. Proofreading using an assistive software homophone tool: compensatory and remedial effects on the literacy skills of students with reading difficulties.

    PubMed

    Lange, Alissa A; Mulhern, Gerry; Wylie, Judith

    2009-01-01

    The present study investigated the effects of using an assistive software homophone tool on the assisted proofreading performance and unassisted basic skills of secondary-level students with reading difficulties. Students aged 13 to 15 years proofread passages for homophonic errors under three conditions: with the homophone tool, with homophones highlighted only, or with no help. The group using the homophone tool significantly outperformed the other two groups on assisted proofreading and outperformed the others on unassisted spelling, although not significantly. Remedial (unassisted) improvements in automaticity of word recognition, homophone proofreading, and basic reading were found over all groups. Results elucidate the differential contributions of each function of the homophone tool and suggest that with the proper training, assistive software can help not only students with diagnosed disabilities but also those with generally weak reading skills.

  10. A Novel Adjustment Method for Shearer Traction Speed through Integration of T-S Cloud Inference Network and Improved PSO

    PubMed Central

    Si, Lei; Wang, Zhongbin; Yang, Yinwei

    2014-01-01

    In order to efficiently and accurately adjust the shearer traction speed, a novel approach based on Takagi-Sugeno (T-S) cloud inference network (CIN) and improved particle swarm optimization (IPSO) is proposed. The T-S CIN is built through the combination of cloud model and T-S fuzzy neural network. Moreover, the IPSO algorithm employs parameter automation adjustment strategy and velocity resetting to significantly improve the performance of basic PSO algorithm in global search and fine-tuning of the solutions, and the flowchart of proposed approach is designed. Furthermore, some simulation examples are carried out and comparison results indicate that the proposed method is feasible, efficient, and is outperforming others. Finally, an industrial application example of coal mining face is demonstrated to specify the effect of proposed system. PMID:25506358

  11. Information filtering via preferential diffusion.

    PubMed

    Lü, Linyuan; Liu, Weiping

    2011-06-01

    Recommender systems have shown great potential in addressing the information overload problem, namely helping users in finding interesting and relevant objects within a huge information space. Some physical dynamics, including the heat conduction process and mass or energy diffusion on networks, have recently found applications in personalized recommendation. Most of the previous studies focus overwhelmingly on recommendation accuracy as the only important factor, while overlooking the significance of diversity and novelty that indeed provide the vitality of the system. In this paper, we propose a recommendation algorithm based on the preferential diffusion process on a user-object bipartite network. Numerical analyses on two benchmark data sets, MovieLens and Netflix, indicate that our method outperforms the state-of-the-art methods. Specifically, it can not only provide more accurate recommendations, but also generate more diverse and novel recommendations by accurately recommending unpopular objects.

  12. Information filtering via preferential diffusion

    NASA Astrophysics Data System (ADS)

    Lü, Linyuan; Liu, Weiping

    2011-06-01

    Recommender systems have shown great potential in addressing the information overload problem, namely helping users in finding interesting and relevant objects within a huge information space. Some physical dynamics, including the heat conduction process and mass or energy diffusion on networks, have recently found applications in personalized recommendation. Most of the previous studies focus overwhelmingly on recommendation accuracy as the only important factor, while overlooking the significance of diversity and novelty that indeed provide the vitality of the system. In this paper, we propose a recommendation algorithm based on the preferential diffusion process on a user-object bipartite network. Numerical analyses on two benchmark data sets, MovieLens and Netflix, indicate that our method outperforms the state-of-the-art methods. Specifically, it can not only provide more accurate recommendations, but also generate more diverse and novel recommendations by accurately recommending unpopular objects.

  13. Updated logistic regression equations for the calculation of post-fire debris-flow likelihood in the western United States

    USGS Publications Warehouse

    Staley, Dennis M.; Negri, Jacquelyn A.; Kean, Jason W.; Laber, Jayme L.; Tillery, Anne C.; Youberg, Ann M.

    2016-06-30

    Wildfire can significantly alter the hydrologic response of a watershed to the extent that even modest rainstorms can generate dangerous flash floods and debris flows. To reduce public exposure to hazard, the U.S. Geological Survey produces post-fire debris-flow hazard assessments for select fires in the western United States. We use publicly available geospatial data describing basin morphology, burn severity, soil properties, and rainfall characteristics to estimate the statistical likelihood that debris flows will occur in response to a storm of a given rainfall intensity. Using an empirical database and refined geospatial analysis methods, we defined new equations for the prediction of debris-flow likelihood using logistic regression methods. We showed that the new logistic regression model outperformed previous models used to predict debris-flow likelihood.

  14. Good initialization model with constrained body structure for scene text recognition

    NASA Astrophysics Data System (ADS)

    Zhu, Anna; Wang, Guoyou; Dong, Yangbo

    2016-09-01

    Scene text recognition has gained significant attention in the computer vision community. Character detection and recognition are the promise of text recognition and affect the overall performance to a large extent. We proposed a good initialization model for scene character recognition from cropped text regions. We use constrained character's body structures with deformable part-based models to detect and recognize characters in various backgrounds. The character's body structures are achieved by an unsupervised discriminative clustering approach followed by a statistical model and a self-build minimum spanning tree model. Our method utilizes part appearance and location information, and combines character detection and recognition in cropped text region together. The evaluation results on the benchmark datasets demonstrate that our proposed scheme outperforms the state-of-the-art methods both on scene character recognition and word recognition aspects.

  15. Decentralized stabilization of semi-active vibrating structures

    NASA Astrophysics Data System (ADS)

    Pisarski, Dominik

    2018-02-01

    A novel method of decentralized structural vibration control is presented. The control is assumed to be realized by a semi-active device. The objective is to stabilize a vibrating system with the optimal rates of decrease of the energy. The controller relies on an easily implemented decentralized switched state-feedback control law. It uses a set of communication channels to exchange the state information between the neighboring subcontrollers. The performance of the designed method is validated by means of numerical experiments performed for a double cantilever system equipped with a set of elastomers with controlled viscoelastic properties. In terms of the assumed objectives, the proposed control strategy significantly outperforms the passive damping cases and is competitive with a standard centralized control. The presented methodology can be applied to a class of bilinear control systems concerned with smart structural elements.

  16. Segmentation of Vasculature from Fluorescently Labeled Endothelial Cells in Multi-Photon Microscopy Images.

    PubMed

    Bates, Russell; Irving, Benjamin; Markelc, Bostjan; Kaeppler, Jakob; Brown, Graham; Muschel, Ruth J; Brady, Sir Michael; Grau, Vicente; Schnabel, Julia A

    2017-08-09

    Vasculature is known to be of key biological significance, especially in the study of tumors. As such, considerable effort has been focused on the automated segmentation of vasculature in medical and pre-clinical images. The majority of vascular segmentation methods focus on bloodpool labeling methods, however, particularly in the study of tumors it is of particular interest to be able to visualize both perfused and non-perfused vasculature. Imaging vasculature by highlighting the endothelium provides a way to separate the morphology of vasculature from the potentially confounding factor of perfusion. Here we present a method for the segmentation of tumor vasculature in 3D fluorescence microscopy images using signals from the endothelial and surrounding cells. We show that our method can provide complete and semantically meaningful segmentations of complex vasculature using a supervoxel-Markov Random Field approach. We show that in terms of extracting meaningful segmentations of the vasculature, our method out-performs both a state-ofthe- art method, specific to these data, as well as more classical vasculature segmentation methods.

  17. RNABindRPlus: a predictor that combines machine learning and sequence homology-based methods to improve the reliability of predicted RNA-binding residues in proteins.

    PubMed

    Walia, Rasna R; Xue, Li C; Wilkins, Katherine; El-Manzalawy, Yasser; Dobbs, Drena; Honavar, Vasant

    2014-01-01

    Protein-RNA interactions are central to essential cellular processes such as protein synthesis and regulation of gene expression and play roles in human infectious and genetic diseases. Reliable identification of protein-RNA interfaces is critical for understanding the structural bases and functional implications of such interactions and for developing effective approaches to rational drug design. Sequence-based computational methods offer a viable, cost-effective way to identify putative RNA-binding residues in RNA-binding proteins. Here we report two novel approaches: (i) HomPRIP, a sequence homology-based method for predicting RNA-binding sites in proteins; (ii) RNABindRPlus, a new method that combines predictions from HomPRIP with those from an optimized Support Vector Machine (SVM) classifier trained on a benchmark dataset of 198 RNA-binding proteins. Although highly reliable, HomPRIP cannot make predictions for the unaligned parts of query proteins and its coverage is limited by the availability of close sequence homologs of the query protein with experimentally determined RNA-binding sites. RNABindRPlus overcomes these limitations. We compared the performance of HomPRIP and RNABindRPlus with that of several state-of-the-art predictors on two test sets, RB44 and RB111. On a subset of proteins for which homologs with experimentally determined interfaces could be reliably identified, HomPRIP outperformed all other methods achieving an MCC of 0.63 on RB44 and 0.83 on RB111. RNABindRPlus was able to predict RNA-binding residues of all proteins in both test sets, achieving an MCC of 0.55 and 0.37, respectively, and outperforming all other methods, including those that make use of structure-derived features of proteins. More importantly, RNABindRPlus outperforms all other methods for any choice of tradeoff between precision and recall. An important advantage of both HomPRIP and RNABindRPlus is that they rely on readily available sequence and sequence-derived features of RNA-binding proteins. A webserver implementation of both methods is freely available at http://einstein.cs.iastate.edu/RNABindRPlus/.

  18. A new method for enhancer prediction based on deep belief network.

    PubMed

    Bu, Hongda; Gan, Yanglan; Wang, Yang; Zhou, Shuigeng; Guan, Jihong

    2017-10-16

    Studies have shown that enhancers are significant regulatory elements to play crucial roles in gene expression regulation. Since enhancers are unrelated to the orientation and distance to their target genes, it is a challenging mission for scholars and researchers to accurately predicting distal enhancers. In the past years, with the high-throughout ChiP-seq technologies development, several computational techniques emerge to predict enhancers using epigenetic or genomic features. Nevertheless, the inconsistency of computational models across different cell-lines and the unsatisfactory prediction performance call for further research in this area. Here, we propose a new Deep Belief Network (DBN) based computational method for enhancer prediction, which is called EnhancerDBN. This method combines diverse features, composed of DNA sequence compositional features, DNA methylation and histone modifications. Our computational results indicate that 1) EnhancerDBN outperforms 13 existing methods in prediction, and 2) GC content and DNA methylation can serve as relevant features for enhancer prediction. Deep learning is effective in boosting the performance of enhancer prediction.

  19. FBC: a flat binary code scheme for fast Manhattan hash retrieval

    NASA Astrophysics Data System (ADS)

    Kong, Yan; Wu, Fuzhang; Gao, Lifa; Wu, Yanjun

    2018-04-01

    Hash coding is a widely used technique in approximate nearest neighbor (ANN) search, especially in document search and multimedia (such as image and video) retrieval. Based on the difference of distance measurement, hash methods are generally classified into two categories: Hamming hashing and Manhattan hashing. Benefitting from better neighborhood structure preservation, Manhattan hashing methods outperform earlier methods in search effectiveness. However, due to using decimal arithmetic operations instead of bit operations, Manhattan hashing becomes a more time-consuming process, which significantly decreases the whole search efficiency. To solve this problem, we present an intuitive hash scheme which uses Flat Binary Code (FBC) to encode the data points. As a result, the decimal arithmetic used in previous Manhattan hashing can be replaced by more efficient XOR operator. The final experiments show that with a reasonable memory space growth, our FBC speeds up more than 80% averagely without any search accuracy loss when comparing to the state-of-art Manhattan hashing methods.

  20. A Multifeatures Fusion and Discrete Firefly Optimization Method for Prediction of Protein Tyrosine Sulfation Residues.

    PubMed

    Guo, Song; Liu, Chunhua; Zhou, Peng; Li, Yanling

    2016-01-01

    Tyrosine sulfation is one of the ubiquitous protein posttranslational modifications, where some sulfate groups are added to the tyrosine residues. It plays significant roles in various physiological processes in eukaryotic cells. To explore the molecular mechanism of tyrosine sulfation, one of the prerequisites is to correctly identify possible protein tyrosine sulfation residues. In this paper, a novel method was presented to predict protein tyrosine sulfation residues from primary sequences. By means of informative feature construction and elaborate feature selection and parameter optimization scheme, the proposed predictor achieved promising results and outperformed many other state-of-the-art predictors. Using the optimal features subset, the proposed method achieved mean MCC of 94.41% on the benchmark dataset, and a MCC of 90.09% on the independent dataset. The experimental performance indicated that our new proposed method could be effective in identifying the important protein posttranslational modifications and the feature selection scheme would be powerful in protein functional residues prediction research fields.

  1. A Multifeatures Fusion and Discrete Firefly Optimization Method for Prediction of Protein Tyrosine Sulfation Residues

    PubMed Central

    Liu, Chunhua; Zhou, Peng; Li, Yanling

    2016-01-01

    Tyrosine sulfation is one of the ubiquitous protein posttranslational modifications, where some sulfate groups are added to the tyrosine residues. It plays significant roles in various physiological processes in eukaryotic cells. To explore the molecular mechanism of tyrosine sulfation, one of the prerequisites is to correctly identify possible protein tyrosine sulfation residues. In this paper, a novel method was presented to predict protein tyrosine sulfation residues from primary sequences. By means of informative feature construction and elaborate feature selection and parameter optimization scheme, the proposed predictor achieved promising results and outperformed many other state-of-the-art predictors. Using the optimal features subset, the proposed method achieved mean MCC of 94.41% on the benchmark dataset, and a MCC of 90.09% on the independent dataset. The experimental performance indicated that our new proposed method could be effective in identifying the important protein posttranslational modifications and the feature selection scheme would be powerful in protein functional residues prediction research fields. PMID:27034949

  2. Multidimensional upwind hydrodynamics on unstructured meshes using graphics processing units - I. Two-dimensional uniform meshes

    NASA Astrophysics Data System (ADS)

    Paardekooper, S.-J.

    2017-08-01

    We present a new method for numerical hydrodynamics which uses a multidimensional generalization of the Roe solver and operates on an unstructured triangular mesh. The main advantage over traditional methods based on Riemann solvers, which commonly use one-dimensional flux estimates as building blocks for a multidimensional integration, is its inherently multidimensional nature, and as a consequence its ability to recognize multidimensional stationary states that are not hydrostatic. A second novelty is the focus on graphics processing units (GPUs). By tailoring the algorithms specifically to GPUs, we are able to get speedups of 100-250 compared to a desktop machine. We compare the multidimensional upwind scheme to a traditional, dimensionally split implementation of the Roe solver on several test problems, and we find that the new method significantly outperforms the Roe solver in almost all cases. This comes with increased computational costs per time-step, which makes the new method approximately a factor of 2 slower than a dimensionally split scheme acting on a structured grid.

  3. RaptorX-Angle: real-value prediction of protein backbone dihedral angles through a hybrid method of clustering and deep learning.

    PubMed

    Gao, Yujuan; Wang, Sheng; Deng, Minghua; Xu, Jinbo

    2018-05-08

    Protein dihedral angles provide a detailed description of protein local conformation. Predicted dihedral angles can be used to narrow down the conformational space of the whole polypeptide chain significantly, thus aiding protein tertiary structure prediction. However, direct angle prediction from sequence alone is challenging. In this article, we present a novel method (named RaptorX-Angle) to predict real-valued angles by combining clustering and deep learning. Tested on a subset of PDB25 and the targets in the latest two Critical Assessment of protein Structure Prediction (CASP), our method outperforms the existing state-of-art method SPIDER2 in terms of Pearson Correlation Coefficient (PCC) and Mean Absolute Error (MAE). Our result also shows approximately linear relationship between the real prediction errors and our estimated bounds. That is, the real prediction error can be well approximated by our estimated bounds. Our study provides an alternative and more accurate prediction of dihedral angles, which may facilitate protein structure prediction and functional study.

  4. Distributed Adaptive Binary Quantization for Fast Nearest Neighbor Search.

    PubMed

    Xianglong Liu; Zhujin Li; Cheng Deng; Dacheng Tao

    2017-11-01

    Hashing has been proved an attractive technique for fast nearest neighbor search over big data. Compared with the projection based hashing methods, prototype-based ones own stronger power to generate discriminative binary codes for the data with complex intrinsic structure. However, existing prototype-based methods, such as spherical hashing and K-means hashing, still suffer from the ineffective coding that utilizes the complete binary codes in a hypercube. To address this problem, we propose an adaptive binary quantization (ABQ) method that learns a discriminative hash function with prototypes associated with small unique binary codes. Our alternating optimization adaptively discovers the prototype set and the code set of a varying size in an efficient way, which together robustly approximate the data relations. Our method can be naturally generalized to the product space for long hash codes, and enjoys the fast training linear to the number of the training data. We further devise a distributed framework for the large-scale learning, which can significantly speed up the training of ABQ in the distributed environment that has been widely deployed in many areas nowadays. The extensive experiments on four large-scale (up to 80 million) data sets demonstrate that our method significantly outperforms state-of-the-art hashing methods, with up to 58.84% performance gains relatively.

  5. Multiagent scheduling method with earliness and tardiness objectives in flexible job shops.

    PubMed

    Wu, Zuobao; Weng, Michael X

    2005-04-01

    Flexible job-shop scheduling problems are an important extension of the classical job-shop scheduling problems and present additional complexity. Such problems are mainly due to the existence of a considerable amount of overlapping capacities with modern machines. Classical scheduling methods are generally incapable of addressing such capacity overlapping. We propose a multiagent scheduling method with job earliness and tardiness objectives in a flexible job-shop environment. The earliness and tardiness objectives are consistent with the just-in-time production philosophy which has attracted significant attention in both industry and academic community. A new job-routing and sequencing mechanism is proposed. In this mechanism, two kinds of jobs are defined to distinguish jobs with one operation left from jobs with more than one operation left. Different criteria are proposed to route these two kinds of jobs. Job sequencing enables to hold a job that may be completed too early. Two heuristic algorithms for job sequencing are developed to deal with these two kinds of jobs. The computational experiments show that the proposed multiagent scheduling method significantly outperforms the existing scheduling methods in the literature. In addition, the proposed method is quite fast. In fact, the simulation time to find a complete schedule with over 2000 jobs on ten machines is less than 1.5 min.

  6. Influences of gender and socioeconomic status on the motor proficiency of children in the UK.

    PubMed

    Morley, David; Till, Kevin; Ogilvie, Paul; Turner, Graham

    2015-12-01

    As the development of movement skills are so crucial to a child's involvement in lifelong physical activity and sport, the purpose of this study was to assess the motor proficiency of children aged 4-7 years (range=4.3-7.2 years), whilst considering gender and socioeconomic status. 369 children (176 females, 193 males, aged=5.96 ± 0.57 years) were assessed for fine motor precision, fine motor integration, manual dexterity, bilateral co-ordination, balance, speed and agility, upper-limb co-ordination and strength. The average standard score for all participants was 44.4 ± 8.9, classifying the participants towards the lower end of the average score. Multivariate analysis of covariance identified significant effects for gender (p<0.001) and socioeconomic status (p<0.001). Females outperformed males for fine motor skills and boys outperformed girls for catch and dribble gross motor skills. High socioeconomic status significantly outperformed middle and/or low socioeconomic status for total, fine and gross motor proficiency. Current motor proficiency of primary children aged 4-7 years in the UK is just below average with differences evident between gender and socioeconomic status. Teachers and sport coaches working with primary aged children should concentrate on the development of movement skills, whilst considering differences between genders and socioeconomic status. Crown Copyright © 2015. Published by Elsevier B.V. All rights reserved.

  7. Non-linear eigensolver-based alternative to traditional SCF methods

    NASA Astrophysics Data System (ADS)

    Gavin, B.; Polizzi, E.

    2013-05-01

    The self-consistent procedure in electronic structure calculations is revisited using a highly efficient and robust algorithm for solving the non-linear eigenvector problem, i.e., H({ψ})ψ = Eψ. This new scheme is derived from a generalization of the FEAST eigenvalue algorithm to account for the non-linearity of the Hamiltonian with the occupied eigenvectors. Using a series of numerical examples and the density functional theory-Kohn/Sham model, it will be shown that our approach can outperform the traditional SCF mixing-scheme techniques by providing a higher converge rate, convergence to the correct solution regardless of the choice of the initial guess, and a significant reduction of the eigenvalue solve time in simulations.

  8. Integrative Analysis of Prognosis Data on Multiple Cancer Subtypes

    PubMed Central

    Liu, Jin; Huang, Jian; Zhang, Yawei; Lan, Qing; Rothman, Nathaniel; Zheng, Tongzhang; Ma, Shuangge

    2014-01-01

    Summary In cancer research, profiling studies have been extensively conducted, searching for genes/SNPs associated with prognosis. Cancer is diverse. Examining the similarity and difference in the genetic basis of multiple subtypes of the same cancer can lead to a better understanding of their connections and distinctions. Classic meta-analysis methods analyze each subtype separately and then compare analysis results across subtypes. Integrative analysis methods, in contrast, analyze the raw data on multiple subtypes simultaneously and can outperform meta-analysis methods. In this study, prognosis data on multiple subtypes of the same cancer are analyzed. An AFT (accelerated failure time) model is adopted to describe survival. The genetic basis of multiple subtypes is described using the heterogeneity model, which allows a gene/SNP to be associated with prognosis of some subtypes but not others. A compound penalization method is developed to identify genes that contain important SNPs associated with prognosis. The proposed method has an intuitive formulation and is realized using an iterative algorithm. Asymptotic properties are rigorously established. Simulation shows that the proposed method has satisfactory performance and outperforms a penalization-based meta-analysis method and a regularized thresholding method. An NHL (non-Hodgkin lymphoma) prognosis study with SNP measurements is analyzed. Genes associated with the three major subtypes, namely DLBCL, FL, and CLL/SLL, are identified. The proposed method identifies genes that are different from alternatives and have important implications and satisfactory prediction performance. PMID:24766212

  9. JPEG2000 vs. full frame wavelet packet compression for smart card medical records.

    PubMed

    Leehan, Joaquín Azpirox; Lerallut, Jean-Francois

    2006-01-01

    This paper describes a comparison among different compression methods to be used in the context of electronic health records in the newer version of "smart cards". The JPEG2000 standard is compared to a full-frame wavelet packet compression method at high (33:1 and 50:1) compression rates. Results show that the full-frame method outperforms the JPEG2K standard qualitatively and quantitatively.

  10. Performance of toxicity probability interval based designs in contrast to the continual reassessment method

    PubMed Central

    Horton, Bethany Jablonski; Wages, Nolan A.; Conaway, Mark R.

    2016-01-01

    Toxicity probability interval designs have received increasing attention as a dose-finding method in recent years. In this study, we compared the two-stage, likelihood-based continual reassessment method (CRM), modified toxicity probability interval (mTPI), and the Bayesian optimal interval design (BOIN) in order to evaluate each method's performance in dose selection for Phase I trials. We use several summary measures to compare the performance of these methods, including percentage of correct selection (PCS) of the true maximum tolerable dose (MTD), allocation of patients to doses at and around the true MTD, and an accuracy index. This index is an efficiency measure that describes the entire distribution of MTD selection and patient allocation by taking into account the distance between the true probability of toxicity at each dose level and the target toxicity rate. The simulation study considered a broad range of toxicity curves and various sample sizes. When considering PCS, we found that CRM outperformed the two competing methods in most scenarios, followed by BOIN, then mTPI. We observed a similar trend when considering the accuracy index for dose allocation, where CRM most often outperformed both the mTPI and BOIN. These trends were more pronounced with increasing number of dose levels. PMID:27435150

  11. Comparison of DNA preservation methods for environmental bacterial community samples.

    PubMed

    Gray, Michael A; Pratte, Zoe A; Kellogg, Christina A

    2013-02-01

    Field collections of environmental samples, for example corals, for molecular microbial analyses present distinct challenges. The lack of laboratory facilities in remote locations is common, and preservation of microbial community DNA for later study is critical. A particular challenge is keeping samples frozen in transit. Five nucleic acid preservation methods that do not require cold storage were compared for effectiveness over time and ease of use. Mixed microbial communities of known composition were created and preserved by DNAgard(™), RNAlater(®), DMSO-EDTA-salt (DESS), FTA(®) cards, and FTA Elute(®) cards. Automated ribosomal intergenic spacer analysis and clone libraries were used to detect specific changes in the faux communities over weeks and months of storage. A previously known bias in FTA(®) cards that results in lower recovery of pure cultures of Gram-positive bacteria was also detected in mixed community samples. There appears to be a uniform bias across all five preservation methods against microorganisms with high G + C DNA. Overall, the liquid-based preservatives (DNAgard(™), RNAlater(®), and DESS) outperformed the card-based methods. No single liquid method clearly outperformed the others, leaving method choice to be based on experimental design, field facilities, shipping constraints, and allowable cost. © 2012 Federation of European Microbiological Societies. Published by Blackwell Publishing Ltd. All rights reserved.

  12. A New Online Calibration Method Based on Lord's Bias-Correction.

    PubMed

    He, Yinhong; Chen, Ping; Li, Yong; Zhang, Shumei

    2017-09-01

    Online calibration technique has been widely employed to calibrate new items due to its advantages. Method A is the simplest online calibration method and has attracted many attentions from researchers recently. However, a key assumption of Method A is that it treats person-parameter estimates θ ^ s (obtained by maximum likelihood estimation [MLE]) as their true values θ s , thus the deviation of the estimated θ ^ s from their true values might yield inaccurate item calibration when the deviation is nonignorable. To improve the performance of Method A, a new method, MLE-LBCI-Method A, is proposed. This new method combines a modified Lord's bias-correction method (named as maximum likelihood estimation-Lord's bias-correction with iteration [MLE-LBCI]) with the original Method A in an effort to correct the deviation of θ ^ s which may adversely affect the item calibration precision. Two simulation studies were carried out to explore the performance of both MLE-LBCI and MLE-LBCI-Method A under several scenarios. Simulation results showed that MLE-LBCI could make a significant improvement over the ML ability estimates, and MLE-LBCI-Method A did outperform Method A in almost all experimental conditions.

  13. Particle Swarm Optimization of Low-Thrust, Geocentric-to-Halo-Orbit Transfers

    NASA Astrophysics Data System (ADS)

    Abraham, Andrew J.

    Missions to Lagrange points are becoming increasingly popular amongst spacecraft mission planners. Lagrange points are locations in space where the gravity force from two bodies, and the centrifugal force acting on a third body, cancel. To date, all spacecraft that have visited a Lagrange point have done so using high-thrust, chemical propulsion. Due to the increasing availability of low-thrust (high efficiency) propulsive devices, and their increasing capability in terms of fuel efficiency and instantaneous thrust, it has now become possible for a spacecraft to reach a Lagrange point orbit without the aid of chemical propellant. While at any given time there are many paths for a low-thrust trajectory to take, only one is optimal. The traditional approach to spacecraft trajectory optimization utilizes some form of gradient-based algorithm. While these algorithms offer numerous advantages, they also have a few significant shortcomings. The three most significant shortcomings are: (1) the fact that an initial guess solution is required to initialize the algorithm, (2) the radius of convergence can be quite small and can allow the algorithm to become trapped in local minima, and (3) gradient information is not always assessable nor always trustworthy for a given problem. To avoid these problems, this dissertation is focused on optimizing a low-thrust transfer trajectory from a geocentric orbit to an Earth-Moon, L1, Lagrange point orbit using the method of Particle Swarm Optimization (PSO). The PSO method is an evolutionary heuristic that was originally written to model birds swarming to locate hidden food sources. This PSO method will enable the exploration of the invariant stable manifold of the target Lagrange point orbit in an effort to optimize the spacecraft's low-thrust trajectory. Examples of these optimized trajectories are presented and contrasted with those found using traditional, gradient-based approaches. In summary, the results of this dissertation find that the PSO method does, indeed, successfully optimize the low-thrust trajectory transfer problem without the need for initial guessing. Furthermore, a two-degree-of-freedom PSO problem formulation significantly outperformed a one-degree-of-freedom formulation by at least an order of magnitude, in terms of CPU time. Finally, the PSO method is also used to solve a traditional, two-burn, impulsive transfer to a Lagrange point orbit using a hybrid optimization algorithm that incorporates a gradient-based shooting algorithm as a pre-optimizer. Surprisingly, the results of this study show that "fast" transfers outperform "slow" transfers in terms of both Deltav and time of flight.

  14. SANDPUMA: ensemble predictions of nonribosomal peptide chemistry reveal biosynthetic diversity across Actinobacteria.

    PubMed

    Chevrette, Marc G; Aicheler, Fabian; Kohlbacher, Oliver; Currie, Cameron R; Medema, Marnix H

    2017-10-15

    Nonribosomally synthesized peptides (NRPs) are natural products with widespread applications in medicine and biotechnology. Many algorithms have been developed to predict the substrate specificities of nonribosomal peptide synthetase adenylation (A) domains from DNA sequences, which enables prioritization and dereplication, and integration with other data types in discovery efforts. However, insufficient training data and a lack of clarity regarding prediction quality have impeded optimal use. Here, we introduce prediCAT, a new phylogenetics-inspired algorithm, which quantitatively estimates the degree of predictability of each A-domain. We then systematically benchmarked all algorithms on a newly gathered, independent test set of 434 A-domain sequences, showing that active-site-motif-based algorithms outperform whole-domain-based methods. Subsequently, we developed SANDPUMA, a powerful ensemble algorithm, based on newly trained versions of all high-performing algorithms, which significantly outperforms individual methods. Finally, we deployed SANDPUMA in a systematic investigation of 7635 Actinobacteria genomes, suggesting that NRP chemical diversity is much higher than previously estimated. SANDPUMA has been integrated into the widely used antiSMASH biosynthetic gene cluster analysis pipeline and is also available as an open-source, standalone tool. SANDPUMA is freely available at https://bitbucket.org/chevrm/sandpuma and as a docker image at https://hub.docker.com/r/chevrm/sandpuma/ under the GNU Public License 3 (GPL3). chevrette@wisc.edu or marnix.medema@wur.nl. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  15. Invariant 2D object recognition using the wavelet transform and structured neural networks

    NASA Astrophysics Data System (ADS)

    Khalil, Mahmoud I.; Bayoumi, Mohamed M.

    1999-03-01

    This paper applies the dyadic wavelet transform and the structured neural networks approach to recognize 2D objects under translation, rotation, and scale transformation. Experimental results are presented and compared with traditional methods. The experimental results showed that this refined technique successfully classified the objects and outperformed some traditional methods especially in the presence of noise.

  16. Automatic Keyword Identification by Artificial Neural Networks Compared to Manual Identification by Users of Filtering Systems.

    ERIC Educational Resources Information Center

    Boger, Zvi; Kuflik, Tsvi; Shoval, Peretz; Shapira, Bracha

    2001-01-01

    Discussion of information filtering (IF) and information retrieval focuses on the use of an artificial neural network (ANN) as an alternative method for both IF and term selection and compares its effectiveness to that of traditional methods. Results show that the ANN relevance prediction out-performs the prediction of an IF system. (Author/LRW)

  17. Measuring Link-Resolver Success: Comparing 360 Link with a Local Implementation of WebBridge

    ERIC Educational Resources Information Center

    Herrera, Gail

    2011-01-01

    This study reviewed link resolver success comparing 360 Link and a local implementation of WebBridge. Two methods were used: (1) comparing article-level access and (2) examining technical issues for 384 randomly sampled OpenURLs. Google Analytics was used to collect user-generated OpenURLs. For both methods, 360 Link out-performed the local…

  18. Autonomous rock detection on mars through region contrast

    NASA Astrophysics Data System (ADS)

    Xiao, Xueming; Cui, Hutao; Yao, Meibao; Tian, Yang

    2017-08-01

    In this paper, we present a new autonomous rock detection approach through region contrast. Unlike current state-of-art pixel-level rock segmenting methods, new method deals with this issue in region level, which will significantly reduce the computational cost. Image is firstly splitted into homogeneous regions based on intensity information and spatial layout. Considering the high-water memory constraints of onboard flight processor, only low-level features, average intensity and variation of superpixel, are measured. Region contrast is derived as the integration of intensity contrast and smoothness measurement. Rocks are then segmented from the resulting contrast map by an adaptive threshold. Since the merely intensity-based method may cause false detection in background areas with different illuminations from surroundings, a more reliable method is further proposed by introducing spatial factor and background similarity to the region contrast. Spatial factor demonstrates the locality of contrast, while background similarity calculates the probability of each subregion belonging to background. Our method is efficient in dealing with large images and only few parameters are needed. Preliminary experimental results show that our algorithm outperforms edge-based methods in various grayscale rover images.

  19. Electroencephalogram-based decoding cognitive states using convolutional neural network and likelihood ratio based score fusion.

    PubMed

    Zafar, Raheel; Dass, Sarat C; Malik, Aamir Saeed

    2017-01-01

    Electroencephalogram (EEG)-based decoding human brain activity is challenging, owing to the low spatial resolution of EEG. However, EEG is an important technique, especially for brain-computer interface applications. In this study, a novel algorithm is proposed to decode brain activity associated with different types of images. In this hybrid algorithm, convolutional neural network is modified for the extraction of features, a t-test is used for the selection of significant features and likelihood ratio-based score fusion is used for the prediction of brain activity. The proposed algorithm takes input data from multichannel EEG time-series, which is also known as multivariate pattern analysis. Comprehensive analysis was conducted using data from 30 participants. The results from the proposed method are compared with current recognized feature extraction and classification/prediction techniques. The wavelet transform-support vector machine method is the most popular currently used feature extraction and prediction method. This method showed an accuracy of 65.7%. However, the proposed method predicts the novel data with improved accuracy of 79.9%. In conclusion, the proposed algorithm outperformed the current feature extraction and prediction method.

  20. Exact and Metaheuristic Approaches for a Bi-Objective School Bus Scheduling Problem.

    PubMed

    Chen, Xiaopan; Kong, Yunfeng; Dang, Lanxue; Hou, Yane; Ye, Xinyue

    2015-01-01

    As a class of hard combinatorial optimization problems, the school bus routing problem has received considerable attention in the last decades. For a multi-school system, given the bus trips for each school, the school bus scheduling problem aims at optimizing bus schedules to serve all the trips within the school time windows. In this paper, we propose two approaches for solving the bi-objective school bus scheduling problem: an exact method of mixed integer programming (MIP) and a metaheuristic method which combines simulated annealing with local search. We develop MIP formulations for homogenous and heterogeneous fleet problems respectively and solve the models by MIP solver CPLEX. The bus type-based formulation for heterogeneous fleet problem reduces the model complexity in terms of the number of decision variables and constraints. The metaheuristic method is a two-stage framework for minimizing the number of buses to be used as well as the total travel distance of buses. We evaluate the proposed MIP and the metaheuristic method on two benchmark datasets, showing that on both instances, our metaheuristic method significantly outperforms the respective state-of-the-art methods.

  1. A Heterogeneous Network Based Method for Identifying GBM-Related Genes by Integrating Multi-Dimensional Data.

    PubMed

    Chen Peng; Ao Li

    2017-01-01

    The emergence of multi-dimensional data offers opportunities for more comprehensive analysis of the molecular characteristics of human diseases and therefore improving diagnosis, treatment, and prevention. In this study, we proposed a heterogeneous network based method by integrating multi-dimensional data (HNMD) to identify GBM-related genes. The novelty of the method lies in that the multi-dimensional data of GBM from TCGA dataset that provide comprehensive information of genes, are combined with protein-protein interactions to construct a weighted heterogeneous network, which reflects both the general and disease-specific relationships between genes. In addition, a propagation algorithm with resistance is introduced to precisely score and rank GBM-related genes. The results of comprehensive performance evaluation show that the proposed method significantly outperforms the network based methods with single-dimensional data and other existing approaches. Subsequent analysis of the top ranked genes suggests they may be functionally implicated in GBM, which further corroborates the superiority of the proposed method. The source code and the results of HNMD can be downloaded from the following URL: http://bioinformatics.ustc.edu.cn/hnmd/ .

  2. Anderson acceleration of the Jacobi iterative method: An efficient alternative to Krylov methods for large, sparse linear systems

    DOE PAGES

    Pratapa, Phanisri P.; Suryanarayana, Phanish; Pask, John E.

    2015-12-01

    We employ Anderson extrapolation to accelerate the classical Jacobi iterative method for large, sparse linear systems. Specifically, we utilize extrapolation at periodic intervals within the Jacobi iteration to develop the Alternating Anderson–Jacobi (AAJ) method. We verify the accuracy and efficacy of AAJ in a range of test cases, including nonsymmetric systems of equations. We demonstrate that AAJ possesses a favorable scaling with system size that is accompanied by a small prefactor, even in the absence of a preconditioner. In particular, we show that AAJ is able to accelerate the classical Jacobi iteration by over four orders of magnitude, with speed-upsmore » that increase as the system gets larger. Moreover, we find that AAJ significantly outperforms the Generalized Minimal Residual (GMRES) method in the range of problems considered here, with the relative performance again improving with size of the system. As a result, the proposed method represents a simple yet efficient technique that is particularly attractive for large-scale parallel solutions of linear systems of equations.« less

  3. Development of a two-stage gene selection method that incorporates a novel hybrid approach using the cuckoo optimization algorithm and harmony search for cancer classification.

    PubMed

    Elyasigomari, V; Lee, D A; Screen, H R C; Shaheed, M H

    2017-03-01

    For each cancer type, only a few genes are informative. Due to the so-called 'curse of dimensionality' problem, the gene selection task remains a challenge. To overcome this problem, we propose a two-stage gene selection method called MRMR-COA-HS. In the first stage, the minimum redundancy and maximum relevance (MRMR) feature selection is used to select a subset of relevant genes. The selected genes are then fed into a wrapper setup that combines a new algorithm, COA-HS, using the support vector machine as a classifier. The method was applied to four microarray datasets, and the performance was assessed by the leave one out cross-validation method. Comparative performance assessment of the proposed method with other evolutionary algorithms suggested that the proposed algorithm significantly outperforms other methods in selecting a fewer number of genes while maintaining the highest classification accuracy. The functions of the selected genes were further investigated, and it was confirmed that the selected genes are biologically relevant to each cancer type. Copyright © 2017. Published by Elsevier Inc.

  4. Vessel Enhancement and Segmentation of 4D CT Lung Image Using Stick Tensor Voting

    NASA Astrophysics Data System (ADS)

    Cong, Tan; Hao, Yang; Jingli, Shi; Xuan, Yang

    2016-12-01

    Vessel enhancement and segmentation plays a significant role in medical image analysis. This paper proposes a novel vessel enhancement and segmentation method for 4D CT lung image using stick tensor voting algorithm, which focuses on addressing the vessel distortion issue of vessel enhancement diffusion (VED) method. Furthermore, the enhanced results are easily segmented using level-set segmentation. In our method, firstly, vessels are filtered using Frangi's filter to reduce intrapulmonary noises and extract rough blood vessels. Secondly, stick tensor voting algorithm is employed to estimate the correct direction along the vessel. Then the estimated direction along the vessel is used as the anisotropic diffusion direction of vessel in VED algorithm, which makes the intensity diffusion of points locating at the vessel wall be consistent with the directions of vessels and enhance the tubular features of vessels. Finally, vessels can be extracted from the enhanced image by applying level-set segmentation method. A number of experiments results show that our method outperforms traditional VED method in vessel enhancement and results in satisfied segmented vessels.

  5. Anderson acceleration of the Jacobi iterative method: An efficient alternative to Krylov methods for large, sparse linear systems

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Pratapa, Phanisri P.; Suryanarayana, Phanish; Pask, John E.

    We employ Anderson extrapolation to accelerate the classical Jacobi iterative method for large, sparse linear systems. Specifically, we utilize extrapolation at periodic intervals within the Jacobi iteration to develop the Alternating Anderson–Jacobi (AAJ) method. We verify the accuracy and efficacy of AAJ in a range of test cases, including nonsymmetric systems of equations. We demonstrate that AAJ possesses a favorable scaling with system size that is accompanied by a small prefactor, even in the absence of a preconditioner. In particular, we show that AAJ is able to accelerate the classical Jacobi iteration by over four orders of magnitude, with speed-upsmore » that increase as the system gets larger. Moreover, we find that AAJ significantly outperforms the Generalized Minimal Residual (GMRES) method in the range of problems considered here, with the relative performance again improving with size of the system. As a result, the proposed method represents a simple yet efficient technique that is particularly attractive for large-scale parallel solutions of linear systems of equations.« less

  6. The effect of explanations on mathematical reasoning tasks

    NASA Astrophysics Data System (ADS)

    Norqvist, Mathias

    2018-01-01

    Studies in mathematics education often point to the necessity for students to engage in more cognitively demanding activities than just solving tasks by applying given solution methods. Previous studies have shown that students that engage in creative mathematically founded reasoning to construct a solution method, perform significantly better in follow up tests than students that are given a solution method and engage in algorithmic reasoning. However, teachers and textbooks, at least occasionally, provide explanations together with an algorithmic method, and this could possibly be more efficient than creative reasoning. In this study, three matched groups practiced with either creative, algorithmic, or explained algorithmic tasks. The main finding was that students that practiced with creative tasks did, outperform the students that practiced with explained algorithmic tasks in a post-test, despite a much lower practice score. The two groups that got a solution method presented, performed similarly in both practice and post-test, even though one group got an explanation to the given solution method. Additionally, there were some differences between the groups in which variables predicted the post-test score.

  7. Detecting cis-regulatory binding sites for cooperatively binding proteins

    PubMed Central

    van Oeffelen, Liesbeth; Cornelis, Pierre; Van Delm, Wouter; De Ridder, Fedor; De Moor, Bart; Moreau, Yves

    2008-01-01

    Several methods are available to predict cis-regulatory modules in DNA based on position weight matrices. However, the performance of these methods generally depends on a number of additional parameters that cannot be derived from sequences and are difficult to estimate because they have no physical meaning. As the best way to detect cis-regulatory modules is the way in which the proteins recognize them, we developed a new scoring method that utilizes the underlying physical binding model. This method requires no additional parameter to account for multiple binding sites; and the only necessary parameters to model homotypic cooperative interactions are the distances between adjacent protein binding sites in basepairs, and the corresponding cooperative binding constants. The heterotypic cooperative binding model requires one more parameter per cooperatively binding protein, which is the concentration multiplied by the partition function of this protein. In a case study on the bacterial ferric uptake regulator, we show that our scoring method for homotypic cooperatively binding proteins significantly outperforms other PWM-based methods where biophysical cooperativity is not taken into account. PMID:18400778

  8. Quantitative comparison of alternative methods for coarse-graining biological networks

    PubMed Central

    Bowman, Gregory R.; Meng, Luming; Huang, Xuhui

    2013-01-01

    Markov models and master equations are a powerful means of modeling dynamic processes like protein conformational changes. However, these models are often difficult to understand because of the enormous number of components and connections between them. Therefore, a variety of methods have been developed to facilitate understanding by coarse-graining these complex models. Here, we employ Bayesian model comparison to determine which of these coarse-graining methods provides the models that are most faithful to the original set of states. We find that the Bayesian agglomerative clustering engine and the hierarchical Nyström expansion graph (HNEG) typically provide the best performance. Surprisingly, the original Perron cluster cluster analysis (PCCA) method often provides the next best results, outperforming the newer PCCA+ method and the most probable paths algorithm. We also show that the differences between the models are qualitatively significant, rather than being minor shifts in the boundaries between states. The performance of the methods correlates well with the entropy of the resulting coarse-grainings, suggesting that finding states with more similar populations (i.e., avoiding low population states that may just be noise) gives better results. PMID:24089717

  9. Novel non-parametric models to estimate evolutionary rates and divergence times from heterochronous sequence data.

    PubMed

    Fourment, Mathieu; Holmes, Edward C

    2014-07-24

    Early methods for estimating divergence times from gene sequence data relied on the assumption of a molecular clock. More sophisticated methods were created to model rate variation and used auto-correlation of rates, local clocks, or the so called "uncorrelated relaxed clock" where substitution rates are assumed to be drawn from a parametric distribution. In the case of Bayesian inference methods the impact of the prior on branching times is not clearly understood, and if the amount of data is limited the posterior could be strongly influenced by the prior. We develop a maximum likelihood method--Physher--that uses local or discrete clocks to estimate evolutionary rates and divergence times from heterochronous sequence data. Using two empirical data sets we show that our discrete clock estimates are similar to those obtained by other methods, and that Physher outperformed some methods in the estimation of the root age of an influenza virus data set. A simulation analysis suggests that Physher can outperform a Bayesian method when the real topology contains two long branches below the root node, even when evolution is strongly clock-like. These results suggest it is advisable to use a variety of methods to estimate evolutionary rates and divergence times from heterochronous sequence data. Physher and the associated data sets used here are available online at http://code.google.com/p/physher/.

  10. Prediction of nocturnal hypoglycemia by an aggregation of previously known prediction approaches: proof of concept for clinical application.

    PubMed

    Tkachenko, Pavlo; Kriukova, Galyna; Aleksandrova, Marharyta; Chertov, Oleg; Renard, Eric; Pereverzyev, Sergei V

    2016-10-01

    Nocturnal hypoglycemia (NH) is common in patients with insulin-treated diabetes. Despite the risk associated with NH, there are only a few methods aiming at the prediction of such events based on intermittent blood glucose monitoring data and none has been validated for clinical use. Here we propose a method of combining several predictors into a new one that will perform at the level of the best involved one, or even outperform all individual candidates. The idea of the method is to use a recently developed strategy for aggregating ranking algorithms. The method has been calibrated and tested on data extracted from clinical trials, performed in the European FP7-funded project DIAdvisor. Then we have tested the proposed approach on other datasets to show the portability of the method. This feature of the method allows its simple implementation in the form of a diabetic smartphone app. On the considered datasets the proposed approach exhibits good performance in terms of sensitivity, specificity and predictive values. Moreover, the resulting predictor automatically performs at the level of the best involved method or even outperforms it. We propose a strategy for a combination of NH predictors that leads to a method exhibiting a reliable performance and the potential for everyday use by any patient who performs self-monitoring of blood glucose. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  11. Improved image alignment method in application to X-ray images and biological images.

    PubMed

    Wang, Ching-Wei; Chen, Hsiang-Chou

    2013-08-01

    Alignment of medical images is a vital component of a large number of applications throughout the clinical track of events; not only within clinical diagnostic settings, but prominently so in the area of planning, consummation and evaluation of surgical and radiotherapeutical procedures. However, image registration of medical images is challenging because of variations on data appearance, imaging artifacts and complex data deformation problems. Hence, the aim of this study is to develop a robust image alignment method for medical images. An improved image registration method is proposed, and the method is evaluated with two types of medical data, including biological microscopic tissue images and dental X-ray images and compared with five state-of-the-art image registration techniques. The experimental results show that the presented method consistently performs well on both types of medical images, achieving 88.44 and 88.93% averaged registration accuracies for biological tissue images and X-ray images, respectively, and outperforms the benchmark methods. Based on the Tukey's honestly significant difference test and Fisher's least square difference test tests, the presented method performs significantly better than all existing methods (P ≤ 0.001) for tissue image alignment, and for the X-ray image registration, the proposed method performs significantly better than the two benchmark b-spline approaches (P < 0.001). The software implementation of the presented method and the data used in this study are made publicly available for scientific communities to use (http://www-o.ntust.edu.tw/∼cweiwang/ImprovedImageRegistration/). cweiwang@mail.ntust.edu.tw.

  12. Accurate diagnosis of thyroid follicular lesions from nuclear morphology using supervised learning.

    PubMed

    Ozolek, John A; Tosun, Akif Burak; Wang, Wei; Chen, Cheng; Kolouri, Soheil; Basu, Saurav; Huang, Hu; Rohde, Gustavo K

    2014-07-01

    Follicular lesions of the thyroid remain significant diagnostic challenges in surgical pathology and cytology. The diagnosis often requires considerable resources and ancillary tests including immunohistochemistry, molecular studies, and expert consultation. Visual analyses of nuclear morphological features, generally speaking, have not been helpful in distinguishing this group of lesions. Here we describe a method for distinguishing between follicular lesions of the thyroid based on nuclear morphology. The method utilizes an optimal transport-based linear embedding for segmented nuclei, together with an adaptation of existing classification methods. We show the method outputs assignments (classification results) which are near perfectly correlated with the clinical diagnosis of several lesion types' lesions utilizing a database of 94 patients in total. Experimental comparisons also show the new method can significantly outperform standard numerical feature-type methods in terms of agreement with the clinical diagnosis gold standard. In addition, the new method could potentially be used to derive insights into biologically meaningful nuclear morphology differences in these lesions. Our methods could be incorporated into a tool for pathologists to aid in distinguishing between follicular lesions of the thyroid. In addition, these results could potentially provide nuclear morphological correlates of biological behavior and reduce health care costs by decreasing histotechnician and pathologist time and obviating the need for ancillary testing. Copyright © 2014 Elsevier B.V. All rights reserved.

  13. Geodesic denoising for optical coherence tomography images

    NASA Astrophysics Data System (ADS)

    Shahrian Varnousfaderani, Ehsan; Vogl, Wolf-Dieter; Wu, Jing; Gerendas, Bianca S.; Simader, Christian; Langs, Georg; Waldstein, Sebastian M.; Schmidt-Erfurth, Ursula

    2016-03-01

    Optical coherence tomography (OCT) is an optical signal acquisition method capturing micrometer resolution, cross-sectional three-dimensional images. OCT images are used widely in ophthalmology to diagnose and monitor retinal diseases such as age-related macular degeneration (AMD) and Glaucoma. While OCT allows the visualization of retinal structures such as vessels and retinal layers, image quality and contrast is reduced by speckle noise, obfuscating small, low intensity structures and structural boundaries. Existing denoising methods for OCT images may remove clinically significant image features such as texture and boundaries of anomalies. In this paper, we propose a novel patch based denoising method, Geodesic Denoising. The method reduces noise in OCT images while preserving clinically significant, although small, pathological structures, such as fluid-filled cysts in diseased retinas. Our method selects optimal image patch distribution representations based on geodesic patch similarity to noisy samples. Patch distributions are then randomly sampled to build a set of best matching candidates for every noisy sample, and the denoised value is computed based on a geodesic weighted average of the best candidate samples. Our method is evaluated qualitatively on real pathological OCT scans and quantitatively on a proposed set of ground truth, noise free synthetic OCT scans with artificially added noise and pathologies. Experimental results show that performance of our method is comparable with state of the art denoising methods while outperforming them in preserving the critical clinically relevant structures.

  14. Paroxysmal atrial fibrillation prediction method with shorter HRV sequences.

    PubMed

    Boon, K H; Khalil-Hani, M; Malarvili, M B; Sia, C W

    2016-10-01

    This paper proposes a method that predicts the onset of paroxysmal atrial fibrillation (PAF), using heart rate variability (HRV) segments that are shorter than those applied in existing methods, while maintaining good prediction accuracy. PAF is a common cardiac arrhythmia that increases the health risk of a patient, and the development of an accurate predictor of the onset of PAF is clinical important because it increases the possibility to stabilize (electrically) and prevent the onset of atrial arrhythmias with different pacing techniques. We investigate the effect of HRV features extracted from different lengths of HRV segments prior to PAF onset with the proposed PAF prediction method. The pre-processing stage of the predictor includes QRS detection, HRV quantification and ectopic beat correction. Time-domain, frequency-domain, non-linear and bispectrum features are then extracted from the quantified HRV. In the feature selection, the HRV feature set and classifier parameters are optimized simultaneously using an optimization procedure based on genetic algorithm (GA). Both full feature set and statistically significant feature subset are optimized by GA respectively. For the statistically significant feature subset, Mann-Whitney U test is used to filter non-statistical significance features that cannot pass the statistical test at 20% significant level. The final stage of our predictor is the classifier that is based on support vector machine (SVM). A 10-fold cross-validation is applied in performance evaluation, and the proposed method achieves 79.3% prediction accuracy using 15-minutes HRV segment. This accuracy is comparable to that achieved by existing methods that use 30-minutes HRV segments, most of which achieves accuracy of around 80%. More importantly, our method significantly outperforms those that applied segments shorter than 30 minutes. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  15. Automatic bladder segmentation from CT images using deep CNN and 3D fully connected CRF-RNN.

    PubMed

    Xu, Xuanang; Zhou, Fugen; Liu, Bo

    2018-03-19

    Automatic approach for bladder segmentation from computed tomography (CT) images is highly desirable in clinical practice. It is a challenging task since the bladder usually suffers large variations of appearance and low soft-tissue contrast in CT images. In this study, we present a deep learning-based approach which involves a convolutional neural network (CNN) and a 3D fully connected conditional random fields recurrent neural network (CRF-RNN) to perform accurate bladder segmentation. We also propose a novel preprocessing method, called dual-channel preprocessing, to further advance the segmentation performance of our approach. The presented approach works as following: first, we apply our proposed preprocessing method on the input CT image and obtain a dual-channel image which consists of the CT image and an enhanced bladder density map. Second, we exploit a CNN to predict a coarse voxel-wise bladder score map on this dual-channel image. Finally, a 3D fully connected CRF-RNN refines the coarse bladder score map and produce final fine-localized segmentation result. We compare our approach to the state-of-the-art V-net on a clinical dataset. Results show that our approach achieves superior segmentation accuracy, outperforming the V-net by a significant margin. The Dice Similarity Coefficient of our approach (92.24%) is 8.12% higher than that of the V-net. Moreover, the bladder probability maps performed by our approach present sharper boundaries and more accurate localizations compared with that of the V-net. Our approach achieves higher segmentation accuracy than the state-of-the-art method on clinical data. Both the dual-channel processing and the 3D fully connected CRF-RNN contribute to this improvement. The united deep network composed of the CNN and 3D CRF-RNN also outperforms a system where the CRF model acts as a post-processing method disconnected from the CNN.

  16. The review and results of different methods for facial recognition

    NASA Astrophysics Data System (ADS)

    Le, Yifan

    2017-09-01

    In recent years, facial recognition draws much attention due to its wide potential applications. As a unique technology in Biometric Identification, facial recognition represents a significant improvement since it could be operated without cooperation of people under detection. Hence, facial recognition will be taken into defense system, medical detection, human behavior understanding, etc. Several theories and methods have been established to make progress in facial recognition: (1) A novel two-stage facial landmark localization method is proposed which has more accurate facial localization effect under specific database; (2) A statistical face frontalization method is proposed which outperforms state-of-the-art methods for face landmark localization; (3) It proposes a general facial landmark detection algorithm to handle images with severe occlusion and images with large head poses; (4) There are three methods proposed on Face Alignment including shape augmented regression method, pose-indexed based multi-view method and a learning based method via regressing local binary features. The aim of this paper is to analyze previous work of different aspects in facial recognition, focusing on concrete method and performance under various databases. In addition, some improvement measures and suggestions in potential applications will be put forward.

  17. Iterative Nonlocal Total Variation Regularization Method for Image Restoration

    PubMed Central

    Xu, Huanyu; Sun, Quansen; Luo, Nan; Cao, Guo; Xia, Deshen

    2013-01-01

    In this paper, a Bregman iteration based total variation image restoration algorithm is proposed. Based on the Bregman iteration, the algorithm splits the original total variation problem into sub-problems that are easy to solve. Moreover, non-local regularization is introduced into the proposed algorithm, and a method to choose the non-local filter parameter locally and adaptively is proposed. Experiment results show that the proposed algorithms outperform some other regularization methods. PMID:23776560

  18. Boosted Regression Trees Outperforms Support Vector Machines in Predicting (Regional) Yields of Winter Wheat from Single and Cumulated Dekadal Spot-VGT Derived Normalized Difference Vegetation Indices

    NASA Astrophysics Data System (ADS)

    Stas, Michiel; Dong, Qinghan; Heremans, Stien; Zhang, Beier; Van Orshoven, Jos

    2016-08-01

    This paper compares two machine learning techniques to predict regional winter wheat yields. The models, based on Boosted Regression Trees (BRT) and Support Vector Machines (SVM), are constructed of Normalized Difference Vegetation Indices (NDVI) derived from low resolution SPOT VEGETATION satellite imagery. Three types of NDVI-related predictors were used: Single NDVI, Incremental NDVI and Targeted NDVI. BRT and SVM were first used to select features with high relevance for predicting the yield. Although the exact selections differed between the prefectures, certain periods with high influence scores for multiple prefectures could be identified. The same period of high influence stretching from March to June was detected by both machine learning methods. After feature selection, BRT and SVM models were applied to the subset of selected features for actual yield forecasting. Whereas both machine learning methods returned very low prediction errors, BRT seems to slightly but consistently outperform SVM.

  19. Globally maximizing, locally minimizing: unsupervised discriminant projection with applications to face and palm biometrics.

    PubMed

    Yang, Jian; Zhang, David; Yang, Jing-Yu; Niu, Ben

    2007-04-01

    This paper develops an unsupervised discriminant projection (UDP) technique for dimensionality reduction of high-dimensional data in small sample size cases. UDP can be seen as a linear approximation of a multimanifolds-based learning framework which takes into account both the local and nonlocal quantities. UDP characterizes the local scatter as well as the nonlocal scatter, seeking to find a projection that simultaneously maximizes the nonlocal scatter and minimizes the local scatter. This characteristic makes UDP more intuitive and more powerful than the most up-to-date method, Locality Preserving Projection (LPP), which considers only the local scatter for clustering or classification tasks. The proposed method is applied to face and palm biometrics and is examined using the Yale, FERET, and AR face image databases and the PolyU palmprint database. The experimental results show that UDP consistently outperforms LPP and PCA and outperforms LDA when the training sample size per class is small. This demonstrates that UDP is a good choice for real-world biometrics applications.

  20. A case study of alternative site response explanatory variables in Parkfield, California

    USGS Publications Warehouse

    Thompson, E.M.; Baise, L.G.; Kayen, R.E.; Morgan, E.C.; Kaklamanos, J.

    2011-01-01

    The combination of densely-spaced strong-motion stations in Parkfield, California, and spectral analysis of surface waves (SASW) profiles provides an ideal dataset for assessing the accuracy of different site response explanatory variables. We judge accuracy in terms of spatial coverage and correlation with observations. The performance of the alternative models is period-dependent, but generally we observe that: (1) where a profile is available, the square-root-of-impedance method outperforms VS30 (average S-wave velocity to 30 m depth), and (2) where a profile is unavailable, the topographic-slope method outperforms surficial geology. The fundamental site frequency is a valuable site response explanatory variable, though less valuable than VS30. However, given the expense and difficulty of obtaining reliable estimates of VS30 and the relative ease with which the fundamental site frequency can be computed, the fundamental site frequency may prove to be a valuable site response explanatory variable for many applications. ?? 2011 ASCE.

  1. Spreading to localized targets in complex networks

    NASA Astrophysics Data System (ADS)

    Sun, Ye; Ma, Long; Zeng, An; Wang, Wen-Xu

    2016-12-01

    As an important type of dynamics on complex networks, spreading is widely used to model many real processes such as the epidemic contagion and information propagation. One of the most significant research questions in spreading is to rank the spreading ability of nodes in the network. To this end, substantial effort has been made and a variety of effective methods have been proposed. These methods usually define the spreading ability of a node as the number of finally infected nodes given that the spreading is initialized from the node. However, in many real cases such as advertising and news propagation, the spreading only aims to cover a specific group of nodes. Therefore, it is necessary to study the spreading ability of nodes towards localized targets in complex networks. In this paper, we propose a reversed local path algorithm for this problem. Simulation results show that our method outperforms the existing methods in identifying the influential nodes with respect to these localized targets. Moreover, the influential spreaders identified by our method can effectively avoid infecting the non-target nodes in the spreading process.

  2. Classification of Time Series Gene Expression in Clinical Studies via Integration of Biological Network

    PubMed Central

    Qian, Liwei; Zheng, Haoran; Zhou, Hong; Qin, Ruibin; Li, Jinlong

    2013-01-01

    The increasing availability of time series expression datasets, although promising, raises a number of new computational challenges. Accordingly, the development of suitable classification methods to make reliable and sound predictions is becoming a pressing issue. We propose, here, a new method to classify time series gene expression via integration of biological networks. We evaluated our approach on 2 different datasets and showed that the use of a hidden Markov model/Gaussian mixture models hybrid explores the time-dependence of the expression data, thereby leading to better prediction results. We demonstrated that the biclustering procedure identifies function-related genes as a whole, giving rise to high accordance in prognosis prediction across independent time series datasets. In addition, we showed that integration of biological networks into our method significantly improves prediction performance. Moreover, we compared our approach with several state-of–the-art algorithms and found that our method outperformed previous approaches with regard to various criteria. Finally, our approach achieved better prediction results on early-stage data, implying the potential of our method for practical prediction. PMID:23516469

  3. A study of active learning methods for named entity recognition in clinical text.

    PubMed

    Chen, Yukun; Lasko, Thomas A; Mei, Qiaozhu; Denny, Joshua C; Xu, Hua

    2015-12-01

    Named entity recognition (NER), a sequential labeling task, is one of the fundamental tasks for building clinical natural language processing (NLP) systems. Machine learning (ML) based approaches can achieve good performance, but they often require large amounts of annotated samples, which are expensive to build due to the requirement of domain experts in annotation. Active learning (AL), a sample selection approach integrated with supervised ML, aims to minimize the annotation cost while maximizing the performance of ML-based models. In this study, our goal was to develop and evaluate both existing and new AL methods for a clinical NER task to identify concepts of medical problems, treatments, and lab tests from the clinical notes. Using the annotated NER corpus from the 2010 i2b2/VA NLP challenge that contained 349 clinical documents with 20,423 unique sentences, we simulated AL experiments using a number of existing and novel algorithms in three different categories including uncertainty-based, diversity-based, and baseline sampling strategies. They were compared with the passive learning that uses random sampling. Learning curves that plot performance of the NER model against the estimated annotation cost (based on number of sentences or words in the training set) were generated to evaluate different active learning and the passive learning methods and the area under the learning curve (ALC) score was computed. Based on the learning curves of F-measure vs. number of sentences, uncertainty sampling algorithms outperformed all other methods in ALC. Most diversity-based methods also performed better than random sampling in ALC. To achieve an F-measure of 0.80, the best method based on uncertainty sampling could save 66% annotations in sentences, as compared to random sampling. For the learning curves of F-measure vs. number of words, uncertainty sampling methods again outperformed all other methods in ALC. To achieve 0.80 in F-measure, in comparison to random sampling, the best uncertainty based method saved 42% annotations in words. But the best diversity based method reduced only 7% annotation effort. In the simulated setting, AL methods, particularly uncertainty-sampling based approaches, seemed to significantly save annotation cost for the clinical NER task. The actual benefit of active learning in clinical NER should be further evaluated in a real-time setting. Copyright © 2015 Elsevier Inc. All rights reserved.

  4. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Tuo, Rui; Wu, C. F. Jeff

    Many computer models contain unknown parameters which need to be estimated using physical observations. Furthermore, the calibration method based on Gaussian process models may lead to unreasonable estimate for imperfect computer models. In this work, we extend their study to calibration problems with stochastic physical data. We propose a novel method, called the L 2 calibration, and show its semiparametric efficiency. The conventional method of the ordinary least squares is also studied. Theoretical analysis shows that it is consistent but not efficient. Here, numerical examples show that the proposed method outperforms the existing ones.

  5. A reconsideration of negative ratings for network-based recommendation

    NASA Astrophysics Data System (ADS)

    Hu, Liang; Ren, Liang; Lin, Wenbin

    2018-01-01

    Recommendation algorithms based on bipartite networks have become increasingly popular, thanks to their accuracy and flexibility. Currently, many of these methods ignore users' negative ratings. In this work, we propose a method to exploit negative ratings for the network-based inference algorithm. We find that negative ratings play a positive role regardless of sparsity of data sets. Furthermore, we improve the efficiency of our method and compare it with the state-of-the-art algorithms. Experimental results show that the present method outperforms the existing algorithms.

  6. Small target detection using objectness and saliency

    NASA Astrophysics Data System (ADS)

    Zhang, Naiwen; Xiao, Yang; Fang, Zhiwen; Yang, Jian; Wang, Li; Li, Tao

    2017-10-01

    We are motived by the need for generic object detection algorithm which achieves high recall for small targets in complex scenes with acceptable computational efficiency. We propose a novel object detection algorithm, which has high localization quality with acceptable computational cost. Firstly, we obtain the objectness map as in BING[1] and use NMS to get the top N points. Then, k-means algorithm is used to cluster them into K classes according to their location. We set the center points of the K classes as seed points. For each seed point, an object potential region is extracted. Finally, a fast salient object detection algorithm[2] is applied to the object potential regions to highlight objectlike pixels, and a series of efficient post-processing operations are proposed to locate the targets. Our method runs at 5 FPS on 1000*1000 images, and significantly outperforms previous methods on small targets in cluttered background.

  7. Directional view interpolation for compensation of sparse angular sampling in cone-beam CT.

    PubMed

    Bertram, Matthias; Wiegert, Jens; Schafer, Dirk; Aach, Til; Rose, Georg

    2009-07-01

    In flat detector cone-beam computed tomography and related applications, sparse angular sampling frequently leads to characteristic streak artifacts. To overcome this problem, it has been suggested to generate additional views by means of interpolation. The practicality of this approach is investigated in combination with a dedicated method for angular interpolation of 3-D sinogram data. For this purpose, a novel dedicated shape-driven directional interpolation algorithm based on a structure tensor approach is developed. Quantitative evaluation shows that this method clearly outperforms conventional scene-based interpolation schemes. Furthermore, the image quality trade-offs associated with the use of interpolated intermediate views are systematically evaluated for simulated and clinical cone-beam computed tomography data sets of the human head. It is found that utilization of directionally interpolated views significantly reduces streak artifacts and noise, at the expense of small introduced image blur.

  8. Cross-language opinion lexicon extraction using mutual-reinforcement label propagation.

    PubMed

    Lin, Zheng; Tan, Songbo; Liu, Yue; Cheng, Xueqi; Xu, Xueke

    2013-01-01

    There is a growing interest in automatically building opinion lexicon from sources such as product reviews. Most of these methods depend on abundant external resources such as WordNet, which limits the applicability of these methods. Unsupervised or semi-supervised learning provides an optional solution to multilingual opinion lexicon extraction. However, the datasets are imbalanced in different languages. For some languages, the high-quality corpora are scarce or hard to obtain, which limits the research progress. To solve the above problems, we explore a mutual-reinforcement label propagation framework. First, for each language, a label propagation algorithm is applied to a word relation graph, and then a bilingual dictionary is used as a bridge to transfer information between two languages. A key advantage of this model is its ability to make two languages learn from each other and boost each other. The experimental results show that the proposed approach outperforms baseline significantly.

  9. An Integrated Method Based on PSO and EDA for the Max-Cut Problem.

    PubMed

    Lin, Geng; Guan, Jian

    2016-01-01

    The max-cut problem is NP-hard combinatorial optimization problem with many real world applications. In this paper, we propose an integrated method based on particle swarm optimization and estimation of distribution algorithm (PSO-EDA) for solving the max-cut problem. The integrated algorithm overcomes the shortcomings of particle swarm optimization and estimation of distribution algorithm. To enhance the performance of the PSO-EDA, a fast local search procedure is applied. In addition, a path relinking procedure is developed to intensify the search. To evaluate the performance of PSO-EDA, extensive experiments were carried out on two sets of benchmark instances with 800 to 20,000 vertices from the literature. Computational results and comparisons show that PSO-EDA significantly outperforms the existing PSO-based and EDA-based algorithms for the max-cut problem. Compared with other best performing algorithms, PSO-EDA is able to find very competitive results in terms of solution quality.

  10. Reflective random indexing for semi-automatic indexing of the biomedical literature.

    PubMed

    Vasuki, Vidya; Cohen, Trevor

    2010-10-01

    The rapid growth of biomedical literature is evident in the increasing size of the MEDLINE research database. Medical Subject Headings (MeSH), a controlled set of keywords, are used to index all the citations contained in the database to facilitate search and retrieval. This volume of citations calls for efficient tools to assist indexers at the US National Library of Medicine (NLM). Currently, the Medical Text Indexer (MTI) system provides assistance by recommending MeSH terms based on the title and abstract of an article using a combination of distributional and vocabulary-based methods. In this paper, we evaluate a novel approach toward indexer assistance by using nearest neighbor classification in combination with Reflective Random Indexing (RRI), a scalable alternative to the established methods of distributional semantics. On a test set provided by the NLM, our approach significantly outperforms the MTI system, suggesting that the RRI approach would make a useful addition to the current methodologies.

  11. Blind Channel Equalization Using Constrained Generalized Pattern Search Optimization and Reinitialization Strategy

    NASA Astrophysics Data System (ADS)

    Zaouche, Abdelouahib; Dayoub, Iyad; Rouvaen, Jean Michel; Tatkeu, Charles

    2008-12-01

    We propose a global convergence baud-spaced blind equalization method in this paper. This method is based on the application of both generalized pattern optimization and channel surfing reinitialization. The potentially used unimodal cost function relies on higher- order statistics, and its optimization is achieved using a pattern search algorithm. Since the convergence to the global minimum is not unconditionally warranted, we make use of channel surfing reinitialization (CSR) strategy to find the right global minimum. The proposed algorithm is analyzed, and simulation results using a severe frequency selective propagation channel are given. Detailed comparisons with constant modulus algorithm (CMA) are highlighted. The proposed algorithm performances are evaluated in terms of intersymbol interference, normalized received signal constellations, and root mean square error vector magnitude. In case of nonconstant modulus input signals, our algorithm outperforms significantly CMA algorithm with full channel surfing reinitialization strategy. However, comparable performances are obtained for constant modulus signals.

  12. Physics-driven Spatiotemporal Regularization for High-dimensional Predictive Modeling: A Novel Approach to Solve the Inverse ECG Problem

    NASA Astrophysics Data System (ADS)

    Yao, Bing; Yang, Hui

    2016-12-01

    This paper presents a novel physics-driven spatiotemporal regularization (STRE) method for high-dimensional predictive modeling in complex healthcare systems. This model not only captures the physics-based interrelationship between time-varying explanatory and response variables that are distributed in the space, but also addresses the spatial and temporal regularizations to improve the prediction performance. The STRE model is implemented to predict the time-varying distribution of electric potentials on the heart surface based on the electrocardiogram (ECG) data from the distributed sensor network placed on the body surface. The model performance is evaluated and validated in both a simulated two-sphere geometry and a realistic torso-heart geometry. Experimental results show that the STRE model significantly outperforms other regularization models that are widely used in current practice such as Tikhonov zero-order, Tikhonov first-order and L1 first-order regularization methods.

  13. Cross-Language Opinion Lexicon Extraction Using Mutual-Reinforcement Label Propagation

    PubMed Central

    Lin, Zheng; Tan, Songbo; Liu, Yue; Cheng, Xueqi; Xu, Xueke

    2013-01-01

    There is a growing interest in automatically building opinion lexicon from sources such as product reviews. Most of these methods depend on abundant external resources such as WordNet, which limits the applicability of these methods. Unsupervised or semi-supervised learning provides an optional solution to multilingual opinion lexicon extraction. However, the datasets are imbalanced in different languages. For some languages, the high-quality corpora are scarce or hard to obtain, which limits the research progress. To solve the above problems, we explore a mutual-reinforcement label propagation framework. First, for each language, a label propagation algorithm is applied to a word relation graph, and then a bilingual dictionary is used as a bridge to transfer information between two languages. A key advantage of this model is its ability to make two languages learn from each other and boost each other. The experimental results show that the proposed approach outperforms baseline significantly. PMID:24260190

  14. Automatic Summarization as a Combinatorial Optimization Problem

    NASA Astrophysics Data System (ADS)

    Hirao, Tsutomu; Suzuki, Jun; Isozaki, Hideki

    We derived the oracle summary with the highest ROUGE score that can be achieved by integrating sentence extraction with sentence compression from the reference abstract. The analysis results of the oracle revealed that summarization systems have to assign an appropriate compression rate for each sentence in the document. In accordance with this observation, this paper proposes a summarization method as a combinatorial optimization: selecting the set of sentences that maximize the sum of the sentence scores from the pool which consists of the sentences with various compression rates, subject to length constrains. The score of the sentence is defined by its compression rate, content words and positional information. The parameters for the compression rates and positional information are optimized by minimizing the loss between score of oracles and that of candidates. The results obtained from TSC-2 corpus showed that our method outperformed the previous systems with statistical significance.

  15. Local Intrinsic Dimension Estimation by Generalized Linear Modeling.

    PubMed

    Hino, Hideitsu; Fujiki, Jun; Akaho, Shotaro; Murata, Noboru

    2017-07-01

    We propose a method for intrinsic dimension estimation. By fitting the power of distance from an inspection point and the number of samples included inside a ball with a radius equal to the distance, to a regression model, we estimate the goodness of fit. Then, by using the maximum likelihood method, we estimate the local intrinsic dimension around the inspection point. The proposed method is shown to be comparable to conventional methods in global intrinsic dimension estimation experiments. Furthermore, we experimentally show that the proposed method outperforms a conventional local dimension estimation method.

  16. Generalized disequilibrium test for association in qualitative traits incorporating imprinting effects based on extended pedigrees.

    PubMed

    Li, Jian-Long; Wang, Peng; Fung, Wing Kam; Zhou, Ji-Yuan

    2017-10-16

    For dichotomous traits, the generalized disequilibrium test with the moment estimate of the variance (GDT-ME) is a powerful family-based association method. Genomic imprinting is an important epigenetic phenomenon and currently, there has been increasing interest of incorporating imprinting to improve the test power of association analysis. However, GDT-ME does not take imprinting effects into account, and it has not been investigated whether it can be used for association analysis when the effects indeed exist. In this article, based on a novel decomposition of the genotype score according to the paternal or maternal source of the allele, we propose the generalized disequilibrium test with imprinting (GDTI) for complete pedigrees without any missing genotypes. Then, we extend GDTI and GDT-ME to accommodate incomplete pedigrees with some pedigrees having missing genotypes, by using a Monte Carlo (MC) sampling and estimation scheme to infer missing genotypes given available genotypes in each pedigree, denoted by MCGDTI and MCGDT-ME, respectively. The proposed GDTI and MCGDTI methods evaluate the differences of the paternal as well as maternal allele scores for all discordant relative pairs in a pedigree, including beyond first-degree relative pairs. Advantages of the proposed GDTI and MCGDTI test statistics over existing methods are demonstrated by simulation studies under various simulation settings and by application to the rheumatoid arthritis dataset. Simulation results show that the proposed tests control the size well under the null hypothesis of no association, and outperform the existing methods under various imprinting effect models. The existing GDT-ME and the proposed MCGDT-ME can be used to test for association even when imprinting effects exist. For the application to the rheumatoid arthritis data, compared to the existing methods, MCGDTI identifies more loci statistically significantly associated with the disease. Under complete and incomplete imprinting effect models, our proposed GDTI and MCGDTI methods, by considering the information on imprinting effects and all discordant relative pairs within each pedigree, outperform all the existing test statistics and MCGDTI can recapture much of the missing information. Therefore, MCGDTI is recommended in practice.

  17. Sparse Bayesian Learning for Identifying Imaging Biomarkers in AD Prediction

    PubMed Central

    Shen, Li; Qi, Yuan; Kim, Sungeun; Nho, Kwangsik; Wan, Jing; Risacher, Shannon L.; Saykin, Andrew J.

    2010-01-01

    We apply sparse Bayesian learning methods, automatic relevance determination (ARD) and predictive ARD (PARD), to Alzheimer’s disease (AD) classification to make accurate prediction and identify critical imaging markers relevant to AD at the same time. ARD is one of the most successful Bayesian feature selection methods. PARD is a powerful Bayesian feature selection method, and provides sparse models that is easy to interpret. PARD selects the model with the best estimate of the predictive performance instead of choosing the one with the largest marginal model likelihood. Comparative study with support vector machine (SVM) shows that ARD/PARD in general outperform SVM in terms of prediction accuracy. Additional comparison with surface-based general linear model (GLM) analysis shows that regions with strongest signals are identified by both GLM and ARD/PARD. While GLM P-map returns significant regions all over the cortex, ARD/PARD provide a small number of relevant and meaningful imaging markers with predictive power, including both cortical and subcortical measures. PMID:20879451

  18. A hybrid method for classifying cognitive states from fMRI data.

    PubMed

    Parida, S; Dehuri, S; Cho, S-B; Cacha, L A; Poznanski, R R

    2015-09-01

    Functional magnetic resonance imaging (fMRI) makes it possible to detect brain activities in order to elucidate cognitive-states. The complex nature of fMRI data requires under-standing of the analyses applied to produce possible avenues for developing models of cognitive state classification and improving brain activity prediction. While many models of classification task of fMRI data analysis have been developed, in this paper, we present a novel hybrid technique through combining the best attributes of genetic algorithms (GAs) and ensemble decision tree technique that consistently outperforms all other methods which are being used for cognitive-state classification. Specifically, this paper illustrates the combined effort of decision-trees ensemble and GAs for feature selection through an extensive simulation study and discusses the classification performance with respect to fMRI data. We have shown that our proposed method exhibits significant reduction of the number of features with clear edge classification accuracy over ensemble of decision-trees.

  19. Road Lane Detection Robust to Shadows Based on a Fuzzy System Using a Visible Light Camera Sensor.

    PubMed

    Hoang, Toan Minh; Baek, Na Rae; Cho, Se Woon; Kim, Ki Wan; Park, Kang Ryoung

    2017-10-28

    Recently, autonomous vehicles, particularly self-driving cars, have received significant attention owing to rapid advancements in sensor and computation technologies. In addition to traffic sign recognition, road lane detection is one of the most important factors used in lane departure warning systems and autonomous vehicles for maintaining the safety of semi-autonomous and fully autonomous systems. Unlike traffic signs, road lanes are easily damaged by both internal and external factors such as road quality, occlusion (traffic on the road), weather conditions, and illumination (shadows from objects such as cars, trees, and buildings). Obtaining clear road lane markings for recognition processing is a difficult challenge. Therefore, we propose a method to overcome various illumination problems, particularly severe shadows, by using fuzzy system and line segment detector algorithms to obtain better results for detecting road lanes by a visible light camera sensor. Experimental results from three open databases, Caltech dataset, Santiago Lanes dataset (SLD), and Road Marking dataset, showed that our method outperformed conventional lane detection methods.

  20. Unsupervised Deep Hashing With Pseudo Labels for Scalable Image Retrieval.

    PubMed

    Zhang, Haofeng; Liu, Li; Long, Yang; Shao, Ling

    2018-04-01

    In order to achieve efficient similarity searching, hash functions are designed to encode images into low-dimensional binary codes with the constraint that similar features will have a short distance in the projected Hamming space. Recently, deep learning-based methods have become more popular, and outperform traditional non-deep methods. However, without label information, most state-of-the-art unsupervised deep hashing (DH) algorithms suffer from severe performance degradation for unsupervised scenarios. One of the main reasons is that the ad-hoc encoding process cannot properly capture the visual feature distribution. In this paper, we propose a novel unsupervised framework that has two main contributions: 1) we convert the unsupervised DH model into supervised by discovering pseudo labels; 2) the framework unifies likelihood maximization, mutual information maximization, and quantization error minimization so that the pseudo labels can maximumly preserve the distribution of visual features. Extensive experiments on three popular data sets demonstrate the advantages of the proposed method, which leads to significant performance improvement over the state-of-the-art unsupervised hashing algorithms.

  1. Extraction of extracellular polymeric substances from aerobic granule with compact interior structure.

    PubMed

    Adav, Sunil S; Lee, Duu-Jong

    2008-06-15

    Extracellular polymeric substances (EPS) were extracted from aerobic granules of compact interior structure using seven extraction methods. Ultrasound followed by the chemical reagents formamide and NaOH outperformed other methods in extracting EPS from aerobic granules of compact interior. The collected EPS revealed no contamination by intracellular substances and consisted mainly of proteins, polysaccharides, humic substances and lipids. The quantity of extracted proteins exhibited a weak correlation with quantity of extracted carbohydrates but no correlation with quantity of extracted humic substances. The total polysaccharides/total proteins (PN/PS) ratios for sludge flocs were approximately 0.9 regardless of extraction method. Protein content was significantly enriched in the granules, producing a PN/PS ratio of 3.4-6.2. This experimental result correlated with observations using excitation-emission matrix (EEM) and confocal laser scanning microscope technique. However, detailed study disproved the use of EEM results as a quantitative index of extracted EPS from sludge flocs or from granules.

  2. Advanced time integration algorithms for dislocation dynamics simulations of work hardening

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sills, Ryan B.; Aghaei, Amin; Cai, Wei

    Efficient time integration is a necessity for dislocation dynamics simulations of work hardening to achieve experimentally relevant strains. In this work, an efficient time integration scheme using a high order explicit method with time step subcycling and a newly-developed collision detection algorithm are evaluated. First, time integrator performance is examined for an annihilating Frank–Read source, showing the effects of dislocation line collision. The integrator with subcycling is found to significantly out-perform other integration schemes. The performance of the time integration and collision detection algorithms is then tested in a work hardening simulation. The new algorithms show a 100-fold speed-up relativemore » to traditional schemes. As a result, subcycling is shown to improve efficiency significantly while maintaining an accurate solution, and the new collision algorithm allows an arbitrarily large time step size without missing collisions.« less

  3. Advanced time integration algorithms for dislocation dynamics simulations of work hardening

    DOE PAGES

    Sills, Ryan B.; Aghaei, Amin; Cai, Wei

    2016-04-25

    Efficient time integration is a necessity for dislocation dynamics simulations of work hardening to achieve experimentally relevant strains. In this work, an efficient time integration scheme using a high order explicit method with time step subcycling and a newly-developed collision detection algorithm are evaluated. First, time integrator performance is examined for an annihilating Frank–Read source, showing the effects of dislocation line collision. The integrator with subcycling is found to significantly out-perform other integration schemes. The performance of the time integration and collision detection algorithms is then tested in a work hardening simulation. The new algorithms show a 100-fold speed-up relativemore » to traditional schemes. As a result, subcycling is shown to improve efficiency significantly while maintaining an accurate solution, and the new collision algorithm allows an arbitrarily large time step size without missing collisions.« less

  4. Testing the significance of a correlation with nonnormal data: comparison of Pearson, Spearman, transformation, and resampling approaches.

    PubMed

    Bishara, Anthony J; Hittner, James B

    2012-09-01

    It is well known that when data are nonnormally distributed, a test of the significance of Pearson's r may inflate Type I error rates and reduce power. Statistics textbooks and the simulation literature provide several alternatives to Pearson's correlation. However, the relative performance of these alternatives has been unclear. Two simulation studies were conducted to compare 12 methods, including Pearson, Spearman's rank-order, transformation, and resampling approaches. With most sample sizes (n ≥ 20), Type I and Type II error rates were minimized by transforming the data to a normal shape prior to assessing the Pearson correlation. Among transformation approaches, a general purpose rank-based inverse normal transformation (i.e., transformation to rankit scores) was most beneficial. However, when samples were both small (n ≤ 10) and extremely nonnormal, the permutation test often outperformed other alternatives, including various bootstrap tests.

  5. Comparison of Survival Models for Analyzing Prognostic Factors in Gastric Cancer Patients

    PubMed

    Habibi, Danial; Rafiei, Mohammad; Chehrei, Ali; Shayan, Zahra; Tafaqodi, Soheil

    2018-03-27

    Objective: There are a number of models for determining risk factors for survival of patients with gastric cancer. This study was conducted to select the model showing the best fit with available data. Methods: Cox regression and parametric models (Exponential, Weibull, Gompertz, Log normal, Log logistic and Generalized Gamma) were utilized in unadjusted and adjusted forms to detect factors influencing mortality of patients. Comparisons were made with Akaike Information Criterion (AIC) by using STATA 13 and R 3.1.3 softwares. Results: The results of this study indicated that all parametric models outperform the Cox regression model. The Log normal, Log logistic and Generalized Gamma provided the best performance in terms of AIC values (179.2, 179.4 and 181.1, respectively). On unadjusted analysis, the results of the Cox regression and parametric models indicated stage, grade, largest diameter of metastatic nest, largest diameter of LM, number of involved lymph nodes and the largest ratio of metastatic nests to lymph nodes, to be variables influencing the survival of patients with gastric cancer. On adjusted analysis, according to the best model (log normal), grade was found as the significant variable. Conclusion: The results suggested that all parametric models outperform the Cox model. The log normal model provides the best fit and is a good substitute for Cox regression. Creative Commons Attribution License

  6. Neighboring block based disparity vector derivation for multiview compatible 3D-AVC

    NASA Astrophysics Data System (ADS)

    Kang, Jewon; Chen, Ying; Zhang, Li; Zhao, Xin; Karczewicz, Marta

    2013-09-01

    3D-AVC being developed under Joint Collaborative Team on 3D Video Coding (JCT-3V) significantly outperforms the Multiview Video Coding plus Depth (MVC+D) which simultaneously encodes texture views and depth views with the multiview extension of H.264/AVC (MVC). However, when the 3D-AVC is configured to support multiview compatibility in which texture views are decoded without depth information, the coding performance becomes significantly degraded. The reason is that advanced coding tools incorporated into the 3D-AVC do not perform well due to the lack of a disparity vector converted from the depth information. In this paper, we propose a disparity vector derivation method utilizing only the information of texture views. Motion information of neighboring blocks is used to determine a disparity vector for a macroblock, so that the derived disparity vector is efficiently used for the coding tools in 3D-AVC. The proposed method significantly improves a coding gain of the 3D-AVC in the multiview compatible mode about 20% BD-rate saving in the coded views and 26% BD-rate saving in the synthesized views on average.

  7. Predicting protein complexes using a supervised learning method combined with local structural information.

    PubMed

    Dong, Yadong; Sun, Yongqi; Qin, Chao

    2018-01-01

    The existing protein complex detection methods can be broadly divided into two categories: unsupervised and supervised learning methods. Most of the unsupervised learning methods assume that protein complexes are in dense regions of protein-protein interaction (PPI) networks even though many true complexes are not dense subgraphs. Supervised learning methods utilize the informative properties of known complexes; they often extract features from existing complexes and then use the features to train a classification model. The trained model is used to guide the search process for new complexes. However, insufficient extracted features, noise in the PPI data and the incompleteness of complex data make the classification model imprecise. Consequently, the classification model is not sufficient for guiding the detection of complexes. Therefore, we propose a new robust score function that combines the classification model with local structural information. Based on the score function, we provide a search method that works both forwards and backwards. The results from experiments on six benchmark PPI datasets and three protein complex datasets show that our approach can achieve better performance compared with the state-of-the-art supervised, semi-supervised and unsupervised methods for protein complex detection, occasionally significantly outperforming such methods.

  8. Synchronization for Optical PPM with Inter-Symbol Guard Times

    NASA Astrophysics Data System (ADS)

    Rogalin, R.; Srinivasan, M.

    2017-05-01

    Deep space optical communications promises orders of magnitude growth in communication capacity, supporting high data rate applications such as video streaming and high-bandwidth science instruments. Pulse position modulation is the modulation format of choice for deep space applications, and by inserting inter-symbol guard times between the symbols, the signal carries the timing information needed by the demodulator. Accurately extracting this timing information is crucial to demodulating and decoding this signal. In this article, we propose a number of timing and frequency estimation schemes for this modulation format, and in particular highlight a low complexity maximum likelihood timing estimator that significantly outperforms the prior art in this domain. This method does not require an explicit synchronization sequence, freeing up channel resources for data transmission.

  9. An extended Kalman-Bucy filter for atmospheric temperature profile retrieval with a passive microwave sounder

    NASA Technical Reports Server (NTRS)

    Ledsham, W. H.; Staelin, D. H.

    1978-01-01

    An extended Kalman-Bucy filter has been implemented for atmospheric temperature profile retrievals from observations made using the Scanned Microwave Spectrometer (SCAMS) instrument carried on the Nimbus 6 satellite. This filter has the advantage that it requires neither stationary statistics in the underlying processes nor linear production of the observed variables from the variables to be estimated. This extended Kalman-Bucy filter has yielded significant performance improvement relative to multiple regression retrieval methods. A multi-spot extended Kalman-Bucy filter has also been developed in which the temperature profiles at a number of scan angles in a scanning instrument are retrieved simultaneously. These multi-spot retrievals are shown to outperform the single-spot Kalman retrievals.

  10. Pulling Mobile Assisted Language Learning (MALL) into the Mainstream: MALL in Broad Practice

    PubMed Central

    Wu, Qun

    2015-01-01

    The researcher designed a smartphone app to help college students to learn English (L2) vocabulary. The app contained 3,402 English words that were compiled into an alphabetic wordlist with each word displayed on three features; namely: spelling, pronunciation and Chinese definitions. To test the effectiveness of the app, an experimental group (with app) was compared with a control group (without app) and knowledge of words was tested before and after the research. The study revealed that the students using the program significantly outperformed those in the control group in vocabulary acquisition. This paper introduced a research design method and set up a pedagogical paradigm which can be followed as a way to practice MALL. PMID:26010606

  11. Analysis and characterization of high-resolution and high-aspect-ratio imaging fiber bundles.

    PubMed

    Motamedi, Nojan; Karbasi, Salman; Ford, Joseph E; Lomakin, Vitaliy

    2015-11-10

    High-contrast imaging fiber bundles (FBs) are characterized and modeled for wide-angle and high-resolution imaging applications. Scanning electron microscope images of FB cross sections are taken to measure physical parameters and verify the variations of irregular fibers due to the fabrication process. Modal analysis tools are developed that include irregularities in the fiber core shapes and provide results in agreement with experimental measurements. The modeling demonstrates that the irregular fibers significantly outperform a perfectly regular "ideal" array. Using this method, FBs are designed that can provide high contrast with core pitches of only a few wavelengths of the guided light. Structural modifications of the commercially available FB can reduce the core pitch by 60% for higher resolution image relay.

  12. Results of a joint NOAA/NASA sounder simulation study

    NASA Technical Reports Server (NTRS)

    Phillips, N.; Susskind, Joel; Mcmillin, L.

    1988-01-01

    This paper presents the results of a joint NOAA and NASA sounder simulation study in which the accuracies of atmospheric temperature profiles and surface skin temperature measuremnents retrieved from two sounders were compared: (1) the currently used IR temperature sounder HIRS2 (High-resolution Infrared Radiation Sounder 2); and (2) the recently proposed high-spectral-resolution IR sounder AMTS (Advanced Moisture and Temperature Sounder). Simulations were conducted for both clear and partial cloud conditions. Data were analyzed at NASA using a physical inversion technique and at NOAA using a statistical technique. Results show significant improvement of AMTS compared to HIRS2 for both clear and cloudy conditions. The improvements are indicated by both methods of data analysis, but the physical retrievals outperform the statistical retrievals.

  13. Joint Dictionary Learning-Based Non-Negative Matrix Factorization for Voice Conversion to Improve Speech Intelligibility After Oral Surgery.

    PubMed

    Fu, Szu-Wei; Li, Pei-Chun; Lai, Ying-Hui; Yang, Cheng-Chien; Hsieh, Li-Chun; Tsao, Yu

    2017-11-01

    Objective: This paper focuses on machine learning based voice conversion (VC) techniques for improving the speech intelligibility of surgical patients who have had parts of their articulators removed. Because of the removal of parts of the articulator, a patient's speech may be distorted and difficult to understand. To overcome this problem, VC methods can be applied to convert the distorted speech such that it is clear and more intelligible. To design an effective VC method, two key points must be considered: 1) the amount of training data may be limited (because speaking for a long time is usually difficult for postoperative patients); 2) rapid conversion is desirable (for better communication). Methods: We propose a novel joint dictionary learning based non-negative matrix factorization (JD-NMF) algorithm. Compared to conventional VC techniques, JD-NMF can perform VC efficiently and effectively with only a small amount of training data. Results: The experimental results demonstrate that the proposed JD-NMF method not only achieves notably higher short-time objective intelligibility (STOI) scores (a standardized objective intelligibility evaluation metric) than those obtained using the original unconverted speech but is also significantly more efficient and effective than a conventional exemplar-based NMF VC method. Conclusion: The proposed JD-NMF method may outperform the state-of-the-art exemplar-based NMF VC method in terms of STOI scores under the desired scenario. Significance: We confirmed the advantages of the proposed joint training criterion for the NMF-based VC. Moreover, we verified that the proposed JD-NMF can effectively improve the speech intelligibility scores of oral surgery patients. Objective: This paper focuses on machine learning based voice conversion (VC) techniques for improving the speech intelligibility of surgical patients who have had parts of their articulators removed. Because of the removal of parts of the articulator, a patient's speech may be distorted and difficult to understand. To overcome this problem, VC methods can be applied to convert the distorted speech such that it is clear and more intelligible. To design an effective VC method, two key points must be considered: 1) the amount of training data may be limited (because speaking for a long time is usually difficult for postoperative patients); 2) rapid conversion is desirable (for better communication). Methods: We propose a novel joint dictionary learning based non-negative matrix factorization (JD-NMF) algorithm. Compared to conventional VC techniques, JD-NMF can perform VC efficiently and effectively with only a small amount of training data. Results: The experimental results demonstrate that the proposed JD-NMF method not only achieves notably higher short-time objective intelligibility (STOI) scores (a standardized objective intelligibility evaluation metric) than those obtained using the original unconverted speech but is also significantly more efficient and effective than a conventional exemplar-based NMF VC method. Conclusion: The proposed JD-NMF method may outperform the state-of-the-art exemplar-based NMF VC method in terms of STOI scores under the desired scenario. Significance: We confirmed the advantages of the proposed joint training criterion for the NMF-based VC. Moreover, we verified that the proposed JD-NMF can effectively improve the speech intelligibility scores of oral surgery patients.

  14. A novel approach to identifying regulatory motifs in distantly related genomes

    PubMed Central

    Van Hellemont, Ruth; Monsieurs, Pieter; Thijs, Gert; De Moor, Bart; Van de Peer, Yves; Marchal, Kathleen

    2005-01-01

    Although proven successful in the identification of regulatory motifs, phylogenetic footprinting methods still show some shortcomings. To assess these difficulties, most apparent when applying phylogenetic footprinting to distantly related organisms, we developed a two-step procedure that combines the advantages of sequence alignment and motif detection approaches. The results on well-studied benchmark datasets indicate that the presented method outperforms other methods when the sequences become either too long or too heterogeneous in size. PMID:16420672

  15. NetMHCpan-3.0; improved prediction of binding to MHC class I molecules integrating information from multiple receptor and peptide length datasets.

    PubMed

    Nielsen, Morten; Andreatta, Massimo

    2016-03-30

    Binding of peptides to MHC class I molecules (MHC-I) is essential for antigen presentation to cytotoxic T-cells. Here, we demonstrate how a simple alignment step allowing insertions and deletions in a pan-specific MHC-I binding machine-learning model enables combining information across both multiple MHC molecules and peptide lengths. This pan-allele/pan-length algorithm significantly outperforms state-of-the-art methods, and captures differences in the length profile of binders to different MHC molecules leading to increased accuracy for ligand identification. Using this model, we demonstrate that percentile ranks in contrast to affinity-based thresholds are optimal for ligand identification due to uniform sampling of the MHC space. We have developed a neural network-based machine-learning algorithm leveraging information across multiple receptor specificities and ligand length scales, and demonstrated how this approach significantly improves the accuracy for prediction of peptide binding and identification of MHC ligands. The method is available at www.cbs.dtu.dk/services/NetMHCpan-3.0 .

  16. Binary Interval Search: a scalable algorithm for counting interval intersections

    PubMed Central

    Layer, Ryan M.; Skadron, Kevin; Robins, Gabriel; Hall, Ira M.; Quinlan, Aaron R.

    2013-01-01

    Motivation: The comparison of diverse genomic datasets is fundamental to understand genome biology. Researchers must explore many large datasets of genome intervals (e.g. genes, sequence alignments) to place their experimental results in a broader context and to make new discoveries. Relationships between genomic datasets are typically measured by identifying intervals that intersect, that is, they overlap and thus share a common genome interval. Given the continued advances in DNA sequencing technologies, efficient methods for measuring statistically significant relationships between many sets of genomic features are crucial for future discovery. Results: We introduce the Binary Interval Search (BITS) algorithm, a novel and scalable approach to interval set intersection. We demonstrate that BITS outperforms existing methods at counting interval intersections. Moreover, we show that BITS is intrinsically suited to parallel computing architectures, such as graphics processing units by illustrating its utility for efficient Monte Carlo simulations measuring the significance of relationships between sets of genomic intervals. Availability: https://github.com/arq5x/bits. Contact: arq5x@virginia.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:23129298

  17. Feature-based classification of amino acid substitutions outside conserved functional protein domains.

    PubMed

    Gemovic, Branislava; Perovic, Vladimir; Glisic, Sanja; Veljkovic, Nevena

    2013-01-01

    There are more than 500 amino acid substitutions in each human genome, and bioinformatics tools irreplaceably contribute to determination of their functional effects. We have developed feature-based algorithm for the detection of mutations outside conserved functional domains (CFDs) and compared its classification efficacy with the most commonly used phylogeny-based tools, PolyPhen-2 and SIFT. The new algorithm is based on the informational spectrum method (ISM), a feature-based technique, and statistical analysis. Our dataset contained neutral polymorphisms and mutations associated with myeloid malignancies from epigenetic regulators ASXL1, DNMT3A, EZH2, and TET2. PolyPhen-2 and SIFT had significantly lower accuracies in predicting the effects of amino acid substitutions outside CFDs than expected, with especially low sensitivity. On the other hand, only ISM algorithm showed statistically significant classification of these sequences. It outperformed PolyPhen-2 and SIFT by 15% and 13%, respectively. These results suggest that feature-based methods, like ISM, are more suitable for the classification of amino acid substitutions outside CFDs than phylogeny-based tools.

  18. Automatic Condensation of Electronic Publications by Sentence Selection.

    ERIC Educational Resources Information Center

    Brandow, Ronald; And Others

    1995-01-01

    Describes a system that performs automatic summaries of news from a large commercial news service encompassing 41 different publications. This system was compared to a system that used only the lead sentences of the texts. Lead-based summaries significantly outperformed the sentence-selection summaries. (AEF)

  19. Interplay between past market correlation structure changes and future volatility outbursts.

    PubMed

    Musmeci, Nicoló; Aste, Tomaso; Di Matteo, T

    2016-11-18

    We report significant relations between past changes in the market correlation structure and future changes in the market volatility. This relation is made evident by using a measure of "correlation structure persistence" on correlation-based information filtering networks that quantifies the rate of change of the market dependence structure. We also measured changes in the correlation structure by means of a "metacorrelation" that measures a lagged correlation between correlation matrices computed over different time windows. Both methods show a deep interplay between past changes in correlation structure and future changes in volatility and we demonstrate they can anticipate market risk variations and this can be used to better forecast portfolio risk. Notably, these methods overcome the curse of dimensionality that limits the applicability of traditional econometric tools to portfolios made of a large number of assets. We report on forecasting performances and statistical significance of both methods for two different equity datasets. We also identify an optimal region of parameters in terms of True Positive and False Positive trade-off, through a ROC curve analysis. We find that this forecasting method is robust and it outperforms logistic regression predictors based on past volatility only. Moreover the temporal analysis indicates that methods based on correlation structural persistence are able to adapt to abrupt changes in the market, such as financial crises, more rapidly than methods based on past volatility.

  20. Interplay between past market correlation structure changes and future volatility outbursts

    NASA Astrophysics Data System (ADS)

    Musmeci, Nicoló; Aste, Tomaso; Di Matteo, T.

    2016-11-01

    We report significant relations between past changes in the market correlation structure and future changes in the market volatility. This relation is made evident by using a measure of “correlation structure persistence” on correlation-based information filtering networks that quantifies the rate of change of the market dependence structure. We also measured changes in the correlation structure by means of a “metacorrelation” that measures a lagged correlation between correlation matrices computed over different time windows. Both methods show a deep interplay between past changes in correlation structure and future changes in volatility and we demonstrate they can anticipate market risk variations and this can be used to better forecast portfolio risk. Notably, these methods overcome the curse of dimensionality that limits the applicability of traditional econometric tools to portfolios made of a large number of assets. We report on forecasting performances and statistical significance of both methods for two different equity datasets. We also identify an optimal region of parameters in terms of True Positive and False Positive trade-off, through a ROC curve analysis. We find that this forecasting method is robust and it outperforms logistic regression predictors based on past volatility only. Moreover the temporal analysis indicates that methods based on correlation structural persistence are able to adapt to abrupt changes in the market, such as financial crises, more rapidly than methods based on past volatility.

  1. Interplay between past market correlation structure changes and future volatility outbursts

    PubMed Central

    Musmeci, Nicoló; Aste, Tomaso; Di Matteo, T.

    2016-01-01

    We report significant relations between past changes in the market correlation structure and future changes in the market volatility. This relation is made evident by using a measure of “correlation structure persistence” on correlation-based information filtering networks that quantifies the rate of change of the market dependence structure. We also measured changes in the correlation structure by means of a “metacorrelation” that measures a lagged correlation between correlation matrices computed over different time windows. Both methods show a deep interplay between past changes in correlation structure and future changes in volatility and we demonstrate they can anticipate market risk variations and this can be used to better forecast portfolio risk. Notably, these methods overcome the curse of dimensionality that limits the applicability of traditional econometric tools to portfolios made of a large number of assets. We report on forecasting performances and statistical significance of both methods for two different equity datasets. We also identify an optimal region of parameters in terms of True Positive and False Positive trade-off, through a ROC curve analysis. We find that this forecasting method is robust and it outperforms logistic regression predictors based on past volatility only. Moreover the temporal analysis indicates that methods based on correlation structural persistence are able to adapt to abrupt changes in the market, such as financial crises, more rapidly than methods based on past volatility. PMID:27857144

  2. Application of Super-Resolution Convolutional Neural Network for Enhancing Image Resolution in Chest CT.

    PubMed

    Umehara, Kensuke; Ota, Junko; Ishida, Takayuki

    2017-10-18

    In this study, the super-resolution convolutional neural network (SRCNN) scheme, which is the emerging deep-learning-based super-resolution method for enhancing image resolution in chest CT images, was applied and evaluated using the post-processing approach. For evaluation, 89 chest CT cases were sampled from The Cancer Imaging Archive. The 89 CT cases were divided randomly into 45 training cases and 44 external test cases. The SRCNN was trained using the training dataset. With the trained SRCNN, a high-resolution image was reconstructed from a low-resolution image, which was down-sampled from an original test image. For quantitative evaluation, two image quality metrics were measured and compared to those of the conventional linear interpolation methods. The image restoration quality of the SRCNN scheme was significantly higher than that of the linear interpolation methods (p < 0.001 or p < 0.05). The high-resolution image reconstructed by the SRCNN scheme was highly restored and comparable to the original reference image, in particular, for a ×2 magnification. These results indicate that the SRCNN scheme significantly outperforms the linear interpolation methods for enhancing image resolution in chest CT images. The results also suggest that SRCNN may become a potential solution for generating high-resolution CT images from standard CT images.

  3. Probe-specific mixed-model approach to detect copy number differences using multiplex ligation-dependent probe amplification (MLPA)

    PubMed Central

    González, Juan R; Carrasco, Josep L; Armengol, Lluís; Villatoro, Sergi; Jover, Lluís; Yasui, Yutaka; Estivill, Xavier

    2008-01-01

    Background MLPA method is a potentially useful semi-quantitative method to detect copy number alterations in targeted regions. In this paper, we propose a method for the normalization procedure based on a non-linear mixed-model, as well as a new approach for determining the statistical significance of altered probes based on linear mixed-model. This method establishes a threshold by using different tolerance intervals that accommodates the specific random error variability observed in each test sample. Results Through simulation studies we have shown that our proposed method outperforms two existing methods that are based on simple threshold rules or iterative regression. We have illustrated the method using a controlled MLPA assay in which targeted regions are variable in copy number in individuals suffering from different disorders such as Prader-Willi, DiGeorge or Autism showing the best performace. Conclusion Using the proposed mixed-model, we are able to determine thresholds to decide whether a region is altered. These threholds are specific for each individual, incorporating experimental variability, resulting in improved sensitivity and specificity as the examples with real data have revealed. PMID:18522760

  4. Intensity and Compactness Enabled Saliency Estimation for Leakage Detection in Diabetic and Malarial Retinopathy.

    PubMed

    Zhao, Yitian; Zheng, Yalin; Liu, Yonghuai; Yang, Jian; Zhao, Yifan; Chen, Duanduan; Wang, Yongtian

    2017-01-01

    Leakage in retinal angiography currently is a key feature for confirming the activities of lesions in the management of a wide range of retinal diseases, such as diabetic maculopathy and paediatric malarial retinopathy. This paper proposes a new saliency-based method for the detection of leakage in fluorescein angiography. A superpixel approach is firstly employed to divide the image into meaningful patches (or superpixels) at different levels. Two saliency cues, intensity and compactness, are then proposed for the estimation of the saliency map of each individual superpixel at each level. The saliency maps at different levels over the same cues are fused using an averaging operator. The two saliency maps over different cues are fused using a pixel-wise multiplication operator. Leaking regions are finally detected by thresholding the saliency map followed by a graph-cut segmentation. The proposed method has been validated using the only two publicly available datasets: one for malarial retinopathy and the other for diabetic retinopathy. The experimental results show that it outperforms one of the latest competitors and performs as well as a human expert for leakage detection and outperforms several state-of-the-art methods for saliency detection.

  5. Maximum-likelihood techniques for joint segmentation-classification of multispectral chromosome images.

    PubMed

    Schwartzkopf, Wade C; Bovik, Alan C; Evans, Brian L

    2005-12-01

    Traditional chromosome imaging has been limited to grayscale images, but recently a 5-fluorophore combinatorial labeling technique (M-FISH) was developed wherein each class of chromosomes binds with a different combination of fluorophores. This results in a multispectral image, where each class of chromosomes has distinct spectral components. In this paper, we develop new methods for automatic chromosome identification by exploiting the multispectral information in M-FISH chromosome images and by jointly performing chromosome segmentation and classification. We (1) develop a maximum-likelihood hypothesis test that uses multispectral information, together with conventional criteria, to select the best segmentation possibility; (2) use this likelihood function to combine chromosome segmentation and classification into a robust chromosome identification system; and (3) show that the proposed likelihood function can also be used as a reliable indicator of errors in segmentation, errors in classification, and chromosome anomalies, which can be indicators of radiation damage, cancer, and a wide variety of inherited diseases. We show that the proposed multispectral joint segmentation-classification method outperforms past grayscale segmentation methods when decomposing touching chromosomes. We also show that it outperforms past M-FISH classification techniques that do not use segmentation information.

  6. Summarizing techniques that combine three non-parametric scores to detect disease-associated 2-way SNP-SNP interactions.

    PubMed

    Sengupta Chattopadhyay, Amrita; Hsiao, Ching-Lin; Chang, Chien Ching; Lian, Ie-Bin; Fann, Cathy S J

    2014-01-01

    Identifying susceptibility genes that influence complex diseases is extremely difficult because loci often influence the disease state through genetic interactions. Numerous approaches to detect disease-associated SNP-SNP interactions have been developed, but none consistently generates high-quality results under different disease scenarios. Using summarizing techniques to combine a number of existing methods may provide a solution to this problem. Here we used three popular non-parametric methods-Gini, absolute probability difference (APD), and entropy-to develop two novel summary scores, namely principle component score (PCS) and Z-sum score (ZSS), with which to predict disease-associated genetic interactions. We used a simulation study to compare performance of the non-parametric scores, the summary scores, the scaled-sum score (SSS; used in polymorphism interaction analysis (PIA)), and the multifactor dimensionality reduction (MDR). The non-parametric methods achieved high power, but no non-parametric method outperformed all others under a variety of epistatic scenarios. PCS and ZSS, however, outperformed MDR. PCS, ZSS and SSS displayed controlled type-I-errors (<0.05) compared to GS, APDS, ES (>0.05). A real data study using the genetic-analysis-workshop 16 (GAW 16) rheumatoid arthritis dataset identified a number of interesting SNP-SNP interactions. © 2013 Elsevier B.V. All rights reserved.

  7. Integrative Analysis of High-throughput Cancer Studies with Contrasted Penalization

    PubMed Central

    Shi, Xingjie; Liu, Jin; Huang, Jian; Zhou, Yong; Shia, BenChang; Ma, Shuangge

    2015-01-01

    In cancer studies with high-throughput genetic and genomic measurements, integrative analysis provides a way to effectively pool and analyze heterogeneous raw data from multiple independent studies and outperforms “classic” meta-analysis and single-dataset analysis. When marker selection is of interest, the genetic basis of multiple datasets can be described using the homogeneity model or the heterogeneity model. In this study, we consider marker selection under the heterogeneity model, which includes the homogeneity model as a special case and can be more flexible. Penalization methods have been developed in the literature for marker selection. This study advances from the published ones by introducing the contrast penalties, which can accommodate the within- and across-dataset structures of covariates/regression coefficients and, by doing so, further improve marker selection performance. Specifically, we develop a penalization method that accommodates the across-dataset structures by smoothing over regression coefficients. An effective iterative algorithm, which calls an inner coordinate descent iteration, is developed. Simulation shows that the proposed method outperforms the benchmark with more accurate marker identification. The analysis of breast cancer and lung cancer prognosis studies with gene expression measurements shows that the proposed method identifies genes different from those using the benchmark and has better prediction performance. PMID:24395534

  8. Human Splice-Site Prediction with Deep Neural Networks.

    PubMed

    Naito, Tatsuhiko

    2018-04-18

    Accurate splice-site prediction is essential to delineate gene structures from sequence data. Several computational techniques have been applied to create a system to predict canonical splice sites. For classification tasks, deep neural networks (DNNs) have achieved record-breaking results and often outperformed other supervised learning techniques. In this study, a new method of splice-site prediction using DNNs was proposed. The proposed system receives an input sequence data and returns an answer as to whether it is splice site. The length of input is 140 nucleotides, with the consensus sequence (i.e., "GT" and "AG" for the donor and acceptor sites, respectively) in the middle. Each input sequence model is applied to the pretrained DNN model that determines the probability that an input is a splice site. The model consists of convolutional layers and bidirectional long short-term memory network layers. The pretraining and validation were conducted using the data set tested in previously reported methods. The performance evaluation results showed that the proposed method can outperform the previous methods. In addition, the pattern learned by the DNNs was visualized as position frequency matrices (PFMs). Some of PFMs were very similar to the consensus sequence. The trained DNN model and the brief source code for the prediction system are uploaded. Further improvement will be achieved following the further development of DNNs.

  9. Filter bank canonical correlation analysis for implementing a high-speed SSVEP-based brain-computer interface

    NASA Astrophysics Data System (ADS)

    Chen, Xiaogang; Wang, Yijun; Gao, Shangkai; Jung, Tzyy-Ping; Gao, Xiaorong

    2015-08-01

    Objective. Recently, canonical correlation analysis (CCA) has been widely used in steady-state visual evoked potential (SSVEP)-based brain-computer interfaces (BCIs) due to its high efficiency, robustness, and simple implementation. However, a method with which to make use of harmonic SSVEP components to enhance the CCA-based frequency detection has not been well established. Approach. This study proposed a filter bank canonical correlation analysis (FBCCA) method to incorporate fundamental and harmonic frequency components to improve the detection of SSVEPs. A 40-target BCI speller based on frequency coding (frequency range: 8-15.8 Hz, frequency interval: 0.2 Hz) was used for performance evaluation. To optimize the filter bank design, three methods (M1: sub-bands with equally spaced bandwidths; M2: sub-bands corresponding to individual harmonic frequency bands; M3: sub-bands covering multiple harmonic frequency bands) were proposed for comparison. Classification accuracy and information transfer rate (ITR) of the three FBCCA methods and the standard CCA method were estimated using an offline dataset from 12 subjects. Furthermore, an online BCI speller adopting the optimal FBCCA method was tested with a group of 10 subjects. Main results. The FBCCA methods significantly outperformed the standard CCA method. The method M3 achieved the highest classification performance. At a spelling rate of ˜33.3 characters/min, the online BCI speller obtained an average ITR of 151.18 ± 20.34 bits min-1. Significance. By incorporating the fundamental and harmonic SSVEP components in target identification, the proposed FBCCA method significantly improves the performance of the SSVEP-based BCI, and thereby facilitates its practical applications such as high-speed spelling.

  10. Single-Trial Classification of Multi-User P300-Based Brain-Computer Interface Using Riemannian Geometry.

    PubMed

    Korczowski, L; Congedo, M; Jutten, C

    2015-08-01

    The classification of electroencephalographic (EEG) data recorded from multiple users simultaneously is an important challenge in the field of Brain-Computer Interface (BCI). In this paper we compare different approaches for classification of single-trials Event-Related Potential (ERP) on two subjects playing a collaborative BCI game. The minimum distance to mean (MDM) classifier in a Riemannian framework is extended to use the diversity of the inter-subjects spatio-temporal statistics (MDM-hyper) or to merge multiple classifiers (MDM-multi). We show that both these classifiers outperform significantly the mean performance of the two users and analogous classifiers based on the step-wise linear discriminant analysis. More importantly, the MDM-multi outperforms the performance of the best player within the pair.

  11. Efficient biprediction decision scheme for fast high efficiency video coding encoding

    NASA Astrophysics Data System (ADS)

    Park, Sang-hyo; Lee, Seung-ho; Jang, Euee S.; Jun, Dongsan; Kang, Jung-Won

    2016-11-01

    An efficient biprediction decision scheme of high efficiency video coding (HEVC) is proposed for fast-encoding applications. For low-delay video applications, bidirectional prediction can be used to increase compression performance efficiently with previous reference frames. However, at the same time, the computational complexity of the HEVC encoder is significantly increased due to the additional biprediction search. Although a some research has attempted to reduce this complexity, whether the prediction is strongly related to both motion complexity and prediction modes in a coding unit has not yet been investigated. A method that avoids most compression-inefficient search points is proposed so that the computational complexity of the motion estimation process can be dramatically decreased. To determine if biprediction is critical, the proposed method exploits the stochastic correlation of the context of prediction units (PUs): the direction of a PU and the accuracy of a motion vector. Through experimental results, the proposed method showed that the time complexity of biprediction can be reduced to 30% on average, outperforming existing methods in view of encoding time, number of function calls, and memory access.

  12. Cyclic Solvent Vapor Annealing for Rapid, Robust Vertical Orientation of Features in BCP Thin Films

    NASA Astrophysics Data System (ADS)

    Paradiso, Sean; Delaney, Kris; Fredrickson, Glenn

    2015-03-01

    Methods for reliably controlling block copolymer self assembly have seen much attention over the past decade as new applications for nanostructured thin films emerge in the fields of nanopatterning and lithography. While solvent assisted annealing techniques are established as flexible and simple methods for achieving long range order, solvent annealing alone exhibits a very weak thermodynamic driving force for vertically orienting domains with respect to the free surface. To address the desire for oriented features, we have investigated a cyclic solvent vapor annealing (CSVA) approach that combines the mobility benefits of solvent annealing with selective stress experienced by structures oriented parallel to the free surface as the film is repeatedly swollen with solvent and dried. Using dynamical self-consistent field theory (DSCFT) calculations, we establish the conditions under which the method significantly outperforms both static and cyclic thermal annealing and implicate the orientation selection as a consequence of the swelling/deswelling process. Our results suggest that CSVA may prove to be a potent method for the rapid formation of highly ordered, vertically oriented features in block copolymer thin films.

  13. A loop-counting method for covariate-corrected low-rank biclustering of gene-expression and genome-wide association study data.

    PubMed

    Rangan, Aaditya V; McGrouther, Caroline C; Kelsoe, John; Schork, Nicholas; Stahl, Eli; Zhu, Qian; Krishnan, Arjun; Yao, Vicky; Troyanskaya, Olga; Bilaloglu, Seda; Raghavan, Preeti; Bergen, Sarah; Jureus, Anders; Landen, Mikael

    2018-05-14

    A common goal in data-analysis is to sift through a large data-matrix and detect any significant submatrices (i.e., biclusters) that have a low numerical rank. We present a simple algorithm for tackling this biclustering problem. Our algorithm accumulates information about 2-by-2 submatrices (i.e., 'loops') within the data-matrix, and focuses on rows and columns of the data-matrix that participate in an abundance of low-rank loops. We demonstrate, through analysis and numerical-experiments, that this loop-counting method performs well in a variety of scenarios, outperforming simple spectral methods in many situations of interest. Another important feature of our method is that it can easily be modified to account for aspects of experimental design which commonly arise in practice. For example, our algorithm can be modified to correct for controls, categorical- and continuous-covariates, as well as sparsity within the data. We demonstrate these practical features with two examples; the first drawn from gene-expression analysis and the second drawn from a much larger genome-wide-association-study (GWAS).

  14. A Nonlinear Framework of Delayed Particle Smoothing Method for Vehicle Localization under Non-Gaussian Environment.

    PubMed

    Xiao, Zhu; Havyarimana, Vincent; Li, Tong; Wang, Dong

    2016-05-13

    In this paper, a novel nonlinear framework of smoothing method, non-Gaussian delayed particle smoother (nGDPS), is proposed, which enables vehicle state estimation (VSE) with high accuracy taking into account the non-Gaussianity of the measurement and process noises. Within the proposed method, the multivariate Student's t-distribution is adopted in order to compute the probability distribution function (PDF) related to the process and measurement noises, which are assumed to be non-Gaussian distributed. A computation approach based on Ensemble Kalman Filter (EnKF) is designed to cope with the mean and the covariance matrix of the proposal non-Gaussian distribution. A delayed Gibbs sampling algorithm, which incorporates smoothing of the sampled trajectories over a fixed-delay, is proposed to deal with the sample degeneracy of particles. The performance is investigated based on the real-world data, which is collected by low-cost on-board vehicle sensors. The comparison study based on the real-world experiments and the statistical analysis demonstrates that the proposed nGDPS has significant improvement on the vehicle state accuracy and outperforms the existing filtering and smoothing methods.

  15. Uncovering the essential links in online commercial networks

    NASA Astrophysics Data System (ADS)

    Zeng, Wei; Fang, Meiling; Shao, Junming; Shang, Mingsheng

    2016-09-01

    Recommender systems are designed to effectively support individuals' decision-making process on various web sites. It can be naturally represented by a user-object bipartite network, where a link indicates that a user has collected an object. Recently, research on the information backbone has attracted researchers' interests, which is a sub-network with fewer nodes and links but carrying most of the relevant information. With the backbone, a system can generate satisfactory recommenda- tions while saving much computing resource. In this paper, we propose an enhanced topology-aware method to extract the information backbone in the bipartite network mainly based on the information of neighboring users and objects. Our backbone extraction method enables the recommender systems achieve more than 90% of the accuracy of the top-L recommendation, however, consuming only 20% links. The experimental results show that our method outperforms the alternative backbone extraction methods. Moreover, the structure of the information backbone is studied in detail. Finally, we highlight that the information backbone is one of the most important properties of the bipartite network, with which one can significantly improve the efficiency of the recommender system.

  16. Protein-RNA interface residue prediction using machine learning: an assessment of the state of the art.

    PubMed

    Walia, Rasna R; Caragea, Cornelia; Lewis, Benjamin A; Towfic, Fadi; Terribilini, Michael; El-Manzalawy, Yasser; Dobbs, Drena; Honavar, Vasant

    2012-05-10

    RNA molecules play diverse functional and structural roles in cells. They function as messengers for transferring genetic information from DNA to proteins, as the primary genetic material in many viruses, as catalysts (ribozymes) important for protein synthesis and RNA processing, and as essential and ubiquitous regulators of gene expression in living organisms. Many of these functions depend on precisely orchestrated interactions between RNA molecules and specific proteins in cells. Understanding the molecular mechanisms by which proteins recognize and bind RNA is essential for comprehending the functional implications of these interactions, but the recognition 'code' that mediates interactions between proteins and RNA is not yet understood. Success in deciphering this code would dramatically impact the development of new therapeutic strategies for intervening in devastating diseases such as AIDS and cancer. Because of the high cost of experimental determination of protein-RNA interfaces, there is an increasing reliance on statistical machine learning methods for training predictors of RNA-binding residues in proteins. However, because of differences in the choice of datasets, performance measures, and data representations used, it has been difficult to obtain an accurate assessment of the current state of the art in protein-RNA interface prediction. We provide a review of published approaches for predicting RNA-binding residues in proteins and a systematic comparison and critical assessment of protein-RNA interface residue predictors trained using these approaches on three carefully curated non-redundant datasets. We directly compare two widely used machine learning algorithms (Naïve Bayes (NB) and Support Vector Machine (SVM)) using three different data representations in which features are encoded using either sequence- or structure-based windows. Our results show that (i) Sequence-based classifiers that use a position-specific scoring matrix (PSSM)-based representation (PSSMSeq) outperform those that use an amino acid identity based representation (IDSeq) or a smoothed PSSM (SmoPSSMSeq); (ii) Structure-based classifiers that use smoothed PSSM representation (SmoPSSMStr) outperform those that use PSSM (PSSMStr) as well as sequence identity based representation (IDStr). PSSMSeq classifiers, when tested on an independent test set of 44 proteins, achieve performance that is comparable to that of three state-of-the-art structure-based predictors (including those that exploit geometric features) in terms of Matthews Correlation Coefficient (MCC), although the structure-based methods achieve substantially higher Specificity (albeit at the expense of Sensitivity) compared to sequence-based methods. We also find that the expected performance of the classifiers on a residue level can be markedly different from that on a protein level. Our experiments show that the classifiers trained on three different non-redundant protein-RNA interface datasets achieve comparable cross-validation performance. However, we find that the results are significantly affected by differences in the distance threshold used to define interface residues. Our results demonstrate that protein-RNA interface residue predictors that use a PSSM-based encoding of sequence windows outperform classifiers that use other encodings of sequence windows. While structure-based methods that exploit geometric features can yield significant increases in the Specificity of protein-RNA interface residue predictions, such increases are offset by decreases in Sensitivity. These results underscore the importance of comparing alternative methods using rigorous statistical procedures, multiple performance measures, and datasets that are constructed based on several alternative definitions of interface residues and redundancy cutoffs as well as including evaluations on independent test sets into the comparisons.

  17. Comparative evaluation of reverse engineering gene regulatory networks with relevance networks, graphical gaussian models and bayesian networks.

    PubMed

    Werhli, Adriano V; Grzegorczyk, Marco; Husmeier, Dirk

    2006-10-15

    An important problem in systems biology is the inference of biochemical pathways and regulatory networks from postgenomic data. Various reverse engineering methods have been proposed in the literature, and it is important to understand their relative merits and shortcomings. In the present paper, we compare the accuracy of reconstructing gene regulatory networks with three different modelling and inference paradigms: (1) Relevance networks (RNs): pairwise association scores independent of the remaining network; (2) graphical Gaussian models (GGMs): undirected graphical models with constraint-based inference, and (3) Bayesian networks (BNs): directed graphical models with score-based inference. The evaluation is carried out on the Raf pathway, a cellular signalling network describing the interaction of 11 phosphorylated proteins and phospholipids in human immune system cells. We use both laboratory data from cytometry experiments as well as data simulated from the gold-standard network. We also compare passive observations with active interventions. On Gaussian observational data, BNs and GGMs were found to outperform RNs. The difference in performance was not significant for the non-linear simulated data and the cytoflow data, though. Also, we did not observe a significant difference between BNs and GGMs on observational data in general. However, for interventional data, BNs outperform GGMs and RNs, especially when taking the edge directions rather than just the skeletons of the graphs into account. This suggests that the higher computational costs of inference with BNs over GGMs and RNs are not justified when using only passive observations, but that active interventions in the form of gene knockouts and over-expressions are required to exploit the full potential of BNs. Data, software and supplementary material are available from http://www.bioss.sari.ac.uk/staff/adriano/research.html

  18. Predictive and External Validity of a Pre-Market Study to Determine the Most Effective Pictorial Health Warning Label Content for Cigarette Packages

    PubMed Central

    Thrasher, James F.; Reid, Jessica L.; Hammond, David

    2016-01-01

    Abstract Introduction: Studies examining cigarette package pictorial health warning label (HWL) content have primarily used designs that do not allow determination of effectiveness after repeated, naturalistic exposure. This research aimed to determine the predictive and external validity of a pre-market evaluation study of pictorial HWLs. Methods: Data were analyzed from: (1) a pre-market convenience sample of 544 adult smokers who participated in field experiments in Mexico City before pictorial HWL implementation (September 2010); and (2) a post-market population-based representative sample of 1765 adult smokers in the Mexican administration of the International Tobacco Control Policy Evaluation Survey after pictorial HWL implementation. Participants in both samples rated six HWLs that appeared on cigarette packs, and also ranked HWLs with four different themes. Mixed effects models were estimated for each sample to assess ratings of relative effectiveness for the six HWLs, and to assess which HWL themes were ranked as the most effective. Results: Pre- and post-market data showed similar relative ratings across the six HWLs, with the least and most effective HWLs consistently differentiated from other HWLs. Models predicting rankings of HWL themes in post-market sample indicated: (1) pictorial HWLs were ranked as more effective than text-only HWLs; (2) HWLs with both graphic and “lived experience” content outperformed symbolic content; and, (3) testimonial content significantly outperformed didactic content. Pre-market data showed a similar pattern of results, but with fewer statistically significant findings. Conclusions: The study suggests well-designed pre-market studies can have predictive and external validity, helping regulators select HWL content. PMID:26377516

  19. Fast and robust multimodal image registration using a local derivative pattern.

    PubMed

    Jiang, Dongsheng; Shi, Yonghong; Chen, Xinrong; Wang, Manning; Song, Zhijian

    2017-02-01

    Deformable multimodal image registration, which can benefit radiotherapy and image guided surgery by providing complementary information, remains a challenging task in the medical image analysis field due to the difficulty of defining a proper similarity measure. This article presents a novel, robust and fast binary descriptor, the discriminative local derivative pattern (dLDP), which is able to encode images of different modalities into similar image representations. dLDP calculates a binary string for each voxel according to the pattern of intensity derivatives in its neighborhood. The descriptor similarity is evaluated using the Hamming distance, which can be efficiently computed, instead of conventional L1 or L2 norms. For the first time, we validated the effectiveness and feasibility of the local derivative pattern for multimodal deformable image registration with several multi-modal registration applications. dLDP was compared with three state-of-the-art methods in artificial image and clinical settings. In the experiments of deformable registration between different magnetic resonance imaging (MRI) modalities from BrainWeb, between computed tomography and MRI images from patient data, and between MRI and ultrasound images from BITE database, we show our method outperforms localized mutual information and entropy images in terms of both accuracy and time efficiency. We have further validated dLDP for the deformable registration of preoperative MRI and three-dimensional intraoperative ultrasound images. Our results indicate that dLDP reduces the average mean target registration error from 4.12 mm to 2.30 mm. This accuracy is statistically equivalent to the accuracy of the state-of-the-art methods in the study; however, in terms of computational complexity, our method significantly outperforms other methods and is even comparable to the sum of the absolute difference. The results reveal that dLDP can achieve superior performance regarding both accuracy and time efficiency in general multimodal image registration. In addition, dLDP also indicates the potential for clinical ultrasound guided intervention. © 2016 The Authors. Medical Physics published by Wiley Periodicals, Inc. on behalf of American Association of Physicists in Medicine.

  20. An automated framework for NMR resonance assignment through simultaneous slice picking and spin system forming.

    PubMed

    Abbas, Ahmed; Guo, Xianrong; Jing, Bing-Yi; Gao, Xin

    2014-06-01

    Despite significant advances in automated nuclear magnetic resonance-based protein structure determination, the high numbers of false positives and false negatives among the peaks selected by fully automated methods remain a problem. These false positives and negatives impair the performance of resonance assignment methods. One of the main reasons for this problem is that the computational research community often considers peak picking and resonance assignment to be two separate problems, whereas spectroscopists use expert knowledge to pick peaks and assign their resonances at the same time. We propose a novel framework that simultaneously conducts slice picking and spin system forming, an essential step in resonance assignment. Our framework then employs a genetic algorithm, directed by both connectivity information and amino acid typing information from the spin systems, to assign the spin systems to residues. The inputs to our framework can be as few as two commonly used spectra, i.e., CBCA(CO)NH and HNCACB. Different from the existing peak picking and resonance assignment methods that treat peaks as the units, our method is based on 'slices', which are one-dimensional vectors in three-dimensional spectra that correspond to certain ([Formula: see text]) values. Experimental results on both benchmark simulated data sets and four real protein data sets demonstrate that our method significantly outperforms the state-of-the-art methods while using a less number of spectra than those methods. Our method is freely available at http://sfb.kaust.edu.sa/Pages/Software.aspx.

  1. Phylogenomics of plant genomes: a methodology for genome-wide searches for orthologs in plants

    PubMed Central

    Conte, Matthieu G; Gaillard, Sylvain; Droc, Gaetan; Perin, Christophe

    2008-01-01

    Background Gene ortholog identification is now a major objective for mining the increasing amount of sequence data generated by complete or partial genome sequencing projects. Comparative and functional genomics urgently need a method for ortholog detection to reduce gene function inference and to aid in the identification of conserved or divergent genetic pathways between several species. As gene functions change during evolution, reconstructing the evolutionary history of genes should be a more accurate way to differentiate orthologs from paralogs. Phylogenomics takes into account phylogenetic information from high-throughput genome annotation and is the most straightforward way to infer orthologs. However, procedures for automatic detection of orthologs are still scarce and suffer from several limitations. Results We developed a procedure for ortholog prediction between Oryza sativa and Arabidopsis thaliana. Firstly, we established an efficient method to cluster A. thaliana and O. sativa full proteomes into gene families. Then, we developed an optimized phylogenomics pipeline for ortholog inference. We validated the full procedure using test sets of orthologs and paralogs to demonstrate that our method outperforms pairwise methods for ortholog predictions. Conclusion Our procedure achieved a high level of accuracy in predicting ortholog and paralog relationships. Phylogenomic predictions for all validated gene families in both species were easily achieved and we can conclude that our methodology outperforms similarly based methods. PMID:18426584

  2. WE-FG-207B-05: Iterative Reconstruction Via Prior Image Constrained Total Generalized Variation for Spectral CT

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Niu, S; Zhang, Y; Ma, J

    Purpose: To investigate iterative reconstruction via prior image constrained total generalized variation (PICTGV) for spectral computed tomography (CT) using fewer projections while achieving greater image quality. Methods: The proposed PICTGV method is formulated as an optimization problem, which balances the data fidelity and prior image constrained total generalized variation of reconstructed images in one framework. The PICTGV method is based on structure correlations among images in the energy domain and high-quality images to guide the reconstruction of energy-specific images. In PICTGV method, the high-quality image is reconstructed from all detector-collected X-ray signals and is referred as the broad-spectrum image. Distinctmore » from the existing reconstruction methods applied on the images with first order derivative, the higher order derivative of the images is incorporated into the PICTGV method. An alternating optimization algorithm is used to minimize the PICTGV objective function. We evaluate the performance of PICTGV on noise and artifacts suppressing using phantom studies and compare the method with the conventional filtered back-projection method as well as TGV based method without prior image. Results: On the digital phantom, the proposed method outperforms the existing TGV method in terms of the noise reduction, artifacts suppression, and edge detail preservation. Compared to that obtained by the TGV based method without prior image, the relative root mean square error in the images reconstructed by the proposed method is reduced by over 20%. Conclusion: The authors propose an iterative reconstruction via prior image constrained total generalize variation for spectral CT. Also, we have developed an alternating optimization algorithm and numerically demonstrated the merits of our approach. Results show that the proposed PICTGV method outperforms the TGV method for spectral CT.« less

  3. Electroencephalogram-based decoding cognitive states using convolutional neural network and likelihood ratio based score fusion

    PubMed Central

    2017-01-01

    Electroencephalogram (EEG)-based decoding human brain activity is challenging, owing to the low spatial resolution of EEG. However, EEG is an important technique, especially for brain–computer interface applications. In this study, a novel algorithm is proposed to decode brain activity associated with different types of images. In this hybrid algorithm, convolutional neural network is modified for the extraction of features, a t-test is used for the selection of significant features and likelihood ratio-based score fusion is used for the prediction of brain activity. The proposed algorithm takes input data from multichannel EEG time-series, which is also known as multivariate pattern analysis. Comprehensive analysis was conducted using data from 30 participants. The results from the proposed method are compared with current recognized feature extraction and classification/prediction techniques. The wavelet transform-support vector machine method is the most popular currently used feature extraction and prediction method. This method showed an accuracy of 65.7%. However, the proposed method predicts the novel data with improved accuracy of 79.9%. In conclusion, the proposed algorithm outperformed the current feature extraction and prediction method. PMID:28558002

  4. Exact and Metaheuristic Approaches for a Bi-Objective School Bus Scheduling Problem

    PubMed Central

    Chen, Xiaopan; Kong, Yunfeng; Dang, Lanxue; Hou, Yane; Ye, Xinyue

    2015-01-01

    As a class of hard combinatorial optimization problems, the school bus routing problem has received considerable attention in the last decades. For a multi-school system, given the bus trips for each school, the school bus scheduling problem aims at optimizing bus schedules to serve all the trips within the school time windows. In this paper, we propose two approaches for solving the bi-objective school bus scheduling problem: an exact method of mixed integer programming (MIP) and a metaheuristic method which combines simulated annealing with local search. We develop MIP formulations for homogenous and heterogeneous fleet problems respectively and solve the models by MIP solver CPLEX. The bus type-based formulation for heterogeneous fleet problem reduces the model complexity in terms of the number of decision variables and constraints. The metaheuristic method is a two-stage framework for minimizing the number of buses to be used as well as the total travel distance of buses. We evaluate the proposed MIP and the metaheuristic method on two benchmark datasets, showing that on both instances, our metaheuristic method significantly outperforms the respective state-of-the-art methods. PMID:26176764

  5. Efficient calibration for imperfect computer models

    DOE PAGES

    Tuo, Rui; Wu, C. F. Jeff

    2015-12-01

    Many computer models contain unknown parameters which need to be estimated using physical observations. Furthermore, the calibration method based on Gaussian process models may lead to unreasonable estimate for imperfect computer models. In this work, we extend their study to calibration problems with stochastic physical data. We propose a novel method, called the L 2 calibration, and show its semiparametric efficiency. The conventional method of the ordinary least squares is also studied. Theoretical analysis shows that it is consistent but not efficient. Here, numerical examples show that the proposed method outperforms the existing ones.

  6. Foliar and woody materials discriminated using terrestrial LiDAR in a mixed natural forest

    NASA Astrophysics Data System (ADS)

    Zhu, Xi; Skidmore, Andrew K.; Darvishzadeh, Roshanak; Niemann, K. Olaf; Liu, Jing; Shi, Yifang; Wang, Tiejun

    2018-02-01

    Separation of foliar and woody materials using remotely sensed data is crucial for the accurate estimation of leaf area index (LAI) and woody biomass across forest stands. In this paper, we present a new method to accurately separate foliar and woody materials using terrestrial LiDAR point clouds obtained from ten test sites in a mixed forest in Bavarian Forest National Park, Germany. Firstly, we applied and compared an adaptive radius near-neighbor search algorithm with a fixed radius near-neighbor search method in order to obtain both radiometric and geometric features derived from terrestrial LiDAR point clouds. Secondly, we used a random forest machine learning algorithm to classify foliar and woody materials and examined the impact of understory and slope on the classification accuracy. An average overall accuracy of 84.4% (Kappa = 0.75) was achieved across all experimental plots. The adaptive radius near-neighbor search method outperformed the fixed radius near-neighbor search method. The classification accuracy was significantly higher when the combination of both radiometric and geometric features was utilized. The analysis showed that increasing slope and understory coverage had a significant negative effect on the overall classification accuracy. Our results suggest that the utilization of the adaptive radius near-neighbor search method coupling both radiometric and geometric features has the potential to accurately discriminate foliar and woody materials from terrestrial LiDAR data in a mixed natural forest.

  7. Post-boosting of classification boundary for imbalanced data using geometric mean.

    PubMed

    Du, Jie; Vong, Chi-Man; Pun, Chi-Man; Wong, Pak-Kin; Ip, Weng-Fai

    2017-12-01

    In this paper, a novel imbalance learning method for binary classes is proposed, named as Post-Boosting of classification boundary for Imbalanced data (PBI), which can significantly improve the performance of any trained neural networks (NN) classification boundary. The procedure of PBI simply consists of two steps: an (imbalanced) NN learning method is first applied to produce a classification boundary, which is then adjusted by PBI under the geometric mean (G-mean). For data imbalance, the geometric mean of the accuracies of both minority and majority classes is considered, that is statistically more suitable than the common metric accuracy. PBI also has the following advantages over traditional imbalance methods: (i) PBI can significantly improve the classification accuracy on minority class while improving or keeping that on majority class as well; (ii) PBI is suitable for large data even with high imbalance ratio (up to 0.001). For evaluation of (i), a new metric called Majority loss/Minority advance ratio (MMR) is proposed that evaluates the loss ratio of majority class to minority class. Experiments have been conducted for PBI and several imbalance learning methods over benchmark datasets of different sizes, different imbalance ratios, and different dimensionalities. By analyzing the experimental results, PBI is shown to outperform other imbalance learning methods on almost all datasets. Copyright © 2017 Elsevier Ltd. All rights reserved.

  8. The Effects of School Culture and Climate on Student Achievement

    ERIC Educational Resources Information Center

    MacNeil, Angus J.; Prater, Doris L.; Busch, Steve

    2009-01-01

    The purpose of the study was to investigate whether Exemplary, Recognized and Acceptable schools differ in their school climates, as measured by the 10 dimensions of the Organizational Health Inventory. Significant differences were found on all 10 dimensions of the Organizational Health Inventory, with Exemplary schools out-performing Acceptable…

  9. A Cross-National Study of Calculus

    ERIC Educational Resources Information Center

    Chai, Jun; Friedler, Louis M.; Wolff, Edward F.; Li, Jun; Rhea, Karen

    2015-01-01

    The results from a cross-national study comparing calculus performance of students at East China Normal University (ECNU) in Shanghai and students at the University of Michigan before and after their first university calculus course are presented. Overall, ECNU significantly outperformed Michigan on both the pre- and post-tests, but the Michigan…

  10. Enhancing Motivation and Acquisition of Coordinate Concepts by Using Concept Trees.

    ERIC Educational Resources Information Center

    Hirumi, Atsusi; Bowers, Dennis R.

    1991-01-01

    Examines the effects of providing undergraduate learners with graphic illustrations of coordinate concept relationships to supplement text-based instruction. Half of those reading a specific passage received a graphic concept tree. That group outperformed those who did not, reporting significantly higher amounts of attenuation, confidence, and…

  11. Effects of Computer Algebra System (CAS) with Metacognitive Training on Mathematical Reasoning.

    ERIC Educational Resources Information Center

    Kramarski, Bracha; Hirsch, Chaya

    2003-01-01

    Describes a study that investigated the differential effects of Computer Algebra Systems (CAS) and metacognitive training (META) on mathematical reasoning. Participants were 83 Israeli eighth-grade students. Results showed that CAS embedded within META significantly outperformed the META and CAS alone conditions, which in turn significantly…

  12. A Comparison of Two Phonological Awareness Techniques between Samples of Preschool Children.

    ERIC Educational Resources Information Center

    Maslanka, Phyllis; Joseph, Laurice M.

    2002-01-01

    Examines the differential effects of sound boxes and sound sort phonological awareness instructional techniques on preschoolers' phonological awareness performance. Finds that children in the sound box group significantly outperformed children in the sound sort group on isolating medial sounds and segmenting phonemes. Reveals that preschool…

  13. Enhancement of Creativity in Computer Environments.

    ERIC Educational Resources Information Center

    Clements, Douglas H.

    1991-01-01

    The effects of the LOGO computer programing environment on creativity were studied for 73 8-year-old third graders (33 males and 40 females) who were tested before and after LOGO instruction. Overall, the LOGO group significantly outperformed a comparison group receiving non-LOGO creativity training and a nontreatment control group. (SLD)

  14. A Comparison of Academic and Athletic Performance in the NCAA

    ERIC Educational Resources Information Center

    Bailey, Sarah; Bhattacharyya, Mouchumi

    2017-01-01

    The Academic Progress Rate (APR) of 34 sports was investigated to determine whether the top athletic teams performed significantly better "academically" compared to their bottom counterparts. A "p" value of 0.0029 revealed that top athletic teams academically outperformed bottom athletic teams. Further analysis showed the…

  15. Development and validation of an objective instrument to measure surgical performance at tonsillectomy.

    PubMed

    Roberson, David W; Kentala, Erna; Forbes, Peter

    2005-12-01

    The goals of this project were 1) to develop and validate an objective instrument to measure surgical performance at tonsillectomy, 2) to assess its interobserver and interobservation reliability and construct validity, and 3) to select those items with best reliability and most independent information to design a simplified form suitable for routine use in otolaryngology surgical evaluation. Prospective, observational data collection for an educational quality improvement project. The evaluation instrument was based on previous instruments developed in general surgery with input from attending otolaryngologic surgeons and experts in medical education. It was pilot tested and subjected to iterative improvements. After the instrument was finalized, a total of 55 tonsillectomies were observed and scored during academic year 2002 to 2003: 45 cases by residents at different points during their rotation, 5 by fellows, and 5 by faculty. Results were assessed for interobserver reliability, interobservation reliability, and construct validity. Factor analysis was used to identify items with independent information. Interobserver and interobservation reliability was high. On technical items, faculty substantially outperformed fellows, who in turn outperformed residents (P < .0001 for both comparisons). On the "global" scale (overall assessment), residents improved an average of 1 full point (on a 5 point scale) during a 3 month rotation (P = .01). In the subscale of "patient care," results were less clear cut: fellows outperformed residents, who in turn outperformed faculty, but only the fellows to faculty comparison was statistically significant (P = .04), and residents did not clearly improve over time (P = .36). Factor analysis demonstrated that technical items and patient care items factor separately and thus represent separate skill domains in surgery. It is possible to objectively measure surgical skill at tonsillectomy with high reliability and good construct validity. Factor analysis demonstrated that patient care is a distinct domain in surgical skill. Although the interobserver reliability for some patient care items reached statistical significance, it was not high enough for "high stakes testing" purposes. Using reliability and factor analysis results, we propose a simplified instrument for use in evaluating trainees in otolaryngologic surgery.

  16. Gene selection for microarray cancer classification using a new evolutionary method employing artificial intelligence concepts.

    PubMed

    Dashtban, M; Balafar, Mohammadali

    2017-03-01

    Gene selection is a demanding task for microarray data analysis. The diverse complexity of different cancers makes this issue still challenging. In this study, a novel evolutionary method based on genetic algorithms and artificial intelligence is proposed to identify predictive genes for cancer classification. A filter method was first applied to reduce the dimensionality of feature space followed by employing an integer-coded genetic algorithm with dynamic-length genotype, intelligent parameter settings, and modified operators. The algorithmic behaviors including convergence trends, mutation and crossover rate changes, and running time were studied, conceptually discussed, and shown to be coherent with literature findings. Two well-known filter methods, Laplacian and Fisher score, were examined considering similarities, the quality of selected genes, and their influences on the evolutionary approach. Several statistical tests concerning choice of classifier, choice of dataset, and choice of filter method were performed, and they revealed some significant differences between the performance of different classifiers and filter methods over datasets. The proposed method was benchmarked upon five popular high-dimensional cancer datasets; for each, top explored genes were reported. Comparing the experimental results with several state-of-the-art methods revealed that the proposed method outperforms previous methods in DLBCL dataset. Copyright © 2017 Elsevier Inc. All rights reserved.

  17. The Recruitment and Retention of Hispanic Undergraduate Students in Public Universities in the United States, 2000-2006

    ERIC Educational Resources Information Center

    Montalvo, Edris J.

    2013-01-01

    In many public U.S. universities, Hispanic undergraduates are underrepresented in terms of enrollment and graduation. This mixed-method geographical study investigated whether some public universities outperform others in recruiting and retaining Hispanic undergraduates. The quantitative findings showed that the effect of financial aid and…

  18. A randomized controlled study to evaluate the role of video-based coaching in training laparoscopic skills.

    PubMed

    Singh, Pritam; Aggarwal, Rajesh; Tahir, Muaaz; Pucher, Philip H; Darzi, Ara

    2015-05-01

    This study evaluates whether video-based coaching can enhance laparoscopic surgical skills performance. Many professions utilize coaching to improve performance. The sports industry employs video analysis to maximize improvement from every performance. Laparoscopic novices were baseline tested and then trained on a validated virtual reality (VR) laparoscopic cholecystectomy (LC) curriculum. After competence, subjects were randomized on a 1:1 ratio and each performed 5 VRLCs. After each LC, intervention group subjects received video-based coaching by a surgeon, utilizing an adaptation of the GROW (Goals, Reality, Options, Wrap-up) coaching model. Control subjects viewed online surgical lectures. All subjects then performed 2 porcine LCs. Performance was assessed by blinded video review using validated global rating scales. Twenty subjects were recruited. No significant differences were observed between groups in baseline performance and in VRLC1. For each subsequent repetition, intervention subjects significantly outperformed controls on all global rating scales. Interventions outperformed controls in porcine LC1 [Global Operative Assessment of Laparoscopic Skills: (20.5 vs 15.5; P = 0.011), Objective Structured Assessment of Technical Skills: (21.5vs 14.5; P = 0.001), and Operative Performance Rating System: (26 vs 19.5; P = 0.001)] and porcine LC2 [Global Operative Assessment of Laparoscopic Skills: (28 vs 17.5; P = 0.005), Objective Structured Assessment of Technical Skills: (30 vs 16.5; P < 0.001), and Operative Performance Rating System: (36 vs 21; P = 0.004)]. Intervention subjects took significantly longer than controls in porcine LC1 (2920 vs 2004 seconds; P = 0.009) and LC2 (2297 vs 1683; P = 0.003). Despite equivalent exposure to practical laparoscopic skills training, video-based coaching enhanced the quality of laparoscopic surgical performance on both VR and porcine LCs, although at the expense of increased time. Video-based coaching is a feasible method of maximizing performance enhancement from every clinical exposure.

  19. New principle for measuring arterial blood oxygenation, enabling motion-robust remote monitoring.

    PubMed

    van Gastel, Mark; Stuijk, Sander; de Haan, Gerard

    2016-12-07

    Finger-oximeters are ubiquitously used for patient monitoring in hospitals worldwide. Recently, remote measurement of arterial blood oxygenation (SpO 2 ) with a camera has been demonstrated. Both contact and remote measurements, however, require the subject to remain static for accurate SpO 2 values. This is due to the use of the common ratio-of-ratios measurement principle that measures the relative pulsatility at different wavelengths. Since the amplitudes are small, they are easily corrupted by motion-induced variations. We introduce a new principle that allows accurate remote measurements even during significant subject motion. We demonstrate the main advantage of the principle, i.e. that the optimal signature remains the same even when the SNR of the PPG signal drops significantly due to motion or limited measurement area. The evaluation uses recordings with breath-holding events, which induce hypoxemia in healthy moving subjects. The events lead to clinically relevant SpO 2 levels in the range 80-100%. The new principle is shown to greatly outperform current remote ratio-of-ratios based methods. The mean-absolute SpO 2 -error (MAE) is about 2 percentage-points during head movements, where the benchmark method shows a MAE of 24 percentage-points. Consequently, we claim ours to be the first method to reliably measure SpO 2 remotely during significant subject motion.

  20. Characterization of electrophysiological propagation by multichannel sensors

    PubMed Central

    Bradshaw, L. Alan; Kim, Juliana H.; Somarajan, Suseela; Richards, William O.; Cheng, Leo K.

    2016-01-01

    Objective The propagation of electrophysiological activity measured by multichannel devices could have significant clinical implications. Gastric slow waves normally propagate along longitudinal paths that are evident in recordings of serosal potentials and transcutaneous magnetic fields. We employed a realistic model of gastric slow wave activity to simulate the transabdominal magnetogastrogram (MGG) recorded in a multichannel biomagnetometer and to determine characteristics of electrophysiological propagation from MGG measurements. Methods Using MGG simulations of slow wave sources in a realistic abdomen (both superficial and deep sources) and in a horizontally-layered volume conductor, we compared two analytic methods (Second Order Blind Identification, SOBI and Surface Current Density, SCD) that allow quantitative characterization of slow wave propagation. We also evaluated the performance of the methods with simulated experimental noise. The methods were also validated in an experimental animal model. Results Mean square errors in position estimates were within 2 cm of the correct position, and average propagation velocities within 2 mm/s of the actual velocities. SOBI propagation analysis outperformed the SCD method for dipoles in the superficial and horizontal layer models with and without additive noise. The SCD method gave better estimates for deep sources, but did not handle additive noise as well as SOBI. Conclusion SOBI-MGG and SCD-MGG were used to quantify slow wave propagation in a realistic abdomen model of gastric electrical activity. Significance These methods could be generalized to any propagating electrophysiological activity detected by multichannel sensor arrays. PMID:26595907

  1. The Effects of Schema-Based Instruction on the Proportional Thinking of Students With Mathematics Difficulties With and Without Reading Difficulties.

    PubMed

    Jitendra, Asha K; Dupuis, Danielle N; Star, Jon R; Rodriguez, Michael C

    2016-07-01

    This study examined the effect of schema-based instruction (SBI) on the proportional problem-solving performance of students with mathematics difficulties only (MD) and students with mathematics and reading difficulties (MDRD). Specifically, we examined the responsiveness of 260 seventh grade students identified as MD or MDRD to a 6-week treatment (SBI) on measures of proportional problem solving. Results indicated that students in the SBI condition significantly outperformed students in the control condition on a measure of proportional problem solving administered at posttest (g = 0.40) and again 6 weeks later (g = 0.42). The interaction between treatment group and students' difficulty status was not significant, which indicates that SBI was equally effective for both students with MD and those with MDRD. Further analyses revealed that SBI was particularly effective at improving students' performance on items related to percents. Finally, students with MD significantly outperformed students with MDRD on all measures of proportional problem solving. These findings suggest that interventions designed to include effective instructional features (e.g., SBI) promote student understanding of mathematical ideas. © Hammill Institute on Disabilities 2014.

  2. Combining Relevance Vector Machines and exponential regression for bearing residual life estimation

    NASA Astrophysics Data System (ADS)

    Di Maio, Francesco; Tsui, Kwok Leung; Zio, Enrico

    2012-08-01

    In this paper we present a new procedure for estimating the bearing Residual Useful Life (RUL) by combining data-driven and model-based techniques. Respectively, we resort to (i) Relevance Vector Machines (RVMs) for selecting a low number of significant basis functions, called Relevant Vectors (RVs), and (ii) exponential regression to compute and continuously update residual life estimations. The combination of these techniques is developed with reference to partially degraded thrust ball bearings and tested on real world vibration-based degradation data. On the case study considered, the proposed procedure outperforms other model-based methods, with the added value of an adequate representation of the uncertainty associated to the estimates of the quantification of the credibility of the results by the Prognostic Horizon (PH) metric.

  3. Sustaining Reform-Based Science Teaching of Preservice and Inservice Elementary School Teachers

    NASA Astrophysics Data System (ADS)

    Sullivan-Watts, Barbara K.; Nowicki, Barbara L.; Shim, Minsuk K.; Young, Betty J.

    2013-08-01

    This study examined the influence of a professional development program based around commercially available inquiry science curricula on the teaching practices of 27 beginning elementary school teachers and their teacher mentors over a 2 year period. A quantitative rubric used to score inquiry elements and use of data in videotaped lessons indicated that education students assigned to inquiry-based classrooms during their methods course or student teaching year outperformed students without this experience. There was also a significant positive effect of multi-year access to the kit-based program on mentor teaching practice. Recent inclusion of a "writing in science" program in both preservice and inservice training has been used to address the lesson element that received lowest scores—evaluation of data and its use in scientific explanation.

  4. Deep Learning for Computer Vision: A Brief Review

    PubMed Central

    Doulamis, Nikolaos; Doulamis, Anastasios; Protopapadakis, Eftychios

    2018-01-01

    Over the last years deep learning methods have been shown to outperform previous state-of-the-art machine learning techniques in several fields, with computer vision being one of the most prominent cases. This review paper provides a brief overview of some of the most significant deep learning schemes used in computer vision problems, that is, Convolutional Neural Networks, Deep Boltzmann Machines and Deep Belief Networks, and Stacked Denoising Autoencoders. A brief account of their history, structure, advantages, and limitations is given, followed by a description of their applications in various computer vision tasks, such as object detection, face recognition, action and activity recognition, and human pose estimation. Finally, a brief overview is given of future directions in designing deep learning schemes for computer vision problems and the challenges involved therein. PMID:29487619

  5. Smoothing of climate time series revisited

    NASA Astrophysics Data System (ADS)

    Mann, Michael E.

    2008-08-01

    We present an easily implemented method for smoothing climate time series, generalizing upon an approach previously described by Mann (2004). The method adaptively weights the three lowest order time series boundary constraints to optimize the fit with the raw time series. We apply the method to the instrumental global mean temperature series from 1850-2007 and to various surrogate global mean temperature series from 1850-2100 derived from the CMIP3 multimodel intercomparison project. These applications demonstrate that the adaptive method systematically out-performs certain widely used default smoothing methods, and is more likely to yield accurate assessments of long-term warming trends.

  6. Monte Carlo sampling in diffusive dynamical systems

    NASA Astrophysics Data System (ADS)

    Tapias, Diego; Sanders, David P.; Altmann, Eduardo G.

    2018-05-01

    We introduce a Monte Carlo algorithm to efficiently compute transport properties of chaotic dynamical systems. Our method exploits the importance sampling technique that favors trajectories in the tail of the distribution of displacements, where deviations from a diffusive process are most prominent. We search for initial conditions using a proposal that correlates states in the Markov chain constructed via a Metropolis-Hastings algorithm. We show that our method outperforms the direct sampling method and also Metropolis-Hastings methods with alternative proposals. We test our general method through numerical simulations in 1D (box-map) and 2D (Lorentz gas) systems.

  7. Probability genotype imputation method and integrated weighted lasso for QTL identification.

    PubMed

    Demetrashvili, Nino; Van den Heuvel, Edwin R; Wit, Ernst C

    2013-12-30

    Many QTL studies have two common features: (1) often there is missing marker information, (2) among many markers involved in the biological process only a few are causal. In statistics, the second issue falls under the headings "sparsity" and "causal inference". The goal of this work is to develop a two-step statistical methodology for QTL mapping for markers with binary genotypes. The first step introduces a novel imputation method for missing genotypes. Outcomes of the proposed imputation method are probabilities which serve as weights to the second step, namely in weighted lasso. The sparse phenotype inference is employed to select a set of predictive markers for the trait of interest. Simulation studies validate the proposed methodology under a wide range of realistic settings. Furthermore, the methodology outperforms alternative imputation and variable selection methods in such studies. The methodology was applied to an Arabidopsis experiment, containing 69 markers for 165 recombinant inbred lines of a F8 generation. The results confirm previously identified regions, however several new markers are also found. On the basis of the inferred ROC behavior these markers show good potential for being real, especially for the germination trait Gmax. Our imputation method shows higher accuracy in terms of sensitivity and specificity compared to alternative imputation method. Also, the proposed weighted lasso outperforms commonly practiced multiple regression as well as the traditional lasso and adaptive lasso with three weighting schemes. This means that under realistic missing data settings this methodology can be used for QTL identification.

  8. multiDE: a dimension reduced model based statistical method for differential expression analysis using RNA-sequencing data with multiple treatment conditions.

    PubMed

    Kang, Guangliang; Du, Li; Zhang, Hong

    2016-06-22

    The growing complexity of biological experiment design based on high-throughput RNA sequencing (RNA-seq) is calling for more accommodative statistical tools. We focus on differential expression (DE) analysis using RNA-seq data in the presence of multiple treatment conditions. We propose a novel method, multiDE, for facilitating DE analysis using RNA-seq read count data with multiple treatment conditions. The read count is assumed to follow a log-linear model incorporating two factors (i.e., condition and gene), where an interaction term is used to quantify the association between gene and condition. The number of the degrees of freedom is reduced to one through the first order decomposition of the interaction, leading to a dramatically power improvement in testing DE genes when the number of conditions is greater than two. In our simulation situations, multiDE outperformed the benchmark methods (i.e. edgeR and DESeq2) even if the underlying model was severely misspecified, and the power gain was increasing in the number of conditions. In the application to two real datasets, multiDE identified more biologically meaningful DE genes than the benchmark methods. An R package implementing multiDE is available publicly at http://homepage.fudan.edu.cn/zhangh/softwares/multiDE . When the number of conditions is two, multiDE performs comparably with the benchmark methods. When the number of conditions is greater than two, multiDE outperforms the benchmark methods.

  9. Sequential Probability Ratio Testing with Power Projective Base Method Improves Decision-Making for BCI

    PubMed Central

    Liu, Rong

    2017-01-01

    Obtaining a fast and reliable decision is an important issue in brain-computer interfaces (BCI), particularly in practical real-time applications such as wheelchair or neuroprosthetic control. In this study, the EEG signals were firstly analyzed with a power projective base method. Then we were applied a decision-making model, the sequential probability ratio testing (SPRT), for single-trial classification of motor imagery movement events. The unique strength of this proposed classification method lies in its accumulative process, which increases the discriminative power as more and more evidence is observed over time. The properties of the method were illustrated on thirteen subjects' recordings from three datasets. Results showed that our proposed power projective method outperformed two benchmark methods for every subject. Moreover, with sequential classifier, the accuracies across subjects were significantly higher than that with nonsequential ones. The average maximum accuracy of the SPRT method was 84.1%, as compared with 82.3% accuracy for the sequential Bayesian (SB) method. The proposed SPRT method provides an explicit relationship between stopping time, thresholds, and error, which is important for balancing the time-accuracy trade-off. These results suggest SPRT would be useful in speeding up decision-making while trading off errors in BCI. PMID:29348781

  10. Knowledge-Based Topic Model for Unsupervised Object Discovery and Localization.

    PubMed

    Niu, Zhenxing; Hua, Gang; Wang, Le; Gao, Xinbo

    Unsupervised object discovery and localization is to discover some dominant object classes and localize all of object instances from a given image collection without any supervision. Previous work has attempted to tackle this problem with vanilla topic models, such as latent Dirichlet allocation (LDA). However, in those methods no prior knowledge for the given image collection is exploited to facilitate object discovery. On the other hand, the topic models used in those methods suffer from the topic coherence issue-some inferred topics do not have clear meaning, which limits the final performance of object discovery. In this paper, prior knowledge in terms of the so-called must-links are exploited from Web images on the Internet. Furthermore, a novel knowledge-based topic model, called LDA with mixture of Dirichlet trees, is proposed to incorporate the must-links into topic modeling for object discovery. In particular, to better deal with the polysemy phenomenon of visual words, the must-link is re-defined as that one must-link only constrains one or some topic(s) instead of all topics, which leads to significantly improved topic coherence. Moreover, the must-links are built and grouped with respect to specific object classes, thus the must-links in our approach are semantic-specific , which allows to more efficiently exploit discriminative prior knowledge from Web images. Extensive experiments validated the efficiency of our proposed approach on several data sets. It is shown that our method significantly improves topic coherence and outperforms the unsupervised methods for object discovery and localization. In addition, compared with discriminative methods, the naturally existing object classes in the given image collection can be subtly discovered, which makes our approach well suited for realistic applications of unsupervised object discovery.Unsupervised object discovery and localization is to discover some dominant object classes and localize all of object instances from a given image collection without any supervision. Previous work has attempted to tackle this problem with vanilla topic models, such as latent Dirichlet allocation (LDA). However, in those methods no prior knowledge for the given image collection is exploited to facilitate object discovery. On the other hand, the topic models used in those methods suffer from the topic coherence issue-some inferred topics do not have clear meaning, which limits the final performance of object discovery. In this paper, prior knowledge in terms of the so-called must-links are exploited from Web images on the Internet. Furthermore, a novel knowledge-based topic model, called LDA with mixture of Dirichlet trees, is proposed to incorporate the must-links into topic modeling for object discovery. In particular, to better deal with the polysemy phenomenon of visual words, the must-link is re-defined as that one must-link only constrains one or some topic(s) instead of all topics, which leads to significantly improved topic coherence. Moreover, the must-links are built and grouped with respect to specific object classes, thus the must-links in our approach are semantic-specific , which allows to more efficiently exploit discriminative prior knowledge from Web images. Extensive experiments validated the efficiency of our proposed approach on several data sets. It is shown that our method significantly improves topic coherence and outperforms the unsupervised methods for object discovery and localization. In addition, compared with discriminative methods, the naturally existing object classes in the given image collection can be subtly discovered, which makes our approach well suited for realistic applications of unsupervised object discovery.

  11. Joint sparse reconstruction of multi-contrast MRI images with graph based redundant wavelet transform.

    PubMed

    Lai, Zongying; Zhang, Xinlin; Guo, Di; Du, Xiaofeng; Yang, Yonggui; Guo, Gang; Chen, Zhong; Qu, Xiaobo

    2018-05-03

    Multi-contrast images in magnetic resonance imaging (MRI) provide abundant contrast information reflecting the characteristics of the internal tissues of human bodies, and thus have been widely utilized in clinical diagnosis. However, long acquisition time limits the application of multi-contrast MRI. One efficient way to accelerate data acquisition is to under-sample the k-space data and then reconstruct images with sparsity constraint. However, images are compromised at high acceleration factor if images are reconstructed individually. We aim to improve the images with a jointly sparse reconstruction and Graph-based redundant wavelet transform (GBRWT). First, a sparsifying transform, GBRWT, is trained to reflect the similarity of tissue structures in multi-contrast images. Second, joint multi-contrast image reconstruction is formulated as a ℓ 2, 1 norm optimization problem under GBRWT representations. Third, the optimization problem is numerically solved using a derived alternating direction method. Experimental results in synthetic and in vivo MRI data demonstrate that the proposed joint reconstruction method can achieve lower reconstruction errors and better preserve image structures than the compared joint reconstruction methods. Besides, the proposed method outperforms single image reconstruction with joint sparsity constraint of multi-contrast images. The proposed method explores the joint sparsity of multi-contrast MRI images under graph-based redundant wavelet transform and realizes joint sparse reconstruction of multi-contrast images. Experiment demonstrate that the proposed method outperforms the compared joint reconstruction methods as well as individual reconstructions. With this high quality image reconstruction method, it is possible to achieve the high acceleration factors by exploring the complementary information provided by multi-contrast MRI.

  12. A comparison of cosegregation analysis methods for the clinical setting.

    PubMed

    Rañola, John Michael O; Liu, Quanhui; Rosenthal, Elisabeth A; Shirts, Brian H

    2018-04-01

    Quantitative cosegregation analysis can help evaluate the pathogenicity of genetic variants. However, genetics professionals without statistical training often use simple methods, reporting only qualitative findings. We evaluate the potential utility of quantitative cosegregation in the clinical setting by comparing three methods. One thousand pedigrees each were simulated for benign and pathogenic variants in BRCA1 and MLH1 using United States historical demographic data to produce pedigrees similar to those seen in the clinic. These pedigrees were analyzed using two robust methods, full likelihood Bayes factors (FLB) and cosegregation likelihood ratios (CSLR), and a simpler method, counting meioses. Both FLB and CSLR outperform counting meioses when dealing with pathogenic variants, though counting meioses is not far behind. For benign variants, FLB and CSLR greatly outperform as counting meioses is unable to generate evidence for benign variants. Comparing FLB and CSLR, we find that the two methods perform similarly, indicating that quantitative results from either of these methods could be combined in multifactorial calculations. Combining quantitative information will be important as isolated use of cosegregation in single families will yield classification for less than 1% of variants. To encourage wider use of robust cosegregation analysis, we present a website ( http://www.analyze.myvariant.org ) which implements the CSLR, FLB, and Counting Meioses methods for ATM, BRCA1, BRCA2, CHEK2, MEN1, MLH1, MSH2, MSH6, and PMS2. We also present an R package, CoSeg, which performs the CSLR analysis on any gene with user supplied parameters. Future variant classification guidelines should allow nuanced inclusion of cosegregation evidence against pathogenicity.

  13. Image processing and machine learning for fully automated probabilistic evaluation of medical images.

    PubMed

    Sajn, Luka; Kukar, Matjaž

    2011-12-01

    The paper presents results of our long-term study on using image processing and data mining methods in a medical imaging. Since evaluation of modern medical images is becoming increasingly complex, advanced analytical and decision support tools are involved in integration of partial diagnostic results. Such partial results, frequently obtained from tests with substantial imperfections, are integrated into ultimate diagnostic conclusion about the probability of disease for a given patient. We study various topics such as improving the predictive power of clinical tests by utilizing pre-test and post-test probabilities, texture representation, multi-resolution feature extraction, feature construction and data mining algorithms that significantly outperform medical practice. Our long-term study reveals three significant milestones. The first improvement was achieved by significantly increasing post-test diagnostic probabilities with respect to expert physicians. The second, even more significant improvement utilizes multi-resolution image parametrization. Machine learning methods in conjunction with the feature subset selection on these parameters significantly improve diagnostic performance. However, further feature construction with the principle component analysis on these features elevates results to an even higher accuracy level that represents the third milestone. With the proposed approach clinical results are significantly improved throughout the study. The most significant result of our study is improvement in the diagnostic power of the whole diagnostic process. Our compound approach aids, but does not replace, the physician's judgment and may assist in decisions on cost effectiveness of tests. Copyright © 2010 Elsevier Ireland Ltd. All rights reserved.

  14. A perturbative approach for enhancing the performance of time series forecasting.

    PubMed

    de Mattos Neto, Paulo S G; Ferreira, Tiago A E; Lima, Aranildo R; Vasconcelos, Germano C; Cavalcanti, George D C

    2017-04-01

    This paper proposes a method to perform time series prediction based on perturbation theory. The approach is based on continuously adjusting an initial forecasting model to asymptotically approximate a desired time series model. First, a predictive model generates an initial forecasting for a time series. Second, a residual time series is calculated as the difference between the original time series and the initial forecasting. If that residual series is not white noise, then it can be used to improve the accuracy of the initial model and a new predictive model is adjusted using residual series. The whole process is repeated until convergence or the residual series becomes white noise. The output of the method is then given by summing up the outputs of all trained predictive models in a perturbative sense. To test the method, an experimental investigation was conducted on six real world time series. A comparison was made with six other methods experimented and ten other results found in the literature. Results show that not only the performance of the initial model is significantly improved but also the proposed method outperforms the other results previously published. Copyright © 2017 Elsevier Ltd. All rights reserved.

  15. Optical aberration correction for simple lenses via sparse representation

    NASA Astrophysics Data System (ADS)

    Cui, Jinlin; Huang, Wei

    2018-04-01

    Simple lenses with spherical surfaces are lightweight, inexpensive, highly flexible, and can be easily processed. However, they suffer from optical aberrations that lead to limitations in high-quality photography. In this study, we propose a set of computational photography techniques based on sparse signal representation to remove optical aberrations, thereby allowing the recovery of images captured through a single-lens camera. The primary advantage of the proposed method is that many prior point spread functions calibrated at different depths are successfully used for restoring visual images in a short time, which can be generally applied to nonblind deconvolution methods for solving the problem of the excessive processing time caused by the number of point spread functions. The optical software CODE V is applied for examining the reliability of the proposed method by simulation. The simulation results reveal that the suggested method outperforms the traditional methods. Moreover, the performance of a single-lens camera is significantly enhanced both qualitatively and perceptually. Particularly, the prior information obtained by CODE V can be used for processing the real images of a single-lens camera, which provides an alternative approach to conveniently and accurately obtain point spread functions of single-lens cameras.

  16. Enhancing magnetic nanoparticle-based DNA transfection: Intracellular-active cassette features

    NASA Astrophysics Data System (ADS)

    Vernon, Matthew Martin

    Efficient plasmid DNA transfection of embryonic stem cells, mesenchymal stem cells, neural cell lines and the majority of primary cell lines is a current challenge in gene therapy research. Magnetic nanoparticle-based DNA transfection is a gene vectoring technique that is promising because it is capable of outperforming most other non-viral transfection methods in terms of both transfection efficiency and cell viability. The nature of the DNA vector implemented depends on the target cell phenotype, where the particle surface chemistry and DNA binding/unbinding kinetics of the DNA carrier molecule play a critical role in the many steps required for successful gene transfection. Accordingly, Neuromag, an iron oxide/polymer nanoparticle optimized for transfection of neural phenotypes, outperforms many other nanoparticles and lipidbased DNA carriers. Up to now, improvements to nanomagnetic transfection techniques have focused mostly on particle functionalization and transfection parameter optimization (cell confluence, growth media, serum starvation, magnet oscillation parameters, etc.). None of these parameters are capable of assisting the nuclear translocation of delivered plasmid DNA once the particle-DNA complex is released from the endosome and dissociates in the cell's cytoplasm. In this study, incorporation of a DNA targeting sequence (DTS) feature in the transfecting plasmid DNA confers improved nuclear translocation, demonstrating significant improvement in nanomagnetic transfection efficiency in differentiated SH-SY5Y neuroblastoma cells. Other parameters, such as days in vitro, are also found to play a role and represent potential targets for further optimization.

  17. Hyperspectral interventional imaging for enhanced tissue visualization and discrimination combining band selection methods.

    PubMed

    Nouri, Dorra; Lucas, Yves; Treuillet, Sylvie

    2016-12-01

    Hyperspectral imaging is an emerging technology recently introduced in medical applications inasmuch as it provides a powerful tool for noninvasive tissue characterization. In this context, a new system was designed to be easily integrated in the operating room in order to detect anatomical tissues hardly noticed by the surgeon's naked eye. Our LCTF-based spectral imaging system is operative over visible, near- and middle-infrared spectral ranges (400-1700 nm). It is dedicated to enhance critical biological tissues such as the ureter and the facial nerve. We aim to find the best three relevant bands to create a RGB image to display during the intervention with maximal contrast between the target tissue and its surroundings. A comparative study is carried out between band selection methods and band transformation methods. Combined band selection methods are proposed. All methods are compared using different evaluation criteria. Experimental results show that the proposed combined band selection methods provide the best performance with rich information, high tissue separability and short computational time. These methods yield a significant discrimination between biological tissues. We developed a hyperspectral imaging system in order to enhance some biological tissue visualization. The proposed methods provided an acceptable trade-off between the evaluation criteria especially in SWIR spectral band that outperforms the naked eye's capacities.

  18. Identification of significant features by the Global Mean Rank test.

    PubMed

    Klammer, Martin; Dybowski, J Nikolaj; Hoffmann, Daniel; Schaab, Christoph

    2014-01-01

    With the introduction of omics-technologies such as transcriptomics and proteomics, numerous methods for the reliable identification of significantly regulated features (genes, proteins, etc.) have been developed. Experimental practice requires these tests to successfully deal with conditions such as small numbers of replicates, missing values, non-normally distributed expression levels, and non-identical distributions of features. With the MeanRank test we aimed at developing a test that performs robustly under these conditions, while favorably scaling with the number of replicates. The test proposed here is a global one-sample location test, which is based on the mean ranks across replicates, and internally estimates and controls the false discovery rate. Furthermore, missing data is accounted for without the need of imputation. In extensive simulations comparing MeanRank to other frequently used methods, we found that it performs well with small and large numbers of replicates, feature dependent variance between replicates, and variable regulation across features on simulation data and a recent two-color microarray spike-in dataset. The tests were then used to identify significant changes in the phosphoproteomes of cancer cells induced by the kinase inhibitors erlotinib and 3-MB-PP1 in two independently published mass spectrometry-based studies. MeanRank outperformed the other global rank-based methods applied in this study. Compared to the popular Significance Analysis of Microarrays and Linear Models for Microarray methods, MeanRank performed similar or better. Furthermore, MeanRank exhibits more consistent behavior regarding the degree of regulation and is robust against the choice of preprocessing methods. MeanRank does not require any imputation of missing values, is easy to understand, and yields results that are easy to interpret. The software implementing the algorithm is freely available for academic and commercial use.

  19. Quantifying fracture geometry with X-ray tomography: Technique of Iterative Local Thresholding (TILT) for 3D image segmentation

    DOE PAGES

    Deng, Hang; Fitts, Jeffrey P.; Peters, Catherine A.

    2016-02-01

    This paper presents a new method—the Technique of Iterative Local Thresholding (TILT)—for processing 3D X-ray computed tomography (xCT) images for visualization and quantification of rock fractures. The TILT method includes the following advancements. First, custom masks are generated by a fracture-dilation procedure, which significantly amplifies the fracture signal on the intensity histogram used for local thresholding. Second, TILT is particularly well suited for fracture characterization in granular rocks because the multi-scale Hessian fracture (MHF) filter has been incorporated to distinguish fractures from pores in the rock matrix. Third, TILT wraps the thresholding and fracture isolation steps in an optimized iterativemore » routine for binary segmentation, minimizing human intervention and enabling automated processing of large 3D datasets. As an illustrative example, we applied TILT to 3D xCT images of reacted and unreacted fractured limestone cores. Other segmentation methods were also applied to provide insights regarding variability in image processing. The results show that TILT significantly enhanced separability of grayscale intensities, outperformed the other methods in automation, and was successful in isolating fractures from the porous rock matrix. Because the other methods are more likely to misclassify fracture edges as void and/or have limited capacity in distinguishing fractures from pores, those methods estimated larger fracture volumes (up to 80 %), surface areas (up to 60 %), and roughness (up to a factor of 2). In conclusion, these differences in fracture geometry would lead to significant disparities in hydraulic permeability predictions, as determined by 2D flow simulations.« less

  20. Do No Harm--A Comparison of the Effects of On-Line vs. Traditional Delivery Media on a Science Course.

    ERIC Educational Resources Information Center

    Schoenfeld-Tacher, Regina; McConnell, Sherry; Graham, Michele

    2001-01-01

    Presents the results of a study examining the effects of distance delivery on student performance and classroom interactions in an upper level Histology course. Finds that students in an on-line group significantly out-perform their peers in an on-campus section. (Author/MM)

  1. A Comparison of Patched HOTV Visual Acuity and Photoscreening

    ERIC Educational Resources Information Center

    Leman, Rachel; Clausen, Michelle M.; Bates, Janice; Stark, Lee; Arnold, Koni K.; Arnold, Robert W.

    2006-01-01

    Early detection of significant vision problems in children is a high priority for pediatricians and school nurses. Routine vision screening is a necessary part of that detection and has traditionally involved acuity charts. However, photoscreening in which "red eye" is elicited to show whether each eye is focusing may outperform routine acuity…

  2. Integrating Concept Mapping and the Learning Cycle To Teach Diffusion and Osmosis Concepts to High School Biology Students.

    ERIC Educational Resources Information Center

    Odom, Arthur L.; Kelly, Paul V.

    2001-01-01

    Explores the effectiveness of concept mapping, the learning cycle, expository instruction, and a combination of concept mapping/learning cycle in promoting conceptual understanding of diffusion and osmosis. Concludes that the concept mapping/learning cycle and concept mapping treatment groups significantly outperformed the expository treatment…

  3. The Relative Performance of Female and Male Students in Accounting Principles Classes.

    ERIC Educational Resources Information Center

    Bouillon, Marvin L.; Doran, B. Michael

    1992-01-01

    The performance of female and male students in Accounting Principles (AP) I and II was compared by using multiple regression techniques to assess the incremental explanatory effects of gender. Males significantly outperformed females in AP I, contradicting earlier studies. Similar gender of instructor and student was insignificant. (JOW)

  4. Effects of tutoring in phonological and early reading skills on students at risk for reading disabilities.

    PubMed

    Vadasy, P F; Jenkins, J R; Pool, K

    2000-01-01

    This study examined the effectiveness of nonprofessional tutors in a phonologically based reading treatment similar to those in which successful reading outcomes have been demonstrated. Participants were 23 first graders at risk for learning disability who received intensive one-to-one tutoring from noncertified tutors for 30 minutes, 4 days a week, for one school year. Tutoring included instruction in phonological skills, letter-sound correspondence, explicit decoding, rime analysis, writing, spelling, and reading phonetically controlled text. At year end, tutored students significantly outperformed untutored control students on measures of reading, spelling, and decoding. Effect sizes ranged from .42 to 1.24. Treatment effects diminished at follow-up at the end of second grade, although tutored students continued to significantly outperform untutored students in decoding and spelling. Findings suggest that phonologically based reading instruction for first graders at risk for learning disability can be delivered by nonteacher tutors. Our discussion addresses the character of reading outcomes associated with tutoring, individual differences in response to treatment, and the infrastructure required for nonprofessional tutoring programs.

  5. The Effect of Prior Knowledge and Gender on Physics Achievement

    NASA Astrophysics Data System (ADS)

    Stewart, John; Henderson, Rachel

    2017-01-01

    Gender differences on the Conceptual Survey in Electricity and Magnetism (CSEM) have been extensively studied. Ten semesters (N=1621) of CSEM data is presented showing male students outperform female students on the CSEM posttest by 5 % (p < . 001). Male students also outperform female students on qualitative in-semester test questions by 3 % (p = . 004), but no significant difference between male and female students was found on quantitative test questions. Male students enter the class with superior prior preparation in the subject and score 4 % higher on the CSEM pretest (p < . 001). If the sample is restricted to students with little prior knowledge who answer no more than 8 of the 32 questions correctly (N=822), male and female differences on the CSEM and qualitative test questions cease to be significant. This suggests no intrinsic gender bias exists in the CSEM itself and that gender differences are the result of prior preparation measured by CSEM pretest score. Gender differences between male and female students increase with pretest score. Regression analyses are presented to further explore interactions between preparation, gender, and achievement.

  6. Poor phonemic discrimination does not underlie poor verbal short-term memory in Down syndrome.

    PubMed

    Purser, Harry R M; Jarrold, Christopher

    2013-05-01

    Individuals with Down syndrome tend to have a marked impairment of verbal short-term memory. The chief aim of this study was to investigate whether phonemic discrimination contributes to this deficit. The secondary aim was to investigate whether phonological representations are degraded in verbal short-term memory in people with Down syndrome relative to control participants. To answer these questions, two tasks were used: a discrimination task, in which memory load was as low as possible, and a short-term recognition task that used the same stimulus items. Individuals with Down syndrome were found to perform significantly better than a nonverbal-matched typically developing group on the discrimination task, but they performed significantly more poorly than that group on the recognition task. The Down syndrome group was outperformed by an additional vocabulary-matched control group on the discrimination task but was outperformed to a markedly greater extent on the recognition task. Taken together, the results strongly indicate that phonemic discrimination ability is not central to the verbal short-term memory deficit associated with Down syndrome. Copyright © 2013 Elsevier Inc. All rights reserved.

  7. Numeracy skills of undergraduate entry level nurse, midwife and pharmacy students.

    PubMed

    Arkell, Sharon; Rutter, Paul M

    2012-07-01

    The ability of healthcare professionals to perform basic numeracy and therefore dose calculations competently is without question. Research has primarily focused on nurses, and to a lesser extent doctors, ability to perform this function with findings highlighting poor aptitude. Studies involving pharmacists are few but findings are more positive than other healthcare staff. To determine first year nursing, midwifery and pharmacy students ability to perform basic numeracy calculations. All new undergraduate entrants to nursing, midwifery and pharmacy sat a formative numeracy test within the first two weeks of their first year of study. Test results showed that pharmacy students significantly outperformed midwifery and nursing students on all questions. In turn midwifery students outperformed nurses, although this did not achieve significance. When looking at each cohorts general attitude towards mathematics, pharmacy students were more positive and confident compared to midwifery and nursing students. Pharmacy students expressed greater levels of enjoyment and confidence in performing mathematics and correspondingly showed the greatest proficiency. In contrast nurse, and to a lesser extent midwifery students showed poor performance and low confidence levels. Copyright © 2012 Elsevier Ltd. All rights reserved.

  8. Gender differences in undergraduate medicine in Galway: a tale of two curricula.

    PubMed

    McVeigh, T P; Dunne, F P

    2014-03-01

    Medical teaching in the National University of Ireland Galway (NUIG) has undergone a shift from subject- to system-based learning. Our aims were to examine differences between genders in academic performance in medicine across two different curricula. Results of each student graduating between 2007 and 2012 for each subject undertaken over the medical degree were obtained from the Medical School. Data were collected with respect to gender, nationality and mode of entry, and analysis completed using SPSS. The cohort included 360 females and 249 males. 396 students read from a subject-based curriculum and 213 a system-based course. Females outperformed males in 19/24 (79 %) subjects in the subject-based curriculum, and in 9/38 (24 %) in the system-based course. Males were more likely to fail and less likely to achieve an honours degree. Multivariate analysis showed nationality and gender to be significant predictive factors. Females outperformed males overall. Differences were most pronounced in a subject-based curriculum. Nationality and gender were found to be significant factors in determining overall results.

  9. A performance comparison of two small rocket nozzles

    NASA Technical Reports Server (NTRS)

    Arrington, Lynn A.; Reed, Brian D.; Rivera, Angel, Jr.

    1996-01-01

    An experimental study was conducted on two small rockets (110 N thrust class) to directly compare a standard conical nozzle with a bell nozzle optimized for maximum thrust using the Rao method. In large rockets, with throat Reynolds numbers of greater than 1 x 10(exp 5), bell nozzles outperform conical nozzles. In rockets with throat Reynolds numbers below 1 x 10(exp 5), however, test results have been ambiguous. An experimental program was conducted to test two small nozzles at two different fuel film cooling percentages and three different chamber pressures. Test results showed that for the throat Reynolds number range from 2 x 10(exp 4) to 4 x 10(exp 4), the bell nozzle outperformed the conical nozzle. Thrust coefficients for the bell nozzle were approximately 4 to 12 percent higher than those obtained with the conical nozzle. As expected, testing showed that lowering the fuel film cooling increased performance for both nozzle types.

  10. [Gender differences in cognitive functions and influence of sex hormones].

    PubMed

    Torres, A; Gómez-Gil, E; Vidal, A; Puig, O; Boget, T; Salamero, M

    2006-01-01

    To review scientific evidence on gender differences in cognitive functions and influence of sex hormones on cognitive performance. Systematical search of related studies identified in Medline. Women outperform men on verbal fluency, perceptual speed tasks, fine motor skills, verbal memory and verbal learning. Men outperform women on visuospatial ability, mathematical problem solving and visual memory. No gender differences on attention and working memory are found. Researchers distinguish four methods to investigate hormonal influence on cognitive performance: a) patient with hormonal disorders; b) neuroimaging in individuals during hormone administration; c) in women during different phases of menstrual cycle, and d) in patients receiving hormonal treatment (idiopathic hypogonadotropic hypogonadism, postmenopausal women and transsexuals). The findings mostly suggest an influence of sex hormones on some cognitive functions, but they are not conclusive because of limitations and scarcity of the studies. There are gender differences on cognitive functions. Sex hormones seem to influence cognitive performance.

  11. A family of chaotic pure analog coding schemes based on baker's map function

    NASA Astrophysics Data System (ADS)

    Liu, Yang; Li, Jing; Lu, Xuanxuan; Yuen, Chau; Wu, Jun

    2015-12-01

    This paper considers a family of pure analog coding schemes constructed from dynamic systems which are governed by chaotic functions—baker's map function and its variants. Various decoding methods, including maximum likelihood (ML), minimum mean square error (MMSE), and mixed ML-MMSE decoding algorithms, have been developed for these novel encoding schemes. The proposed mirrored baker's and single-input baker's analog codes perform a balanced protection against the fold error (large distortion) and weak distortion and outperform the classical chaotic analog coding and analog joint source-channel coding schemes in literature. Compared to the conventional digital communication system, where quantization and digital error correction codes are used, the proposed analog coding system has graceful performance evolution, low decoding latency, and no quantization noise. Numerical results show that under the same bandwidth expansion, the proposed analog system outperforms the digital ones over a wide signal-to-noise (SNR) range.

  12. New technology in postfire rehab

    Treesearch

    Joe Sabel

    2007-01-01

    PAM-12™ is a recycled office paper byproduct made into a spreadable mulch with added Water Soluble Polyacrylamide (WSPAM), a previously difficult polymer to apply. PAM-12 is extremely versatile and can be applied through several methods. In a field test, PAM-12 outperformed straw in every targeted performance area: erosion control, improving soil hydrophobicity, and...

  13. Comparison of Classification Methods for Detecting Emotion from Mandarin Speech

    NASA Astrophysics Data System (ADS)

    Pao, Tsang-Long; Chen, Yu-Te; Yeh, Jun-Heng

    It is said that technology comes out from humanity. What is humanity? The very definition of humanity is emotion. Emotion is the basis for all human expression and the underlying theme behind everything that is done, said, thought or imagined. Making computers being able to perceive and respond to human emotion, the human-computer interaction will be more natural. Several classifiers are adopted for automatically assigning an emotion category, such as anger, happiness or sadness, to a speech utterance. These classifiers were designed independently and tested on various emotional speech corpora, making it difficult to compare and evaluate their performance. In this paper, we first compared several popular classification methods and evaluated their performance by applying them to a Mandarin speech corpus consisting of five basic emotions, including anger, happiness, boredom, sadness and neutral. The extracted feature streams contain MFCC, LPCC, and LPC. The experimental results show that the proposed WD-MKNN classifier achieves an accuracy of 81.4% for the 5-class emotion recognition and outperforms other classification techniques, including KNN, MKNN, DW-KNN, LDA, QDA, GMM, HMM, SVM, and BPNN. Then, to verify the advantage of the proposed method, we compared these classifiers by applying them to another Mandarin expressive speech corpus consisting of two emotions. The experimental results still show that the proposed WD-MKNN outperforms others.

  14. RNA-seq mixology: designing realistic control experiments to compare protocols and analysis methods

    PubMed Central

    Holik, Aliaksei Z.; Law, Charity W.; Liu, Ruijie; Wang, Zeya; Wang, Wenyi; Ahn, Jaeil; Asselin-Labat, Marie-Liesse; Smyth, Gordon K.

    2017-01-01

    Abstract Carefully designed control experiments provide a gold standard for benchmarking different genomics research tools. A shortcoming of many gene expression control studies is that replication involves profiling the same reference RNA sample multiple times. This leads to low, pure technical noise that is atypical of regular studies. To achieve a more realistic noise structure, we generated a RNA-sequencing mixture experiment using two cell lines of the same cancer type. Variability was added by extracting RNA from independent cell cultures and degrading particular samples. The systematic gene expression changes induced by this design allowed benchmarking of different library preparation kits (standard poly-A versus total RNA with Ribozero depletion) and analysis pipelines. Data generated using the total RNA kit had more signal for introns and various RNA classes (ncRNA, snRNA, snoRNA) and less variability after degradation. For differential expression analysis, voom with quality weights marginally outperformed other popular methods, while for differential splicing, DEXSeq was simultaneously the most sensitive and the most inconsistent method. For sample deconvolution analysis, DeMix outperformed IsoPure convincingly. Our RNA-sequencing data set provides a valuable resource for benchmarking different protocols and data pre-processing workflows. The extra noise mimics routine lab experiments more closely, ensuring any conclusions are widely applicable. PMID:27899618

  15. AUC-Maximized Deep Convolutional Neural Fields for Protein Sequence Labeling.

    PubMed

    Wang, Sheng; Sun, Siqi; Xu, Jinbo

    2016-09-01

    Deep Convolutional Neural Networks (DCNN) has shown excellent performance in a variety of machine learning tasks. This paper presents Deep Convolutional Neural Fields (DeepCNF), an integration of DCNN with Conditional Random Field (CRF), for sequence labeling with an imbalanced label distribution. The widely-used training methods, such as maximum-likelihood and maximum labelwise accuracy, do not work well on imbalanced data. To handle this, we present a new training algorithm called maximum-AUC for DeepCNF. That is, we train DeepCNF by directly maximizing the empirical Area Under the ROC Curve (AUC), which is an unbiased measurement for imbalanced data. To fulfill this, we formulate AUC in a pairwise ranking framework, approximate it by a polynomial function and then apply a gradient-based procedure to optimize it. Our experimental results confirm that maximum-AUC greatly outperforms the other two training methods on 8-state secondary structure prediction and disorder prediction since their label distributions are highly imbalanced and also has similar performance as the other two training methods on solvent accessibility prediction, which has three equally-distributed labels. Furthermore, our experimental results show that our AUC-trained DeepCNF models greatly outperform existing popular predictors of these three tasks. The data and software related to this paper are available at https://github.com/realbigws/DeepCNF_AUC.

  16. AUC-Maximized Deep Convolutional Neural Fields for Protein Sequence Labeling

    PubMed Central

    Wang, Sheng; Sun, Siqi

    2017-01-01

    Deep Convolutional Neural Networks (DCNN) has shown excellent performance in a variety of machine learning tasks. This paper presents Deep Convolutional Neural Fields (DeepCNF), an integration of DCNN with Conditional Random Field (CRF), for sequence labeling with an imbalanced label distribution. The widely-used training methods, such as maximum-likelihood and maximum labelwise accuracy, do not work well on imbalanced data. To handle this, we present a new training algorithm called maximum-AUC for DeepCNF. That is, we train DeepCNF by directly maximizing the empirical Area Under the ROC Curve (AUC), which is an unbiased measurement for imbalanced data. To fulfill this, we formulate AUC in a pairwise ranking framework, approximate it by a polynomial function and then apply a gradient-based procedure to optimize it. Our experimental results confirm that maximum-AUC greatly outperforms the other two training methods on 8-state secondary structure prediction and disorder prediction since their label distributions are highly imbalanced and also has similar performance as the other two training methods on solvent accessibility prediction, which has three equally-distributed labels. Furthermore, our experimental results show that our AUC-trained DeepCNF models greatly outperform existing popular predictors of these three tasks. The data and software related to this paper are available at https://github.com/realbigws/DeepCNF_AUC. PMID:28884168

  17. An algorithm to track laboratory zebrafish shoals.

    PubMed

    Feijó, Gregory de Oliveira; Sangalli, Vicenzo Abichequer; da Silva, Isaac Newton Lima; Pinho, Márcio Sarroglia

    2018-05-01

    In this paper, a semi-automatic multi-object tracking method to track a group of unmarked zebrafish is proposed. This method can handle partial occlusion cases, maintaining the correct identity of each individual. For every object, we extracted a set of geometric features to be used in the two main stages of the algorithm. The first stage selected the best candidate, based both on the blobs identified in the image and the estimate generated by a Kalman Filter instance. In the second stage, if the same candidate-blob is selected by two or more instances, a blob-partitioning algorithm takes place in order to split this blob and reestablish the instances' identities. If the algorithm cannot determine the identity of a blob, a manual intervention is required. This procedure was compared against a manual labeled ground truth on four video sequences with different numbers of fish and spatial resolution. The performance of the proposed method is then compared against two well-known zebrafish tracking methods found in the literature: one that treats occlusion scenarios and one that only track fish that are not in occlusion. Based on the data set used, the proposed method outperforms the first method in correctly separating fish in occlusion, increasing its efficiency by at least 8.15% of the cases. As for the second, the proposed method's overall performance outperformed the second in some of the tested videos, especially those with lower image quality, because the second method requires high-spatial resolution images, which is not a requirement for the proposed method. Yet, the proposed method was able to separate fish involved in occlusion and correctly assign its identity in up to 87.85% of the cases, without accounting for user intervention. Copyright © 2018 Elsevier Ltd. All rights reserved.

  18. An Accurate Scalable Template-based Alignment Algorithm

    PubMed Central

    Gardner, David P.; Xu, Weijia; Miranker, Daniel P.; Ozer, Stuart; Cannone, Jamie J.; Gutell, Robin R.

    2013-01-01

    The rapid determination of nucleic acid sequences is increasing the number of sequences that are available. Inherent in a template or seed alignment is the culmination of structural and functional constraints that are selecting those mutations that are viable during the evolution of the RNA. While we might not understand these structural and functional, template-based alignment programs utilize the patterns of sequence conservation to encapsulate the characteristics of viable RNA sequences that are aligned properly. We have developed a program that utilizes the different dimensions of information in rCAD, a large RNA informatics resource, to establish a profile for each position in an alignment. The most significant include sequence identity and column composition in different phylogenetic taxa. We have compared our methods with a maximum of eight alternative alignment methods on different sets of 16S and 23S rRNA sequences with sequence percent identities ranging from 50% to 100%. The results showed that CRWAlign outperformed the other alignment methods in both speed and accuracy. A web-based alignment server is available at http://www.rna.ccbb.utexas.edu/SAE/2F/CRWAlign. PMID:24772376

  19. Microblog sentiment analysis using social and topic context.

    PubMed

    Zou, Xiaomei; Yang, Jing; Zhang, Jianpei

    2018-01-01

    Analyzing massive user-generated microblogs is very crucial in many fields, attracting many researchers to study. However, it is very challenging to process such noisy and short microblogs. Most prior works only use texts to identify sentiment polarity and assume that microblogs are independent and identically distributed, which ignore microblogs are networked data. Therefore, their performance is not usually satisfactory. Inspired by two sociological theories (sentimental consistency and emotional contagion), in this paper, we propose a new method combining social context and topic context to analyze microblog sentiment. In particular, different from previous work using direct user relations, we introduce structure similarity context into social contexts and propose a method to measure structure similarity. In addition, we also introduce topic context to model the semantic relations between microblogs. Social context and topic context are combined by the Laplacian matrix of the graph built by these contexts and Laplacian regularization are added into the microblog sentiment analysis model. Experimental results on two real Twitter datasets demonstrate that our proposed model can outperform baseline methods consistently and significantly.

  20. Compressive sensing of high betweenness centrality nodes in networks

    NASA Astrophysics Data System (ADS)

    Mahyar, Hamidreza; Hasheminezhad, Rouzbeh; Ghalebi K., Elahe; Nazemian, Ali; Grosu, Radu; Movaghar, Ali; Rabiee, Hamid R.

    2018-05-01

    Betweenness centrality is a prominent centrality measure expressing importance of a node within a network, in terms of the fraction of shortest paths passing through that node. Nodes with high betweenness centrality have significant impacts on the spread of influence and idea in social networks, the user activity in mobile phone networks, the contagion process in biological networks, and the bottlenecks in communication networks. Thus, identifying k-highest betweenness centrality nodes in networks will be of great interest in many applications. In this paper, we introduce CS-HiBet, a new method to efficiently detect top- k betweenness centrality nodes in networks, using compressive sensing. CS-HiBet can perform as a distributed algorithm by using only the local information at each node. Hence, it is applicable to large real-world and unknown networks in which the global approaches are usually unrealizable. The performance of the proposed method is evaluated by extensive simulations on several synthetic and real-world networks. The experimental results demonstrate that CS-HiBet outperforms the best existing methods with notable improvements.

  1. A Weighted Deep Representation Learning Model for Imbalanced Fault Diagnosis in Cyber-Physical Systems.

    PubMed

    Wu, Zhenyu; Guo, Yang; Lin, Wenfang; Yu, Shuyang; Ji, Yang

    2018-04-05

    Predictive maintenance plays an important role in modern Cyber-Physical Systems (CPSs) and data-driven methods have been a worthwhile direction for Prognostics Health Management (PHM). However, two main challenges have significant influences on the traditional fault diagnostic models: one is that extracting hand-crafted features from multi-dimensional sensors with internal dependencies depends too much on expertise knowledge; the other is that imbalance pervasively exists among faulty and normal samples. As deep learning models have proved to be good methods for automatic feature extraction, the objective of this paper is to study an optimized deep learning model for imbalanced fault diagnosis for CPSs. Thus, this paper proposes a weighted Long Recurrent Convolutional LSTM model with sampling policy (wLRCL-D) to deal with these challenges. The model consists of 2-layer CNNs, 2-layer inner LSTMs and 2-Layer outer LSTMs, with under-sampling policy and weighted cost-sensitive loss function. Experiments are conducted on PHM 2015 challenge datasets, and the results show that wLRCL-D outperforms other baseline methods.

  2. Computational system identification of continuous-time nonlinear systems using approximate Bayesian computation

    NASA Astrophysics Data System (ADS)

    Krishnanathan, Kirubhakaran; Anderson, Sean R.; Billings, Stephen A.; Kadirkamanathan, Visakan

    2016-11-01

    In this paper, we derive a system identification framework for continuous-time nonlinear systems, for the first time using a simulation-focused computational Bayesian approach. Simulation approaches to nonlinear system identification have been shown to outperform regression methods under certain conditions, such as non-persistently exciting inputs and fast-sampling. We use the approximate Bayesian computation (ABC) algorithm to perform simulation-based inference of model parameters. The framework has the following main advantages: (1) parameter distributions are intrinsically generated, giving the user a clear description of uncertainty, (2) the simulation approach avoids the difficult problem of estimating signal derivatives as is common with other continuous-time methods, and (3) as noted above, the simulation approach improves identification under conditions of non-persistently exciting inputs and fast-sampling. Term selection is performed by judging parameter significance using parameter distributions that are intrinsically generated as part of the ABC procedure. The results from a numerical example demonstrate that the method performs well in noisy scenarios, especially in comparison to competing techniques that rely on signal derivative estimation.

  3. Village Building Identification Based on Ensemble Convolutional Neural Networks

    PubMed Central

    Guo, Zhiling; Chen, Qi; Xu, Yongwei; Shibasaki, Ryosuke; Shao, Xiaowei

    2017-01-01

    In this study, we present the Ensemble Convolutional Neural Network (ECNN), an elaborate CNN frame formulated based on ensembling state-of-the-art CNN models, to identify village buildings from open high-resolution remote sensing (HRRS) images. First, to optimize and mine the capability of CNN for village mapping and to ensure compatibility with our classification targets, a few state-of-the-art models were carefully optimized and enhanced based on a series of rigorous analyses and evaluations. Second, rather than directly implementing building identification by using these models, we exploited most of their advantages by ensembling their feature extractor parts into a stronger model called ECNN based on the multiscale feature learning method. Finally, the generated ECNN was applied to a pixel-level classification frame to implement object identification. The proposed method can serve as a viable tool for village building identification with high accuracy and efficiency. The experimental results obtained from the test area in Savannakhet province, Laos, prove that the proposed ECNN model significantly outperforms existing methods, improving overall accuracy from 96.64% to 99.26%, and kappa from 0.57 to 0.86. PMID:29084154

  4. GOClonto: an ontological clustering approach for conceptualizing PubMed abstracts.

    PubMed

    Zheng, Hai-Tao; Borchert, Charles; Kim, Hong-Gee

    2010-02-01

    Concurrent with progress in biomedical sciences, an overwhelming of textual knowledge is accumulating in the biomedical literature. PubMed is the most comprehensive database collecting and managing biomedical literature. To help researchers easily understand collections of PubMed abstracts, numerous clustering methods have been proposed to group similar abstracts based on their shared features. However, most of these methods do not explore the semantic relationships among groupings of documents, which could help better illuminate the groupings of PubMed abstracts. To address this issue, we proposed an ontological clustering method called GOClonto for conceptualizing PubMed abstracts. GOClonto uses latent semantic analysis (LSA) and gene ontology (GO) to identify key gene-related concepts and their relationships as well as allocate PubMed abstracts based on these key gene-related concepts. Based on two PubMed abstract collections, the experimental results show that GOClonto is able to identify key gene-related concepts and outperforms the STC (suffix tree clustering) algorithm, the Lingo algorithm, the Fuzzy Ants algorithm, and the clustering based TRS (tolerance rough set) algorithm. Moreover, the two ontologies generated by GOClonto show significant informative conceptual structures.

  5. Road Lane Detection Robust to Shadows Based on a Fuzzy System Using a Visible Light Camera Sensor

    PubMed Central

    Hoang, Toan Minh; Baek, Na Rae; Cho, Se Woon; Kim, Ki Wan; Park, Kang Ryoung

    2017-01-01

    Recently, autonomous vehicles, particularly self-driving cars, have received significant attention owing to rapid advancements in sensor and computation technologies. In addition to traffic sign recognition, road lane detection is one of the most important factors used in lane departure warning systems and autonomous vehicles for maintaining the safety of semi-autonomous and fully autonomous systems. Unlike traffic signs, road lanes are easily damaged by both internal and external factors such as road quality, occlusion (traffic on the road), weather conditions, and illumination (shadows from objects such as cars, trees, and buildings). Obtaining clear road lane markings for recognition processing is a difficult challenge. Therefore, we propose a method to overcome various illumination problems, particularly severe shadows, by using fuzzy system and line segment detector algorithms to obtain better results for detecting road lanes by a visible light camera sensor. Experimental results from three open databases, Caltech dataset, Santiago Lanes dataset (SLD), and Road Marking dataset, showed that our method outperformed conventional lane detection methods. PMID:29143764

  6. Autocorrelation-based time synchronous averaging for condition monitoring of planetary gearboxes in wind turbines

    NASA Astrophysics Data System (ADS)

    Ha, Jong M.; Youn, Byeng D.; Oh, Hyunseok; Han, Bongtae; Jung, Yoongho; Park, Jungho

    2016-03-01

    We propose autocorrelation-based time synchronous averaging (ATSA) to cope with the challenges associated with the current practice of time synchronous averaging (TSA) for planet gears in planetary gearboxes of wind turbine (WT). An autocorrelation function that represents physical interactions between the ring, sun, and planet gears in the gearbox is utilized to define the optimal shape and range of the window function for TSA using actual kinetic responses. The proposed ATSA offers two distinctive features: (1) data-efficient TSA processing and (2) prevention of signal distortion during the TSA process. It is thus expected that an order analysis with the ATSA signals significantly improves the efficiency and accuracy in fault diagnostics of planet gears in planetary gearboxes. Two case studies are presented to demonstrate the effectiveness of the proposed method: an analytical signal from a simulation and a signal measured from a 2 kW WT testbed. It can be concluded from the results that the proposed method outperforms conventional TSA methods in condition monitoring of the planetary gearbox when the amount of available stationary data is limited.

  7. Semi-Supervised Projective Non-Negative Matrix Factorization for Cancer Classification.

    PubMed

    Zhang, Xiang; Guan, Naiyang; Jia, Zhilong; Qiu, Xiaogang; Luo, Zhigang

    2015-01-01

    Advances in DNA microarray technologies have made gene expression profiles a significant candidate in identifying different types of cancers. Traditional learning-based cancer identification methods utilize labeled samples to train a classifier, but they are inconvenient for practical application because labels are quite expensive in the clinical cancer research community. This paper proposes a semi-supervised projective non-negative matrix factorization method (Semi-PNMF) to learn an effective classifier from both labeled and unlabeled samples, thus boosting subsequent cancer classification performance. In particular, Semi-PNMF jointly learns a non-negative subspace from concatenated labeled and unlabeled samples and indicates classes by the positions of the maximum entries of their coefficients. Because Semi-PNMF incorporates statistical information from the large volume of unlabeled samples in the learned subspace, it can learn more representative subspaces and boost classification performance. We developed a multiplicative update rule (MUR) to optimize Semi-PNMF and proved its convergence. The experimental results of cancer classification for two multiclass cancer gene expression profile datasets show that Semi-PNMF outperforms the representative methods.

  8. A Weighted Deep Representation Learning Model for Imbalanced Fault Diagnosis in Cyber-Physical Systems

    PubMed Central

    Guo, Yang; Lin, Wenfang; Yu, Shuyang; Ji, Yang

    2018-01-01

    Predictive maintenance plays an important role in modern Cyber-Physical Systems (CPSs) and data-driven methods have been a worthwhile direction for Prognostics Health Management (PHM). However, two main challenges have significant influences on the traditional fault diagnostic models: one is that extracting hand-crafted features from multi-dimensional sensors with internal dependencies depends too much on expertise knowledge; the other is that imbalance pervasively exists among faulty and normal samples. As deep learning models have proved to be good methods for automatic feature extraction, the objective of this paper is to study an optimized deep learning model for imbalanced fault diagnosis for CPSs. Thus, this paper proposes a weighted Long Recurrent Convolutional LSTM model with sampling policy (wLRCL-D) to deal with these challenges. The model consists of 2-layer CNNs, 2-layer inner LSTMs and 2-Layer outer LSTMs, with under-sampling policy and weighted cost-sensitive loss function. Experiments are conducted on PHM 2015 challenge datasets, and the results show that wLRCL-D outperforms other baseline methods. PMID:29621131

  9. A propensity score approach to correction for bias due to population stratification using genetic and non-genetic factors.

    PubMed

    Zhao, Huaqing; Rebbeck, Timothy R; Mitra, Nandita

    2009-12-01

    Confounding due to population stratification (PS) arises when differences in both allele and disease frequencies exist in a population of mixed racial/ethnic subpopulations. Genomic control, structured association, principal components analysis (PCA), and multidimensional scaling (MDS) approaches have been proposed to address this bias using genetic markers. However, confounding due to PS can also be due to non-genetic factors. Propensity scores are widely used to address confounding in observational studies but have not been adapted to deal with PS in genetic association studies. We propose a genomic propensity score (GPS) approach to correct for bias due to PS that considers both genetic and non-genetic factors. We compare the GPS method with PCA and MDS using simulation studies. Our results show that GPS can adequately adjust and consistently correct for bias due to PS. Under no/mild, moderate, and severe PS, GPS yielded estimated with bias close to 0 (mean=-0.0044, standard error=0.0087). Under moderate or severe PS, the GPS method consistently outperforms the PCA method in terms of bias, coverage probability (CP), and type I error. Under moderate PS, the GPS method consistently outperforms the MDS method in terms of CP. PCA maintains relatively high power compared to both MDS and GPS methods under the simulated situations. GPS and MDS are comparable in terms of statistical properties such as bias, type I error, and power. The GPS method provides a novel and robust tool for obtaining less-biased estimates of genetic associations that can consider both genetic and non-genetic factors. 2009 Wiley-Liss, Inc.

  10. Weighted community detection and data clustering using message passing

    NASA Astrophysics Data System (ADS)

    Shi, Cheng; Liu, Yanchen; Zhang, Pan

    2018-03-01

    Grouping objects into clusters based on the similarities or weights between them is one of the most important problems in science and engineering. In this work, by extending message-passing algorithms and spectral algorithms proposed for an unweighted community detection problem, we develop a non-parametric method based on statistical physics, by mapping the problem to the Potts model at the critical temperature of spin-glass transition and applying belief propagation to solve the marginals corresponding to the Boltzmann distribution. Our algorithm is robust to over-fitting and gives a principled way to determine whether there are significant clusters in the data and how many clusters there are. We apply our method to different clustering tasks. In the community detection problem in weighted and directed networks, we show that our algorithm significantly outperforms existing algorithms. In the clustering problem, where the data were generated by mixture models in the sparse regime, we show that our method works all the way down to the theoretical limit of detectability and gives accuracy very close to that of the optimal Bayesian inference. In the semi-supervised clustering problem, our method only needs several labels to work perfectly in classic datasets. Finally, we further develop Thouless-Anderson-Palmer equations which heavily reduce the computation complexity in dense networks but give almost the same performance as belief propagation.

  11. Improved interpolation of meteorological forcings for hydrologic applications in a Swiss Alpine region

    NASA Astrophysics Data System (ADS)

    Tobin, Cara; Nicotina, Ludovico; Parlange, Marc B.; Berne, Alexis; Rinaldo, Andrea

    2011-04-01

    SummaryThis paper presents a comparative study on the mapping of temperature and precipitation fields in complex Alpine terrain. Its relevance hinges on the major impact that inadequate interpolations of meteorological forcings bear on the accuracy of hydrologic predictions regardless of the specifics of the models, particularly during flood events. Three flood events measured in the Swiss Alps are analyzed in detail to determine the interpolation methods which best capture the distribution of intense, orographically-induced precipitation. The interpolation techniques comparatively examined include: Inverse Distance Weighting (IDW), Ordinary Kriging (OK), and Kriging with External Drift (KED). Geostatistical methods rely on a robust anisotropic variogram for the definition of the spatial rainfall structure. Results indicate that IDW tends to significantly underestimate rainfall volumes whereas OK and KED methods capture spatial patterns and rainfall volumes induced by storm advection. Using numerical weather forecasts and elevation data as covariates for precipitation, we provide evidence for KED to outperform the other methods. Most significantly, the use of elevation as auxiliary information in KED of temperatures demonstrates minimal errors in estimated instantaneous rainfall volumes and provides instantaneous lapse rates which better capture snow/rainfall partitioning. Incorporation of the temperature and precipitation input fields into a hydrological model used for operational management was found to provide vastly improved outputs with respect to measured discharge volumes and flood peaks, with notable implications for flood modeling.

  12. Predictive modeling of structured electronic health records for adverse drug event detection

    PubMed Central

    2015-01-01

    Background The digitization of healthcare data, resulting from the increasingly widespread adoption of electronic health records, has greatly facilitated its analysis by computational methods and thereby enabled large-scale secondary use thereof. This can be exploited to support public health activities such as pharmacovigilance, wherein the safety of drugs is monitored to inform regulatory decisions about sustained use. To that end, electronic health records have emerged as a potentially valuable data source, providing access to longitudinal observations of patient treatment and drug use. A nascent line of research concerns predictive modeling of healthcare data for the automatic detection of adverse drug events, which presents its own set of challenges: it is not yet clear how to represent the heterogeneous data types in a manner conducive to learning high-performing machine learning models. Methods Datasets from an electronic health record database are used for learning predictive models with the purpose of detecting adverse drug events. The use and representation of two data types, as well as their combination, are studied: clinical codes, describing prescribed drugs and assigned diagnoses, and measurements. Feature selection is conducted on the various types of data to reduce dimensionality and sparsity, while allowing for an in-depth feature analysis of the usefulness of each data type and representation. Results Within each data type, combining multiple representations yields better predictive performance compared to using any single representation. The use of clinical codes for adverse drug event detection significantly outperforms the use of measurements; however, there is no significant difference over datasets between using only clinical codes and their combination with measurements. For certain adverse drug events, the combination does, however, outperform using only clinical codes. Feature selection leads to increased predictive performance for both data types, in isolation and combined. Conclusions We have demonstrated how machine learning can be applied to electronic health records for the purpose of detecting adverse drug events and proposed solutions to some of the challenges this presents, including how to represent the various data types. Overall, clinical codes are more useful than measurements and, in specific cases, it is beneficial to combine the two. PMID:26606038

  13. Prediction of cyclohexane-water distribution coefficient for SAMPL5 drug-like compounds with the QMPFF3 and ARROW polarizable force fields.

    PubMed

    Kamath, Ganesh; Kurnikov, Igor; Fain, Boris; Leontyev, Igor; Illarionov, Alexey; Butin, Oleg; Olevanov, Michael; Pereyaslavets, Leonid

    2016-11-01

    We present the performance of blind predictions of water-cyclohexane distribution coefficients for 53 drug-like compounds in the SAMPL5 challenge by three methods currently in use within our group. Two of them utilize QMPFF3 and ARROW, polarizable force-fields of varying complexity, and the third uses the General Amber Force-Field (GAFF). The polarizable FF's are implemented in an in-house MD package, Arbalest. We find that when we had time to parametrize the functional groups with care (batch 0), the polarizable force-fields outperformed the non-polarizable one. Conversely, on the full set of 53 compounds, GAFF performed better than both QMPFF3 and ARROW. We also describe the torsion-restrain method we used to improve sampling of molecular conformational space and thus the overall accuracy of prediction. The SAMPL5 challenge highlighted several drawbacks of our force-fields, such as our significant systematic over-estimation of hydrophobic interactions, specifically for alkanes and aromatic rings.

  14. Mirroring co-evolving trees in the light of their topologies.

    PubMed

    Hajirasouliha, Iman; Schönhuth, Alexander; de Juan, David; Valencia, Alfonso; Sahinalp, S Cenk

    2012-05-01

    Determining the interaction partners among protein/domain families poses hard computational problems, in particular in the presence of paralogous proteins. Available approaches aim to identify interaction partners among protein/domain families through maximizing the similarity between trimmed versions of their phylogenetic trees. Since maximization of any natural similarity score is computationally difficult, many approaches employ heuristics to evaluate the distance matrices corresponding to the tree topologies in question. In this article, we devise an efficient deterministic algorithm which directly maximizes the similarity between two leaf labeled trees with edge lengths, obtaining a score-optimal alignment of the two trees in question. Our algorithm is significantly faster than those methods based on distance matrix comparison: 1 min on a single processor versus 730 h on a supercomputer. Furthermore, we outperform the current state-of-the-art exhaustive search approach in terms of precision, while incurring acceptable losses in recall. A C implementation of the method demonstrated in this article is available at http://compbio.cs.sfu.ca/mirrort.htm

  15. MIIC online: a web server to reconstruct causal or non-causal networks from non-perturbative data.

    PubMed

    Sella, Nadir; Verny, Louis; Uguzzoni, Guido; Affeldt, Séverine; Isambert, Hervé

    2018-07-01

    We present a web server running the MIIC algorithm, a network learning method combining constraint-based and information-theoretic frameworks to reconstruct causal, non-causal or mixed networks from non-perturbative data, without the need for an a priori choice on the class of reconstructed network. Starting from a fully connected network, the algorithm first removes dispensable edges by iteratively subtracting the most significant information contributions from indirect paths between each pair of variables. The remaining edges are then filtered based on their confidence assessment or oriented based on the signature of causality in observational data. MIIC online server can be used for a broad range of biological data, including possible unobserved (latent) variables, from single-cell gene expression data to protein sequence evolution and outperforms or matches state-of-the-art methods for either causal or non-causal network reconstruction. MIIC online can be freely accessed at https://miic.curie.fr. Supplementary data are available at Bioinformatics online.

  16. Chelating agent-free, vapor-assisted crystallization method to synthesize hierarchical microporous/mesoporous MIL-125 (Ti).

    PubMed

    McNamara, Nicholas D; Hicks, Jason C

    2015-03-11

    Titanium-based microporous heterogeneous catalysts are widely studied but are often limited by the accessibility of reactants to active sites. Metal-organic frameworks (MOFs), such as MIL-125 (Ti), exhibit enhanced surface areas due to their high intrinsic microporosity, but the pore diameters of most microporous MOFs are often too small to allow for the diffusion of larger reactants (>7 Å) relevant to petroleum and biomass upgrading. In this work, hierarchical microporous MIL-125 exhibiting significantly enhanced interparticle mesoporosity was synthesized using a chelating-free, vapor-assisted crystallization method. The resulting hierarchical MOF was examined as an active catalyst for the oxidation of dibenzothiophene (DBT) with tert-butyl hydroperoxide and outperformed the solely microporous analogue. This was attributed to greater access of the substrate to surface active sites, as the pores in the microporous analogues were of inadequate size to accommodate DBT. Moreover, thiophene adsorption studies suggested the mesoporous MOF contained larger amounts of unsaturated metal sites that could enhance the observed catalytic activity.

  17. Predicting Pharmacodynamic Drug-Drug Interactions through Signaling Propagation Interference on Protein-Protein Interaction Networks.

    PubMed

    Park, Kyunghyun; Kim, Docyong; Ha, Suhyun; Lee, Doheon

    2015-01-01

    As pharmacodynamic drug-drug interactions (PD DDIs) could lead to severe adverse effects in patients, it is important to identify potential PD DDIs in drug development. The signaling starting from drug targets is propagated through protein-protein interaction (PPI) networks. PD DDIs could occur by close interference on the same targets or within the same pathways as well as distant interference through cross-talking pathways. However, most of the previous approaches have considered only close interference by measuring distances between drug targets or comparing target neighbors. We have applied a random walk with restart algorithm to simulate signaling propagation from drug targets in order to capture the possibility of their distant interference. Cross validation with DrugBank and Kyoto Encyclopedia of Genes and Genomes DRUG shows that the proposed method outperforms the previous methods significantly. We also provide a web service with which PD DDIs for drug pairs can be analyzed at http://biosoft.kaist.ac.kr/targetrw.

  18. Link prediction based on local weighted paths for complex networks

    NASA Astrophysics Data System (ADS)

    Yao, Yabing; Zhang, Ruisheng; Yang, Fan; Yuan, Yongna; Hu, Rongjing; Zhao, Zhili

    As a significant problem in complex networks, link prediction aims to find the missing and future links between two unconnected nodes by estimating the existence likelihood of potential links. It plays an important role in understanding the evolution mechanism of networks and has broad applications in practice. In order to improve prediction performance, a variety of structural similarity-based methods that rely on different topological features have been put forward. As one topological feature, the path information between node pairs is utilized to calculate the node similarity. However, many path-dependent methods neglect the different contributions of paths for a pair of nodes. In this paper, a local weighted path (LWP) index is proposed to differentiate the contributions between paths. The LWP index considers the effect of the link degrees of intermediate links and the connectivity influence of intermediate nodes on paths to quantify the path weight in the prediction procedure. The experimental results on 12 real-world networks show that the LWP index outperforms other seven prediction baselines.

  19. Particle swarm optimization-based local entropy weighted histogram equalization for infrared image enhancement

    NASA Astrophysics Data System (ADS)

    Wan, Minjie; Gu, Guohua; Qian, Weixian; Ren, Kan; Chen, Qian; Maldague, Xavier

    2018-06-01

    Infrared image enhancement plays a significant role in intelligent urban surveillance systems for smart city applications. Unlike existing methods only exaggerating the global contrast, we propose a particle swam optimization-based local entropy weighted histogram equalization which involves the enhancement of both local details and fore-and background contrast. First of all, a novel local entropy weighted histogram depicting the distribution of detail information is calculated based on a modified hyperbolic tangent function. Then, the histogram is divided into two parts via a threshold maximizing the inter-class variance in order to improve the contrasts of foreground and background, respectively. To avoid over-enhancement and noise amplification, double plateau thresholds of the presented histogram are formulated by means of particle swarm optimization algorithm. Lastly, each sub-image is equalized independently according to the constrained sub-local entropy weighted histogram. Comparative experiments implemented on real infrared images prove that our algorithm outperforms other state-of-the-art methods in terms of both visual and quantized evaluations.

  20. Learning Semantic Tags from Big Data for Clinical Text Representation.

    PubMed

    Li, Yanpeng; Liu, Hongfang

    2015-01-01

    In clinical text mining, it is one of the biggest challenges to represent medical terminologies and n-gram terms in sparse medical reports using either supervised or unsupervised methods. Addressing this issue, we propose a novel method for word and n-gram representation at semantic level. We first represent each word by its distance with a set of reference features calculated by reference distance estimator (RDE) learned from labeled and unlabeled data, and then generate new features using simple techniques of discretization, random sampling and merging. The new features are a set of binary rules that can be interpreted as semantic tags derived from word and n-grams. We show that the new features significantly outperform classical bag-of-words and n-grams in the task of heart disease risk factor extraction in i2b2 2014 challenge. It is promising to see that semantics tags can be used to replace the original text entirely with even better prediction performance as well as derive new rules beyond lexical level.

  1. A framework for combining multiple soil moisture retrievals based on maximizing temporal correlation

    NASA Astrophysics Data System (ADS)

    Kim, Seokhyeon; Parinussa, Robert M.; Liu, Yi. Y.; Johnson, Fiona M.; Sharma, Ashish

    2015-08-01

    A method for combining two microwave satellite soil moisture products by maximizing the temporal correlation with a reference data set has been developed. The method was applied to two global soil moisture data sets, Japan Aerospace Exploration Agency (JAXA) and Land Parameter Retrieval Model (LPRM), retrieved from the Advanced Microwave Scanning Radiometer 2 observations for the period 2012-2014. A global comparison revealed superior results of the combined product compared to the individual products against the reference data set of ERA-Interim volumetric water content. The global mean temporal correlation coefficient of the combined product with this reference was 0.52 which outperforms the individual JAXA (0.35) as well as the LPRM (0.45) product. Additionally, the performance was evaluated against in situ observations from the International Soil Moisture Network. The combined data set showed a significant improvement in temporal correlation coefficients in the validation compared to JAXA and minor improvements for the LPRM product.

  2. An experimental phylogeny to benchmark ancestral sequence reconstruction

    PubMed Central

    Randall, Ryan N.; Radford, Caelan E.; Roof, Kelsey A.; Natarajan, Divya K.; Gaucher, Eric A.

    2016-01-01

    Ancestral sequence reconstruction (ASR) is a still-burgeoning method that has revealed many key mechanisms of molecular evolution. One criticism of the approach is an inability to validate its algorithms within a biological context as opposed to a computer simulation. Here we build an experimental phylogeny using the gene of a single red fluorescent protein to address this criticism. The evolved phylogeny consists of 19 operational taxonomic units (leaves) and 17 ancestral bifurcations (nodes) that display a wide variety of fluorescent phenotypes. The 19 leaves then serve as ‘modern' sequences that we subject to ASR analyses using various algorithms and to benchmark against the known ancestral genotypes and ancestral phenotypes. We confirm computer simulations that show all algorithms infer ancient sequences with high accuracy, yet we also reveal wide variation in the phenotypes encoded by incorrectly inferred sequences. Specifically, Bayesian methods incorporating rate variation significantly outperform the maximum parsimony criterion in phenotypic accuracy. Subsampling of extant sequences had minor effect on the inference of ancestral sequences. PMID:27628687

  3. Providing R-Tree Support for Mongodb

    NASA Astrophysics Data System (ADS)

    Xiang, Longgang; Shao, Xiaotian; Wang, Dehao

    2016-06-01

    Supporting large amounts of spatial data is a significant characteristic of modern databases. However, unlike some mature relational databases, such as Oracle and PostgreSQL, most of current burgeoning NoSQL databases are not well designed for storing geospatial data, which is becoming increasingly important in various fields. In this paper, we propose a novel method to provide R-tree index, as well as corresponding spatial range query and nearest neighbour query functions, for MongoDB, one of the most prevalent NoSQL databases. First, after in-depth analysis of MongoDB's features, we devise an efficient tabular document structure which flattens R-tree index into MongoDB collections. Further, relevant mechanisms of R-tree operations are issued, and then we discuss in detail how to integrate R-tree into MongoDB. Finally, we present the experimental results which show that our proposed method out-performs the built-in spatial index of MongoDB. Our research will greatly facilitate big data management issues with MongoDB in a variety of geospatial information applications.

  4. Discriminative parameter estimation for random walks segmentation.

    PubMed

    Baudin, Pierre-Yves; Goodman, Danny; Kumrnar, Puneet; Azzabou, Noura; Carlier, Pierre G; Paragios, Nikos; Kumar, M Pawan

    2013-01-01

    The Random Walks (RW) algorithm is one of the most efficient and easy-to-use probabilistic segmentation methods. By combining contrast terms with prior terms, it provides accurate segmentations of medical images in a fully automated manner. However, one of the main drawbacks of using the RW algorithm is that its parameters have to be hand-tuned. we propose a novel discriminative learning framework that estimates the parameters using a training dataset. The main challenge we face is that the training samples are not fully supervised. Specifically, they provide a hard segmentation of the images, instead of a probabilistic segmentation. We overcome this challenge by treating the optimal probabilistic segmentation that is compatible with the given hard segmentation as a latent variable. This allows us to employ the latent support vector machine formulation for parameter estimation. We show that our approach significantly outperforms the baseline methods on a challenging dataset consisting of real clinical 3D MRI volumes of skeletal muscles.

  5. SpeCond: a method to detect condition-specific gene expression

    PubMed Central

    2011-01-01

    Transcriptomic studies routinely measure expression levels across numerous conditions. These datasets allow identification of genes that are specifically expressed in a small number of conditions. However, there are currently no statistically robust methods for identifying such genes. Here we present SpeCond, a method to detect condition-specific genes that outperforms alternative approaches. We apply the method to a dataset of 32 human tissues to determine 2,673 specifically expressed genes. An implementation of SpeCond is freely available as a Bioconductor package at http://www.bioconductor.org/packages/release/bioc/html/SpeCond.html. PMID:22008066

  6. Iso-geometric analysis for neutron diffusion problems

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hall, S. K.; Eaton, M. D.; Williams, M. M. R.

    Iso-geometric analysis can be viewed as a generalisation of the finite element method. It permits the exact representation of a wider range of geometries including conic sections. This is possible due to the use of concepts employed in computer-aided design. The underlying mathematical representations from computer-aided design are used to capture both the geometry and approximate the solution. In this paper the neutron diffusion equation is solved using iso-geometric analysis. The practical advantages are highlighted by looking at the problem of a circular fuel pin in a square moderator. For this problem the finite element method requires the geometry tomore » be approximated. This leads to errors in the shape and size of the interface between the fuel and the moderator. In contrast to this iso-geometric analysis allows the interface to be represented exactly. It is found that, due to a cancellation of errors, the finite element method converges more quickly than iso-geometric analysis for this problem. A fuel pin in a vacuum was then considered as this problem is highly sensitive to the leakage across the interface. In this case iso-geometric analysis greatly outperforms the finite element method. Due to the improvement in the representation of the geometry iso-geometric analysis can outperform traditional finite element methods. It is proposed that the use of iso-geometric analysis on neutron transport problems will allow deterministic solutions to be obtained for exact geometries. Something that is only currently possible with Monte Carlo techniques. (authors)« less

  7. Learning to rank-based gene summary extraction.

    PubMed

    Shang, Yue; Hao, Huihui; Wu, Jiajin; Lin, Hongfei

    2014-01-01

    In recent years, the biomedical literature has been growing rapidly. These articles provide a large amount of information about proteins, genes and their interactions. Reading such a huge amount of literature is a tedious task for researchers to gain knowledge about a gene. As a result, it is significant for biomedical researchers to have a quick understanding of the query concept by integrating its relevant resources. In the task of gene summary generation, we regard automatic summary as a ranking problem and apply the method of learning to rank to automatically solve this problem. This paper uses three features as a basis for sentence selection: gene ontology relevance, topic relevance and TextRank. From there, we obtain the feature weight vector using the learning to rank algorithm and predict the scores of candidate summary sentences and obtain top sentences to generate the summary. ROUGE (a toolkit for summarization of automatic evaluation) was used to evaluate the summarization result and the experimental results showed that our method outperforms the baseline techniques. According to the experimental result, the combination of three features can improve the performance of summary. The application of learning to rank can facilitate the further expansion of features for measuring the significance of sentences.

  8. An integrative approach to predicting the functional effects of small indels in non-coding regions of the human genome

    PubMed Central

    Ferlaino, Michael; Rogers, Mark F.; Shihab, Hashem A.; Mort, Matthew; Cooper, David N.; Gaunt, Tom R.; Campbell, Colin

    2018-01-01

    Background Small insertions and deletions (indels) have a significant influence in human disease and, in terms of frequency, they are second only to single nucleotide variants as pathogenic mutations. As the majority of mutations associated with complex traits are located outside the exome, it is crucial to investigate the potential pathogenic impact of indels in non-coding regions of the human genome. Results We present FATHMM-indel, an integrative approach to predict the functional effect, pathogenic or neutral, of indels in non-coding regions of the human genome. Our method exploits various genomic annotations in addition to sequence data. When validated on benchmark data, FATHMM-indel significantly outperforms CADD and GAVIN, state of the art models in assessing the pathogenic impact of non-coding variants. FATHMM-indel is available via a web server at indels.biocompute.org.uk. Conclusions FATHMM-indel can accurately predict the functional impact and prioritise small indels throughout the whole non-coding genome. PMID:28985712

  9. An integrative approach to predicting the functional effects of small indels in non-coding regions of the human genome.

    PubMed

    Ferlaino, Michael; Rogers, Mark F; Shihab, Hashem A; Mort, Matthew; Cooper, David N; Gaunt, Tom R; Campbell, Colin

    2017-10-06

    Small insertions and deletions (indels) have a significant influence in human disease and, in terms of frequency, they are second only to single nucleotide variants as pathogenic mutations. As the majority of mutations associated with complex traits are located outside the exome, it is crucial to investigate the potential pathogenic impact of indels in non-coding regions of the human genome. We present FATHMM-indel, an integrative approach to predict the functional effect, pathogenic or neutral, of indels in non-coding regions of the human genome. Our method exploits various genomic annotations in addition to sequence data. When validated on benchmark data, FATHMM-indel significantly outperforms CADD and GAVIN, state of the art models in assessing the pathogenic impact of non-coding variants. FATHMM-indel is available via a web server at indels.biocompute.org.uk. FATHMM-indel can accurately predict the functional impact and prioritise small indels throughout the whole non-coding genome.

  10. Pathway-GPS and SIGORA: identifying relevant pathways based on the over-representation of their gene-pair signatures

    PubMed Central

    Foroushani, Amir B.K.; Brinkman, Fiona S.L.

    2013-01-01

    Motivation. Predominant pathway analysis approaches treat pathways as collections of individual genes and consider all pathway members as equally informative. As a result, at times spurious and misleading pathways are inappropriately identified as statistically significant, solely due to components that they share with the more relevant pathways. Results. We introduce the concept of Pathway Gene-Pair Signatures (Pathway-GPS) as pairs of genes that, as a combination, are specific to a single pathway. We devised and implemented a novel approach to pathway analysis, Signature Over-representation Analysis (SIGORA), which focuses on the statistically significant enrichment of Pathway-GPS in a user-specified gene list of interest. In a comparative evaluation of several published datasets, SIGORA outperformed traditional methods by delivering biologically more plausible and relevant results. Availability. An efficient implementation of SIGORA, as an R package with precompiled GPS data for several human and mouse pathway repositories is available for download from http://sigora.googlecode.com/svn/. PMID:24432194

  11. The application of tailor-made force fields and molecular dynamics for NMR crystallography: a case study of free base cocaine

    PubMed Central

    Neumann, Marcus A.

    2017-01-01

    Motional averaging has been proven to be significant in predicting the chemical shifts in ab initio solid-state NMR calculations, and the applicability of motional averaging with molecular dynamics has been shown to depend on the accuracy of the molecular mechanical force field. The performance of a fully automatically generated tailor-made force field (TMFF) for the dynamic aspects of NMR crystallography is evaluated and compared with existing benchmarks, including static dispersion-corrected density functional theory calculations and the COMPASS force field. The crystal structure of free base cocaine is used as an example. The results reveal that, even though the TMFF outperforms the COMPASS force field for representing the energies and conformations of predicted structures, it does not give significant improvement in the accuracy of NMR calculations. Further studies should direct more attention to anisotropic chemical shifts and development of the method of solid-state NMR calculations. PMID:28250956

  12. Protein Identification Using Top-Down Spectra*

    PubMed Central

    Liu, Xiaowen; Sirotkin, Yakov; Shen, Yufeng; Anderson, Gordon; Tsai, Yihsuan S.; Ting, Ying S.; Goodlett, David R.; Smith, Richard D.; Bafna, Vineet; Pevzner, Pavel A.

    2012-01-01

    In the last two years, because of advances in protein separation and mass spectrometry, top-down mass spectrometry moved from analyzing single proteins to analyzing complex samples and identifying hundreds and even thousands of proteins. However, computational tools for database search of top-down spectra against protein databases are still in their infancy. We describe MS-Align+, a fast algorithm for top-down protein identification based on spectral alignment that enables searches for unexpected post-translational modifications. We also propose a method for evaluating statistical significance of top-down protein identifications and further benchmark various software tools on two top-down data sets from Saccharomyces cerevisiae and Salmonella typhimurium. We demonstrate that MS-Align+ significantly increases the number of identified spectra as compared with MASCOT and OMSSA on both data sets. Although MS-Align+ and ProSightPC have similar performance on the Salmonella typhimurium data set, MS-Align+ outperforms ProSightPC on the (more complex) Saccharomyces cerevisiae data set. PMID:22027200

  13. Rapid control and feedback rates enhance neuroprosthetic control

    PubMed Central

    Shanechi, Maryam M.; Orsborn, Amy L.; Moorman, Helene G.; Gowda, Suraj; Dangi, Siddharth; Carmena, Jose M.

    2017-01-01

    Brain-machine interfaces (BMI) create novel sensorimotor pathways for action. Much as the sensorimotor apparatus shapes natural motor control, the BMI pathway characteristics may also influence neuroprosthetic control. Here, we explore the influence of control and feedback rates, where control rate indicates how often motor commands are sent from the brain to the prosthetic, and feedback rate indicates how often visual feedback of the prosthetic is provided to the subject. We developed a new BMI that allows arbitrarily fast control and feedback rates, and used it to dissociate the effects of each rate in two monkeys. Increasing the control rate significantly improved control even when feedback rate was unchanged. Increasing the feedback rate further facilitated control. We also show that our high-rate BMI significantly outperformed state-of-the-art methods due to higher control and feedback rates, combined with a different point process mathematical encoding model. Our BMI paradigm can dissect the contribution of different elements in the sensorimotor pathway, providing a unique tool for studying neuroprosthetic control mechanisms. PMID:28059065

  14. Rapid control and feedback rates enhance neuroprosthetic control

    NASA Astrophysics Data System (ADS)

    Shanechi, Maryam M.; Orsborn, Amy L.; Moorman, Helene G.; Gowda, Suraj; Dangi, Siddharth; Carmena, Jose M.

    2017-01-01

    Brain-machine interfaces (BMI) create novel sensorimotor pathways for action. Much as the sensorimotor apparatus shapes natural motor control, the BMI pathway characteristics may also influence neuroprosthetic control. Here, we explore the influence of control and feedback rates, where control rate indicates how often motor commands are sent from the brain to the prosthetic, and feedback rate indicates how often visual feedback of the prosthetic is provided to the subject. We developed a new BMI that allows arbitrarily fast control and feedback rates, and used it to dissociate the effects of each rate in two monkeys. Increasing the control rate significantly improved control even when feedback rate was unchanged. Increasing the feedback rate further facilitated control. We also show that our high-rate BMI significantly outperformed state-of-the-art methods due to higher control and feedback rates, combined with a different point process mathematical encoding model. Our BMI paradigm can dissect the contribution of different elements in the sensorimotor pathway, providing a unique tool for studying neuroprosthetic control mechanisms.

  15. Physiological outperformance at the morphologically-transformed edge of the cyanobacteriosponge Terpios hoshinota (Suberitidae: Hadromerida) when confronting opponent corals.

    PubMed

    Wang, Jih-Terng; Hsu, Chia-Min; Kuo, Chao-Yang; Meng, Pei-Jie; Kao, Shuh-Ji; Chen, Chaolun Allen

    2015-01-01

    Terpios hoshinota, an encrusting cyanosponge, is known as a strong substrate competitor of reef-building corals that kills encountered coral by overgrowth. Terpios outbreaks cause significant declines in living coral cover in Indo-Pacific coral reefs, with the damage usually lasting for decades. Recent studies show that there are morphological transformations at a sponge's growth front when confronting corals. Whether these morphological transformations at coral contacts are involved with physiological outperformance (e.g., higher metabolic activity or nutritional status) over other portions of Terpios remains equivocal. In this study, we compared the indicators of photosynthetic capability and nitrogen status of a sponge-cyanobacteria association at proximal, middle, and distal portions of opponent corals. Terpios tissues in contact with corals displayed significant increases in photosynthetic oxygen production (ca. 61%), the δ13C value (ca. 4%), free proteinogenic amino acid content (ca. 85%), and Gln/Glu ratio (ca. 115%) compared to middle and distal parts of the sponge. In contrast, the maximum quantum yield (Fv/Fm), which is the indicator usually used to represent the integrity of photosystem II, of cyanobacteria photosynthesis was low (0.256~0.319) and showed an inverse trend of higher values in the distal portion of the sponge that might be due to high and variable levels of cyanobacterial phycocyanin. The inconsistent results between photosynthetic oxygen production and Fv/Fm values indicated that maximum quantum yields might not be a suitable indicator to represent the photosynthetic function of the Terpios-cyanobacteria association. Our data conclusively suggest that Terpios hoshinota competes with opponent corals not only by the morphological transformation of the sponge-cyanobacteria association but also by physiological outperformance in accumulating resources for the battle.

  16. A New Feature-Enhanced Speckle Reduction Method Based on Multiscale Analysis for Ultrasound B-Mode Imaging.

    PubMed

    Kang, Jinbum; Lee, Jae Young; Yoo, Yangmo

    2016-06-01

    Effective speckle reduction in ultrasound B-mode imaging is important for enhancing the image quality and improving the accuracy in image analysis and interpretation. In this paper, a new feature-enhanced speckle reduction (FESR) method based on multiscale analysis and feature enhancement filtering is proposed for ultrasound B-mode imaging. In FESR, clinical features (e.g., boundaries and borders of lesions) are selectively emphasized by edge, coherence, and contrast enhancement filtering from fine to coarse scales while simultaneously suppressing speckle development via robust diffusion filtering. In the simulation study, the proposed FESR method showed statistically significant improvements in edge preservation, mean structure similarity, speckle signal-to-noise ratio, and contrast-to-noise ratio (CNR) compared with other speckle reduction methods, e.g., oriented speckle reducing anisotropic diffusion (OSRAD), nonlinear multiscale wavelet diffusion (NMWD), the Laplacian pyramid-based nonlinear diffusion and shock filter (LPNDSF), and the Bayesian nonlocal means filter (OBNLM). Similarly, the FESR method outperformed the OSRAD, NMWD, LPNDSF, and OBNLM methods in terms of CNR, i.e., 10.70 ± 0.06 versus 9.00 ± 0.06, 9.78 ± 0.06, 8.67 ± 0.04, and 9.22 ± 0.06 in the phantom study, respectively. Reconstructed B-mode images that were developed using the five speckle reduction methods were reviewed by three radiologists for evaluation based on each radiologist's diagnostic preferences. All three radiologists showed a significant preference for the abdominal liver images obtained using the FESR methods in terms of conspicuity, margin sharpness, artificiality, and contrast, p<0.0001. For the kidney and thyroid images, the FESR method showed similar improvement over other methods. However, the FESR method did not show statistically significant improvement compared with the OBNLM method in margin sharpness for the kidney and thyroid images. These results demonstrate that the proposed FESR method can improve the image quality of ultrasound B-mode imaging by enhancing the visualization of lesion features while effectively suppressing speckle noise.

  17. Evaluating and addressing the effects of regression to the mean phenomenon in estimating collision frequencies on urban high collision concentration locations.

    PubMed

    Lee, Jinwoo; Chung, Koohong; Kang, Seungmo

    2016-12-01

    Two different methods for addressing the regression to the mean phenomenon (RTM) were evaluated using empirical data: Data from 110 miles of freeway located in California were used to evaluate the performance of the EB and CRP methods in addressing RTM. CRP outperformed the EB method in estimating collision frequencies in selected high collision concentration locations (HCCLs). Findings indicate that the performance of the EB method can be markedly affected when SPF is biased, while the performance of CRP remains much less affected. The CRP method was more effective in addressing RTM. Published by Elsevier Ltd.

  18. Exponential integrators in time-dependent density-functional calculations

    NASA Astrophysics Data System (ADS)

    Kidd, Daniel; Covington, Cody; Varga, Kálmán

    2017-12-01

    The integrating factor and exponential time differencing methods are implemented and tested for solving the time-dependent Kohn-Sham equations. Popular time propagation methods used in physics, as well as other robust numerical approaches, are compared to these exponential integrator methods in order to judge the relative merit of the computational schemes. We determine an improvement in accuracy of multiple orders of magnitude when describing dynamics driven primarily by a nonlinear potential. For cases of dynamics driven by a time-dependent external potential, the accuracy of the exponential integrator methods are less enhanced but still match or outperform the best of the conventional methods tested.

  19. Neural networks for link prediction in realistic biomedical graphs: a multi-dimensional evaluation of graph embedding-based approaches.

    PubMed

    Crichton, Gamal; Guo, Yufan; Pyysalo, Sampo; Korhonen, Anna

    2018-05-21

    Link prediction in biomedical graphs has several important applications including predicting Drug-Target Interactions (DTI), Protein-Protein Interaction (PPI) prediction and Literature-Based Discovery (LBD). It can be done using a classifier to output the probability of link formation between nodes. Recently several works have used neural networks to create node representations which allow rich inputs to neural classifiers. Preliminary works were done on this and report promising results. However they did not use realistic settings like time-slicing, evaluate performances with comprehensive metrics or explain when or why neural network methods outperform. We investigated how inputs from four node representation algorithms affect performance of a neural link predictor on random- and time-sliced biomedical graphs of real-world sizes (∼ 6 million edges) containing information relevant to DTI, PPI and LBD. We compared the performance of the neural link predictor to those of established baselines and report performance across five metrics. In random- and time-sliced experiments when the neural network methods were able to learn good node representations and there was a negligible amount of disconnected nodes, those approaches outperformed the baselines. In the smallest graph (∼ 15,000 edges) and in larger graphs with approximately 14% disconnected nodes, baselines such as Common Neighbours proved a justifiable choice for link prediction. At low recall levels (∼ 0.3) the approaches were mostly equal, but at higher recall levels across all nodes and average performance at individual nodes, neural network approaches were superior. Analysis showed that neural network methods performed well on links between nodes with no previous common neighbours; potentially the most interesting links. Additionally, while neural network methods benefit from large amounts of data, they require considerable amounts of computational resources to utilise them. Our results indicate that when there is enough data for the neural network methods to use and there are a negligible amount of disconnected nodes, those approaches outperform the baselines. At low recall levels the approaches are mostly equal but at higher recall levels and average performance at individual nodes, neural network approaches are superior. Performance at nodes without common neighbours which indicate more unexpected and perhaps more useful links account for this.

  20. Wireless sensor networks for heritage object deformation detection and tracking algorithm.

    PubMed

    Xie, Zhijun; Huang, Guangyan; Zarei, Roozbeh; He, Jing; Zhang, Yanchun; Ye, Hongwu

    2014-10-31

    Deformation is the direct cause of heritage object collapse. It is significant to monitor and signal the early warnings of the deformation of heritage objects. However, traditional heritage object monitoring methods only roughly monitor a simple-shaped heritage object as a whole, but cannot monitor complicated heritage objects, which may have a large number of surfaces inside and outside. Wireless sensor networks, comprising many small-sized, low-cost, low-power intelligent sensor nodes, are more useful to detect the deformation of every small part of the heritage objects. Wireless sensor networks need an effective mechanism to reduce both the communication costs and energy consumption in order to monitor the heritage objects in real time. In this paper, we provide an effective heritage object deformation detection and tracking method using wireless sensor networks (EffeHDDT). In EffeHDDT, we discover a connected core set of sensor nodes to reduce the communication cost for transmitting and collecting the data of the sensor networks. Particularly, we propose a heritage object boundary detecting and tracking mechanism. Both theoretical analysis and experimental results demonstrate that our EffeHDDT method outperforms the existing methods in terms of network traffic and the precision of the deformation detection.

  1. Multilabel user classification using the community structure of online networks

    PubMed Central

    Papadopoulos, Symeon; Kompatsiaris, Yiannis

    2017-01-01

    We study the problem of semi-supervised, multi-label user classification of networked data in the online social platform setting. We propose a framework that combines unsupervised community extraction and supervised, community-based feature weighting before training a classifier. We introduce Approximate Regularized Commute-Time Embedding (ARCTE), an algorithm that projects the users of a social graph onto a latent space, but instead of packing the global structure into a matrix of predefined rank, as many spectral and neural representation learning methods do, it extracts local communities for all users in the graph in order to learn a sparse embedding. To this end, we employ an improvement of personalized PageRank algorithms for searching locally in each user’s graph structure. Then, we perform supervised community feature weighting in order to boost the importance of highly predictive communities. We assess our method performance on the problem of user classification by performing an extensive comparative study among various recent methods based on graph embeddings. The comparison shows that ARCTE significantly outperforms the competition in almost all cases, achieving up to 35% relative improvement compared to the second best competing method in terms of F1-score. PMID:28278242

  2. A local immunization strategy for networks with overlapping community structure

    NASA Astrophysics Data System (ADS)

    Taghavian, Fatemeh; Salehi, Mostafa; Teimouri, Mehdi

    2017-02-01

    Since full coverage treatment is not feasible due to limited resources, we need to utilize an immunization strategy to effectively distribute the available vaccines. On the other hand, the structure of contact network among people has a significant impact on epidemics of infectious diseases (such as SARS and influenza) in a population. Therefore, network-based immunization strategies aim to reduce the spreading rate by removing the vaccinated nodes from contact network. Such strategies try to identify more important nodes in epidemics spreading over a network. In this paper, we address the effect of overlapping nodes among communities on epidemics spreading. The proposed strategy is an optimized random-walk based selection of these nodes. The whole process is local, i.e. it requires contact network information in the level of nodes. Thus, it is applicable to large-scale and unknown networks in which the global methods usually are unrealizable. Our simulation results on different synthetic and real networks show that the proposed method outperforms the existing local methods in most cases. In particular, for networks with strong community structures, high overlapping membership of nodes or small size communities, the proposed method shows better performance.

  3. Hidden Markov random field model and Broyden-Fletcher-Goldfarb-Shanno algorithm for brain image segmentation

    NASA Astrophysics Data System (ADS)

    Guerrout, EL-Hachemi; Ait-Aoudia, Samy; Michelucci, Dominique; Mahiou, Ramdane

    2018-05-01

    Many routine medical examinations produce images of patients suffering from various pathologies. With the huge number of medical images, the manual analysis and interpretation became a tedious task. Thus, automatic image segmentation became essential for diagnosis assistance. Segmentation consists in dividing the image into homogeneous and significant regions. We focus on hidden Markov random fields referred to as HMRF to model the problem of segmentation. This modelisation leads to a classical function minimisation problem. Broyden-Fletcher-Goldfarb-Shanno algorithm referred to as BFGS is one of the most powerful methods to solve unconstrained optimisation problem. In this paper, we investigate the combination of HMRF and BFGS algorithm to perform the segmentation operation. The proposed method shows very good segmentation results comparing with well-known approaches. The tests are conducted on brain magnetic resonance image databases (BrainWeb and IBSR) largely used to objectively confront the results obtained. The well-known Dice coefficient (DC) was used as similarity metric. The experimental results show that, in many cases, our proposed method approaches the perfect segmentation with a Dice Coefficient above .9. Moreover, it generally outperforms other methods in the tests conducted.

  4. Large-scale systematic analysis of 2D fingerprint methods and parameters to improve virtual screening enrichments.

    PubMed

    Sastry, Madhavi; Lowrie, Jeffrey F; Dixon, Steven L; Sherman, Woody

    2010-05-24

    A systematic virtual screening study on 11 pharmaceutically relevant targets has been conducted to investigate the interrelation between 8 two-dimensional (2D) fingerprinting methods, 13 atom-typing schemes, 13 bit scaling rules, and 12 similarity metrics using the new cheminformatics package Canvas. In total, 157 872 virtual screens were performed to assess the ability of each combination of parameters to identify actives in a database screen. In general, fingerprint methods, such as MOLPRINT2D, Radial, and Dendritic that encode information about local environment beyond simple linear paths outperformed other fingerprint methods. Atom-typing schemes with more specific information, such as Daylight, Mol2, and Carhart were generally superior to more generic atom-typing schemes. Enrichment factors across all targets were improved considerably with the best settings, although no single set of parameters performed optimally on all targets. The size of the addressable bit space for the fingerprints was also explored, and it was found to have a substantial impact on enrichments. Small bit spaces, such as 1024, resulted in many collisions and in a significant degradation in enrichments compared to larger bit spaces that avoid collisions.

  5. A Nonlinear Framework of Delayed Particle Smoothing Method for Vehicle Localization under Non-Gaussian Environment

    PubMed Central

    Xiao, Zhu; Havyarimana, Vincent; Li, Tong; Wang, Dong

    2016-01-01

    In this paper, a novel nonlinear framework of smoothing method, non-Gaussian delayed particle smoother (nGDPS), is proposed, which enables vehicle state estimation (VSE) with high accuracy taking into account the non-Gaussianity of the measurement and process noises. Within the proposed method, the multivariate Student’s t-distribution is adopted in order to compute the probability distribution function (PDF) related to the process and measurement noises, which are assumed to be non-Gaussian distributed. A computation approach based on Ensemble Kalman Filter (EnKF) is designed to cope with the mean and the covariance matrix of the proposal non-Gaussian distribution. A delayed Gibbs sampling algorithm, which incorporates smoothing of the sampled trajectories over a fixed-delay, is proposed to deal with the sample degeneracy of particles. The performance is investigated based on the real-world data, which is collected by low-cost on-board vehicle sensors. The comparison study based on the real-world experiments and the statistical analysis demonstrates that the proposed nGDPS has significant improvement on the vehicle state accuracy and outperforms the existing filtering and smoothing methods. PMID:27187405

  6. Multilabel user classification using the community structure of online networks.

    PubMed

    Rizos, Georgios; Papadopoulos, Symeon; Kompatsiaris, Yiannis

    2017-01-01

    We study the problem of semi-supervised, multi-label user classification of networked data in the online social platform setting. We propose a framework that combines unsupervised community extraction and supervised, community-based feature weighting before training a classifier. We introduce Approximate Regularized Commute-Time Embedding (ARCTE), an algorithm that projects the users of a social graph onto a latent space, but instead of packing the global structure into a matrix of predefined rank, as many spectral and neural representation learning methods do, it extracts local communities for all users in the graph in order to learn a sparse embedding. To this end, we employ an improvement of personalized PageRank algorithms for searching locally in each user's graph structure. Then, we perform supervised community feature weighting in order to boost the importance of highly predictive communities. We assess our method performance on the problem of user classification by performing an extensive comparative study among various recent methods based on graph embeddings. The comparison shows that ARCTE significantly outperforms the competition in almost all cases, achieving up to 35% relative improvement compared to the second best competing method in terms of F1-score.

  7. Deviation-based spam-filtering method via stochastic approach

    NASA Astrophysics Data System (ADS)

    Lee, Daekyung; Lee, Mi Jin; Kim, Beom Jun

    2018-03-01

    In the presence of a huge number of possible purchase choices, ranks or ratings of items by others often play very important roles for a buyer to make a final purchase decision. Perfectly objective rating is an impossible task to achieve, and we often use an average rating built on how previous buyers estimated the quality of the product. The problem of using a simple average rating is that it can easily be polluted by careless users whose evaluation of products cannot be trusted, and by malicious spammers who try to bias the rating result on purpose. In this letter we suggest how trustworthiness of individual users can be systematically and quantitatively reflected to build a more reliable rating system. We compute the suitably defined reliability of each user based on the user's rating pattern for all products she evaluated. We call our proposed method as the deviation-based ranking, since the statistical significance of each user's rating pattern with respect to the average rating pattern is the key ingredient. We find that our deviation-based ranking method outperforms existing methods in filtering out careless random evaluators as well as malicious spammers.

  8. Wireless Sensor Networks for Heritage Object Deformation Detection and Tracking Algorithm

    PubMed Central

    Xie, Zhijun; Huang, Guangyan; Zarei, Roozbeh; He, Jing; Zhang, Yanchun; Ye, Hongwu

    2014-01-01

    Deformation is the direct cause of heritage object collapse. It is significant to monitor and signal the early warnings of the deformation of heritage objects. However, traditional heritage object monitoring methods only roughly monitor a simple-shaped heritage object as a whole, but cannot monitor complicated heritage objects, which may have a large number of surfaces inside and outside. Wireless sensor networks, comprising many small-sized, low-cost, low-power intelligent sensor nodes, are more useful to detect the deformation of every small part of the heritage objects. Wireless sensor networks need an effective mechanism to reduce both the communication costs and energy consumption in order to monitor the heritage objects in real time. In this paper, we provide an effective heritage object deformation detection and tracking method using wireless sensor networks (EffeHDDT). In EffeHDDT, we discover a connected core set of sensor nodes to reduce the communication cost for transmitting and collecting the data of the sensor networks. Particularly, we propose a heritage object boundary detecting and tracking mechanism. Both theoretical analysis and experimental results demonstrate that our EffeHDDT method outperforms the existing methods in terms of network traffic and the precision of the deformation detection. PMID:25365458

  9. Gene Ranking of RNA-Seq Data via Discriminant Non-Negative Matrix Factorization.

    PubMed

    Jia, Zhilong; Zhang, Xiang; Guan, Naiyang; Bo, Xiaochen; Barnes, Michael R; Luo, Zhigang

    2015-01-01

    RNA-sequencing is rapidly becoming the method of choice for studying the full complexity of transcriptomes, however with increasing dimensionality, accurate gene ranking is becoming increasingly challenging. This paper proposes an accurate and sensitive gene ranking method that implements discriminant non-negative matrix factorization (DNMF) for RNA-seq data. To the best of our knowledge, this is the first work to explore the utility of DNMF for gene ranking. When incorporating Fisher's discriminant criteria and setting the reduced dimension as two, DNMF learns two factors to approximate the original gene expression data, abstracting the up-regulated or down-regulated metagene by using the sample label information. The first factor denotes all the genes' weights of two metagenes as the additive combination of all genes, while the second learned factor represents the expression values of two metagenes. In the gene ranking stage, all the genes are ranked as a descending sequence according to the differential values of the metagene weights. Leveraging the nature of NMF and Fisher's criterion, DNMF can robustly boost the gene ranking performance. The Area Under the Curve analysis of differential expression analysis on two benchmarking tests of four RNA-seq data sets with similar phenotypes showed that our proposed DNMF-based gene ranking method outperforms other widely used methods. Moreover, the Gene Set Enrichment Analysis also showed DNMF outweighs others. DNMF is also computationally efficient, substantially outperforming all other benchmarked methods. Consequently, we suggest DNMF is an effective method for the analysis of differential gene expression and gene ranking for RNA-seq data.

  10. Using inquiry-based instructional strategies in third-grade science

    NASA Astrophysics Data System (ADS)

    Harris, Fanicia D.

    The purpose of the study was to determine if the use of inquiry-based instructional strategies as compared to traditional instructional strategies would increase third-grade students' achievement in science, based on the pretest/posttest of the school system and the Georgia Criterion-Referenced Competency Test (CRCT). Inquiry-based instruction, presented students with a question, an observation, a data set, or a hypothesis for problem solving such as scientists use when working in real-world situations. This descriptive research employed a quantitative strategy using a pretest/posttest control group design. The research compared the science academic achievement levels of one Grade 3 class [N=14] exposed to a teacher's inquiry-based instructional strategies as compared to one Grade 3 class [ N=18] exposed to a teacher's traditional instructional strategies. The study compared the science academic performance levels of third-grade students as measured by pretest/posttest mean scores from the school system-based assessment and the Georgia CRCT. Four research hypotheses were examined. Based on the overall findings from this study, both the experimental group and the control group significantly increased their mean scores from the pretests to the posttests. The amount of gain from the pretest to the posttest was significantly greater for the experimental group than the control group for pretest/posttest 1 [t(12) = 8.79, p < .01] and pretest/posttest 2 [t(12) = 9.40, p < .01]. The experimental group significantly outperformed the control group with regard to their mean number of items answered correctly on the life sciences test [t(27) = -1.95, p = .06]. Finally, the control group did not outperform the experimental group on any of the comparisons made throughout this study. The results of this study provide empirical support for the effectiveness of the use of inquiry-based learning strategies, given that the experimental group outperformed the control group on all four posttests, on the science CRCT and on the individual Science portions on the test including earth, life and physical sciences. In fact, this study was able to detect significant differences between the experimental group and the control group with regard to the degree to which the students improved from the pretests to the posttests.

  11. Gender Gaps and Gendered Action in a First-Year Physics Laboratory

    ERIC Educational Resources Information Center

    Day, James; Stang, Jared B.; Holmes, N. G.; Kumar, Dhaneesh; Bonn, D. A.

    2016-01-01

    It is established that male students outperform female students on almost all commonly used physics concept inventories. However, there is significant variation in the factors that contribute to the gap, as well as the direction in which they influence it. It is presently unknown if such a gender gap exists on the relatively new Concise Data…

  12. Gender Differences in Mathematics Achievement in Jordan: A Differential Item Functioning Analysis of the 2015 TIMSS

    ERIC Educational Resources Information Center

    Innabi, Hanan; Dodeen, Hamzeh

    2018-01-01

    This study is within the framework of the United Nations sustainable development goals related to equitable quality education. The total score on the 2015 Trends in International Mathematics and Science Study that indicated eighth-grade girls in Jordan significantly outperformed boys is hiding many details related to the quality of mathematics…

  13. Searching Information Sources in Networks

    DTIC Science & Technology

    2017-06-14

    SECURITY CLASSIFICATION OF: During the course of this project, we made significant progresses in multiple directions of the information detection...result on information source detection on non-tree networks; (2) The development of information source localization algorithms to detect multiple... information sources. The algorithms have provable performance guarantees and outperform existing algorithms in 1. REPORT DATE (DD-MM-YYYY) 4. TITLE AND

  14. Brief Cognitive-Behavioral Depression Prevention Program for High-Risk Adolescents Outperforms Two Alternative Interventions: A Randomized Efficacy Trial

    ERIC Educational Resources Information Center

    Stice, Eric; Rohde, Paul; Seeley, John R.; Gau, Jeff M.

    2008-01-01

    In this depression prevention trial, 341 high-risk adolescents (mean age = 15.6 years, SD = 1.2) with elevated depressive symptoms were randomized to a brief group cognitive-behavioral (CB) intervention, group supportive-expressive intervention, bibliotherapy, or assessment-only control condition. CB participants showed significantly greater…

  15. The Impact of Learning on Women's Labour Market Transitions

    ERIC Educational Resources Information Center

    Haasler, Simone R.

    2014-01-01

    Women play an increasingly important role in the labour market and as wage earners. Moreover, in many countries, young women have outperformed men in terms of educational attainment and qualification. Still, women's human capital investment does not pay off as it does for men as they are still significantly disadvantaged on the labour market.…

  16. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jordan, Dirk C.; Deceglie, Michael G.; Kurtz, Sarah R.

    What is the best method to determine long-term PV system performance and degradation rates? Ideally, one universally applicable methodology would be desirable so that a single number could be derived. However, data sets vary in their attributes and evidence is presented that defining two methodologies may be preferable. Monte Carlo simulations of artificial performance data allowed investigation of different methodologies and their respective confidence intervals. Tradeoffs between different approaches were delineated, elucidating as to why two separate approaches may need to be included in a standard. Regression approaches tend to be preferable when data sets are less contaminated by seasonality,more » noise and occurrence of outliers although robust regression can significantly improve the accuracy when outliers are present. In the presence of outliers, marked seasonality, or strong soiling events, year-on-year approaches tend to outperform regression approaches.« less

  17. An effective convolutional neural network model for Chinese sentiment analysis

    NASA Astrophysics Data System (ADS)

    Zhang, Yu; Chen, Mengdong; Liu, Lianzhong; Wang, Yadong

    2017-06-01

    Nowadays microblog is getting more and more popular. People are increasingly accustomed to expressing their opinions on Twitter, Facebook and Sina Weibo. Sentiment analysis of microblog has received significant attention, both in academia and in industry. So far, Chinese microblog exploration still needs lots of further work. In recent years CNN has also been used to deal with NLP tasks, and already achieved good results. However, these methods ignore the effective use of a large number of existing sentimental resources. For this purpose, we propose a Lexicon-based Sentiment Convolutional Neural Networks (LSCNN) model focus on Weibo's sentiment analysis, which combines two CNNs, trained individually base on sentiment features and word embedding, at the fully connected hidden layer. The experimental results show that our model outperforms the CNN model only with word embedding features on microblog sentiment analysis task.

  18. Discrete post-processing of total cloud cover ensemble forecasts

    NASA Astrophysics Data System (ADS)

    Hemri, Stephan; Haiden, Thomas; Pappenberger, Florian

    2017-04-01

    This contribution presents an approach to post-process ensemble forecasts for the discrete and bounded weather variable of total cloud cover. Two methods for discrete statistical post-processing of ensemble predictions are tested. The first approach is based on multinomial logistic regression, the second involves a proportional odds logistic regression model. Applying them to total cloud cover raw ensemble forecasts from the European Centre for Medium-Range Weather Forecasts improves forecast skill significantly. Based on station-wise post-processing of raw ensemble total cloud cover forecasts for a global set of 3330 stations over the period from 2007 to early 2014, the more parsimonious proportional odds logistic regression model proved to slightly outperform the multinomial logistic regression model. Reference Hemri, S., Haiden, T., & Pappenberger, F. (2016). Discrete post-processing of total cloud cover ensemble forecasts. Monthly Weather Review 144, 2565-2577.

  19. Real-Space Density Functional Theory on Graphical Processing Units: Computational Approach and Comparison to Gaussian Basis Set Methods.

    PubMed

    Andrade, Xavier; Aspuru-Guzik, Alán

    2013-10-08

    We discuss the application of graphical processing units (GPUs) to accelerate real-space density functional theory (DFT) calculations. To make our implementation efficient, we have developed a scheme to expose the data parallelism available in the DFT approach; this is applied to the different procedures required for a real-space DFT calculation. We present results for current-generation GPUs from AMD and Nvidia, which show that our scheme, implemented in the free code Octopus, can reach a sustained performance of up to 90 GFlops for a single GPU, representing a significant speed-up when compared to the CPU version of the code. Moreover, for some systems, our implementation can outperform a GPU Gaussian basis set code, showing that the real-space approach is a competitive alternative for DFT simulations on GPUs.

  20. An Investigation of Unified Memory Access Performance in CUDA

    PubMed Central

    Landaverde, Raphael; Zhang, Tiansheng; Coskun, Ayse K.; Herbordt, Martin

    2015-01-01

    Managing memory between the CPU and GPU is a major challenge in GPU computing. A programming model, Unified Memory Access (UMA), has been recently introduced by Nvidia to simplify the complexities of memory management while claiming good overall performance. In this paper, we investigate this programming model and evaluate its performance and programming model simplifications based on our experimental results. We find that beyond on-demand data transfers to the CPU, the GPU is also able to request subsets of data it requires on demand. This feature allows UMA to outperform full data transfer methods for certain parallel applications and small data sizes. We also find, however, that for the majority of applications and memory access patterns, the performance overheads associated with UMA are significant, while the simplifications to the programming model restrict flexibility for adding future optimizations. PMID:26594668

  1. Bag-of-features based medical image retrieval via multiple assignment and visual words weighting.

    PubMed

    Wang, Jingyan; Li, Yongping; Zhang, Ying; Wang, Chao; Xie, Honglan; Chen, Guoling; Gao, Xin

    2011-11-01

    Bag-of-features based approaches have become prominent for image retrieval and image classification tasks in the past decade. Such methods represent an image as a collection of local features, such as image patches and key points with scale invariant feature transform (SIFT) descriptors. To improve the bag-of-features methods, we first model the assignments of local descriptors as contribution functions, and then propose a novel multiple assignment strategy. Assuming the local features can be reconstructed by their neighboring visual words in a vocabulary, reconstruction weights can be solved by quadratic programming. The weights are then used to build contribution functions, resulting in a novel assignment method, called quadratic programming (QP) assignment. We further propose a novel visual word weighting method. The discriminative power of each visual word is analyzed by the sub-similarity function in the bin that corresponds to the visual word. Each sub-similarity function is then treated as a weak classifier. A strong classifier is learned by boosting methods that combine those weak classifiers. The weighting factors of the visual words are learned accordingly. We evaluate the proposed methods on medical image retrieval tasks. The methods are tested on three well-known data sets, i.e., the ImageCLEFmed data set, the 304 CT Set, and the basal-cell carcinoma image set. Experimental results demonstrate that the proposed QP assignment outperforms the traditional nearest neighbor assignment, the multiple assignment, and the soft assignment, whereas the proposed boosting based weighting strategy outperforms the state-of-the-art weighting methods, such as the term frequency weights and the term frequency-inverse document frequency weights.

  2. Muscle Activity Map Reconstruction from High Density Surface EMG Signals With Missing Channels Using Image Inpainting and Surface Reconstruction Methods.

    PubMed

    Ghaderi, Parviz; Marateb, Hamid R

    2017-07-01

    The aim of this study was to reconstruct low-quality High-density surface EMG (HDsEMG) signals, recorded with 2-D electrode arrays, using image inpainting and surface reconstruction methods. It is common that some fraction of the electrodes may provide low-quality signals. We used variety of image inpainting methods, based on partial differential equations (PDEs), and surface reconstruction methods to reconstruct the time-averaged or instantaneous muscle activity maps of those outlier channels. Two novel reconstruction algorithms were also proposed. HDsEMG signals were recorded from the biceps femoris and brachial biceps muscles during low-to-moderate-level isometric contractions, and some of the channels (5-25%) were randomly marked as outliers. The root-mean-square error (RMSE) between the original and reconstructed maps was then calculated. Overall, the proposed Poisson and wave PDE outperformed the other methods (average RMSE 8.7 μV rms ± 6.1 μV rms and 7.5 μV rms ± 5.9 μV rms ) for the time-averaged single-differential and monopolar map reconstruction, respectively. Biharmonic Spline, the discrete cosine transform, and the Poisson PDE outperformed the other methods for the instantaneous map reconstruction. The running time of the proposed Poisson and wave PDE methods, implemented using a Vectorization package, was 4.6 ± 5.7 ms and 0.6 ± 0.5 ms, respectively, for each signal epoch or time sample in each channel. The proposed reconstruction algorithms could be promising new tools for reconstructing muscle activity maps in real-time applications. Proper reconstruction methods could recover the information of low-quality recorded channels in HDsEMG signals.

  3. Performance of time-series methods in forecasting the demand for red blood cell transfusion.

    PubMed

    Pereira, Arturo

    2004-05-01

    Planning the future blood collection efforts must be based on adequate forecasts of transfusion demand. In this study, univariate time-series methods were investigated for their performance in forecasting the monthly demand for RBCs at one tertiary-care, university hospital. Three time-series methods were investigated: autoregressive integrated moving average (ARIMA), the Holt-Winters family of exponential smoothing models, and one neural-network-based method. The time series consisted of the monthly demand for RBCs from January 1988 to December 2002 and was divided into two segments: the older one was used to fit or train the models, and the younger to test for the accuracy of predictions. Performance was compared across forecasting methods by calculating goodness-of-fit statistics, the percentage of months in which forecast-based supply would have met the RBC demand (coverage rate), and the outdate rate. The RBC transfusion series was best fitted by a seasonal ARIMA(0,1,1)(0,1,1)(12) model. Over 1-year time horizons, forecasts generated by ARIMA or exponential smoothing laid within the +/- 10 percent interval of the real RBC demand in 79 percent of months (62% in the case of neural networks). The coverage rate for the three methods was 89, 91, and 86 percent, respectively. Over 2-year time horizons, exponential smoothing largely outperformed the other methods. Predictions by exponential smoothing laid within the +/- 10 percent interval of real values in 75 percent of the 24 forecasted months, and the coverage rate was 87 percent. Over 1-year time horizons, predictions of RBC demand generated by ARIMA or exponential smoothing are accurate enough to be of help in the planning of blood collection efforts. For longer time horizons, exponential smoothing outperforms the other forecasting methods.

  4. Evaluation of Direct Vapour Equilibration for Stable Isotope Analysis of Plant Water.

    NASA Astrophysics Data System (ADS)

    Millar, C. B.; McDonnell, J.; Pratt, D.

    2017-12-01

    The stable isotopes of water (2H and 18O), extracted from plants, have been utilized in a variety of ecohydrological, biogeochemical and climatological studies. The array of methods used to extract water from plants are as varied as the studies themselves. Here we perform a comprehensive inter-method comparison of six plant water extraction techniques: direct vapour equilibration, microwave extraction, two unique versions of cryogenic extraction, centrifugation, and high pressure mechanical squeezing. We applied these methods to four isotopically unique plant portions (heads, stems, leaves and root crown) of spring wheat (Triticum aestivum L.). The spring wheat was grown under controlled conditions with irrigation inputs of a known isotopic composition. Our results show that the methods of extraction return significantly different plant water isotopic signals. Centrifugation, microwave extraction, direct vapour equilibration, and squeezing returned more enriched results. Both cryogenic systems and squeezing returned more depleted results, depending upon the plant portion extracted. While cryogenic extraction is currently the most widely used method in the literature, our results suggest that direct vapor equilibration method outperforms it in terms of accuracy, sample throughput and replicability. More research is now needed with other plant species (especially woody plants) to see how far the findings from this study could be extended.

  5. DLocalMotif: a discriminative approach for discovering local motifs in protein sequences.

    PubMed

    Mehdi, Ahmed M; Sehgal, Muhammad Shoaib B; Kobe, Bostjan; Bailey, Timothy L; Bodén, Mikael

    2013-01-01

    Local motifs are patterns of DNA or protein sequences that occur within a sequence interval relative to a biologically defined anchor or landmark. Current protein motif discovery methods do not adequately consider such constraints to identify biologically significant motifs that are only weakly over-represented but spatially confined. Using negatives, i.e. sequences known to not contain a local motif, can further increase the specificity of their discovery. This article introduces the method DLocalMotif that makes use of positional information and negative data for local motif discovery in protein sequences. DLocalMotif combines three scoring functions, measuring degrees of motif over-representation, entropy and spatial confinement, specifically designed to discriminatively exploit the availability of negative data. The method is shown to outperform current methods that use only a subset of these motif characteristics. We apply the method to several biological datasets. The analysis of peroxisomal targeting signals uncovers several novel motifs that occur immediately upstream of the dominant peroxisomal targeting signal-1 signal. The analysis of proline-tyrosine nuclear localization signals uncovers multiple novel motifs that overlap with C2H2 zinc finger domains. We also evaluate the method on classical nuclear localization signals and endoplasmic reticulum retention signals and find that DLocalMotif successfully recovers biologically relevant sequence properties. http://bioinf.scmb.uq.edu.au/dlocalmotif/

  6. A preliminary study of muscular artifact cancellation in single-channel EEG.

    PubMed

    Chen, Xun; Liu, Aiping; Peng, Hu; Ward, Rabab K

    2014-10-01

    Electroencephalogram (EEG) recordings are often contaminated with muscular artifacts that strongly obscure the EEG signals and complicates their analysis. For the conventional case, where the EEG recordings are obtained simultaneously over many EEG channels, there exists a considerable range of methods for removing muscular artifacts. In recent years, there has been an increasing trend to use EEG information in ambulatory healthcare and related physiological signal monitoring systems. For practical reasons, a single EEG channel system must be used in these situations. Unfortunately, there exist few studies for muscular artifact cancellation in single-channel EEG recordings. To address this issue, in this preliminary study, we propose a simple, yet effective, method to achieve the muscular artifact cancellation for the single-channel EEG case. This method is a combination of the ensemble empirical mode decomposition (EEMD) and the joint blind source separation (JBSS) techniques. We also conduct a study that compares and investigates all possible single-channel solutions and demonstrate the performance of these methods using numerical simulations and real-life applications. The proposed method is shown to significantly outperform all other methods. It can successfully remove muscular artifacts without altering the underlying EEG activity. It is thus a promising tool for use in ambulatory healthcare systems.

  7. Robust Principal Component Analysis Regularized by Truncated Nuclear Norm for Identifying Differentially Expressed Genes.

    PubMed

    Wang, Ya-Xuan; Gao, Ying-Lian; Liu, Jin-Xing; Kong, Xiang-Zhen; Li, Hai-Jun

    2017-09-01

    Identifying differentially expressed genes from the thousands of genes is a challenging task. Robust principal component analysis (RPCA) is an efficient method in the identification of differentially expressed genes. RPCA method uses nuclear norm to approximate the rank function. However, theoretical studies showed that the nuclear norm minimizes all singular values, so it may not be the best solution to approximate the rank function. The truncated nuclear norm is defined as the sum of some smaller singular values, which may achieve a better approximation of the rank function than nuclear norm. In this paper, a novel method is proposed by replacing nuclear norm of RPCA with the truncated nuclear norm, which is named robust principal component analysis regularized by truncated nuclear norm (TRPCA). The method decomposes the observation matrix of genomic data into a low-rank matrix and a sparse matrix. Because the significant genes can be considered as sparse signals, the differentially expressed genes are viewed as the sparse perturbation signals. Thus, the differentially expressed genes can be identified according to the sparse matrix. The experimental results on The Cancer Genome Atlas data illustrate that the TRPCA method outperforms other state-of-the-art methods in the identification of differentially expressed genes.

  8. Predicting protein contact map using evolutionary and physical constraints by integer programming.

    PubMed

    Wang, Zhiyong; Xu, Jinbo

    2013-07-01

    Protein contact map describes the pairwise spatial and functional relationship of residues in a protein and contains key information for protein 3D structure prediction. Although studied extensively, it remains challenging to predict contact map using only sequence information. Most existing methods predict the contact map matrix element-by-element, ignoring correlation among contacts and physical feasibility of the whole-contact map. A couple of recent methods predict contact map by using mutual information, taking into consideration contact correlation and enforcing a sparsity restraint, but these methods demand for a very large number of sequence homologs for the protein under consideration and the resultant contact map may be still physically infeasible. This article presents a novel method PhyCMAP for contact map prediction, integrating both evolutionary and physical restraints by machine learning and integer linear programming. The evolutionary restraints are much more informative than mutual information, and the physical restraints specify more concrete relationship among contacts than the sparsity restraint. As such, our method greatly reduces the solution space of the contact map matrix and, thus, significantly improves prediction accuracy. Experimental results confirm that PhyCMAP outperforms currently popular methods no matter how many sequence homologs are available for the protein under consideration. http://raptorx.uchicago.edu.

  9. Accuracy of the prostate health index versus the urinary prostate cancer antigen 3 score to predict overall and significant prostate cancer at initial biopsy.

    PubMed

    Seisen, Thomas; Rouprêt, Morgan; Brault, Didier; Léon, Priscilla; Cancel-Tassin, Géraldine; Compérat, Eva; Renard-Penna, Raphaële; Mozer, Pierre; Guechot, Jérome; Cussenot, Olivier

    2015-01-01

    It remains unclear whether the Prostate Health Index (PHI) or the urinary Prostate-Cancer Antigen 3 (PCA-3) score is more accurate at screening for prostate cancer (PCa). The aim of this study was to prospectively compare the accuracy of PHI and PCA-3 scores to predict overall and significant PCa in men undergoing an initial prostate biopsy. Double-blind assessments of PHI and PCA-3 were conducted by referent physicians in 138 patients who subsequently underwent trans-rectal ultrasound-guided prostate biopsy according to a 12-core scheme. Predictive accuracies of PHI and PCA-3 were assessed using AUC and compared according to the DeLong method. Diagnostic performances with usual cut-off values for positivity (i.e., PHI >40 and PCA-3 >35) were calculated, and odds ratios associated with predicting PCa overall and significant PCa as defined by pathological updated Epstein criteria (i.e., Gleason score ≥7, more than three positive cores, or >50% cancer involvement in any core) were estimated using logistic regression. Prevalences of overall and significant PCa were 44.9% and 28.3%, respectively. PCA-3 (AUC = 0.71) was the most accurate predictor of PCa overall, and significantly outperformed PHI (AUC = 0.65; P = 0.03). However, PHI (AUC = 0.80) remained the most accurate predictor when screening exclusively for significant PCa and significantly outperformed PCA-3 (AUC = 0.55; P = 0.03). Furthermore, PCA-3 >35 had the best accuracy, and positive or negative predictive values when screening for PCa overall whereas these diagnostic performances were greater for PHI >40 when exclusively screening for significant PCa. PHI > 40 combined with PCA-3 > 35 was more specific in both cases. In multivariate analyses, PCA-3 >35 (OR = 5.68; 95%CI = [2.21-14.59]; P < 0.001) was significantly correlated with the presence of PCa overall, but PHI >40 (OR = 9.60; 95%CI = [1.72-91.32]; P = 0.001) was the only independent predictor for detecting significant PCa. Although PCA-3 score is the best predictor for PCa overall at initial biopsy, our findings strongly indicate that PHI should be used for population-based screening to avoid over-diagnosis of indolent tumors that are unlikely to cause death. © 2014 Wiley Periodicals, Inc.

  10. Tightening Quantum Speed Limits for Almost All States.

    PubMed

    Campaioli, Francesco; Pollock, Felix A; Binder, Felix C; Modi, Kavan

    2018-02-09

    Conventional quantum speed limits perform poorly for mixed quantum states: They are generally not tight and often significantly underestimate the fastest possible evolution speed. To remedy this, for unitary driving, we derive two quantum speed limits that outperform the traditional bounds for almost all quantum states. Moreover, our bounds are significantly simpler to compute as well as experimentally more accessible. Our bounds have a clear geometric interpretation; they arise from the evaluation of the angle between generalized Bloch vectors.

  11. An optimized content-aware image retargeting method: toward expanding the perceived visual field of the high-density retinal prosthesis recipients

    NASA Astrophysics Data System (ADS)

    Li, Heng; Zeng, Yajie; Lu, Zhuofan; Cao, Xiaofei; Su, Xiaofan; Sui, Xiaohong; Wang, Jing; Chai, Xinyu

    2018-04-01

    Objective. Retinal prosthesis devices have shown great value in restoring some sight for individuals with profoundly impaired vision, but the visual acuity and visual field provided by prostheses greatly limit recipients’ visual experience. In this paper, we employ computer vision approaches to seek to expand the perceptible visual field in patients implanted potentially with a high-density retinal prosthesis while maintaining visual acuity as much as possible. Approach. We propose an optimized content-aware image retargeting method, by introducing salient object detection based on color and intensity-difference contrast, aiming to remap important information of a scene into a small visual field and preserve their original scale as much as possible. It may improve prosthetic recipients’ perceived visual field and aid in performing some visual tasks (e.g. object detection and object recognition). To verify our method, psychophysical experiments, detecting object number and recognizing objects, are conducted under simulated prosthetic vision. As control, we use three other image retargeting techniques, including Cropping, Scaling, and seam-assisted shrinkability. Main results. Results show that our method outperforms in preserving more key features and has significantly higher recognition accuracy in comparison with other three image retargeting methods under the condition of small visual field and low-resolution. Significance. The proposed method is beneficial to expand the perceived visual field of prosthesis recipients and improve their object detection and recognition performance. It suggests that our method may provide an effective option for image processing module in future high-density retinal implants.

  12. Association of parent-child relationships and executive functioning in South Asian adolescents.

    PubMed

    Fatima, Shameem; Sheikh, Hamid; Ardila, Alfredo

    2016-01-01

    It is known that some environmental variables can significantly affect the development of executive functions (EF). The primary aim of this study was to analyze whether some family conditions, such as the adolescent's perception of the quality of parent-child relationships and the socioeconomic status (SES; assessed according to education, occupational status, and income) are significantly associated with EF test scores. There were 370 Pakistani participants ranging in age 13 to 19 years who were selected and then individually administered the following tests taken from the Delis-Kaplan Executive Function System (D-KEFS): Trail Making Test (TMT), Design Fluency Test (DFT), Color Word Interference Test (CWIT), and Card Sorting Test (CST). In addition, a Parent-Child Relationship Scale (PCRS) also was administered. Results showed that perceived "neglect" in the PCRS was negatively associated with the 4 EF test scores. Parents' education and SES were positively associated with 3 EF measures: DFT, CWIT, and CST. Further correlational analyses revealed that inhibition (as measured with the CWIT) and problem-solving ability (as measured with the CST) were significantly associated with the perceived parent-child relationships. Some gender differences also were observed: males outperformed females on TMT, DFT, and CST, while females outperformed males in the CWIT. It was concluded that perceived parent-child relationships, SES, and parents' education are significantly associated with executive function test performance during adolescents. (c) 2015 APA, all rights reserved).

  13. Statistical inference for time course RNA-Seq data using a negative binomial mixed-effect model.

    PubMed

    Sun, Xiaoxiao; Dalpiaz, David; Wu, Di; S Liu, Jun; Zhong, Wenxuan; Ma, Ping

    2016-08-26

    Accurate identification of differentially expressed (DE) genes in time course RNA-Seq data is crucial for understanding the dynamics of transcriptional regulatory network. However, most of the available methods treat gene expressions at different time points as replicates and test the significance of the mean expression difference between treatments or conditions irrespective of time. They thus fail to identify many DE genes with different profiles across time. In this article, we propose a negative binomial mixed-effect model (NBMM) to identify DE genes in time course RNA-Seq data. In the NBMM, mean gene expression is characterized by a fixed effect, and time dependency is described by random effects. The NBMM is very flexible and can be fitted to both unreplicated and replicated time course RNA-Seq data via a penalized likelihood method. By comparing gene expression profiles over time, we further classify the DE genes into two subtypes to enhance the understanding of expression dynamics. A significance test for detecting DE genes is derived using a Kullback-Leibler distance ratio. Additionally, a significance test for gene sets is developed using a gene set score. Simulation analysis shows that the NBMM outperforms currently available methods for detecting DE genes and gene sets. Moreover, our real data analysis of fruit fly developmental time course RNA-Seq data demonstrates the NBMM identifies biologically relevant genes which are well justified by gene ontology analysis. The proposed method is powerful and efficient to detect biologically relevant DE genes and gene sets in time course RNA-Seq data.

  14. A Particle Batch Smoother Approach to Snow Water Equivalent Estimation

    NASA Technical Reports Server (NTRS)

    Margulis, Steven A.; Girotto, Manuela; Cortes, Gonzalo; Durand, Michael

    2015-01-01

    This paper presents a newly proposed data assimilation method for historical snow water equivalent SWE estimation using remotely sensed fractional snow-covered area fSCA. The newly proposed approach consists of a particle batch smoother (PBS), which is compared to a previously applied Kalman-based ensemble batch smoother (EnBS) approach. The methods were applied over the 27-yr Landsat 5 record at snow pillow and snow course in situ verification sites in the American River basin in the Sierra Nevada (United States). This basin is more densely vegetated and thus more challenging for SWE estimation than the previous applications of the EnBS. Both data assimilation methods provided significant improvement over the prior (modeling only) estimates, with both able to significantly reduce prior SWE biases. The prior RMSE values at the snow pillow and snow course sites were reduced by 68%-82% and 60%-68%, respectively, when applying the data assimilation methods. This result is encouraging for a basin like the American where the moderate to high forest cover will necessarily obscure more of the snow-covered ground surface than in previously examined, less-vegetated basins. The PBS generally outperformed the EnBS: for snow pillows the PBSRMSE was approx.54%of that seen in the EnBS, while for snow courses the PBSRMSE was approx.79%of the EnBS. Sensitivity tests show relative insensitivity for both the PBS and EnBS results to ensemble size and fSCA measurement error, but a higher sensitivity for the EnBS to the mean prior precipitation input, especially in the case where significant prior biases exist.

  15. Probabilistic Air Segmentation and Sparse Regression Estimated Pseudo CT for PET/MR Attenuation Correction

    PubMed Central

    Chen, Yasheng; Juttukonda, Meher; Su, Yi; Benzinger, Tammie; Rubin, Brian G.; Lee, Yueh Z.; Lin, Weili; Shen, Dinggang; Lalush, David

    2015-01-01

    Purpose To develop a positron emission tomography (PET) attenuation correction method for brain PET/magnetic resonance (MR) imaging by estimating pseudo computed tomographic (CT) images from T1-weighted MR and atlas CT images. Materials and Methods In this institutional review board–approved and HIPAA-compliant study, PET/MR/CT images were acquired in 20 subjects after obtaining written consent. A probabilistic air segmentation and sparse regression (PASSR) method was developed for pseudo CT estimation. Air segmentation was performed with assistance from a probabilistic air map. For nonair regions, the pseudo CT numbers were estimated via sparse regression by using atlas MR patches. The mean absolute percentage error (MAPE) on PET images was computed as the normalized mean absolute difference in PET signal intensity between a method and the reference standard continuous CT attenuation correction method. Friedman analysis of variance and Wilcoxon matched-pairs tests were performed for statistical comparison of MAPE between the PASSR method and Dixon segmentation, CT segmentation, and population averaged CT atlas (mean atlas) methods. Results The PASSR method yielded a mean MAPE ± standard deviation of 2.42% ± 1.0, 3.28% ± 0.93, and 2.16% ± 1.75, respectively, in the whole brain, gray matter, and white matter, which were significantly lower than the Dixon, CT segmentation, and mean atlas values (P < .01). Moreover, 68.0% ± 16.5, 85.8% ± 12.9, and 96.0% ± 2.5 of whole-brain volume had within ±2%, ±5%, and ±10% percentage error by using PASSR, respectively, which was significantly higher than other methods (P < .01). Conclusion PASSR outperformed the Dixon, CT segmentation, and mean atlas methods by reducing PET error owing to attenuation correction. © RSNA, 2014 PMID:25521778

  16. Deep Convolutional Neural Networks Outperform Feature-Based But Not Categorical Models in Explaining Object Similarity Judgments

    PubMed Central

    Jozwik, Kamila M.; Kriegeskorte, Nikolaus; Storrs, Katherine R.; Mur, Marieke

    2017-01-01

    Recent advances in Deep convolutional Neural Networks (DNNs) have enabled unprecedentedly accurate computational models of brain representations, and present an exciting opportunity to model diverse cognitive functions. State-of-the-art DNNs achieve human-level performance on object categorisation, but it is unclear how well they capture human behavior on complex cognitive tasks. Recent reports suggest that DNNs can explain significant variance in one such task, judging object similarity. Here, we extend these findings by replicating them for a rich set of object images, comparing performance across layers within two DNNs of different depths, and examining how the DNNs’ performance compares to that of non-computational “conceptual” models. Human observers performed similarity judgments for a set of 92 images of real-world objects. Representations of the same images were obtained in each of the layers of two DNNs of different depths (8-layer AlexNet and 16-layer VGG-16). To create conceptual models, other human observers generated visual-feature labels (e.g., “eye”) and category labels (e.g., “animal”) for the same image set. Feature labels were divided into parts, colors, textures and contours, while category labels were divided into subordinate, basic, and superordinate categories. We fitted models derived from the features, categories, and from each layer of each DNN to the similarity judgments, using representational similarity analysis to evaluate model performance. In both DNNs, similarity within the last layer explains most of the explainable variance in human similarity judgments. The last layer outperforms almost all feature-based models. Late and mid-level layers outperform some but not all feature-based models. Importantly, categorical models predict similarity judgments significantly better than any DNN layer. Our results provide further evidence for commonalities between DNNs and brain representations. Models derived from visual features other than object parts perform relatively poorly, perhaps because DNNs more comprehensively capture the colors, textures and contours which matter to human object perception. However, categorical models outperform DNNs, suggesting that further work may be needed to bring high-level semantic representations in DNNs closer to those extracted by humans. Modern DNNs explain similarity judgments remarkably well considering they were not trained on this task, and are promising models for many aspects of human cognition. PMID:29062291

  17. The Hartung-Knapp-Sidik-Jonkman method for random effects meta-analysis is straightforward and considerably outperforms the standard DerSimonian-Laird method

    PubMed Central

    2014-01-01

    Background The DerSimonian and Laird approach (DL) is widely used for random effects meta-analysis, but this often results in inappropriate type I error rates. The method described by Hartung, Knapp, Sidik and Jonkman (HKSJ) is known to perform better when trials of similar size are combined. However evidence in realistic situations, where one trial might be much larger than the other trials, is lacking. We aimed to evaluate the relative performance of the DL and HKSJ methods when studies of different sizes are combined and to develop a simple method to convert DL results to HKSJ results. Methods We evaluated the performance of the HKSJ versus DL approach in simulated meta-analyses of 2–20 trials with varying sample sizes and between-study heterogeneity, and allowing trials to have various sizes, e.g. 25% of the trials being 10-times larger than the smaller trials. We also compared the number of “positive” (statistically significant at p < 0.05) findings using empirical data of recent meta-analyses with > = 3 studies of interventions from the Cochrane Database of Systematic Reviews. Results The simulations showed that the HKSJ method consistently resulted in more adequate error rates than the DL method. When the significance level was 5%, the HKSJ error rates at most doubled, whereas for DL they could be over 30%. DL, and, far less so, HKSJ had more inflated error rates when the combined studies had unequal sizes and between-study heterogeneity. The empirical data from 689 meta-analyses showed that 25.1% of the significant findings for the DL method were non-significant with the HKSJ method. DL results can be easily converted into HKSJ results. Conclusions Our simulations showed that the HKSJ method consistently results in more adequate error rates than the DL method, especially when the number of studies is small, and can easily be applied routinely in meta-analyses. Even with the HKSJ method, extra caution is needed when there are = <5 studies of very unequal sizes. PMID:24548571

  18. Network-based differential gene expression analysis suggests cell cycle related genes regulated by E2F1 underlie the molecular difference between smoker and non-smoker lung adenocarcinoma

    PubMed Central

    2013-01-01

    Background Differential gene expression (DGE) analysis is commonly used to reveal the deregulated molecular mechanisms of complex diseases. However, traditional DGE analysis (e.g., the t test or the rank sum test) tests each gene independently without considering interactions between them. Top-ranked differentially regulated genes prioritized by the analysis may not directly relate to the coherent molecular changes underlying complex diseases. Joint analyses of co-expression and DGE have been applied to reveal the deregulated molecular modules underlying complex diseases. Most of these methods consist of separate steps: first to identify gene-gene relationships under the studied phenotype then to integrate them with gene expression changes for prioritizing signature genes, or vice versa. It is warrant a method that can simultaneously consider gene-gene co-expression strength and corresponding expression level changes so that both types of information can be leveraged optimally. Results In this paper, we develop a gene module based method for differential gene expression analysis, named network-based differential gene expression (nDGE) analysis, a one-step integrative process for prioritizing deregulated genes and grouping them into gene modules. We demonstrate that nDGE outperforms existing methods in prioritizing deregulated genes and discovering deregulated gene modules using simulated data sets. When tested on a series of smoker and non-smoker lung adenocarcinoma data sets, we show that top differentially regulated genes identified by the rank sum test in different sets are not consistent while top ranked genes defined by nDGE in different data sets significantly overlap. nDGE results suggest that a differentially regulated gene module, which is enriched for cell cycle related genes and E2F1 targeted genes, plays a role in the molecular differences between smoker and non-smoker lung adenocarcinoma. Conclusions In this paper, we develop nDGE to prioritize deregulated genes and group them into gene modules by simultaneously considering gene expression level changes and gene-gene co-regulations. When applied to both simulated and empirical data, nDGE outperforms the traditional DGE method. More specifically, when applied to smoker and non-smoker lung cancer sets, nDGE results illustrate the molecular differences between smoker and non-smoker lung cancer. PMID:24341432

  19. Overlapping communities from dense disjoint and high total degree clusters

    NASA Astrophysics Data System (ADS)

    Zhang, Hongli; Gao, Yang; Zhang, Yue

    2018-04-01

    Community plays an important role in the field of sociology, biology and especially in domains of computer science, where systems are often represented as networks. And community detection is of great importance in the domains. A community is a dense subgraph of the whole graph with more links between its members than between its members to the outside nodes, and nodes in the same community probably share common properties or play similar roles in the graph. Communities overlap when nodes in a graph belong to multiple communities. A vast variety of overlapping community detection methods have been proposed in the literature, and the local expansion method is one of the most successful techniques dealing with large networks. The paper presents a density-based seeding method, in which dense disjoint local clusters are searched and selected as seeds. The proposed method selects a seed by the total degree and density of local clusters utilizing merely local structures of the network. Furthermore, this paper proposes a novel community refining phase via minimizing the conductance of each community, through which the quality of identified communities is largely improved in linear time. Experimental results in synthetic networks show that the proposed seeding method outperforms other seeding methods in the state of the art and the proposed refining method largely enhances the quality of the identified communities. Experimental results in real graphs with ground-truth communities show that the proposed approach outperforms other state of the art overlapping community detection algorithms, in particular, it is more than two orders of magnitude faster than the existing global algorithms with higher quality, and it obtains much more accurate community structure than the current local algorithms without any priori information.

  20. Effective Feature Selection for Classification of Promoter Sequences.

    PubMed

    K, Kouser; P G, Lavanya; Rangarajan, Lalitha; K, Acharya Kshitish

    2016-01-01

    Exploring novel computational methods in making sense of biological data has not only been a necessity, but also productive. A part of this trend is the search for more efficient in silico methods/tools for analysis of promoters, which are parts of DNA sequences that are involved in regulation of expression of genes into other functional molecules. Promoter regions vary greatly in their function based on the sequence of nucleotides and the arrangement of protein-binding short-regions called motifs. In fact, the regulatory nature of the promoters seems to be largely driven by the selective presence and/or the arrangement of these motifs. Here, we explore computational classification of promoter sequences based on the pattern of motif distributions, as such classification can pave a new way of functional analysis of promoters and to discover the functionally crucial motifs. We make use of Position Specific Motif Matrix (PSMM) features for exploring the possibility of accurately classifying promoter sequences using some of the popular classification techniques. The classification results on the complete feature set are low, perhaps due to the huge number of features. We propose two ways of reducing features. Our test results show improvement in the classification output after the reduction of features. The results also show that decision trees outperform SVM (Support Vector Machine), KNN (K Nearest Neighbor) and ensemble classifier LibD3C, particularly with reduced features. The proposed feature selection methods outperform some of the popular feature transformation methods such as PCA and SVD. Also, the methods proposed are as accurate as MRMR (feature selection method) but much faster than MRMR. Such methods could be useful to categorize new promoters and explore regulatory mechanisms of gene expressions in complex eukaryotic species.

  1. A total variation diminishing finite difference algorithm for sonic boom propagation models

    NASA Technical Reports Server (NTRS)

    Sparrow, Victor W.

    1993-01-01

    It is difficult to accurately model the rise phases of sonic boom waveforms with traditional finite difference algorithms because of finite difference phase dispersion. This paper introduces the concept of a total variation diminishing (TVD) finite difference method as a tool for accurately modeling the rise phases of sonic booms. A standard second order finite difference algorithm and its TVD modified counterpart are both applied to the one-way propagation of a square pulse. The TVD method clearly outperforms the non-TVD method, showing great potential as a new computational tool in the analysis of sonic boom propagation.

  2. Sinusoidal Analysis-Synthesis of Audio Using Perceptual Criteria

    NASA Astrophysics Data System (ADS)

    Painter, Ted; Spanias, Andreas

    2003-12-01

    This paper presents a new method for the selection of sinusoidal components for use in compact representations of narrowband audio. The method consists of ranking and selecting the most perceptually relevant sinusoids. The idea behind the method is to maximize the matching between the auditory excitation pattern associated with the original signal and the corresponding auditory excitation pattern associated with the modeled signal that is being represented by a small set of sinusoidal parameters. The proposed component-selection methodology is shown to outperform the maximum signal-to-mask ratio selection strategy in terms of subjective quality.

  3. An Improved Heuristic Method for Subgraph Isomorphism Problem

    NASA Astrophysics Data System (ADS)

    Xiang, Yingzhuo; Han, Jiesi; Xu, Haijiang; Guo, Xin

    2017-09-01

    This paper focus on the subgraph isomorphism (SI) problem. We present an improved genetic algorithm, a heuristic method to search the optimal solution. The contribution of this paper is that we design a dedicated crossover algorithm and a new fitness function to measure the evolution process. Experiments show our improved genetic algorithm performs better than other heuristic methods. For a large graph, such as a subgraph of 40 nodes, our algorithm outperforms the traditional tree search algorithms. We find that the performance of our improved genetic algorithm does not decrease as the number of nodes in prototype graphs.

  4. Improved Adaptive LSB Steganography Based on Chaos and Genetic Algorithm

    NASA Astrophysics Data System (ADS)

    Yu, Lifang; Zhao, Yao; Ni, Rongrong; Li, Ting

    2010-12-01

    We propose a novel steganographic method in JPEG images with high performance. Firstly, we propose improved adaptive LSB steganography, which can achieve high capacity while preserving the first-order statistics. Secondly, in order to minimize visual degradation of the stego image, we shuffle bits-order of the message based on chaos whose parameters are selected by the genetic algorithm. Shuffling message's bits-order provides us with a new way to improve the performance of steganography. Experimental results show that our method outperforms classical steganographic methods in image quality, while preserving characteristics of histogram and providing high capacity.

  5. Color Image Classification Using Block Matching and Learning

    NASA Astrophysics Data System (ADS)

    Kondo, Kazuki; Hotta, Seiji

    In this paper, we propose block matching and learning for color image classification. In our method, training images are partitioned into small blocks. Given a test image, it is also partitioned into small blocks, and mean-blocks corresponding to each test block are calculated with neighbor training blocks. Our method classifies a test image into the class that has the shortest total sum of distances between mean blocks and test ones. We also propose a learning method for reducing memory requirement. Experimental results show that our classification outperforms other classifiers such as support vector machine with bag of keypoints.

  6. Deep classification hashing for person re-identification

    NASA Astrophysics Data System (ADS)

    Wang, Jiabao; Li, Yang; Zhang, Xiancai; Miao, Zhuang; Tao, Gang

    2018-04-01

    As the development of surveillance in public, person re-identification becomes more and more important. The largescale databases call for efficient computation and storage, hashing technique is one of the most important methods. In this paper, we proposed a new deep classification hashing network by introducing a new binary appropriation layer in the traditional ImageNet pre-trained CNN models. It outputs binary appropriate features, which can be easily quantized into binary hash-codes for hamming similarity comparison. Experiments show that our deep hashing method can outperform the state-of-the-art methods on the public CUHK03 and Market1501 datasets.

  7. Efficient experimental design for uncertainty reduction in gene regulatory networks.

    PubMed

    Dehghannasiri, Roozbeh; Yoon, Byung-Jun; Dougherty, Edward R

    2015-01-01

    An accurate understanding of interactions among genes plays a major role in developing therapeutic intervention methods. Gene regulatory networks often contain a significant amount of uncertainty. The process of prioritizing biological experiments to reduce the uncertainty of gene regulatory networks is called experimental design. Under such a strategy, the experiments with high priority are suggested to be conducted first. The authors have already proposed an optimal experimental design method based upon the objective for modeling gene regulatory networks, such as deriving therapeutic interventions. The experimental design method utilizes the concept of mean objective cost of uncertainty (MOCU). MOCU quantifies the expected increase of cost resulting from uncertainty. The optimal experiment to be conducted first is the one which leads to the minimum expected remaining MOCU subsequent to the experiment. In the process, one must find the optimal intervention for every gene regulatory network compatible with the prior knowledge, which can be prohibitively expensive when the size of the network is large. In this paper, we propose a computationally efficient experimental design method. This method incorporates a network reduction scheme by introducing a novel cost function that takes into account the disruption in the ranking of potential experiments. We then estimate the approximate expected remaining MOCU at a lower computational cost using the reduced networks. Simulation results based on synthetic and real gene regulatory networks show that the proposed approximate method has close performance to that of the optimal method but at lower computational cost. The proposed approximate method also outperforms the random selection policy significantly. A MATLAB software implementing the proposed experimental design method is available at http://gsp.tamu.edu/Publications/supplementary/roozbeh15a/.

  8. Efficient experimental design for uncertainty reduction in gene regulatory networks

    PubMed Central

    2015-01-01

    Background An accurate understanding of interactions among genes plays a major role in developing therapeutic intervention methods. Gene regulatory networks often contain a significant amount of uncertainty. The process of prioritizing biological experiments to reduce the uncertainty of gene regulatory networks is called experimental design. Under such a strategy, the experiments with high priority are suggested to be conducted first. Results The authors have already proposed an optimal experimental design method based upon the objective for modeling gene regulatory networks, such as deriving therapeutic interventions. The experimental design method utilizes the concept of mean objective cost of uncertainty (MOCU). MOCU quantifies the expected increase of cost resulting from uncertainty. The optimal experiment to be conducted first is the one which leads to the minimum expected remaining MOCU subsequent to the experiment. In the process, one must find the optimal intervention for every gene regulatory network compatible with the prior knowledge, which can be prohibitively expensive when the size of the network is large. In this paper, we propose a computationally efficient experimental design method. This method incorporates a network reduction scheme by introducing a novel cost function that takes into account the disruption in the ranking of potential experiments. We then estimate the approximate expected remaining MOCU at a lower computational cost using the reduced networks. Conclusions Simulation results based on synthetic and real gene regulatory networks show that the proposed approximate method has close performance to that of the optimal method but at lower computational cost. The proposed approximate method also outperforms the random selection policy significantly. A MATLAB software implementing the proposed experimental design method is available at http://gsp.tamu.edu/Publications/supplementary/roozbeh15a/. PMID:26423515

  9. Computational Prediction of Electron Ionization Mass Spectra to Assist in GC/MS Compound Identification.

    PubMed

    Allen, Felicity; Pon, Allison; Greiner, Russ; Wishart, David

    2016-08-02

    We describe a tool, competitive fragmentation modeling for electron ionization (CFM-EI) that, given a chemical structure (e.g., in SMILES or InChI format), computationally predicts an electron ionization mass spectrum (EI-MS) (i.e., the type of mass spectrum commonly generated by gas chromatography mass spectrometry). The predicted spectra produced by this tool can be used for putative compound identification, complementing measured spectra in reference databases by expanding the range of compounds able to be considered when availability of measured spectra is limited. The tool extends CFM-ESI, a recently developed method for computational prediction of electrospray tandem mass spectra (ESI-MS/MS), but unlike CFM-ESI, CFM-EI can handle odd-electron ions and isotopes and incorporates an artificial neural network. Tests on EI-MS data from the NIST database demonstrate that CFM-EI is able to model fragmentation likelihoods in low-resolution EI-MS data, producing predicted spectra whose dot product scores are significantly better than full enumeration "bar-code" spectra. CFM-EI also outperformed previously reported results for MetFrag, MOLGEN-MS, and Mass Frontier on one compound identification task. It also outperformed MetFrag in a range of other compound identification tasks involving a much larger data set, containing both derivatized and nonderivatized compounds. While replicate EI-MS measurements of chemical standards are still a more accurate point of comparison, CFM-EI's predictions provide a much-needed alternative when no reference standard is available for measurement. CFM-EI is available at https://sourceforge.net/projects/cfm-id/ for download and http://cfmid.wishartlab.com as a web service.

  10. enDNA-Prot: identification of DNA-binding proteins by applying ensemble learning.

    PubMed

    Xu, Ruifeng; Zhou, Jiyun; Liu, Bin; Yao, Lin; He, Yulan; Zou, Quan; Wang, Xiaolong

    2014-01-01

    DNA-binding proteins are crucial for various cellular processes, such as recognition of specific nucleotide, regulation of transcription, and regulation of gene expression. Developing an effective model for identifying DNA-binding proteins is an urgent research problem. Up to now, many methods have been proposed, but most of them focus on only one classifier and cannot make full use of the large number of negative samples to improve predicting performance. This study proposed a predictor called enDNA-Prot for DNA-binding protein identification by employing the ensemble learning technique. Experiential results showed that enDNA-Prot was comparable with DNA-Prot and outperformed DNAbinder and iDNA-Prot with performance improvement in the range of 3.97-9.52% in ACC and 0.08-0.19 in MCC. Furthermore, when the benchmark dataset was expanded with negative samples, the performance of enDNA-Prot outperformed the three existing methods by 2.83-16.63% in terms of ACC and 0.02-0.16 in terms of MCC. It indicated that enDNA-Prot is an effective method for DNA-binding protein identification and expanding training dataset with negative samples can improve its performance. For the convenience of the vast majority of experimental scientists, we developed a user-friendly web-server for enDNA-Prot which is freely accessible to the public.

  11. Integrating linear optimization with structural modeling to increase HIV neutralization breadth.

    PubMed

    Sevy, Alexander M; Panda, Swetasudha; Crowe, James E; Meiler, Jens; Vorobeychik, Yevgeniy

    2018-02-01

    Computational protein design has been successful in modeling fixed backbone proteins in a single conformation. However, when modeling large ensembles of flexible proteins, current methods in protein design have been insufficient. Large barriers in the energy landscape are difficult to traverse while redesigning a protein sequence, and as a result current design methods only sample a fraction of available sequence space. We propose a new computational approach that combines traditional structure-based modeling using the Rosetta software suite with machine learning and integer linear programming to overcome limitations in the Rosetta sampling methods. We demonstrate the effectiveness of this method, which we call BROAD, by benchmarking the performance on increasing predicted breadth of anti-HIV antibodies. We use this novel method to increase predicted breadth of naturally-occurring antibody VRC23 against a panel of 180 divergent HIV viral strains and achieve 100% predicted binding against the panel. In addition, we compare the performance of this method to state-of-the-art multistate design in Rosetta and show that we can outperform the existing method significantly. We further demonstrate that sequences recovered by this method recover known binding motifs of broadly neutralizing anti-HIV antibodies. Finally, our approach is general and can be extended easily to other protein systems. Although our modeled antibodies were not tested in vitro, we predict that these variants would have greatly increased breadth compared to the wild-type antibody.

  12. RCK: accurate and efficient inference of sequence- and structure-based protein-RNA binding models from RNAcompete data.

    PubMed

    Orenstein, Yaron; Wang, Yuhao; Berger, Bonnie

    2016-06-15

    Protein-RNA interactions, which play vital roles in many processes, are mediated through both RNA sequence and structure. CLIP-based methods, which measure protein-RNA binding in vivo, suffer from experimental noise and systematic biases, whereas in vitro experiments capture a clearer signal of protein RNA-binding. Among them, RNAcompete provides binding affinities of a specific protein to more than 240 000 unstructured RNA probes in one experiment. The computational challenge is to infer RNA structure- and sequence-based binding models from these data. The state-of-the-art in sequence models, Deepbind, does not model structural preferences. RNAcontext models both sequence and structure preferences, but is outperformed by GraphProt. Unfortunately, GraphProt cannot detect structural preferences from RNAcompete data due to the unstructured nature of the data, as noted by its developers, nor can it be tractably run on the full RNACompete dataset. We develop RCK, an efficient, scalable algorithm that infers both sequence and structure preferences based on a new k-mer based model. Remarkably, even though RNAcompete data is designed to be unstructured, RCK can still learn structural preferences from it. RCK significantly outperforms both RNAcontext and Deepbind in in vitro binding prediction for 244 RNAcompete experiments. Moreover, RCK is also faster and uses less memory, which enables scalability. While currently on par with existing methods in in vivo binding prediction on a small scale test, we demonstrate that RCK will increasingly benefit from experimentally measured RNA structure profiles as compared to computationally predicted ones. By running RCK on the entire RNAcompete dataset, we generate and provide as a resource a set of protein-RNA structure-based models on an unprecedented scale. Software and models are freely available at http://rck.csail.mit.edu/ bab@mit.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.

  13. Biomarker Identification for Prostate Cancer and Lymph Node Metastasis from Microarray Data and Protein Interaction Network Using Gene Prioritization Method

    PubMed Central

    Arias, Carlos Roberto; Yeh, Hsiang-Yuan; Soo, Von-Wun

    2012-01-01

    Finding a genetic disease-related gene is not a trivial task. Therefore, computational methods are needed to present clues to the biomedical community to explore genes that are more likely to be related to a specific disease as biomarker. We present biomarker identification problem using gene prioritization method called gene prioritization from microarray data based on shortest paths, extended with structural and biological properties and edge flux using voting scheme (GP-MIDAS-VXEF). The method is based on finding relevant interactions on protein interaction networks, then scoring the genes using shortest paths and topological analysis, integrating the results using a voting scheme and a biological boosting. We applied two experiments, one is prostate primary and normal samples and the other is prostate primary tumor with and without lymph nodes metastasis. We used 137 truly prostate cancer genes as benchmark. In the first experiment, GP-MIDAS-VXEF outperforms all the other state-of-the-art methods in the benchmark by retrieving the truest related genes from the candidate set in the top 50 scores found. We applied the same technique to infer the significant biomarkers in prostate cancer with lymph nodes metastasis which is not established well. PMID:22654636

  14. Spectroscopic Diagnosis of Arsenic Contamination in Agricultural Soils

    PubMed Central

    Shi, Tiezhu; Liu, Huizeng; Chen, Yiyun; Fei, Teng; Wang, Junjie; Wu, Guofeng

    2017-01-01

    This study investigated the abilities of pre-processing, feature selection and machine-learning methods for the spectroscopic diagnosis of soil arsenic contamination. The spectral data were pre-processed by using Savitzky-Golay smoothing, first and second derivatives, multiplicative scatter correction, standard normal variate, and mean centering. Principle component analysis (PCA) and the RELIEF algorithm were used to extract spectral features. Machine-learning methods, including random forests (RF), artificial neural network (ANN), radial basis function- and linear function- based support vector machine (RBF- and LF-SVM) were employed for establishing diagnosis models. The model accuracies were evaluated and compared by using overall accuracies (OAs). The statistical significance of the difference between models was evaluated by using McNemar’s test (Z value). The results showed that the OAs varied with the different combinations of pre-processing, feature selection, and classification methods. Feature selection methods could improve the modeling efficiencies and diagnosis accuracies, and RELIEF often outperformed PCA. The optimal models established by RF (OA = 86%), ANN (OA = 89%), RBF- (OA = 89%) and LF-SVM (OA = 87%) had no statistical difference in diagnosis accuracies (Z < 1.96, p < 0.05). These results indicated that it was feasible to diagnose soil arsenic contamination using reflectance spectroscopy. The appropriate combination of multivariate methods was important to improve diagnosis accuracies. PMID:28471412

  15. Contrast Enhancement Algorithm Based on Gap Adjustment for Histogram Equalization

    PubMed Central

    Chiu, Chung-Cheng; Ting, Chih-Chung

    2016-01-01

    Image enhancement methods have been widely used to improve the visual effects of images. Owing to its simplicity and effectiveness histogram equalization (HE) is one of the methods used for enhancing image contrast. However, HE may result in over-enhancement and feature loss problems that lead to unnatural look and loss of details in the processed images. Researchers have proposed various HE-based methods to solve the over-enhancement problem; however, they have largely ignored the feature loss problem. Therefore, a contrast enhancement algorithm based on gap adjustment for histogram equalization (CegaHE) is proposed. It refers to a visual contrast enhancement algorithm based on histogram equalization (VCEA), which generates visually pleasing enhanced images, and improves the enhancement effects of VCEA. CegaHE adjusts the gaps between two gray values based on the adjustment equation, which takes the properties of human visual perception into consideration, to solve the over-enhancement problem. Besides, it also alleviates the feature loss problem and further enhances the textures in the dark regions of the images to improve the quality of the processed images for human visual perception. Experimental results demonstrate that CegaHE is a reliable method for contrast enhancement and that it significantly outperforms VCEA and other methods. PMID:27338412

  16. Iterative normalization method for improved prostate cancer localization with multispectral magnetic resonance imaging

    NASA Astrophysics Data System (ADS)

    Liu, Xin; Samil Yetik, Imam

    2012-04-01

    Use of multispectral magnetic resonance imaging has received a great interest for prostate cancer localization in research and clinical studies. Manual extraction of prostate tumors from multispectral magnetic resonance imaging is inefficient and subjective, while automated segmentation is objective and reproducible. For supervised, automated segmentation approaches, learning is essential to obtain the information from training dataset. However, in this procedure, all patients are assumed to have similar properties for the tumor and normal tissues, and the segmentation performance suffers since the variations across patients are ignored. To conquer this difficulty, we propose a new iterative normalization method based on relative intensity values of tumor and normal tissues to normalize multispectral magnetic resonance images and improve segmentation performance. The idea of relative intensity mimics the manual segmentation performed by human readers, who compare the contrast between regions without knowing the actual intensity values. We compare the segmentation performance of the proposed method with that of z-score normalization followed by support vector machine, local active contours, and fuzzy Markov random field. Our experimental results demonstrate that our method outperforms the three other state-of-the-art algorithms, and was found to have specificity of 0.73, sensitivity of 0.69, and accuracy of 0.79, significantly better than alternative methods.

  17. EMUDRA: Ensemble of Multiple Drug Repositioning Approaches to Improve Prediction Accuracy.

    PubMed

    Zhou, Xianxiao; Wang, Minghui; Katsyv, Igor; Irie, Hanna; Zhang, Bin

    2018-04-24

    Availability of large-scale genomic, epigenetic and proteomic data in complex diseases makes it possible to objectively and comprehensively identify therapeutic targets that can lead to new therapies. The Connectivity Map has been widely used to explore novel indications of existing drugs. However, the prediction accuracy of the existing methods, such as Kolmogorov-Smirnov statistic remains low. Here we present a novel high-performance drug repositioning approach that improves over the state-of-the-art methods. We first designed an expression weighted cosine method (EWCos) to minimize the influence of the uninformative expression changes and then developed an ensemble approach termed EMUDRA (Ensemble of Multiple Drug Repositioning Approaches) to integrate EWCos and three existing state-of-the-art methods. EMUDRA significantly outperformed individual drug repositioning methods when applied to simulated and independent evaluation datasets. We predicted using EMUDRA and experimentally validated an antibiotic rifabutin as an inhibitor of cell growth in triple negative breast cancer. EMUDRA can identify drugs that more effectively target disease gene signatures and will thus be a useful tool for identifying novel therapies for complex diseases and predicting new indications for existing drugs. The EMUDRA R package is available at doi:10.7303/syn11510888. bin.zhang@mssm.edu or zhangb@hotmail.com. Supplementary data are available at Bioinformatics online.

  18. Pulse Transit Time Based Continuous Cuffless Blood Pressure Estimation: A New Extension and A Comprehensive Evaluation.

    PubMed

    Ding, Xiaorong; Yan, Bryan P; Zhang, Yuan-Ting; Liu, Jing; Zhao, Ni; Tsang, Hon Ki

    2017-09-14

    Cuffless technique enables continuous blood pressure (BP) measurement in an unobtrusive manner, and thus has the potential to revolutionize the conventional cuff-based approaches. This study extends the pulse transit time (PTT) based cuffless BP measurement method by introducing a new indicator - the photoplethysmogram (PPG) intensity ratio (PIR). The performance of the models with PTT and PIR was comprehensively evaluated in comparison with six models that are based on sole PTT. The validation conducted on 33 subjects with and without hypertension, at rest and under various maneuvers with induced BP changes, and over an extended calibration interval, respectively. The results showed that, comparing to the PTT models, the proposed methods achieved better accuracy on each subject group at rest state and over 24 hours calibration interval. Although the BP estimation errors under dynamic maneuvers and over extended calibration interval were significantly increased for all methods, the proposed methods still outperformed the compared methods in the latter situation. These findings suggest that additional BP-related indicator other than PTT has added value for improving the accuracy of cuffless BP measurement. This study also offers insights into future research in cuffless BP measurement for tracking dynamic BP changes and over extended periods of time.

  19. Methods to maximise recovery of environmental DNA from water samples.

    PubMed

    Hinlo, Rheyda; Gleeson, Dianne; Lintermans, Mark; Furlan, Elise

    2017-01-01

    The environmental DNA (eDNA) method is a detection technique that is rapidly gaining credibility as a sensitive tool useful in the surveillance and monitoring of invasive and threatened species. Because eDNA analysis often deals with small quantities of short and degraded DNA fragments, methods that maximize eDNA recovery are required to increase detectability. In this study, we performed experiments at different stages of the eDNA analysis to show which combinations of methods give the best recovery rate for eDNA. Using Oriental weatherloach (Misgurnus anguillicaudatus) as a study species, we show that various combinations of DNA capture, preservation and extraction methods can significantly affect DNA yield. Filtration using cellulose nitrate filter paper preserved in ethanol or stored in a -20°C freezer and extracted with the Qiagen DNeasy kit outperformed other combinations in terms of cost and efficiency of DNA recovery. Our results support the recommendation to filter water samples within 24hours but if this is not possible, our results suggest that refrigeration may be a better option than freezing for short-term storage (i.e., 3-5 days). This information is useful in designing eDNA detection of low-density invasive or threatened species, where small variations in DNA recovery can signify the difference between detection success or failure.

  20. Using molecular functional networks to manifest connections between obesity and obesity-related diseases

    PubMed Central

    Yang, Jialiang; Qiu, Jing; Wang, Kejing; Zhu, Lijuan; Fan, Jingjing; Zheng, Deyin; Meng, Xiaodi; Yang, Jiasheng; Peng, Lihong; Fu, Yu; Zhang, Dahan; Peng, Shouneng; Huang, Haiyun; Zhang, Yi

    2017-01-01

    Obesity is a primary risk factor for many diseases such as certain cancers. In this study, we have developed three algorithms including a random-walk based method OBNet, a shortest-path based method OBsp and a direct-overlap method OBoverlap, to reveal obesity-disease connections at protein-interaction subnetworks corresponding to thousands of biological functions and pathways. Through literature mining, we also curated an obesity-associated disease list, by which we compared the methods. As a result, OBNet outperforms other two methods. OBNet can predict whether a disease is obesity-related based on its associated genes. Meanwhile, OBNet identifies extensive connections between obesity genes and genes associated with a few diseases at various functional modules and pathways. Using breast cancer and Type 2 diabetes as two examples, OBNet identifies meaningful genes that may play key roles in connecting obesity and the two diseases. For example, TGFB1 and VEGFA are inferred to be the top two key genes mediating obesity-breast cancer connection in modules associated with brain development. Finally, the top modules identified by OBNet in breast cancer significantly overlap with modules identified from TCGA breast cancer gene expression study, revealing the power of OBNet in identifying biological processes involved in the disease. PMID:29156709

  1. The Effectiveness of One-to-One Tutoring by Community Tutors for At-Risk Beginning Readers.

    ERIC Educational Resources Information Center

    Vadasy, Patricia F.; Jenkins, Joseph R.; Antil, Lawrence R.; Wayne, Susan K.; O'Connor, Rollanda E.

    1997-01-01

    Twenty at-risk first graders received 30 minutes of individual instruction from community tutors four days a week for up to 23 weeks. Subjects outperformed the control group on all reading, decoding, spelling and segmenting, and writing measures. Tutors who implemented the program with a high degree of fidelity achieved significant effect sizes in…

  2. Normalization of urinary pteridines by urine specific gravity for early cancer detection.

    PubMed

    Burton, Casey; Shi, Honglan; Ma, Yinfa

    2014-08-05

    Urinary biomarkers, such as pteridines, require normalization with respect to an individual's hydration status and time since last urination. Conventional creatinine-based corrections are affected by a multitude of patient factors whereas urine specific gravity (USG) is a bulk specimen property that may better resist those same factors. We examined the performance of traditional creatinine adjustments relative to USG to six urinary pteridines in aggressive and benign breast cancers. 6-Biopterin, neopterin, pterin, 6-hydroxymethylpterin, isoxanthopterin, xanthopterin, and creatinine were analyzed in 50 urine specimens with a previously developed liquid chromatography-tandem mass spectrometry technique. Creatinine and USG performance were evaluated with non-parametric Mann-Whitney hypothesis testing. USG and creatinine were moderately correlated (r=0.857) with deviations occurring in dilute and concentrated specimens. In 48 aggressive and benign breast cancers, normalization by USG significantly outperformed creatinine adjustments which marginally outperformed uncorrected pteridines in predicting pathological status. In addition, isoxanthopterin and xanthopterin were significantly higher in pathological specimens when normalized by USG. USG, as a bulk property, can provide better performance over creatinine-based normalizations for urinary pteridines in cancer detection applications. Copyright © 2014 Elsevier B.V. All rights reserved.

  3. Forecasting USAF JP-8 Fuel Needs

    DTIC Science & Technology

    2009-03-01

    versus complex ones. When we consider long -term forecasts, 5-years in this case, multiple regression outperforms ANN modeling within the specified...with more simple and easy-to-implement methods, versus complex ones. When we consider long -term 5-year forecasts, our multiple regression model...effort. The insight and experience was certainly appreciated. Special thanks to my Turkish peers for their continuous support and help during this long

  4. Quadrotor Intercept Trajectory Planning and Simulation

    DTIC Science & Technology

    2017-06-01

    Figure 41. Results are grouped by geometry type and colored based on trajectory planner. Figure 41. Summary of Experimental Data Intercept Time...DISTRIBUTION CODE 13. ABSTRACT (maximum 200 words) Quadrotor drones pose a safety hazard when operated in or near controlled airspace. A hazardous...quadrotor could be intercepted and removed by another quadrotor. In this thesis, we seek to determine if optimal control methods outperform missile

  5. YamiPred: A Novel Evolutionary Method for Predicting Pre-miRNAs and Selecting Relevant Features.

    PubMed

    Kleftogiannis, Dimitrios; Theofilatos, Konstantinos; Likothanassis, Spiros; Mavroudi, Seferina

    2015-01-01

    MicroRNAs (miRNAs) are small non-coding RNAs, which play a significant role in gene regulation. Predicting miRNA genes is a challenging bioinformatics problem and existing experimental and computational methods fail to deal with it effectively. We developed YamiPred, an embedded classification method that combines the efficiency and robustness of support vector machines (SVM) with genetic algorithms (GA) for feature selection and parameters optimization. YamiPred was tested in a new and realistic human dataset and was compared with state-of-the-art computational intelligence approaches and the prevalent SVM-based tools for miRNA prediction. Experimental results indicate that YamiPred outperforms existing approaches in terms of accuracy and of geometric mean of sensitivity and specificity. The embedded feature selection component selects a compact feature subset that contributes to the performance optimization. Further experimentation with this minimal feature subset has achieved very high classification performance and revealed the minimum number of samples required for developing a robust predictor. YamiPred also confirmed the important role of commonly used features such as entropy and enthalpy, and uncovered the significance of newly introduced features, such as %A-U aggregate nucleotide frequency and positional entropy. The best model trained on human data has successfully predicted pre-miRNAs to other organisms including the category of viruses.

  6. Large-scale online semantic indexing of biomedical articles via an ensemble of multi-label classification models.

    PubMed

    Papanikolaou, Yannis; Tsoumakas, Grigorios; Laliotis, Manos; Markantonatos, Nikos; Vlahavas, Ioannis

    2017-09-22

    In this paper we present the approach that we employed to deal with large scale multi-label semantic indexing of biomedical papers. This work was mainly implemented within the context of the BioASQ challenge (2013-2017), a challenge concerned with biomedical semantic indexing and question answering. Our main contribution is a MUlti-Label Ensemble method (MULE) that incorporates a McNemar statistical significance test in order to validate the combination of the constituent machine learning algorithms. Some secondary contributions include a study on the temporal aspects of the BioASQ corpus (observations apply also to the BioASQ's super-set, the PubMed articles collection) and the proper parametrization of the algorithms used to deal with this challenging classification task. The ensemble method that we developed is compared to other approaches in experimental scenarios with subsets of the BioASQ corpus giving positive results. In our participation in the BioASQ challenge we obtained the first place in 2013 and the second place in the four following years, steadily outperforming MTI, the indexing system of the National Library of Medicine (NLM). The results of our experimental comparisons, suggest that employing a statistical significance test to validate the ensemble method's choices, is the optimal approach for ensembling multi-label classifiers, especially in contexts with many rare labels.

  7. Performance of in silico tools for the evaluation of p16INK4a (CDKN2A) variants in CAGI.

    PubMed

    Carraro, Marco; Minervini, Giovanni; Giollo, Manuel; Bromberg, Yana; Capriotti, Emidio; Casadio, Rita; Dunbrack, Roland; Elefanti, Lisa; Fariselli, Pietro; Ferrari, Carlo; Gough, Julian; Katsonis, Panagiotis; Leonardi, Emanuela; Lichtarge, Olivier; Menin, Chiara; Martelli, Pier Luigi; Niroula, Abhishek; Pal, Lipika R; Repo, Susanna; Scaini, Maria Chiara; Vihinen, Mauno; Wei, Qiong; Xu, Qifang; Yang, Yuedong; Yin, Yizhou; Zaucha, Jan; Zhao, Huiying; Zhou, Yaoqi; Brenner, Steven E; Moult, John; Tosatto, Silvio C E

    2017-09-01

    Correct phenotypic interpretation of variants of unknown significance for cancer-associated genes is a diagnostic challenge as genetic screenings gain in popularity in the next-generation sequencing era. The Critical Assessment of Genome Interpretation (CAGI) experiment aims to test and define the state of the art of genotype-phenotype interpretation. Here, we present the assessment of the CAGI p16INK4a challenge. Participants were asked to predict the effect on cellular proliferation of 10 variants for the p16INK4a tumor suppressor, a cyclin-dependent kinase inhibitor encoded by the CDKN2A gene. Twenty-two pathogenicity predictors were assessed with a variety of accuracy measures for reliability in a medical context. Different assessment measures were combined in an overall ranking to provide more robust results. The R scripts used for assessment are publicly available from a GitHub repository for future use in similar assessment exercises. Despite a limited test-set size, our findings show a variety of results, with some methods performing significantly better. Methods combining different strategies frequently outperform simpler approaches. The best predictor, Yang&Zhou lab, uses a machine learning method combining an empirical energy function measuring protein stability with an evolutionary conservation term. The p16INK4a challenge highlights how subtle structural effects can neutralize otherwise deleterious variants. © 2017 Wiley Periodicals, Inc.

  8. Integrative approach for inference of gene regulatory networks using lasso-based random featuring and application to psychiatric disorders.

    PubMed

    Kim, Dongchul; Kang, Mingon; Biswas, Ashis; Liu, Chunyu; Gao, Jean

    2016-08-10

    Inferring gene regulatory networks is one of the most interesting research areas in the systems biology. Many inference methods have been developed by using a variety of computational models and approaches. However, there are two issues to solve. First, depending on the structural or computational model of inference method, the results tend to be inconsistent due to innately different advantages and limitations of the methods. Therefore the combination of dissimilar approaches is demanded as an alternative way in order to overcome the limitations of standalone methods through complementary integration. Second, sparse linear regression that is penalized by the regularization parameter (lasso) and bootstrapping-based sparse linear regression methods were suggested in state of the art methods for network inference but they are not effective for a small sample size data and also a true regulator could be missed if the target gene is strongly affected by an indirect regulator with high correlation or another true regulator. We present two novel network inference methods based on the integration of three different criteria, (i) z-score to measure the variation of gene expression from knockout data, (ii) mutual information for the dependency between two genes, and (iii) linear regression-based feature selection. Based on these criterion, we propose a lasso-based random feature selection algorithm (LARF) to achieve better performance overcoming the limitations of bootstrapping as mentioned above. In this work, there are three main contributions. First, our z score-based method to measure gene expression variations from knockout data is more effective than similar criteria of related works. Second, we confirmed that the true regulator selection can be effectively improved by LARF. Lastly, we verified that an integrative approach can clearly outperform a single method when two different methods are effectively jointed. In the experiments, our methods were validated by outperforming the state of the art methods on DREAM challenge data, and then LARF was applied to inferences of gene regulatory network associated with psychiatric disorders.

  9. A Multi-Channel Method for Detecting Periodic Forced Oscillations in Power Systems

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Follum, James D.; Tuffner, Francis K.

    2016-11-14

    Forced oscillations in electric power systems are often symptomatic of equipment malfunction or improper operation. Detecting and addressing the cause of the oscillations can improve overall system operation. In this paper, a multi-channel method of detecting forced oscillations and estimating their frequencies is proposed. The method operates by comparing the sum of scaled periodograms from various channels to a threshold. A method of setting the threshold to specify the detector's probability of false alarm while accounting for the correlation between channels is also presented. Results from simulated and measured power system data indicate that the method outperforms its single-channel counterpartmore » and is suitable for real-world applications.« less

  10. Doubly stochastic radial basis function methods

    NASA Astrophysics Data System (ADS)

    Yang, Fenglian; Yan, Liang; Ling, Leevan

    2018-06-01

    We propose a doubly stochastic radial basis function (DSRBF) method for function recoveries. Instead of a constant, we treat the RBF shape parameters as stochastic variables whose distribution were determined by a stochastic leave-one-out cross validation (LOOCV) estimation. A careful operation count is provided in order to determine the ranges of all the parameters in our methods. The overhead cost for setting up the proposed DSRBF method is O (n2) for function recovery problems with n basis. Numerical experiments confirm that the proposed method not only outperforms constant shape parameter formulation (in terms of accuracy with comparable computational cost) but also the optimal LOOCV formulation (in terms of both accuracy and computational cost).

  11. Drug-target interaction prediction: A Bayesian ranking approach.

    PubMed

    Peska, Ladislav; Buza, Krisztian; Koller, Júlia

    2017-12-01

    In silico prediction of drug-target interactions (DTI) could provide valuable information and speed-up the process of drug repositioning - finding novel usage for existing drugs. In our work, we focus on machine learning algorithms supporting drug-centric repositioning approach, which aims to find novel usage for existing or abandoned drugs. We aim at proposing a per-drug ranking-based method, which reflects the needs of drug-centric repositioning research better than conventional drug-target prediction approaches. We propose Bayesian Ranking Prediction of Drug-Target Interactions (BRDTI). The method is based on Bayesian Personalized Ranking matrix factorization (BPR) which has been shown to be an excellent approach for various preference learning tasks, however, it has not been used for DTI prediction previously. In order to successfully deal with DTI challenges, we extended BPR by proposing: (i) the incorporation of target bias, (ii) a technique to handle new drugs and (iii) content alignment to take structural similarities of drugs and targets into account. Evaluation on five benchmark datasets shows that BRDTI outperforms several state-of-the-art approaches in terms of per-drug nDCG and AUC. BRDTI results w.r.t. nDCG are 0.929, 0.953, 0.948, 0.897 and 0.690 for G-Protein Coupled Receptors (GPCR), Ion Channels (IC), Nuclear Receptors (NR), Enzymes (E) and Kinase (K) datasets respectively. Additionally, BRDTI significantly outperformed other methods (BLM-NII, WNN-GIP, NetLapRLS and CMF) w.r.t. nDCG in 17 out of 20 cases. Furthermore, BRDTI was also shown to be able to predict novel drug-target interactions not contained in the original datasets. The average recall at top-10 predicted targets for each drug was 0.762, 0.560, 1.000 and 0.404 for GPCR, IC, NR, and E datasets respectively. Based on the evaluation, we can conclude that BRDTI is an appropriate choice for researchers looking for an in silico DTI prediction technique to be used in drug-centric repositioning scenarios. BRDTI Software and supplementary materials are available online at www.ksi.mff.cuni.cz/∼peska/BRDTI. Copyright © 2017 Elsevier B.V. All rights reserved.

  12. Validation of the da Vinci Surgical Skill Simulator across three surgical disciplines: A pilot study

    PubMed Central

    Alzahrani, Tarek; Haddad, Richard; Alkhayal, Abdullah; Delisle, Josée; Drudi, Laura; Gotlieb, Walter; Fraser, Shannon; Bergman, Simon; Bladou, Frank; Andonian, Sero; Anidjar, Maurice

    2013-01-01

    Objective: In this paper, we evaluate face, content and construct validity of the da Vinci Surgical Skills Simulator (dVSSS) across 3 surgical disciplines. Methods: In total, 48 participants from urology, gynecology and general surgery participated in the study as novices (0 robotic cases performed), intermediates (1–74) or experts (≥75). Each participant completed 9 tasks (Peg board level 2, match board level 2, needle targeting, ring and rail level 2, dots and needles level 1, suture sponge level 2, energy dissection level 1, ring walk level 3 and tubes). The Mimic Technologies software scored each task from 0 (worst) to 100 (best) using several predetermined metrics. Face and content validity were evaluated by a questionnaire administered after task completion. Wilcoxon test was used to perform pair wise comparisons. Results: The expert group comprised of 6 attending surgeons. The intermediate group included 4 attending surgeons, 3 fellows and 5 residents. The novices included 1 attending surgeon, 1 fellow, 13 residents, 13 medical students and 2 research assistants. The median number of robotic cases performed by experts and intermediates were 250 and 9, respectively. The median overall realistic score (face validity) was 8/10. Experts rated the usefulness of the simulator as a training tool for residents (content validity) as 8.5/10. For construct validity, experts outperformed novices in all 9 tasks (p < 0.05). Intermediates outperformed novices in 7 of 9 tasks (p < 0.05); there were no significant differences in the energy dissection and ring walk tasks. Finally, experts scored significantly better than intermediates in only 3 of 9 tasks (matchboard, dots and needles and energy dissection) (p < 0.05). Conclusions: This study confirms the face, content and construct validities of the dVSSS across urology, gynecology and general surgery. Larger sample size and more complex tasks are needed to further differentiate intermediates from experts. PMID:23914275

  13. Dynamic programming re-ranking for PPI interactor and pair extraction in full-text articles

    PubMed Central

    2011-01-01

    Background Experimentally verified protein-protein interactions (PPIs) cannot be easily retrieved by researchers unless they are stored in PPI databases. The curation of such databases can be facilitated by employing text-mining systems to identify genes which play the interactor role in PPIs and to map these genes to unique database identifiers (interactor normalization task or INT) and then to return a list of interaction pairs for each article (interaction pair task or IPT). These two tasks are evaluated in terms of the area under curve of the interpolated precision/recall (AUC iP/R) score because the order of identifiers in the output list is important for ease of curation. Results Our INT system developed for the BioCreAtIvE II.5 INT challenge achieved a promising AUC iP/R of 43.5% by using a support vector machine (SVM)-based ranking procedure. Using our new re-ranking algorithm, we have been able to improve system performance (AUC iP/R) by 1.84%. Our experimental results also show that with the re-ranked INT results, our unsupervised IPT system can achieve a competitive AUC iP/R of 23.86%, which outperforms the best BC II.5 INT system by 1.64%. Compared to using only SVM ranked INT results, using re-ranked INT results boosts AUC iP/R by 7.84%. Statistical significance t-test results show that our INT/IPT system with re-ranking outperforms that without re-ranking by a statistically significant difference. Conclusions In this paper, we present a new re-ranking algorithm that considers co-occurrence among identifiers in an article to improve INT and IPT ranking results. Combining the re-ranked INT results with an unsupervised approach to find associations among interactors, the proposed method can boost the IPT performance. We also implement score computation using dynamic programming, which is faster and more efficient than traditional approaches. PMID:21342534

  14. Dynamic programming re-ranking for PPI interactor and pair extraction in full-text articles.

    PubMed

    Tsai, Richard Tzong-Han; Lai, Po-Ting

    2011-02-23

    Experimentally verified protein-protein interactions (PPIs) cannot be easily retrieved by researchers unless they are stored in PPI databases. The curation of such databases can be facilitated by employing text-mining systems to identify genes which play the interactor role in PPIs and to map these genes to unique database identifiers (interactor normalization task or INT) and then to return a list of interaction pairs for each article (interaction pair task or IPT). These two tasks are evaluated in terms of the area under curve of the interpolated precision/recall (AUC iP/R) score because the order of identifiers in the output list is important for ease of curation. Our INT system developed for the BioCreAtIvE II.5 INT challenge achieved a promising AUC iP/R of 43.5% by using a support vector machine (SVM)-based ranking procedure. Using our new re-ranking algorithm, we have been able to improve system performance (AUC iP/R) by 1.84%. Our experimental results also show that with the re-ranked INT results, our unsupervised IPT system can achieve a competitive AUC iP/R of 23.86%, which outperforms the best BC II.5 INT system by 1.64%. Compared to using only SVM ranked INT results, using re-ranked INT results boosts AUC iP/R by 7.84%. Statistical significance t-test results show that our INT/IPT system with re-ranking outperforms that without re-ranking by a statistically significant difference. In this paper, we present a new re-ranking algorithm that considers co-occurrence among identifiers in an article to improve INT and IPT ranking results. Combining the re-ranked INT results with an unsupervised approach to find associations among interactors, the proposed method can boost the IPT performance. We also implement score computation using dynamic programming, which is faster and more efficient than traditional approaches.

  15. A discriminative method for protein remote homology detection and fold recognition combining Top-n-grams and latent semantic analysis.

    PubMed

    Liu, Bin; Wang, Xiaolong; Lin, Lei; Dong, Qiwen; Wang, Xuan

    2008-12-01

    Protein remote homology detection and fold recognition are central problems in bioinformatics. Currently, discriminative methods based on support vector machine (SVM) are the most effective and accurate methods for solving these problems. A key step to improve the performance of the SVM-based methods is to find a suitable representation of protein sequences. In this paper, a novel building block of proteins called Top-n-grams is presented, which contains the evolutionary information extracted from the protein sequence frequency profiles. The protein sequence frequency profiles are calculated from the multiple sequence alignments outputted by PSI-BLAST and converted into Top-n-grams. The protein sequences are transformed into fixed-dimension feature vectors by the occurrence times of each Top-n-gram. The training vectors are evaluated by SVM to train classifiers which are then used to classify the test protein sequences. We demonstrate that the prediction performance of remote homology detection and fold recognition can be improved by combining Top-n-grams and latent semantic analysis (LSA), which is an efficient feature extraction technique from natural language processing. When tested on superfamily and fold benchmarks, the method combining Top-n-grams and LSA gives significantly better results compared to related methods. The method based on Top-n-grams significantly outperforms the methods based on many other building blocks including N-grams, patterns, motifs and binary profiles. Therefore, Top-n-gram is a good building block of the protein sequences and can be widely used in many tasks of the computational biology, such as the sequence alignment, the prediction of domain boundary, the designation of knowledge-based potentials and the prediction of protein binding sites.

  16. Comparing three pedagogical approaches to psychomotor skills acquisition.

    PubMed

    Willis, Ross E; Richa, Jacqueline; Oppeltz, Richard; Nguyen, Patrick; Wagner, Kelly; Van Sickle, Kent R; Dent, Daniel L

    2012-01-01

    We compared traditional pedagogical approaches such as time- and repetition-based methods with proficiency-based training. Laparoscopic novices were assigned randomly to 1 of 3 training conditions. In experiment 1, participants in the time condition practiced for 60 minutes, participants in the repetition condition performed 5 practice trials, and participants in the proficiency condition trained until reaching a predetermined proficiency goal. In experiment 2, practice time and number of trials were equated across conditions. In experiment 1, participants in the proficiency-based training conditions outperformed participants in the other 2 conditions (P < .014); however, these participants trained longer (P < .001) and performed more repetitions (P < .001). In experiment 2, despite training for similar amounts of time and number of repetitions, participants in the proficiency condition outperformed their counterparts (P < .038). In both experiments, the standard deviations for the proficiency condition were smaller than the other conditions. Proficiency-based training results in trainees who perform uniformly and at a higher level than traditional training methodologies. Copyright © 2012 Elsevier Inc. All rights reserved.

  17. A comparison of 1D and 2D LSTM architectures for the recognition of handwritten Arabic

    NASA Astrophysics Data System (ADS)

    Yousefi, Mohammad Reza; Soheili, Mohammad Reza; Breuel, Thomas M.; Stricker, Didier

    2015-01-01

    In this paper, we present an Arabic handwriting recognition method based on recurrent neural network. We use the Long Short Term Memory (LSTM) architecture, that have proven successful in different printed and handwritten OCR tasks. Applications of LSTM for handwriting recognition employ the two-dimensional architecture to deal with the variations in both vertical and horizontal axis. However, we show that using a simple pre-processing step that normalizes the position and baseline of letters, we can make use of 1D LSTM, which is faster in learning and convergence, and yet achieve superior performance. In a series of experiments on IFN/ENIT database for Arabic handwriting recognition, we demonstrate that our proposed pipeline can outperform 2D LSTM networks. Furthermore, we provide comparisons with 1D LSTM networks trained with manually crafted features to show that the automatically learned features in a globally trained 1D LSTM network with our normalization step can even outperform such systems.

  18. Geostatistical modeling of riparian forest microclimate and its implications for sampling

    USGS Publications Warehouse

    Eskelson, B.N.I.; Anderson, P.D.; Hagar, J.C.; Temesgen, H.

    2011-01-01

    Predictive models of microclimate under various site conditions in forested headwater stream - riparian areas are poorly developed, and sampling designs for characterizing underlying riparian microclimate gradients are sparse. We used riparian microclimate data collected at eight headwater streams in the Oregon Coast Range to compare ordinary kriging (OK), universal kriging (UK), and kriging with external drift (KED) for point prediction of mean maximum air temperature (Tair). Several topographic and forest structure characteristics were considered as site-specific parameters. Height above stream and distance to stream were the most important covariates in the KED models, which outperformed OK and UK in terms of root mean square error. Sample patterns were optimized based on the kriging variance and the weighted means of shortest distance criterion using the simulated annealing algorithm. The optimized sample patterns outperformed systematic sample patterns in terms of mean kriging variance mainly for small sample sizes. These findings suggest methods for increasing efficiency of microclimate monitoring in riparian areas.

  19. Reducing the Impact of Stereotype Threat on Women's Math Performance: Are Two Strategies Better Than One?

    PubMed Central

    Jones, Paul R.

    2012-01-01

    Introduction Two studies examined whether stereotype threat impairs women's math performance and whether concurrent threat reduction strategies can be used to offset this effect. Method In Study 1, collegiate men and women (N = 100) watched a video purporting that males and females performed equally well (gender-fair) or males outperformed females (gender differences) on an imminent math test. In Study 2, (N = 44) women viewed the gender differences video, followed by misattribution (cue present, absent) and self-affirmation (present, absent) manipulations, before taking the aforesaid test. Results In the initial study, women underperformed men on the test after receiving the gender differences video, whereas no gender differences emerged in the gender-fair condition. In Study 2, affirming the self led to better performance than not doing so. Planned contrasts indicated, however, that only women receiving a misattribution cue and self-affirmation opportunity outperformed their counterparts not given these reduction strategies. Discussion These findings are discussed relative to Stereotype Threat Theory and educational implications are provided. PMID:22545058

  20. Joint state and parameter estimation of the hemodynamic model by particle smoother expectation maximization method

    NASA Astrophysics Data System (ADS)

    Aslan, Serdar; Taylan Cemgil, Ali; Akın, Ata

    2016-08-01

    Objective. In this paper, we aimed for the robust estimation of the parameters and states of the hemodynamic model by using blood oxygen level dependent signal. Approach. In the fMRI literature, there are only a few successful methods that are able to make a joint estimation of the states and parameters of the hemodynamic model. In this paper, we implemented a maximum likelihood based method called the particle smoother expectation maximization (PSEM) algorithm for the joint state and parameter estimation. Main results. Former sequential Monte Carlo methods were only reliable in the hemodynamic state estimates. They were claimed to outperform the local linearization (LL) filter and the extended Kalman filter (EKF). The PSEM algorithm is compared with the most successful method called square-root cubature Kalman smoother (SCKS) for both state and parameter estimation. SCKS was found to be better than the dynamic expectation maximization (DEM) algorithm, which was shown to be a better estimator than EKF, LL and particle filters. Significance. PSEM was more accurate than SCKS for both the state and the parameter estimation. Hence, PSEM seems to be the most accurate method for the system identification and state estimation for the hemodynamic model inversion literature. This paper do not compare its results with Tikhonov-regularized Newton—CKF (TNF-CKF), a recent robust method which works in filtering sense.

  1. Comparison of Basic and Ensemble Data Mining Methods in Predicting 5-Year Survival of Colorectal Cancer Patients.

    PubMed

    Pourhoseingholi, Mohamad Amin; Kheirian, Sedigheh; Zali, Mohammad Reza

    2017-12-01

    Colorectal cancer (CRC) is one of the most common malignancies and cause of cancer mortality worldwide. Given the importance of predicting the survival of CRC patients and the growing use of data mining methods, this study aims to compare the performance of models for predicting 5-year survival of CRC patients using variety of basic and ensemble data mining methods. The CRC dataset from The Shahid Beheshti University of Medical Sciences Research Center for Gastroenterology and Liver Diseases were used for prediction and comparative study of the base and ensemble data mining techniques. Feature selection methods were used to select predictor attributes for classification. The WEKA toolkit and MedCalc software were respectively utilized for creating and comparing the models. The obtained results showed that the predictive performance of developed models was altogether high (all greater than 90%). Overall, the performance of ensemble models was higher than that of basic classifiers and the best result achieved by ensemble voting model in terms of area under the ROC curve (AUC= 0.96). AUC Comparison of models showed that the ensemble voting method significantly outperformed all models except for two methods of Random Forest (RF) and Bayesian Network (BN) considered the overlapping 95% confidence intervals. This result may indicate high predictive power of these two methods along with ensemble voting for predicting 5-year survival of CRC patients.

  2. A Combined Independent Source Separation and Quality Index Optimization Method for Fetal ECG Extraction from Abdominal Maternal Leads

    PubMed Central

    Billeci, Lucia; Varanini, Maurizio

    2017-01-01

    The non-invasive fetal electrocardiogram (fECG) technique has recently received considerable interest in monitoring fetal health. The aim of our paper is to propose a novel fECG algorithm based on the combination of the criteria of independent source separation and of a quality index optimization (ICAQIO-based). The algorithm was compared with two methods applying the two different criteria independently—the ICA-based and the QIO-based methods—which were previously developed by our group. All three methods were tested on the recently implemented Fetal ECG Synthetic Database (FECGSYNDB). Moreover, the performance of the algorithm was tested on real data from the PhysioNet fetal ECG Challenge 2013 Database. The proposed combined method outperformed the other two algorithms on the FECGSYNDB (ICAQIO-based: 98.78%, QIO-based: 97.77%, ICA-based: 97.61%). Significant differences were obtained in particular in the conditions when uterine contractions and maternal and fetal ectopic beats occurred. On the real data, all three methods obtained very high performances, with the QIO-based method proving slightly better than the other two (ICAQIO-based: 99.38%, QIO-based: 99.76%, ICA-based: 99.37%). The findings from this study suggest that the proposed method could potentially be applied as a novel algorithm for accurate extraction of fECG, especially in critical recording conditions. PMID:28509860

  3. Multi-Label Learning via Random Label Selection for Protein Subcellular Multi-Locations Prediction.

    PubMed

    Wang, Xiao; Li, Guo-Zheng

    2013-03-12

    Prediction of protein subcellular localization is an important but challenging problem, particularly when proteins may simultaneously exist at, or move between, two or more different subcellular location sites. Most of the existing protein subcellular localization methods are only used to deal with the single-location proteins. In the past few years, only a few methods have been proposed to tackle proteins with multiple locations. However, they only adopt a simple strategy, that is, transforming the multi-location proteins to multiple proteins with single location, which doesn't take correlations among different subcellular locations into account. In this paper, a novel method named RALS (multi-label learning via RAndom Label Selection), is proposed to learn from multi-location proteins in an effective and efficient way. Through five-fold cross validation test on a benchmark dataset, we demonstrate our proposed method with consideration of label correlations obviously outperforms the baseline BR method without consideration of label correlations, indicating correlations among different subcellular locations really exist and contribute to improvement of prediction performance. Experimental results on two benchmark datasets also show that our proposed methods achieve significantly higher performance than some other state-of-the-art methods in predicting subcellular multi-locations of proteins. The prediction web server is available at http://levis.tongji.edu.cn:8080/bioinfo/MLPred-Euk/ for the public usage.

  4. Conjugate gradient heat bath for ill-conditioned actions.

    PubMed

    Ceriotti, Michele; Bussi, Giovanni; Parrinello, Michele

    2007-08-01

    We present a method for performing sampling from a Boltzmann distribution of an ill-conditioned quadratic action. This method is based on heat-bath thermalization along a set of conjugate directions, generated via a conjugate-gradient procedure. The resulting scheme outperforms local updates for matrices with very high condition number, since it avoids the slowing down of modes with lower eigenvalue, and has some advantages over the global heat-bath approach, compared to which it is more stable and allows for more freedom in devising case-specific optimizations.

  5. A Pressure Control Method for Emulsion Pump Station Based on Elman Neural Network

    PubMed Central

    Tan, Chao; Qi, Nan; Yao, Xingang; Wang, Zhongbin; Si, Lei

    2015-01-01

    In order to realize pressure control of emulsion pump station which is key equipment of coal mine in the safety production, the control requirements were analyzed and a pressure control method based on Elman neural network was proposed. The key techniques such as system framework, pressure prediction model, pressure control model, and the flowchart of proposed approach were presented. Finally, a simulation example was carried out and comparison results indicated that the proposed approach was feasible and efficient and outperformed others. PMID:25861253

  6. Classifying medical relations in clinical text via convolutional neural networks.

    PubMed

    He, Bin; Guan, Yi; Dai, Rui

    2018-05-16

    Deep learning research on relation classification has achieved solid performance in the general domain. This study proposes a convolutional neural network (CNN) architecture with a multi-pooling operation for medical relation classification on clinical records and explores a loss function with a category-level constraint matrix. Experiments using the 2010 i2b2/VA relation corpus demonstrate these models, which do not depend on any external features, outperform previous single-model methods and our best model is competitive with the existing ensemble-based method. Copyright © 2018. Published by Elsevier B.V.

  7. A Topical Anti-inflammatory Healing Regimen Utilizing Conjugated Linolenic Acid for Use Post-ablative Laser Resurfacing of the Face: A Randomized, Controlled Trial

    PubMed Central

    Goldman, Mitchel P.

    2017-01-01

    Background: Fractionated, ablative lasers are usually associated with post-treatment erythema, edema, and crusting, which can last from 5 to 14 days. Conjugated linolenic acid, an omega-5 fatty acid, has significant antioxidant and anti-inflammatory properties, and has been shown to stimulate keratinocyte proliferation and epidermal regeneration. By modulating the early inflammatory milieu and directly affecting skin structure and function, conjugated linolenic acid might therefore shorten downtime following fractionated ablative laser resurfacing of the face. Objective: To evaluate the efficacy and subject satisfaction of a topical regimen containing conjugated linolenic acid derived from pomegranate seed extract in accelerating wound healing and improving skin quality following fractionated ablative laser resurfacing of the face. Materials and Methods: Thirty-four subjects were enrolled and received fractionated CO2 laser resurfacing. Subjects were randomized to use the test healing regimen (n=24) or 1% dimethicone ointment (n=10) post-procedure. The primary endpoint was the degree of erythema, edema, crusting, and exudation evaluated by a blinded clinician at post-treatment Days 1,3,7,10, 14, and 30. Secondary endpoints included a blinded evaluator assessment of the degree of wrinkling and elastosis using the Fitzpatrick-Goldman Wrinkle and Elastosis Scale; subject-assessed degree of pain, itching, tightness, oozing, and crusting; and subject overall satisfaction. Results: Subjects who applied the topical conjugated linolenic acid healing regimen experienced significantly reduced edema on post-procedure Day 3 (p=0.04), and itching on Days 1 and 3 (p=0.03 and p=0.04). Both regimens produced significant improvements in wrinkling and elastosis at Days 14 and 30 post-treatment, with conjugated linolenic acid outperforming placebo in improvements in wrinkling at Day 14. Both regimens were well tolerated with no statistical differences in adverse events or subject satisfaction. Conclusion: The topical conjugated linolenic acid formulation outperformed placebo by decreasing acute pruritus and edema, and enabling a faster positive outcome in wrinkle improvement. Additionally, topical conjugated linolenic acid does not raise any safety or tolerability issues as compared to current standard of care. PMID:29344315

  8. A Topical Anti-inflammatory Healing Regimen Utilizing Conjugated Linolenic Acid for Use Post-ablative Laser Resurfacing of the Face: A Randomized, Controlled Trial.

    PubMed

    Wu, Douglas C; Goldman, Mitchel P

    2017-10-01

    Background: Fractionated, ablative lasers are usually associated with post-treatment erythema, edema, and crusting, which can last from 5 to 14 days. Conjugated linolenic acid, an omega-5 fatty acid, has significant antioxidant and anti-inflammatory properties, and has been shown to stimulate keratinocyte proliferation and epidermal regeneration. By modulating the early inflammatory milieu and directly affecting skin structure and function, conjugated linolenic acid might therefore shorten downtime following fractionated ablative laser resurfacing of the face. Objective: To evaluate the efficacy and subject satisfaction of a topical regimen containing conjugated linolenic acid derived from pomegranate seed extract in accelerating wound healing and improving skin quality following fractionated ablative laser resurfacing of the face. Materials and Methods: Thirty-four subjects were enrolled and received fractionated CO2 laser resurfacing. Subjects were randomized to use the test healing regimen (n=24) or 1% dimethicone ointment (n=10) post-procedure. The primary endpoint was the degree of erythema, edema, crusting, and exudation evaluated by a blinded clinician at post-treatment Days 1,3,7,10, 14, and 30. Secondary endpoints included a blinded evaluator assessment of the degree of wrinkling and elastosis using the Fitzpatrick-Goldman Wrinkle and Elastosis Scale; subject-assessed degree of pain, itching, tightness, oozing, and crusting; and subject overall satisfaction. Results: Subjects who applied the topical conjugated linolenic acid healing regimen experienced significantly reduced edema on post-procedure Day 3 ( p =0.04), and itching on Days 1 and 3 ( p =0.03 and p =0.04). Both regimens produced significant improvements in wrinkling and elastosis at Days 14 and 30 post-treatment, with conjugated linolenic acid outperforming placebo in improvements in wrinkling at Day 14. Both regimens were well tolerated with no statistical differences in adverse events or subject satisfaction. Conclusion: The topical conjugated linolenic acid formulation outperformed placebo by decreasing acute pruritus and edema, and enabling a faster positive outcome in wrinkle improvement. Additionally, topical conjugated linolenic acid does not raise any safety or tolerability issues as compared to current standard of care.

  9. Using the MMPI-2-RF to discriminate psychometrically identified schizotypic college students from a matched comparison sample.

    PubMed

    Hunter, Helen K; Bolinskey, P Kevin; Novi, Jonathan H; Hudak, Daniel V; James, Alison V; Myers, Kevin R; Schuder, Kelly M

    2014-01-01

    This study investigates the extent to which the Minnesota Multiphasic Personality Inventory-2 Restructured Form (MMPI-2-RF) profiles of 52 individuals making up a psychometrically identified schizotypes (SZT) sample could be successfully discriminated from the protocols of 52 individuals in a matched comparison (MC) sample. Replication analyses were performed with an additional 53 pairs of SZT and MC participants. Results showed significant differences in mean T-score values between these 2 groups across a variety of MMPI-2-RF scales. Results from discriminant function analyses indicate that schizotypy can be predicted effectively using 4 MMPI-2-RF scales and that this method of classification held up on replication. Additional results demonstrated that these MMPI-2-RF scales nominally outperformed MMPI-2 scales suggested by previous research as being indicative of schizophrenia liability. Directions for future research with the MMPI-2-RF are suggested.

  10. Entropy-aware projected Landweber reconstruction for quantized block compressive sensing of aerial imagery

    NASA Astrophysics Data System (ADS)

    Liu, Hao; Li, Kangda; Wang, Bing; Tang, Hainie; Gong, Xiaohui

    2017-01-01

    A quantized block compressive sensing (QBCS) framework, which incorporates the universal measurement, quantization/inverse quantization, entropy coder/decoder, and iterative projected Landweber reconstruction, is summarized. Under the QBCS framework, this paper presents an improved reconstruction algorithm for aerial imagery, QBCS, with entropy-aware projected Landweber (QBCS-EPL), which leverages the full-image sparse transform without Wiener filter and an entropy-aware thresholding model for wavelet-domain image denoising. Through analyzing the functional relation between the soft-thresholding factors and entropy-based bitrates for different quantization methods, the proposed model can effectively remove wavelet-domain noise of bivariate shrinkage and achieve better image reconstruction quality. For the overall performance of QBCS reconstruction, experimental results demonstrate that the proposed QBCS-EPL algorithm significantly outperforms several existing algorithms. With the experiment-driven methodology, the QBCS-EPL algorithm can obtain better reconstruction quality at a relatively moderate computational cost, which makes it more desirable for aerial imagery applications.

  11. Removal performance of elemental mercury by low-cost adsorbents prepared through facile methods of carbonisation and activation of coconut husk.

    PubMed

    Johari, Khairiraihanna; Alias, Afidatul Shazwani; Saman, Norasikin; Song, Shiow Tien; Mat, Hanapi

    2015-01-01

    The preparation of chars and activated carbon as low-cost elemental mercury adsorbents was carried out through the carbonisation of coconut husk (pith and fibre) and the activation of chars with potassium hydroxide (KOH), respectively. The synthesised adsorbents were characterised by using scanning electron microscopy, Fourier transform infrared spectroscopy and nitrogen adsorption/desorption analysis. The elemental mercury removal performance was measured using a conventional flow type packed-bed adsorber. The physical and chemical properties of the adsorbents changed as a result of the carbonisation and activation process, hence affecting on the extent of elemental mercury adsorption. The highest elemental mercury (Hg°) adsorption capacity was obtained for the CP-CHAR (3142.57 µg g(-1)), which significantly outperformed the pristine and activated carbon adsorbents, as well as higher than some adsorbents reported in the literature. © The Author(s) 2014.

  12. An Evaluation of Pixel-Based Methods for the Detection of Floating Objects on the Sea Surface

    NASA Astrophysics Data System (ADS)

    Borghgraef, Alexander; Barnich, Olivier; Lapierre, Fabian; Van Droogenbroeck, Marc; Philips, Wilfried; Acheroy, Marc

    2010-12-01

    Ship-based automatic detection of small floating objects on an agitated sea surface remains a hard problem. Our main concern is the detection of floating mines, which proved a real threat to shipping in confined waterways during the first Gulf War, but applications include salvaging, search-and-rescue operation, perimeter, or harbour defense. Detection in infrared (IR) is challenging because a rough sea is seen as a dynamic background of moving objects with size order, shape, and temperature similar to those of the floating mine. In this paper we have applied a selection of background subtraction algorithms to the problem, and we show that the recent algorithms such as ViBe and behaviour subtraction, which take into account spatial and temporal correlations within the dynamic scene, significantly outperform the more conventional parametric techniques, with only little prior assumptions about the physical properties of the scene.

  13. Context-Dependent Piano Music Transcription With Convolutional Sparse Coding

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Cogliati, Andrea; Duan, Zhiyao; Wohlberg, Brendt

    This study presents a novel approach to automatic transcription of piano music in a context-dependent setting. This approach employs convolutional sparse coding to approximate the music waveform as the summation of piano note waveforms (dictionary elements) convolved with their temporal activations (onset transcription). The piano note waveforms are pre-recorded for the specific piano to be transcribed in the specific environment. During transcription, the note waveforms are fixed and their temporal activations are estimated and post-processed to obtain the pitch and onset transcription. This approach works in the time domain, models temporal evolution of piano notes, and estimates pitches and onsetsmore » simultaneously in the same framework. Finally, experiments show that it significantly outperforms a state-of-the-art music transcription method trained in the same context-dependent setting, in both transcription accuracy and time precision, in various scenarios including synthetic, anechoic, noisy, and reverberant environments.« less

  14. Detection of prostate cancer on multiparametric MRI

    NASA Astrophysics Data System (ADS)

    Seah, Jarrel C. Y.; Tang, Jennifer S. N.; Kitchen, Andy

    2017-03-01

    In this manuscript, we describe our approach and methods to the ProstateX challenge, which achieved an overall AUC of 0.84 and the runner-up position. We train a deep convolutional neural network to classify lesions marked on multiparametric MRI of the prostate as clinically significant or not. We implement a novel addition to the standard convolutional architecture described as auto-windowing which is clinically inspired and designed to overcome some of the difficulties faced in MRI interpretation, where high dynamic ranges and low contrast edges may cause difficulty for traditional convolutional neural networks trained on high contrast natural imagery. We demonstrate that this system can be trained end to end and outperforms a similar architecture without such additions. Although a relatively small training set was provided, we use extensive data augmentation to prevent overfitting and transfer learning to improve convergence speed, showing that deep convolutional neural networks can be feasibly trained on small datasets.

  15. A biologically inspired immunization strategy for network epidemiology.

    PubMed

    Liu, Yang; Deng, Yong; Jusup, Marko; Wang, Zhen

    2016-07-07

    Well-known immunization strategies, based on degree centrality, betweenness centrality, or closeness centrality, either neglect the structural significance of a node or require global information about the network. We propose a biologically inspired immunization strategy that circumvents both of these problems by considering the number of links of a focal node and the way the neighbors are connected among themselves. The strategy thus measures the dependence of the neighbors on the focal node, identifying the ability of this node to spread the disease. Nodes with the highest ability in the network are the first to be immunized. To test the performance of our method, we conduct numerical simulations on several computer-generated and empirical networks, using the susceptible-infected-recovered (SIR) model. The results show that the proposed strategy largely outperforms the existing well-known strategies. Copyright © 2016 Elsevier Ltd. All rights reserved.

  16. The Nonsubsampled Contourlet Transform Based Statistical Medical Image Fusion Using Generalized Gaussian Density

    PubMed Central

    Yang, Guocheng; Li, Meiling; Chen, Leiting; Yu, Jie

    2015-01-01

    We propose a novel medical image fusion scheme based on the statistical dependencies between coefficients in the nonsubsampled contourlet transform (NSCT) domain, in which the probability density function of the NSCT coefficients is concisely fitted using generalized Gaussian density (GGD), as well as the similarity measurement of two subbands is accurately computed by Jensen-Shannon divergence of two GGDs. To preserve more useful information from source images, the new fusion rules are developed to combine the subbands with the varied frequencies. That is, the low frequency subbands are fused by utilizing two activity measures based on the regional standard deviation and Shannon entropy and the high frequency subbands are merged together via weight maps which are determined by the saliency values of pixels. The experimental results demonstrate that the proposed method significantly outperforms the conventional NSCT based medical image fusion approaches in both visual perception and evaluation indices. PMID:26557871

  17. A Hybrid Genetic Programming Algorithm for Automated Design of Dispatching Rules.

    PubMed

    Nguyen, Su; Mei, Yi; Xue, Bing; Zhang, Mengjie

    2018-06-04

    Designing effective dispatching rules for production systems is a difficult and timeconsuming task if it is done manually. In the last decade, the growth of computing power, advanced machine learning, and optimisation techniques has made the automated design of dispatching rules possible and automatically discovered rules are competitive or outperform existing rules developed by researchers. Genetic programming is one of the most popular approaches to discovering dispatching rules in the literature, especially for complex production systems. However, the large heuristic search space may restrict genetic programming from finding near optimal dispatching rules. This paper develops a new hybrid genetic programming algorithm for dynamic job shop scheduling based on a new representation, a new local search heuristic, and efficient fitness evaluators. Experiments show that the new method is effective regarding the quality of evolved rules. Moreover, evolved rules are also significantly smaller and contain more relevant attributes.

  18. Prediction of Patient-Controlled Analgesic Consumption: A Multimodel Regression Tree Approach.

    PubMed

    Hu, Yuh-Jyh; Ku, Tien-Hsiung; Yang, Yu-Hung; Shen, Jia-Ying

    2018-01-01

    Several factors contribute to individual variability in postoperative pain, therefore, individuals consume postoperative analgesics at different rates. Although many statistical studies have analyzed postoperative pain and analgesic consumption, most have identified only the correlation and have not subjected the statistical model to further tests in order to evaluate its predictive accuracy. In this study involving 3052 patients, a multistrategy computational approach was developed for analgesic consumption prediction. This approach uses data on patient-controlled analgesia demand behavior over time and combines clustering, classification, and regression to mitigate the limitations of current statistical models. Cross-validation results indicated that the proposed approach significantly outperforms various existing regression methods. Moreover, a comparison between the predictions by anesthesiologists and medical specialists and those of the computational approach for an independent test data set of 60 patients further evidenced the superiority of the computational approach in predicting analgesic consumption because it produced markedly lower root mean squared errors.

  19. Context-Dependent Piano Music Transcription With Convolutional Sparse Coding

    DOE PAGES

    Cogliati, Andrea; Duan, Zhiyao; Wohlberg, Brendt

    2016-08-04

    This study presents a novel approach to automatic transcription of piano music in a context-dependent setting. This approach employs convolutional sparse coding to approximate the music waveform as the summation of piano note waveforms (dictionary elements) convolved with their temporal activations (onset transcription). The piano note waveforms are pre-recorded for the specific piano to be transcribed in the specific environment. During transcription, the note waveforms are fixed and their temporal activations are estimated and post-processed to obtain the pitch and onset transcription. This approach works in the time domain, models temporal evolution of piano notes, and estimates pitches and onsetsmore » simultaneously in the same framework. Finally, experiments show that it significantly outperforms a state-of-the-art music transcription method trained in the same context-dependent setting, in both transcription accuracy and time precision, in various scenarios including synthetic, anechoic, noisy, and reverberant environments.« less

  20. Ensemble Linear Neighborhood Propagation for Predicting Subchloroplast Localization of Multi-Location Proteins.

    PubMed

    Wan, Shibiao; Mak, Man-Wai; Kung, Sun-Yuan

    2016-12-02

    In the postgenomic era, the number of unreviewed protein sequences is remarkably larger and grows tremendously faster than that of reviewed ones. However, existing methods for protein subchloroplast localization often ignore the information from these unlabeled proteins. This paper proposes a multi-label predictor based on ensemble linear neighborhood propagation (LNP), namely, LNP-Chlo, which leverages hybrid sequence-based feature information from both labeled and unlabeled proteins for predicting localization of both single- and multi-label chloroplast proteins. Experimental results on a stringent benchmark dataset and a novel independent dataset suggest that LNP-Chlo performs at least 6% (absolute) better than state-of-the-art predictors. This paper also demonstrates that ensemble LNP significantly outperforms LNP based on individual features. For readers' convenience, the online Web server LNP-Chlo is freely available at http://bioinfo.eie.polyu.edu.hk/LNPChloServer/ .

  1. Operator selection for unmanned aerial systems: comparing video game players and pilots.

    PubMed

    McKinley, R Andy; McIntire, Lindsey K; Funke, Margaret A

    2011-06-01

    Popular unmanned aerial system (UAS) platforms such as the MQ-1 Predator and MQ-9 Reaper have experienced accelerated operations tempos that have outpaced current operator training regimens, leading to a shortage of qualified UAS operators. To find a surrogate to replace pilots of manned aircraft as UAS operators, this study evaluated video game players (VGPs), pilots, and a control group on a set of UAS operation relevant cognitive tasks. There were 30 participants who volunteered for this study and were divided into 3 groups: experienced pilots (P), experienced VGPs, and a control group (C). Each was trained on eight cognitive performance tasks relevant to unmanned flight tasks. The results indicated that pilots significantly outperform the VGP and control groups on multi-attribute cognitive tasks (Tank mean: VGP = 465 +/- 1.046 vs. P = 203 +/- 0.237 vs. C = 351 +/- 0.601). However, the VGPs outperformed pilots on cognitive tests related to visually acquiring, identifying, and tracking targets (final score: VGP = 594.28 +/- 8.708 vs. P = 563.33 +/- 8.787 vs. C = 568.21 +/- 8.224). Likewise, both VGPs and pilots performed similarly on the UAS landing task, but outperformed the control group (glide slope: VGP = 40.982 +/- 3.244 vs. P = 30.461 +/- 2.251 vs. C = 57.060 +/- 4.407). Cognitive skills learned in video game play may transfer to novel environments and improve performance in UAS tasks over individuals with no video game experience.

  2. LFNet: A Novel Bidirectional Recurrent Convolutional Neural Network for Light-Field Image Super-Resolution.

    PubMed

    Wang, Yunlong; Liu, Fei; Zhang, Kunbo; Hou, Guangqi; Sun, Zhenan; Tan, Tieniu

    2018-09-01

    The low spatial resolution of light-field image poses significant difficulties in exploiting its advantage. To mitigate the dependency of accurate depth or disparity information as priors for light-field image super-resolution, we propose an implicitly multi-scale fusion scheme to accumulate contextual information from multiple scales for super-resolution reconstruction. The implicitly multi-scale fusion scheme is then incorporated into bidirectional recurrent convolutional neural network, which aims to iteratively model spatial relations between horizontally or vertically adjacent sub-aperture images of light-field data. Within the network, the recurrent convolutions are modified to be more effective and flexible in modeling the spatial correlations between neighboring views. A horizontal sub-network and a vertical sub-network of the same network structure are ensembled for final outputs via stacked generalization. Experimental results on synthetic and real-world data sets demonstrate that the proposed method outperforms other state-of-the-art methods by a large margin in peak signal-to-noise ratio and gray-scale structural similarity indexes, which also achieves superior quality for human visual systems. Furthermore, the proposed method can enhance the performance of light field applications such as depth estimation.

  3. Compact Termination for Structural Soft-goods

    NASA Technical Reports Server (NTRS)

    Wilkes, Robert, Jr.

    2013-01-01

    Glass fiber is unique in its ability to withstand atomic oxygen and ultraviolet radiation in-space environments. However, glass fiber is also difficult to terminate by traditional methods without decreasing its strength significantly. Glass fiber products are especially sensitive to bend radius, and do not work very well with traditional 'sewn loop on pin' type connections. As with most composites, getting applied loads from a metallic structure into the webbing without stress concentrations is the key to a successful design. A potted end termination has been shown in some preliminary work to out-perform traditional termination methods. It was proposed to conduct a series of tensile tests on structural webbing or cord to determine the optimum potting geometry, and to then be able to estimate a weight and volume savings over traditional sewn-overa- pin connections. During the course of the investigation into potted end terminations for glass fiber webbing, a new and innovative connection was developed that has lower weight, reduced fabrication time, and superior thermal tolerance over the metallic end terminations that were to be optimized in the original proposal. This end termination essentially transitions the flexible glass fiber webbing into a rigid fiberglass termination, which can be bolted/fastened with traditional methods

  4. Discrete False-Discovery Rate Improves Identification of Differentially Abundant Microbes.

    PubMed

    Jiang, Lingjing; Amir, Amnon; Morton, James T; Heller, Ruth; Arias-Castro, Ery; Knight, Rob

    2017-01-01

    Differential abundance testing is a critical task in microbiome studies that is complicated by the sparsity of data matrices. Here we adapt for microbiome studies a solution from the field of gene expression analysis to produce a new method, discrete false-discovery rate (DS-FDR), that greatly improves the power to detect differential taxa by exploiting the discreteness of the data. Additionally, DS-FDR is relatively robust to the number of noninformative features, and thus removes the problem of filtering taxonomy tables by an arbitrary abundance threshold. We show by using a combination of simulations and reanalysis of nine real-world microbiome data sets that this new method outperforms existing methods at the differential abundance testing task, producing a false-discovery rate that is up to threefold more accurate, and halves the number of samples required to find a given difference (thus increasing the efficiency of microbiome experiments considerably). We therefore expect DS-FDR to be widely applied in microbiome studies. IMPORTANCE DS-FDR can achieve higher statistical power to detect significant findings in sparse and noisy microbiome data compared to the commonly used Benjamini-Hochberg procedure and other FDR-controlling procedures.

  5. Optimized SIFTFlow for registration of whole-mount histology to reference optical images

    PubMed Central

    Shojaii, Rushin; Martel, Anne L.

    2016-01-01

    Abstract. The registration of two-dimensional histology images to reference images from other modalities is an important preprocessing step in the reconstruction of three-dimensional histology volumes. This is a challenging problem because of the differences in the appearances of histology images and other modalities, and the presence of large nonrigid deformations which occur during slide preparation. This paper shows the feasibility of using densely sampled scale-invariant feature transform (SIFT) features and a SIFTFlow deformable registration algorithm for coregistering whole-mount histology images with blockface optical images. We present a method for jointly optimizing the regularization parameters used by the SIFTFlow objective function and use it to determine the most appropriate values for the registration of breast lumpectomy specimens. We demonstrate that tuning the regularization parameters results in significant improvements in accuracy and we also show that SIFTFlow outperforms a previously described edge-based registration method. The accuracy of the histology images to blockface images registration using the optimized SIFTFlow method was assessed using an independent test set of images from five different lumpectomy specimens and the mean registration error was 0.32±0.22  mm. PMID:27774494

  6. Intervertebral disc detection in X-ray images using faster R-CNN.

    PubMed

    Ruhan Sa; Owens, William; Wiegand, Raymond; Studin, Mark; Capoferri, Donald; Barooha, Kenneth; Greaux, Alexander; Rattray, Robert; Hutton, Adam; Cintineo, John; Chaudhary, Vipin

    2017-07-01

    Automatic identification of specific osseous landmarks on the spinal radiograph can be used to automate calculations for correcting ligament instability and injury, which affect 75% of patients injured in motor vehicle accidents. In this work, we propose to use deep learning based object detection method as the first step towards identifying landmark points in lateral lumbar X-ray images. The significant breakthrough of deep learning technology has made it a prevailing choice for perception based applications, however, the lack of large annotated training dataset has brought challenges to utilizing the technology in medical image processing field. In this work, we propose to fine tune a deep network, Faster-RCNN, a state-of-the-art deep detection network in natural image domain, using small annotated clinical datasets. In the experiment we show that, by using only 81 lateral lumbar X-Ray training images, one can achieve much better performance compared to traditional sliding window detection method on hand crafted features. Furthermore, we fine-tuned the network using 974 training images and tested on 108 images, which achieved average precision of 0.905 with average computation time of 3 second per image, which greatly outperformed traditional methods in terms of accuracy and efficiency.

  7. Active link selection for efficient semi-supervised community detection

    NASA Astrophysics Data System (ADS)

    Yang, Liang; Jin, Di; Wang, Xiao; Cao, Xiaochun

    2015-03-01

    Several semi-supervised community detection algorithms have been proposed recently to improve the performance of traditional topology-based methods. However, most of them focus on how to integrate supervised information with topology information; few of them pay attention to which information is critical for performance improvement. This leads to large amounts of demand for supervised information, which is expensive or difficult to obtain in most fields. For this problem we propose an active link selection framework, that is we actively select the most uncertain and informative links for human labeling for the efficient utilization of the supervised information. We also disconnect the most likely inter-community edges to further improve the efficiency. Our main idea is that, by connecting uncertain nodes to their community hubs and disconnecting the inter-community edges, one can sharpen the block structure of adjacency matrix more efficiently than randomly labeling links as the existing methods did. Experiments on both synthetic and real networks demonstrate that our new approach significantly outperforms the existing methods in terms of the efficiency of using supervised information. It needs ~13% of the supervised information to achieve a performance similar to that of the original semi-supervised approaches.

  8. Active link selection for efficient semi-supervised community detection

    PubMed Central

    Yang, Liang; Jin, Di; Wang, Xiao; Cao, Xiaochun

    2015-01-01

    Several semi-supervised community detection algorithms have been proposed recently to improve the performance of traditional topology-based methods. However, most of them focus on how to integrate supervised information with topology information; few of them pay attention to which information is critical for performance improvement. This leads to large amounts of demand for supervised information, which is expensive or difficult to obtain in most fields. For this problem we propose an active link selection framework, that is we actively select the most uncertain and informative links for human labeling for the efficient utilization of the supervised information. We also disconnect the most likely inter-community edges to further improve the efficiency. Our main idea is that, by connecting uncertain nodes to their community hubs and disconnecting the inter-community edges, one can sharpen the block structure of adjacency matrix more efficiently than randomly labeling links as the existing methods did. Experiments on both synthetic and real networks demonstrate that our new approach significantly outperforms the existing methods in terms of the efficiency of using supervised information. It needs ~13% of the supervised information to achieve a performance similar to that of the original semi-supervised approaches. PMID:25761385

  9. Super-resolution method for face recognition using nonlinear mappings on coherent features.

    PubMed

    Huang, Hua; He, Huiting

    2011-01-01

    Low-resolution (LR) of face images significantly decreases the performance of face recognition. To address this problem, we present a super-resolution method that uses nonlinear mappings to infer coherent features that favor higher recognition of the nearest neighbor (NN) classifiers for recognition of single LR face image. Canonical correlation analysis is applied to establish the coherent subspaces between the principal component analysis (PCA) based features of high-resolution (HR) and LR face images. Then, a nonlinear mapping between HR/LR features can be built by radial basis functions (RBFs) with lower regression errors in the coherent feature space than in the PCA feature space. Thus, we can compute super-resolved coherent features corresponding to an input LR image according to the trained RBF model efficiently and accurately. And, face identity can be obtained by feeding these super-resolved features to a simple NN classifier. Extensive experiments on the Facial Recognition Technology, University of Manchester Institute of Science and Technology, and Olivetti Research Laboratory databases show that the proposed method outperforms the state-of-the-art face recognition algorithms for single LR image in terms of both recognition rate and robustness to facial variations of pose and expression.

  10. SPONGY (SPam ONtoloGY): Email Classification Using Two-Level Dynamic Ontology

    PubMed Central

    2014-01-01

    Email is one of common communication methods between people on the Internet. However, the increase of email misuse/abuse has resulted in an increasing volume of spam emails over recent years. An experimental system has been designed and implemented with the hypothesis that this method would outperform existing techniques, and the experimental results showed that indeed the proposed ontology-based approach improves spam filtering accuracy significantly. In this paper, two levels of ontology spam filters were implemented: a first level global ontology filter and a second level user-customized ontology filter. The use of the global ontology filter showed about 91% of spam filtered, which is comparable with other methods. The user-customized ontology filter was created based on the specific user's background as well as the filtering mechanism used in the global ontology filter creation. The main contributions of the paper are (1) to introduce an ontology-based multilevel filtering technique that uses both a global ontology and an individual filter for each user to increase spam filtering accuracy and (2) to create a spam filter in the form of ontology, which is user-customized, scalable, and modularized, so that it can be embedded to many other systems for better performance. PMID:25254240

  11. Improved Modeling Tools Development for High Penetration Solar

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Washom, Byron; Meagher, Kevin

    2014-12-11

    One of the significant objectives of the High Penetration solar research is to help the DOE understand, anticipate, and minimize grid operation impacts as more solar resources are added to the electric power system. For Task 2.2, an effective, reliable approach to predicting solar energy availability for energy generation forecasts using the University of California, San Diego (UCSD) Sky Imager technology has been demonstrated. Granular cloud and ramp forecasts for the next 5 to 20 minutes over an area of 10 square miles were developed. Sky images taken every 30 seconds are processed to determine cloud locations and cloud motionmore » vectors yielding future cloud shadow locations respective to distributed generation or utility solar power plants in the area. The performance of the method depends on cloud characteristics. On days with more advective cloud conditions, the developed method outperforms persistence forecasts by up to 30% (based on mean absolute error). On days with dynamic conditions, the method performs worse than persistence. Sky Imagers hold promise for ramp forecasting and ramp mitigation in conjunction with inverter controls and energy storage. The pre-commercial Sky Imager solar forecasting algorithm was documented with licensing information and was a Sunshot website highlight.« less

  12. SPONGY (SPam ONtoloGY): email classification using two-level dynamic ontology.

    PubMed

    Youn, Seongwook

    2014-01-01

    Email is one of common communication methods between people on the Internet. However, the increase of email misuse/abuse has resulted in an increasing volume of spam emails over recent years. An experimental system has been designed and implemented with the hypothesis that this method would outperform existing techniques, and the experimental results showed that indeed the proposed ontology-based approach improves spam filtering accuracy significantly. In this paper, two levels of ontology spam filters were implemented: a first level global ontology filter and a second level user-customized ontology filter. The use of the global ontology filter showed about 91% of spam filtered, which is comparable with other methods. The user-customized ontology filter was created based on the specific user's background as well as the filtering mechanism used in the global ontology filter creation. The main contributions of the paper are (1) to introduce an ontology-based multilevel filtering technique that uses both a global ontology and an individual filter for each user to increase spam filtering accuracy and (2) to create a spam filter in the form of ontology, which is user-customized, scalable, and modularized, so that it can be embedded to many other systems for better performance.

  13. Hyperspectral and multispectral data fusion based on linear-quadratic nonnegative matrix factorization

    NASA Astrophysics Data System (ADS)

    Benhalouche, Fatima Zohra; Karoui, Moussa Sofiane; Deville, Yannick; Ouamri, Abdelaziz

    2017-04-01

    This paper proposes three multisharpening approaches to enhance the spatial resolution of urban hyperspectral remote sensing images. These approaches, related to linear-quadratic spectral unmixing techniques, use a linear-quadratic nonnegative matrix factorization (NMF) multiplicative algorithm. These methods begin by unmixing the observable high-spectral/low-spatial resolution hyperspectral and high-spatial/low-spectral resolution multispectral images. The obtained high-spectral/high-spatial resolution features are then recombined, according to the linear-quadratic mixing model, to obtain an unobservable multisharpened high-spectral/high-spatial resolution hyperspectral image. In the first designed approach, hyperspectral and multispectral variables are independently optimized, once they have been coherently initialized. These variables are alternately updated in the second designed approach. In the third approach, the considered hyperspectral and multispectral variables are jointly updated. Experiments, using synthetic and real data, are conducted to assess the efficiency, in spatial and spectral domains, of the designed approaches and of linear NMF-based approaches from the literature. Experimental results show that the designed methods globally yield very satisfactory spectral and spatial fidelities for the multisharpened hyperspectral data. They also prove that these methods significantly outperform the used literature approaches.

  14. A new distributed systems scheduling algorithm: a swarm intelligence approach

    NASA Astrophysics Data System (ADS)

    Haghi Kashani, Mostafa; Sarvizadeh, Raheleh; Jameii, Mahdi

    2011-12-01

    The scheduling problem in distributed systems is known as an NP-complete problem, and methods based on heuristic or metaheuristic search have been proposed to obtain optimal and suboptimal solutions. The task scheduling is a key factor for distributed systems to gain better performance. In this paper, an efficient method based on memetic algorithm is developed to solve the problem of distributed systems scheduling. With regard to load balancing efficiently, Artificial Bee Colony (ABC) has been applied as local search in the proposed memetic algorithm. The proposed method has been compared to existing memetic-Based approach in which Learning Automata method has been used as local search. The results demonstrated that the proposed method outperform the above mentioned method in terms of communication cost.

  15. Auditory Learning Using a Portable Real-Time Vocoder: Preliminary Findings

    PubMed Central

    Pisoni, David B.

    2015-01-01

    Purpose Although traditional study of auditory training has been in controlled laboratory settings, interest has been increasing in more interactive options. The authors examine whether such interactive training can result in short-term perceptual learning, and the range of perceptual skills it impacts. Method Experiments 1 (N = 37) and 2 (N = 21) used pre- and posttest measures of speech and nonspeech recognition to find evidence of learning (within subject) and to compare the effects of 3 kinds of training (between subject) on the perceptual abilities of adults with normal hearing listening to simulations of cochlear implant processing. Subjects were given interactive, standard lab-based, or control training experience for 1 hr between the pre- and posttest tasks (unique sets across Experiments 1 & 2). Results Subjects receiving interactive training showed significant learning on sentence recognition in quiet task (Experiment 1), outperforming controls but not lab-trained subjects following training. Training groups did not differ significantly on any other task, even those directly involved in the interactive training experience. Conclusions Interactive training has the potential to produce learning in 1 domain (sentence recognition in quiet), but the particulars of the present training method (short duration, high complexity) may have limited benefits to this single criterion task. PMID:25674884

  16. iPHLoc-ES: Identification of bacteriophage protein locations using evolutionary and structural features.

    PubMed

    Shatabda, Swakkhar; Saha, Sanjay; Sharma, Alok; Dehzangi, Abdollah

    2017-12-21

    Bacteriophage proteins are viruses that can significantly impact on the functioning of bacteria and can be used in phage based therapy. The functioning of Bacteriophage in the host bacteria depends on its location in those host cells. It is very important to know the subcellular location of the phage proteins in a host cell in order to understand their working mechanism. In this paper, we propose iPHLoc-ES, a prediction method for subcellular localization of bacteriophage proteins. We aim to solve two problems: discriminating between host located and non-host located phage proteins and discriminating between the locations of host located protein in a host cell (membrane or cytoplasm). To do this, we extract sets of evolutionary and structural features of phage protein and employ Support Vector Machine (SVM) as our classifier. We also use recursive feature elimination (RFE) to reduce the number of features for effective prediction. On standard dataset using standard evaluation criteria, our method significantly outperforms the state-of-the-art predictor. iPHLoc-ES is readily available to use as a standalone tool from: https://github.com/swakkhar/iPHLoc-ES/ and as a web application from: http://brl.uiu.ac.bd/iPHLoc-ES/. Copyright © 2017 Elsevier Ltd. All rights reserved.

  17. Efficient use of unlabeled data for protein sequence classification: a comparative study.

    PubMed

    Kuksa, Pavel; Huang, Pai-Hsi; Pavlovic, Vladimir

    2009-04-29

    Recent studies in computational primary protein sequence analysis have leveraged the power of unlabeled data. For example, predictive models based on string kernels trained on sequences known to belong to particular folds or superfamilies, the so-called labeled data set, can attain significantly improved accuracy if this data is supplemented with protein sequences that lack any class tags-the unlabeled data. In this study, we present a principled and biologically motivated computational framework that more effectively exploits the unlabeled data by only using the sequence regions that are more likely to be biologically relevant for better prediction accuracy. As overly-represented sequences in large uncurated databases may bias the estimation of computational models that rely on unlabeled data, we also propose a method to remove this bias and improve performance of the resulting classifiers. Combined with state-of-the-art string kernels, our proposed computational framework achieves very accurate semi-supervised protein remote fold and homology detection on three large unlabeled databases. It outperforms current state-of-the-art methods and exhibits significant reduction in running time. The unlabeled sequences used under the semi-supervised setting resemble the unpolished gemstones; when used as-is, they may carry unnecessary features and hence compromise the classification accuracy but once cut and polished, they improve the accuracy of the classifiers considerably.

  18. Numerical study on simultaneous emission and transmission tomography in the MRI framework

    NASA Astrophysics Data System (ADS)

    Gjesteby, Lars; Cong, Wenxiang; Wang, Ge

    2017-09-01

    Multi-modality imaging methods are instrumental for advanced diagnosis and therapy. Specifically, a hybrid system that combines computed tomography (CT), nuclear imaging, and magnetic resonance imaging (MRI) will be a Holy Grail of medical imaging, delivering complementary structural/morphological, functional, and molecular information for precision medicine. A novel imaging method was recently demonstrated that takes advantage of radiotracer polarization to combine MRI principles with nuclear imaging. This approach allows the concentration of a polarized Υ-ray emitting radioisotope to be imaged with MRI resolution potentially outperforming the standard nuclear imaging mode at a sensitivity significantly higher than that of MRI. In our work, we propose to acquire MRI-modulated nuclear data for simultaneous image reconstruction of both emission and transmission parameters, suggesting the potential for simultaneous CT-SPECT-MRI. The synchronized diverse datasets allow excellent spatiotemporal registration and unique insight into physiological and pathological features. Here we describe the methodology involving the system design with emphasis on the formulation for tomographic images, even when significant radiotracer signals are limited to a region of interest (ROI). Initial numerical results demonstrate the feasibility of our approach for reconstructing concentration and attenuation images through a head phantom with various radio-labeled ROIs. Additional considerations regarding the radioisotope characteristics are also discussed.

  19. Accurate 3-D Profile Extraction of Skull Bone Using an Ultrasound Matrix Array.

    PubMed

    Hajian, Mehdi; Gaspar, Robert; Maev, Roman Gr

    2017-12-01

    The present study investigates the feasibility, accuracy, and precision of 3-D profile extraction of the human skull bone using a custom-designed ultrasound matrix transducer in Pulse-Echo. Due to the attenuative scattering properties of the skull, the backscattered echoes from the inner surface of the skull are severely degraded, attenuated, and at some points overlapped. Furthermore, the speed of sound (SOS) in the skull varies significantly in different zones and also from case to case; if considered constant, it introduces significant error to the profile measurement. A new method for simultaneous estimation of the skull profiles and the sound speed value is presented. The proposed method is a two-folded procedure: first, the arrival times of the backscattered echoes from the skull bone are estimated using multi-lag phase delay (MLPD) and modified space alternating generalized expectation maximization (SAGE) algorithms. Next, these arrival times are fed into an adaptive sound speed estimation algorithm to compute the optimal SOS value and subsequently, the skull bone thickness. For quantitative evaluation, the estimated bone phantom thicknesses were compared with the mechanical measurements. The accuracies of the bone thickness measurements using MLPD and modified SAGE algorithms combined with the adaptive SOS estimation were 7.93% and 4.21%, respectively. These values were 14.44% and 10.75% for the autocorrelation and cross-correlation methods. Additionally, the Bland-Altman plots showed the modified SAGE outperformed the other methods with -0.35 and 0.44 mm limits of agreement. No systematic error that could be related to the skull bone thickness was observed for this method.

  20. DNA Barcoding of Recently Diverged Species: Relative Performance of Matching Methods

    PubMed Central

    van Velzen, Robin; Weitschek, Emanuel; Felici, Giovanni; Bakker, Freek T.

    2012-01-01

    Recently diverged species are challenging for identification, yet they are frequently of special interest scientifically as well as from a regulatory perspective. DNA barcoding has proven instrumental in species identification, especially in insects and vertebrates, but for the identification of recently diverged species it has been reported to be problematic in some cases. Problems are mostly due to incomplete lineage sorting or simply lack of a ‘barcode gap’ and probably related to large effective population size and/or low mutation rate. Our objective was to compare six methods in their ability to correctly identify recently diverged species with DNA barcodes: neighbor joining and parsimony (both tree-based), nearest neighbor and BLAST (similarity-based), and the diagnostic methods DNA-BAR, and BLOG. We analyzed simulated data assuming three different effective population sizes as well as three selected empirical data sets from published studies. Results show, as expected, that success rates are significantly lower for recently diverged species (∼75%) than for older species (∼97%) (P<0.00001). Similarity-based and diagnostic methods significantly outperform tree-based methods, when applied to simulated DNA barcode data (P<0.00001). The diagnostic method BLOG had highest correct query identification rate based on simulated (86.2%) as well as empirical data (93.1%), indicating that it is a consistently better method overall. Another advantage of BLOG is that it offers species-level information that can be used outside the realm of DNA barcoding, for instance in species description or molecular detection assays. Even though we can confirm that identification success based on DNA barcoding is generally high in our data, recently diverged species remain difficult to identify. Nevertheless, our results contribute to improved solutions for their accurate identification. PMID:22272356

  1. DNA barcoding of recently diverged species: relative performance of matching methods.

    PubMed

    van Velzen, Robin; Weitschek, Emanuel; Felici, Giovanni; Bakker, Freek T

    2012-01-01

    Recently diverged species are challenging for identification, yet they are frequently of special interest scientifically as well as from a regulatory perspective. DNA barcoding has proven instrumental in species identification, especially in insects and vertebrates, but for the identification of recently diverged species it has been reported to be problematic in some cases. Problems are mostly due to incomplete lineage sorting or simply lack of a 'barcode gap' and probably related to large effective population size and/or low mutation rate. Our objective was to compare six methods in their ability to correctly identify recently diverged species with DNA barcodes: neighbor joining and parsimony (both tree-based), nearest neighbor and BLAST (similarity-based), and the diagnostic methods DNA-BAR, and BLOG. We analyzed simulated data assuming three different effective population sizes as well as three selected empirical data sets from published studies. Results show, as expected, that success rates are significantly lower for recently diverged species (∼75%) than for older species (∼97%) (P<0.00001). Similarity-based and diagnostic methods significantly outperform tree-based methods, when applied to simulated DNA barcode data (P<0.00001). The diagnostic method BLOG had highest correct query identification rate based on simulated (86.2%) as well as empirical data (93.1%), indicating that it is a consistently better method overall. Another advantage of BLOG is that it offers species-level information that can be used outside the realm of DNA barcoding, for instance in species description or molecular detection assays. Even though we can confirm that identification success based on DNA barcoding is generally high in our data, recently diverged species remain difficult to identify. Nevertheless, our results contribute to improved solutions for their accurate identification.

  2. Effect of a six month yoga exercise intervention on fitness outcomes for breast cancer survivors

    PubMed Central

    Hughes, Daniel C.; Darby, Nydia; Gonzalez, Krystle; Boggess, Terri; Morris, Ruth M.; Ramirez, Amelie G.

    2016-01-01

    Yoga-based exercise has proven to be beneficial for practitioners, including cancer survivors. This study reports on the improvements in physical fitness for 20 breast cancer survivors who participated in a six-month yoga-based (YE) exercise program. Results are compared to a comprehensive exercise (CE) program group and a comparison (C) exercise group who chose their own exercises. “Pre” and “post” fitness assessments included measures of anthropometrics, cardiorespiratory capacity, strength and flexibility. Descriptive statistics, effect size (d), dependent sample ‘t’ tests for all outcome measures were calculated for the YE group. Significant improvements included: decreased % body fat (−3.00%, d = −0.44, p < 0.001); increased sit to stand leg strength repetitions (2.05, d = 0.48, p = 0.003); forward reach (3.59 cm, d = 0.61, p = 0.01); and right arm sagittal range of motion (6.50°, d = 0.92, p= 0.05). To compare YE outcomes with the other two groups, a one-way analysis of variance (ANOVA) was used. YE participants significantly outperformed C participants on “forward reach” (3.59 cm gained versus −2.44 cm lost), (p = 0.009) and outperformed CE participants (3.59 cm gained versus 1.35 cm gained), but not statistically significant. Our results support yoga-based exercise modified for breast cancer survivors as safe and effective. PMID:26395825

  3. Quantum coupled mutation finder: predicting functionally or structurally important sites in proteins using quantum Jensen-Shannon divergence and CUDA programming.

    PubMed

    Gültas, Mehmet; Düzgün, Güncel; Herzog, Sebastian; Jäger, Sven Joachim; Meckbach, Cornelia; Wingender, Edgar; Waack, Stephan

    2014-04-03

    The identification of functionally or structurally important non-conserved residue sites in protein MSAs is an important challenge for understanding the structural basis and molecular mechanism of protein functions. Despite the rich literature on compensatory mutations as well as sequence conservation analysis for the detection of those important residues, previous methods often rely on classical information-theoretic measures. However, these measures usually do not take into account dis/similarities of amino acids which are likely to be crucial for those residues. In this study, we present a new method, the Quantum Coupled Mutation Finder (QCMF) that incorporates significant dis/similar amino acid pair signals in the prediction of functionally or structurally important sites. The result of this study is twofold. First, using the essential sites of two human proteins, namely epidermal growth factor receptor (EGFR) and glucokinase (GCK), we tested the QCMF-method. The QCMF includes two metrics based on quantum Jensen-Shannon divergence to measure both sequence conservation and compensatory mutations. We found that the QCMF reaches an improved performance in identifying essential sites from MSAs of both proteins with a significantly higher Matthews correlation coefficient (MCC) value in comparison to previous methods. Second, using a data set of 153 proteins, we made a pairwise comparison between QCMF and three conventional methods. This comparison study strongly suggests that QCMF complements the conventional methods for the identification of correlated mutations in MSAs. QCMF utilizes the notion of entanglement, which is a major resource of quantum information, to model significant dissimilar and similar amino acid pair signals in the detection of functionally or structurally important sites. Our results suggest that on the one hand QCMF significantly outperforms the previous method, which mainly focuses on dissimilar amino acid signals, to detect essential sites in proteins. On the other hand, it is complementary to the existing methods for the identification of correlated mutations. The method of QCMF is computationally intensive. To ensure a feasible computation time of the QCMF's algorithm, we leveraged Compute Unified Device Architecture (CUDA).The QCMF server is freely accessible at http://qcmf.informatik.uni-goettingen.de/.

  4. Quantifying edge significance on maintaining global connectivity

    PubMed Central

    Qian, Yuhua; Li, Yebin; Zhang, Min; Ma, Guoshuai; Lu, Furong

    2017-01-01

    Global connectivity is a quite important issue for networks. The failures of some key edges may lead to breakdown of the whole system. How to find them will provide a better understanding on system robustness. Based on topological information, we propose an approach named LE (link entropy) to quantify the edge significance on maintaining global connectivity. Then we compare the LE with the other six acknowledged indices on the edge significance: the edge betweenness centrality, degree product, bridgeness, diffusion importance, topological overlap and k-path edge centrality. Experimental results show that the LE approach outperforms in quantifying edge significance on maintaining global connectivity. PMID:28349923

  5. Examination of Gender Differences on Cognitive and Motivational Factors That Influence 8th Graders' Science Achievement in Turkey

    ERIC Educational Resources Information Center

    Acar, Ömer; Türkmen, Lütfullah; Bilgin, Ahmet

    2015-01-01

    We examined the influence of several students' cognitive and motivational factors on 8th graders' science achievement and also gender differences on factors that significantly contribute to the science achievement model. A total of 99 girls and 83 boys responded all the instruments used in this study. Results showed that girls outperformed boys on…

  6. Gender differences in recognition of toy faces suggest a contribution of experience.

    PubMed

    Ryan, Kaitlin F; Gauthier, Isabel

    2016-12-01

    When there is a gender effect, women perform better then men in face recognition tasks. Prior work has not documented a male advantage on a face recognition task, suggesting that women may outperform men at face recognition generally either due to evolutionary reasons or the influence of social roles. Here, we question the idea that women excel at all face recognition and provide a proof of concept based on a face category for which men outperform women. We developed a test of face learning to measures individual differences with face categories for which men and women may differ in experience, using the faces of Barbie dolls and of Transformers. The results show a crossover interaction between subject gender and category, where men outperform women with Transformers' faces. We demonstrate that men can outperform women with some categories of faces, suggesting that explanations for a general face recognition advantage for women are in fact not needed. Copyright © 2016 Elsevier Ltd. All rights reserved.

  7. A novel Iterative algorithm to text segmentation for web born-digital images

    NASA Astrophysics Data System (ADS)

    Xu, Zhigang; Zhu, Yuesheng; Sun, Ziqiang; Liu, Zhen

    2015-07-01

    Since web born-digital images have low resolution and dense text atoms, text region over-merging and miss detection are still two open issues to be addressed. In this paper a novel iterative algorithm is proposed to locate and segment text regions. In each iteration, the candidate text regions are generated by detecting Maximally Stable Extremal Region (MSER) with diminishing thresholds, and categorized into different groups based on a new similarity graph, and the texted region groups are identified by applying several features and rules. With our proposed overlap checking method the final well-segmented text regions are selected from these groups in all iterations. Experiments have been carried out on the web born-digital image datasets used for robust reading competition in ICDAR 2011 and 2013, and the results demonstrate that our proposed scheme can significantly reduce both the number of over-merge regions and the lost rate of target atoms, and the overall performance outperforms the best compared with the methods shown in the two competitions in term of recall rate and f-score at the cost of slightly higher computational complexity.

  8. An Information Retrieval Approach for Robust Prediction of Road Surface States.

    PubMed

    Park, Jae-Hyung; Kim, Kwanho

    2017-01-28

    Recently, due to the increasing importance of reducing severe vehicle accidents on roads (especially on highways), the automatic identification of road surface conditions, and the provisioning of such information to drivers in advance, have recently been gaining significant momentum as a proactive solution to decrease the number of vehicle accidents. In this paper, we firstly propose an information retrieval approach that aims to identify road surface states by combining conventional machine-learning techniques and moving average methods. Specifically, when signal information is received from a radar system, our approach attempts to estimate the current state of the road surface based on the similar instances observed previously based on utilizing a given similarity function. Next, the estimated state is then calibrated by using the recently estimated states to yield both effective and robust prediction results. To validate the performances of the proposed approach, we established a real-world experimental setting on a section of actual highway in South Korea and conducted a comparison with the conventional approaches in terms of accuracy. The experimental results show that the proposed approach successfully outperforms the previously developed methods.

  9. Estimation of signal-dependent noise level function in transform domain via a sparse recovery model.

    PubMed

    Yang, Jingyu; Gan, Ziqiao; Wu, Zhaoyang; Hou, Chunping

    2015-05-01

    This paper proposes a novel algorithm to estimate the noise level function (NLF) of signal-dependent noise (SDN) from a single image based on the sparse representation of NLFs. Noise level samples are estimated from the high-frequency discrete cosine transform (DCT) coefficients of nonlocal-grouped low-variation image patches. Then, an NLF recovery model based on the sparse representation of NLFs under a trained basis is constructed to recover NLF from the incomplete noise level samples. Confidence levels of the NLF samples are incorporated into the proposed model to promote reliable samples and weaken unreliable ones. We investigate the behavior of the estimation performance with respect to the block size, sampling rate, and confidence weighting. Simulation results on synthetic noisy images show that our method outperforms existing state-of-the-art schemes. The proposed method is evaluated on real noisy images captured by three types of commodity imaging devices, and shows consistently excellent SDN estimation performance. The estimated NLFs are incorporated into two well-known denoising schemes, nonlocal means and BM3D, and show significant improvements in denoising SDN-polluted images.

  10. Remote Marker-Based Tracking for UAV Landing Using Visible-Light Camera Sensor.

    PubMed

    Nguyen, Phong Ha; Kim, Ki Wan; Lee, Young Won; Park, Kang Ryoung

    2017-08-30

    Unmanned aerial vehicles (UAVs), which are commonly known as drones, have proved to be useful not only on the battlefields where manned flight is considered too risky or difficult, but also in everyday life purposes such as surveillance, monitoring, rescue, unmanned cargo, aerial video, and photography. More advanced drones make use of global positioning system (GPS) receivers during the navigation and control loop which allows for smart GPS features of drone navigation. However, there are problems if the drones operate in heterogeneous areas with no GPS signal, so it is important to perform research into the development of UAVs with autonomous navigation and landing guidance using computer vision. In this research, we determined how to safely land a drone in the absence of GPS signals using our remote maker-based tracking algorithm based on the visible light camera sensor. The proposed method uses a unique marker designed as a tracking target during landing procedures. Experimental results show that our method significantly outperforms state-of-the-art object trackers in terms of both accuracy and processing time, and we perform test on an embedded system in various environments.

  11. Significantly improved precision of cell migration analysis in time-lapse video microscopy through use of a fully automated tracking system

    PubMed Central

    2010-01-01

    Background Cell motility is a critical parameter in many physiological as well as pathophysiological processes. In time-lapse video microscopy, manual cell tracking remains the most common method of analyzing migratory behavior of cell populations. In addition to being labor-intensive, this method is susceptible to user-dependent errors regarding the selection of "representative" subsets of cells and manual determination of precise cell positions. Results We have quantitatively analyzed these error sources, demonstrating that manual cell tracking of pancreatic cancer cells lead to mis-calculation of migration rates of up to 410%. In order to provide for objective measurements of cell migration rates, we have employed multi-target tracking technologies commonly used in radar applications to develop fully automated cell identification and tracking system suitable for high throughput screening of video sequences of unstained living cells. Conclusion We demonstrate that our automatic multi target tracking system identifies cell objects, follows individual cells and computes migration rates with high precision, clearly outperforming manual procedures. PMID:20377897

  12. Super-resolution for imagery from integrated microgrid polarimeters.

    PubMed

    Hardie, Russell C; LeMaster, Daniel A; Ratliff, Bradley M

    2011-07-04

    Imagery from microgrid polarimeters is obtained by using a mosaic of pixel-wise micropolarizers on a focal plane array (FPA). Each distinct polarization image is obtained by subsampling the full FPA image. Thus, the effective pixel pitch for each polarization channel is increased and the sampling frequency is decreased. As a result, aliasing artifacts from such undersampling can corrupt the true polarization content of the scene. Here we present the first multi-channel multi-frame super-resolution (SR) algorithms designed specifically for the problem of image restoration in microgrid polarization imagers. These SR algorithms can be used to address aliasing and other degradations, without sacrificing field of view or compromising optical resolution with an anti-aliasing filter. The new SR methods are designed to exploit correlation between the polarimetric channels. One of the new SR algorithms uses a form of regularized least squares and has an iterative solution. The other is based on the faster adaptive Wiener filter SR method. We demonstrate that the new multi-channel SR algorithms are capable of providing significant enhancement of polarimetric imagery and that they outperform their independent channel counterparts.

  13. Human voice quality measurement in noisy environments.

    PubMed

    Ueng, Shyh-Kuang; Luo, Cheng-Ming; Tsai, Tsung-Yu; Yeh, Hsuan-Chen

    2015-01-01

    Computerized acoustic voice measurement is essential for the diagnosis of vocal pathologies. Previous studies showed that ambient noises have significant influences on the accuracy of voice quality assessment. This paper presents a voice quality assessment system that can accurately measure qualities of voice signals, even though the input voice data are contaminated by low-frequency noises. The ambient noises in our living rooms and laboratories are collected and the frequencies of these noises are analyzed. Based on the analysis, a filter is designed to reduce noise level of the input voice signal. Then, improved numerical algorithms are employed to extract voice parameters from the voice signal to reveal the health of the voice signal. Compared with MDVP and Praat, the proposed method outperforms these two widely used programs in measuring fundamental frequency and harmonic-to-noise ratio, and its performance is comparable to these two famous programs in computing jitter and shimmer. The proposed voice quality assessment method is resistant to low-frequency noises and it can measure human voice quality in environments filled with noises from air-conditioners, ceiling fans and cooling fans of computers.

  14. Receptive fields selection for binary feature description.

    PubMed

    Fan, Bin; Kong, Qingqun; Trzcinski, Tomasz; Wang, Zhiheng; Pan, Chunhong; Fua, Pascal

    2014-06-01

    Feature description for local image patch is widely used in computer vision. While the conventional way to design local descriptor is based on expert experience and knowledge, learning-based methods for designing local descriptor become more and more popular because of their good performance and data-driven property. This paper proposes a novel data-driven method for designing binary feature descriptor, which we call receptive fields descriptor (RFD). Technically, RFD is constructed by thresholding responses of a set of receptive fields, which are selected from a large number of candidates according to their distinctiveness and correlations in a greedy way. Using two different kinds of receptive fields (namely rectangular pooling area and Gaussian pooling area) for selection, we obtain two binary descriptors RFDR and RFDG .accordingly. Image matching experiments on the well-known patch data set and Oxford data set demonstrate that RFD significantly outperforms the state-of-the-art binary descriptors, and is comparable with the best float-valued descriptors at a fraction of processing time. Finally, experiments on object recognition tasks confirm that both RFDR and RFDG successfully bridge the performance gap between binary descriptors and their floating-point competitors.

  15. Error mitigation for CCSD compressed imager data

    NASA Astrophysics Data System (ADS)

    Gladkova, Irina; Grossberg, Michael; Gottipati, Srikanth; Shahriar, Fazlul; Bonev, George

    2009-08-01

    To efficiently use the limited bandwidth available on the downlink from satellite to ground station, imager data is usually compressed before transmission. Transmission introduces unavoidable errors, which are only partially removed by forward error correction and packetization. In the case of the commonly used CCSD Rice-based compression, it results in a contiguous sequence of dummy values along scan lines in a band of the imager data. We have developed a method capable of using the image statistics to provide a principled estimate of the missing data. Our method outperforms interpolation yet can be performed fast enough to provide uninterrupted data flow. The estimation of the lost data provides significant value to end users who may use only part of the data, may not have statistical tools, or lack the expertise to mitigate the impact of the lost data. Since the locations of the lost data will be clearly marked as meta-data in the HDF or NetCDF header, experts who prefer to handle error mitigation themselves will be free to use or ignore our estimates as they see fit.

  16. GPURFSCREEN: a GPU based virtual screening tool using random forest classifier.

    PubMed

    Jayaraj, P B; Ajay, Mathias K; Nufail, M; Gopakumar, G; Jaleel, U C A

    2016-01-01

    In-silico methods are an integral part of modern drug discovery paradigm. Virtual screening, an in-silico method, is used to refine data models and reduce the chemical space on which wet lab experiments need to be performed. Virtual screening of a ligand data model requires large scale computations, making it a highly time consuming task. This process can be speeded up by implementing parallelized algorithms on a Graphical Processing Unit (GPU). Random Forest is a robust classification algorithm that can be employed in the virtual screening. A ligand based virtual screening tool (GPURFSCREEN) that uses random forests on GPU systems has been proposed and evaluated in this paper. This tool produces optimized results at a lower execution time for large bioassay data sets. The quality of results produced by our tool on GPU is same as that on a regular serial environment. Considering the magnitude of data to be screened, the parallelized virtual screening has a significantly lower running time at high throughput. The proposed parallel tool outperforms its serial counterpart by successfully screening billions of molecules in training and prediction phases.

  17. A single-layer network unsupervised feature learning method for white matter hyperintensity segmentation

    NASA Astrophysics Data System (ADS)

    Vijverberg, Koen; Ghafoorian, Mohsen; van Uden, Inge W. M.; de Leeuw, Frank-Erik; Platel, Bram; Heskes, Tom

    2016-03-01

    Cerebral small vessel disease (SVD) is a disorder frequently found among the old people and is associated with deterioration in cognitive performance, parkinsonism, motor and mood impairments. White matter hyperintensities (WMH) as well as lacunes, microbleeds and subcortical brain atrophy are part of the spectrum of image findings, related to SVD. Accurate segmentation of WMHs is important for prognosis and diagnosis of multiple neurological disorders such as MS and SVD. Almost all of the published (semi-)automated WMH detection models employ multiple complex hand-crafted features, which require in-depth domain knowledge. In this paper we propose to apply a single-layer network unsupervised feature learning (USFL) method to avoid hand-crafted features, but rather to automatically learn a more efficient set of features. Experimental results show that a computer aided detection system with a USFL system outperforms a hand-crafted approach. Moreover, since the two feature sets have complementary properties, a hybrid system that makes use of both hand-crafted and unsupervised learned features, shows a significant performance boost compared to each system separately, getting close to the performance of an independent human expert.

  18. Support vector machine with a Pearson VII function kernel for discriminating halophilic and non-halophilic proteins.

    PubMed

    Zhang, Guangya; Ge, Huihua

    2013-10-01

    Understanding of proteins adaptive to hypersaline environment and identifying them is a challenging task and would help to design stable proteins. Here, we have systematically analyzed the normalized amino acid compositions of 2121 halophilic and 2400 non-halophilic proteins. The results showed that halophilic protein contained more Asp at the expense of Lys, Ile, Cys and Met, fewer small and hydrophobic residues, and showed a large excess of acidic over basic amino acids. Then, we introduce a support vector machine method to discriminate the halophilic and non-halophilic proteins, by using a novel Pearson VII universal function based kernel. In the three validation check methods, it achieved an overall accuracy of 97.7%, 91.7% and 86.9% and outperformed other machine learning algorithms. We also address the influence of protein size on prediction accuracy and found the worse performance for small size proteins might be some significant residues (Cys and Lys) were missing in the proteins. Copyright © 2013 The Authors. Published by Elsevier Ltd.. All rights reserved.

  19. An Information Retrieval Approach for Robust Prediction of Road Surface States

    PubMed Central

    Park, Jae-Hyung; Kim, Kwanho

    2017-01-01

    Recently, due to the increasing importance of reducing severe vehicle accidents on roads (especially on highways), the automatic identification of road surface conditions, and the provisioning of such information to drivers in advance, have recently been gaining significant momentum as a proactive solution to decrease the number of vehicle accidents. In this paper, we firstly propose an information retrieval approach that aims to identify road surface states by combining conventional machine-learning techniques and moving average methods. Specifically, when signal information is received from a radar system, our approach attempts to estimate the current state of the road surface based on the similar instances observed previously based on utilizing a given similarity function. Next, the estimated state is then calibrated by using the recently estimated states to yield both effective and robust prediction results. To validate the performances of the proposed approach, we established a real-world experimental setting on a section of actual highway in South Korea and conducted a comparison with the conventional approaches in terms of accuracy. The experimental results show that the proposed approach successfully outperforms the previously developed methods. PMID:28134859

  20. Computer-aided classification of rheumatoid arthritis in finger joints using frequency domain optical tomography

    NASA Astrophysics Data System (ADS)

    Klose, C. D.; Kim, H. K.; Netz, U.; Blaschke, S.; Zwaka, P. A.; Mueller, G. A.; Beuthan, J.; Hielscher, A. H.

    2009-02-01

    Novel methods that can help in the diagnosis and monitoring of joint disease are essential for efficient use of novel arthritis therapies that are currently emerging. Building on previous studies that involved continuous wave imaging systems we present here first clinical data obtained with a new frequency-domain imaging system. Three-dimensional tomographic data sets of absorption and scattering coefficients were generated for 107 fingers. The data were analyzed using ANOVA, MANOVA, Discriminant Analysis DA, and a machine-learning algorithm that is based on self-organizing mapping (SOM) for clustering data in 2-dimensional parameter spaces. Overall we found that the SOM algorithm outperforms the more traditional analysis methods in terms of correctly classifying finger joints. Using SOM, healthy and affected joints can now be separated with a sensitivity of 0.97 and specificity of 0.91. Furthermore, preliminary results suggest that if a combination of multiple image properties is used, statistical significant differences can be found between RA-affected finger joints that show different clinical features (e.g. effusion, synovitis or erosion).

Top