Churchill, Nathan W.; Oder, Anita; Abdi, Hervé; Tam, Fred; Lee, Wayne; Thomas, Christopher; Ween, Jon E.; Graham, Simon J.; Strother, Stephen C.
2016-01-01
Subject-specific artifacts caused by head motion and physiological noise are major confounds in BOLD fMRI analyses. However, there is little consensus on the optimal choice of data preprocessing steps to minimize these effects. To evaluate the effects of various preprocessing strategies, we present a framework which comprises a combination of (1) nonparametric testing including reproducibility and prediction metrics of the data-driven NPAIRS framework (Strother et al. [2002]: NeuroImage 15:747–771), and (2) intersubject comparison of SPM effects, using DISTATIS (a three-way version of metric multidimensional scaling (Abdi et al. [2009]: NeuroImage 45:89–95). It is shown that the quality of brain activation maps may be significantly limited by sub-optimal choices of data preprocessing steps (or “pipeline”) in a clinical task-design, an fMRI adaptation of the widely used Trail-Making Test. The relative importance of motion correction, physiological noise correction, motion parameter regression, and temporal detrending were examined for fMRI data acquired in young, healthy adults. Analysis performance and the quality of activation maps were evaluated based on Penalized Discriminant Analysis (PDA). The relative importance of different preprocessing steps was assessed by (1) a nonparametric Friedman rank test for fixed sets of preprocessing steps, applied to all subjects; and (2) evaluating pipelines chosen specifically for each subject. Results demonstrate that preprocessing choices have significant, but subject-dependant effects, and that individually-optimized pipelines may significantly improve the reproducibility of fMRI results over fixed pipelines. This was demonstrated by the detection of a significant interaction with motion parameter regression and physiological noise correction, even though the range of subject head motion was small across the group (≪ 1 voxel). Optimizing pipelines on an individual-subject basis also revealed brain activation patterns either weak or absent under fixed pipelines, which has implications for the overall interpretation of fMRI data, and the relative importance of preprocessing methods. PMID:21455942
Schulze, H Georg; Turner, Robin F B
2015-06-01
High-throughput information extraction from large numbers of Raman spectra is becoming an increasingly taxing problem due to the proliferation of new applications enabled using advances in instrumentation. Fortunately, in many of these applications, the entire process can be automated, yielding reproducibly good results with significant time and cost savings. Information extraction consists of two stages, preprocessing and analysis. We focus here on the preprocessing stage, which typically involves several steps, such as calibration, background subtraction, baseline flattening, artifact removal, smoothing, and so on, before the resulting spectra can be further analyzed. Because the results of some of these steps can affect the performance of subsequent ones, attention must be given to the sequencing of steps, the compatibility of these sequences, and the propensity of each step to generate spectral distortions. We outline here important considerations to effect full automation of Raman spectral preprocessing: what is considered full automation; putative general principles to effect full automation; the proper sequencing of processing and analysis steps; conflicts and circularities arising from sequencing; and the need for, and approaches to, preprocessing quality control. These considerations are discussed and illustrated with biological and biomedical examples reflecting both successful and faulty preprocessing.
Convolutional neural networks for vibrational spectroscopic data analysis.
Acquarelli, Jacopo; van Laarhoven, Twan; Gerretzen, Jan; Tran, Thanh N; Buydens, Lutgarde M C; Marchiori, Elena
2017-02-15
In this work we show that convolutional neural networks (CNNs) can be efficiently used to classify vibrational spectroscopic data and identify important spectral regions. CNNs are the current state-of-the-art in image classification and speech recognition and can learn interpretable representations of the data. These characteristics make CNNs a good candidate for reducing the need for preprocessing and for highlighting important spectral regions, both of which are crucial steps in the analysis of vibrational spectroscopic data. Chemometric analysis of vibrational spectroscopic data often relies on preprocessing methods involving baseline correction, scatter correction and noise removal, which are applied to the spectra prior to model building. Preprocessing is a critical step because even in simple problems using 'reasonable' preprocessing methods may decrease the performance of the final model. We develop a new CNN based method and provide an accompanying publicly available software. It is based on a simple CNN architecture with a single convolutional layer (a so-called shallow CNN). Our method outperforms standard classification algorithms used in chemometrics (e.g. PLS) in terms of accuracy when applied to non-preprocessed test data (86% average accuracy compared to the 62% achieved by PLS), and it achieves better performance even on preprocessed test data (96% average accuracy compared to the 89% achieved by PLS). For interpretability purposes, our method includes a procedure for finding important spectral regions, thereby facilitating qualitative interpretation of results. Copyright © 2016 Elsevier B.V. All rights reserved.
Muncy, Nathan M; Hedges-Muncy, Ariana M; Kirwan, C Brock
2017-01-01
Pre-processing MRI scans prior to performing volumetric analyses is common practice in MRI studies. As pre-processing steps adjust the voxel intensities, the space in which the scan exists, and the amount of data in the scan, it is possible that the steps have an effect on the volumetric output. To date, studies have compared between and not within pipelines, and so the impact of each step is unknown. This study aims to quantify the effects of pre-processing steps on volumetric measures in T1-weighted scans within a single pipeline. It was our hypothesis that pre-processing steps would significantly impact ROI volume estimations. One hundred fifteen participants from the OASIS dataset were used, where each participant contributed three scans. All scans were then pre-processed using a step-wise pipeline. Bilateral hippocampus, putamen, and middle temporal gyrus volume estimations were assessed following each successive step, and all data were processed by the same pipeline 5 times. Repeated-measures analyses tested for a main effects of pipeline step, scan-rescan (for MRI scanner consistency) and repeated pipeline runs (for algorithmic consistency). A main effect of pipeline step was detected, and interestingly an interaction between pipeline step and ROI exists. No effect for either scan-rescan or repeated pipeline run was detected. We then supply a correction for noise in the data resulting from pre-processing.
2017-01-01
Pre-processing MRI scans prior to performing volumetric analyses is common practice in MRI studies. As pre-processing steps adjust the voxel intensities, the space in which the scan exists, and the amount of data in the scan, it is possible that the steps have an effect on the volumetric output. To date, studies have compared between and not within pipelines, and so the impact of each step is unknown. This study aims to quantify the effects of pre-processing steps on volumetric measures in T1-weighted scans within a single pipeline. It was our hypothesis that pre-processing steps would significantly impact ROI volume estimations. One hundred fifteen participants from the OASIS dataset were used, where each participant contributed three scans. All scans were then pre-processed using a step-wise pipeline. Bilateral hippocampus, putamen, and middle temporal gyrus volume estimations were assessed following each successive step, and all data were processed by the same pipeline 5 times. Repeated-measures analyses tested for a main effects of pipeline step, scan-rescan (for MRI scanner consistency) and repeated pipeline runs (for algorithmic consistency). A main effect of pipeline step was detected, and interestingly an interaction between pipeline step and ROI exists. No effect for either scan-rescan or repeated pipeline run was detected. We then supply a correction for noise in the data resulting from pre-processing. PMID:29023597
On-Board, Real-Time Preprocessing System for Optical Remote-Sensing Imagery
Qi, Baogui; Zhuang, Yin; Chen, He; Chen, Liang
2018-01-01
With the development of remote-sensing technology, optical remote-sensing imagery processing has played an important role in many application fields, such as geological exploration and natural disaster prevention. However, relative radiation correction and geometric correction are key steps in preprocessing because raw image data without preprocessing will cause poor performance during application. Traditionally, remote-sensing data are downlinked to the ground station, preprocessed, and distributed to users. This process generates long delays, which is a major bottleneck in real-time applications for remote-sensing data. Therefore, on-board, real-time image preprocessing is greatly desired. In this paper, a real-time processing architecture for on-board imagery preprocessing is proposed. First, a hierarchical optimization and mapping method is proposed to realize the preprocessing algorithm in a hardware structure, which can effectively reduce the computation burden of on-board processing. Second, a co-processing system using a field-programmable gate array (FPGA) and a digital signal processor (DSP; altogether, FPGA-DSP) based on optimization is designed to realize real-time preprocessing. The experimental results demonstrate the potential application of our system to an on-board processor, for which resources and power consumption are limited. PMID:29693585
On-Board, Real-Time Preprocessing System for Optical Remote-Sensing Imagery.
Qi, Baogui; Shi, Hao; Zhuang, Yin; Chen, He; Chen, Liang
2018-04-25
With the development of remote-sensing technology, optical remote-sensing imagery processing has played an important role in many application fields, such as geological exploration and natural disaster prevention. However, relative radiation correction and geometric correction are key steps in preprocessing because raw image data without preprocessing will cause poor performance during application. Traditionally, remote-sensing data are downlinked to the ground station, preprocessed, and distributed to users. This process generates long delays, which is a major bottleneck in real-time applications for remote-sensing data. Therefore, on-board, real-time image preprocessing is greatly desired. In this paper, a real-time processing architecture for on-board imagery preprocessing is proposed. First, a hierarchical optimization and mapping method is proposed to realize the preprocessing algorithm in a hardware structure, which can effectively reduce the computation burden of on-board processing. Second, a co-processing system using a field-programmable gate array (FPGA) and a digital signal processor (DSP; altogether, FPGA-DSP) based on optimization is designed to realize real-time preprocessing. The experimental results demonstrate the potential application of our system to an on-board processor, for which resources and power consumption are limited.
Sentiment analysis of feature ranking methods for classification accuracy
NASA Astrophysics Data System (ADS)
Joseph, Shashank; Mugauri, Calvin; Sumathy, S.
2017-11-01
Text pre-processing and feature selection are important and critical steps in text mining. Text pre-processing of large volumes of datasets is a difficult task as unstructured raw data is converted into structured format. Traditional methods of processing and weighing took much time and were less accurate. To overcome this challenge, feature ranking techniques have been devised. A feature set from text preprocessing is fed as input for feature selection. Feature selection helps improve text classification accuracy. Of the three feature selection categories available, the filter category will be the focus. Five feature ranking methods namely: document frequency, standard deviation information gain, CHI-SQUARE, and weighted-log likelihood -ratio is analyzed.
NASA Astrophysics Data System (ADS)
Sawall, Mathias; von Harbou, Erik; Moog, Annekathrin; Behrens, Richard; Schröder, Henning; Simoneau, Joël; Steimers, Ellen; Neymeyr, Klaus
2018-04-01
Spectral data preprocessing is an integral and sometimes inevitable part of chemometric analyses. For Nuclear Magnetic Resonance (NMR) spectra a possible first preprocessing step is a phase correction which is applied to the Fourier transformed free induction decay (FID) signal. This preprocessing step can be followed by a separate baseline correction step. Especially if series of high-resolution spectra are considered, then automated and computationally fast preprocessing routines are desirable. A new method is suggested that applies the phase and the baseline corrections simultaneously in an automated form without manual input, which distinguishes this work from other approaches. The underlying multi-objective optimization or Pareto optimization provides improved results compared to consecutively applied correction steps. The optimization process uses an objective function which applies strong penalty constraints and weaker regularization conditions. The new method includes an approach for the detection of zero baseline regions. The baseline correction uses a modified Whittaker smoother. The functionality of the new method is demonstrated for experimental NMR spectra. The results are verified against gravimetric data. The method is compared to alternative preprocessing tools. Additionally, the simultaneous correction method is compared to a consecutive application of the two correction steps.
Devos, Olivier; Downey, Gerard; Duponchel, Ludovic
2014-04-01
Classification is an important task in chemometrics. For several years now, support vector machines (SVMs) have proven to be powerful for infrared spectral data classification. However such methods require optimisation of parameters in order to control the risk of overfitting and the complexity of the boundary. Furthermore, it is established that the prediction ability of classification models can be improved using pre-processing in order to remove unwanted variance in the spectra. In this paper we propose a new methodology based on genetic algorithm (GA) for the simultaneous optimisation of SVM parameters and pre-processing (GENOPT-SVM). The method has been tested for the discrimination of the geographical origin of Italian olive oil (Ligurian and non-Ligurian) on the basis of near infrared (NIR) or mid infrared (FTIR) spectra. Different classification models (PLS-DA, SVM with mean centre data, GENOPT-SVM) have been tested and statistically compared using McNemar's statistical test. For the two datasets, SVM with optimised pre-processing give models with higher accuracy than the one obtained with PLS-DA on pre-processed data. In the case of the NIR dataset, most of this accuracy improvement (86.3% compared with 82.8% for PLS-DA) occurred using only a single pre-processing step. For the FTIR dataset, three optimised pre-processing steps are required to obtain SVM model with significant accuracy improvement (82.2%) compared to the one obtained with PLS-DA (78.6%). Furthermore, this study demonstrates that even SVM models have to be developed on the basis of well-corrected spectral data in order to obtain higher classification rates. Copyright © 2013 Elsevier Ltd. All rights reserved.
The Influence of Preprocessing Steps on Graph Theory Measures Derived from Resting State fMRI
Gargouri, Fatma; Kallel, Fathi; Delphine, Sebastien; Ben Hamida, Ahmed; Lehéricy, Stéphane; Valabregue, Romain
2018-01-01
Resting state functional MRI (rs-fMRI) is an imaging technique that allows the spontaneous activity of the brain to be measured. Measures of functional connectivity highly depend on the quality of the BOLD signal data processing. In this study, our aim was to study the influence of preprocessing steps and their order of application on small-world topology and their efficiency in resting state fMRI data analysis using graph theory. We applied the most standard preprocessing steps: slice-timing, realign, smoothing, filtering, and the tCompCor method. In particular, we were interested in how preprocessing can retain the small-world economic properties and how to maximize the local and global efficiency of a network while minimizing the cost. Tests that we conducted in 54 healthy subjects showed that the choice and ordering of preprocessing steps impacted the graph measures. We found that the csr (where we applied realignment, smoothing, and tCompCor as a final step) and the scr (where we applied realignment, tCompCor and smoothing as a final step) strategies had the highest mean values of global efficiency (eg). Furthermore, we found that the fscr strategy (where we applied realignment, tCompCor, smoothing, and filtering as a final step), had the highest mean local efficiency (el) values. These results confirm that the graph theory measures of functional connectivity depend on the ordering of the processing steps, with the best results being obtained using smoothing and tCompCor as the final steps for global efficiency with additional filtering for local efficiency. PMID:29497372
The Influence of Preprocessing Steps on Graph Theory Measures Derived from Resting State fMRI.
Gargouri, Fatma; Kallel, Fathi; Delphine, Sebastien; Ben Hamida, Ahmed; Lehéricy, Stéphane; Valabregue, Romain
2018-01-01
Resting state functional MRI (rs-fMRI) is an imaging technique that allows the spontaneous activity of the brain to be measured. Measures of functional connectivity highly depend on the quality of the BOLD signal data processing. In this study, our aim was to study the influence of preprocessing steps and their order of application on small-world topology and their efficiency in resting state fMRI data analysis using graph theory. We applied the most standard preprocessing steps: slice-timing, realign, smoothing, filtering, and the tCompCor method. In particular, we were interested in how preprocessing can retain the small-world economic properties and how to maximize the local and global efficiency of a network while minimizing the cost. Tests that we conducted in 54 healthy subjects showed that the choice and ordering of preprocessing steps impacted the graph measures. We found that the csr (where we applied realignment, smoothing, and tCompCor as a final step) and the scr (where we applied realignment, tCompCor and smoothing as a final step) strategies had the highest mean values of global efficiency (eg) . Furthermore, we found that the fscr strategy (where we applied realignment, tCompCor, smoothing, and filtering as a final step), had the highest mean local efficiency (el) values. These results confirm that the graph theory measures of functional connectivity depend on the ordering of the processing steps, with the best results being obtained using smoothing and tCompCor as the final steps for global efficiency with additional filtering for local efficiency.
Preprocessing and Analysis of LC-MS-Based Proteomic Data
Tsai, Tsung-Heng; Wang, Minkun; Ressom, Habtom W.
2016-01-01
Liquid chromatography coupled with mass spectrometry (LC-MS) has been widely used for profiling protein expression levels. This chapter is focused on LC-MS data preprocessing, which is a crucial step in the analysis of LC-MS based proteomics. We provide a high-level overview, highlight associated challenges, and present a step-by-step example for analysis of data from LC-MS based untargeted proteomic study. Furthermore, key procedures and relevant issues with the subsequent analysis by multiple reaction monitoring (MRM) are discussed. PMID:26519169
Data Acquisition and Preprocessing in Studies on Humans: What Is Not Taught in Statistics Classes?
Zhu, Yeyi; Hernandez, Ladia M; Mueller, Peter; Dong, Yongquan; Forman, Michele R
2013-01-01
The aim of this paper is to address issues in research that may be missing from statistics classes and important for (bio-)statistics students. In the context of a case study, we discuss data acquisition and preprocessing steps that fill the gap between research questions posed by subject matter scientists and statistical methodology for formal inference. Issues include participant recruitment, data collection training and standardization, variable coding, data review and verification, data cleaning and editing, and documentation. Despite the critical importance of these details in research, most of these issues are rarely discussed in an applied statistics program. One reason for the lack of more formal training is the difficulty in addressing the many challenges that can possibly arise in the course of a study in a systematic way. This article can help to bridge this gap between research questions and formal statistical inference by using an illustrative case study for a discussion. We hope that reading and discussing this paper and practicing data preprocessing exercises will sensitize statistics students to these important issues and achieve optimal conduct, quality control, analysis, and interpretation of a study.
Preprocessing of A-scan GPR data based on energy features
NASA Astrophysics Data System (ADS)
Dogan, Mesut; Turhan-Sayan, Gonul
2016-05-01
There is an increasing demand for noninvasive real-time detection and classification of buried objects in various civil and military applications. The problem of detection and annihilation of landmines is particularly important due to strong safety concerns. The requirement for a fast real-time decision process is as important as the requirements for high detection rates and low false alarm rates. In this paper, we introduce and demonstrate a computationally simple, timeefficient, energy-based preprocessing approach that can be used in ground penetrating radar (GPR) applications to eliminate reflections from the air-ground boundary and to locate the buried objects, simultaneously, at one easy step. The instantaneous power signals, the total energy values and the cumulative energy curves are extracted from the A-scan GPR data. The cumulative energy curves, in particular, are shown to be useful to detect the presence and location of buried objects in a fast and simple way while preserving the spectral content of the original A-scan data for further steps of physics-based target classification. The proposed method is demonstrated using the GPR data collected at the facilities of IPA Defense, Ankara at outdoor test lanes. Cylindrically shaped plastic containers were buried in fine-medium sand to simulate buried landmines. These plastic containers were half-filled by ammonium nitrate including metal pins. Results of this pilot study are demonstrated to be highly promising to motivate further research for the use of energy-based preprocessing features in landmine detection problem.
DPPP: Default Pre-Processing Pipeline
NASA Astrophysics Data System (ADS)
van Diepen, Ger; Dijkema, Tammo Jan
2018-04-01
DPPP (Default Pre-Processing Pipeline, also referred to as NDPPP) reads and writes radio-interferometric data in the form of Measurement Sets, mainly those that are created by the LOFAR telescope. It goes through visibilities in time order and contains standard operations like averaging, phase-shifting and flagging bad stations. Between the steps in a pipeline, the data is not written to disk, making this tool suitable for operations where I/O dominates. More advanced procedures such as gain calibration are also included. Other computing steps can be provided by loading a shared library; currently supported external steps are the AOFlagger (ascl:1010.017) and a bridge that enables loading python steps.
Barry, Robert L.; Williams, Joy M.; Klassen, L. Martyn; Gallivan, Jason P.; Culham, Jody C.
2009-01-01
Blood-oxygenation-level-dependent (BOLD) functional magnetic resonance imaging (fMRI) is currently the dominant technique for non-invasive investigation of brain functions. One of the challenges with BOLD fMRI, particularly at high fields, is compensation for the effects of spatiotemporally varying magnetic field inhomogeneities (ΔB0) caused by normal subject respiration, and in some studies, movement of the subject during the scan to perform tasks related to the functional paradigm. The presence of ΔB0 during data acquisition distorts reconstructed images and introduces extraneous fluctuations in the fMRI time series that decrease the BOLD contrast-to-noise ratio. Optimization of the fMRI data-processing pipeline to compensate for geometric distortions is of paramount importance to ensure high quality of fMRI data. To investigate ΔB0 caused by subject movement, echo-planar imaging scans were collected with and without concurrent motion of a phantom arm. The phantom arm was constructed and moved by the experimenter to emulate forearm motions while subjects remained still and observed a visual stimulation paradigm. These data were then subjected to eight different combinations of preprocessing steps. The best preprocessing pipeline included navigator correction, a complex phase regressor, and spatial smoothing. The synergy between navigator correction and phase regression reduced geometric distortions better than either step in isolation, and preconditioned the data to make them more amenable to the benefits of spatial smoothing. The combination of these steps provided a 10% increase in t-statistics compared to only navigator correction and spatial smoothing, and reduced the noise and false activations in regions where no legitimate effects would occur. PMID:19695810
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fromm, Catherine
2015-08-20
Ptychography is an advanced diffraction based imaging technique that can achieve resolution of 5nm and below. It is done by scanning a sample through a beam of focused x-rays using discrete yet overlapping scan steps. Scattering data is collected on a CCD camera, and the phase of the scattered light is reconstructed with sophisticated iterative algorithms. Because the experimental setup is similar, ptychography setups can be created by retrofitting existing STXM beam lines with new hardware. The other challenge comes in the reconstruction of the collected scattering images. Scattering data must be adjusted and packaged with experimental parameters to calibratemore » the reconstruction software. The necessary pre-processing of data prior to reconstruction is unique to each beamline setup, and even the optical alignments used on that particular day. Pre-processing software must be developed to be flexible and efficient in order to allow experiments appropriate control and freedom in the analysis of their hard-won data. This paper will describe the implementation of pre-processing software which successfully connects data collection steps to reconstruction steps, letting the user accomplish accurate and reliable ptychography.« less
Integrated Sensing Processor, Phase 2
2005-12-01
performance analysis for several baseline classifiers including neural nets, linear classifiers, and kNN classifiers. Use of CCDR as a preprocessing step...below the level of the benchmark non-linear classifier for this problem ( kNN ). Furthermore, the CCDR preconditioned kNN achieved a 10% improvement over...the benchmark kNN without CCDR. Finally, we found an important connection between intrinsic dimension estimation via entropic graphs and the optimal
NASA Astrophysics Data System (ADS)
Zhu, Zhe
2017-08-01
The free and open access to all archived Landsat images in 2008 has completely changed the way of using Landsat data. Many novel change detection algorithms based on Landsat time series have been developed We present a comprehensive review of four important aspects of change detection studies based on Landsat time series, including frequencies, preprocessing, algorithms, and applications. We observed the trend that the more recent the study, the higher the frequency of Landsat time series used. We reviewed a series of image preprocessing steps, including atmospheric correction, cloud and cloud shadow detection, and composite/fusion/metrics techniques. We divided all change detection algorithms into six categories, including thresholding, differencing, segmentation, trajectory classification, statistical boundary, and regression. Within each category, six major characteristics of different algorithms, such as frequency, change index, univariate/multivariate, online/offline, abrupt/gradual change, and sub-pixel/pixel/spatial were analyzed. Moreover, some of the widely-used change detection algorithms were also discussed. Finally, we reviewed different change detection applications by dividing these applications into two categories, change target and change agent detection.
Prediction of carbonate rock type from NMR responses using data mining techniques
NASA Astrophysics Data System (ADS)
Gonçalves, Eduardo Corrêa; da Silva, Pablo Nascimento; Silveira, Carla Semiramis; Carneiro, Giovanna; Domingues, Ana Beatriz; Moss, Adam; Pritchard, Tim; Plastino, Alexandre; Azeredo, Rodrigo Bagueira de Vasconcellos
2017-05-01
Recent studies have indicated that the accurate identification of carbonate rock types in a reservoir can be employed as a preliminary step to enhance the effectiveness of petrophysical property modeling. Furthermore, rock typing activity has been shown to be of key importance in several steps of formation evaluation, such as the study of sedimentary series, reservoir zonation and well-to-well correlation. In this paper, a methodology based exclusively on the analysis of 1H-NMR (Nuclear Magnetic Resonance) relaxation responses - using data mining algorithms - is evaluated to perform the automatic classification of carbonate samples according to their rock type. We analyze the effectiveness of six different classification algorithms (k-NN, Naïve Bayes, C4.5, Random Forest, SMO and Multilayer Perceptron) and two data preprocessing strategies (discretization and feature selection). The dataset used in this evaluation is formed by 78 1H-NMR T2 distributions of fully brine-saturated rock samples from six different rock type classes. The experiments reveal that the combination of preprocessing strategies with classification algorithms is able to achieve a prediction accuracy of 97.4%.
Text block identification in restoration process of Javanese script damage
NASA Astrophysics Data System (ADS)
Himamunanto, A. R.; Setyowati, E.
2018-05-01
Generally, in a sheet of documents there are two objects of information, namely text and image. A text block area in the sheet of manuscript is a vital object because the restoration process would be done only in this object. Text block or text area identification becomes an important step before. This paper describes the steps leading to the restoration of Java script destruction. The process stages are: pre-processing, identification of text block, segmentation, damage identification, restoration. The test result based on the input manuscript “Hamong Tani” show that the system works with a success rate of 82.07%
Tsai, Tsung-Heng; Tadesse, Mahlet G.; Di Poto, Cristina; Pannell, Lewis K.; Mechref, Yehia; Wang, Yue; Ressom, Habtom W.
2013-01-01
Motivation: Liquid chromatography-mass spectrometry (LC-MS) has been widely used for profiling expression levels of biomolecules in various ‘-omic’ studies including proteomics, metabolomics and glycomics. Appropriate LC-MS data preprocessing steps are needed to detect true differences between biological groups. Retention time (RT) alignment, which is required to ensure that ion intensity measurements among multiple LC-MS runs are comparable, is one of the most important yet challenging preprocessing steps. Current alignment approaches estimate RT variability using either single chromatograms or detected peaks, but do not simultaneously take into account the complementary information embedded in the entire LC-MS data. Results: We propose a Bayesian alignment model for LC-MS data analysis. The alignment model provides estimates of the RT variability along with uncertainty measures. The model enables integration of multiple sources of information including internal standards and clustered chromatograms in a mathematically rigorous framework. We apply the model to LC-MS metabolomic, proteomic and glycomic data. The performance of the model is evaluated based on ground-truth data, by measuring correlation of variation, RT difference across runs and peak-matching performance. We demonstrate that Bayesian alignment model improves significantly the RT alignment performance through appropriate integration of relevant information. Availability and implementation: MATLAB code, raw and preprocessed LC-MS data are available at http://omics.georgetown.edu/alignLCMS.html Contact: hwr@georgetown.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:24013927
An Automated, Adaptive Framework for Optimizing Preprocessing Pipelines in Task-Based Functional MRI
Churchill, Nathan W.; Spring, Robyn; Afshin-Pour, Babak; Dong, Fan; Strother, Stephen C.
2015-01-01
BOLD fMRI is sensitive to blood-oxygenation changes correlated with brain function; however, it is limited by relatively weak signal and significant noise confounds. Many preprocessing algorithms have been developed to control noise and improve signal detection in fMRI. Although the chosen set of preprocessing and analysis steps (the “pipeline”) significantly affects signal detection, pipelines are rarely quantitatively validated in the neuroimaging literature, due to complex preprocessing interactions. This paper outlines and validates an adaptive resampling framework for evaluating and optimizing preprocessing choices by optimizing data-driven metrics of task prediction and spatial reproducibility. Compared to standard “fixed” preprocessing pipelines, this optimization approach significantly improves independent validation measures of within-subject test-retest, and between-subject activation overlap, and behavioural prediction accuracy. We demonstrate that preprocessing choices function as implicit model regularizers, and that improvements due to pipeline optimization generalize across a range of simple to complex experimental tasks and analysis models. Results are shown for brief scanning sessions (<3 minutes each), demonstrating that with pipeline optimization, it is possible to obtain reliable results and brain-behaviour correlations in relatively small datasets. PMID:26161667
Näreoja, Tuomas; Rosenholm, Jessica M; Lamminmäki, Urpo; Hänninen, Pekka E
2017-05-01
Thyrotropin or thyroid-stimulating hormone (TSH) is used as a marker for thyroid function. More precise and more sensitive immunoassays are needed to facilitate continuous monitoring of thyroid dysfunctions and to assess the efficacy of the selected therapy and dosage of medication. Moreover, most thyroid diseases are autoimmune diseases making TSH assays very prone to immunoassay interferences due to autoantibodies in the sample matrix. We have developed a super-sensitive TSH immunoassay utilizing nanoparticle labels with a detection limit of 60 nU L -1 in preprocessed serum samples by reducing nonspecific binding. The developed preprocessing step by affinity purification removed interfering compounds and improved the recovery of spiked TSH from serum. The sensitivity enhancement was achieved by stabilization of the protein corona of the nanoparticle bioconjugates and a spot-coated configuration of the active solid-phase that reduced sedimentation of the nanoparticle bioconjugates and their contact time with antibody-coated solid phase, thus making use of the higher association rate of specific binding due to high avidity nanoparticle bioconjugates. Graphical Abstract We were able to decrease the lowest limit of detection and increase sensitivity of TSH immunoassay using Eu(III)-nanoparticles. The improvement was achieved by decreasing binding time of nanoparticle bioconjugates by small capture area and fast circular rotation. Also, we applied a step to stabilize protein corona of the nanoparticles and a serum-preprocessing step with a structurally related antibody.
NASA Astrophysics Data System (ADS)
Qin, Cheng-Zhi; Zhan, Lijun
2012-06-01
As one of the important tasks in digital terrain analysis, the calculation of flow accumulations from gridded digital elevation models (DEMs) usually involves two steps in a real application: (1) using an iterative DEM preprocessing algorithm to remove the depressions and flat areas commonly contained in real DEMs, and (2) using a recursive flow-direction algorithm to calculate the flow accumulation for every cell in the DEM. Because both algorithms are computationally intensive, quick calculation of the flow accumulations from a DEM (especially for a large area) presents a practical challenge to personal computer (PC) users. In recent years, rapid increases in hardware capacity of the graphics processing units (GPUs) provided in modern PCs have made it possible to meet this challenge in a PC environment. Parallel computing on GPUs using a compute-unified-device-architecture (CUDA) programming model has been explored to speed up the execution of the single-flow-direction algorithm (SFD). However, the parallel implementation on a GPU of the multiple-flow-direction (MFD) algorithm, which generally performs better than the SFD algorithm, has not been reported. Moreover, GPU-based parallelization of the DEM preprocessing step in the flow-accumulation calculations has not been addressed. This paper proposes a parallel approach to calculate flow accumulations (including both iterative DEM preprocessing and a recursive MFD algorithm) on a CUDA-compatible GPU. For the parallelization of an MFD algorithm (MFD-md), two different parallelization strategies using a GPU are explored. The first parallelization strategy, which has been used in the existing parallel SFD algorithm on GPU, has the problem of computing redundancy. Therefore, we designed a parallelization strategy based on graph theory. The application results show that the proposed parallel approach to calculate flow accumulations on a GPU performs much faster than either sequential algorithms or other parallel GPU-based algorithms based on existing parallelization strategies.
Micro-Analyzer: automatic preprocessing of Affymetrix microarray data.
Guzzi, Pietro Hiram; Cannataro, Mario
2013-08-01
A current trend in genomics is the investigation of the cell mechanism using different technologies, in order to explain the relationship among genes, molecular processes and diseases. For instance, the combined use of gene-expression arrays and genomic arrays has been demonstrated as an effective instrument in clinical practice. Consequently, in a single experiment different kind of microarrays may be used, resulting in the production of different types of binary data (images and textual raw data). The analysis of microarray data requires an initial preprocessing phase, that makes raw data suitable for use on existing analysis platforms, such as the TIGR M4 (TM4) Suite. An additional challenge to be faced by emerging data analysis platforms is the ability to treat in a combined way those different microarray formats coupled with clinical data. In fact, resulting integrated data may include both numerical and symbolic data (e.g. gene expression and SNPs regarding molecular data), as well as temporal data (e.g. the response to a drug, time to progression and survival rate), regarding clinical data. Raw data preprocessing is a crucial step in analysis but is often performed in a manual and error prone way using different software tools. Thus novel, platform independent, and possibly open source tools enabling the semi-automatic preprocessing and annotation of different microarray data are needed. The paper presents Micro-Analyzer (Microarray Analyzer), a cross-platform tool for the automatic normalization, summarization and annotation of Affymetrix gene expression and SNP binary data. It represents the evolution of the μ-CS tool, extending the preprocessing to SNP arrays that were not allowed in μ-CS. The Micro-Analyzer is provided as a Java standalone tool and enables users to read, preprocess and analyse binary microarray data (gene expression and SNPs) by invoking TM4 platform. It avoids: (i) the manual invocation of external tools (e.g. the Affymetrix Power Tools), (ii) the manual loading of preprocessing libraries, and (iii) the management of intermediate files, such as results and metadata. Micro-Analyzer users can directly manage Affymetrix binary data without worrying about locating and invoking the proper preprocessing tools and chip-specific libraries. Moreover, users of the Micro-Analyzer tool can load the preprocessed data directly into the well-known TM4 platform, extending in such a way also the TM4 capabilities. Consequently, Micro Analyzer offers the following advantages: (i) it reduces possible errors in the preprocessing and further analysis phases, e.g. due to the incorrect choice of parameters or due to the use of old libraries, (ii) it enables the combined and centralized pre-processing of different arrays, (iii) it may enhance the quality of further analysis by storing the workflow, i.e. information about the preprocessing steps, and (iv) finally Micro-Analzyer is freely available as a standalone application at the project web site http://sourceforge.net/projects/microanalyzer/. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Applying Knowledge Discovery in Databases in Public Health Data Set: Challenges and Concerns
Volrathongchia, Kanittha
2003-01-01
In attempting to apply Knowledge Discovery in Databases (KDD) to generate a predictive model from a health care dataset that is currently available to the public, the first step is to pre-process the data to overcome the challenges of missing data, redundant observations, and records containing inaccurate data. This study will demonstrate how to use simple pre-processing methods to improve the quality of input data. PMID:14728545
A comparison of PCA/ICA for data preprocessing in remote sensing imagery classification
NASA Astrophysics Data System (ADS)
He, Hui; Yu, Xianchuan
2005-10-01
In this paper a performance comparison of a variety of data preprocessing algorithms in remote sensing image classification is presented. These selected algorithms are principal component analysis (PCA) and three different independent component analyses, ICA (Fast-ICA (Aapo Hyvarinen, 1999), Kernel-ICA (KCCA and KGV (Bach & Jordan, 2002), EFFICA (Aiyou Chen & Peter Bickel, 2003). These algorithms were applied to a remote sensing imagery (1600×1197), obtained from Shunyi, Beijing. For classification, a MLC method is used for the raw and preprocessed data. The results show that classification with the preprocessed data have more confident results than that with raw data and among the preprocessing algorithms, ICA algorithms improve on PCA and EFFICA performs better than the others. The convergence of these ICA algorithms (for data points more than a million) are also studied, the result shows EFFICA converges much faster than the others. Furthermore, because EFFICA is a one-step maximum likelihood estimate (MLE) which reaches asymptotic Fisher efficiency (EFFICA), it computers quite small so that its demand of memory come down greatly, which settled the "out of memory" problem occurred in the other algorithms.
On the structure of Bayesian network for Indonesian text document paraphrase identification
NASA Astrophysics Data System (ADS)
Prayogo, Ario Harry; Syahrul Mubarok, Mohamad; Adiwijaya
2018-03-01
Paraphrase identification is an important process within natural language processing. The idea is to automatically recognize phrases that have different forms but contain same meanings. For examples if we input query “causing fire hazard”, then the computer has to recognize this query that this query has same meaning as “the cause of fire hazard. Paraphrasing is an activity that reveals the meaning of an expression, writing, or speech using different words or forms, especially to achieve greater clarity. In this research we will focus on classifying two Indonesian sentences whether it is a paraphrase to each other or not. There are four steps in this research, first is preprocessing, second is feature extraction, third is classifier building, and the last is performance evaluation. Preprocessing consists of tokenization, non-alphanumerical removal, and stemming. After preprocessing we will conduct feature extraction in order to build new features from given dataset. There are two kinds of features in the research, syntactic features and semantic features. Syntactic features consist of normalized levenshtein distance feature, term-frequency based cosine similarity feature, and LCS (Longest Common Subsequence) feature. Semantic features consist of Wu and Palmer feature and Shortest Path Feature. We use Bayesian Networks as the method of training the classifier. Parameter estimation that we use is called MAP (Maximum A Posteriori). For structure learning of Bayesian Networks DAG (Directed Acyclic Graph), we use BDeu (Bayesian Dirichlet equivalent uniform) scoring function and for finding DAG with the best BDeu score, we use K2 algorithm. In evaluation step we perform cross-validation. The average result that we get from testing the classifier as follows: Precision 75.2%, Recall 76.5%, F1-Measure 75.8% and Accuracy 75.6%.
Applying Enhancement Filters in the Pre-processing of Images of Lymphoma
NASA Astrophysics Data System (ADS)
Henrique Silva, Sérgio; Zanchetta do Nascimento, Marcelo; Alves Neves, Leandro; Ramos Batista, Valério
2015-01-01
Lymphoma is a type of cancer that affects the immune system, and is classified as Hodgkin or non-Hodgkin. It is one of the ten types of cancer that are the most common on earth. Among all malignant neoplasms diagnosed in the world, lymphoma ranges from three to four percent of them. Our work presents a study of some filters devoted to enhancing images of lymphoma at the pre-processing step. Here the enhancement is useful for removing noise from the digital images. We have analysed the noise caused by different sources like room vibration, scraps and defocusing, and in the following classes of lymphoma: follicular, mantle cell and B-cell chronic lymphocytic leukemia. The filters Gaussian, Median and Mean-Shift were applied to different colour models (RGB, Lab and HSV). Afterwards, we performed a quantitative analysis of the images by means of the Structural Similarity Index. This was done in order to evaluate the similarity between the images. In all cases we have obtained a certainty of at least 75%, which rises to 99% if one considers only HSV. Namely, we have concluded that HSV is an important choice of colour model at pre-processing histological images of lymphoma, because in this case the resulting image will get the best enhancement.
Automatic initial and final segmentation in cleft palate speech of Mandarin speakers
Liu, Yin; Yin, Heng; Zhang, Junpeng; Zhang, Jing; Zhang, Jiang
2017-01-01
The speech unit segmentation is an important pre-processing step in the analysis of cleft palate speech. In Mandarin, one syllable is composed of two parts: initial and final. In cleft palate speech, the resonance disorders occur at the finals and the voiced initials, while the articulation disorders occur at the unvoiced initials. Thus, the initials and finals are the minimum speech units, which could reflect the characteristics of cleft palate speech disorders. In this work, an automatic initial/final segmentation method is proposed. It is an important preprocessing step in cleft palate speech signal processing. The tested cleft palate speech utterances are collected from the Cleft Palate Speech Treatment Center in the Hospital of Stomatology, Sichuan University, which has the largest cleft palate patients in China. The cleft palate speech data includes 824 speech segments, and the control samples contain 228 speech segments. The syllables are extracted from the speech utterances firstly. The proposed syllable extraction method avoids the training stage, and achieves a good performance for both voiced and unvoiced speech. Then, the syllables are classified into with “quasi-unvoiced” or with “quasi-voiced” initials. Respective initial/final segmentation methods are proposed to these two types of syllables. Moreover, a two-step segmentation method is proposed. The rough locations of syllable and initial/final boundaries are refined in the second segmentation step, in order to improve the robustness of segmentation accuracy. The experiments show that the initial/final segmentation accuracies for syllables with quasi-unvoiced initials are higher than quasi-voiced initials. For the cleft palate speech, the mean time error is 4.4ms for syllables with quasi-unvoiced initials, and 25.7ms for syllables with quasi-voiced initials, and the correct segmentation accuracy P30 for all the syllables is 91.69%. For the control samples, P30 for all the syllables is 91.24%. PMID:28926572
Automatic initial and final segmentation in cleft palate speech of Mandarin speakers.
He, Ling; Liu, Yin; Yin, Heng; Zhang, Junpeng; Zhang, Jing; Zhang, Jiang
2017-01-01
The speech unit segmentation is an important pre-processing step in the analysis of cleft palate speech. In Mandarin, one syllable is composed of two parts: initial and final. In cleft palate speech, the resonance disorders occur at the finals and the voiced initials, while the articulation disorders occur at the unvoiced initials. Thus, the initials and finals are the minimum speech units, which could reflect the characteristics of cleft palate speech disorders. In this work, an automatic initial/final segmentation method is proposed. It is an important preprocessing step in cleft palate speech signal processing. The tested cleft palate speech utterances are collected from the Cleft Palate Speech Treatment Center in the Hospital of Stomatology, Sichuan University, which has the largest cleft palate patients in China. The cleft palate speech data includes 824 speech segments, and the control samples contain 228 speech segments. The syllables are extracted from the speech utterances firstly. The proposed syllable extraction method avoids the training stage, and achieves a good performance for both voiced and unvoiced speech. Then, the syllables are classified into with "quasi-unvoiced" or with "quasi-voiced" initials. Respective initial/final segmentation methods are proposed to these two types of syllables. Moreover, a two-step segmentation method is proposed. The rough locations of syllable and initial/final boundaries are refined in the second segmentation step, in order to improve the robustness of segmentation accuracy. The experiments show that the initial/final segmentation accuracies for syllables with quasi-unvoiced initials are higher than quasi-voiced initials. For the cleft palate speech, the mean time error is 4.4ms for syllables with quasi-unvoiced initials, and 25.7ms for syllables with quasi-voiced initials, and the correct segmentation accuracy P30 for all the syllables is 91.69%. For the control samples, P30 for all the syllables is 91.24%.
Data Treatment for LC-MS Untargeted Analysis.
Riccadonna, Samantha; Franceschi, Pietro
2018-01-01
Liquid chromatography-mass spectrometry (LC-MS) untargeted experiments require complex chemometrics strategies to extract information from the experimental data. Here we discuss "data preprocessing", the set of procedures performed on the raw data to produce a data matrix which will be the starting point for the subsequent statistical analysis. Data preprocessing is a crucial step on the path to knowledge extraction, which should be carefully controlled and optimized in order to maximize the output of any untargeted metabolomics investigation.
A new approach to pre-processing digital image for wavelet-based watermark
NASA Astrophysics Data System (ADS)
Agreste, Santa; Andaloro, Guido
2008-11-01
The growth of the Internet has increased the phenomenon of digital piracy, in multimedia objects, like software, image, video, audio and text. Therefore it is strategic to individualize and to develop methods and numerical algorithms, which are stable and have low computational cost, that will allow us to find a solution to these problems. We describe a digital watermarking algorithm for color image protection and authenticity: robust, not blind, and wavelet-based. The use of Discrete Wavelet Transform is motivated by good time-frequency features and a good match with Human Visual System directives. These two combined elements are important for building an invisible and robust watermark. Moreover our algorithm can work with any image, thanks to the step of pre-processing of the image that includes resize techniques that adapt to the size of the original image for Wavelet transform. The watermark signal is calculated in correlation with the image features and statistic properties. In the detection step we apply a re-synchronization between the original and watermarked image according to the Neyman-Pearson statistic criterion. Experimentation on a large set of different images has been shown to be resistant against geometric, filtering, and StirMark attacks with a low rate of false alarm.
Semiautomatic Segmentation of Glioma on Mobile Devices.
Wu, Ya-Ping; Lin, Yu-Song; Wu, Wei-Guo; Yang, Cong; Gu, Jian-Qin; Bai, Yan; Wang, Mei-Yun
2017-01-01
Brain tumor segmentation is the first and the most critical step in clinical applications of radiomics. However, segmenting brain images by radiologists is labor intense and prone to inter- and intraobserver variability. Stable and reproducible brain image segmentation algorithms are thus important for successful tumor detection in radiomics. In this paper, we propose a supervised brain image segmentation method, especially for magnetic resonance (MR) brain images with glioma. This paper uses hard edge multiplicative intrinsic component optimization to preprocess glioma medical image on the server side, and then, the doctors could supervise the segmentation process on mobile devices in their convenient time. Since the preprocessed images have the same brightness for the same tissue voxels, they have small data size (typically 1/10 of the original image size) and simple structure of 4 types of intensity value. This observation thus allows follow-up steps to be processed on mobile devices with low bandwidth and limited computing performance. Experiments conducted on 1935 brain slices from 129 patients show that more than 30% of the sample can reach 90% similarity; over 60% of the samples can reach 85% similarity, and more than 80% of the sample could reach 75% similarity. The comparisons with other segmentation methods also demonstrate both efficiency and stability of the proposed approach.
ERIC Educational Resources Information Center
Werner, Linda; McDowell, Charlie; Denner, Jill
2013-01-01
Educational data mining can miss or misidentify key findings about student learning without a transparent process of analyzing the data. This paper describes the first steps in the process of using low-level logging data to understand how middle school students used Alice, an initial programming environment. We describe the steps that were…
Ensemble analyses improve signatures of tumour hypoxia and reveal inter-platform differences
2014-01-01
Background The reproducibility of transcriptomic biomarkers across datasets remains poor, limiting clinical application. We and others have suggested that this is in-part caused by differential error-structure between datasets, and their incomplete removal by pre-processing algorithms. Methods To test this hypothesis, we systematically assessed the effects of pre-processing on biomarker classification using 24 different pre-processing methods and 15 distinct signatures of tumour hypoxia in 10 datasets (2,143 patients). Results We confirm strong pre-processing effects for all datasets and signatures, and find that these differ between microarray versions. Importantly, exploiting different pre-processing techniques in an ensemble technique improved classification for a majority of signatures. Conclusions Assessing biomarkers using an ensemble of pre-processing techniques shows clear value across multiple diseases, datasets and biomarkers. Importantly, ensemble classification improves biomarkers with initially good results but does not result in spuriously improved performance for poor biomarkers. While further research is required, this approach has the potential to become a standard for transcriptomic biomarkers. PMID:24902696
Andronache, Adrian; Rosazza, Cristina; Sattin, Davide; Leonardi, Matilde; D'Incerti, Ludovico; Minati, Ludovico
2013-01-01
An emerging application of resting-state functional MRI (rs-fMRI) is the study of patients with disorders of consciousness (DoC), where integrity of default-mode network (DMN) activity is associated to the clinical level of preservation of consciousness. Due to the inherent inability to follow verbal instructions, arousal induced by scanning noise and postural pain, these patients tend to exhibit substantial levels of movement. This results in spurious, non-neural fluctuations of the rs-fMRI signal, which impair the evaluation of residual functional connectivity. Here, the effect of data preprocessing choices on the detectability of the DMN was systematically evaluated in a representative cohort of 30 clinically and etiologically heterogeneous DoC patients and 33 healthy controls. Starting from a standard preprocessing pipeline, additional steps were gradually inserted, namely band-pass filtering (BPF), removal of co-variance with the movement vectors, removal of co-variance with the global brain parenchyma signal, rejection of realignment outlier volumes and ventricle masking. Both independent-component analysis (ICA) and seed-based analysis (SBA) were performed, and DMN detectability was assessed quantitatively as well as visually. The results of the present study strongly show that the detection of DMN activity in the sub-optimal fMRI series acquired on DoC patients is contingent on the use of adequate filtering steps. ICA and SBA are differently affected but give convergent findings for high-grade preprocessing. We propose that future studies in this area should adopt the described preprocessing procedures as a minimum standard to reduce the probability of wrongly inferring that DMN activity is absent.
NASA Astrophysics Data System (ADS)
Gao, M.; Li, J.
2018-04-01
Geometric correction is an important preprocessing process in the application of GF4 PMS image. The method of geometric correction that is based on the manual selection of geometric control points is time-consuming and laborious. The more common method, based on a reference image, is automatic image registration. This method involves several steps and parameters. For the multi-spectral sensor GF4 PMS, it is necessary for us to identify the best combination of parameters and steps. This study mainly focuses on the following issues: necessity of Rational Polynomial Coefficients (RPC) correction before automatic registration, base band in the automatic registration and configuration of GF4 PMS spatial resolution.
Probabilistic Model for Untargeted Peak Detection in LC-MS Using Bayesian Statistics.
Woldegebriel, Michael; Vivó-Truyols, Gabriel
2015-07-21
We introduce a novel Bayesian probabilistic peak detection algorithm for liquid chromatography-mass spectroscopy (LC-MS). The final probabilistic result allows the user to make a final decision about which points in a chromatogram are affected by a chromatographic peak and which ones are only affected by noise. The use of probabilities contrasts with the traditional method in which a binary answer is given, relying on a threshold. By contrast, with the Bayesian peak detection presented here, the values of probability can be further propagated into other preprocessing steps, which will increase (or decrease) the importance of chromatographic regions into the final results. The present work is based on the use of the statistical overlap theory of component overlap from Davis and Giddings (Davis, J. M.; Giddings, J. Anal. Chem. 1983, 55, 418-424) as prior probability in the Bayesian formulation. The algorithm was tested on LC-MS Orbitrap data and was able to successfully distinguish chemical noise from actual peaks without any data preprocessing.
A spectral water index based on visual bands
NASA Astrophysics Data System (ADS)
Basaeed, Essa; Bhaskar, Harish; Al-Mualla, Mohammed
2013-10-01
Land-water segmentation is an important preprocessing step in a number of remote sensing applications such as target detection, environmental monitoring, and map updating. A Normalized Optical Water Index (NOWI) is proposed to accurately discriminate between land and water regions in multi-spectral satellite imagery data from DubaiSat-1. NOWI exploits the spectral characteristics of water content (using visible bands) and uses a non-linear normalization procedure that renders strong emphasize on small changes in lower brightness values whilst guaranteeing that the segmentation process remains image-independent. The NOWI representation is validated through systematic experiments, evaluated using robust metrics, and compared against various supervised classification algorithms. Analysis has indicated that NOWI has the advantages that it: a) is a pixel-based method that requires no global knowledge of the scene under investigation, b) can be easily implemented in parallel processing, c) is image-independent and requires no training, d) works in different environmental conditions, e) provides high accuracy and efficiency, and f) works directly on the input image without any form of pre-processing.
Gabor filter based fingerprint image enhancement
NASA Astrophysics Data System (ADS)
Wang, Jin-Xiang
2013-03-01
Fingerprint recognition technology has become the most reliable biometric technology due to its uniqueness and invariance, which has been most convenient and most reliable technique for personal authentication. The development of Automated Fingerprint Identification System is an urgent need for modern information security. Meanwhile, fingerprint preprocessing algorithm of fingerprint recognition technology has played an important part in Automatic Fingerprint Identification System. This article introduces the general steps in the fingerprint recognition technology, namely the image input, preprocessing, feature recognition, and fingerprint image enhancement. As the key to fingerprint identification technology, fingerprint image enhancement affects the accuracy of the system. It focuses on the characteristics of the fingerprint image, Gabor filters algorithm for fingerprint image enhancement, the theoretical basis of Gabor filters, and demonstration of the filter. The enhancement algorithm for fingerprint image is in the windows XP platform with matlab.65 as a development tool for the demonstration. The result shows that the Gabor filter is effective in fingerprint image enhancement technology.
Bayesian Peak Picking for NMR Spectra
Cheng, Yichen; Gao, Xin; Liang, Faming
2013-01-01
Protein structure determination is a very important topic in structural genomics, which helps people to understand varieties of biological functions such as protein-protein interactions, protein–DNA interactions and so on. Nowadays, nuclear magnetic resonance (NMR) has often been used to determine the three-dimensional structures of protein in vivo. This study aims to automate the peak picking step, the most important and tricky step in NMR structure determination. We propose to model the NMR spectrum by a mixture of bivariate Gaussian densities and use the stochastic approximation Monte Carlo algorithm as the computational tool to solve the problem. Under the Bayesian framework, the peak picking problem is casted as a variable selection problem. The proposed method can automatically distinguish true peaks from false ones without preprocessing the data. To the best of our knowledge, this is the first effort in the literature that tackles the peak picking problem for NMR spectrum data using Bayesian method. PMID:24184964
Controlling Surface Chemistry to Deconvolute Corrosion Benefits Derived from SMAT Processing
NASA Astrophysics Data System (ADS)
Murdoch, Heather A.; Labukas, Joseph P.; Roberts, Anthony J.; Darling, Kristopher A.
2017-07-01
Grain refinement through surface plastic deformation processes such as surface mechanical attrition treatment has shown measureable benefits for mechanical properties, but the impact on corrosion behavior has been inconsistent. Many factors obfuscate the particular corrosion mechanisms at work, including grain size, but also texture, processing contamination, and surface roughness. Many studies attempting to link corrosion and grain size have not been able to decouple these effects. Here we introduce a preprocessing step to mitigate the surface contamination effects that have been a concern in previous corrosion studies on plastically deformed surfaces; this allows comparison of corrosion behavior across grain sizes while controlling for texture and surface roughness. Potentiodynamic polarization in aqueous NaCl solution suggests that different corrosion mechanisms are responsible for samples prepared with the preprocessing step.
Vesselness propagation: a fast interactive vessel segmentation method
NASA Astrophysics Data System (ADS)
Cai, Wenli; Dachille, Frank; Harris, Gordon J.; Yoshida, Hiroyuki
2006-03-01
With the rapid development of multi-detector computed tomography (MDCT), resulting in increasing temporal and spatial resolution of data sets, clinical use of computed tomographic angiography (CTA) is rapidly increasing. Analysis of vascular structures is much needed in CTA images; however, the basis of the analysis, vessel segmentation, can still be a challenging problem. In this paper, we present a fast interactive method for CTA vessel segmentation, called vesselness propagation. This method is a two-step procedure, with a pre-processing step and an interactive step. During the pre-processing step, a vesselness volume is computed by application of a CTA transfer function followed by a multi-scale Hessian filtering. At the interactive stage, the propagation is controlled interactively in terms of the priority of the vesselness. This method was used successfully in many CTA applications such as the carotid artery, coronary artery, and peripheral arteries. It takes less than one minute for a user to segment the entire vascular structure. Thus, the proposed method provides an effective way of obtaining an overview of vascular structures.
Automatic Semantic Orientation of Adjectives for Indonesian Language Using PMI-IR and Clustering
NASA Astrophysics Data System (ADS)
Riyanti, Dewi; Arif Bijaksana, M.; Adiwijaya
2018-03-01
We present our work in the area of sentiment analysis for Indonesian language. We focus on bulding automatic semantic orientation using available resources in Indonesian. In this research we used Indonesian corpus that contains 9 million words from kompas.txt and tempo.txt that manually tagged and annotated with of part-of-speech tagset. And then we construct a dataset by taking all the adjectives from the corpus, removing the adjective with no orientation. The set contained 923 adjective words. This systems will include several steps such as text pre-processing and clustering. The text pre-processing aims to increase the accuracy. And finally clustering method will classify each word to related sentiment which is positive or negative. With improvements to the text preprocessing, can be achieved 72% of accuracy.
Raef, A.
2009-01-01
The recent proliferation of the 3D reflection seismic method into the near-surface area of geophysical applications, especially in response to the emergence of the need to comprehensively characterize and monitor near-surface carbon dioxide sequestration in shallow saline aquifers around the world, justifies the emphasis on cost-effective and robust quality control and assurance (QC/QA) workflow of 3D seismic data preprocessing that is suitable for near-surface applications. The main purpose of our seismic data preprocessing QC is to enable the use of appropriate header information, data that are free of noise-dominated traces, and/or flawed vertical stacking in subsequent processing steps. In this article, I provide an account of utilizing survey design specifications, noise properties, first breaks, and normal moveout for rapid and thorough graphical QC/QA diagnostics, which are easy to apply and efficient in the diagnosis of inconsistencies. A correlated vibroseis time-lapse 3D-seismic data set from a CO2-flood monitoring survey is used for demonstrating QC diagnostics. An important by-product of the QC workflow is establishing the number of layers for a refraction statics model in a data-driven graphical manner that capitalizes on the spatial coverage of the 3D seismic data. ?? China University of Geosciences (Wuhan) and Springer-Verlag GmbH 2009.
Bashar, Md Khayrul; Komatsu, Koji; Fujimori, Toshihiko; Kobayashi, Tetsuya J
2012-01-01
Accurate identification of cell nuclei and their tracking using three dimensional (3D) microscopic images is a demanding task in many biological studies. Manual identification of nuclei centroids from images is an error-prone task, sometimes impossible to accomplish due to low contrast and the presence of noise. Nonetheless, only a few methods are available for 3D bioimaging applications, which sharply contrast with 2D analysis, where many methods already exist. In addition, most methods essentially adopt segmentation for which a reliable solution is still unknown, especially for 3D bio-images having juxtaposed cells. In this work, we propose a new method that can directly extract nuclei centroids from fluorescence microscopy images. This method involves three steps: (i) Pre-processing, (ii) Local enhancement, and (iii) Centroid extraction. The first step includes two variations: first variation (Variant-1) uses the whole 3D pre-processed image, whereas the second one (Variant-2) modifies the preprocessed image to the candidate regions or the candidate hybrid image for further processing. At the second step, a multiscale cube filtering is employed in order to locally enhance the pre-processed image. Centroid extraction in the third step consists of three stages. In Stage-1, we compute a local characteristic ratio at every voxel and extract local maxima regions as candidate centroids using a ratio threshold. Stage-2 processing removes spurious centroids from Stage-1 results by analyzing shapes of intensity profiles from the enhanced image. An iterative procedure based on the nearest neighborhood principle is then proposed to combine if there are fragmented nuclei. Both qualitative and quantitative analyses on a set of 100 images of 3D mouse embryo are performed. Investigations reveal a promising achievement of the technique presented in terms of average sensitivity and precision (i.e., 88.04% and 91.30% for Variant-1; 86.19% and 95.00% for Variant-2), when compared with an existing method (86.06% and 90.11%), originally developed for analyzing C. elegans images.
Automated Detection of Optic Disc in Fundus Images
NASA Astrophysics Data System (ADS)
Burman, R.; Almazroa, A.; Raahemifar, K.; Lakshminarayanan, V.
Optic disc (OD) localization is an important preprocessing step in the automated image detection of fundus image infected with glaucoma. An Interval Type-II fuzzy entropy based thresholding scheme along with Differential Evolution (DE) is applied to determine the location of the OD in the right of left eye retinal fundus image. The algorithm, when applied to 460 fundus images from the MESSIDOR dataset, shows a success rate of 99.07 % for 217 normal images and 95.47 % for 243 pathological images. The mean computational time is 1.709 s for normal images and 1.753 s for pathological images. These results are important for automated detection of glaucoma and for telemedicine purposes.
Zeynoddin, Mohammad; Bonakdari, Hossein; Azari, Arash; Ebtehaj, Isa; Gharabaghi, Bahram; Riahi Madavar, Hossein
2018-09-15
A novel hybrid approach is presented that can more accurately predict monthly rainfall in a tropical climate by integrating a linear stochastic model with a powerful non-linear extreme learning machine method. This new hybrid method was then evaluated by considering four general scenarios. In the first scenario, the modeling process is initiated without preprocessing input data as a base case. While in other three scenarios, the one-step and two-step procedures are utilized to make the model predictions more precise. The mentioned scenarios are based on a combination of stationarization techniques (i.e., differencing, seasonal and non-seasonal standardization and spectral analysis), and normality transforms (i.e., Box-Cox, John and Draper, Yeo and Johnson, Johnson, Box-Cox-Mod, log, log standard, and Manly). In scenario 2, which is a one-step scenario, the stationarization methods are employed as preprocessing approaches. In scenario 3 and 4, different combinations of normality transform, and stationarization methods are considered as preprocessing techniques. In total, 61 sub-scenarios are evaluated resulting 11013 models (10785 linear methods, 4 nonlinear models, and 224 hybrid models are evaluated). The uncertainty of the linear, nonlinear and hybrid models are examined by Monte Carlo technique. The best preprocessing technique is the utilization of Johnson normality transform and seasonal standardization (respectively) (R 2 = 0.99; RMSE = 0.6; MAE = 0.38; RMSRE = 0.1, MARE = 0.06, UI = 0.03 &UII = 0.05). The results of uncertainty analysis indicated the good performance of proposed technique (d-factor = 0.27; 95PPU = 83.57). Moreover, the results of the proposed methodology in this study were compared with an evolutionary hybrid of adaptive neuro fuzzy inference system (ANFIS) with firefly algorithm (ANFIS-FFA) demonstrating that the new hybrid methods outperformed ANFIS-FFA method. Copyright © 2018 Elsevier Ltd. All rights reserved.
Schwartz, Matthias; Meyer, Björn; Wirnitzer, Bernhard; Hopf, Carsten
2015-03-01
Conventional mass spectrometry image preprocessing methods used for denoising, such as the Savitzky-Golay smoothing or discrete wavelet transformation, typically do not only remove noise but also weak signals. Recently, memory-efficient principal component analysis (PCA) in conjunction with random projections (RP) has been proposed for reversible compression and analysis of large mass spectrometry imaging datasets. It considers single-pixel spectra in their local context and consequently offers the prospect of using information from the spectra of adjacent pixels for denoising or signal enhancement. However, little systematic analysis of key RP-PCA parameters has been reported so far, and the utility and validity of this method for context-dependent enhancement of known medically or pharmacologically relevant weak analyte signals in linear-mode matrix-assisted laser desorption/ionization (MALDI) mass spectra has not been explored yet. Here, we investigate MALDI imaging datasets from mouse models of Alzheimer's disease and gastric cancer to systematically assess the importance of selecting the right number of random projections k and of principal components (PCs) L for reconstructing reproducibly denoised images after compression. We provide detailed quantitative data for comparison of RP-PCA-denoising with the Savitzky-Golay and wavelet-based denoising in these mouse models as a resource for the mass spectrometry imaging community. Most importantly, we demonstrate that RP-PCA preprocessing can enhance signals of low-intensity amyloid-β peptide isoforms such as Aβ1-26 even in sparsely distributed Alzheimer's β-amyloid plaques and that it enables enhanced imaging of multiply acetylated histone H4 isoforms in response to pharmacological histone deacetylase inhibition in vivo. We conclude that RP-PCA denoising may be a useful preprocessing step in biomarker discovery workflows.
A Guide for Designing and Analyzing RNA-Seq Data.
Chatterjee, Aniruddha; Ahn, Antonio; Rodger, Euan J; Stockwell, Peter A; Eccles, Michael R
2018-01-01
The identity of a cell or an organism is at least in part defined by its gene expression and therefore analyzing gene expression remains one of the most frequently performed experimental techniques in molecular biology. The development of the RNA-Sequencing (RNA-Seq) method allows an unprecedented opportunity to analyze expression of protein-coding, noncoding RNA and also de novo transcript assembly of a new species or organism. However, the planning and design of RNA-Seq experiments has important implications for addressing the desired biological question and maximizing the value of the data obtained. In addition, RNA-Seq generates a huge volume of data and accurate analysis of this data involves several different steps and choices of tools. This can be challenging and overwhelming, especially for bench scientists. In this chapter, we describe an entire workflow for performing RNA-Seq experiments. We describe critical aspects of wet lab experiments such as RNA isolation, library preparation and the initial design of an experiment. Further, we provide a step-by-step description of the bioinformatics workflow for different steps involved in RNA-Seq data analysis. This includes power calculations, setting up a computational environment, acquisition and processing of publicly available data if desired, quality control measures, preprocessing steps for the raw data, differential expression analysis, and data visualization. We particularly mention important considerations for each step to provide a guide for designing and analyzing RNA-Seq data.
Despeckle filtering software toolbox for ultrasound imaging of the common carotid artery.
Loizou, Christos P; Theofanous, Charoula; Pantziaris, Marios; Kasparis, Takis
2014-04-01
Ultrasound imaging of the common carotid artery (CCA) is a non-invasive tool used in medicine to assess the severity of atherosclerosis and monitor its progression through time. It is also used in border detection and texture characterization of the atherosclerotic carotid plaque in the CCA, the identification and measurement of the intima-media thickness (IMT) and the lumen diameter that all are very important in the assessment of cardiovascular disease (CVD). Visual perception, however, is hindered by speckle, a multiplicative noise, that degrades the quality of ultrasound B-mode imaging. Noise reduction is therefore essential for improving the visual observation quality or as a pre-processing step for further automated analysis, such as image segmentation of the IMT and the atherosclerotic carotid plaque in ultrasound images. In order to facilitate this preprocessing step, we have developed in MATLAB(®) a unified toolbox that integrates image despeckle filtering (IDF), texture analysis and image quality evaluation techniques to automate the pre-processing and complement the disease evaluation in ultrasound CCA images. The proposed software, is based on a graphical user interface (GUI) and incorporates image normalization, 10 different despeckle filtering techniques (DsFlsmv, DsFwiener, DsFlsminsc, DsFkuwahara, DsFgf, DsFmedian, DsFhmedian, DsFad, DsFnldif, DsFsrad), image intensity normalization, 65 texture features, 15 quantitative image quality metrics and objective image quality evaluation. The software is publicly available in an executable form, which can be downloaded from http://www.cs.ucy.ac.cy/medinfo/. It was validated on 100 ultrasound images of the CCA, by comparing its results with quantitative visual analysis performed by a medical expert. It was observed that the despeckle filters DsFlsmv, and DsFhmedian improved image quality perception (based on the expert's assessment and the image texture and quality metrics). It is anticipated that the system could help the physician in the assessment of cardiovascular image analysis. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Peikari, Mohammad; Martel, Anne L.
2016-03-01
Purpose: Automatic cell segmentation plays an important role in reliable diagnosis and prognosis of patients. Most of the state-of-the-art cell detection and segmentation techniques focus on complicated methods to subtract foreground cells from the background. In this study, we introduce a preprocessing method which leads to a better detection and segmentation results compared to a well-known state-of-the-art work. Method: We transform the original red-green-blue (RGB) space into a new space defined by the top eigenvectors of the RGB space. Stretching is done by manipulating the contrast of each pixel value to equalize the color variances. New pixel values are then inverse transformed to the original RGB space. This altered RGB image is then used to segment cells. Result: The validation of our method with a well-known state-of-the-art technique revealed a statistically significant improvement on an identical validation set. We achieved a mean F1-score of 0.901. Conclusion: Preprocessing steps to decorrelate colorspaces may improve cell segmentation performances.
Bayesian peak picking for NMR spectra.
Cheng, Yichen; Gao, Xin; Liang, Faming
2014-02-01
Protein structure determination is a very important topic in structural genomics, which helps people to understand varieties of biological functions such as protein-protein interactions, protein-DNA interactions and so on. Nowadays, nuclear magnetic resonance (NMR) has often been used to determine the three-dimensional structures of protein in vivo. This study aims to automate the peak picking step, the most important and tricky step in NMR structure determination. We propose to model the NMR spectrum by a mixture of bivariate Gaussian densities and use the stochastic approximation Monte Carlo algorithm as the computational tool to solve the problem. Under the Bayesian framework, the peak picking problem is casted as a variable selection problem. The proposed method can automatically distinguish true peaks from false ones without preprocessing the data. To the best of our knowledge, this is the first effort in the literature that tackles the peak picking problem for NMR spectrum data using Bayesian method. Copyright © 2013. Production and hosting by Elsevier Ltd.
NASA Astrophysics Data System (ADS)
Gómez-Gutiérrez, Álvaro; Juan de Sanjosé-Blasco, José; Schnabel, Susanne; de Matías-Bejarano, Javier; Pulido-Fernández, Manuel; Berenguer-Sempere, Fernando
2015-04-01
In this work, the hypothesis of improving 3D models obtained with Structure from Motion (SfM) approaches using images pre-processed by High Dynamic Range (HDR) techniques is tested. Photographs of the Veleta Rock Glacier in Spain were captured with different exposure values (EV0, EV+1 and EV-1), two focal lengths (35 and 100 mm) and under different weather conditions for the years 2008, 2009, 2011, 2012 and 2014. HDR images were produced using the different EV steps within Fusion F.1 software. Point clouds were generated using commercial and free available SfM software: Agisoft Photoscan and 123D Catch. Models Obtained using pre-processed images and non-preprocessed images were compared in a 3D environment with a benchmark 3D model obtained by means of a Terrestrial Laser Scanner (TLS). A total of 40 point clouds were produced, georeferenced and compared. Results indicated that for Agisoft Photoscan software differences in the accuracy between models obtained with pre-processed and non-preprocessed images were not significant from a statistical viewpoint. However, in the case of the free available software 123D Catch, models obtained using images pre-processed by HDR techniques presented a higher point density and were more accurate. This tendency was observed along the 5 studied years and under different capture conditions. More work should be done in the near future to corroborate whether the results of similar software packages can be improved by HDR techniques (e.g. ARC3D, Bundler and PMVS2, CMP SfM, Photosynth and VisualSFM).
NASA Astrophysics Data System (ADS)
Neher, Peter F.; Stieltjes, Bram; Reisert, Marco; Reicht, Ignaz; Meinzer, Hans-Peter; Fritzsche, Klaus H.
2012-02-01
Fiber tracking algorithms yield valuable information for neurosurgery as well as automated diagnostic approaches. However, they have not yet arrived in the daily clinical practice. In this paper we present an open source integration of the global tractography algorithm proposed by Reisert et.al.1 into the open source Medical Imaging Interaction Toolkit (MITK) developed and maintained by the Division of Medical and Biological Informatics at the German Cancer Research Center (DKFZ). The integration of this algorithm into a standardized and open development environment like MITK enriches accessibility of tractography algorithms for the science community and is an important step towards bringing neuronal tractography closer to a clinical application. The MITK diffusion imaging application, downloadable from www.mitk.org, combines all the steps necessary for a successful tractography: preprocessing, reconstruction of the images, the actual tracking, live monitoring of intermediate results, postprocessing and visualization of the final tracking results. This paper presents typical tracking results and demonstrates the steps for pre- and post-processing of the images.
Lieb, Florian; Stark, Hans-Georg; Thielemann, Christiane
2017-06-01
Spike detection from extracellular recordings is a crucial preprocessing step when analyzing neuronal activity. The decision whether a specific part of the signal is a spike or not is important for any kind of other subsequent preprocessing steps, like spike sorting or burst detection in order to reduce the classification of erroneously identified spikes. Many spike detection algorithms have already been suggested, all working reasonably well whenever the signal-to-noise ratio is large enough. When the noise level is high, however, these algorithms have a poor performance. In this paper we present two new spike detection algorithms. The first is based on a stationary wavelet energy operator and the second is based on the time-frequency representation of spikes. Both algorithms are more reliable than all of the most commonly used methods. The performance of the algorithms is confirmed by using simulated data, resembling original data recorded from cortical neurons with multielectrode arrays. In order to demonstrate that the performance of the algorithms is not restricted to only one specific set of data, we also verify the performance using a simulated publicly available data set. We show that both proposed algorithms have the best performance under all tested methods, regardless of the signal-to-noise ratio in both data sets. This contribution will redound to the benefit of electrophysiological investigations of human cells. Especially the spatial and temporal analysis of neural network communications is improved by using the proposed spike detection algorithms.
NASA Astrophysics Data System (ADS)
Li, Wanjing; Schütze, Rainer; Böhler, Martin; Boochs, Frank; Marzani, Franck S.; Voisin, Yvon
2009-06-01
We present an approach to integrate a preprocessing step of the region of interest (ROI) localization into 3-D scanners (laser or stereoscopic). The definite objective is to make the 3-D scanner intelligent enough to localize rapidly in the scene, during the preprocessing phase, the regions with high surface curvature, so that precise scanning will be done only in these regions instead of in the whole scene. In this way, the scanning time can be largely reduced, and the results contain only pertinent data. To test its feasibility and efficiency, we simulated the preprocessing process under an active stereoscopic system composed of two cameras and a video projector. The ROI localization is done in an iterative way. First, the video projector projects a regular point pattern in the scene, and then the pattern is modified iteratively according to the local surface curvature of each reconstructed 3-D point. Finally, the last pattern is used to determine the ROI. Our experiments showed that with this approach, the system is capable to localize all types of objects, including small objects with small depth.
Efficient algorithms for a class of partitioning problems
NASA Technical Reports Server (NTRS)
Iqbal, M. Ashraf; Bokhari, Shahid H.
1990-01-01
The problem of optimally partitioning the modules of chain- or tree-like tasks over chain-structured or host-satellite multiple computer systems is addressed. This important class of problems includes many signal processing and industrial control applications. Prior research has resulted in a succession of faster exact and approximate algorithms for these problems. Polynomial exact and approximate algorithms are described for this class that are better than any of the previously reported algorithms. The approach is based on a preprocessing step that condenses the given chain or tree structured task into a monotonic chain or tree. The partitioning of this monotonic take can then be carried out using fast search techniques.
BeadArray Expression Analysis Using Bioconductor
Ritchie, Matthew E.; Dunning, Mark J.; Smith, Mike L.; Shi, Wei; Lynch, Andy G.
2011-01-01
Illumina whole-genome expression BeadArrays are a popular choice in gene profiling studies. Aside from the vendor-provided software tools for analyzing BeadArray expression data (GenomeStudio/BeadStudio), there exists a comprehensive set of open-source analysis tools in the Bioconductor project, many of which have been tailored to exploit the unique properties of this platform. In this article, we explore a number of these software packages and demonstrate how to perform a complete analysis of BeadArray data in various formats. The key steps of importing data, performing quality assessments, preprocessing, and annotation in the common setting of assessing differential expression in designed experiments will be covered. PMID:22144879
Automatic cloud coverage assessment of Formosat-2 image
NASA Astrophysics Data System (ADS)
Hsu, Kuo-Hsien
2011-11-01
Formosat-2 satellite equips with the high-spatial-resolution (2m ground sampling distance) remote sensing instrument. It has been being operated on the daily-revisiting mission orbit by National Space organization (NSPO) of Taiwan since May 21 2004. NSPO has also serving as one of the ground receiving stations for daily processing the received Formosat- 2 images. The current cloud coverage assessment of Formosat-2 image for NSPO Image Processing System generally consists of two major steps. Firstly, an un-supervised K-means method is used for automatically estimating the cloud statistic of Formosat-2 image. Secondly, manual estimation of cloud coverage from Formosat-2 image is processed by manual examination. Apparently, a more accurate Automatic Cloud Coverage Assessment (ACCA) method certainly increases the efficiency of processing step 2 with a good prediction of cloud statistic. In this paper, mainly based on the research results from Chang et al, Irish, and Gotoh, we propose a modified Formosat-2 ACCA method which considered pre-processing and post-processing analysis. For pre-processing analysis, cloud statistic is determined by using un-supervised K-means classification, Sobel's method, Otsu's method, non-cloudy pixels reexamination, and cross-band filter method. Box-Counting fractal method is considered as a post-processing tool to double check the results of pre-processing analysis for increasing the efficiency of manual examination.
A multistage gene normalization system integrating multiple effective methods.
Li, Lishuang; Liu, Shanshan; Li, Lihua; Fan, Wenting; Huang, Degen; Zhou, Huiwei
2013-01-01
Gene/protein recognition and normalization is an important preliminary step for many biological text mining tasks. In this paper, we present a multistage gene normalization system which consists of four major subtasks: pre-processing, dictionary matching, ambiguity resolution and filtering. For the first subtask, we apply the gene mention tagger developed in our earlier work, which achieves an F-score of 88.42% on the BioCreative II GM testing set. In the stage of dictionary matching, the exact matching and approximate matching between gene names and the EntrezGene lexicon have been combined. For the ambiguity resolution subtask, we propose a semantic similarity disambiguation method based on Munkres' Assignment Algorithm. At the last step, a filter based on Wikipedia has been built to remove the false positives. Experimental results show that the presented system can achieve an F-score of 90.1%, outperforming most of the state-of-the-art systems.
Sundareshan, Malur K; Bhattacharjee, Supratik; Inampudi, Radhika; Pang, Ho-Yuen
2002-12-10
Computational complexity is a major impediment to the real-time implementation of image restoration and superresolution algorithms in many applications. Although powerful restoration algorithms have been developed within the past few years utilizing sophisticated mathematical machinery (based on statistical optimization and convex set theory), these algorithms are typically iterative in nature and require a sufficient number of iterations to be executed to achieve the desired resolution improvement that may be needed to meaningfully perform postprocessing image exploitation tasks in practice. Additionally, recent technological breakthroughs have facilitated novel sensor designs (focal plane arrays, for instance) that make it possible to capture megapixel imagery data at video frame rates. A major challenge in the processing of these large-format images is to complete the execution of the image processing steps within the frame capture times and to keep up with the output rate of the sensor so that all data captured by the sensor can be efficiently utilized. Consequently, development of novel methods that facilitate real-time implementation of image restoration and superresolution algorithms is of significant practical interest and is the primary focus of this study. The key to designing computationally efficient processing schemes lies in strategically introducing appropriate preprocessing steps together with the superresolution iterations to tailor optimized overall processing sequences for imagery data of specific formats. For substantiating this assertion, three distinct methods for tailoring a preprocessing filter and integrating it with the superresolution processing steps are outlined. These methods consist of a region-of-interest extraction scheme, a background-detail separation procedure, and a scene-derived information extraction step for implementing a set-theoretic restoration of the image that is less demanding in computation compared with the superresolution iterations. A quantitative evaluation of the performance of these algorithms for restoring and superresolving various imagery data captured by diffraction-limited sensing operations are also presented.
NASA Astrophysics Data System (ADS)
Belabbassi, L.; Garzio, L. M.; Smith, M. J.; Knuth, F.; Vardaro, M.; Kerfoot, J.
2016-02-01
The Ocean Observatories Initiative (OOI), funded by the National Science Foundation, provides users with access to long-term datasets from a variety of deployed oceanographic sensors. The Pioneer Array in the Atlantic Ocean off the Coast of New England hosts 10 moorings and 6 gliders. Each mooring is outfitted with 6 to 19 different instruments telemetering more than 1000 data streams. These data are available to science users to collaborate on common scientific goals such as water quality monitoring and scale variability measures of continental shelf processes and coastal open ocean exchanges. To serve this purpose, the acquired datasets undergo an iterative multi-step quality assurance and quality control procedure automated to work with all types of data. Data processing involves several stages, including a fundamental pre-processing step when the data are prepared for processing. This takes a considerable amount of processing time and is often not given enough thought in development initiatives. The volume and complexity of OOI data necessitates the development of a systematic diagnostic tool to enable the management of a comprehensive data information system for the OOI arrays. We present two examples to demonstrate the current OOI pre-processing diagnostic tool. First, Data Filtering is used to identify incomplete, incorrect, or irrelevant parts of the data and then replaces, modifies or deletes the coarse data. This provides data consistency with similar datasets in the system. Second, Data Normalization occurs when the database is organized in fields and tables to minimize redundancy and dependency. At the end of this step, the data are stored in one place to reduce the risk of data inconsistency and promote easy and efficient mapping to the database.
The Neuro Bureau ADHD-200 Preprocessed repository.
Bellec, Pierre; Chu, Carlton; Chouinard-Decorte, François; Benhajali, Yassine; Margulies, Daniel S; Craddock, R Cameron
2017-01-01
In 2011, the "ADHD-200 Global Competition" was held with the aim of identifying biomarkers of attention-deficit/hyperactivity disorder from resting-state functional magnetic resonance imaging (rs-fMRI) and structural MRI (s-MRI) data collected on 973 individuals. Statisticians and computer scientists were potentially the most qualified for the machine learning aspect of the competition, but generally lacked the specialized skills to implement the necessary steps of data preparation for rs-fMRI. Realizing this barrier to entry, the Neuro Bureau prospectively collaborated with all competitors by preprocessing the data and sharing these results at the Neuroimaging Informatics Tools and Resources Clearinghouse (NITRC) (http://www.nitrc.org/frs/?group_id=383). This "ADHD-200 Preprocessed" release included multiple analytical pipelines to cater to different philosophies of data analysis. The processed derivatives included denoised and registered 4D fMRI volumes, regional time series extracted from brain parcellations, maps of 10 intrinsic connectivity networks, fractional amplitude of low frequency fluctuation, and regional homogeneity, along with grey matter density maps. The data was used by several teams who competed in the ADHD-200 Global Competition, including the winning entry by a group of biostaticians. To the best of our knowledge, the ADHD-200 Preprocessed release was the first large public resource of preprocessed resting-state fMRI and structural MRI data, and remains to this day the only resource featuring a battery of alternative processing paths. Copyright © 2016 Elsevier Inc. All rights reserved.
3CCD image segmentation and edge detection based on MATLAB
NASA Astrophysics Data System (ADS)
He, Yong; Pan, Jiazhi; Zhang, Yun
2006-09-01
This research aimed to identify weeds from crops in early stage in the field operation by using image-processing technology. As 3CCD images offer greater binary value difference between weed and crop section than ordinary digital images taken by common cameras. It has 3 channels (green, red, ifred) which takes a snap-photo of the same area, and the three images can be composed into one image, which facilitates the segmentation of different areas. By the application of image-processing toolkit on MATLAB, the different areas in the image can be segmented clearly. As edge detection technique is the first and very important step in image processing, The different result of different processing method was compared. Especially, by using the wavelet packet transform toolkit on MATLAB, An image was preprocessed and then the edge was extracted, and getting more clearly cut image of edge. The segmentation methods include operations as erosion, dilation and other algorithms to preprocess the images. It is of great importance to segment different areas in digital images in field real time, so as to be applied in precision farming, to saving energy and herbicide and many other materials. At present time Large scale software as MATLAB on PC was used, but the computation can be reduced and integrated into a small embed system, which means that the application of this technique in agricultural engineering is feasible and of great economical value.
NASA Astrophysics Data System (ADS)
Kang, Yubin; Choi, Jaeyoung; Park, Jinju; Kim, Woo-Byoung; Lee, Kun-Jae
2017-09-01
This study attempts to improve the physical and chemical adhesion between metals and ceramics by using electrolytic oxidation and a titanium organic/inorganic complex ion solution on the SS-304 plate. Surface analysis confirmed the existence of the Tisbnd Osbnd Mx bonds formed by the bonding between the metal ions and the Ti oxide at the surface of the pre-processed SS plate, and improved chemical adhesion during ceramic coating was expected by confirming the presence of the carboxylic group. The adhesion was evaluated by using the ceramic coating solution in order to assess the improved adhesion of the SS plate under conditions. The results showed that both the adhesion and durability were largely improved in the sample processed with all the pre-processing steps, thus confirming that the physical and chemical adhesion between metals and ceramics can be improved by enhancing the physical roughness via electrolytic oxidation and pre-processing using a Ti complex ion solution.
Metabolomics Reveal Optimal Grain Preprocessing (Milling) toward Rice Koji Fermentation.
Lee, Sunmin; Lee, Da Eun; Singh, Digar; Lee, Choong Hwan
2018-03-21
A time-correlated mass spectrometry (MS)-based metabolic profiling was performed for rice koji made using the substrates with varying degrees of milling (DOM). Overall, 67 primary and secondary metabolites were observed as significantly discriminant among different samples. Notably, a higher abundance of carbohydrate (sugars, sugar alcohols, organic acids, and phenolic acids) and lipid (fatty acids and lysophospholipids) derived metabolites with enhanced hydrolytic enzyme activities were observed for koji made with DOM of 5-7 substrates at 36 h. The antioxidant secondary metabolites (flavonoids and phenolic acid) were relatively higher in koji with DOM of 0 substrates, followed by DOM of 5 > DOM of 7 > DOM of 9 and 11 at 96 h. Hence, we conjecture that the rice substrate preprocessing between DOM of 5 and 7 was potentially optimal toward koji fermentation, with the end product being rich in distinctive organoleptic, nutritional, and functional metabolites. The study rationalizes the substrate preprocessing steps vital for commercial koji making.
Community detection enhancement using non-negative matrix factorization with graph regularization
NASA Astrophysics Data System (ADS)
Liu, Xiao; Wei, Yi-Ming; Wang, Jian; Wang, Wen-Jun; He, Dong-Xiao; Song, Zhan-Jie
2016-06-01
Community detection is a meaningful task in the analysis of complex networks, which has received great concern in various domains. A plethora of exhaustive studies has made great effort and proposed many methods on community detection. Particularly, a kind of attractive one is the two-step method which first makes a preprocessing for the network and then identifies its communities. However, not all types of methods can achieve satisfactory results by using such preprocessing strategy, such as the non-negative matrix factorization (NMF) methods. In this paper, rather than using the above two-step method as most works did, we propose a graph regularized-based model to improve, specialized, the NMF-based methods for the detection of communities, namely NMFGR. In NMFGR, we introduce the similarity metric which contains both the global and local information of networks, to reflect the relationships between two nodes, so as to improve the accuracy of community detection. Experimental results on both artificial and real-world networks demonstrate the superior performance of NMFGR to some competing methods.
Optimization of miRNA-seq data preprocessing.
Tam, Shirley; Tsao, Ming-Sound; McPherson, John D
2015-11-01
The past two decades of microRNA (miRNA) research has solidified the role of these small non-coding RNAs as key regulators of many biological processes and promising biomarkers for disease. The concurrent development in high-throughput profiling technology has further advanced our understanding of the impact of their dysregulation on a global scale. Currently, next-generation sequencing is the platform of choice for the discovery and quantification of miRNAs. Despite this, there is no clear consensus on how the data should be preprocessed before conducting downstream analyses. Often overlooked, data preprocessing is an essential step in data analysis: the presence of unreliable features and noise can affect the conclusions drawn from downstream analyses. Using a spike-in dilution study, we evaluated the effects of several general-purpose aligners (BWA, Bowtie, Bowtie 2 and Novoalign), and normalization methods (counts-per-million, total count scaling, upper quartile scaling, Trimmed Mean of M, DESeq, linear regression, cyclic loess and quantile) with respect to the final miRNA count data distribution, variance, bias and accuracy of differential expression analysis. We make practical recommendations on the optimal preprocessing methods for the extraction and interpretation of miRNA count data from small RNA-sequencing experiments. © The Author 2015. Published by Oxford University Press.
Vellmer, Sebastian; Tonoyan, Aram S; Suter, Dieter; Pronin, Igor N; Maximov, Ivan I
2018-02-01
Diffusion magnetic resonance imaging (dMRI) is a powerful tool in clinical applications, in particular, in oncology screening. dMRI demonstrated its benefit and efficiency in the localisation and detection of different types of human brain tumours. Clinical dMRI data suffer from multiple artefacts such as motion and eddy-current distortions, contamination by noise, outliers etc. In order to increase the image quality of the derived diffusion scalar metrics and the accuracy of the subsequent data analysis, various pre-processing approaches are actively developed and used. In the present work we assess the effect of different pre-processing procedures such as a noise correction, different smoothing algorithms and spatial interpolation of raw diffusion data, with respect to the accuracy of brain glioma differentiation. As a set of sensitive biomarkers of the glioma malignancy grades we chose the derived scalar metrics from diffusion and kurtosis tensor imaging as well as the neurite orientation dispersion and density imaging (NODDI) biophysical model. Our results show that the application of noise correction, anisotropic diffusion filtering, and cubic-order spline interpolation resulted in the highest sensitivity and specificity for glioma malignancy grading. Thus, these pre-processing steps are recommended for the statistical analysis in brain tumour studies. Copyright © 2017. Published by Elsevier GmbH.
Myers, Owen D; Sumner, Susan J; Li, Shuzhao; Barnes, Stephen; Du, Xiuxia
2017-09-05
XCMS and MZmine 2 are two widely used software packages for preprocessing untargeted LC/MS metabolomics data. Both construct extracted ion chromatograms (EICs) and detect peaks from the EICs, the first two steps in the data preprocessing workflow. While both packages have performed admirably in peak picking, they also detect a problematic number of false positive EIC peaks and can also fail to detect real EIC peaks. The former and latter translate downstream into spurious and missing compounds and present significant limitations with most existing software packages that preprocess untargeted mass spectrometry metabolomics data. We seek to understand the specific reasons why XCMS and MZmine 2 find the false positive EIC peaks that they do and in what ways they fail to detect real compounds. We investigate differences of EIC construction methods in XCMS and MZmine 2 and find several problems in the XCMS centWave peak detection algorithm which we show are partly responsible for the false positive and false negative compound identifications. In addition, we find a problem with MZmine 2's use of centWave. We hope that a detailed understanding of the XCMS and MZmine 2 algorithms will allow users to work with them more effectively and will also help with future algorithmic development.
Preprocessing and meta-classification for brain-computer interfaces.
Hammon, Paul S; de Sa, Virginia R
2007-03-01
A brain-computer interface (BCI) is a system which allows direct translation of brain states into actions, bypassing the usual muscular pathways. A BCI system works by extracting user brain signals, applying machine learning algorithms to classify the user's brain state, and performing a computer-controlled action. Our goal is to improve brain state classification. Perhaps the most obvious way to improve classification performance is the selection of an advanced learning algorithm. However, it is now well known in the BCI community that careful selection of preprocessing steps is crucial to the success of any classification scheme. Furthermore, recent work indicates that combining the output of multiple classifiers (meta-classification) leads to improved classification rates relative to single classifiers (Dornhege et al., 2004). In this paper, we develop an automated approach which systematically analyzes the relative contributions of different preprocessing and meta-classification approaches. We apply this procedure to three data sets drawn from BCI Competition 2003 (Blankertz et al., 2004) and BCI Competition III (Blankertz et al., 2006), each of which exhibit very different characteristics. Our final classification results compare favorably with those from past BCI competitions. Additionally, we analyze the relative contributions of individual preprocessing and meta-classification choices and discuss which types of BCI data benefit most from specific algorithms.
Fast algorithm for wavefront reconstruction in XAO/SCAO with pyramid wavefront sensor
NASA Astrophysics Data System (ADS)
Shatokhina, Iuliia; Obereder, Andreas; Ramlau, Ronny
2014-08-01
We present a fast wavefront reconstruction algorithm developed for an extreme adaptive optics system equipped with a pyramid wavefront sensor on a 42m telescope. The method is called the Preprocessed Cumulative Reconstructor with domain decomposition (P-CuReD). The algorithm is based on the theoretical relationship between pyramid and Shack-Hartmann wavefront sensor data. The algorithm consists of two consecutive steps - a data preprocessing, and an application of the CuReD algorithm, which is a fast method for wavefront reconstruction from Shack-Hartmann sensor data. The closed loop simulation results show that the P-CuReD method provides the same reconstruction quality and is significantly faster than an MVM.
Costa, Marcus V C; Carvalho, Joao L A; Berger, Pedro A; Zaghetto, Alexandre; da Rocha, Adson F; Nascimento, Francisco A O
2009-01-01
We present a new preprocessing technique for two-dimensional compression of surface electromyographic (S-EMG) signals, based on correlation sorting. We show that the JPEG2000 coding system (originally designed for compression of still images) and the H.264/AVC encoder (video compression algorithm operating in intraframe mode) can be used for compression of S-EMG signals. We compare the performance of these two off-the-shelf image compression algorithms for S-EMG compression, with and without the proposed preprocessing step. Compression of both isotonic and isometric contraction S-EMG signals is evaluated. The proposed methods were compared with other S-EMG compression algorithms from the literature.
Some practical aspects of lossless and nearly-lossless compression of AVHRR imagery
NASA Technical Reports Server (NTRS)
Hogan, David B.; Miller, Chris X.; Christensen, Than Lee; Moorti, Raj
1994-01-01
Compression of Advanced Very high Resolution Radiometers (AVHRR) imagery operating in a lossless or nearly-lossless mode is evaluated. Several practical issues are analyzed including: variability of compression over time and among channels, rate-smoothing buffer size, multi-spectral preprocessing of data, day/night handling, and impact on key operational data applications. This analysis is based on a DPCM algorithm employing the Universal Noiseless Coder, which is a candidate for inclusion in many future remote sensing systems. It is shown that compression rates of about 2:1 (daytime) can be achieved with modest buffer sizes (less than or equal to 2.5 Mbytes) and a relatively simple multi-spectral preprocessing step.
Bistatic SAR: Signal Processing and Image Formation.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wahl, Daniel E.; Yocky, David A.
This report describes the significant processing steps that were used to take the raw recorded digitized signals from the bistatic synthetic aperture RADAR (SAR) hardware built for the NCNS Bistatic SAR project to a final bistatic SAR image. In general, the process steps herein are applicable to bistatic SAR signals that include the direct-path signal and the reflected signal. The steps include preprocessing steps, data extraction to for a phase history, and finally, image format. Various plots and values will be shown at most steps to illustrate the processing for a bistatic COSMO SkyMed collection gathered on June 10, 2013more » on Kirtland Air Force Base, New Mexico.« less
NASA Astrophysics Data System (ADS)
Lieb, Florian; Stark, Hans-Georg; Thielemann, Christiane
2017-06-01
Objective. Spike detection from extracellular recordings is a crucial preprocessing step when analyzing neuronal activity. The decision whether a specific part of the signal is a spike or not is important for any kind of other subsequent preprocessing steps, like spike sorting or burst detection in order to reduce the classification of erroneously identified spikes. Many spike detection algorithms have already been suggested, all working reasonably well whenever the signal-to-noise ratio is large enough. When the noise level is high, however, these algorithms have a poor performance. Approach. In this paper we present two new spike detection algorithms. The first is based on a stationary wavelet energy operator and the second is based on the time-frequency representation of spikes. Both algorithms are more reliable than all of the most commonly used methods. Main results. The performance of the algorithms is confirmed by using simulated data, resembling original data recorded from cortical neurons with multielectrode arrays. In order to demonstrate that the performance of the algorithms is not restricted to only one specific set of data, we also verify the performance using a simulated publicly available data set. We show that both proposed algorithms have the best performance under all tested methods, regardless of the signal-to-noise ratio in both data sets. Significance. This contribution will redound to the benefit of electrophysiological investigations of human cells. Especially the spatial and temporal analysis of neural network communications is improved by using the proposed spike detection algorithms.
Houshyarifar, Vahid; Chehel Amirani, Mehdi
2016-08-12
In this paper we present a method to predict Sudden Cardiac Arrest (SCA) with higher order spectral (HOS) and linear (Time) features extracted from heart rate variability (HRV) signal. Predicting the occurrence of SCA is important in order to avoid the probability of Sudden Cardiac Death (SCD). This work is a challenge to predict five minutes before SCA onset. The method consists of four steps: pre-processing, feature extraction, feature reduction, and classification. In the first step, the QRS complexes are detected from the electrocardiogram (ECG) signal and then the HRV signal is extracted. In second step, bispectrum features of HRV signal and time-domain features are obtained. Six features are extracted from bispectrum and two features from time-domain. In the next step, these features are reduced to one feature by the linear discriminant analysis (LDA) technique. Finally, KNN and support vector machine-based classifiers are used to classify the HRV signals. We used two database named, MIT/BIH Sudden Cardiac Death (SCD) Database and Physiobank Normal Sinus Rhythm (NSR). In this work we achieved prediction of SCD occurrence for six minutes before the SCA with the accuracy over 91%.
A special purpose knowledge-based face localization method
NASA Astrophysics Data System (ADS)
Hassanat, Ahmad; Jassim, Sabah
2008-04-01
This paper is concerned with face localization for visual speech recognition (VSR) system. Face detection and localization have got a great deal of attention in the last few years, because it is an essential pre-processing step in many techniques that handle or deal with faces, (e.g. age, face, gender, race and visual speech recognition). We shall present an efficient method for localization human's faces in video images captured on mobile constrained devices, under a wide variation in lighting conditions. We use a multiphase method that may include all or some of the following steps starting with image pre-processing, followed by a special purpose edge detection, then an image refinement step. The output image will be passed through a discrete wavelet decomposition procedure, and the computed LL sub-band at a certain level will be transformed into a binary image that will be scanned by using a special template to select a number of possible candidate locations. Finally, we fuse the scores from the wavelet step with scores determined by color information for the candidate location and employ a form of fuzzy logic to distinguish face from non-face locations. We shall present results of large number of experiments to demonstrate that the proposed face localization method is efficient and achieve high level of accuracy that outperforms existing general-purpose face detection methods.
An automatic segmentation method of a parameter-adaptive PCNN for medical images.
Lian, Jing; Shi, Bin; Li, Mingcong; Nan, Ziwei; Ma, Yide
2017-09-01
Since pre-processing and initial segmentation steps in medical images directly affect the final segmentation results of the regions of interesting, an automatic segmentation method of a parameter-adaptive pulse-coupled neural network is proposed to integrate the above-mentioned two segmentation steps into one. This method has a low computational complexity for different kinds of medical images and has a high segmentation precision. The method comprises four steps. Firstly, an optimal histogram threshold is used to determine the parameter [Formula: see text] for different kinds of images. Secondly, we acquire the parameter [Formula: see text] according to a simplified pulse-coupled neural network (SPCNN). Thirdly, we redefine the parameter V of the SPCNN model by sub-intensity distribution range of firing pixels. Fourthly, we add an offset [Formula: see text] to improve initial segmentation precision. Compared with the state-of-the-art algorithms, the new method achieves a comparable performance by the experimental results from ultrasound images of the gallbladder and gallstones, magnetic resonance images of the left ventricle, and mammogram images of the left and the right breast, presenting the overall metric UM of 0.9845, CM of 0.8142, TM of 0.0726. The algorithm has a great potential to achieve the pre-processing and initial segmentation steps in various medical images. This is a premise for assisting physicians to detect and diagnose clinical cases.
Nucleus segmentation in histology images with hierarchical multilevel thresholding
NASA Astrophysics Data System (ADS)
Ahmady Phoulady, Hady; Goldgof, Dmitry B.; Hall, Lawrence O.; Mouton, Peter R.
2016-03-01
Automatic segmentation of histological images is an important step for increasing throughput while maintaining high accuracy, avoiding variation from subjective bias, and reducing the costs for diagnosing human illnesses such as cancer and Alzheimer's disease. In this paper, we present a novel method for unsupervised segmentation of cell nuclei in stained histology tissue. Following an initial preprocessing step involving color deconvolution and image reconstruction, the segmentation step consists of multilevel thresholding and a series of morphological operations. The only parameter required for the method is the minimum region size, which is set according to the resolution of the image. Hence, the proposed method requires no training sets or parameter learning. Because the algorithm requires no assumptions or a priori information with regard to cell morphology, the automatic approach is generalizable across a wide range of tissues. Evaluation across a dataset consisting of diverse tissues, including breast, liver, gastric mucosa and bone marrow, shows superior performance over four other recent methods on the same dataset in terms of F-measure with precision and recall of 0.929 and 0.886, respectively.
Publication Of Oceanographic Data on CD-ROM
NASA Technical Reports Server (NTRS)
Hilland, Jeffrey E.; Smith, Elizabeth A.; Martin, Michael D.
1992-01-01
Large collections of oceanographic data and other large collections of data published on CD-ROM's in formats facilitating access and analysis. Involves four major steps: preprocessing, premastering, mastering, and verification. Large capacity, small size, commercial availability, long-life, and standard format of CD-ROM's offer advantages over computer-compatible magnetic tape.
Interpolation algorithm for asynchronous ADC-data
NASA Astrophysics Data System (ADS)
Bramburger, Stefan; Zinke, Benny; Killat, Dirk
2017-09-01
This paper presents a modified interpolation algorithm for signals with variable data rate from asynchronous ADCs. The Adaptive weights Conjugate gradient Toeplitz matrix (ACT) algorithm is extended to operate with a continuous data stream. An additional preprocessing of data with constant and linear sections and a weighted overlap of step-by-step into spectral domain transformed signals improve the reconstruction of the asycnhronous ADC signal. The interpolation method can be used if asynchronous ADC data is fed into synchronous digital signal processing.
NASA Astrophysics Data System (ADS)
Kravchenko, Alexandra; Negassa, Wakene; Guber, Andrey; Schmidt, Sonja
2014-05-01
Particulate soil organic matter (POM) is biologically and chemically active fraction of soil organic matter. It is a source of many agricultural and ecological benefits, among which are POM's contribution to C sequestration. Most of conventional research methods for studying organic matter dynamics involve measurements conducted on pre-processed i.e., ground and sieved soil samples. Unfortunately, grinding and sieving completely destroys soil structure, the component crucial for soil functioning and C protection. Importance of a better understanding of the role of soil structure and of the physical protection that it provides to soil C cannot be overstated; and analysis of quantities, characteristics, and decomposition rates of POM in soil samples with intact structure is among the key elements of gaining such understanding. However, a marked difficulty hindering the progress in such analyses is a lack of tools for identification and quantitative analysis of POM in intact soil samples. Recent advancement in applications of X-ray computed micro-tomography (μ-CT) to soil science has given an opportunity to conduct such analyses. The objective of the current study is to develop a procedure for identification and quantitative characterization of POM within intact soil samples using X-ray μ-CT images and to test performance of the proposed procedure on a set of multiple intact soil macro-aggregates. We used 16 4-6 mm soil aggregates collected at 0-15 cm depth from a Typic Hapludalf soil at multiple field sites with diverse agricultural management history. The aggregates have been scanned at SIMBIOS Centre, Dundee, Scotland at 10 micron resolution. POM was determined from the aggregate images using the developed procedure. The procedure was based on combining image pre-processing steps with discriminant analysis classification. The first component of the procedure consisted of image pre-processing steps based on the range of gray values (GV) along with shape and size of POM pieces. That was followed by discriminant analysis conducted using statistical and geostatistical characteristics of POM pieces. POM identified in the intact individual soil aggregates using the proposed procedure was in good agreement with POM measured in the studied aggregates using conventional lab method (R2=0.75). Of particular importance for accurate identification of POM in the images was the information on spatial characteristics of POM's GVs. Since this is the first attempt of POM determination, future work will be needed to explore how the proposed procedure performs under a variety of potentially influential factors, such as POM's origin and decomposition stage, X-ray scanning settings, image filtering and segmentation methods.
Computing Fourier integral operators with caustics
NASA Astrophysics Data System (ADS)
Caday, Peter
2016-12-01
Fourier integral operators (FIOs) have widespread applications in imaging, inverse problems, and PDEs. An implementation of a generic algorithm for computing FIOs associated with canonical graphs is presented, based on a recent paper of de Hoop et al. Given the canonical transformation and principal symbol of the operator, a preprocessing step reduces application of an FIO approximately to multiplications, pushforwards and forward and inverse discrete Fourier transforms, which can be computed in O({N}n+(n-1)/2{log}N) time for an n-dimensional FIO. The same preprocessed data also allows computation of the inverse and transpose of the FIO, with identical runtime. Examples demonstrate the algorithm’s output, and easily extendible MATLAB/C++ source code is available from the author.
The properties of the lunar regolith at Chang'e-3 landing site: A study based on LPR data
NASA Astrophysics Data System (ADS)
Feng, J.; Su, Y.; Xing, S.; Ding, C.; Li, C.
2015-12-01
In situ sampling from surface is difficult in the exploration of planets and sometimes radar sensing is a better choice. The properties of the surface material such as permittivity, density and depth can be obtained by a surface penetrating radar. The Chang'e-3 (CE-3) landed in the northern Mare Imbrium and a Lunar Penetrating Radar (LPR) is carried on the Yutu rover to detect the shallow structure of the lunar crust and the properties of the lunar regolith, which will give us a close look at the lunar subsurface. We process the radar data in a way which consist two steps: the regular preprocessing step and migration step. The preprocessing part includes zero time correction, de-wow, gain compensation, DC removal, geometric positioning. Then we combine all radar data obtained at the time the rover was moving, and use FIR filter to reduce the noise in the radar image with a pass band frequency range 200MHz-600MHz. A normal radar image is obtained after the preprocessing step. Using a nonlinear least squares fitting method, we fit the most hyperbolas in the radar image which are caused by the buried objects or rocks in the regolith and estimate the EM wave propagation velocity and the permittivity of the regolith. For there is a fixed mathematical relationship between dielectric constant and density, the density profile of the lunar regolith is also calculated. It seems that the permittivity and density at the landing site is larger than we thought before. Finally with a model of variable velocities, we apply the Kirchhoff migration method widely used in the seismology to transform the the unfocused space-time LPR image to a focused one showing the object's (most are stones) true location and size. From the migrated image, we find that the regolith depth in the landing site is smaller than previous study and the stone content rises rapidly with depth. Our study suggests that the landing site is a young region and the reworked history of the surface is short, which is consistent with crater density, showing the gradual formation of regolith by rock fracture during impact events.
Resting-state functional magnetic resonance imaging: the impact of regression analysis.
Yeh, Chia-Jung; Tseng, Yu-Sheng; Lin, Yi-Ru; Tsai, Shang-Yueh; Huang, Teng-Yi
2015-01-01
To investigate the impact of regression methods on resting-state functional magnetic resonance imaging (rsfMRI). During rsfMRI preprocessing, regression analysis is considered effective for reducing the interference of physiological noise on the signal time course. However, it is unclear whether the regression method benefits rsfMRI analysis. Twenty volunteers (10 men and 10 women; aged 23.4 ± 1.5 years) participated in the experiments. We used node analysis and functional connectivity mapping to assess the brain default mode network by using five combinations of regression methods. The results show that regressing the global mean plays a major role in the preprocessing steps. When a global regression method is applied, the values of functional connectivity are significantly lower (P ≤ .01) than those calculated without a global regression. This step increases inter-subject variation and produces anticorrelated brain areas. rsfMRI data processed using regression should be interpreted carefully. The significance of the anticorrelated brain areas produced by global signal removal is unclear. Copyright © 2014 by the American Society of Neuroimaging.
D Scanning of Live Pigs System and its Application in Body Measurements
NASA Astrophysics Data System (ADS)
Guo, H.; Wang, K.; Su, W.; Zhu, D. H.; Liu, W. L.; Xing, Ch.; Chen, Z. R.
2017-09-01
The shape of a live pig is an important indicator of its health and value, whether for breeding or for carcass quality. This paper implements a prototype system for live single pig body surface 3d scanning based on two consumer depth cameras, utilizing the 3d point clouds data. These cameras are calibrated in advance to have a common coordinate system. The live 3D point clouds stream of moving single pig is obtained by two Xtion Pro Live sensors from different viewpoints simultaneously. A novel detection method is proposed and applied to automatically detect the frames containing pigs with the correct posture from the point clouds stream, according to the geometric characteristics of pig's shape. The proposed method is incorporated in a hybrid scheme, that serves as the preprocessing step in a body measurements framework for pigs. Experimental results show the portability of our scanning system and effectiveness of our detection method. Furthermore, an updated this point cloud preprocessing software for livestock body measurements can be downloaded freely from https://github.com/LiveStockShapeAnalysis to livestock industry, research community and can be used for monitoring livestock growth status.
MRI Segmentation of the Human Brain: Challenges, Methods, and Applications
Despotović, Ivana
2015-01-01
Image segmentation is one of the most important tasks in medical image analysis and is often the first and the most critical step in many clinical applications. In brain MRI analysis, image segmentation is commonly used for measuring and visualizing the brain's anatomical structures, for analyzing brain changes, for delineating pathological regions, and for surgical planning and image-guided interventions. In the last few decades, various segmentation techniques of different accuracy and degree of complexity have been developed and reported in the literature. In this paper we review the most popular methods commonly used for brain MRI segmentation. We highlight differences between them and discuss their capabilities, advantages, and limitations. To address the complexity and challenges of the brain MRI segmentation problem, we first introduce the basic concepts of image segmentation. Then, we explain different MRI preprocessing steps including image registration, bias field correction, and removal of nonbrain tissue. Finally, after reviewing different brain MRI segmentation methods, we discuss the validation problem in brain MRI segmentation. PMID:25945121
Linear model for fast background subtraction in oligonucleotide microarrays.
Kroll, K Myriam; Barkema, Gerard T; Carlon, Enrico
2009-11-16
One important preprocessing step in the analysis of microarray data is background subtraction. In high-density oligonucleotide arrays this is recognized as a crucial step for the global performance of the data analysis from raw intensities to expression values. We propose here an algorithm for background estimation based on a model in which the cost function is quadratic in a set of fitting parameters such that minimization can be performed through linear algebra. The model incorporates two effects: 1) Correlated intensities between neighboring features in the chip and 2) sequence-dependent affinities for non-specific hybridization fitted by an extended nearest-neighbor model. The algorithm has been tested on 360 GeneChips from publicly available data of recent expression experiments. The algorithm is fast and accurate. Strong correlations between the fitted values for different experiments as well as between the free-energy parameters and their counterparts in aqueous solution indicate that the model captures a significant part of the underlying physical chemistry.
Computer-aided diagnostic detection system of venous beading in retinal images
NASA Astrophysics Data System (ADS)
Yang, Ching-Wen; Ma, DyeJyun; Chao, ShuennChing; Wang, ChuinMu; Wen, Chia-Hsien; Lo, ChienShun; Chung, Pau-Choo; Chang, Chein-I.
2000-05-01
The detection of venous beading in retinal images provides an early sign of diabetic retinopathy and plays an important role as a preprocessing step in diagnosing ocular diseases. We present a computer-aided diagnostic system to automatically detect venous beading of blood vessels. It comprises of two modules, referred to as the blood vessel extraction module and the venus beading detection module. The former uses a bell-shaped Gaussian kernel with 12 azimuths to extract blood vessels while the latter applies a neural network-based shape cognitron to detect venous beading among the extracted blood vessels for diagnosis. Both modules are fully computer-automated. To evaluate the proposed system, 61 retinal images (32 beaded and 29 normal images) are used for performance evaluation.
Learning-Based Just-Noticeable-Quantization- Distortion Modeling for Perceptual Video Coding.
Ki, Sehwan; Bae, Sung-Ho; Kim, Munchurl; Ko, Hyunsuk
2018-07-01
Conventional predictive video coding-based approaches are reaching the limit of their potential coding efficiency improvements, because of severely increasing computation complexity. As an alternative approach, perceptual video coding (PVC) has attempted to achieve high coding efficiency by eliminating perceptual redundancy, using just-noticeable-distortion (JND) directed PVC. The previous JNDs were modeled by adding white Gaussian noise or specific signal patterns into the original images, which were not appropriate in finding JND thresholds due to distortion with energy reduction. In this paper, we present a novel discrete cosine transform-based energy-reduced JND model, called ERJND, that is more suitable for JND-based PVC schemes. Then, the proposed ERJND model is extended to two learning-based just-noticeable-quantization-distortion (JNQD) models as preprocessing that can be applied for perceptual video coding. The two JNQD models can automatically adjust JND levels based on given quantization step sizes. One of the two JNQD models, called LR-JNQD, is based on linear regression and determines the model parameter for JNQD based on extracted handcraft features. The other JNQD model is based on a convolution neural network (CNN), called CNN-JNQD. To our best knowledge, our paper is the first approach to automatically adjust JND levels according to quantization step sizes for preprocessing the input to video encoders. In experiments, both the LR-JNQD and CNN-JNQD models were applied to high efficiency video coding (HEVC) and yielded maximum (average) bitrate reductions of 38.51% (10.38%) and 67.88% (24.91%), respectively, with little subjective video quality degradation, compared with the input without preprocessing applied.
Demonstration of angular anisotropy in the output of Thematic Mapper
NASA Technical Reports Server (NTRS)
Duggin, M. J. (Principal Investigator); Lindsay, J.; Piwinski, D. J.; Schoch, L. B.
1984-01-01
There is a dependence of TM output (proportional to scene radiance in a manner which will be discussed) upon season, upon cover type and upon view angle. The existence of a significant systematic variation across uniform scenes in p-type (radiometrically and geometrically pre-processed) data is demonstrated. Present pre-processing does remove the effects and the problem must be addressed because the effects are large. While this is in no way attributable to any shortcomings in the thematic mapper, it is an effect which is sufficiently important to warrant more study, with a view to developing suitable pre-processing correction algorithms.
Texture Feature Extraction and Classification for Iris Diagnosis
NASA Astrophysics Data System (ADS)
Ma, Lin; Li, Naimin
Appling computer aided techniques in iris image processing, and combining occidental iridology with the traditional Chinese medicine is a challenging research area in digital image processing and artificial intelligence. This paper proposes an iridology model that consists the iris image pre-processing, texture feature analysis and disease classification. To the pre-processing, a 2-step iris localization approach is proposed; a 2-D Gabor filter based texture analysis and a texture fractal dimension estimation method are proposed for pathological feature extraction; and at last support vector machines are constructed to recognize 2 typical diseases such as the alimentary canal disease and the nerve system disease. Experimental results show that the proposed iridology diagnosis model is quite effective and promising for medical diagnosis and health surveillance for both hospital and public use.
Nondestructive Evaluation of Hardwood Logs Using Automated Interpretation of CT Images
Daniel L. Schmoldt; Dongping Zhu; Richard W. Conners
1993-01-01
Computed tomography (CT) imaging is being used to examine the internal structure of hardwood logs. The following steps are used to automatically interpret CT images: (1) preprocessing to remove unwanted portions of the image, e.g., annual ring structure, (2) image-by-image segmentation to produce relatively homogeneous image areas, (3) volume growing to create volumes...
Wang, Shenghao; Zhang, Yuyan; Cao, Fuyi; Pei, Zhenying; Gao, Xuewei; Zhang, Xu; Zhao, Yong
2018-02-13
This paper presents a novel spectrum analysis tool named synergy adaptive moving window modeling based on immune clone algorithm (SA-MWM-ICA) considering the tedious and inconvenient labor involved in the selection of pre-processing methods and spectral variables by prior experience. In this work, immune clone algorithm is first introduced into the spectrum analysis field as a new optimization strategy, covering the shortage of the relative traditional methods. Based on the working principle of the human immune system, the performance of the quantitative model is regarded as antigen, and a special vector corresponding to the above mentioned antigen is regarded as antibody. The antibody contains a pre-processing method optimization region which is created by 11 decimal digits, and a spectrum variable optimization region which is formed by some moving windows with changeable width and position. A set of original antibodies are created by modeling with this algorithm. After calculating the affinity of these antibodies, those with high affinity will be selected to clone. The regulation for cloning is that the higher the affinity, the more copies will be. In the next step, another import operation named hyper-mutation is applied to the antibodies after cloning. Moreover, the regulation for hyper-mutation is that the lower the affinity, the more possibility will be. Several antibodies with high affinity will be created on the basis of these steps. Groups of simulated dataset, gasoline near-infrared spectra dataset, and soil near-infrared spectra dataset are employed to verify and illustrate the performance of SA-MWM-ICA. Analysis results show that the performance of the quantitative models adopted by SA-MWM-ICA are better especially for structures with relatively complex spectra than traditional models such as partial least squares (PLS), moving window PLS (MWPLS), genetic algorithm PLS (GAPLS), and pretreatment method classification and adjustable parameter changeable size moving window PLS (CA-CSMWPLS). The selected pre-processing methods and spectrum variables are easily explained. The proposed method will converge in few generations and can be used not only for near-infrared spectroscopy analysis but also for other similar spectral analysis, such as infrared spectroscopy. Copyright © 2017 Elsevier B.V. All rights reserved.
Robust power spectral estimation for EEG data
Melman, Tamar; Victor, Jonathan D.
2016-01-01
Background Typical electroencephalogram (EEG) recordings often contain substantial artifact. These artifacts, often large and intermittent, can interfere with quantification of the EEG via its power spectrum. To reduce the impact of artifact, EEG records are typically cleaned by a preprocessing stage that removes individual segments or components of the recording. However, such preprocessing can introduce bias, discard available signal, and be labor-intensive. With this motivation, we present a method that uses robust statistics to reduce dependence on preprocessing by minimizing the effect of large intermittent outliers on the spectral estimates. New method Using the multitaper method[1] as a starting point, we replaced the final step of the standard power spectrum calculation with a quantile-based estimator, and the Jackknife approach to confidence intervals with a Bayesian approach. The method is implemented in provided MATLAB modules, which extend the widely used Chronux toolbox. Results Using both simulated and human data, we show that in the presence of large intermittent outliers, the robust method produces improved estimates of the power spectrum, and that the Bayesian confidence intervals yield close-to-veridical coverage factors. Comparison to existing method The robust method, as compared to the standard method, is less affected by artifact: inclusion of outliers produces fewer changes in the shape of the power spectrum as well as in the coverage factor. Conclusion In the presence of large intermittent outliers, the robust method can reduce dependence on data preprocessing as compared to standard methods of spectral estimation. PMID:27102041
Robust power spectral estimation for EEG data.
Melman, Tamar; Victor, Jonathan D
2016-08-01
Typical electroencephalogram (EEG) recordings often contain substantial artifact. These artifacts, often large and intermittent, can interfere with quantification of the EEG via its power spectrum. To reduce the impact of artifact, EEG records are typically cleaned by a preprocessing stage that removes individual segments or components of the recording. However, such preprocessing can introduce bias, discard available signal, and be labor-intensive. With this motivation, we present a method that uses robust statistics to reduce dependence on preprocessing by minimizing the effect of large intermittent outliers on the spectral estimates. Using the multitaper method (Thomson, 1982) as a starting point, we replaced the final step of the standard power spectrum calculation with a quantile-based estimator, and the Jackknife approach to confidence intervals with a Bayesian approach. The method is implemented in provided MATLAB modules, which extend the widely used Chronux toolbox. Using both simulated and human data, we show that in the presence of large intermittent outliers, the robust method produces improved estimates of the power spectrum, and that the Bayesian confidence intervals yield close-to-veridical coverage factors. The robust method, as compared to the standard method, is less affected by artifact: inclusion of outliers produces fewer changes in the shape of the power spectrum as well as in the coverage factor. In the presence of large intermittent outliers, the robust method can reduce dependence on data preprocessing as compared to standard methods of spectral estimation. Copyright © 2016 Elsevier B.V. All rights reserved.
Sentiment topic mining based on comment tags
NASA Astrophysics Data System (ADS)
Zhang, Daohai; Liu, Xue; Li, Juan; Fan, Mingyue
2018-03-01
With the development of e-commerce, various comments based on tags are generated, how to extract valuable information from these comment tags has become an important content of business management decisions. This study takes HUAWEI mobile phone tags as an example using the sentiment analysis and topic LDA mining method. The first step is data preprocessing and classification of comment tag topic mining. And then make the sentiment classification for comment tags. Finally, mine the comments again and analyze the emotional theme distribution under different sentiment classification. The results show that HUAWEI mobile phone has a good user experience in terms of fluency, cost performance, appearance, etc. Meanwhile, it should pay more attention to independent research and development, product design and development. In addition, battery and speed performance should be enhanced.
Cherry recognition in natural environment based on the vision of picking robot
NASA Astrophysics Data System (ADS)
Zhang, Qirong; Chen, Shanxiong; Yu, Tingzhong; Wang, Yan
2017-04-01
In order to realize the automatic recognition of cherry in the natural environment, this paper designed a robot vision system recognition method. The first step of this method is to pre-process the cherry image by median filtering. The second step is to identify the colour of the cherry through the 0.9R-G colour difference formula, and then use the Otsu algorithm for threshold segmentation. The third step is to remove noise by using the area threshold. The fourth step is to remove the holes in the cherry image by morphological closed and open operation. The fifth step is to obtain the centroid and contour of cherry by using the smallest external rectangular and the Hough transform. Through this recognition process, we can successfully identify 96% of the cherry without blocking and adhesion.
NASA Astrophysics Data System (ADS)
Matsumoto, H.; Haralabus, G.; Zampolli, M.; Özel, N. M.
2016-12-01
Underwater acoustic signal waveforms recorded during the 2015 Chile earthquake (Mw 8.3) by the hydrophones of hydroacoustic station HA03, located at the Juan Fernandez Islands, are analyzed. HA03 is part of the Comprehensive Nuclear-Test-Ban Treaty International Monitoring System. The interest in the particular data set stems from the fact that HA03 is located only approximately 700 km SW from the epicenter of the earthquake. This makes it possible to study aspects of the signal associated with the tsunamigenic earthquake, which would be more difficult to detect had the hydrophones been located far from the source. The analysis shows that the direction of arrival of the T phase can be estimated by means of a three-step preprocessing technique which circumvents spatial aliasing caused by the hydrophone spacing, the latter being large compared to the wavelength. Following this preprocessing step, standard frequency-wave number analysis (F-K analysis) can accurately estimate back azimuth and slowness of T-phase signals. The data analysis also shows that the dispersive tsunami signals can be identified by the water-column hydrophones at the time when the tsunami surface gravity wave reaches the station.
A Computer-Aided Type-II Fuzzy Image Processing for Diagnosis of Meniscus Tear.
Zarandi, M H Fazel; Khadangi, A; Karimi, F; Turksen, I B
2016-12-01
Meniscal tear is one of the prevalent knee disorders among young athletes and the aging population, and requires correct diagnosis and surgical intervention, if necessary. Not only the errors followed by human intervention but also the obstacles of manual meniscal tear detection highlight the need for automatic detection techniques. This paper presents a type-2 fuzzy expert system for meniscal tear diagnosis using PD magnetic resonance images (MRI). The scheme of the proposed type-2 fuzzy image processing model is composed of three distinct modules: Pre-processing, Segmentation, and Classification. λ-nhancement algorithm is used to perform the pre-processing step. For the segmentation step, first, Interval Type-2 Fuzzy C-Means (IT2FCM) is applied to the images, outputs of which are then employed by Interval Type-2 Possibilistic C-Means (IT2PCM) to perform post-processes. Second stage concludes with re-estimation of "η" value to enhance IT2PCM. Finally, a Perceptron neural network with two hidden layers is used for Classification stage. The results of the proposed type-2 expert system have been compared with a well-known segmentation algorithm, approving the superiority of the proposed system in meniscal tear recognition.
Tissue-aware RNA-Seq processing and normalization for heterogeneous and sparse data.
Paulson, Joseph N; Chen, Cho-Yi; Lopes-Ramos, Camila M; Kuijjer, Marieke L; Platig, John; Sonawane, Abhijeet R; Fagny, Maud; Glass, Kimberly; Quackenbush, John
2017-10-03
Although ultrahigh-throughput RNA-Sequencing has become the dominant technology for genome-wide transcriptional profiling, the vast majority of RNA-Seq studies typically profile only tens of samples, and most analytical pipelines are optimized for these smaller studies. However, projects are generating ever-larger data sets comprising RNA-Seq data from hundreds or thousands of samples, often collected at multiple centers and from diverse tissues. These complex data sets present significant analytical challenges due to batch and tissue effects, but provide the opportunity to revisit the assumptions and methods that we use to preprocess, normalize, and filter RNA-Seq data - critical first steps for any subsequent analysis. We find that analysis of large RNA-Seq data sets requires both careful quality control and the need to account for sparsity due to the heterogeneity intrinsic in multi-group studies. We developed Yet Another RNA Normalization software pipeline (YARN), that includes quality control and preprocessing, gene filtering, and normalization steps designed to facilitate downstream analysis of large, heterogeneous RNA-Seq data sets and we demonstrate its use with data from the Genotype-Tissue Expression (GTEx) project. An R package instantiating YARN is available at http://bioconductor.org/packages/yarn .
Salas-Gonzalez, D; Górriz, J M; Ramírez, J; Padilla, P; Illán, I A
2013-01-01
A procedure to improve the convergence rate for affine registration methods of medical brain images when the images differ greatly from the template is presented. The methodology is based on a histogram matching of the source images with respect to the reference brain template before proceeding with the affine registration. The preprocessed source brain images are spatially normalized to a template using a general affine model with 12 parameters. A sum of squared differences between the source images and the template is considered as objective function, and a Gauss-Newton optimization algorithm is used to find the minimum of the cost function. Using histogram equalization as a preprocessing step improves the convergence rate in the affine registration algorithm of brain images as we show in this work using SPECT and PET brain images.
Classification of product inspection items using nonlinear features
NASA Astrophysics Data System (ADS)
Talukder, Ashit; Casasent, David P.; Lee, H.-W.
1998-03-01
Automated processing and classification of real-time x-ray images of randomly oriented touching pistachio nuts is discussed. The ultimate objective is the development of a system for automated non-invasive detection of defective product items on a conveyor belt. This approach involves two main steps: preprocessing and classification. Preprocessing locates individual items and segments ones that touch using a modified watershed algorithm. The second stage involves extraction of features that allow discrimination between damaged and clean items (pistachio nuts). This feature extraction and classification stage is the new aspect of this paper. We use a new nonlinear feature extraction scheme called the maximum representation and discriminating feature (MRDF) extraction method to compute nonlinear features that are used as inputs to a classifier. The MRDF is shown to provide better classification and a better ROC (receiver operating characteristic) curve than other methods.
Engagement Assessment Using EEG Signals
NASA Technical Reports Server (NTRS)
Li, Feng; Li, Jiang; McKenzie, Frederic; Zhang, Guangfan; Wang, Wei; Pepe, Aaron; Xu, Roger; Schnell, Thomas; Anderson, Nick; Heitkamp, Dean
2012-01-01
In this paper, we present methods to analyze and improve an EEG-based engagement assessment approach, consisting of data preprocessing, feature extraction and engagement state classification. During data preprocessing, spikes, baseline drift and saturation caused by recording devices in EEG signals are identified and eliminated, and a wavelet based method is utilized to remove ocular and muscular artifacts in the EEG recordings. In feature extraction, power spectrum densities with 1 Hz bin are calculated as features, and these features are analyzed using the Fisher score and the one way ANOVA method. In the classification step, a committee classifier is trained based on the extracted features to assess engagement status. Finally, experiment results showed that there exist significant differences in the extracted features among different subjects, and we have implemented a feature normalization procedure to mitigate the differences and significantly improved the engagement assessment performance.
Normalization of T2W-MRI prostate images using Rician a priori
NASA Astrophysics Data System (ADS)
Lemaître, Guillaume; Rastgoo, Mojdeh; Massich, Joan; Vilanova, Joan C.; Walker, Paul M.; Freixenet, Jordi; Meyer-Baese, Anke; Mériaudeau, Fabrice; Martí, Robert
2016-03-01
Prostate cancer is reported to be the second most frequently diagnosed cancer of men in the world. In practise, diagnosis can be affected by multiple factors which reduces the chance to detect the potential lesions. In the last decades, new imaging techniques mainly based on MRI are developed in conjunction with Computer-Aided Diagnosis (CAD) systems to help radiologists for such diagnosis. CAD systems are usually designed as a sequential process consisting of four stages: pre-processing, segmentation, registration and classification. As a pre-processing, image normalization is a critical and important step of the chain in order to design a robust classifier and overcome the inter-patients intensity variations. However, little attention has been dedicated to the normalization of T2W-Magnetic Resonance Imaging (MRI) prostate images. In this paper, we propose two methods to normalize T2W-MRI prostate images: (i) based on a Rician a priori and (ii) based on a Square-Root Slope Function (SRSF) representation which does not make any assumption regarding the Probability Density Function (PDF) of the data. A comparison with the state-of-the-art methods is also provided. The normalization of the data is assessed by comparing the alignment of the patient PDFs in both qualitative and quantitative manners. In both evaluation, the normalization using Rician a priori outperforms the other state-of-the-art methods.
Detection of retinal nerve fiber layer defects in retinal fundus images using Gabor filtering
NASA Astrophysics Data System (ADS)
Hayashi, Yoshinori; Nakagawa, Toshiaki; Hatanaka, Yuji; Aoyama, Akira; Kakogawa, Masakatsu; Hara, Takeshi; Fujita, Hiroshi; Yamamoto, Tetsuya
2007-03-01
Retinal nerve fiber layer defect (NFLD) is one of the most important findings for the diagnosis of glaucoma reported by ophthalmologists. However, such changes could be overlooked, especially in mass screenings, because ophthalmologists have limited time to search for a number of different changes for the diagnosis of various diseases such as diabetes, hypertension and glaucoma. Therefore, the use of a computer-aided detection (CAD) system can improve the results of diagnosis. In this work, a technique for the detection of NFLDs in retinal fundus images is proposed. In the preprocessing step, blood vessels are "erased" from the original retinal fundus image by using morphological filtering. The preprocessed image is then transformed into a rectangular array. NFLD regions are observed as vertical dark bands in the transformed image. Gabor filtering is then applied to enhance the vertical dark bands. False positives (FPs) are reduced by a rule-based method which uses the information of the location and the width of each candidate region. The detected regions are back-transformed into the original configuration. In this preliminary study, 71% of NFLD regions are detected with average number of FPs of 3.2 per image. In conclusion, we have developed a technique for the detection of NFLDs in retinal fundus images. Promising results have been obtained in this initial study.
Scan Line Based Road Marking Extraction from Mobile LiDAR Point Clouds.
Yan, Li; Liu, Hua; Tan, Junxiang; Li, Zan; Xie, Hong; Chen, Changjun
2016-06-17
Mobile Mapping Technology (MMT) is one of the most important 3D spatial data acquisition technologies. The state-of-the-art mobile mapping systems, equipped with laser scanners and named Mobile LiDAR Scanning (MLS) systems, have been widely used in a variety of areas, especially in road mapping and road inventory. With the commercialization of Advanced Driving Assistance Systems (ADASs) and self-driving technology, there will be a great demand for lane-level detailed 3D maps, and MLS is the most promising technology to generate such lane-level detailed 3D maps. Road markings and road edges are necessary information in creating such lane-level detailed 3D maps. This paper proposes a scan line based method to extract road markings from mobile LiDAR point clouds in three steps: (1) preprocessing; (2) road points extraction; (3) road markings extraction and refinement. In preprocessing step, the isolated LiDAR points in the air are removed from the LiDAR point clouds and the point clouds are organized into scan lines. In the road points extraction step, seed road points are first extracted by Height Difference (HD) between trajectory data and road surface, then full road points are extracted from the point clouds by moving least squares line fitting. In the road markings extraction and refinement step, the intensity values of road points in a scan line are first smoothed by a dynamic window median filter to suppress intensity noises, then road markings are extracted by Edge Detection and Edge Constraint (EDEC) method, and the Fake Road Marking Points (FRMPs) are eliminated from the detected road markings by segment and dimensionality feature-based refinement. The performance of the proposed method is evaluated by three data samples and the experiment results indicate that road points are well extracted from MLS data and road markings are well extracted from road points by the applied method. A quantitative study shows that the proposed method achieves an average completeness, correctness, and F-measure of 0.96, 0.93, and 0.94, respectively. The time complexity analysis shows that the scan line based road markings extraction method proposed in this paper provides a promising alternative for offline road markings extraction from MLS data.
Scan Line Based Road Marking Extraction from Mobile LiDAR Point Clouds†
Yan, Li; Liu, Hua; Tan, Junxiang; Li, Zan; Xie, Hong; Chen, Changjun
2016-01-01
Mobile Mapping Technology (MMT) is one of the most important 3D spatial data acquisition technologies. The state-of-the-art mobile mapping systems, equipped with laser scanners and named Mobile LiDAR Scanning (MLS) systems, have been widely used in a variety of areas, especially in road mapping and road inventory. With the commercialization of Advanced Driving Assistance Systems (ADASs) and self-driving technology, there will be a great demand for lane-level detailed 3D maps, and MLS is the most promising technology to generate such lane-level detailed 3D maps. Road markings and road edges are necessary information in creating such lane-level detailed 3D maps. This paper proposes a scan line based method to extract road markings from mobile LiDAR point clouds in three steps: (1) preprocessing; (2) road points extraction; (3) road markings extraction and refinement. In preprocessing step, the isolated LiDAR points in the air are removed from the LiDAR point clouds and the point clouds are organized into scan lines. In the road points extraction step, seed road points are first extracted by Height Difference (HD) between trajectory data and road surface, then full road points are extracted from the point clouds by moving least squares line fitting. In the road markings extraction and refinement step, the intensity values of road points in a scan line are first smoothed by a dynamic window median filter to suppress intensity noises, then road markings are extracted by Edge Detection and Edge Constraint (EDEC) method, and the Fake Road Marking Points (FRMPs) are eliminated from the detected road markings by segment and dimensionality feature-based refinement. The performance of the proposed method is evaluated by three data samples and the experiment results indicate that road points are well extracted from MLS data and road markings are well extracted from road points by the applied method. A quantitative study shows that the proposed method achieves an average completeness, correctness, and F-measure of 0.96, 0.93, and 0.94, respectively. The time complexity analysis shows that the scan line based road markings extraction method proposed in this paper provides a promising alternative for offline road markings extraction from MLS data. PMID:27322279
Ngan, Shing-Chung; Hu, Xiaoping; Khong, Pek-Lan
2011-03-01
We propose a method for preprocessing event-related functional magnetic resonance imaging (fMRI) data that can lead to enhancement of template-free activation detection. The method is based on using a figure of merit to guide the wavelet shrinkage of a given fMRI data set. Several previous studies have demonstrated that in the root-mean-square error setting, wavelet shrinkage can improve the signal-to-noise ratio of fMRI time courses. However, preprocessing fMRI data in the root-mean-square error setting does not necessarily lead to enhancement of template-free activation detection. Motivated by this observation, in this paper, we move to the detection setting and investigate the possibility of using wavelet shrinkage to enhance template-free activation detection of fMRI data. The main ingredients of our method are (i) forward wavelet transform of the voxel time courses, (ii) shrinking the resulting wavelet coefficients as directed by an appropriate figure of merit, (iii) inverse wavelet transform of the shrunk data, and (iv) submitting these preprocessed time courses to a given activation detection algorithm. Two figures of merit are developed in the paper, and two other figures of merit adapted from the literature are described. Receiver-operating characteristic analyses with simulated fMRI data showed quantitative evidence that data preprocessing as guided by the figures of merit developed in the paper can yield improved detectability of the template-free measures. We also demonstrate the application of our methodology on an experimental fMRI data set. The proposed method is useful for enhancing template-free activation detection in event-related fMRI data. It is of significant interest to extend the present framework to produce comprehensive, adaptive and fully automated preprocessing of fMRI data optimally suited for subsequent data analysis steps. Copyright © 2010 Elsevier B.V. All rights reserved.
Tactile and bone-conduction auditory brain computer interface for vision and hearing impaired users.
Rutkowski, Tomasz M; Mori, Hiromu
2015-04-15
The paper presents a report on the recently developed BCI alternative for users suffering from impaired vision (lack of focus or eye-movements) or from the so-called "ear-blocking-syndrome" (limited hearing). We report on our recent studies of the extents to which vibrotactile stimuli delivered to the head of a user can serve as a platform for a brain computer interface (BCI) paradigm. In the proposed tactile and bone-conduction auditory BCI novel multiple head positions are used to evoke combined somatosensory and auditory (via the bone conduction effect) P300 brain responses, in order to define a multimodal tactile and bone-conduction auditory brain computer interface (tbcaBCI). In order to further remove EEG interferences and to improve P300 response classification synchrosqueezing transform (SST) is applied. SST outperforms the classical time-frequency analysis methods of the non-linear and non-stationary signals such as EEG. The proposed method is also computationally more effective comparing to the empirical mode decomposition. The SST filtering allows for online EEG preprocessing application which is essential in the case of BCI. Experimental results with healthy BCI-naive users performing online tbcaBCI, validate the paradigm, while the feasibility of the concept is illuminated through information transfer rate case studies. We present a comparison of the proposed SST-based preprocessing method, combined with a logistic regression (LR) classifier, together with classical preprocessing and LDA-based classification BCI techniques. The proposed tbcaBCI paradigm together with data-driven preprocessing methods are a step forward in robust BCI applications research. Copyright © 2014 Elsevier B.V. All rights reserved.
Puccio, Benjamin; Pooley, James P; Pellman, John S; Taverna, Elise C; Craddock, R Cameron
2016-10-25
Skull-stripping is the procedure of removing non-brain tissue from anatomical MRI data. This procedure can be useful for calculating brain volume and for improving the quality of other image processing steps. Developing new skull-stripping algorithms and evaluating their performance requires gold standard data from a variety of different scanners and acquisition methods. We complement existing repositories with manually corrected brain masks for 125 T1-weighted anatomical scans from the Nathan Kline Institute Enhanced Rockland Sample Neurofeedback Study. Skull-stripped images were obtained using a semi-automated procedure that involved skull-stripping the data using the brain extraction based on nonlocal segmentation technique (BEaST) software, and manually correcting the worst results. Corrected brain masks were added into the BEaST library and the procedure was repeated until acceptable brain masks were available for all images. In total, 85 of the skull-stripped images were hand-edited and 40 were deemed to not need editing. The results are brain masks for the 125 images along with a BEaST library for automatically skull-stripping other data. Skull-stripped anatomical images from the Neurofeedback sample are available for download from the Preprocessed Connectomes Project. The resulting brain masks can be used by researchers to improve preprocessing of the Neurofeedback data, as training and testing data for developing new skull-stripping algorithms, and for evaluating the impact on other aspects of MRI preprocessing. We have illustrated the utility of these data as a reference for comparing various automatic methods and evaluated the performance of the newly created library on independent data.
NASA Astrophysics Data System (ADS)
Geiss, Alexander; Marksteiner, Uwe; Lux, Oliver; Lemmerz, Christian; Reitebuch, Oliver; Kanitz, Thomas; Straume-Lindner, Anne Grete
2018-04-01
By the end of 2017, the European Space Agency (ESA) will launch the Atmospheric laser Doppler instrument (ALADIN), a direct detection Doppler wind lidar operating at 355 nm. An important tool for the validation and optimization of ALADIN's hardware and data processors for wind retrievals with real atmospheric signals is the ALADIN airborne demonstrator A2D. In order to be able to validate and test aerosol retrieval algorithms from ALADIN, an algorithm for the retrieval of atmospheric backscatter and extinction profiles from A2D is necessary. The A2D is utilizing a direct detection scheme by using a dual Fabry-Pérot interferometer to measure molecular Rayleigh signals and a Fizeau interferometer to measure aerosol Mie returns. Signals are captured by accumulation charge coupled devices (ACCD). These specifications make different steps in the signal preprocessing necessary. In this paper, the required steps to retrieve aerosol optical products, i. e. particle backscatter coefficient βp, particle extinction coefficient αp and lidar ratio Sp from A2D raw signals are described.
A novel pre-processing technique for improving image quality in digital breast tomosynthesis.
Kim, Hyeongseok; Lee, Taewon; Hong, Joonpyo; Sabir, Sohail; Lee, Jung-Ryun; Choi, Young Wook; Kim, Hak Hee; Chae, Eun Young; Cho, Seungryong
2017-02-01
Nonlinear pre-reconstruction processing of the projection data in computed tomography (CT) where accurate recovery of the CT numbers is important for diagnosis is usually discouraged, for such a processing would violate the physics of image formation in CT. However, one can devise a pre-processing step to enhance detectability of lesions in digital breast tomosynthesis (DBT) where accurate recovery of the CT numbers is fundamentally impossible due to the incompleteness of the scanned data. Since the detection of lesions such as micro-calcifications and mass in breasts is the purpose of using DBT, it is justified that a technique producing higher detectability of lesions is a virtue. A histogram modification technique was developed in the projection data domain. Histogram of raw projection data was first divided into two parts: One for the breast projection data and the other for background. Background pixel values were set to a single value that represents the boundary between breast and background. After that, both histogram parts were shifted by an appropriate amount of offset and the histogram-modified projection data were log-transformed. Filtered-backprojection (FBP) algorithm was used for image reconstruction of DBT. To evaluate performance of the proposed method, we computed the detectability index for the reconstructed images from clinically acquired data. Typical breast border enhancement artifacts were greatly suppressed and the detectability of calcifications and masses was increased by use of the proposed method. Compared to a global threshold-based post-reconstruction processing technique, the proposed method produced images of higher contrast without invoking additional image artifacts. In this work, we report a novel pre-processing technique that improves detectability of lesions in DBT and has potential advantages over the global threshold-based post-reconstruction processing technique. The proposed method not only increased the lesion detectability but also reduced typical image artifacts pronounced in conventional FBP-based DBT. © 2016 American Association of Physicists in Medicine.
de Vos, Stijn; Wardenaar, Klaas J; Bos, Elisabeth H; Wit, Ernst C; Bouwmans, Mara E J; de Jonge, Peter
2017-01-01
Differences in within-person emotion dynamics may be an important source of heterogeneity in depression. To investigate these dynamics, researchers have previously combined multilevel regression analyses with network representations. However, sparse network methods, specifically developed for longitudinal network analyses, have not been applied. Therefore, this study used this approach to investigate population-level and individual-level emotion dynamics in healthy and depressed persons and compared this method with the multilevel approach. Time-series data were collected in pair-matched healthy persons and major depressive disorder (MDD) patients (n = 54). Seven positive affect (PA) and seven negative affect (NA) items were administered electronically at 90 times (30 days; thrice per day). The population-level (healthy vs. MDD) and individual-level time series were analyzed using a sparse longitudinal network model based on vector autoregression. The population-level model was also estimated with a multilevel approach. Effects of different preprocessing steps were evaluated as well. The characteristics of the longitudinal networks were investigated to gain insight into the emotion dynamics. In the population-level networks, longitudinal network connectivity was strongest in the healthy group, with nodes showing more and stronger longitudinal associations with each other. Individually estimated networks varied strongly across individuals. Individual variations in network connectivity were unrelated to baseline characteristics (depression status, neuroticism, severity). A multilevel approach applied to the same data showed higher connectivity in the MDD group, which seemed partly related to the preprocessing approach. The sparse network approach can be useful for the estimation of networks with multiple nodes, where overparameterization is an issue, and for individual-level networks. However, its current inability to model random effects makes it less useful as a population-level approach in case of large heterogeneity. Different preprocessing strategies appeared to strongly influence the results, complicating inferences about network density.
Liang, Xia; Wang, Jinhui; Yan, Chaogan; Shu, Ni; Xu, Ke; Gong, Gaolang; He, Yong
2012-01-01
Graph theoretical analysis of brain networks based on resting-state functional MRI (R-fMRI) has attracted a great deal of attention in recent years. These analyses often involve the selection of correlation metrics and specific preprocessing steps. However, the influence of these factors on the topological properties of functional brain networks has not been systematically examined. Here, we investigated the influences of correlation metric choice (Pearson's correlation versus partial correlation), global signal presence (regressed or not) and frequency band selection [slow-5 (0.01-0.027 Hz) versus slow-4 (0.027-0.073 Hz)] on the topological properties of both binary and weighted brain networks derived from them, and we employed test-retest (TRT) analyses for further guidance on how to choose the "best" network modeling strategy from the reliability perspective. Our results show significant differences in global network metrics associated with both correlation metrics and global signals. Analysis of nodal degree revealed differing hub distributions for brain networks derived from Pearson's correlation versus partial correlation. TRT analysis revealed that the reliability of both global and local topological properties are modulated by correlation metrics and the global signal, with the highest reliability observed for Pearson's-correlation-based brain networks without global signal removal (WOGR-PEAR). The nodal reliability exhibited a spatially heterogeneous distribution wherein regions in association and limbic/paralimbic cortices showed moderate TRT reliability in Pearson's-correlation-based brain networks. Moreover, we found that there were significant frequency-related differences in topological properties of WOGR-PEAR networks, and brain networks derived in the 0.027-0.073 Hz band exhibited greater reliability than those in the 0.01-0.027 Hz band. Taken together, our results provide direct evidence regarding the influences of correlation metrics and specific preprocessing choices on both the global and nodal topological properties of functional brain networks. This study also has important implications for how to choose reliable analytical schemes in brain network studies.
NASA Technical Reports Server (NTRS)
Kelly, W. L.; Howle, W. M.; Meredith, B. D.
1980-01-01
The Information Adaptive System (IAS) is an element of the NASA End-to-End Data System (NEEDS) Phase II and is focused toward onbaord image processing. Since the IAS is a data preprocessing system which is closely coupled to the sensor system, it serves as a first step in providing a 'Smart' imaging sensor. Some of the functions planned for the IAS include sensor response nonuniformity correction, geometric correction, data set selection, data formatting, packetization, and adaptive system control. The inclusion of these sensor data preprocessing functions onboard the spacecraft will significantly improve the extraction of information from the sensor data in a timely and cost effective manner and provide the opportunity to design sensor systems which can be reconfigured in near real time for optimum performance. The purpose of this paper is to present the preliminary design of the IAS and the plans for its development.
Multisubject Learning for Common Spatial Patterns in Motor-Imagery BCI
Devlaminck, Dieter; Wyns, Bart; Grosse-Wentrup, Moritz; Otte, Georges; Santens, Patrick
2011-01-01
Motor-imagery-based brain-computer interfaces (BCIs) commonly use the common spatial pattern filter (CSP) as preprocessing step before feature extraction and classification. The CSP method is a supervised algorithm and therefore needs subject-specific training data for calibration, which is very time consuming to collect. In order to reduce the amount of calibration data that is needed for a new subject, one can apply multitask (from now on called multisubject) machine learning techniques to the preprocessing phase. Here, the goal of multisubject learning is to learn a spatial filter for a new subject based on its own data and that of other subjects. This paper outlines the details of the multitask CSP algorithm and shows results on two data sets. In certain subjects a clear improvement can be seen, especially when the number of training trials is relatively low. PMID:22007194
On the influence of high-pass filtering on ICA-based artifact reduction in EEG-ERP.
Winkler, Irene; Debener, Stefan; Müller, Klaus-Robert; Tangermann, Michael
2015-01-01
Standard artifact removal methods for electroencephalographic (EEG) signals are either based on Independent Component Analysis (ICA) or they regress out ocular activity measured at electrooculogram (EOG) channels. Successful ICA-based artifact reduction relies on suitable pre-processing. Here we systematically evaluate the effects of high-pass filtering at different frequencies. Offline analyses were based on event-related potential data from 21 participants performing a standard auditory oddball task and an automatic artifactual component classifier method (MARA). As a pre-processing step for ICA, high-pass filtering between 1-2 Hz consistently produced good results in terms of signal-to-noise ratio (SNR), single-trial classification accuracy and the percentage of `near-dipolar' ICA components. Relative to no artifact reduction, ICA-based artifact removal significantly improved SNR and classification accuracy. This was not the case for a regression-based approach to remove EOG artifacts.
Automated detection of microcalcification clusters in mammograms
NASA Astrophysics Data System (ADS)
Karale, Vikrant A.; Mukhopadhyay, Sudipta; Singh, Tulika; Khandelwal, Niranjan; Sadhu, Anup
2017-03-01
Mammography is the most efficient modality for detection of breast cancer at early stage. Microcalcifications are tiny bright spots in mammograms and can often get missed by the radiologist during diagnosis. The presence of microcalcification clusters in mammograms can act as an early sign of breast cancer. This paper presents a completely automated computer-aided detection (CAD) system for detection of microcalcification clusters in mammograms. Unsharp masking is used as a preprocessing step which enhances the contrast between microcalcifications and the background. The preprocessed image is thresholded and various shape and intensity based features are extracted. Support vector machine (SVM) classifier is used to reduce the false positives while preserving the true microcalcification clusters. The proposed technique is applied on two different databases i.e DDSM and private database. The proposed technique shows good sensitivity with moderate false positives (FPs) per image on both databases.
NASA Astrophysics Data System (ADS)
Zhou, Y.; Zhang, X.; Xiao, W.
2018-04-01
As the geomagnetic sensor is susceptible to interference, a pre-processing total least square iteration method is proposed for calibration compensation. Firstly, the error model of the geomagnetic sensor is analyzed and the correction model is proposed, then the characteristics of the model are analyzed and converted into nine parameters. The geomagnetic data is processed by Hilbert transform (HHT) to improve the signal-to-noise ratio, and the nine parameters are calculated by using the combination of Newton iteration method and the least squares estimation method. The sifter algorithm is used to filter the initial value of the iteration to ensure that the initial error is as small as possible. The experimental results show that this method does not need additional equipment and devices, can continuously update the calibration parameters, and better than the two-step estimation method, it can compensate geomagnetic sensor error well.
Modeling urban air pollution with optimized hierarchical fuzzy inference system.
Tashayo, Behnam; Alimohammadi, Abbas
2016-10-01
Environmental exposure assessments (EEA) and epidemiological studies require urban air pollution models with appropriate spatial and temporal resolutions. Uncertain available data and inflexible models can limit air pollution modeling techniques, particularly in under developing countries. This paper develops a hierarchical fuzzy inference system (HFIS) to model air pollution under different land use, transportation, and meteorological conditions. To improve performance, the system treats the issue as a large-scale and high-dimensional problem and develops the proposed model using a three-step approach. In the first step, a geospatial information system (GIS) and probabilistic methods are used to preprocess the data. In the second step, a hierarchical structure is generated based on the problem. In the third step, the accuracy and complexity of the model are simultaneously optimized with a multiple objective particle swarm optimization (MOPSO) algorithm. We examine the capabilities of the proposed model for predicting daily and annual mean PM2.5 and NO2 and compare the accuracy of the results with representative models from existing literature. The benefits provided by the model features, including probabilistic preprocessing, multi-objective optimization, and hierarchical structure, are precisely evaluated by comparing five different consecutive models in terms of accuracy and complexity criteria. Fivefold cross validation is used to assess the performance of the generated models. The respective average RMSEs and coefficients of determination (R (2)) for the test datasets using proposed model are as follows: daily PM2.5 = (8.13, 0.78), annual mean PM2.5 = (4.96, 0.80), daily NO2 = (5.63, 0.79), and annual mean NO2 = (2.89, 0.83). The obtained results demonstrate that the developed hierarchical fuzzy inference system can be utilized for modeling air pollution in EEA and epidemiological studies.
Assessing semantic similarity of texts - Methods and algorithms
NASA Astrophysics Data System (ADS)
Rozeva, Anna; Zerkova, Silvia
2017-12-01
Assessing the semantic similarity of texts is an important part of different text-related applications like educational systems, information retrieval, text summarization, etc. This task is performed by sophisticated analysis, which implements text-mining techniques. Text mining involves several pre-processing steps, which provide for obtaining structured representative model of the documents in a corpus by means of extracting and selecting the features, characterizing their content. Generally the model is vector-based and enables further analysis with knowledge discovery approaches. Algorithms and measures are used for assessing texts at syntactical and semantic level. An important text-mining method and similarity measure is latent semantic analysis (LSA). It provides for reducing the dimensionality of the document vector space and better capturing the text semantics. The mathematical background of LSA for deriving the meaning of the words in a given text by exploring their co-occurrence is examined. The algorithm for obtaining the vector representation of words and their corresponding latent concepts in a reduced multidimensional space as well as similarity calculation are presented.
Iris Matching Based on Personalized Weight Map.
Dong, Wenbo; Sun, Zhenan; Tan, Tieniu
2011-09-01
Iris recognition typically involves three steps, namely, iris image preprocessing, feature extraction, and feature matching. The first two steps of iris recognition have been well studied, but the last step is less addressed. Each human iris has its unique visual pattern and local image features also vary from region to region, which leads to significant differences in robustness and distinctiveness among the feature codes derived from different iris regions. However, most state-of-the-art iris recognition methods use a uniform matching strategy, where features extracted from different regions of the same person or the same region for different individuals are considered to be equally important. This paper proposes a personalized iris matching strategy using a class-specific weight map learned from the training images of the same iris class. The weight map can be updated online during the iris recognition procedure when the successfully recognized iris images are regarded as the new training data. The weight map reflects the robustness of an encoding algorithm on different iris regions by assigning an appropriate weight to each feature code for iris matching. Such a weight map trained by sufficient iris templates is convergent and robust against various noise. Extensive and comprehensive experiments demonstrate that the proposed personalized iris matching strategy achieves much better iris recognition performance than uniform strategies, especially for poor quality iris images.
Using deep learning for detecting gender in adult chest radiographs
NASA Astrophysics Data System (ADS)
Xue, Zhiyun; Antani, Sameer; Long, L. Rodney; Thoma, George R.
2018-03-01
In this paper, we present a method for automatically identifying the gender of an imaged person using their frontal chest x-ray images. Our work is motivated by the need to determine missing gender information in some datasets. The proposed method employs the technique of convolutional neural network (CNN) based deep learning and transfer learning to overcome the challenge of developing handcrafted features in limited data. Specifically, the method consists of four main steps: pre-processing, CNN feature extractor, feature selection, and classifier. The method is tested on a combined dataset obtained from several sources with varying acquisition quality resulting in different pre-processing steps that are applied for each. For feature extraction, we tested and compared four CNN architectures, viz., AlexNet, VggNet, GoogLeNet, and ResNet. We applied a feature selection technique, since the feature length is larger than the number of images. Two popular classifiers: SVM and Random Forest, are used and compared. We evaluated the classification performance by cross-validation and used seven performance measures. The best performer is the VggNet-16 feature extractor with the SVM classifier, with accuracy of 86.6% and ROC Area being 0.932 for 5-fold cross validation. We also discuss several misclassified cases and describe future work for performance improvement.
Budak, Umit; Şengür, Abdulkadir; Guo, Yanhui; Akbulut, Yaman
2017-12-01
Microaneurysms (MAs) are known as early signs of diabetic-retinopathy which are called red lesions in color fundus images. Detection of MAs in fundus images needs highly skilled physicians or eye angiography. Eye angiography is an invasive and expensive procedure. Therefore, an automatic detection system to identify the MAs locations in fundus images is in demand. In this paper, we proposed a system to detect the MAs in colored fundus images. The proposed method composed of three stages. In the first stage, a series of pre-processing steps are used to make the input images more convenient for MAs detection. To this end, green channel decomposition, Gaussian filtering, median filtering, back ground determination, and subtraction operations are applied to input colored fundus images. After pre-processing, a candidate MAs extraction procedure is applied to detect potential regions. A five-stepped procedure is adopted to get the potential MA locations. Finally, deep convolutional neural network (DCNN) with reinforcement sample learning strategy is used to train the proposed system. The DCNN is trained with color image patches which are collected from ground-truth MA locations and non-MA locations. We conducted extensive experiments on ROC dataset to evaluate of our proposal. The results are encouraging.
Carvalho, Luis Felipe C. S.; Nogueira, Marcelo Saito; Neto, Lázaro P. M.; Bhattacharjee, Tanmoy T.; Martin, Airton A.
2017-01-01
Most oral injuries are diagnosed by histopathological analysis of a biopsy, which is an invasive procedure and does not give immediate results. On the other hand, Raman spectroscopy is a real time and minimally invasive analytical tool with potential for the diagnosis of diseases. The potential for diagnostics can be improved by data post-processing. Hence, this study aims to evaluate the performance of preprocessing steps and multivariate analysis methods for the classification of normal tissues and pathological oral lesion spectra. A total of 80 spectra acquired from normal and abnormal tissues using optical fiber Raman-based spectroscopy (OFRS) were subjected to PCA preprocessing in the z-scored data set, and the KNN (K-nearest neighbors), J48 (unpruned C4.5 decision tree), RBF (radial basis function), RF (random forest), and MLP (multilayer perceptron) classifiers at WEKA software (Waikato environment for knowledge analysis), after area normalization or maximum intensity normalization. Our results suggest the best classification was achieved by using maximum intensity normalization followed by MLP. Based on these results, software for automated analysis can be generated and validated using larger data sets. This would aid quick comprehension of spectroscopic data and easy diagnosis by medical practitioners in clinical settings. PMID:29188115
Carvalho, Luis Felipe C S; Nogueira, Marcelo Saito; Neto, Lázaro P M; Bhattacharjee, Tanmoy T; Martin, Airton A
2017-11-01
Most oral injuries are diagnosed by histopathological analysis of a biopsy, which is an invasive procedure and does not give immediate results. On the other hand, Raman spectroscopy is a real time and minimally invasive analytical tool with potential for the diagnosis of diseases. The potential for diagnostics can be improved by data post-processing. Hence, this study aims to evaluate the performance of preprocessing steps and multivariate analysis methods for the classification of normal tissues and pathological oral lesion spectra. A total of 80 spectra acquired from normal and abnormal tissues using optical fiber Raman-based spectroscopy (OFRS) were subjected to PCA preprocessing in the z-scored data set, and the KNN (K-nearest neighbors), J48 (unpruned C4.5 decision tree), RBF (radial basis function), RF (random forest), and MLP (multilayer perceptron) classifiers at WEKA software (Waikato environment for knowledge analysis), after area normalization or maximum intensity normalization. Our results suggest the best classification was achieved by using maximum intensity normalization followed by MLP. Based on these results, software for automated analysis can be generated and validated using larger data sets. This would aid quick comprehension of spectroscopic data and easy diagnosis by medical practitioners in clinical settings.
Walkaway-VSP survey using distributed optical fiber in China oilfield
NASA Astrophysics Data System (ADS)
Wu, Junjun; Yu, Gang; Zhang, Qinghong; Li, Yanpeng; Cai, Zhidong; Chen, Yuanzhong; Liu, Congwei; Zhao, Haiying; Li, Fei
2017-10-01
Distributed acoustic sensing (DAS) is a new type of replacement technology for geophysical geophone. DAS system is similar to high-density surface seismic geophone array. In the stage of acquisition, DAS can obtain the full well data with one shot. And it can provide enhanced vertical seismic profile (VSP) imaging and monitor fluids and pressures changes in the hydrocarbon production reservoir. Walkaway VSP data acquired over a former producing well in north eastern China provided a rich set of very high quality data. A standard VSP data pre-processing workflow was applied, followed by pre-stack Kirchhoff time migration. In the DAS pre-processing step we were faced with additional and special challenges: strong coherent noise due to cable slapping and ringing along the borehole casing. The single well DAS Walkaway VSP images provide a good result with higher vertical and lateral resolution than the surface seismic in the objective area. This paper reports on lessons learned in the handling of the wireline cable and subsequent special DAS data processing steps developed to remediate some of the practical wireline deployment issues. Optical wireline cable as a conveyance of fiber optic cables for VSP in vertical wells will open the use of the DAS system to much wider applications.
Impulse Noise Cancellation of Medical Images Using Wavelet Networks and Median Filters
Sadri, Amir Reza; Zekri, Maryam; Sadri, Saeid; Gheissari, Niloofar
2012-01-01
This paper presents a new two-stage approach to impulse noise removal for medical images based on wavelet network (WN). The first step is noise detection, in which the so-called gray-level difference and average background difference are considered as the inputs of a WN. Wavelet Network is used as a preprocessing for the second stage. The second step is removing impulse noise with a median filter. The wavelet network presented here is a fixed one without learning. Experimental results show that our method acts on impulse noise effectively, and at the same time preserves chromaticity and image details very well. PMID:23493998
Sensor feature fusion for detecting buried objects
DOE Office of Scientific and Technical Information (OSTI.GOV)
Clark, G.A.; Sengupta, S.K.; Sherwood, R.J.
1993-04-01
Given multiple registered images of the earth`s surface from dual-band sensors, our system fuses information from the sensors to reduce the effects of clutter and improve the ability to detect buried or surface target sites. The sensor suite currently includes two sensors (5 micron and 10 micron wavelengths) and one ground penetrating radar (GPR) of the wide-band pulsed synthetic aperture type. We use a supervised teaming pattern recognition approach to detect metal and plastic land mines buried in soil. The overall process consists of four main parts: Preprocessing, feature extraction, feature selection, and classification. These parts are used in amore » two step process to classify a subimage. Thee first step, referred to as feature selection, determines the features of sub-images which result in the greatest separability among the classes. The second step, image labeling, uses the selected features and the decisions from a pattern classifier to label the regions in the image which are likely to correspond to buried mines. We extract features from the images, and use feature selection algorithms to select only the most important features according to their contribution to correct detections. This allows us to save computational complexity and determine which of the sensors add value to the detection system. The most important features from the various sensors are fused using supervised teaming pattern classifiers (including neural networks). We present results of experiments to detect buried land mines from real data, and evaluate the usefulness of fusing feature information from multiple sensor types, including dual-band infrared and ground penetrating radar. The novelty of the work lies mostly in the combination of the algorithms and their application to the very important and currently unsolved operational problem of detecting buried land mines from an airborne standoff platform.« less
Image preprocessing study on KPCA-based face recognition
NASA Astrophysics Data System (ADS)
Li, Xuan; Li, Dehua
2015-12-01
Face recognition as an important biometric identification method, with its friendly, natural, convenient advantages, has obtained more and more attention. This paper intends to research a face recognition system including face detection, feature extraction and face recognition, mainly through researching on related theory and the key technology of various preprocessing methods in face detection process, using KPCA method, focuses on the different recognition results in different preprocessing methods. In this paper, we choose YCbCr color space for skin segmentation and choose integral projection for face location. We use erosion and dilation of the opening and closing operation and illumination compensation method to preprocess face images, and then use the face recognition method based on kernel principal component analysis method for analysis and research, and the experiments were carried out using the typical face database. The algorithms experiment on MATLAB platform. Experimental results show that integration of the kernel method based on PCA algorithm under certain conditions make the extracted features represent the original image information better for using nonlinear feature extraction method, which can obtain higher recognition rate. In the image preprocessing stage, we found that images under various operations may appear different results, so as to obtain different recognition rate in recognition stage. At the same time, in the process of the kernel principal component analysis, the value of the power of the polynomial function can affect the recognition result.
A robust data scaling algorithm to improve classification accuracies in biomedical data.
Cao, Xi Hang; Stojkovic, Ivan; Obradovic, Zoran
2016-09-09
Machine learning models have been adapted in biomedical research and practice for knowledge discovery and decision support. While mainstream biomedical informatics research focuses on developing more accurate models, the importance of data preprocessing draws less attention. We propose the Generalized Logistic (GL) algorithm that scales data uniformly to an appropriate interval by learning a generalized logistic function to fit the empirical cumulative distribution function of the data. The GL algorithm is simple yet effective; it is intrinsically robust to outliers, so it is particularly suitable for diagnostic/classification models in clinical/medical applications where the number of samples is usually small; it scales the data in a nonlinear fashion, which leads to potential improvement in accuracy. To evaluate the effectiveness of the proposed algorithm, we conducted experiments on 16 binary classification tasks with different variable types and cover a wide range of applications. The resultant performance in terms of area under the receiver operation characteristic curve (AUROC) and percentage of correct classification showed that models learned using data scaled by the GL algorithm outperform the ones using data scaled by the Min-max and the Z-score algorithm, which are the most commonly used data scaling algorithms. The proposed GL algorithm is simple and effective. It is robust to outliers, so no additional denoising or outlier detection step is needed in data preprocessing. Empirical results also show models learned from data scaled by the GL algorithm have higher accuracy compared to the commonly used data scaling algorithms.
Rough sets and Laplacian score based cost-sensitive feature selection
Yu, Shenglong
2018-01-01
Cost-sensitive feature selection learning is an important preprocessing step in machine learning and data mining. Recently, most existing cost-sensitive feature selection algorithms are heuristic algorithms, which evaluate the importance of each feature individually and select features one by one. Obviously, these algorithms do not consider the relationship among features. In this paper, we propose a new algorithm for minimal cost feature selection called the rough sets and Laplacian score based cost-sensitive feature selection. The importance of each feature is evaluated by both rough sets and Laplacian score. Compared with heuristic algorithms, the proposed algorithm takes into consideration the relationship among features with locality preservation of Laplacian score. We select a feature subset with maximal feature importance and minimal cost when cost is undertaken in parallel, where the cost is given by three different distributions to simulate different applications. Different from existing cost-sensitive feature selection algorithms, our algorithm simultaneously selects out a predetermined number of “good” features. Extensive experimental results show that the approach is efficient and able to effectively obtain the minimum cost subset. In addition, the results of our method are more promising than the results of other cost-sensitive feature selection algorithms. PMID:29912884
Rough sets and Laplacian score based cost-sensitive feature selection.
Yu, Shenglong; Zhao, Hong
2018-01-01
Cost-sensitive feature selection learning is an important preprocessing step in machine learning and data mining. Recently, most existing cost-sensitive feature selection algorithms are heuristic algorithms, which evaluate the importance of each feature individually and select features one by one. Obviously, these algorithms do not consider the relationship among features. In this paper, we propose a new algorithm for minimal cost feature selection called the rough sets and Laplacian score based cost-sensitive feature selection. The importance of each feature is evaluated by both rough sets and Laplacian score. Compared with heuristic algorithms, the proposed algorithm takes into consideration the relationship among features with locality preservation of Laplacian score. We select a feature subset with maximal feature importance and minimal cost when cost is undertaken in parallel, where the cost is given by three different distributions to simulate different applications. Different from existing cost-sensitive feature selection algorithms, our algorithm simultaneously selects out a predetermined number of "good" features. Extensive experimental results show that the approach is efficient and able to effectively obtain the minimum cost subset. In addition, the results of our method are more promising than the results of other cost-sensitive feature selection algorithms.
Computer aided detection of brain micro-bleeds in traumatic brain injury
NASA Astrophysics Data System (ADS)
van den Heuvel, T. L. A.; Ghafoorian, M.; van der Eerden, A. W.; Goraj, B. M.; Andriessen, T. M. J. C.; ter Haar Romeny, B. M.; Platel, B.
2015-03-01
Brain micro-bleeds (BMBs) are used as surrogate markers for detecting diffuse axonal injury in traumatic brain injury (TBI) patients. The location and number of BMBs have been shown to influence the long-term outcome of TBI. To further study the importance of BMBs for prognosis, accurate localization and quantification are required. The task of annotating BMBs is laborious, complex and prone to error, resulting in a high inter- and intra-reader variability. In this paper we propose a computer-aided detection (CAD) system to automatically detect BMBs in MRI scans of moderate to severe neuro-trauma patients. Our method consists of four steps. Step one: preprocessing of the data. Both susceptibility (SWI) and T1 weighted MRI scans are used. The images are co-registered, a brain-mask is generated, the bias field is corrected, and the image intensities are normalized. Step two: initial candidates for BMBs are selected as local minima in the processed SWI scans. Step three: feature extraction. BMBs appear as round or ovoid signal hypo-intensities on SWI. Twelve features are computed to capture these properties of a BMB. Step four: Classification. To identify BMBs from the set of local minima using their features, different classifiers are trained on a database of 33 expert annotated scans and 18 healthy subjects with no BMBs. Our system uses a leave-one-out strategy to analyze its performance. With a sensitivity of 90% and 1.3 false positives per BMB, our CAD system shows superior results compared to state-of-the-art BMB detection algorithms (developed for non-trauma patients).
On evaluating clustering procedures for use in classification
NASA Technical Reports Server (NTRS)
Pore, M. D.; Moritz, T. E.; Register, D. T.; Yao, S. S.; Eppler, W. G. (Principal Investigator)
1979-01-01
The problem of evaluating clustering algorithms and their respective computer programs for use in a preprocessing step for classification is addressed. In clustering for classification the probability of correct classification is suggested as the ultimate measure of accuracy on training data. A means of implementing this criterion and a measure of cluster purity are discussed. Examples are given. A procedure for cluster labeling that is based on cluster purity and sample size is presented.
Image Understanding and Information Extraction\\
1977-11-01
mentation and generalization of DeCarlo’s Nyquist-like stability test [15,161. The last step of the procedure is to check whether this zero ...Several general sta- bility theorems which relate stability to the zero set of B(w,z) have been presented. These theorems led to the conclusion that...Spatial Stochastic Model for Contextual Pattern Recognition . ° . .............. 88 T. S. Yu and K. S. Fu V. PREPROCESSING 1. Stability of General Two
ATACseqQC: a Bioconductor package for post-alignment quality assessment of ATAC-seq data.
Ou, Jianhong; Liu, Haibo; Yu, Jun; Kelliher, Michelle A; Castilla, Lucio H; Lawson, Nathan D; Zhu, Lihua Julie
2018-03-01
ATAC-seq (Assays for Transposase-Accessible Chromatin using sequencing) is a recently developed technique for genome-wide analysis of chromatin accessibility. Compared to earlier methods for assaying chromatin accessibility, ATAC-seq is faster and easier to perform, does not require cross-linking, has higher signal to noise ratio, and can be performed on small cell numbers. However, to ensure a successful ATAC-seq experiment, step-by-step quality assurance processes, including both wet lab quality control and in silico quality assessment, are essential. While several tools have been developed or adopted for assessing read quality, identifying nucleosome occupancy and accessible regions from ATAC-seq data, none of the tools provide a comprehensive set of functionalities for preprocessing and quality assessment of aligned ATAC-seq datasets. We have developed a Bioconductor package, ATACseqQC, for easily generating various diagnostic plots to help researchers quickly assess the quality of their ATAC-seq data. In addition, this package contains functions to preprocess aligned ATAC-seq data for subsequent peak calling. Here we demonstrate the utilities of our package using 25 publicly available ATAC-seq datasets from four studies. We also provide guidelines on what the diagnostic plots should look like for an ideal ATAC-seq dataset. This software package has been used successfully for preprocessing and assessing several in-house and public ATAC-seq datasets. Diagnostic plots generated by this package will facilitate the quality assessment of ATAC-seq data, and help researchers to evaluate their own ATAC-seq experiments as well as select high-quality ATAC-seq datasets from public repositories such as GEO to avoid generating hypotheses or drawing conclusions from low-quality ATAC-seq experiments. The software, source code, and documentation are freely available as a Bioconductor package at https://bioconductor.org/packages/release/bioc/html/ATACseqQC.html .
Georeferencing CAMS data: Polynomial rectification and beyond
NASA Astrophysics Data System (ADS)
Yang, Xinghe
The Calibrated Airborne Multispectral Scanner (CAMS) is a sensor used in the commercial remote sensing program at NASA Stennis Space Center. In geographic applications of the CAMS data, accurate geometric rectification is essential for the analysis of the remotely sensed data and for the integration of the data into Geographic Information Systems (GIS). The commonly used rectification techniques such as the polynomial transformation and ortho rectification have been very successful in the field of remote sensing and GIS for most remote sensing data such as Landsat imagery, SPOT imagery and aerial photos. However, due to the geometric nature of the airborne line scanner which has high spatial frequency distortions, the polynomial model and the ortho rectification technique in current commercial software packages such as Erdas Imagine are not adequate for obtaining sufficient geometric accuracy. In this research, the geometric nature, especially the major distortions, of the CAMS data has been described. An analytical step-by-step geometric preprocessing has been utilized to deal with the potential high frequency distortions of the CAMS data. A generic sensor-independent photogrammetric model has been developed for the ortho-rectification of the CAMS data. Three generalized kernel classes and directional elliptical basis have been formulated into a rectification model of summation of multisurface functions, which is a significant extension to the traditional radial basis functions. The preprocessing mechanism has been fully incorporated into the polynomial, the triangle-based finite element analysis as well as the summation of multisurface functions. While the multisurface functions and the finite element analysis have the characteristics of localization, piecewise logic has been applied to the polynomial and photogrammetric methods, which can produce significant accuracy improvement over the global approach. A software module has been implemented with full integration of data preprocessing and rectification techniques under Erdas Imagine development environment. The final root mean square (RMS) errors for the test CAMS data are about two pixels which are compatible with the random RMS errors existed in the reference map coordinates.
Fingerprint recognition system by use of graph matching
NASA Astrophysics Data System (ADS)
Shen, Wei; Shen, Jun; Zheng, Huicheng
2001-09-01
Fingerprint recognition is an important subject in biometrics to identify or verify persons by physiological characteristics, and has found wide applications in different domains. In the present paper, we present a finger recognition system that combines singular points and structures. The principal steps of processing in our system are: preprocessing and ridge segmentation, singular point extraction and selection, graph representation, and finger recognition by graphs matching. Our fingerprint recognition system is implemented and tested for many fingerprint images and the experimental result are satisfactory. Different techniques are used in our system, such as fast calculation of orientation field, local fuzzy dynamical thresholding, algebraic analysis of connections and fingerprints representation and matching by graphs. Wed find that for fingerprint database that is not very large, the recognition rate is very high even without using a prior coarse category classification. This system works well for both one-to-few and one-to-many problems.
NASA Astrophysics Data System (ADS)
Ayu Cyntya Dewi, Dyah; Shaufiah; Asror, Ibnu
2018-03-01
SMS (Short Message Service) is on e of the communication services that still be the main choice, although now the phone grow with various applications. Along with the development of various other communication media, some countries lowered SMS rates to keep the interest of mobile users. It resulted in increased spam SMS that used by several parties, one of them for advertisement. Given the kind of multi-lingual documents in a message SMS, the Web, and others, necessary for effective multilingual or cross-lingual processing techniques is becoming increasingly important. The steps that performed in this research is data / messages first preprocessing then represented into a graph model. Then calculated using GKNN method. From this research we get the maximum accuracy is 98.86 with training data in Indonesian language and testing data in indonesian language with K 10 and threshold 0.001.
A hybrid skull-stripping algorithm based on adaptive balloon snake models
NASA Astrophysics Data System (ADS)
Liu, Hung-Ting; Sheu, Tony W. H.; Chang, Herng-Hua
2013-02-01
Skull-stripping is one of the most important preprocessing steps in neuroimage analysis. We proposed a hybrid algorithm based on an adaptive balloon snake model to handle this challenging task. The proposed framework consists of two stages: first, the fuzzy possibilistic c-means (FPCM) is used for voxel clustering, which provides a labeled image for the snake contour initialization. In the second stage, the contour is initialized outside the brain surface based on the FPCM result and evolves under the guidance of the balloon snake model, which drives the contour with an adaptive inward normal force to capture the boundary of the brain. The similarity indices indicate that our method outperformed the BSE and BET methods in skull-stripping the MR image volumes in the IBSR data set. Experimental results show the effectiveness of this new scheme and potential applications in a wide variety of skull-stripping applications.
A novel method to detect shadows on multispectral images
NASA Astrophysics Data System (ADS)
Daǧlayan Sevim, Hazan; Yardımcı ćetin, Yasemin; Özışık Başkurt, Didem
2016-10-01
Shadowing occurs when the direct light coming from a light source is obstructed by high human made structures, mountains or clouds. Since shadow regions are illuminated only by scattered light, true spectral properties of the objects are not observed in such regions. Therefore, many object classification and change detection problems utilize shadow detection as a preprocessing step. Besides, shadows are useful for obtaining 3D information of the objects such as estimating the height of buildings. With pervasiveness of remote sensing images, shadow detection is ever more important. This study aims to develop a shadow detection method on multispectral images based on the transformation of C1C2C3 space and contribution of NIR bands. The proposed method is tested on Worldview-2 images covering Ankara, Turkey at different times. The new index is used on these 8-band multispectral images with two NIR bands. The method is compared with methods in the literature.
Detection of Melanoma Skin Cancer in Dermoscopy Images
NASA Astrophysics Data System (ADS)
Eltayef, Khalid; Li, Yongmin; Liu, Xiaohui
2017-02-01
Malignant melanoma is the most hazardous type of human skin cancer and its incidence has been rapidly increasing. Early detection of malignant melanoma in dermoscopy images is very important and critical, since its detection in the early stage can be helpful to cure it. Computer Aided Diagnosis systems can be very helpful to facilitate the early detection of cancers for dermatologists. In this paper, we present a novel method for the detection of melanoma skin cancer. To detect the hair and several noises from images, pre-processing step is carried out by applying a bank of directional filters. And therefore, Image inpainting method is implemented to fill in the unknown regions. Fuzzy C-Means and Markov Random Field methods are used to delineate the border of the lesion area in the images. The method was evaluated on a dataset of 200 dermoscopic images, and superior results were produced compared to alternative methods.
Wilson, Scott; Bowyer, Andrea; Harrap, Stephen B
2015-01-01
The clinical characterization of cardiovascular dynamics during hemodialysis (HD) has important pathophysiological implications in terms of diagnostic, cardiovascular risk assessment, and treatment efficacy perspectives. Currently the diagnosis of significant intradialytic systolic blood pressure (SBP) changes among HD patients is imprecise and opportunistic, reliant upon the presence of hypotensive symptoms in conjunction with coincident but isolated noninvasive brachial cuff blood pressure (NIBP) readings. Considering hemodynamic variables as a time series makes a continuous recording approach more desirable than intermittent measures; however, in the clinical environment, the data signal is susceptible to corruption due to both impulsive and Gaussian-type noise. Signal preprocessing is an attractive solution to this problem. Prospectively collected continuous noninvasive SBP data over the short-break intradialytic period in ten patients was preprocessed using a novel median hybrid filter (MHF) algorithm and compared with 50 time-coincident pairs of intradialytic NIBP measures from routine HD practice. The median hybrid preprocessing technique for continuously acquired cardiovascular data yielded a dynamic regression without significant noise and artifact, suitable for high-level profiling of time-dependent SBP behavior. Signal accuracy is highly comparable with standard NIBP measurement, with the added clinical benefit of dynamic real-time hemodynamic information.
Machine learning for the automatic localisation of foetal body parts in cine-MRI scans
NASA Astrophysics Data System (ADS)
Bowles, Christopher; Nowlan, Niamh C.; Hayat, Tayyib T. A.; Malamateniou, Christina; Rutherford, Mary; Hajnal, Joseph V.; Rueckert, Daniel; Kainz, Bernhard
2015-03-01
Being able to automate the location of individual foetal body parts has the potential to dramatically reduce the work required to analyse time resolved foetal Magnetic Resonance Imaging (cine-MRI) scans, for example, for use in the automatic evaluation of the foetal development. Currently, manual preprocessing of every scan is required to locate body parts before analysis can be performed, leading to a significant time overhead. With the volume of scans becoming available set to increase as cine-MRI scans become more prevalent in clinical practice, this stage of manual preprocessing is a bottleneck, limiting the data available for further analysis. Any tools which can automate this process will therefore save many hours of research time and increase the rate of new discoveries in what is a key area in understanding early human development. Here we present a series of techniques which can be applied to foetal cine-MRI scans in order to first locate and then differentiate between individual body parts. A novel approach to maternal movement suppression and segmentation using Fourier transforms is put forward as a preprocessing step, allowing for easy extraction of short movements of individual foetal body parts via the clustering of optical flow vector fields. These body part movements are compared to a labelled database and probabilistically classified before being spatially and temporally combined to give a final estimate for the location of each body part.
Binarization algorithm for document image with complex background
NASA Astrophysics Data System (ADS)
Miao, Shaojun; Lu, Tongwei; Min, Feng
2015-12-01
The most important step in image preprocessing for Optical Character Recognition (OCR) is binarization. Due to the complex background or varying light in the text image, binarization is a very difficult problem. This paper presents the improved binarization algorithm. The algorithm can be divided into several steps. First, the background approximation can be obtained by the polynomial fitting, and the text is sharpened by using bilateral filter. Second, the image contrast compensation is done to reduce the impact of light and improve contrast of the original image. Third, the first derivative of the pixels in the compensated image are calculated to get the average value of the threshold, then the edge detection is obtained. Fourth, the stroke width of the text is estimated through a measuring of distance between edge pixels. The final stroke width is determined by choosing the most frequent distance in the histogram. Fifth, according to the value of the final stroke width, the window size is calculated, then a local threshold estimation approach can begin to binaries the image. Finally, the small noise is removed based on the morphological operators. The experimental result shows that the proposed method can effectively remove the noise caused by complex background and varying light.
Selecting a plutonium vitrification process
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jouan, A.
1996-05-01
Vitrification of plutonium is one means of mitigating its potential danger. This option is technically feasible, even if it is not the solution advocated in France. Two situations are possible, depending on whether or not the glass matrix also contains fission products; concentrations of up to 15% should be achievable for plutonium alone, whereas the upper limit is 3% in the presence of fission products. The French continuous vitrification process appears to be particularly suitable for plutonium vitrification: its capacity is compatible with the required throughout, and the compact dimensions of the process equipment prevent a criticality hazard. Preprocessing ofmore » plutonium metal, to convert it to PuO{sub 2} or to a nitric acid solution, may prove advantageous or even necessary depending on whether a dry or wet process is adopted. The process may involve a single step (vitrification of Pu or PuO{sub 2} mixed with glass frit) or may include a prior calcination step - notably if the plutonium is to be incorporated into a fission product glass. It is important to weigh the advantages and drawbacks of all the possible options in terms of feasibility, safety and cost-effectiveness.« less
Poplová, Michaela; Sovka, Pavel; Cifra, Michal
2017-01-01
Photonic signals are broadly exploited in communication and sensing and they typically exhibit Poisson-like statistics. In a common scenario where the intensity of the photonic signals is low and one needs to remove a nonstationary trend of the signals for any further analysis, one faces an obstacle: due to the dependence between the mean and variance typical for a Poisson-like process, information about the trend remains in the variance even after the trend has been subtracted, possibly yielding artifactual results in further analyses. Commonly available detrending or normalizing methods cannot cope with this issue. To alleviate this issue we developed a suitable pre-processing method for the signals that originate from a Poisson-like process. In this paper, a Poisson pre-processing method for nonstationary time series with Poisson distribution is developed and tested on computer-generated model data and experimental data of chemiluminescence from human neutrophils and mung seeds. The presented method transforms a nonstationary Poisson signal into a stationary signal with a Poisson distribution while preserving the type of photocount distribution and phase-space structure of the signal. The importance of the suggested pre-processing method is shown in Fano factor and Hurst exponent analysis of both computer-generated model signals and experimental photonic signals. It is demonstrated that our pre-processing method is superior to standard detrending-based methods whenever further signal analysis is sensitive to variance of the signal.
Poplová, Michaela; Sovka, Pavel
2017-01-01
Photonic signals are broadly exploited in communication and sensing and they typically exhibit Poisson-like statistics. In a common scenario where the intensity of the photonic signals is low and one needs to remove a nonstationary trend of the signals for any further analysis, one faces an obstacle: due to the dependence between the mean and variance typical for a Poisson-like process, information about the trend remains in the variance even after the trend has been subtracted, possibly yielding artifactual results in further analyses. Commonly available detrending or normalizing methods cannot cope with this issue. To alleviate this issue we developed a suitable pre-processing method for the signals that originate from a Poisson-like process. In this paper, a Poisson pre-processing method for nonstationary time series with Poisson distribution is developed and tested on computer-generated model data and experimental data of chemiluminescence from human neutrophils and mung seeds. The presented method transforms a nonstationary Poisson signal into a stationary signal with a Poisson distribution while preserving the type of photocount distribution and phase-space structure of the signal. The importance of the suggested pre-processing method is shown in Fano factor and Hurst exponent analysis of both computer-generated model signals and experimental photonic signals. It is demonstrated that our pre-processing method is superior to standard detrending-based methods whenever further signal analysis is sensitive to variance of the signal. PMID:29216207
Geena 2, improved automated analysis of MALDI/TOF mass spectra.
Romano, Paolo; Profumo, Aldo; Rocco, Mattia; Mangerini, Rosa; Ferri, Fabio; Facchiano, Angelo
2016-03-02
Mass spectrometry (MS) is producing high volumes of data supporting oncological sciences, especially for translational research. Most of related elaborations can be carried out by combining existing tools at different levels, but little is currently available for the automation of the fundamental steps. For the analysis of MALDI/TOF spectra, a number of pre-processing steps are required, including joining of isotopic abundances for a given molecular species, normalization of signals against an internal standard, background noise removal, averaging multiple spectra from the same sample, and aligning spectra from different samples. In this paper, we present Geena 2, a public software tool for the automated execution of these pre-processing steps for MALDI/TOF spectra. Geena 2 has been developed in a Linux-Apache-MySQL-PHP web development environment, with scripts in PHP and Perl. Input and output are managed as simple formats that can be consumed by any database system and spreadsheet software. Input data may also be stored in a MySQL database. Processing methods are based on original heuristic algorithms which are introduced in the paper. Three simple and intuitive web interfaces are available: the Standard Search Interface, which allows a complete control over all parameters, the Bright Search Interface, which leaves to the user the possibility to tune parameters for alignment of spectra, and the Quick Search Interface, which limits the number of parameters to a minimum by using default values for the majority of parameters. Geena 2 has been utilized, in conjunction with a statistical analysis tool, in three published experimental works: a proteomic study on the effects of long-term cryopreservation on the low molecular weight fraction of serum proteome, and two retrospective serum proteomic studies, one on the risk of developing breat cancer in patients affected by gross cystic disease of the breast (GCDB) and the other for the identification of a predictor of breast cancer mortality following breast cancer surgery, whose results were validated by ELISA, a completely alternative method. Geena 2 is a public tool for the automated pre-processing of MS data originated by MALDI/TOF instruments, with a simple and intuitive web interface. It is now under active development for the inclusion of further filtering options and for the adoption of standard formats for MS spectra.
Epidermal segmentation in high-definition optical coherence tomography.
Li, Annan; Cheng, Jun; Yow, Ai Ping; Wall, Carolin; Wong, Damon Wing Kee; Tey, Hong Liang; Liu, Jiang
2015-01-01
Epidermis segmentation is a crucial step in many dermatological applications. Recently, high-definition optical coherence tomography (HD-OCT) has been developed and applied to imaging subsurface skin tissues. In this paper, a novel epidermis segmentation method using HD-OCT is proposed in which the epidermis is segmented by 3 steps: the weighted least square-based pre-processing, the graph-based skin surface detection and the local integral projection-based dermal-epidermal junction detection respectively. Using a dataset of five 3D volumes, we found that this method correlates well with the conventional method of manually marking out the epidermis. This method can therefore serve to effectively and rapidly delineate the epidermis for study and clinical management of skin diseases.
flowVS: channel-specific variance stabilization in flow cytometry
DOE Office of Scientific and Technical Information (OSTI.GOV)
Azad, Ariful; Rajwa, Bartek; Pothen, Alex
Comparing phenotypes of heterogeneous cell populations from multiple biological conditions is at the heart of scientific discovery based on flow cytometry (FC). When the biological signal is measured by the average expression of a biomarker, standard statistical methods require that variance be approximately stabilized in populations to be compared. Since the mean and variance of a cell population are often correlated in fluorescence-based FC measurements, a preprocessing step is needed to stabilize the within-population variances.
flowVS: channel-specific variance stabilization in flow cytometry
Azad, Ariful; Rajwa, Bartek; Pothen, Alex
2016-07-28
Comparing phenotypes of heterogeneous cell populations from multiple biological conditions is at the heart of scientific discovery based on flow cytometry (FC). When the biological signal is measured by the average expression of a biomarker, standard statistical methods require that variance be approximately stabilized in populations to be compared. Since the mean and variance of a cell population are often correlated in fluorescence-based FC measurements, a preprocessing step is needed to stabilize the within-population variances.
Comparison of performance of some common Hartmann-Shack centroid estimation methods
NASA Astrophysics Data System (ADS)
Thatiparthi, C.; Ommani, A.; Burman, R.; Thapa, D.; Hutchings, N.; Lakshminarayanan, V.
2016-03-01
The accuracy of the estimation of optical aberrations by measuring the distorted wave front using a Hartmann-Shack wave front sensor (HSWS) is mainly dependent upon the measurement accuracy of the centroid of the focal spot. The most commonly used methods for centroid estimation such as the brightest spot centroid; first moment centroid; weighted center of gravity and intensity weighted center of gravity, are generally applied on the entire individual sub-apertures of the lens let array. However, these processes of centroid estimation are sensitive to the influence of reflections, scattered light, and noise; especially in the case where the signal spot area is smaller compared to the whole sub-aperture area. In this paper, we give a comparison of performance of the commonly used centroiding methods on estimation of optical aberrations, with and without the use of some pre-processing steps (thresholding, Gaussian smoothing and adaptive windowing). As an example we use the aberrations of the human eye model. This is done using the raw data collected from a custom made ophthalmic aberrometer and a model eye to emulate myopic and hyper-metropic defocus values up to 2 Diopters. We show that the use of any simple centroiding algorithm is sufficient in the case of ophthalmic applications for estimating aberrations within the typical clinically acceptable limits of a quarter Diopter margins, when certain pre-processing steps to reduce the impact of external factors are used.
Software for pre-processing Illumina next-generation sequencing short read sequences
2014-01-01
Background When compared to Sanger sequencing technology, next-generation sequencing (NGS) technologies are hindered by shorter sequence read length, higher base-call error rate, non-uniform coverage, and platform-specific sequencing artifacts. These characteristics lower the quality of their downstream analyses, e.g. de novo and reference-based assembly, by introducing sequencing artifacts and errors that may contribute to incorrect interpretation of data. Although many tools have been developed for quality control and pre-processing of NGS data, none of them provide flexible and comprehensive trimming options in conjunction with parallel processing to expedite pre-processing of large NGS datasets. Methods We developed ngsShoRT (next-generation sequencing Short Reads Trimmer), a flexible and comprehensive open-source software package written in Perl that provides a set of algorithms commonly used for pre-processing NGS short read sequences. We compared the features and performance of ngsShoRT with existing tools: CutAdapt, NGS QC Toolkit and Trimmomatic. We also compared the effects of using pre-processed short read sequences generated by different algorithms on de novo and reference-based assembly for three different genomes: Caenorhabditis elegans, Saccharomyces cerevisiae S288c, and Escherichia coli O157 H7. Results Several combinations of ngsShoRT algorithms were tested on publicly available Illumina GA II, HiSeq 2000, and MiSeq eukaryotic and bacteria genomic short read sequences with the focus on removing sequencing artifacts and low-quality reads and/or bases. Our results show that across three organisms and three sequencing platforms, trimming improved the mean quality scores of trimmed sequences. Using trimmed sequences for de novo and reference-based assembly improved assembly quality as well as assembler performance. In general, ngsShoRT outperformed comparable trimming tools in terms of trimming speed and improvement of de novo and reference-based assembly as measured by assembly contiguity and correctness. Conclusions Trimming of short read sequences can improve the quality of de novo and reference-based assembly and assembler performance. The parallel processing capability of ngsShoRT reduces trimming time and improves the memory efficiency when dealing with large datasets. We recommend combining sequencing artifacts removal, and quality score based read filtering and base trimming as the most consistent method for improving sequence quality and downstream assemblies. ngsShoRT source code, user guide and tutorial are available at http://research.bioinformatics.udel.edu/genomics/ngsShoRT/. ngsShoRT can be incorporated as a pre-processing step in genome and transcriptome assembly projects. PMID:24955109
Liang, Xia; Wang, Jinhui; Yan, Chaogan; Shu, Ni; Xu, Ke; Gong, Gaolang; He, Yong
2012-01-01
Graph theoretical analysis of brain networks based on resting-state functional MRI (R-fMRI) has attracted a great deal of attention in recent years. These analyses often involve the selection of correlation metrics and specific preprocessing steps. However, the influence of these factors on the topological properties of functional brain networks has not been systematically examined. Here, we investigated the influences of correlation metric choice (Pearson's correlation versus partial correlation), global signal presence (regressed or not) and frequency band selection [slow-5 (0.01–0.027 Hz) versus slow-4 (0.027–0.073 Hz)] on the topological properties of both binary and weighted brain networks derived from them, and we employed test-retest (TRT) analyses for further guidance on how to choose the “best” network modeling strategy from the reliability perspective. Our results show significant differences in global network metrics associated with both correlation metrics and global signals. Analysis of nodal degree revealed differing hub distributions for brain networks derived from Pearson's correlation versus partial correlation. TRT analysis revealed that the reliability of both global and local topological properties are modulated by correlation metrics and the global signal, with the highest reliability observed for Pearson's-correlation-based brain networks without global signal removal (WOGR-PEAR). The nodal reliability exhibited a spatially heterogeneous distribution wherein regions in association and limbic/paralimbic cortices showed moderate TRT reliability in Pearson's-correlation-based brain networks. Moreover, we found that there were significant frequency-related differences in topological properties of WOGR-PEAR networks, and brain networks derived in the 0.027–0.073 Hz band exhibited greater reliability than those in the 0.01–0.027 Hz band. Taken together, our results provide direct evidence regarding the influences of correlation metrics and specific preprocessing choices on both the global and nodal topological properties of functional brain networks. This study also has important implications for how to choose reliable analytical schemes in brain network studies. PMID:22412922
Pre-processing Tasks in Indonesian Twitter Messages
NASA Astrophysics Data System (ADS)
Hidayatullah, A. F.; Ma'arif, M. R.
2017-01-01
Twitter text messages are very noisy. Moreover, tweet data are unstructured and complicated enough. The focus of this work is to investigate pre-processing technique for Twitter messages in Bahasa Indonesia. The main goal of this experiment is to clean the tweet data for further analysis. Thus, the objectives of this pre-processing task is simply removing all meaningless character and left valuable words. In this research, we divide our proposed pre-processing experiments into two parts. The first part is common pre-processing task. The second part is a specific pre-processing task for tweet data. From the experimental result we can conclude that by employing a specific pre-processing task related to tweet data characteristic we obtained more valuable result. The result obtained is better in terms of less meaningful word occurrence which is not significant in number comparing to the result obtained by just running common pre-processing tasks.
Assessing Species Diversity Using Metavirome Data: Methods and Challenges.
Herath, Damayanthi; Jayasundara, Duleepa; Ackland, David; Saeed, Isaam; Tang, Sen-Lin; Halgamuge, Saman
2017-01-01
Assessing biodiversity is an important step in the study of microbial ecology associated with a given environment. Multiple indices have been used to quantify species diversity, which is a key biodiversity measure. Measuring species diversity of viruses in different environments remains a challenge relative to measuring the diversity of other microbial communities. Metagenomics has played an important role in elucidating viral diversity by conducting metavirome studies; however, metavirome data are of high complexity requiring robust data preprocessing and analysis methods. In this review, existing bioinformatics methods for measuring species diversity using metavirome data are categorised broadly as either sequence similarity-dependent methods or sequence similarity-independent methods. The former includes a comparison of DNA fragments or assemblies generated in the experiment against reference databases for quantifying species diversity, whereas estimates from the latter are independent of the knowledge of existing sequence data. Current methods and tools are discussed in detail, including their applications and limitations. Drawbacks of the state-of-the-art method are demonstrated through results from a simulation. In addition, alternative approaches are proposed to overcome the challenges in estimating species diversity measures using metavirome data.
Anifah, Lilik; Purnama, I Ketut Eddy; Hariadi, Mochamad; Purnomo, Mauridhi Hery
2013-01-01
Localization is the first step in osteoarthritis (OA) classification. Manual classification, however, is time-consuming, tedious, and expensive. The proposed system is designed as decision support system for medical doctors to classify the severity of knee OA. A method has been proposed here to localize a joint space area for OA and then classify it in 4 steps to classify OA into KL-Grade 0, KL-Grade 1, KL-Grade 2, KL-Grade 3 and KL-Grade 4, which are preprocessing, segmentation, feature extraction, and classification. In this proposed system, right and left knee detection was performed by employing the Contrast-Limited Adaptive Histogram Equalization (CLAHE) and the template matching. The Gabor kernel, row sum graph and moment methods were used to localize the junction space area of knee. CLAHE is used for preprocessing step, i.e.to normalize the varied intensities. The segmentation process was conducted using the Gabor kernel, template matching, row sum graph and gray level center of mass method. Here GLCM (contrast, correlation, energy, and homogeinity) features were employed as training data. Overall, 50 data were evaluated for training and 258 data for testing. Experimental results showed the best performance by using gabor kernel with parameters α=8, θ=0, Ψ=[0 π/2], γ=0,8, N=4 and with number of iterations being 5000, momentum value 0.5 and α0=0.6 for the classification process. The run gave classification accuracy rate of 93.8% for KL-Grade 0, 70% for KL-Grade 1, 4% for KL-Grade 2, 10% for KL-Grade 3 and 88.9% for KL-Grade 4.
Anifah, Lilik; Purnama, I Ketut Eddy; Hariadi, Mochamad; Purnomo, Mauridhi Hery
2013-01-01
Localization is the first step in osteoarthritis (OA) classification. Manual classification, however, is time-consuming, tedious, and expensive. The proposed system is designed as decision support system for medical doctors to classify the severity of knee OA. A method has been proposed here to localize a joint space area for OA and then classify it in 4 steps to classify OA into KL-Grade 0, KL-Grade 1, KL-Grade 2, KL-Grade 3 and KL-Grade 4, which are preprocessing, segmentation, feature extraction, and classification. In this proposed system, right and left knee detection was performed by employing the Contrast-Limited Adaptive Histogram Equalization (CLAHE) and the template matching. The Gabor kernel, row sum graph and moment methods were used to localize the junction space area of knee. CLAHE is used for preprocessing step, i.e.to normalize the varied intensities. The segmentation process was conducted using the Gabor kernel, template matching, row sum graph and gray level center of mass method. Here GLCM (contrast, correlation, energy, and homogeinity) features were employed as training data. Overall, 50 data were evaluated for training and 258 data for testing. Experimental results showed the best performance by using gabor kernel with parameters α=8, θ=0, Ψ=[0 π/2], γ=0,8, N=4 and with number of iterations being 5000, momentum value 0.5 and α0=0.6 for the classification process. The run gave classification accuracy rate of 93.8% for KL-Grade 0, 70% for KL-Grade 1, 4% for KL-Grade 2, 10% for KL-Grade 3 and 88.9% for KL-Grade 4. PMID:23525188
Skin tumor area extraction using an improved dynamic programming approach.
Abbas, Qaisar; Celebi, M E; Fondón García, Irene
2012-05-01
Border (B) description of melanoma and other pigmented skin lesions is one of the most important tasks for the clinical diagnosis of dermoscopy images using the ABCD rule. For an accurate description of the border, there must be an effective skin tumor area extraction (STAE) method. However, this task is complicated due to uneven illumination, artifacts present in the lesions and smooth areas or fuzzy borders of the desired regions. In this paper, a novel STAE algorithm based on improved dynamic programming (IDP) is presented. The STAE technique consists of the following four steps: color space transform, pre-processing, rough tumor area detection and refinement of the segmented area. The procedure is performed in the CIE L(*) a(*) b(*) color space, which is approximately uniform and is therefore related to dermatologist's perception. After pre-processing the skin lesions to reduce artifacts, the DP algorithm is improved by introducing a local cost function, which is based on color and texture weights. The STAE method is tested on a total of 100 dermoscopic images. In order to compare the performance of STAE with other state-of-the-art algorithms, various statistical measures based on dermatologist-drawn borders are utilized as a ground truth. The proposed method outperforms the others with a sensitivity of 96.64%, a specificity of 98.14% and an error probability of 5.23%. The results demonstrate that this STAE method by IDP is an effective solution when compared with other state-of-the-art segmentation techniques. The proposed method can accurately extract tumor borders in dermoscopy images. © 2011 John Wiley & Sons A/S.
Using Fourier transform IR spectroscopy to analyze biological materials
Baker, Matthew J; Trevisan, Júlio; Bassan, Paul; Bhargava, Rohit; Butler, Holly J; Dorling, Konrad M; Fielden, Peter R; Fogarty, Simon W; Fullwood, Nigel J; Heys, Kelly A; Hughes, Caryn; Lasch, Peter; Martin-Hirsch, Pierre L; Obinaju, Blessing; Sockalingum, Ganesh D; Sulé-Suso, Josep; Strong, Rebecca J; Walsh, Michael J; Wood, Bayden R; Gardner, Peter; Martin, Francis L
2015-01-01
IR spectroscopy is an excellent method for biological analyses. It enables the nonperturbative, label-free extraction of biochemical information and images toward diagnosis and the assessment of cell functionality. Although not strictly microscopy in the conventional sense, it allows the construction of images of tissue or cell architecture by the passing of spectral data through a variety of computational algorithms. Because such images are constructed from fingerprint spectra, the notion is that they can be an objective reflection of the underlying health status of the analyzed sample. One of the major difficulties in the field has been determining a consensus on spectral pre-processing and data analysis. This manuscript brings together as coauthors some of the leaders in this field to allow the standardization of methods and procedures for adapting a multistage approach to a methodology that can be applied to a variety of cell biological questions or used within a clinical setting for disease screening or diagnosis. We describe a protocol for collecting IR spectra and images from biological samples (e.g., fixed cytology and tissue sections, live cells or biofluids) that assesses the instrumental options available, appropriate sample preparation, different sampling modes as well as important advances in spectral data acquisition. After acquisition, data processing consists of a sequence of steps including quality control, spectral pre-processing, feature extraction and classification of the supervised or unsupervised type. A typical experiment can be completed and analyzed within hours. Example results are presented on the use of IR spectra combined with multivariate data processing. PMID:24992094
Assessment of data pre-processing methods for LC-MS/MS-based metabolomics of uterine cervix cancer.
Chen, Yanhua; Xu, Jing; Zhang, Ruiping; Shen, Guoqing; Song, Yongmei; Sun, Jianghao; He, Jiuming; Zhan, Qimin; Abliz, Zeper
2013-05-07
A metabolomics strategy based on rapid resolution liquid chromatography/tandem mass spectrometry (RRLC-MS/MS) and multivariate statistics has been implemented to identify potential biomarkers in uterine cervix cancer. Due to the importance of the data pre-processing method, three popular software packages have been compared. Then they have been used to acquire respective data matrices from the same LC-MS/MS data. Multivariate statistics was subsequently used to identify significantly changed biomarkers for uterine cervix cancer from the resulting data matrices. The reliabilities of the identified discriminated metabolites have been further validated on the basis of manually extracted data and ROC curves. Nine potential biomarkers have been identified as having a close relationship with uterine cervix cancer. Considering these in combination as a biomarker group, the AUC amounted to 0.997, with a sensitivity of 92.9% and a specificity of 95.6%. The prediction accuracy was 96.6%. Among these potential biomarkers, the amounts of four purine derivatives were greatly decreased, which might be related to a P2 receptor that might lead to a decrease in cell number through apoptosis. Moreover, only two of them were identified simultaneously by all of the pre-processing tools. The results have demonstrated that the data pre-processing method could seriously bias the metabolomics results. Therefore, application of two or more data pre-processing methods would reveal a more comprehensive set of potential biomarkers in non-targeted metabolomics, before a further validation with LC-MS/MS based targeted metabolomics in MRM mode could be conducted.
ERDC MSRC Resource. High Performance Computing for the Warfighter. Spring 2006
2006-01-01
named Ruby, and the HP/Compaq SC45, named Emerald , continue to add their unique sparkle to the ERDC MSRC computer infrastructure. ERDC invited the...configuration on B-52H purchased additional memory for the login nodes so that this part of the solution process could be done as a preprocessing step. On...application and system services. Of the service nodes, 10 are login nodes and 23 are input/output (I/O) server nodes for the Lustre file system (i.e., the
Image processing methods used to simulate flight over remotely sensed data
NASA Technical Reports Server (NTRS)
Mortensen, H. B.; Hussey, K. J.; Mortensen, R. A.
1988-01-01
It has been demonstrated that image processing techniques can provide an effective means of simulating flight over remotely sensed data (Hussey et al. 1986). This paper explains the methods used to simulate and animate three-dimensional surfaces from two-dimensional imagery. The preprocessing techniques used on the input data, the selection of the animation sequence, the generation of the animation frames, and the recording of the animation is covered. The software used for all steps is discussed.
Preprocessed Consortium for Neuropsychiatric Phenomics dataset.
Gorgolewski, Krzysztof J; Durnez, Joke; Poldrack, Russell A
2017-01-01
Here we present preprocessed MRI data of 265 participants from the Consortium for Neuropsychiatric Phenomics (CNP) dataset. The preprocessed dataset includes minimally preprocessed data in the native, MNI and surface spaces accompanied with potential confound regressors, tissue probability masks, brain masks and transformations. In addition the preprocessed dataset includes unthresholded group level and single subject statistical maps from all tasks included in the original dataset. We hope that availability of this dataset will greatly accelerate research.
Mohebbi, Maryam; Ghassemian, Hassan; Asl, Babak Mohammadzadeh
2011-05-01
This paper aims to propose an effective paroxysmal atrial fibrillation (PAF) predictor which is based on the analysis of the heart rate variability (HRV) signal. Predicting the onset of PAF, based on non-invasive techniques, is clinically important and can be invaluable in order to avoid useless therapeutic interventions and to minimize the risks for the patients. This method consists of four steps: Preprocessing, feature extraction, feature reduction, and classification. In the first step, the QRS complexes are detected from the electrocardiogram (ECG) signal and then the HRV signal is extracted. In the next step, the recurrence plot (RP) of HRV signal is obtained and six features are extracted to characterize the basic patterns of the RP. These features consist of length of longest diagonal segments, average length of the diagonal lines, entropy, trapping time, length of longest vertical line, and recurrence trend. In the third step, these features are reduced to three features by the linear discriminant analysis (LDA) technique. Using LDA not only reduces the number of the input features, but also increases the classification accuracy by selecting the most discriminating features. Finally, a support vector machine-based classifier is used to classify the HRV signals. The performance of the proposed method in prediction of PAF episodes was evaluated using the Atrial Fibrillation Prediction Database which consists of both 30-minutes ECG recordings end just prior to the onset of PAF and segments at least 45 min distant from any PAF events. The obtained sensitivity, specificity, and positive predictivity were 96.55%, 100%, and 100%, respectively.
Automatic cortical thickness analysis on rodent brain
NASA Astrophysics Data System (ADS)
Lee, Joohwi; Ehlers, Cindy; Crews, Fulton; Niethammer, Marc; Budin, Francois; Paniagua, Beatriz; Sulik, Kathy; Johns, Josephine; Styner, Martin; Oguz, Ipek
2011-03-01
Localized difference in the cortex is one of the most useful morphometric traits in human and animal brain studies. There are many tools and methods already developed to automatically measure and analyze cortical thickness for the human brain. However, these tools cannot be directly applied to rodent brains due to the different scales; even adult rodent brains are 50 to 100 times smaller than humans. This paper describes an algorithm for automatically measuring the cortical thickness of mouse and rat brains. The algorithm consists of three steps: segmentation, thickness measurement, and statistical analysis among experimental groups. The segmentation step provides the neocortex separation from other brain structures and thus is a preprocessing step for the thickness measurement. In the thickness measurement step, the thickness is computed by solving a Laplacian PDE and a transport equation. The Laplacian PDE first creates streamlines as an analogy of cortical columns; the transport equation computes the length of the streamlines. The result is stored as a thickness map over the neocortex surface. For the statistical analysis, it is important to sample thickness at corresponding points. This is achieved by the particle correspondence algorithm which minimizes entropy between dynamically moving sample points called particles. Since the computational cost of the correspondence algorithm may limit the number of corresponding points, we use thin-plate spline based interpolation to increase the number of corresponding sample points. As a driving application, we measured the thickness difference to assess the effects of adolescent intermittent ethanol exposure that persist into adulthood and performed t-test between the control and exposed rat groups. We found significantly differing regions in both hemispheres.
CLASSIFYING MEDICAL IMAGES USING MORPHOLOGICAL APPEARANCE MANIFOLDS.
Varol, Erdem; Gaonkar, Bilwaj; Davatzikos, Christos
2013-12-31
Input features for medical image classification algorithms are extracted from raw images using a series of pre processing steps. One common preprocessing step in computational neuroanatomy and functional brain mapping is the nonlinear registration of raw images to a common template space. Typically, the registration methods used are parametric and their output varies greatly with changes in parameters. Most results reported previously perform registration using a fixed parameter setting and use the results as input to the subsequent classification step. The variation in registration results due to choice of parameters thus translates to variation of performance of the classifiers that depend on the registration step for input. Analogous issues have been investigated in the computer vision literature, where image appearance varies with pose and illumination, thereby making classification vulnerable to these confounding parameters. The proposed methodology addresses this issue by sampling image appearances as registration parameters vary, and shows that better classification accuracies can be obtained this way, compared to the conventional approach.
ITSG-Grace2016 data preprocessing methodologies revisited: impact of using Level-1A data products
NASA Astrophysics Data System (ADS)
Klinger, Beate; Mayer-Gürr, Torsten
2017-04-01
For the ITSG-Grace2016 release, the gravity field recovery is based on the use of official GRACE (Gravity Recovery and Climate Experiment) Level-1B data products, generated by the Jet Propulsion Laboratory (JPL). Before gravity field recovery, the Level-1B instrument data are preprocessed. This data preprocessing step includes the combination of Level-1B star camera (SCA1B) and angular acceleration (ACC1B) data for an improved attitude determination (sensor fusion), instrument data screening and ACC1B data calibration. Based on a Level-1A test dataset, provided for individual month throughout the GRACE period by the Center of Space Research at the University of Texas at Austin (UTCSR), the impact of using Level-1A instead of Level-1B data products within the ITSG-Grace2016 processing chain is analyzed. We discuss (1) the attitude determination through an optimal combination of SCA1A and ACC1A data using our sensor fusion approach, (2) the impact of the new attitude product on temporal gravity field solutions, and (3) possible benefits of using Level-1A data for instrument data screening and calibration. As the GRACE mission is currently reaching its end-of-life, the presented work aims not only at a better understanding of GRACE science data to reduce the impact of possible error sources on the gravity field recovery, but it also aims at preparing Level-1A data handling capabilities for the GRACE Follow-On mission.
[Advances in automatic detection technology for images of thin blood film of malaria parasite].
Juan-Sheng, Zhang; Di-Qiang, Zhang; Wei, Wang; Xiao-Guang, Wei; Zeng-Guo, Wang
2017-05-05
This paper reviews the computer vision and image analysis studies aiming at automated diagnosis or screening of malaria in microscope images of thin blood film smears. On the basis of introducing the background and significance of automatic detection technology, the existing detection technologies are summarized and divided into several steps, including image acquisition, pre-processing, morphological analysis, segmentation, count, and pattern classification components. Then, the principles and implementation methods of each step are given in detail. In addition, the promotion and application in automatic detection technology of thick blood film smears are put forwarded as questions worthy of study, and a perspective of the future work for realization of automated microscopy diagnosis of malaria is provided.
The Effect of Normalization in Violence Video Classification Performance
NASA Astrophysics Data System (ADS)
Ali, Ashikin; Senan, Norhalina
2017-08-01
Basically, data pre-processing is an important part of data mining. Normalization is a pre-processing stage for any type of problem statement, especially in video classification. Challenging problems that arises in video classification is because of the heterogeneous content, large variations in video quality and complex semantic meanings of the concepts involved. Therefore, to regularize this problem, it is thoughtful to ensure normalization or basically involvement of thorough pre-processing stage aids the robustness of classification performance. This process is to scale all the numeric variables into certain range to make it more meaningful for further phases in available data mining techniques. Thus, this paper attempts to examine the effect of 2 normalization techniques namely Min-max normalization and Z-score in violence video classifications towards the performance of classification rate using Multi-layer perceptron (MLP) classifier. Using Min-Max Normalization range of [0,1] the result shows almost 98% of accuracy, meanwhile Min-Max Normalization range of [-1,1] accuracy is 59% and for Z-score the accuracy is 50%.
Hu, Yi; Loizou, Philipos C
2010-01-01
Pre-processing based noise-reduction algorithms used for cochlear implants (CIs) can sometimes introduce distortions which are carried through the vocoder stages of CI processing. While the background noise may be notably suppressed, the harmonic structure and/or spectral envelope of the signal may be distorted. The present study investigates the potential of preserving the signal's harmonic structure in voiced segments (e.g., vowels) as a means of alleviating the negative effects of pre-processing. The hypothesis tested is that preserving the harmonic structure of the signal is crucial for subsequent vocoder processing. The implications of preserving either the main harmonic components occurring at multiples of F0 or the main harmonics along with adjacent partials are investigated. This is done by first pre-processing noisy speech with a conventional noise-reduction algorithm, regenerating the harmonics, and vocoder processing the stimuli with eight channels of stimulation in steady speech-shaped noise. Results indicated that preserving the main low-frequency harmonics (spanning 1 or 3 kHz) alone was not beneficial. Preserving, however, the harmonic structure of the stimulus, i.e., the main harmonics along with the adjacent partials, was found to be critically important and provided substantial improvements (41 percentage points) in intelligibility.
Larriba, Yolanda; Rueda, Cristina; Fernández, Miguel A; Peddada, Shyamal D
2018-01-01
Motivation: Gene-expression data obtained from high throughput technologies are subject to various sources of noise and accordingly the raw data are pre-processed before formally analyzed. Normalization of the data is a key pre-processing step, since it removes systematic variations across arrays. There are numerous normalization methods available in the literature. Based on our experience, in the context of oscillatory systems, such as cell-cycle, circadian clock, etc., the choice of the normalization method may substantially impact the determination of a gene to be rhythmic. Thus rhythmicity of a gene can purely be an artifact of how the data were normalized. Since the determination of rhythmic genes is an important component of modern toxicological and pharmacological studies, it is important to determine truly rhythmic genes that are robust to the choice of a normalization method. Results: In this paper we introduce a rhythmicity measure and a bootstrap methodology to detect rhythmic genes in an oscillatory system. Although the proposed methodology can be used for any high-throughput gene expression data, in this paper we illustrate the proposed methodology using several publicly available circadian clock microarray gene-expression datasets. We demonstrate that the choice of normalization method has very little effect on the proposed methodology. Specifically, for any pair of normalization methods considered in this paper, the resulting values of the rhythmicity measure are highly correlated. Thus it suggests that the proposed measure is robust to the choice of a normalization method. Consequently, the rhythmicity of a gene is potentially not a mere artifact of the normalization method used. Lastly, as demonstrated in the paper, the proposed bootstrap methodology can also be used for simulating data for genes participating in an oscillatory system using a reference dataset. Availability: A user friendly code implemented in R language can be downloaded from http://www.eio.uva.es/~miguel/robustdetectionprocedure.html.
Larriba, Yolanda; Rueda, Cristina; Fernández, Miguel A.; Peddada, Shyamal D.
2018-01-01
Motivation: Gene-expression data obtained from high throughput technologies are subject to various sources of noise and accordingly the raw data are pre-processed before formally analyzed. Normalization of the data is a key pre-processing step, since it removes systematic variations across arrays. There are numerous normalization methods available in the literature. Based on our experience, in the context of oscillatory systems, such as cell-cycle, circadian clock, etc., the choice of the normalization method may substantially impact the determination of a gene to be rhythmic. Thus rhythmicity of a gene can purely be an artifact of how the data were normalized. Since the determination of rhythmic genes is an important component of modern toxicological and pharmacological studies, it is important to determine truly rhythmic genes that are robust to the choice of a normalization method. Results: In this paper we introduce a rhythmicity measure and a bootstrap methodology to detect rhythmic genes in an oscillatory system. Although the proposed methodology can be used for any high-throughput gene expression data, in this paper we illustrate the proposed methodology using several publicly available circadian clock microarray gene-expression datasets. We demonstrate that the choice of normalization method has very little effect on the proposed methodology. Specifically, for any pair of normalization methods considered in this paper, the resulting values of the rhythmicity measure are highly correlated. Thus it suggests that the proposed measure is robust to the choice of a normalization method. Consequently, the rhythmicity of a gene is potentially not a mere artifact of the normalization method used. Lastly, as demonstrated in the paper, the proposed bootstrap methodology can also be used for simulating data for genes participating in an oscillatory system using a reference dataset. Availability: A user friendly code implemented in R language can be downloaded from http://www.eio.uva.es/~miguel/robustdetectionprocedure.html PMID:29456555
Sun, Xiaofei; Shi, Lin; Luo, Yishan; Yang, Wei; Li, Hongpeng; Liang, Peipeng; Li, Kuncheng; Mok, Vincent C T; Chu, Winnie C W; Wang, Defeng
2015-07-28
Intensity normalization is an important preprocessing step in brain magnetic resonance image (MRI) analysis. During MR image acquisition, different scanners or parameters would be used for scanning different subjects or the same subject at a different time, which may result in large intensity variations. This intensity variation will greatly undermine the performance of subsequent MRI processing and population analysis, such as image registration, segmentation, and tissue volume measurement. In this work, we proposed a new histogram normalization method to reduce the intensity variation between MRIs obtained from different acquisitions. In our experiment, we scanned each subject twice on two different scanners using different imaging parameters. With noise estimation, the image with lower noise level was determined and treated as the high-quality reference image. Then the histogram of the low-quality image was normalized to the histogram of the high-quality image. The normalization algorithm includes two main steps: (1) intensity scaling (IS), where, for the high-quality reference image, the intensities of the image are first rescaled to a range between the low intensity region (LIR) value and the high intensity region (HIR) value; and (2) histogram normalization (HN),where the histogram of low-quality image as input image is stretched to match the histogram of the reference image, so that the intensity range in the normalized image will also lie between LIR and HIR. We performed three sets of experiments to evaluate the proposed method, i.e., image registration, segmentation, and tissue volume measurement, and compared this with the existing intensity normalization method. It is then possible to validate that our histogram normalization framework can achieve better results in all the experiments. It is also demonstrated that the brain template with normalization preprocessing is of higher quality than the template with no normalization processing. We have proposed a histogram-based MRI intensity normalization method. The method can normalize scans which were acquired on different MRI units. We have validated that the method can greatly improve the image analysis performance. Furthermore, it is demonstrated that with the help of our normalization method, we can create a higher quality Chinese brain template.
Bandeira Diniz, João Otávio; Bandeira Diniz, Pedro Henrique; Azevedo Valente, Thales Levi; Corrêa Silva, Aristófanes; de Paiva, Anselmo Cardoso; Gattass, Marcelo
2018-03-01
The processing of medical image is an important tool to assist in minimizing the degree of uncertainty of the specialist, while providing specialists with an additional source of detect and diagnosis information. Breast cancer is the most common type of cancer that affects the female population around the world. It is also the most deadly type of cancer among women. It is the second most common type of cancer among all others. The most common examination to diagnose breast cancer early is mammography. In the last decades, computational techniques have been developed with the purpose of automatically detecting structures that maybe associated with tumors in mammography examination. This work presents a computational methodology to automatically detection of mass regions in mammography by using a convolutional neural network. The materials used in this work is the DDSM database. The method proposed consists of two phases: training phase and test phase. The training phase has 2 main steps: (1) create a model to classify breast tissue into dense and non-dense (2) create a model to classify regions of breast into mass and non-mass. The test phase has 7 step: (1) preprocessing; (2) registration; (3) segmentation; (4) first reduction of false positives; (5) preprocessing of regions segmented; (6) density tissue classification (7) second reduction of false positives where regions will be classified into mass and non-mass. The proposed method achieved 95.6% of accuracy in classify non-dense breasts tissue and 97,72% accuracy in classify dense breasts. To detect regions of mass in non-dense breast, the method achieved a sensitivity value of 91.5%, and specificity value of 90.7%, with 91% accuracy. To detect regions in dense breasts, our method achieved 90.4% of sensitivity and 96.4% of specificity, with accuracy of 94.8%. According to the results achieved by CNN, we demonstrate the feasibility of using convolutional neural networks on medical image processing techniques for classification of breast tissue and mass detection. Copyright © 2018 Elsevier B.V. All rights reserved.
Retinex Preprocessing for Improved Multi-Spectral Image Classification
NASA Technical Reports Server (NTRS)
Thompson, B.; Rahman, Z.; Park, S.
2000-01-01
The goal of multi-image classification is to identify and label "similar regions" within a scene. The ability to correctly classify a remotely sensed multi-image of a scene is affected by the ability of the classification process to adequately compensate for the effects of atmospheric variations and sensor anomalies. Better classification may be obtained if the multi-image is preprocessed before classification, so as to reduce the adverse effects of image formation. In this paper, we discuss the overall impact on multi-spectral image classification when the retinex image enhancement algorithm is used to preprocess multi-spectral images. The retinex is a multi-purpose image enhancement algorithm that performs dynamic range compression, reduces the dependence on lighting conditions, and generally enhances apparent spatial resolution. The retinex has been successfully applied to the enhancement of many different types of grayscale and color images. We show in this paper that retinex preprocessing improves the spatial structure of multi-spectral images and thus provides better within-class variations than would otherwise be obtained without the preprocessing. For a series of multi-spectral images obtained with diffuse and direct lighting, we show that without retinex preprocessing the class spectral signatures vary substantially with the lighting conditions. Whereas multi-dimensional clustering without preprocessing produced one-class homogeneous regions, the classification on the preprocessed images produced multi-class non-homogeneous regions. This lack of homogeneity is explained by the interaction between different agronomic treatments applied to the regions: the preprocessed images are closer to ground truth. The principle advantage that the retinex offers is that for different lighting conditions classifications derived from the retinex preprocessed images look remarkably "similar", and thus more consistent, whereas classifications derived from the original images, without preprocessing, are much less similar.
NASA Astrophysics Data System (ADS)
Shabani, H.; Doblas, A.; Saavedra, G.; Preza, C.
2018-02-01
The restored images in structured illumination microscopy (SIM) can be affected by residual fringes due to a mismatch between the illumination pattern and the sinusoidal model assumed by the restoration method. When a Fresnel biprism is used to generate a structured pattern, this pattern cannot be described by a pure sinusoidal function since it is distorted by an envelope due to the biprism's edge. In this contribution, we have investigated the effect of the envelope on the restored SIM images and propose a computational method in order to address it. The proposed approach to reduce the effect of the envelope consists of two parts. First, the envelope of the structured pattern, determined through calibration data, is removed from the raw SIM data via a preprocessing step. In the second step, a notch filter is applied to the images, which are restored using the well-known generalized Wiener filter, to filter any residual undesired fringes. The performance of our approach has been evaluated numerically by simulating the effect of the envelope on synthetic forward images of a 6-μm spherical bead generated using the real pattern and then restored using the SIM approach that is based on an ideal pure sinusoidal function before and after our proposed correction method. The simulation result shows 74% reduction in the contrast of the residual pattern when the proposed method is applied. Experimental results from a pollen grain sample also validate the proposed approach.
Effects of preprocessing Landsat MSS data on derived features
NASA Technical Reports Server (NTRS)
Parris, T. M.; Cicone, R. C.
1983-01-01
Important to the use of multitemporal Landsat MSS data for earth resources monitoring, such as agricultural inventories, is the ability to minimize the effects of varying atmospheric and satellite viewing conditions, while extracting physically meaningful features from the data. In general, the approaches to the preprocessing problem have been derived from either physical or statistical models. This paper compares three proposed algorithms; XSTAR haze correction, Color Normalization, and Multiple Acquisition Mean Level Adjustment. These techniques represent physical, statistical, and hybrid physical-statistical models, respectively. The comparisons are made in the context of three feature extraction techniques; the Tasseled Cap, the Cate Color Cube. and Normalized Difference.
Data preprocessing for a vehicle-based localization system used in road traffic applications
NASA Astrophysics Data System (ADS)
Patelczyk, Timo; Löffler, Andreas; Biebl, Erwin
2016-09-01
This paper presents a fixed-point implementation of the preprocessing using a field programmable gate array (FPGA), which is required for a multipath joint angle and delay estimation (JADE) used in road traffic applications. This paper lays the foundation for many model-based parameter estimation methods. Here, a simulation of a vehicle-based localization system application for protecting vulnerable road users, which were equipped with appropriate transponders, is considered. For such safety critical applications, the robustness and real-time capability of the localization is particularly important. Additionally, a motivation to use a fixed-point implementation for the data preprocessing is a limited computing power of the head unit of a vehicle. This study aims to process the raw data provided by the localization system used in this paper. The data preprocessing applied includes a wideband calibration of the physical localization system, separation of relevant information from the received sampled signal, and preparation of the incoming data via further processing. Further, a channel matrix estimation was implemented to complete the data preprocessing, which contains information on channel parameters, e.g., the positions of the objects to be located. In the presented case of a vehicle-based localization system application we assume an urban environment, in which multipath propagation occurs. Since most methods for localization are based on uncorrelated signals, this fact must be addressed. Hence, a decorrelation of incoming data stream in terms of a further localization is required. This decorrelation was accomplished by considering several snapshots in different time slots. As a final aspect of the use of fixed-point arithmetic, quantization errors are considered. In addition, the resources and runtime of the presented implementation are discussed; these factors are strongly linked to a practical implementation.
quanTLC, an online open-source solution for videodensitometric quantification.
Fichou, Dimitri; Morlock, Gertrud E
2018-07-27
The image is the key feature of planar chromatography. Videodensitometry by digital image conversion is the fastest way of its evaluation. Instead of scanning single sample tracks one after the other, only few clicks are needed to convert all tracks at one go. A minimalistic software was newly developed, termed quanTLC, that allowed the quantitative evaluation of samples in few minutes. quanTLC includes important assets such as open-source, online, free of charge, intuitive to use and tailored to planar chromatography, as none of the nine existent software for image evaluation covered these aspects altogether. quanTLC supports common image file formats for chromatogram upload. All necessary steps were included, i.e., videodensitogram extraction, preprocessing, automatic peak integration, calibration, statistical data analysis, reporting and data export. The default options for each step are suitable for most analyses while still being tunable, if needed. A one-minute video was recorded to serve as user manual. The software capabilities are shown on the example of a lipophilic dye mixture separation. The quantitative results were verified by comparison with those obtained by commercial videodensitometry software and opto-mechanical slit-scanning densitometry. The data can be exported at each step to be processed in further software, if required. The code was released open-source to be exploited even further. The software itself is online useable without installation and directly accessible at http://shinyapps.ernaehrung.uni-giessen.de/quanTLC. Copyright © 2018 Elsevier B.V. All rights reserved.
Multivariate assessment of event-related potentials with the t-CWT method.
Bostanov, Vladimir
2015-11-05
Event-related brain potentials (ERPs) are usually assessed with univariate statistical tests although they are essentially multivariate objects. Brain-computer interface applications are a notable exception to this practice, because they are based on multivariate classification of single-trial ERPs. Multivariate ERP assessment can be facilitated by feature extraction methods. One such method is t-CWT, a mathematical-statistical algorithm based on the continuous wavelet transform (CWT) and Student's t-test. This article begins with a geometric primer on some basic concepts of multivariate statistics as applied to ERP assessment in general and to the t-CWT method in particular. Further, it presents for the first time a detailed, step-by-step, formal mathematical description of the t-CWT algorithm. A new multivariate outlier rejection procedure based on principal component analysis in the frequency domain is presented as an important pre-processing step. The MATLAB and GNU Octave implementation of t-CWT is also made publicly available for the first time as free and open source code. The method is demonstrated on some example ERP data obtained in a passive oddball paradigm. Finally, some conceptually novel applications of the multivariate approach in general and of the t-CWT method in particular are suggested and discussed. Hopefully, the publication of both the t-CWT source code and its underlying mathematical algorithm along with a didactic geometric introduction to some basic concepts of multivariate statistics would make t-CWT more accessible to both users and developers in the field of neuroscience research.
Urban Density Indices Using Mean Shift-Based Upsampled Elevetion Data
NASA Astrophysics Data System (ADS)
Charou, E.; Gyftakis, S.; Bratsolis, E.; Tsenoglou, T.; Papadopoulou, Th. D.; Vassilas, N.
2015-04-01
Urban density is an important factor for several fields, e.g. urban design, planning and land management. Modern remote sensors deliver ample information for the estimation of specific urban land classification classes (2D indicators), and the height of urban land classification objects (3D indicators) within an Area of Interest (AOI). In this research, two of these indicators, Building Coverage Ratio (BCR) and Floor Area Ratio (FAR) are numerically and automatically derived from high-resolution airborne RGB orthophotos and LiDAR data. In the pre-processing step the low resolution elevation data are fused with the high resolution optical data through a mean-shift based discontinuity preserving smoothing algorithm. The outcome is an improved normalized digital surface model (nDSM) is an upsampled elevation data with considerable improvement regarding region filling and "straightness" of elevation discontinuities. In a following step, a Multilayer Feedforward Neural Network (MFNN) is used to classify all pixels of the AOI to building or non-building categories. For the total surface of the block and the buildings we consider the number of their pixels and the surface of the unit pixel. Comparisons of the automatically derived BCR and FAR indicators with manually derived ones shows the applicability and effectiveness of the methodology proposed.
Curvature correction of retinal OCTs using graph-based geometry detection
NASA Astrophysics Data System (ADS)
Kafieh, Raheleh; Rabbani, Hossein; Abramoff, Michael D.; Sonka, Milan
2013-05-01
In this paper, we present a new algorithm as an enhancement and preprocessing step for acquired optical coherence tomography (OCT) images of the retina. The proposed method is composed of two steps, first of which is a denoising algorithm with wavelet diffusion based on a circular symmetric Laplacian model, and the second part can be described in terms of graph-based geometry detection and curvature correction according to the hyper-reflective complex layer in the retina. The proposed denoising algorithm showed an improvement of contrast-to-noise ratio from 0.89 to 1.49 and an increase of signal-to-noise ratio (OCT image SNR) from 18.27 to 30.43 dB. By applying the proposed method for estimation of the interpolated curve using a full automatic method, the mean ± SD unsigned border positioning error was calculated for normal and abnormal cases. The error values of 2.19 ± 1.25 and 8.53 ± 3.76 µm were detected for 200 randomly selected slices without pathological curvature and 50 randomly selected slices with pathological curvature, respectively. The important aspect of this algorithm is its ability in detection of curvature in strongly pathological images that surpasses previously introduced methods; the method is also fast, compared to the relatively low speed of similar methods.
NASA Astrophysics Data System (ADS)
Lei, Xiaohui; Wang, Yuhui; Liao, Weihong; Jiang, Yunzhong; Tian, Yu; Wang, Hao
2011-09-01
Many regions are still threatened with frequent floods and water resource shortage problems in China. Consequently, the task of reproducing and predicting the hydrological process in watersheds is hard and unavoidable for reducing the risks of damage and loss. Thus, it is necessary to develop an efficient and cost-effective hydrological tool in China as many areas should be modeled. Currently, developed hydrological tools such as Mike SHE and ArcSWAT (soil and water assessment tool based on ArcGIS) show significant power in improving the precision of hydrological modeling in China by considering spatial variability both in land cover and in soil type. However, adopting developed commercial tools in such a large developing country comes at a high cost. Commercial modeling tools usually contain large numbers of formulas, complicated data formats, and many preprocessing or postprocessing steps that may make it difficult for the user to carry out simulation, thus lowering the efficiency of the modeling process. Besides, commercial hydrological models usually cannot be modified or improved to be suitable for some special hydrological conditions in China. Some other hydrological models are open source, but integrated into commercial GIS systems. Therefore, by integrating hydrological simulation code EasyDHM, a hydrological simulation tool named MWEasyDHM was developed based on open-source MapWindow GIS, the purpose of which is to establish the first open-source GIS-based distributed hydrological model tool in China by integrating modules of preprocessing, model computation, parameter estimation, result display, and analysis. MWEasyDHM provides users with a friendly manipulating MapWindow GIS interface, selectable multifunctional hydrological processing modules, and, more importantly, an efficient and cost-effective hydrological simulation tool. The general construction of MWEasyDHM consists of four major parts: (1) a general GIS module for hydrological analysis, (2) a preprocessing module for modeling inputs, (3) a model calibration module, and (4) a postprocessing module. The general GIS module for hydrological analysis is developed on the basis of totally open-source GIS software, MapWindow, which contains basic GIS functions. The preprocessing module is made up of three submodules including a DEM-based submodule for hydrological analysis, a submodule for default parameter calculation, and a submodule for the spatial interpolation of meteorological data. The calibration module contains parallel computation, real-time computation, and visualization. The postprocessing module includes model calibration and model results spatial visualization using tabular form and spatial grids. MWEasyDHM makes it possible for efficient modeling and calibration of EasyDHM, and promises further development of cost-effective applications in various watersheds.
Metabolic profiling of body fluids and multivariate data analysis.
Trezzi, Jean-Pierre; Jäger, Christian; Galozzi, Sara; Barkovits, Katalin; Marcus, Katrin; Mollenhauer, Brit; Hiller, Karsten
2017-01-01
Metabolome analyses of body fluids are challenging due pre-analytical variations, such as pre-processing delay and temperature, and constant dynamical changes of biochemical processes within the samples. Therefore, proper sample handling starting from the time of collection up to the analysis is crucial to obtain high quality samples and reproducible results. A metabolomics analysis is divided into 4 main steps: 1) Sample collection, 2) Metabolite extraction, 3) Data acquisition and 4) Data analysis. Here, we describe a protocol for gas chromatography coupled to mass spectrometry (GC-MS) based metabolic analysis for biological matrices, especially body fluids. This protocol can be applied on blood serum/plasma, saliva and cerebrospinal fluid (CSF) samples of humans and other vertebrates. It covers sample collection, sample pre-processing, metabolite extraction, GC-MS measurement and guidelines for the subsequent data analysis. Advantages of this protocol include: •Robust and reproducible metabolomics results, taking into account pre-analytical variations that may occur during the sampling process•Small sample volume required•Rapid and cost-effective processing of biological samples•Logistic regression based determination of biomarker signatures for in-depth data analysis.
Automated segmentation of three-dimensional MR brain images
NASA Astrophysics Data System (ADS)
Park, Jonggeun; Baek, Byungjun; Ahn, Choong-Il; Ku, Kyo Bum; Jeong, Dong Kyun; Lee, Chulhee
2006-03-01
Brain segmentation is a challenging problem due to the complexity of the brain. In this paper, we propose an automated brain segmentation method for 3D magnetic resonance (MR) brain images which are represented as a sequence of 2D brain images. The proposed method consists of three steps: pre-processing, removal of non-brain regions (e.g., the skull, meninges, other organs, etc), and spinal cord restoration. In pre-processing, we perform adaptive thresholding which takes into account variable intensities of MR brain images corresponding to various image acquisition conditions. In segmentation process, we iteratively apply 2D morphological operations and masking for the sequences of 2D sagittal, coronal, and axial planes in order to remove non-brain tissues. Next, final 3D brain regions are obtained by applying OR operation for segmentation results of three planes. Finally we reconstruct the spinal cord truncated during the previous processes. Experiments are performed with fifteen 3D MR brain image sets with 8-bit gray-scale. Experiment results show the proposed algorithm is fast, and provides robust and satisfactory results.
Zhao, Li-Ting; Xiang, Yu-Hong; Dai, Yin-Mei; Zhang, Zhuo-Yong
2010-04-01
Near infrared spectroscopy was applied to measure the tissue slice of endometrial tissues for collecting the spectra. A total of 154 spectra were obtained from 154 samples. The number of normal, hyperplasia, and malignant samples was 36, 60, and 58, respectively. Original near infrared spectra are composed of many variables, for example, interference information including instrument errors and physical effects such as particle size and light scatter. In order to reduce these influences, original spectra data should be performed with different spectral preprocessing methods to compress variables and extract useful information. So the methods of spectral preprocessing and wavelength selection have played an important role in near infrared spectroscopy technique. In the present paper the raw spectra were processed using various preprocessing methods including first derivative, multiplication scatter correction, Savitzky-Golay first derivative algorithm, standard normal variate, smoothing, and moving-window median. Standard deviation was used to select the optimal spectral region of 4 000-6 000 cm(-1). Then principal component analysis was used for classification. Principal component analysis results showed that three types of samples could be discriminated completely and the accuracy almost achieved 100%. This study demonstrated that near infrared spectroscopy technology and chemometrics method could be a fast, efficient, and novel means to diagnose cancer. The proposed methods would be a promising and significant diagnosis technique of early stage cancer.
Cloud Detection by Fusing Multi-Scale Convolutional Features
NASA Astrophysics Data System (ADS)
Li, Zhiwei; Shen, Huanfeng; Wei, Yancong; Cheng, Qing; Yuan, Qiangqiang
2018-04-01
Clouds detection is an important pre-processing step for accurate application of optical satellite imagery. Recent studies indicate that deep learning achieves best performance in image segmentation tasks. Aiming at boosting the accuracy of cloud detection for multispectral imagery, especially for those that contain only visible and near infrared bands, in this paper, we proposed a deep learning based cloud detection method termed MSCN (multi-scale cloud net), which segments cloud by fusing multi-scale convolutional features. MSCN was trained on a global cloud cover validation collection, and was tested in more than ten types of optical images with different resolution. Experiment results show that MSCN has obvious advantages over the traditional multi-feature combined cloud detection method in accuracy, especially when in snow and other areas covered by bright non-cloud objects. Besides, MSCN produced more detailed cloud masks than the compared deep cloud detection convolution network. The effectiveness of MSCN make it promising for practical application in multiple kinds of optical imagery.
A novel application of artificial neural network for wind speed estimation
NASA Astrophysics Data System (ADS)
Fang, Da; Wang, Jianzhou
2017-05-01
Providing accurate multi-steps wind speed estimation models has increasing significance, because of the important technical and economic impacts of wind speed on power grid security and environment benefits. In this study, the combined strategies for wind speed forecasting are proposed based on an intelligent data processing system using artificial neural network (ANN). Generalized regression neural network and Elman neural network are employed to form two hybrid models. The approach employs one of ANN to model the samples achieving data denoising and assimilation and apply the other to predict wind speed using the pre-processed samples. The proposed method is demonstrated in terms of the predicting improvements of the hybrid models compared with single ANN and the typical forecasting method. To give sufficient cases for the study, four observation sites with monthly average wind speed of four given years in Western China were used to test the models. Multiple evaluation methods demonstrated that the proposed method provides a promising alternative technique in monthly average wind speed estimation.
Skull removal in MR images using a modified artificial bee colony optimization algorithm.
Taherdangkoo, Mohammad
2014-01-01
Removal of the skull from brain Magnetic Resonance (MR) images is an important preprocessing step required for other image analysis techniques such as brain tissue segmentation. In this paper, we propose a new algorithm based on the Artificial Bee Colony (ABC) optimization algorithm to remove the skull region from brain MR images. We modify the ABC algorithm using a different strategy for initializing the coordinates of scout bees and their direction of search. Moreover, we impose an additional constraint to the ABC algorithm to avoid the creation of discontinuous regions. We found that our algorithm successfully removed all bony skull from a sample of de-identified MR brain images acquired from different model scanners. The obtained results of the proposed algorithm compared with those of previously introduced well known optimization algorithms such as Particle Swarm Optimization (PSO) and Ant Colony Optimization (ACO) demonstrate the superior results and computational performance of our algorithm, suggesting its potential for clinical applications.
Matrix factorization-based data fusion for gene function prediction in baker's yeast and slime mold.
Zitnik, Marinka; Zupan, Blaž
2014-01-01
The development of effective methods for the characterization of gene functions that are able to combine diverse data sources in a sound and easily-extendible way is an important goal in computational biology. We have previously developed a general matrix factorization-based data fusion approach for gene function prediction. In this manuscript, we show that this data fusion approach can be applied to gene function prediction and that it can fuse various heterogeneous data sources, such as gene expression profiles, known protein annotations, interaction and literature data. The fusion is achieved by simultaneous matrix tri-factorization that shares matrix factors between sources. We demonstrate the effectiveness of the approach by evaluating its performance on predicting ontological annotations in slime mold D. discoideum and on recognizing proteins of baker's yeast S. cerevisiae that participate in the ribosome or are located in the cell membrane. Our approach achieves predictive performance comparable to that of the state-of-the-art kernel-based data fusion, but requires fewer data preprocessing steps.
Anima: Modular Workflow System for Comprehensive Image Data Analysis
Rantanen, Ville; Valori, Miko; Hautaniemi, Sampsa
2014-01-01
Modern microscopes produce vast amounts of image data, and computational methods are needed to analyze and interpret these data. Furthermore, a single image analysis project may require tens or hundreds of analysis steps starting from data import and pre-processing to segmentation and statistical analysis; and ending with visualization and reporting. To manage such large-scale image data analysis projects, we present here a modular workflow system called Anima. Anima is designed for comprehensive and efficient image data analysis development, and it contains several features that are crucial in high-throughput image data analysis: programing language independence, batch processing, easily customized data processing, interoperability with other software via application programing interfaces, and advanced multivariate statistical analysis. The utility of Anima is shown with two case studies focusing on testing different algorithms developed in different imaging platforms and an automated prediction of alive/dead C. elegans worms by integrating several analysis environments. Anima is a fully open source and available with documentation at www.anduril.org/anima. PMID:25126541
Morphological rational operator for contrast enhancement.
Peregrina-Barreto, Hayde; Herrera-Navarro, Ana M; Morales-Hernández, Luis A; Terol-Villalobos, Iván R
2011-03-01
Contrast enhancement is an important task in image processing that is commonly used as a preprocessing step to improve the images for other tasks such as segmentation. However, some methods for contrast improvement that work well in low-contrast regions affect good contrast regions as well. This occurs due to the fact that some elements may vanish. A method focused on images with different luminance conditions is introduced in the present work. The proposed method is based on morphological transformations by reconstruction and rational operations, which, altogether, allow a more accurate contrast enhancement resulting in regions that are in harmony with their environment. Furthermore, due to the properties of these morphological transformations, the creation of new elements on the image is avoided. The processing is carried out on luminance values in the u'v'Y color space, which avoids the creation of new colors. As a result of the previous considerations, the proposed method keeps the natural color appearance of the image.
Super-Resolution Community Detection for Layer-Aggregated Multilayer Networks
Taylor, Dane; Caceres, Rajmonda S.; Mucha, Peter J.
2017-01-01
Applied network science often involves preprocessing network data before applying a network-analysis method, and there is typically a theoretical disconnect between these steps. For example, it is common to aggregate time-varying network data into windows prior to analysis, and the trade-offs of this preprocessing are not well understood. Focusing on the problem of detecting small communities in multilayer networks, we study the effects of layer aggregation by developing random-matrix theory for modularity matrices associated with layer-aggregated networks with N nodes and L layers, which are drawn from an ensemble of Erdős–Rényi networks with communities planted in subsets of layers. We study phase transitions in which eigenvectors localize onto communities (allowing their detection) and which occur for a given community provided its size surpasses a detectability limit K*. When layers are aggregated via a summation, we obtain K∗∝O(NL/T), where T is the number of layers across which the community persists. Interestingly, if T is allowed to vary with L, then summation-based layer aggregation enhances small-community detection even if the community persists across a vanishing fraction of layers, provided that T/L decays more slowly than 𝒪(L−1/2). Moreover, we find that thresholding the summation can, in some cases, cause K* to decay exponentially, decreasing by orders of magnitude in a phenomenon we call super-resolution community detection. In other words, layer aggregation with thresholding is a nonlinear data filter enabling detection of communities that are otherwise too small to detect. Importantly, different thresholds generally enhance the detectability of communities having different properties, illustrating that community detection can be obscured if one analyzes network data using a single threshold. PMID:29445565
Super-Resolution Community Detection for Layer-Aggregated Multilayer Networks.
Taylor, Dane; Caceres, Rajmonda S; Mucha, Peter J
2017-01-01
Applied network science often involves preprocessing network data before applying a network-analysis method, and there is typically a theoretical disconnect between these steps. For example, it is common to aggregate time-varying network data into windows prior to analysis, and the trade-offs of this preprocessing are not well understood. Focusing on the problem of detecting small communities in multilayer networks, we study the effects of layer aggregation by developing random-matrix theory for modularity matrices associated with layer-aggregated networks with N nodes and L layers, which are drawn from an ensemble of Erdős-Rényi networks with communities planted in subsets of layers. We study phase transitions in which eigenvectors localize onto communities (allowing their detection) and which occur for a given community provided its size surpasses a detectability limit K * . When layers are aggregated via a summation, we obtain [Formula: see text], where T is the number of layers across which the community persists. Interestingly, if T is allowed to vary with L , then summation-based layer aggregation enhances small-community detection even if the community persists across a vanishing fraction of layers, provided that T/L decays more slowly than ( L -1/2 ). Moreover, we find that thresholding the summation can, in some cases, cause K * to decay exponentially, decreasing by orders of magnitude in a phenomenon we call super-resolution community detection. In other words, layer aggregation with thresholding is a nonlinear data filter enabling detection of communities that are otherwise too small to detect. Importantly, different thresholds generally enhance the detectability of communities having different properties, illustrating that community detection can be obscured if one analyzes network data using a single threshold.
Improved document image segmentation algorithm using multiresolution morphology
NASA Astrophysics Data System (ADS)
Bukhari, Syed Saqib; Shafait, Faisal; Breuel, Thomas M.
2011-01-01
Page segmentation into text and non-text elements is an essential preprocessing step before optical character recognition (OCR) operation. In case of poor segmentation, an OCR classification engine produces garbage characters due to the presence of non-text elements. This paper describes modifications to the text/non-text segmentation algorithm presented by Bloomberg,1 which is also available in his open-source Leptonica library.2The modifications result in significant improvements and achieved better segmentation accuracy than the original algorithm for UW-III, UNLV, ICDAR 2009 page segmentation competition test images and circuit diagram datasets.
Prognosis for children with acute liver failure due to Amanita phalloides poisoning
NASA Astrophysics Data System (ADS)
Wachulski, Marcin F.; Kamińska-Gocał, Diana; Dądalski, Maciej; Socha, Piotr; Mulawka, Jan J.
2011-10-01
The primary objective of this article is to find new effective methods of diagnosis of urgent liver transplantation after Amanita phalloides intoxication amongst pediatric patients. The research was carried out using a medical database of pediatric patients who suffered from acute liver failure after amatoxin consumption. After data preprocessing and attribute selection steps, a two-phase experiment was conducted, which incorporated a wide variety of data mining algorithms. The results deliver two equivalent classification models with simple decision structure and reasonable quality of surgery prediction.
NASA/BLM APT, phase 2. Volume 2: Technology demonstration. [Arizona
NASA Technical Reports Server (NTRS)
1981-01-01
Techniques described include: (1) steps in the preprocessing of LANDSAT data; (2) the training of a classifier; (3) maximum likelihood classification and precision; (4) geometric correction; (5) class description; (6) digitizing; (7) digital terrain data; (8) an overview of sample design; (9) allocation and selection of primary sample units; (10) interpretation of secondary sample units; (11) data collection ground plots; (12) data reductions; (13) analysis for productivity estimation and map verification; (14) cost analysis; and (150) LANDSAT digital products. The evaluation of the pre-inventory planning for P.J. is included.
Saa, Pedro A.; Nielsen, Lars K.
2016-01-01
Motivation: Computation of steady-state flux solutions in large metabolic models is routinely performed using flux balance analysis based on a simple LP (Linear Programming) formulation. A minimal requirement for thermodynamic feasibility of the flux solution is the absence of internal loops, which are enforced using ‘loopless constraints’. The resulting loopless flux problem is a substantially harder MILP (Mixed Integer Linear Programming) problem, which is computationally expensive for large metabolic models. Results: We developed a pre-processing algorithm that significantly reduces the size of the original loopless problem into an easier and equivalent MILP problem. The pre-processing step employs a fast matrix sparsification algorithm—Fast- sparse null-space pursuit (SNP)—inspired by recent results on SNP. By finding a reduced feasible ‘loop-law’ matrix subject to known directionalities, Fast-SNP considerably improves the computational efficiency in several metabolic models running different loopless optimization problems. Furthermore, analysis of the topology encoded in the reduced loop matrix enabled identification of key directional constraints for the potential permanent elimination of infeasible loops in the underlying model. Overall, Fast-SNP is an effective and simple algorithm for efficient formulation of loop-law constraints, making loopless flux optimization feasible and numerically tractable at large scale. Availability and Implementation: Source code for MATLAB including examples is freely available for download at http://www.aibn.uq.edu.au/cssb-resources under Software. Optimization uses Gurobi, CPLEX or GLPK (the latter is included with the algorithm). Contact: lars.nielsen@uq.edu.au Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27559155
Comparison of algorithms for automatic border detection of melanoma in dermoscopy images
NASA Astrophysics Data System (ADS)
Srinivasa Raghavan, Sowmya; Kaur, Ravneet; LeAnder, Robert
2016-09-01
Melanoma is one of the most rapidly accelerating cancers in the world [1]. Early diagnosis is critical to an effective cure. We propose a new algorithm for more accurately detecting melanoma borders in dermoscopy images. Proper border detection requires eliminating occlusions like hair and bubbles by processing the original image. The preprocessing step involves transforming the RGB image to the CIE L*u*v* color space, in order to decouple brightness from color information, then increasing contrast, using contrast-limited adaptive histogram equalization (CLAHE), followed by artifacts removal using a Gaussian filter. After preprocessing, the Chen-Vese technique segments the preprocessed images to create a lesion mask which undergoes a morphological closing operation. Next, the largest central blob in the lesion is detected, after which, the blob is dilated to generate an image output mask. Finally, the automatically-generated mask is compared to the manual mask by calculating the XOR error [3]. Our border detection algorithm was developed using training and test sets of 30 and 20 images, respectively. This detection method was compared to the SRM method [4] by calculating the average XOR error for each of the two algorithms. Average error for test images was 0.10, using the new algorithm, and 0.99, using SRM method. In comparing the average error values produced by the two algorithms, it is evident that the average XOR error for our technique is lower than the SRM method, thereby implying that the new algorithm detects borders of melanomas more accurately than the SRM algorithm.
Comparative analysis of nonlinear dimensionality reduction techniques for breast MRI segmentation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Akhbardeh, Alireza; Jacobs, Michael A.; Russell H. Morgan Department of Radiology and Radiological Science, Johns Hopkins University School of Medicine, Baltimore, Maryland 21205
2012-04-15
Purpose: Visualization of anatomical structures using radiological imaging methods is an important tool in medicine to differentiate normal from pathological tissue and can generate large amounts of data for a radiologist to read. Integrating these large data sets is difficult and time-consuming. A new approach uses both supervised and unsupervised advanced machine learning techniques to visualize and segment radiological data. This study describes the application of a novel hybrid scheme, based on combining wavelet transform and nonlinear dimensionality reduction (NLDR) methods, to breast magnetic resonance imaging (MRI) data using three well-established NLDR techniques, namely, ISOMAP, local linear embedding (LLE), andmore » diffusion maps (DfM), to perform a comparative performance analysis. Methods: Twenty-five breast lesion subjects were scanned using a 3T scanner. MRI sequences used were T1-weighted, T2-weighted, diffusion-weighted imaging (DWI), and dynamic contrast-enhanced (DCE) imaging. The hybrid scheme consisted of two steps: preprocessing and postprocessing of the data. The preprocessing step was applied for B{sub 1} inhomogeneity correction, image registration, and wavelet-based image compression to match and denoise the data. In the postprocessing step, MRI parameters were considered data dimensions and the NLDR-based hybrid approach was applied to integrate the MRI parameters into a single image, termed the embedded image. This was achieved by mapping all pixel intensities from the higher dimension to a lower dimensional (embedded) space. For validation, the authors compared the hybrid NLDR with linear methods of principal component analysis (PCA) and multidimensional scaling (MDS) using synthetic data. For the clinical application, the authors used breast MRI data, comparison was performed using the postcontrast DCE MRI image and evaluating the congruence of the segmented lesions. Results: The NLDR-based hybrid approach was able to define and segment both synthetic and clinical data. In the synthetic data, the authors demonstrated the performance of the NLDR method compared with conventional linear DR methods. The NLDR approach enabled successful segmentation of the structures, whereas, in most cases, PCA and MDS failed. The NLDR approach was able to segment different breast tissue types with a high accuracy and the embedded image of the breast MRI data demonstrated fuzzy boundaries between the different types of breast tissue, i.e., fatty, glandular, and tissue with lesions (>86%). Conclusions: The proposed hybrid NLDR methods were able to segment clinical breast data with a high accuracy and construct an embedded image that visualized the contribution of different radiological parameters.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhang, L; Fried, D; Fave, X
Purpose: To investigate how different image preprocessing techniques, their parameters, and the different boundary handling techniques can augment the information of features and improve feature’s differentiating capability. Methods: Twenty-seven NSCLC patients with a solid tumor volume and no visually obvious necrotic regions in the simulation CT images were identified. Fourteen of these patients had a necrotic region visible in their pre-treatment PET images (necrosis group), and thirteen had no visible necrotic region in the pre-treatment PET images (non-necrosis group). We investigated how image preprocessing can impact the ability of radiomics image features extracted from the CT to differentiate between twomore » groups. It is expected the histogram in the necrosis group is more negatively skewed, and the uniformity from the necrosis group is less. Therefore, we analyzed two first order features, skewness and uniformity, on the image inside the GTV in the intensity range [−20HU, 180HU] under the combination of several image preprocessing techniques: (1) applying the isotropic Gaussian or anisotropic diffusion smoothing filter with a range of parameter(Gaussian smoothing: size=11, sigma=0:0.1:2.3; anisotropic smoothing: iteration=4, kappa=0:10:110); (2) applying the boundaryadapted Laplacian filter; and (3) applying the adaptive upper threshold for the intensity range. A 2-tailed T-test was used to evaluate the differentiating capability of CT features on pre-treatment PT necrosis. Result: Without any preprocessing, no differences in either skewness or uniformity were observed between two groups. After applying appropriate Gaussian filters (sigma>=1.3) or anisotropic filters(kappa >=60) with the adaptive upper threshold, skewness was significantly more negative in the necrosis group(p<0.05). By applying the boundary-adapted Laplacian filtering after the appropriate Gaussian filters (0.5 <=sigma<=1.1) or anisotropic filters(20<=kappa <=50), the uniformity was significantly lower in the necrosis group (p<0.05). Conclusion: Appropriate selection of image preprocessing techniques allows radiomics features to extract more useful information and thereby improve prediction models based on these features.« less
Mohebbi, Maryam; Ghassemian, Hassan; Asl, Babak Mohammadzadeh
2011-01-01
This paper aims to propose an effective paroxysmal atrial fibrillation (PAF) predictor which is based on the analysis of the heart rate variability (HRV) signal. Predicting the onset of PAF, based on non-invasive techniques, is clinically important and can be invaluable in order to avoid useless therapeutic interventions and to minimize the risks for the patients. This method consists of four steps: Preprocessing, feature extraction, feature reduction, and classification. In the first step, the QRS complexes are detected from the electrocardiogram (ECG) signal and then the HRV signal is extracted. In the next step, the recurrence plot (RP) of HRV signal is obtained and six features are extracted to characterize the basic patterns of the RP. These features consist of length of longest diagonal segments, average length of the diagonal lines, entropy, trapping time, length of longest vertical line, and recurrence trend. In the third step, these features are reduced to three features by the linear discriminant analysis (LDA) technique. Using LDA not only reduces the number of the input features, but also increases the classification accuracy by selecting the most discriminating features. Finally, a support vector machine-based classifier is used to classify the HRV signals. The performance of the proposed method in prediction of PAF episodes was evaluated using the Atrial Fibrillation Prediction Database which consists of both 30-minutes ECG recordings end just prior to the onset of PAF and segments at least 45 min distant from any PAF events. The obtained sensitivity, specificity, and positive predictivity were 96.55%, 100%, and 100%, respectively. PMID:22606666
Mohebbi, Maryam; Ghassemian, Hassan
2011-08-01
Atrial fibrillation (AF) is the most common cardiac arrhythmia and increases the risk of stroke. Predicting the onset of paroxysmal AF (PAF), based on noninvasive techniques, is clinically important and can be invaluable in order to avoid useless therapeutic intervention and to minimize risks for the patients. In this paper, we propose an effective PAF predictor which is based on the analysis of the RR-interval signal. This method consists of three steps: preprocessing, feature extraction and classification. In the first step, the QRS complexes are detected from the electrocardiogram (ECG) signal and then the RR-interval signal is extracted. In the next step, the recurrence plot (RP) of the RR-interval signal is obtained and five statistically significant features are extracted to characterize the basic patterns of the RP. These features consist of the recurrence rate, length of longest diagonal segments (L(max )), average length of the diagonal lines (L(mean)), entropy, and trapping time. Recurrence quantification analysis can reveal subtle aspects of dynamics not easily appreciated by other methods and exhibits characteristic patterns which are caused by the typical dynamical behavior. In the final step, a support vector machine (SVM)-based classifier is used for PAF prediction. The performance of the proposed method in prediction of PAF episodes was evaluated using the Atrial Fibrillation Prediction Database (AFPDB) which consists of both 30 min ECG recordings that end just prior to the onset of PAF and segments at least 45 min distant from any PAF events. The obtained sensitivity, specificity, positive predictivity and negative predictivity were 97%, 100%, 100%, and 96%, respectively. The proposed methodology presents better results than other existing approaches.
Extraction of skin lesions from non-dermoscopic images for surgical excision of melanoma.
Jafari, M Hossein; Nasr-Esfahani, Ebrahim; Karimi, Nader; Soroushmehr, S M Reza; Samavi, Shadrokh; Najarian, Kayvan
2017-06-01
Computerized prescreening of suspicious moles and lesions for malignancy is of great importance for assessing the need and the priority of the removal surgery. Detection can be done by images captured by standard cameras, which are more preferable due to low cost and availability. One important step in computerized evaluation is accurate detection of lesion's region, i.e., segmentation of an image into two regions as lesion and normal skin. In this paper, a new method based on deep neural networks is proposed for accurate extraction of a lesion region. The input image is preprocessed, and then, its patches are fed to a convolutional neural network. Local texture and global structure of the patches are processed in order to assign pixels to lesion or normal classes. A method for effective selection of training patches is proposed for more accurate detection of a lesion's border. Our results indicate that the proposed method could reach the accuracy of 98.7% and the sensitivity of 95.2% in segmentation of lesion regions over the dataset of clinical images. The experimental results of qualitative and quantitative evaluations demonstrate that our method can outperform other state-of-the-art algorithms exist in the literature.
Chinese character recognition based on Gabor feature extraction and CNN
NASA Astrophysics Data System (ADS)
Xiong, Yudian; Lu, Tongwei; Jiang, Yongyuan
2018-03-01
As an important application in the field of text line recognition and office automation, Chinese character recognition has become an important subject of pattern recognition. However, due to the large number of Chinese characters and the complexity of its structure, there is a great difficulty in the Chinese character recognition. In order to solve this problem, this paper proposes a method of printed Chinese character recognition based on Gabor feature extraction and Convolution Neural Network(CNN). The main steps are preprocessing, feature extraction, training classification. First, the gray-scale Chinese character image is binarized and normalized to reduce the redundancy of the image data. Second, each image is convoluted with Gabor filter with different orientations, and the feature map of the eight orientations of Chinese characters is extracted. Third, the feature map through Gabor filters and the original image are convoluted with learning kernels, and the results of the convolution is the input of pooling layer. Finally, the feature vector is used to classify and recognition. In addition, the generalization capacity of the network is improved by Dropout technology. The experimental results show that this method can effectively extract the characteristics of Chinese characters and recognize Chinese characters.
Measurement data preprocessing in a radar-based system for monitoring of human movements
NASA Astrophysics Data System (ADS)
Morawski, Roman Z.; Miȩkina, Andrzej; Bajurko, Paweł R.
2015-02-01
The importance of research on new technologies that could be employed in care services for elderly people is highlighted. The need to examine the applicability of various sensor systems for non-invasive monitoring of the movements and vital bodily functions, such as heart beat or breathing rhythm, of elderly persons in their home environment is justified. An extensive overview of the literature concerning existing monitoring techniques is provided. A technological potential behind radar sensors is indicated. A new class of algorithms for preprocessing of measurement data from impulse radar sensors, when applied for elderly people monitoring, is proposed. Preliminary results of numerical experiments performed on those algorithms are demonstrated.
Comparison of pre-processing methods for multiplex bead-based immunoassays.
Rausch, Tanja K; Schillert, Arne; Ziegler, Andreas; Lüking, Angelika; Zucht, Hans-Dieter; Schulz-Knappe, Peter
2016-08-11
High throughput protein expression studies can be performed using bead-based protein immunoassays, such as the Luminex® xMAP® technology. Technical variability is inherent to these experiments and may lead to systematic bias and reduced power. To reduce technical variability, data pre-processing is performed. However, no recommendations exist for the pre-processing of Luminex® xMAP® data. We compared 37 different data pre-processing combinations of transformation and normalization methods in 42 samples on 384 analytes obtained from a multiplex immunoassay based on the Luminex® xMAP® technology. We evaluated the performance of each pre-processing approach with 6 different performance criteria. Three performance criteria were plots. All plots were evaluated by 15 independent and blinded readers. Four different combinations of transformation and normalization methods performed well as pre-processing procedure for this bead-based protein immunoassay. The following combinations of transformation and normalization were suitable for pre-processing Luminex® xMAP® data in this study: weighted Box-Cox followed by quantile or robust spline normalization (rsn), asinh transformation followed by loess normalization and Box-Cox followed by rsn.
The Effects of Pre-processing Strategies for Pediatric Cochlear Implant Recipients
Rakszawski, Bernadette; Wright, Rose; Cadieux, Jamie H.; Davidson, Lisa S.; Brenner, Christine
2016-01-01
Background Cochlear implants (CIs) have been shown to improve children’s speech recognition over traditional amplification when severe to profound sensorineural hearing loss is present. Despite improvements, understanding speech at low-level intensities or in the presence of background noise remains difficult. In an effort to improve speech understanding in challenging environments, Cochlear Ltd. offers pre-processing strategies that apply various algorithms prior to mapping the signal to the internal array. Two of these strategies include Autosensitivity Control™ (ASC) and Adaptive Dynamic Range Optimization (ADRO®). Based on previous research, the manufacturer’s default pre-processing strategy for pediatrics’ everyday programs combines ASC+ADRO®. Purpose The purpose of this study is to compare pediatric speech perception performance across various pre-processing strategies while applying a specific programming protocol utilizing increased threshold (T) levels to ensure access to very low-level sounds. Research Design This was a prospective, cross-sectional, observational study. Participants completed speech perception tasks in four pre-processing conditions: no pre-processing, ADRO®, ASC, ASC+ADRO®. Study Sample Eleven pediatric Cochlear Ltd. cochlear implant users were recruited: six bilateral, one unilateral, and four bimodal. Intervention Four programs, with the participants’ everyday map, were loaded into the processor with different pre-processing strategies applied in each of the four positions: no pre-processing, ADRO®, ASC, and ASC+ADRO®. Data Collection and Analysis Participants repeated CNC words presented at 50 and 70 dB SPL in quiet and HINT sentences presented adaptively with competing R-Space noise at 60 and 70 dB SPL. Each measure was completed as participants listened with each of the four pre-processing strategies listed above. Test order and condition were randomized. A repeated-measures analysis of variance (ANOVA) was used to compare each pre-processing strategy across group data. Critical differences were utilized to determine significant score differences between each pre-processing strategy for individual participants. Results For CNC words presented at 50 dB SPL, the group data revealed significantly better scores using ASC+ADRO® compared to all other pre-processing conditions while ASC resulted in poorer scores compared to ADRO® and ASC+ADRO®. Group data for HINT sentences presented in 70 dB SPL of R-Space noise revealed significantly improved scores using ASC and ASC+ADRO® compared to no pre-processing, with ASC+ADRO® scores being better than ADRO® alone scores. Group data for CNC words presented at 70 dB SPL and adaptive HINT sentences presented in 60 dB SPL of R-Space noise showed no significant difference among conditions. Individual data showed that the pre-processing strategy yielding the best scores varied across measures and participants. Conclusions Group data reveals an advantage with ASC+ADRO® for speech perception presented at lower levels and in higher levels of background noise. Individual data revealed that the optimal pre-processing strategy varied among participants; indicating that a variety of pre-processing strategies should be explored for each CI user considering his or her performance in challenging listening environments. PMID:26905529
Automated microaneurysm detection in diabetic retinopathy using curvelet transform
NASA Astrophysics Data System (ADS)
Ali Shah, Syed Ayaz; Laude, Augustinus; Faye, Ibrahima; Tang, Tong Boon
2016-10-01
Microaneurysms (MAs) are known to be the early signs of diabetic retinopathy (DR). An automated MA detection system based on curvelet transform is proposed for color fundus image analysis. Candidates of MA were extracted in two parallel steps. In step one, blood vessels were removed from preprocessed green band image and preliminary MA candidates were selected by local thresholding technique. In step two, based on statistical features, the image background was estimated. The results from the two steps allowed us to identify preliminary MA candidates which were also present in the image foreground. A collection set of features was fed to a rule-based classifier to divide the candidates into MAs and non-MAs. The proposed system was tested with Retinopathy Online Challenge database. The automated system detected 162 MAs out of 336, thus achieved a sensitivity of 48.21% with 65 false positives per image. Counting MA is a means to measure the progression of DR. Hence, the proposed system may be deployed to monitor the progression of DR at early stage in population studies.
Automated microaneurysm detection in diabetic retinopathy using curvelet transform.
Ali Shah, Syed Ayaz; Laude, Augustinus; Faye, Ibrahima; Tang, Tong Boon
2016-10-01
Microaneurysms (MAs) are known to be the early signs of diabetic retinopathy (DR). An automated MA detection system based on curvelet transform is proposed for color fundus image analysis. Candidates of MA were extracted in two parallel steps. In step one, blood vessels were removed from preprocessed green band image and preliminary MA candidates were selected by local thresholding technique. In step two, based on statistical features, the image background was estimated. The results from the two steps allowed us to identify preliminary MA candidates which were also present in the image foreground. A collection set of features was fed to a rule-based classifier to divide the candidates into MAs and non-MAs. The proposed system was tested with Retinopathy Online Challenge database. The automated system detected 162 MAs out of 336, thus achieved a sensitivity of 48.21% with 65 false positives per image. Counting MA is a means to measure the progression of DR. Hence, the proposed system may be deployed to monitor the progression of DR at early stage in population studies.
NASA Astrophysics Data System (ADS)
von Ruette, Jonas; Lehmann, Peter; Fan, Linfeng; Bickel, Samuel; Or, Dani
2017-04-01
Landslides and subsequent debris-flows initiated by rainfall represent a ubiquitous natural hazard in steep mountainous regions. We integrated a landslide hydro-mechanical triggering model and associated debris flow runout pathways with a graphical user interface (GUI) to represent these natural hazards in a wide range of catchments over the globe. The STEP-TRAMM GUI provides process-based locations and sizes of landslides patterns using digital elevation models (DEM) from SRTM database (30 m resolution) linked with soil maps from global database SoilGrids (250 m resolution) and satellite based information on rainfall statistics for the selected region. In a preprocessing step STEP-TRAMM models soil depth distribution and complements soil information that jointly capture key hydrological and mechanical properties relevant to local soil failure representation. In the presentation we will discuss feature of this publicly available platform and compare landslide and debris flow patterns for different regions considering representative intense rainfall events. Model outcomes will be compared for different spatial and temporal resolutions to test applicability of web-based information on elevation and rainfall for hazard assessment.
Dalla-Costa, Libera M; Morello, Luis G; Conte, Danieli; Pereira, Luciane A; Palmeiro, Jussara K; Ambrosio, Altair; Cardozo, Dayane; Krieger, Marco A; Raboni, Sonia M
2017-09-01
Sepsis is the leading cause of death in intensive care units (ICUs) worldwide and its diagnosis remains a challenge. Blood culturing is the gold standard technique for blood stream infection (BSI) identification. Molecular tests to detect pathogens in whole blood enable early use of antimicrobials and affect clinical outcomes. Here, using real-time PCR, we evaluated DNA extraction using seven manual and three automated commercially available systems with whole blood samples artificially contaminated with Escherichia coli, Staphylococcus aureus, and Candida albicans, microorganisms commonly associated with BSI. Overall, the commercial kits evaluated presented several technical limitations including long turnaround time and low DNA yield and purity. The performance of the kits was comparable for detection of high microorganism loads (10 6 CFU/mL). However, the detection of lower concentrations was variable, despite the addition of pre-processing treatment to kits without such steps. Of the evaluated kits, the UMD-Universal CE IVD kit generated a higher quantity of DNA with greater nucleic acid purity and afforded the detection of the lowest microbial load in the samples. The inclusion of pre-processing steps with the kit seems to be critical for the detection of microorganism DNA directly from whole blood. In conclusion, future application of molecular techniques will require overcoming major challenges such as the detection of low levels of microorganism nucleic acids amidst the large quantity of human DNA present in samples or differences in the cellular structures of etiological agents that can also prevent high-quality DNA yields. Copyright © 2017 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Or, D.; von Ruette, J.; Lehmann, P.
2017-12-01
Landslides and subsequent debris-flows initiated by rainfall represent a common natural hazard in mountainous regions. We integrated a landslide hydro-mechanical triggering model with a simple model for debris flow runout pathways and developed a graphical user interface (GUI) to represent these natural hazards at catchment scale at any location. The STEP-TRAMM GUI provides process-based estimates of the initiation locations and sizes of landslides patterns based on digital elevation models (SRTM) linked with high resolution global soil maps (SoilGrids 250 m resolution) and satellite based information on rainfall statistics for the selected region. In the preprocessing phase the STEP-TRAMM model estimates soil depth distribution to supplement other soil information for delineating key hydrological and mechanical properties relevant to representing local soil failure. We will illustrate this publicly available GUI and modeling platform to simulate effects of deforestation on landslide hazards in several regions and compare model outcome with satellite based information.
Video Completion in Digital Stabilization Task Using Pseudo-Panoramic Technique
NASA Astrophysics Data System (ADS)
Favorskaya, M. N.; Buryachenko, V. V.; Zotin, A. G.; Pakhirka, A. I.
2017-05-01
Video completion is a necessary stage after stabilization of a non-stationary video sequence, if it is desirable to make the resolution of the stabilized frames equalled the resolution of the original frames. Usually the cropped stabilized frames lose 10-20% of area that means the worse visibility of the reconstructed scenes. The extension of a view of field may appear due to the pan-tilt-zoom unwanted camera movement. Our approach deals with a preparing of pseudo-panoramic key frame during a stabilization stage as a pre-processing step for the following inpainting. It is based on a multi-layered representation of each frame including the background and objects, moving differently. The proposed algorithm involves four steps, such as the background completion, local motion inpainting, local warping, and seamless blending. Our experiments show that a necessity of a seamless stitching occurs often than a local warping step. Therefore, a seamless blending was investigated in details including four main categories, such as feathering-based, pyramid-based, gradient-based, and optimal seam-based blending.
Engel, Erwan; Ratel, Jérémy
2007-06-22
The objective of the work was to assess the relevance for the authentication of food of a novel chemometric method developed to correct mass spectrometry (MS) data from instrumental drifts, namely, the comprehensive combinatory standard correction (CCSC). Applied to gas chromatography (GC)-MS data, the method consists in analyzing a liquid sample with a mixture of n internal standards and in using the best combination of standards to correct the MS signal provided by each compound. The paper focuses on the authentication of the type of feeding in farm animals based on the composition in volatile constituents of their adipose tissues. The first step of the work enabled on one hand to ensure the feasibility of the conversion of the adipose tissue sample into a liquid phase required for the use of the CCSC method and on the other hand, to determine the key parameters of the extraction of the volatile fraction from this liquid phase by dynamic headspace. The second step showed the relevance of the CCSC pre-processing of the MS fingerprints generated by dynamic headspace-MS analysis of lamb tissues, for the discrimination of animals fed exclusively with pasture (n=8) or concentrate (n=8). When compared with filtering of raw data, internal normalization and correction by a single standard, the CCSC method increased by 17.1-, 3.3- and 1.3-fold, respectively, the number of mass fragments which discriminated the type of feeding. The final step confirmed the advantage of the CCSC pre-processing of dynamic headspace-gas chromatography-MS data for revealing molecular tracers of the type of feeding those number (n=72) was greater when compared to the number of tracers obtained with raw data (n=42), internal normalization (n=63) and correction by a single standard (n=57). The relevance of the information gained by using the CCSC method is discussed.
Spatial-spectral preprocessing for endmember extraction on GPU's
NASA Astrophysics Data System (ADS)
Jimenez, Luis I.; Plaza, Javier; Plaza, Antonio; Li, Jun
2016-10-01
Spectral unmixing is focused in the identification of spectrally pure signatures, called endmembers, and their corresponding abundances in each pixel of a hyperspectral image. Mainly focused on the spectral information contained in the hyperspectral images, endmember extraction techniques have recently included spatial information to achieve more accurate results. Several algorithms have been developed for automatic or semi-automatic identification of endmembers using spatial and spectral information, including the spectral-spatial endmember extraction (SSEE) where, within a preprocessing step in the technique, both sources of information are extracted from the hyperspectral image and equally used for this purpose. Previous works have implemented the SSEE technique in four main steps: 1) local eigenvectors calculation in each sub-region in which the original hyperspectral image is divided; 2) computation of the maxima and minima projection of all eigenvectors over the entire hyperspectral image in order to obtain a candidates pixels set; 3) expansion and averaging of the signatures of the candidate set; 4) ranking based on the spectral angle distance (SAD). The result of this method is a list of candidate signatures from which the endmembers can be extracted using various spectral-based techniques, such as orthogonal subspace projection (OSP), vertex component analysis (VCA) or N-FINDR. Considering the large volume of data and the complexity of the calculations, there is a need for efficient implementations. Latest- generation hardware accelerators such as commodity graphics processing units (GPUs) offer a good chance for improving the computational performance in this context. In this paper, we develop two different implementations of the SSEE algorithm using GPUs. Both are based on the eigenvectors computation within each sub-region of the first step, one using the singular value decomposition (SVD) and another one using principal component analysis (PCA). Based on our experiments with hyperspectral data sets, high computational performance is observed in both cases.
Design and implementation of a preprocessing system for a sodium lidar
NASA Technical Reports Server (NTRS)
Voelz, D. G.; Sechrist, C. F., Jr.
1983-01-01
A preprocessing system, designed and constructed for use with the University of Illinois sodium lidar system, was developed to increase the altitude resolution and range of the lidar system and also to decrease the processing burden of the main lidar computer. The preprocessing system hardware and the software required to implement the system are described. Some preliminary results of an airborne sodium lidar experiment conducted with the preprocessing system installed in the sodium lidar are presented.
Installé, Arnaud Jf; Van den Bosch, Thierry; De Moor, Bart; Timmerman, Dirk
2014-10-20
Using machine-learning techniques, clinical diagnostic model research extracts diagnostic models from patient data. Traditionally, patient data are often collected using electronic Case Report Form (eCRF) systems, while mathematical software is used for analyzing these data using machine-learning techniques. Due to the lack of integration between eCRF systems and mathematical software, extracting diagnostic models is a complex, error-prone process. Moreover, due to the complexity of this process, it is usually only performed once, after a predetermined number of data points have been collected, without insight into the predictive performance of the resulting models. The objective of the study of Clinical Data Miner (CDM) software framework is to offer an eCRF system with integrated data preprocessing and machine-learning libraries, improving efficiency of the clinical diagnostic model research workflow, and to enable optimization of patient inclusion numbers through study performance monitoring. The CDM software framework was developed using a test-driven development (TDD) approach, to ensure high software quality. Architecturally, CDM's design is split over a number of modules, to ensure future extendability. The TDD approach has enabled us to deliver high software quality. CDM's eCRF Web interface is in active use by the studies of the International Endometrial Tumor Analysis consortium, with over 4000 enrolled patients, and more studies planned. Additionally, a derived user interface has been used in six separate interrater agreement studies. CDM's integrated data preprocessing and machine-learning libraries simplify some otherwise manual and error-prone steps in the clinical diagnostic model research workflow. Furthermore, CDM's libraries provide study coordinators with a method to monitor a study's predictive performance as patient inclusions increase. To our knowledge, CDM is the only eCRF system integrating data preprocessing and machine-learning libraries. This integration improves the efficiency of the clinical diagnostic model research workflow. Moreover, by simplifying the generation of learning curves, CDM enables study coordinators to assess more accurately when data collection can be terminated, resulting in better models or lower patient recruitment costs.
Layered recognition networks that pre-process, classify, and describe
NASA Technical Reports Server (NTRS)
Uhr, L.
1971-01-01
A brief overview is presented of six types of pattern recognition programs that: (1) preprocess, then characterize; (2) preprocess and characterize together; (3) preprocess and characterize into a recognition cone; (4) describe as well as name; (5) compose interrelated descriptions; and (6) converse. A computer program (of types 3 through 6) is presented that transforms and characterizes the input scene through the successive layers of a recognition cone, and then engages in a stylized conversation to describe the scene.
Data Access Services that Make Remote Sensing Data Easier to Use
NASA Technical Reports Server (NTRS)
Lynnes, Christopher
2010-01-01
This slide presentation reviews some of the processes that NASA uses to make the remote sensing data easy to use over the World Wide Web. This work involves much research into data formats, geolocation structures and quality indicators, often to be followed by coding a preprocessing program. Only then are the data usable within the analysis tool of choice. The Goddard Earth Sciences Data and Information Services Center is deploying a variety of data access services that are designed to dramatically shorten the time consumed in the data preparation step. On-the-fly conversion to the standard network Common Data Form (netCDF) format with Climate-Forecast (CF) conventions imposes a standard coordinate system framework that makes data instantly readable through several tools, such as the Integrated Data Viewer, Gridded Analysis and Display System, Panoply and Ferret. A similar benefit is achieved by serving data through the Open Source Project for a Network Data Access Protocol (OPeNDAP), which also provides subsetting. The Data Quality Screening Service goes a step further in filtering out data points based on quality control flags, based on science team recommendations or user-specified criteria. Further still is the Giovanni online analysis system which goes beyond handling formatting and quality to provide visualization and basic statistics of the data. This general approach of automating the preparation steps has the important added benefit of enabling use of the data by non-human users (i.e., computer programs), which often make sub-optimal use of the available data due to the need to hard-code data preparation on the client side.
van Mierlo, Pieter; Lie, Octavian; Staljanssens, Willeke; Coito, Ana; Vulliémoz, Serge
2018-04-26
We investigated the influence of processing steps in the estimation of multivariate directed functional connectivity during seizures recorded with intracranial EEG (iEEG) on seizure-onset zone (SOZ) localization. We studied the effect of (i) the number of nodes, (ii) time-series normalization, (iii) the choice of multivariate time-varying connectivity measure: Adaptive Directed Transfer Function (ADTF) or Adaptive Partial Directed Coherence (APDC) and (iv) graph theory measure: outdegree or shortest path length. First, simulations were performed to quantify the influence of the various processing steps on the accuracy to localize the SOZ. Afterwards, the SOZ was estimated from a 113-electrodes iEEG seizure recording and compared with the resection that rendered the patient seizure-free. The simulations revealed that ADTF is preferred over APDC to localize the SOZ from ictal iEEG recordings. Normalizing the time series before analysis resulted in an increase of 25-35% of correctly localized SOZ, while adding more nodes to the connectivity analysis led to a moderate decrease of 10%, when comparing 128 with 32 input nodes. The real-seizure connectivity estimates localized the SOZ inside the resection area using the ADTF coupled to outdegree or shortest path length. Our study showed that normalizing the time-series is an important pre-processing step, while adding nodes to the analysis did only marginally affect the SOZ localization. The study shows that directed multivariate Granger-based connectivity analysis is feasible with many input nodes (> 100) and that normalization of the time-series before connectivity analysis is preferred.
NASA Technical Reports Server (NTRS)
Austin, W. W.
1983-01-01
The effect on LANDSAT data of a Sun angle correction, an intersatellite LANDSAT-2 and LANDSAT-3 data range adjustment, and the atmospheric correction algorithm was evaluated. Fourteen 1978 crop year LACIE sites were used as the site data set. The preprocessing techniques were applied to multispectral scanner channel data and transformed data were plotted and used to analyze the effectiveness of the preprocessing techniques. Ratio transformations effectively reduce the need for preprocessing techniques to be applied directly to the data. Subtractive transformations are more sensitive to Sun angle and atmospheric corrections than ratios. Preprocessing techniques, other than those applied at the Goddard Space Flight Center, should only be applied as an option of the user. While performed on LANDSAT data the study results are also applicable to meteorological satellite data.
Comparison of preprocessing methods and storage times for touch DNA samples
Dong, Hui; Wang, Jing; Zhang, Tao; Ge, Jian-ye; Dong, Ying-qiang; Sun, Qi-fan; Liu, Chao; Li, Cai-xia
2017-01-01
Aim To select appropriate preprocessing methods for different substrates by comparing the effects of four different preprocessing methods on touch DNA samples and to determine the effect of various storage times on the results of touch DNA sample analysis. Method Hand touch DNA samples were used to investigate the detection and inspection results of DNA on different substrates. Four preprocessing methods, including the direct cutting method, stubbing procedure, double swab technique, and vacuum cleaner method, were used in this study. DNA was extracted from mock samples with four different preprocessing methods. The best preprocess protocol determined from the study was further used to compare performance after various storage times. DNA extracted from all samples was quantified and amplified using standard procedures. Results The amounts of DNA and the number of alleles detected on the porous substrates were greater than those on the non-porous substrates. The performances of the four preprocessing methods varied with different substrates. The direct cutting method displayed advantages for porous substrates, and the vacuum cleaner method was advantageous for non-porous substrates. No significant degradation trend was observed as the storage times increased. Conclusion Different substrates require the use of different preprocessing method in order to obtain the highest DNA amount and allele number from touch DNA samples. This study provides a theoretical basis for explorations of touch DNA samples and may be used as a reference when dealing with touch DNA samples in case work. PMID:28252870
NASA Astrophysics Data System (ADS)
Zhu, Weifang; Zhang, Li; Shi, Fei; Xiang, Dehui; Wang, Lirong; Guo, Jingyun; Yang, Xiaoling; Chen, Haoyu; Chen, Xinjian
2017-07-01
Cystoid macular edema (CME) and macular hole (MH) are the leading causes for visual loss in retinal diseases. The volume of the CMEs can be an accurate predictor for visual prognosis. This paper presents an automatic method to segment the CMEs from the abnormal retina with coexistence of MH in three-dimensional-optical coherence tomography images. The proposed framework consists of preprocessing and CMEs segmentation. The preprocessing part includes denoising, intraretinal layers segmentation and flattening, and MH and vessel silhouettes exclusion. In the CMEs segmentation, a three-step strategy is applied. First, an AdaBoost classifier trained with 57 features is employed to generate the initialization results. Second, an automated shape-constrained graph cut algorithm is applied to obtain the refined results. Finally, cyst area information is used to remove false positives (FPs). The method was evaluated on 19 eyes with coexistence of CMEs and MH from 18 subjects. The true positive volume fraction, FP volume fraction, dice similarity coefficient, and accuracy rate for CMEs segmentation were 81.0%±7.8%, 0.80%±0.63%, 80.9%±5.7%, and 99.7%±0.1%, respectively.
Retinal image restoration by means of blind deconvolution
NASA Astrophysics Data System (ADS)
Marrugo, Andrés G.; Šorel, Michal; Šroubek, Filip; Millán, María S.
2011-11-01
Retinal imaging plays a key role in the diagnosis and management of ophthalmologic disorders, such as diabetic retinopathy, glaucoma, and age-related macular degeneration. Because of the acquisition process, retinal images often suffer from blurring and uneven illumination. This problem may seriously affect disease diagnosis and progression assessment. Here we present a method for color retinal image restoration by means of multichannel blind deconvolution. The method is applied to a pair of retinal images acquired within a lapse of time, ranging from several minutes to months. It consists of a series of preprocessing steps to adjust the images so they comply with the considered degradation model, followed by the estimation of the point-spread function and, ultimately, image deconvolution. The preprocessing is mainly composed of image registration, uneven illumination compensation, and segmentation of areas with structural changes. In addition, we have developed a procedure for the detection and visualization of structural changes. This enables the identification of subtle developments in the retina not caused by variation in illumination or blur. The method was tested on synthetic and real images. Encouraging experimental results show that the method is capable of significant restoration of degraded retinal images.
Sunspot Pattern Classification using PCA and Neural Networks (Poster)
NASA Technical Reports Server (NTRS)
Rajkumar, T.; Thompson, D. E.; Slater, G. L.
2005-01-01
The sunspot classification scheme presented in this paper is considered as a 2-D classification problem on archived datasets, and is not a real-time system. As a first step, it mirrors the Zuerich/McIntosh historical classification system and reproduces classification of sunspot patterns based on preprocessing and neural net training datasets. Ultimately, the project intends to move from more rudimentary schemes, to develop spatial-temporal-spectral classes derived by correlating spatial and temporal variations in various wavelengths to the brightness fluctuation spectrum of the sun in those wavelengths. Once the approach is generalized, then the focus will naturally move from a 2-D to an n-D classification, where "n" includes time and frequency. Here, the 2-D perspective refers both to the actual SOH0 Michelson Doppler Imager (MDI) images that are processed, but also refers to the fact that a 2-D matrix is created from each image during preprocessing. The 2-D matrix is the result of running Principal Component Analysis (PCA) over the selected dataset images, and the resulting matrices and their eigenvalues are the objects that are stored in a database, classified, and compared. These matrices are indexed according to the standard McIntosh classification scheme.
Multi-Temporal Land Cover Classification with Sequential Recurrent Encoders
NASA Astrophysics Data System (ADS)
Rußwurm, Marc; Körner, Marco
2018-03-01
Earth observation (EO) sensors deliver data with daily or weekly temporal resolution. Most land use and land cover (LULC) approaches, however, expect cloud-free and mono-temporal observations. The increasing temporal capabilities of today's sensors enables the use of temporal, along with spectral and spatial features. Domains, such as speech recognition or neural machine translation, work with inherently temporal data and, today, achieve impressive results using sequential encoder-decoder structures. Inspired by these sequence-to-sequence models, we adapt an encoder structure with convolutional recurrent layers in order to approximate a phenological model for vegetation classes based on a temporal sequence of Sentinel 2 (S2) images. In our experiments, we visualize internal activations over a sequence of cloudy and non-cloudy images and find several recurrent cells, which reduce the input activity for cloudy observations. Hence, we assume that our network has learned cloud-filtering schemes solely from input data, which could alleviate the need for tedious cloud-filtering as a preprocessing step for many EO approaches. Moreover, using unfiltered temporal series of top-of-atmosphere (TOA) reflectance data, we achieved in our experiments state-of-the-art classification accuracies on a large number of crop classes with minimal preprocessing compared to other classification approaches.
NASA Astrophysics Data System (ADS)
Moustafa, Azza A.; Hegazy, Maha A.; Mohamed, Dalia; Ali, Omnia
2016-02-01
A novel approach for the resolution and quantitation of severely overlapped quaternary mixture of carbinoxamine maleate (CAR), pholcodine (PHL), ephedrine hydrochloride (EPH) and sunset yellow (SUN) in syrup was demonstrated utilizing different spectrophotometric assisted multivariate calibration methods. The applied methods have used different processing and pre-processing algorithms. The proposed methods were partial least squares (PLS), concentration residuals augmented classical least squares (CRACLS), and a novel method; continuous wavelet transforms coupled with partial least squares (CWT-PLS). These methods were applied to a training set in the concentration ranges of 40-100 μg/mL, 40-160 μg/mL, 100-500 μg/mL and 8-24 μg/mL for the four components, respectively. The utilized methods have not required any preliminary separation step or chemical pretreatment. The validity of the methods was evaluated by an external validation set. The selectivity of the developed methods was demonstrated by analyzing the drugs in their combined pharmaceutical formulation without any interference from additives. The obtained results were statistically compared with the official and reported methods where no significant difference was observed regarding both accuracy and precision.
Iris Recognition Using Feature Extraction of Box Counting Fractal Dimension
NASA Astrophysics Data System (ADS)
Khotimah, C.; Juniati, D.
2018-01-01
Biometrics is a science that is now growing rapidly. Iris recognition is a biometric modality which captures a photo of the eye pattern. The markings of the iris are distinctive that it has been proposed to use as a means of identification, instead of fingerprints. Iris recognition was chosen for identification in this research because every human has a special feature that each individual is different and the iris is protected by the cornea so that it will have a fixed shape. This iris recognition consists of three step: pre-processing of data, feature extraction, and feature matching. Hough transformation is used in the process of pre-processing to locate the iris area and Daugman’s rubber sheet model to normalize the iris data set into rectangular blocks. To find the characteristics of the iris, it was used box counting method to get the fractal dimension value of the iris. Tests carried out by used k-fold cross method with k = 5. In each test used 10 different grade K of K-Nearest Neighbor (KNN). The result of iris recognition was obtained with the best accuracy was 92,63 % for K = 3 value on K-Nearest Neighbor (KNN) method.
NASA Astrophysics Data System (ADS)
Princz, S.; Wenzel, U.; Miller, R.; Hessling, M.
2014-11-01
One aerobic and four anaerobic batch fermentations of the yeast Saccharomyces cerevisiae were conducted in a stirred bioreactor and monitored inline by NIR spectroscopy and a transflectance dip probe. From the acquired NIR spectra, chemometric partial least squares regression (PLSR) models for predicting biomass, glucose and ethanol were constructed. The spectra were directly measured in the fermentation broth and successfully inspected for adulteration using our novel data pre-processing method. These adulterations manifested as strong fluctuations in the shape and offset of the absorption spectra. They resulted from cells, cell clusters, or gas bubbles intercepting the optical path of the dip probe. In the proposed data pre-processing method, adulterated signals are removed by passing the time-scanned non-averaged spectra through two filter algorithms with a 5% quantile cutoff. The filtered spectra containing meaningful data are then averaged. A second step checks whether the whole time scan is analyzable. If true, the average is calculated and used to prepare the PLSR models. This new method distinctly improved the prediction results. To dissociate possible correlations between analyte concentrations, such as glucose and ethanol, the feeding analytes were alternately supplied at different concentrations (spiking) at the end of the four anaerobic fermentations. This procedure yielded low-error (anaerobic) PLSR models for predicting analyte concentrations of 0.31 g/l for biomass, 3.41 g/l for glucose, and 2.17 g/l for ethanol. The maximum concentrations were 14 g/l biomass, 167 g/l glucose, and 80 g/l ethanol. Data from the aerobic fermentation, carried out under high agitation and high aeration, were incorporated to realize combined PLSR models, which have not been previously reported to our knowledge.
Chriskos, Panteleimon; Frantzidis, Christos A; Gkivogkli, Polyxeni T; Bamidis, Panagiotis D; Kourtidou-Papadeli, Chrysoula
2018-01-01
Sleep staging, the process of assigning labels to epochs of sleep, depending on the stage of sleep they belong, is an arduous, time consuming and error prone process as the initial recordings are quite often polluted by noise from different sources. To properly analyze such data and extract clinical knowledge, noise components must be removed or alleviated. In this paper a pre-processing and subsequent sleep staging pipeline for the sleep analysis of electroencephalographic signals is described. Two novel methods of functional connectivity estimation (Synchronization Likelihood/SL and Relative Wavelet Entropy/RWE) are comparatively investigated for automatic sleep staging through manually pre-processed electroencephalographic recordings. A multi-step process that renders signals suitable for further analysis is initially described. Then, two methods that rely on extracting synchronization features from electroencephalographic recordings to achieve computerized sleep staging are proposed, based on bivariate features which provide a functional overview of the brain network, contrary to most proposed methods that rely on extracting univariate time and frequency features. Annotation of sleep epochs is achieved through the presented feature extraction methods by training classifiers, which are in turn able to accurately classify new epochs. Analysis of data from sleep experiments on a randomized, controlled bed-rest study, which was organized by the European Space Agency and was conducted in the "ENVIHAB" facility of the Institute of Aerospace Medicine at the German Aerospace Center (DLR) in Cologne, Germany attains high accuracy rates, over 90% based on ground truth that resulted from manual sleep staging by two experienced sleep experts. Therefore, it can be concluded that the above feature extraction methods are suitable for semi-automatic sleep staging.
Chriskos, Panteleimon; Frantzidis, Christos A.; Gkivogkli, Polyxeni T.; Bamidis, Panagiotis D.; Kourtidou-Papadeli, Chrysoula
2018-01-01
Sleep staging, the process of assigning labels to epochs of sleep, depending on the stage of sleep they belong, is an arduous, time consuming and error prone process as the initial recordings are quite often polluted by noise from different sources. To properly analyze such data and extract clinical knowledge, noise components must be removed or alleviated. In this paper a pre-processing and subsequent sleep staging pipeline for the sleep analysis of electroencephalographic signals is described. Two novel methods of functional connectivity estimation (Synchronization Likelihood/SL and Relative Wavelet Entropy/RWE) are comparatively investigated for automatic sleep staging through manually pre-processed electroencephalographic recordings. A multi-step process that renders signals suitable for further analysis is initially described. Then, two methods that rely on extracting synchronization features from electroencephalographic recordings to achieve computerized sleep staging are proposed, based on bivariate features which provide a functional overview of the brain network, contrary to most proposed methods that rely on extracting univariate time and frequency features. Annotation of sleep epochs is achieved through the presented feature extraction methods by training classifiers, which are in turn able to accurately classify new epochs. Analysis of data from sleep experiments on a randomized, controlled bed-rest study, which was organized by the European Space Agency and was conducted in the “ENVIHAB” facility of the Institute of Aerospace Medicine at the German Aerospace Center (DLR) in Cologne, Germany attains high accuracy rates, over 90% based on ground truth that resulted from manual sleep staging by two experienced sleep experts. Therefore, it can be concluded that the above feature extraction methods are suitable for semi-automatic sleep staging. PMID:29628883
Sand, Andreas; Kristiansen, Martin; Pedersen, Christian N S; Mailund, Thomas
2013-11-22
Hidden Markov models are widely used for genome analysis as they combine ease of modelling with efficient analysis algorithms. Calculating the likelihood of a model using the forward algorithm has worst case time complexity linear in the length of the sequence and quadratic in the number of states in the model. For genome analysis, however, the length runs to millions or billions of observations, and when maximising the likelihood hundreds of evaluations are often needed. A time efficient forward algorithm is therefore a key ingredient in an efficient hidden Markov model library. We have built a software library for efficiently computing the likelihood of a hidden Markov model. The library exploits commonly occurring substrings in the input to reuse computations in the forward algorithm. In a pre-processing step our library identifies common substrings and builds a structure over the computations in the forward algorithm which can be reused. This analysis can be saved between uses of the library and is independent of concrete hidden Markov models so one preprocessing can be used to run a number of different models.Using this library, we achieve up to 78 times shorter wall-clock time for realistic whole-genome analyses with a real and reasonably complex hidden Markov model. In one particular case the analysis was performed in less than 8 minutes compared to 9.6 hours for the previously fastest library. We have implemented the preprocessing procedure and forward algorithm as a C++ library, zipHMM, with Python bindings for use in scripts. The library is available at http://birc.au.dk/software/ziphmm/.
NASA Astrophysics Data System (ADS)
Bhattacharjee, T.; Kumar, P.; Fillipe, L.
2018-02-01
Vibrational spectroscopy, especially FTIR and Raman, has shown enormous potential in disease diagnosis, especially in cancers. Their potential for detecting varied pathological conditions are regularly reported. However, to prove their applicability in clinics, large multi-center multi-national studies need to be undertaken; and these will result in enormous amount of data. A parallel effort to develop analytical methods, including user-friendly software that can quickly pre-process data and subject them to required multivariate analysis is warranted in order to obtain results in real time. This study reports a MATLAB based script that can automatically import data, preprocess spectra— interpolation, derivatives, normalization, and then carry out Principal Component Analysis (PCA) followed by Linear Discriminant Analysis (LDA) of the first 10 PCs; all with a single click. The software has been verified on data obtained from cell lines, animal models, and in vivo patient datasets, and gives results comparable to Minitab 16 software. The software can be used to import variety of file extensions, asc, .txt., .xls, and many others. Options to ignore noisy data, plot all possible graphs with PCA factors 1 to 5, and save loading factors, confusion matrices and other parameters are also present. The software can provide results for a dataset of 300 spectra within 0.01 s. We believe that the software will be vital not only in clinical trials using vibrational spectroscopic data, but also to obtain rapid results when these tools get translated into clinics.
Han, Bangxing; Peng, Huasheng; Yan, Hui
2016-01-01
Mugua is a common Chinese herbal medicine. There are three main medicinal origin places in China, Xuancheng City Anhui Province, Qijiang District Chongqing City, Yichang City, Hubei Province, and suitable for food origin places Linyi City Shandong Province. To construct a qualitative analytical method to identify the origin of medicinal Mugua by near infrared spectroscopy (NIRS). Partial least squares discriminant analysis (PLSDA) model was established after the Mugua derived from five different origins were preprocessed by the original spectrum. Moreover, the hierarchical cluster analysis was performed. The result showed that PLSDA model was established. According to the relationship of the origins-related important score and wavenumber, and K-mean cluster analysis, the Muguas derived from different origins were effectively identified. NIRS technology can quickly and accurately identify the origin of Mugua, provide a new method and technology for the identification of Chinese medicinal materials. After preprocessed by D1+autoscale, more peaks were increased in the preprocessed Mugua in the near infrared spectrumFive latent variable scores could reflect the information related to the origin place of MuguaOrigins of Mugua were well-distinguished according to K. mean value clustering analysis. Abbreviations used: TCM: Traditional Chinese Medicine, NIRS: Near infrared spectroscopy, SG: Savitzky-Golay smoothness, D1: First derivative, D2: Second derivative, SNV: Standard normal variable transformation, MSC: Multiplicative scatter correction, PLSDA: Partial least squares discriminant analysis, LV: Latent variable, VIP scores: Important score.
A robust technique based on VLM and Frangi filter for retinal vessel extraction and denoising.
Khan, Khan Bahadar; Khaliq, Amir A; Jalil, Abdul; Shahid, Muhammad
2018-01-01
The exploration of retinal vessel structure is colossally important on account of numerous diseases including stroke, Diabetic Retinopathy (DR) and coronary heart diseases, which can damage the retinal vessel structure. The retinal vascular network is very hard to be extracted due to its spreading and diminishing geometry and contrast variation in an image. The proposed technique consists of unique parallel processes for denoising and extraction of blood vessels in retinal images. In the preprocessing section, an adaptive histogram equalization enhances dissimilarity between the vessels and the background and morphological top-hat filters are employed to eliminate macula and optic disc, etc. To remove local noise, the difference of images is computed from the top-hat filtered image and the high-boost filtered image. Frangi filter is applied at multi scale for the enhancement of vessels possessing diverse widths. Segmentation is performed by using improved Otsu thresholding on the high-boost filtered image and Frangi's enhanced image, separately. In the postprocessing steps, a Vessel Location Map (VLM) is extracted by using raster to vector transformation. Postprocessing steps are employed in a novel way to reject misclassified vessel pixels. The final segmented image is obtained by using pixel-by-pixel AND operation between VLM and Frangi output image. The method has been rigorously analyzed on the STARE, DRIVE and HRF datasets.
Hemorrhage detection in MRI brain images using images features
NASA Astrophysics Data System (ADS)
Moraru, Luminita; Moldovanu, Simona; Bibicu, Dorin; Stratulat (Visan), Mirela
2013-11-01
The abnormalities appear frequently on Magnetic Resonance Images (MRI) of brain in elderly patients presenting either stroke or cognitive impairment. Detection of brain hemorrhage lesions in MRI is an important but very time-consuming task. This research aims to develop a method to extract brain tissue features from T2-weighted MR images of the brain using a selection of the most valuable texture features in order to discriminate between normal and affected areas of the brain. Due to textural similarity between normal and affected areas in brain MR images these operation are very challenging. A trauma may cause microstructural changes, which are not necessarily perceptible by visual inspection, but they could be detected by using a texture analysis. The proposed analysis is developed in five steps: i) in the pre-processing step: the de-noising operation is performed using the Daubechies wavelets; ii) the original images were transformed in image features using the first order descriptors; iii) the regions of interest (ROIs) were cropped from images feature following up the axial symmetry properties with respect to the mid - sagittal plan; iv) the variation in the measurement of features was quantified using the two descriptors of the co-occurrence matrix, namely energy and homogeneity; v) finally, the meaningful of the image features is analyzed by using the t-test method. P-value has been applied to the pair of features in order to measure they efficacy.
NASA Astrophysics Data System (ADS)
Prilianti, K. R.; Hariyanto, S.; Natali, F. D. D.; Indriatmoko, Adhiwibawa, M. A. S.; Limantara, L.; Brotosudarmo, T. H. P.
2016-04-01
The development of rapid and automatic pigment characterization method become an important issue due to the fact that there are only less than 1% of plant pigments in the earth have been explored. In this research, a mathematical model based on artificial intelligence approach was developed to simplify and accelerate pigment characterization process from HPLC (high-performance liquid chromatography) procedure. HPLC is a widely used technique to separate and identify pigments in a mixture. Input of the model is chromatographic data from HPLC device and output of the model is a list of pigments which is the spectrum pattern is discovered in it. This model provides two dimensional (retention time and wavelength) fingerprints for pigment characterization which is proven to be more accurate than one dimensional fingerprint (fixed wavelength). Moreover, by mimicking interconnection of the neuron in the nervous systems of the human brain, the model have learning ability that could be replacing expert judgement on evaluating spectrum pattern. In the preprocessing step, principal component analysis (PCA) was used to reduce the huge dimension of the chromatographic data. The aim of this step is to simplify the model and accelerate the identification process. Six photosynthetic pigments i.e. zeaxantin, pheophytin a, α-carotene, β-carotene, lycopene and lutein could be well identified by the model with accuracy up to 85.33% and processing time less than 1 second.
Bioinformatics in proteomics: application, terminology, and pitfalls.
Wiemer, Jan C; Prokudin, Alexander
2004-01-01
Bioinformatics applies data mining, i.e., modern computer-based statistics, to biomedical data. It leverages on machine learning approaches, such as artificial neural networks, decision trees and clustering algorithms, and is ideally suited for handling huge data amounts. In this article, we review the analysis of mass spectrometry data in proteomics, starting with common pre-processing steps and using single decision trees and decision tree ensembles for classification. Special emphasis is put on the pitfall of overfitting, i.e., of generating too complex single decision trees. Finally, we discuss the pros and cons of the two different decision tree usages.
Testing for intracycle determinism in pseudoperiodic time series.
Coelho, Mara C S; Mendes, Eduardo M A M; Aguirre, Luis A
2008-06-01
A determinism test is proposed based on the well-known method of the surrogate data. Assuming predictability to be a signature of determinism, the proposed method checks for intracycle (e.g., short-term) determinism in the pseudoperiodic time series for which standard methods of surrogate analysis do not apply. The approach presented is composed of two steps. First, the data are preprocessed to reduce the effects of seasonal and trend components. Second, standard tests of surrogate analysis can then be used. The determinism test is applied to simulated and experimental pseudoperiodic time series and the results show the applicability of the proposed test.
A General Algorithm for Reusing Krylov Subspace Information. I. Unsteady Navier-Stokes
NASA Technical Reports Server (NTRS)
Carpenter, Mark H.; Vuik, C.; Lucas, Peter; vanGijzen, Martin; Bijl, Hester
2010-01-01
A general algorithm is developed that reuses available information to accelerate the iterative convergence of linear systems with multiple right-hand sides A x = b (sup i), which are commonly encountered in steady or unsteady simulations of nonlinear equations. The algorithm is based on the classical GMRES algorithm with eigenvector enrichment but also includes a Galerkin projection preprocessing step and several novel Krylov subspace reuse strategies. The new approach is applied to a set of test problems, including an unsteady turbulent airfoil, and is shown in some cases to provide significant improvement in computational efficiency relative to baseline approaches.
Quasi interpolation with Voronoi splines.
Mirzargar, Mahsa; Entezari, Alireza
2011-12-01
We present a quasi interpolation framework that attains the optimal approximation-order of Voronoi splines for reconstruction of volumetric data sampled on general lattices. The quasi interpolation framework of Voronoi splines provides an unbiased reconstruction method across various lattices. Therefore this framework allows us to analyze and contrast the sampling-theoretic performance of general lattices, using signal reconstruction, in an unbiased manner. Our quasi interpolation methodology is implemented as an efficient FIR filter that can be applied online or as a preprocessing step. We present visual and numerical experiments that demonstrate the improved accuracy of reconstruction across lattices, using the quasi interpolation framework. © 2011 IEEE
PhySIC_IST: cleaning source trees to infer more informative supertrees
Scornavacca, Celine; Berry, Vincent; Lefort, Vincent; Douzery, Emmanuel JP; Ranwez, Vincent
2008-01-01
Background Supertree methods combine phylogenies with overlapping sets of taxa into a larger one. Topological conflicts frequently arise among source trees for methodological or biological reasons, such as long branch attraction, lateral gene transfers, gene duplication/loss or deep gene coalescence. When topological conflicts occur among source trees, liberal methods infer supertrees containing the most frequent alternative, while veto methods infer supertrees not contradicting any source tree, i.e. discard all conflicting resolutions. When the source trees host a significant number of topological conflicts or have a small taxon overlap, supertree methods of both kinds can propose poorly resolved, hence uninformative, supertrees. Results To overcome this problem, we propose to infer non-plenary supertrees, i.e. supertrees that do not necessarily contain all the taxa present in the source trees, discarding those whose position greatly differs among source trees or for which insufficient information is provided. We detail a variant of the PhySIC veto method called PhySIC_IST that can infer non-plenary supertrees. PhySIC_IST aims at inferring supertrees that satisfy the same appealing theoretical properties as with PhySIC, while being as informative as possible under this constraint. The informativeness of a supertree is estimated using a variation of the CIC (Cladistic Information Content) criterion, that takes into account both the presence of multifurcations and the absence of some taxa. Additionally, we propose a statistical preprocessing step called STC (Source Trees Correction) to correct the source trees prior to the supertree inference. STC is a liberal step that removes the parts of each source tree that significantly conflict with other source trees. Combining STC with a veto method allows an explicit trade-off between veto and liberal approaches, tuned by a single parameter. Performing large-scale simulations, we observe that STC+PhySIC_IST infers much more informative supertrees than PhySIC, while preserving low type I error compared to the well-known MRP method. Two biological case studies on animals confirm that the STC preprocess successfully detects anomalies in the source trees while STC+PhySIC_IST provides well-resolved supertrees agreeing with current knowledge in systematics. Conclusion The paper introduces and tests two new methodologies, PhySIC_IST and STC, that demonstrate the interest in inferring non-plenary supertrees as well as preprocessing the source trees. An implementation of the methods is available at: . PMID:18834542
Automated Recognition of 3D Features in GPIR Images
NASA Technical Reports Server (NTRS)
Park, Han; Stough, Timothy; Fijany, Amir
2007-01-01
A method of automated recognition of three-dimensional (3D) features in images generated by ground-penetrating imaging radar (GPIR) is undergoing development. GPIR 3D images can be analyzed to detect and identify such subsurface features as pipes and other utility conduits. Until now, much of the analysis of GPIR images has been performed manually by expert operators who must visually identify and track each feature. The present method is intended to satisfy a need for more efficient and accurate analysis by means of algorithms that can automatically identify and track subsurface features, with minimal supervision by human operators. In this method, data from multiple sources (for example, data on different features extracted by different algorithms) are fused together for identifying subsurface objects. The algorithms of this method can be classified in several different ways. In one classification, the algorithms fall into three classes: (1) image-processing algorithms, (2) feature- extraction algorithms, and (3) a multiaxis data-fusion/pattern-recognition algorithm that includes a combination of machine-learning, pattern-recognition, and object-linking algorithms. The image-processing class includes preprocessing algorithms for reducing noise and enhancing target features for pattern recognition. The feature-extraction algorithms operate on preprocessed data to extract such specific features in images as two-dimensional (2D) slices of a pipe. Then the multiaxis data-fusion/ pattern-recognition algorithm identifies, classifies, and reconstructs 3D objects from the extracted features. In this process, multiple 2D features extracted by use of different algorithms and representing views along different directions are used to identify and reconstruct 3D objects. In object linking, which is an essential part of this process, features identified in successive 2D slices and located within a threshold radius of identical features in adjacent slices are linked in a directed-graph data structure. Relative to past approaches, this multiaxis approach offers the advantages of more reliable detections, better discrimination of objects, and provision of redundant information, which can be helpful in filling gaps in feature recognition by one of the component algorithms. The image-processing class also includes postprocessing algorithms that enhance identified features to prepare them for further scrutiny by human analysts (see figure). Enhancement of images as a postprocessing step is a significant departure from traditional practice, in which enhancement of images is a preprocessing step.
PhySIC_IST: cleaning source trees to infer more informative supertrees.
Scornavacca, Celine; Berry, Vincent; Lefort, Vincent; Douzery, Emmanuel J P; Ranwez, Vincent
2008-10-04
Supertree methods combine phylogenies with overlapping sets of taxa into a larger one. Topological conflicts frequently arise among source trees for methodological or biological reasons, such as long branch attraction, lateral gene transfers, gene duplication/loss or deep gene coalescence. When topological conflicts occur among source trees, liberal methods infer supertrees containing the most frequent alternative, while veto methods infer supertrees not contradicting any source tree, i.e. discard all conflicting resolutions. When the source trees host a significant number of topological conflicts or have a small taxon overlap, supertree methods of both kinds can propose poorly resolved, hence uninformative, supertrees. To overcome this problem, we propose to infer non-plenary supertrees, i.e. supertrees that do not necessarily contain all the taxa present in the source trees, discarding those whose position greatly differs among source trees or for which insufficient information is provided. We detail a variant of the PhySIC veto method called PhySIC_IST that can infer non-plenary supertrees. PhySIC_IST aims at inferring supertrees that satisfy the same appealing theoretical properties as with PhySIC, while being as informative as possible under this constraint. The informativeness of a supertree is estimated using a variation of the CIC (Cladistic Information Content) criterion, that takes into account both the presence of multifurcations and the absence of some taxa. Additionally, we propose a statistical preprocessing step called STC (Source Trees Correction) to correct the source trees prior to the supertree inference. STC is a liberal step that removes the parts of each source tree that significantly conflict with other source trees. Combining STC with a veto method allows an explicit trade-off between veto and liberal approaches, tuned by a single parameter.Performing large-scale simulations, we observe that STC+PhySIC_IST infers much more informative supertrees than PhySIC, while preserving low type I error compared to the well-known MRP method. Two biological case studies on animals confirm that the STC preprocess successfully detects anomalies in the source trees while STC+PhySIC_IST provides well-resolved supertrees agreeing with current knowledge in systematics. The paper introduces and tests two new methodologies, PhySIC_IST and STC, that demonstrate the interest in inferring non-plenary supertrees as well as preprocessing the source trees. An implementation of the methods is available at: http://www.atgc-montpellier.fr/physic_ist/.
Pre-Processes for Urban Areas Detection in SAR Images
NASA Astrophysics Data System (ADS)
Altay Açar, S.; Bayır, Ş.
2017-11-01
In this study, pre-processes for urban areas detection in synthetic aperture radar (SAR) images are examined. These pre-processes are image smoothing, thresholding and white coloured regions determination. Image smoothing is carried out to remove noises then thresholding is applied to obtain binary image. Finally, candidate urban areas are detected by using white coloured regions determination. All pre-processes are applied by utilizing the developed software. Two different SAR images which are acquired by TerraSAR-X are used in experimental study. Obtained results are shown visually.
The effects of pre-processing strategies in sentiment analysis of online movie reviews
NASA Astrophysics Data System (ADS)
Zin, Harnani Mat; Mustapha, Norwati; Murad, Masrah Azrifah Azmi; Sharef, Nurfadhlina Mohd
2017-10-01
With the ever increasing of internet applications and social networking sites, people nowadays can easily express their feelings towards any products and services. These online reviews act as an important source for further analysis and improved decision making. These reviews are mostly unstructured by nature and thus, need processing like sentiment analysis and classification to provide a meaningful information for future uses. In text analysis tasks, the appropriate selection of words/features will have a huge impact on the effectiveness of the classifier. Thus, this paper explores the effect of the pre-processing strategies in the sentiment analysis of online movie reviews. In this paper, supervised machine learning method was used to classify the reviews. The support vector machine (SVM) with linear and non-linear kernel has been considered as classifier for the classification of the reviews. The performance of the classifier is critically examined based on the results of precision, recall, f-measure, and accuracy. Two different features representations were used which are term frequency and term frequency-inverse document frequency. Results show that the pre-processing strategies give a significant impact on the classification process.
A new classification scheme of plastic wastes based upon recycling labels.
Özkan, Kemal; Ergin, Semih; Işık, Şahin; Işıklı, Idil
2015-01-01
Since recycling of materials is widely assumed to be environmentally and economically beneficial, reliable sorting and processing of waste packaging materials such as plastics is very important for recycling with high efficiency. An automated system that can quickly categorize these materials is certainly needed for obtaining maximum classification while maintaining high throughput. In this paper, first of all, the photographs of the plastic bottles have been taken and several preprocessing steps were carried out. The first preprocessing step is to extract the plastic area of a bottle from the background. Then, the morphological image operations are implemented. These operations are edge detection, noise removal, hole removing, image enhancement, and image segmentation. These morphological operations can be generally defined in terms of the combinations of erosion and dilation. The effect of bottle color as well as label are eliminated using these operations. Secondly, the pixel-wise intensity values of the plastic bottle images have been used together with the most popular subspace and statistical feature extraction methods to construct the feature vectors in this study. Only three types of plastics are considered due to higher existence ratio of them than the other plastic types in the world. The decision mechanism consists of five different feature extraction methods including as Principal Component Analysis (PCA), Kernel PCA (KPCA), Fisher's Linear Discriminant Analysis (FLDA), Singular Value Decomposition (SVD) and Laplacian Eigenmaps (LEMAP) and uses a simple experimental setup with a camera and homogenous backlighting. Due to the giving global solution for a classification problem, Support Vector Machine (SVM) is selected to achieve the classification task and majority voting technique is used as the decision mechanism. This technique equally weights each classification result and assigns the given plastic object to the class that the most classification results agree on. The proposed classification scheme provides high accuracy rate, and also it is able to run in real-time applications. It can automatically classify the plastic bottle types with approximately 90% recognition accuracy. Besides this, the proposed methodology yields approximately 96% classification rate for the separation of PET or non-PET plastic types. It also gives 92% accuracy for the categorization of non-PET plastic types into HPDE or PP. Copyright © 2014 Elsevier Ltd. All rights reserved.
Chain of evidence generation for contrast enhancement in digital image forensics
NASA Astrophysics Data System (ADS)
Battiato, Sebastiano; Messina, Giuseppe; Strano, Daniela
2010-01-01
The quality of the images obtained by digital cameras has improved a lot since digital cameras early days. Unfortunately, it is not unusual in image forensics to find wrongly exposed pictures. This is mainly due to obsolete techniques or old technologies, but also due to backlight conditions. To extrapolate some invisible details a stretching of the image contrast is obviously required. The forensics rules to produce evidences require a complete documentation of the processing steps, enabling the replication of the entire process. The automation of enhancement techniques is thus quite difficult and needs to be carefully documented. This work presents an automatic procedure to find contrast enhancement settings, allowing both image correction and automatic scripting generation. The technique is based on a preprocessing step which extracts the features of the image and selects correction parameters. The parameters are thus saved through a JavaScript code that is used in the second step of the approach to correct the image. The generated script is Adobe Photoshop compliant (which is largely used in image forensics analysis) thus permitting the replication of the enhancement steps. Experiments on a dataset of images are also reported showing the effectiveness of the proposed methodology.
Testing for nonlinearity in non-stationary physiological time series.
Guarín, Diego; Delgado, Edilson; Orozco, Álvaro
2011-01-01
Testing for nonlinearity is one of the most important preprocessing steps in nonlinear time series analysis. Typically, this is done by means of the linear surrogate data methods. But it is a known fact that the validity of the results heavily depends on the stationarity of the time series. Since most physiological signals are non-stationary, it is easy to falsely detect nonlinearity using the linear surrogate data methods. In this document, we propose a methodology to extend the procedure for generating constrained surrogate time series in order to assess nonlinearity in non-stationary data. The method is based on the band-phase-randomized surrogates, which consists (contrary to the linear surrogate data methods) in randomizing only a portion of the Fourier phases in the high frequency domain. Analysis of simulated time series showed that in comparison to the linear surrogate data method, our method is able to discriminate between linear stationarity, linear non-stationary and nonlinear time series. Applying our methodology to heart rate variability (HRV) records of five healthy patients, we encountered that nonlinear correlations are present in this non-stationary physiological signals.
Brain tumor segmentation in MR slices using improved GrowCut algorithm
NASA Astrophysics Data System (ADS)
Ji, Chunhong; Yu, Jinhua; Wang, Yuanyuan; Chen, Liang; Shi, Zhifeng; Mao, Ying
2015-12-01
The detection of brain tumor from MR images is very significant for medical diagnosis and treatment. However, the existing methods are mostly based on manual or semiautomatic segmentation which are awkward when dealing with a large amount of MR slices. In this paper, a new fully automatic method for the segmentation of brain tumors in MR slices is presented. Based on the hypothesis of the symmetric brain structure, the method improves the interactive GrowCut algorithm by further using the bounding box algorithm in the pre-processing step. More importantly, local reflectional symmetry is used to make up the deficiency of the bounding box method. After segmentation, 3D tumor image is reconstructed. We evaluate the accuracy of the proposed method on MR slices with synthetic tumors and actual clinical MR images. Result of the proposed method is compared with the actual position of simulated 3D tumor qualitatively and quantitatively. In addition, our automatic method produces equivalent performance as manual segmentation and the interactive GrowCut with manual interference while providing fully automatic segmentation.
Data Pre-Processing for Label-Free Multiple Reaction Monitoring (MRM) Experiments
Chung, Lisa M.; Colangelo, Christopher M.; Zhao, Hongyu
2014-01-01
Multiple Reaction Monitoring (MRM) conducted on a triple quadrupole mass spectrometer allows researchers to quantify the expression levels of a set of target proteins. Each protein is often characterized by several unique peptides that can be detected by monitoring predetermined fragment ions, called transitions, for each peptide. Concatenating large numbers of MRM transitions into a single assay enables simultaneous quantification of hundreds of peptides and proteins. In recognition of the important role that MRM can play in hypothesis-driven research and its increasing impact on clinical proteomics, targeted proteomics such as MRM was recently selected as the Nature Method of the Year. However, there are many challenges in MRM applications, especially data pre‑processing where many steps still rely on manual inspection of each observation in practice. In this paper, we discuss an analysis pipeline to automate MRM data pre‑processing. This pipeline includes data quality assessment across replicated samples, outlier detection, identification of inaccurate transitions, and data normalization. We demonstrate the utility of our pipeline through its applications to several real MRM data sets. PMID:24905083
Data Pre-Processing for Label-Free Multiple Reaction Monitoring (MRM) Experiments.
Chung, Lisa M; Colangelo, Christopher M; Zhao, Hongyu
2014-06-05
Multiple Reaction Monitoring (MRM) conducted on a triple quadrupole mass spectrometer allows researchers to quantify the expression levels of a set of target proteins. Each protein is often characterized by several unique peptides that can be detected by monitoring predetermined fragment ions, called transitions, for each peptide. Concatenating large numbers of MRM transitions into a single assay enables simultaneous quantification of hundreds of peptides and proteins. In recognition of the important role that MRM can play in hypothesis-driven research and its increasing impact on clinical proteomics, targeted proteomics such as MRM was recently selected as the Nature Method of the Year. However, there are many challenges in MRM applications, especially data pre‑processing where many steps still rely on manual inspection of each observation in practice. In this paper, we discuss an analysis pipeline to automate MRM data pre‑processing. This pipeline includes data quality assessment across replicated samples, outlier detection, identification of inaccurate transitions, and data normalization. We demonstrate the utility of our pipeline through its applications to several real MRM data sets.
Residual Deep Convolutional Neural Network Predicts MGMT Methylation Status.
Korfiatis, Panagiotis; Kline, Timothy L; Lachance, Daniel H; Parney, Ian F; Buckner, Jan C; Erickson, Bradley J
2017-10-01
Predicting methylation of the O6-methylguanine methyltransferase (MGMT) gene status utilizing MRI imaging is of high importance since it is a predictor of response and prognosis in brain tumors. In this study, we compare three different residual deep neural network (ResNet) architectures to evaluate their ability in predicting MGMT methylation status without the need for a distinct tumor segmentation step. We found that the ResNet50 (50 layers) architecture was the best performing model, achieving an accuracy of 94.90% (+/- 3.92%) for the test set (classification of a slice as no tumor, methylated MGMT, or non-methylated). ResNet34 (34 layers) achieved 80.72% (+/- 13.61%) while ResNet18 (18 layers) accuracy was 76.75% (+/- 20.67%). ResNet50 performance was statistically significantly better than both ResNet18 and ResNet34 architectures (p < 0.001). We report a method that alleviates the need of extensive preprocessing and acts as a proof of concept that deep neural architectures can be used to predict molecular biomarkers from routine medical images.
Li, Guanghui; Luo, Jiawei; Xiao, Qiu; Liang, Cheng; Ding, Pingjian
2018-05-12
Interactions between microRNAs (miRNAs) and diseases can yield important information for uncovering novel prognostic markers. Since experimental determination of disease-miRNA associations is time-consuming and costly, attention has been given to designing efficient and robust computational techniques for identifying undiscovered interactions. In this study, we present a label propagation model with linear neighborhood similarity, called LPLNS, to predict unobserved miRNA-disease associations. Additionally, a preprocessing step is performed to derive new interaction likelihood profiles that will contribute to the prediction since new miRNAs and diseases lack known associations. Our results demonstrate that the LPLNS model based on the known disease-miRNA associations could achieve impressive performance with an AUC of 0.9034. Furthermore, we observed that the LPLNS model based on new interaction likelihood profiles could improve the performance to an AUC of 0.9127. This was better than other comparable methods. In addition, case studies also demonstrated our method's outstanding performance for inferring undiscovered interactions between miRNAs and diseases, especially for novel diseases. Copyright © 2018. Published by Elsevier Inc.
Tan, Weng Chun; Mat Isa, Nor Ashidi
2016-01-01
In human sperm motility analysis, sperm segmentation plays an important role to determine the location of multiple sperms. To ensure an improved segmentation result, the Laplacian of Gaussian filter is implemented as a kernel in a pre-processing step before applying the image segmentation process to automatically segment and detect human spermatozoa. This study proposes an intersecting cortical model (ICM), which was derived from several visual cortex models, to segment the sperm head region. However, the proposed method suffered from parameter selection; thus, the ICM network is optimised using particle swarm optimization where feature mutual information is introduced as the new fitness function. The final results showed that the proposed method is more accurate and robust than four state-of-the-art segmentation methods. The proposed method resulted in rates of 98.14%, 98.82%, 86.46% and 99.81% in accuracy, sensitivity, specificity and precision, respectively, after testing with 1200 sperms. The proposed algorithm is expected to be implemented in analysing sperm motility because of the robustness and capability of this algorithm.
2017-01-01
Retinal blood vessels have a significant role in the diagnosis and treatment of various retinal diseases such as diabetic retinopathy, glaucoma, arteriosclerosis, and hypertension. For this reason, retinal vasculature extraction is important in order to help specialists for the diagnosis and treatment of systematic diseases. In this paper, a novel approach is developed to extract retinal blood vessel network. Our method comprises four stages: (1) preprocessing stage in order to prepare dataset for segmentation; (2) an enhancement procedure including Gabor, Frangi, and Gauss filters obtained separately before a top-hat transform; (3) a hard and soft clustering stage which includes K-means and Fuzzy C-means (FCM) in order to get binary vessel map; and (4) a postprocessing step which removes falsely segmented isolated regions. The method is tested on color retinal images obtained from STARE and DRIVE databases which are available online. As a result, Gabor filter followed by K-means clustering method achieves 95.94% and 95.71% of accuracy for STARE and DRIVE databases, respectively, which are acceptable for diagnosis systems. PMID:29065611
Complexity-reduced implementations of complete and null-space-based linear discriminant analysis.
Lu, Gui-Fu; Zheng, Wenming
2013-10-01
Dimensionality reduction has become an important data preprocessing step in a lot of applications. Linear discriminant analysis (LDA) is one of the most well-known dimensionality reduction methods. However, the classical LDA cannot be used directly in the small sample size (SSS) problem where the within-class scatter matrix is singular. In the past, many generalized LDA methods has been reported to address the SSS problem. Among these methods, complete linear discriminant analysis (CLDA) and null-space-based LDA (NLDA) provide good performances. The existing implementations of CLDA are computationally expensive. In this paper, we propose a new and fast implementation of CLDA. Our proposed implementation of CLDA, which is the most efficient one, is equivalent to the existing implementations of CLDA in theory. Since CLDA is an extension of null-space-based LDA (NLDA), our implementation of CLDA also provides a fast implementation of NLDA. Experiments on some real-world data sets demonstrate the effectiveness of our proposed new CLDA and NLDA algorithms. Copyright © 2013 Elsevier Ltd. All rights reserved.
MATRIX FACTORIZATION-BASED DATA FUSION FOR GENE FUNCTION PREDICTION IN BAKER’S YEAST AND SLIME MOLD
ŽITNIK, MARINKA; ZUPAN, BLAŽ
2014-01-01
The development of effective methods for the characterization of gene functions that are able to combine diverse data sources in a sound and easily-extendible way is an important goal in computational biology. We have previously developed a general matrix factorization-based data fusion approach for gene function prediction. In this manuscript, we show that this data fusion approach can be applied to gene function prediction and that it can fuse various heterogeneous data sources, such as gene expression profiles, known protein annotations, interaction and literature data. The fusion is achieved by simultaneous matrix tri-factorization that shares matrix factors between sources. We demonstrate the effectiveness of the approach by evaluating its performance on predicting ontological annotations in slime mold D. discoideum and on recognizing proteins of baker’s yeast S. cerevisiae that participate in the ribosome or are located in the cell membrane. Our approach achieves predictive performance comparable to that of the state-of-the-art kernel-based data fusion, but requires fewer data preprocessing steps. PMID:24297565
Pacini, Clare; Ajioka, James W; Micklem, Gos
2017-04-12
Correlation matrices are important in inferring relationships and networks between regulatory or signalling elements in biological systems. With currently available technology sample sizes for experiments are typically small, meaning that these correlations can be difficult to estimate. At a genome-wide scale estimation of correlation matrices can also be computationally demanding. We develop an empirical Bayes approach to improve covariance estimates for gene expression, where we assume the covariance matrix takes a block diagonal form. Our method shows lower false discovery rates than existing methods on simulated data. Applied to a real data set from Bacillus subtilis we demonstrate it's ability to detecting known regulatory units and interactions between them. We demonstrate that, compared to existing methods, our method is able to find significant covariances and also to control false discovery rates, even when the sample size is small (n=10). The method can be used to find potential regulatory networks, and it may also be used as a pre-processing step for methods that calculate, for example, partial correlations, so enabling the inference of the causal and hierarchical structure of the networks.
MOtoNMS: A MATLAB toolbox to process motion data for neuromusculoskeletal modeling and simulation.
Mantoan, Alice; Pizzolato, Claudio; Sartori, Massimo; Sawacha, Zimi; Cobelli, Claudio; Reggiani, Monica
2015-01-01
Neuromusculoskeletal modeling and simulation enable investigation of the neuromusculoskeletal system and its role in human movement dynamics. These methods are progressively introduced into daily clinical practice. However, a major factor limiting this translation is the lack of robust tools for the pre-processing of experimental movement data for their use in neuromusculoskeletal modeling software. This paper presents MOtoNMS (matlab MOtion data elaboration TOolbox for NeuroMusculoSkeletal applications), a toolbox freely available to the community, that aims to fill this lack. MOtoNMS processes experimental data from different motion analysis devices and generates input data for neuromusculoskeletal modeling and simulation software, such as OpenSim and CEINMS (Calibrated EMG-Informed NMS Modelling Toolbox). MOtoNMS implements commonly required processing steps and its generic architecture simplifies the integration of new user-defined processing components. MOtoNMS allows users to setup their laboratory configurations and processing procedures through user-friendly graphical interfaces, without requiring advanced computer skills. Finally, configuration choices can be stored enabling the full reproduction of the processing steps. MOtoNMS is released under GNU General Public License and it is available at the SimTK website and from the GitHub repository. Motion data collected at four institutions demonstrate that, despite differences in laboratory instrumentation and procedures, MOtoNMS succeeds in processing data and producing consistent inputs for OpenSim and CEINMS. MOtoNMS fills the gap between motion analysis and neuromusculoskeletal modeling and simulation. Its support to several devices, a complete implementation of the pre-processing procedures, its simple extensibility, the available user interfaces, and its free availability can boost the translation of neuromusculoskeletal methods in daily and clinical practice.
NASA Astrophysics Data System (ADS)
Jepsen, Morten Leth; Harmsen, Charlotte; Godbole, Adwait Anand; Nagaraja, Valakunja; Knudsen, Birgitta R.; Ho, Yi-Ping
2015-12-01
We present a quantum dot based DNA nanosensor specifically targeting the cleavage step in the reaction cycle of the essential DNA-modifying enzyme, mycobacterial topoisomerase I. The design takes advantages of the unique photophysical properties of quantum dots to generate visible fluorescence recovery upon specific cleavage by mycobacterial topoisomerase I. This report, for the first time, demonstrates the possibility to quantify the cleavage activity of the mycobacterial enzyme without the pre-processing sample purification or post-processing signal amplification. The cleavage induced signal response has also proven reliable in biological matrices, such as whole cell extracts prepared from Escherichia coli and human Caco-2 cells. It is expected that the assay may contribute to the clinical diagnostics of bacterial diseases, as well as the evaluation of treatment outcomes.We present a quantum dot based DNA nanosensor specifically targeting the cleavage step in the reaction cycle of the essential DNA-modifying enzyme, mycobacterial topoisomerase I. The design takes advantages of the unique photophysical properties of quantum dots to generate visible fluorescence recovery upon specific cleavage by mycobacterial topoisomerase I. This report, for the first time, demonstrates the possibility to quantify the cleavage activity of the mycobacterial enzyme without the pre-processing sample purification or post-processing signal amplification. The cleavage induced signal response has also proven reliable in biological matrices, such as whole cell extracts prepared from Escherichia coli and human Caco-2 cells. It is expected that the assay may contribute to the clinical diagnostics of bacterial diseases, as well as the evaluation of treatment outcomes. Electronic supplementary information (ESI) available: Characterization of the QD-based DNA Nanosensor. See DOI: 10.1039/c5nr06326d
Hamilton, Liberty S; Chang, David L; Lee, Morgan B; Chang, Edward F
2017-01-01
In this article, we introduce img_pipe, our open source python package for preprocessing of imaging data for use in intracranial electrocorticography (ECoG) and intracranial stereo-EEG analyses. The process of electrode localization, labeling, and warping for use in ECoG currently varies widely across laboratories, and it is usually performed with custom, lab-specific code. This python package aims to provide a standardized interface for these procedures, as well as code to plot and display results on 3D cortical surface meshes. It gives the user an easy interface to create anatomically labeled electrodes that can also be warped to an atlas brain, starting with only a preoperative T1 MRI scan and a postoperative CT scan. We describe the full capabilities of our imaging pipeline and present a step-by-step protocol for users.
Barreto, Goncalo; Soininen, Antti; Sillat, Tarvo; Konttinen, Yrjö T; Kaivosoja, Emilia
2014-01-01
Time-of-flight secondary ion mass spectrometry (ToF-SIMS) is increasingly being used in analysis of biological samples. For example, it has been applied to distinguish healthy and osteoarthritic human cartilage. This chapter discusses ToF-SIMS principle and instrumentation including the three modes of analysis in ToF-SIMS. ToF-SIMS sets certain requirements for the samples to be analyzed; for example, the samples have to be vacuum compatible. Accordingly, sample processing steps for different biological samples, i.e., proteins, cells, frozen and paraffin-embedded tissues and extracellular matrix for the ToF-SIMS are presented. Multivariate analysis of the ToF-SIMS data and the necessary data preprocessing steps (peak selection, data normalization, mean-centering, and scaling and transformation) are discussed in this chapter.
Gaia DR2 documentation Chapter 3: Astrometry
NASA Astrophysics Data System (ADS)
Hobbs, D.; Lindegren, L.; Bastian, U.; Klioner, S.; Butkevich, A.; Stephenson, C.; Hernandez, J.; Lammers, U.; Bombrun, A.; Mignard, F.; Altmann, M.; Davidson, M.; de Bruijne, J. H. J.; Fernández-Hernández, J.; Siddiqui, H.; Utrilla Molina, E.
2018-04-01
This chapter of the Gaia DR2 documentation describes the models and processing steps used for the astrometric core solution, namely, the Astrometric Global Iterative Solution (AGIS). The inputs to this solution rely heavily on the basic observables (or astrometric elementaries) which have been pre-processed and discussed in Chapter 2, the results of which were published in Fabricius et al. (2016). The models consist of reference systems and time scales; assumed linear stellar motion and relativistic light deflection; in addition to fundamental constants and the transformation of coordinate systems. Higher level inputs such as: planetary and solar system ephemeris; Gaia tracking and orbit information; initial quasar catalogues and BAM data are all needed for the processing described here. The astrometric calibration models are outlined followed by the details processing steps which give AGIS its name. We also present a basic quality assessment and validation of the scientific results (for details, see Lindegren et al. 2018).
Hamilton, Liberty S.; Chang, David L.; Lee, Morgan B.; Chang, Edward F.
2017-01-01
In this article, we introduce img_pipe, our open source python package for preprocessing of imaging data for use in intracranial electrocorticography (ECoG) and intracranial stereo-EEG analyses. The process of electrode localization, labeling, and warping for use in ECoG currently varies widely across laboratories, and it is usually performed with custom, lab-specific code. This python package aims to provide a standardized interface for these procedures, as well as code to plot and display results on 3D cortical surface meshes. It gives the user an easy interface to create anatomically labeled electrodes that can also be warped to an atlas brain, starting with only a preoperative T1 MRI scan and a postoperative CT scan. We describe the full capabilities of our imaging pipeline and present a step-by-step protocol for users. PMID:29163118
The Overgrid Interface for Computational Simulations on Overset Grids
NASA Technical Reports Server (NTRS)
Chan, William M.; Kwak, Dochan (Technical Monitor)
2002-01-01
Computational simulations using overset grids typically involve multiple steps and a variety of software modules. A graphical interface called OVERGRID has been specially designed for such purposes. Data required and created by the different steps include geometry, grids, domain connectivity information and flow solver input parameters. The interface provides a unified environment for the visualization, processing, generation and diagnosis of such data. General modules are available for the manipulation of structured grids and unstructured surface triangulations. Modules more specific for the overset approach include surface curve generators, hyperbolic and algebraic surface grid generators, a hyperbolic volume grid generator, Cartesian box grid generators, and domain connectivity: pre-processing tools. An interface provides automatic selection and viewing of flow solver boundary conditions, and various other flow solver inputs. For problems involving multiple components in relative motion, a module is available to build the component/grid relationships and to prescribe and animate the dynamics of the different components.
Mathijssen, N M C; Sturm, P D; Pilot, P; Bloem, R M; Buma, P; Petit, P L; Schreurs, B W
2013-12-01
With bone impaction grafting, cancellous bone chips made from allograft femoral heads are impacted in a bone defect, which introduces an additional source of infection. The potential benefit of the use of pre-processed bone chips was investigated by comparing the bacterial contamination of bone chips prepared intraoperatively with the bacterial contamination of pre-processed bone chips at different stages in the surgical procedure. To investigate baseline contamination of the bone grafts, specimens were collected during 88 procedures before actual use or preparation of the bone chips: in 44 procedures intraoperatively prepared chips were used (Group A) and in the other 44 procedures pre-processed bone chips were used (Group B). In 64 of these procedures (32 using locally prepared bone chips and 32 using pre-processed bone chips) specimens were also collected later in the procedure to investigate contamination after use and preparation of the bone chips. In total, 8 procedures had one or more positive specimen(s) (12.5 %). Contamination rates were not significantly different between bone chips prepared at the operating theatre and pre-processed bone chips. In conclusion, there was no difference in bacterial contamination between bone chips prepared from whole femoral heads in the operating room and pre-processed bone chips, and therefore, both types of bone allografts are comparable with respect to risk of infection.
Gifford, René H; Revit, Lawrence J
2010-01-01
Although cochlear implant patients are achieving increasingly higher levels of performance, speech perception in noise continues to be problematic. The newest generations of implant speech processors are equipped with preprocessing and/or external accessories that are purported to improve listening in noise. Most speech perception measures in the clinical setting, however, do not provide a close approximation to real-world listening environments. To assess speech perception for adult cochlear implant recipients in the presence of a realistic restaurant simulation generated by an eight-loudspeaker (R-SPACE) array in order to determine whether commercially available preprocessing strategies and/or external accessories yield improved sentence recognition in noise. Single-subject, repeated-measures design with two groups of participants: Advanced Bionics and Cochlear Corporation recipients. Thirty-four subjects, ranging in age from 18 to 90 yr (mean 54.5 yr), participated in this prospective study. Fourteen subjects were Advanced Bionics recipients, and 20 subjects were Cochlear Corporation recipients. Speech reception thresholds (SRTs) in semidiffuse restaurant noise originating from an eight-loudspeaker array were assessed with the subjects' preferred listening programs as well as with the addition of either Beam preprocessing (Cochlear Corporation) or the T-Mic accessory option (Advanced Bionics). In Experiment 1, adaptive SRTs with the Hearing in Noise Test sentences were obtained for all 34 subjects. For Cochlear Corporation recipients, SRTs were obtained with their preferred everyday listening program as well as with the addition of Focus preprocessing. For Advanced Bionics recipients, SRTs were obtained with the integrated behind-the-ear (BTE) mic as well as with the T-Mic. Statistical analysis using a repeated-measures analysis of variance (ANOVA) evaluated the effects of the preprocessing strategy or external accessory in reducing the SRT in noise. In addition, a standard t-test was run to evaluate effectiveness across manufacturer for improving the SRT in noise. In Experiment 2, 16 of the 20 Cochlear Corporation subjects were reassessed obtaining an SRT in noise using the manufacturer-suggested "Everyday," "Noise," and "Focus" preprocessing strategies. A repeated-measures ANOVA was employed to assess the effects of preprocessing. The primary findings were (i) both Noise and Focus preprocessing strategies (Cochlear Corporation) significantly improved the SRT in noise as compared to Everyday preprocessing, (ii) the T-Mic accessory option (Advanced Bionics) significantly improved the SRT as compared to the BTE mic, and (iii) Focus preprocessing and the T-Mic resulted in similar degrees of improvement that were not found to be significantly different from one another. Options available in current cochlear implant sound processors are able to significantly improve speech understanding in a realistic, semidiffuse noise with both Cochlear Corporation and Advanced Bionics systems. For Cochlear Corporation recipients, Focus preprocessing yields the best speech-recognition performance in a complex listening environment; however, it is recommended that Noise preprocessing be used as the new default for everyday listening environments to avoid the need for switching programs throughout the day. For Advanced Bionics recipients, the T-Mic offers significantly improved performance in noise and is recommended for everyday use in all listening environments. American Academy of Audiology.
Díaz, Gloria; González, Fabio A; Romero, Eduardo
2009-04-01
Visual quantification of parasitemia in thin blood films is a very tedious, subjective and time-consuming task. This study presents an original method for quantification and classification of erythrocytes in stained thin blood films infected with Plasmodium falciparum. The proposed approach is composed of three main phases: a preprocessing step, which corrects luminance differences. A segmentation step that uses the normalized RGB color space for classifying pixels either as erythrocyte or background followed by an Inclusion-Tree representation that structures the pixel information into objects, from which erythrocytes are found. Finally, a two step classification process identifies infected erythrocytes and differentiates the infection stage, using a trained bank of classifiers. Additionally, user intervention is allowed when the approach cannot make a proper decision. Four hundred fifty malaria images were used for training and evaluating the method. Automatic identification of infected erythrocytes showed a specificity of 99.7% and a sensitivity of 94%. The infection stage was determined with an average sensitivity of 78.8% and average specificity of 91.2%.
Optimization of an incubation step to maximize sulforaphane content in pre-processed broccoli.
Mahn, Andrea; Pérez, Carmen
2016-11-01
Sulforaphane is a powerful anticancer compound, found naturally in food, which comes from the hydrolysis of glucoraphanin, the main glucosinolate of broccoli. The aim of this work was to maximize sulforaphane content in broccoli by designing an incubation step after subjecting broccoli pieces to an optimized blanching step. Incubation was optimized through a Box-Behnken design using ascorbic acid concentration, incubation temperature and incubation time as factors. The optimal incubation conditions were 38 °C for 3 h and 0.22 mg ascorbic acid per g fresh broccoli. The maximum sulforaphane concentration predicted by the model was 8.0 µmol g -1 , which was confirmed experimentally yielding a value of 8.1 ± 0.3 µmol g -1 . This represents a 585% increase with respect to fresh broccoli and a 119% increase in relation to blanched broccoli, equivalent to a conversion of 94% of glucoraphanin. The process proposed here allows maximizing sulforaphane content, thus avoiding artificial chemical synthesis. The compound could probably be isolated from broccoli, and may find application as nutraceutical or functional ingredient.
Comparisons and Selections of Features and Classifiers for Short Text Classification
NASA Astrophysics Data System (ADS)
Wang, Ye; Zhou, Zhi; Jin, Shan; Liu, Debin; Lu, Mi
2017-10-01
Short text is considerably different from traditional long text documents due to its shortness and conciseness, which somehow hinders the applications of conventional machine learning and data mining algorithms in short text classification. According to traditional artificial intelligence methods, we divide short text classification into three steps, namely preprocessing, feature selection and classifier comparison. In this paper, we have illustrated step-by-step how we approach our goals. Specifically, in feature selection, we compared the performance and robustness of the four methods of one-hot encoding, tf-idf weighting, word2vec and paragraph2vec, and in the classification part, we deliberately chose and compared Naive Bayes, Logistic Regression, Support Vector Machine, K-nearest Neighbor and Decision Tree as our classifiers. Then, we compared and analysed the classifiers horizontally with each other and vertically with feature selections. Regarding the datasets, we crawled more than 400,000 short text files from Shanghai and Shenzhen Stock Exchanges and manually labeled them into two classes, the big and the small. There are eight labels in the big class, and 59 labels in the small class.
Text Classification for Organizational Researchers
Kobayashi, Vladimer B.; Mol, Stefan T.; Berkers, Hannah A.; Kismihók, Gábor; Den Hartog, Deanne N.
2017-01-01
Organizations are increasingly interested in classifying texts or parts thereof into categories, as this enables more effective use of their information. Manual procedures for text classification work well for up to a few hundred documents. However, when the number of documents is larger, manual procedures become laborious, time-consuming, and potentially unreliable. Techniques from text mining facilitate the automatic assignment of text strings to categories, making classification expedient, fast, and reliable, which creates potential for its application in organizational research. The purpose of this article is to familiarize organizational researchers with text mining techniques from machine learning and statistics. We describe the text classification process in several roughly sequential steps, namely training data preparation, preprocessing, transformation, application of classification techniques, and validation, and provide concrete recommendations at each step. To help researchers develop their own text classifiers, the R code associated with each step is presented in a tutorial. The tutorial draws from our own work on job vacancy mining. We end the article by discussing how researchers can validate a text classification model and the associated output. PMID:29881249
A survey of visual preprocessing and shape representation techniques
NASA Technical Reports Server (NTRS)
Olshausen, Bruno A.
1988-01-01
Many recent theories and methods proposed for visual preprocessing and shape representation are summarized. The survey brings together research from the fields of biology, psychology, computer science, electrical engineering, and most recently, neural networks. It was motivated by the need to preprocess images for a sparse distributed memory (SDM), but the techniques presented may also prove useful for applying other associative memories to visual pattern recognition. The material of this survey is divided into three sections: an overview of biological visual processing; methods of preprocessing (extracting parts of shape, texture, motion, and depth); and shape representation and recognition (form invariance, primitives and structural descriptions, and theories of attention).
Chancerel, Perrine; Bolland, Til; Rotter, Vera Susanne
2011-03-01
Waste electrical and electronic equipment (WEEE) contains gold in low but from an environmental and economic point of view relevant concentration. After collection, WEEE is pre-processed in order to generate appropriate material fractions that are sent to the subsequent end-processing stages (recovery, reuse or disposal). The goal of this research is to quantify the overall recovery rates of pre-processing technologies used in Germany for the reference year 2007. To achieve this goal, facilities operating in Germany were listed and classified according to the technology they apply. Information on their processing capacity was gathered by evaluating statistical databases. Based on a literature review of experimental results for gold recovery rates of different pre-processing technologies, the German overall recovery rate of gold at the pre-processing level was quantified depending on the characteristics of the treated WEEE. The results reveal that - depending on the equipment groups - pre-processing recovery rates of gold of 29 to 61% are achieved in Germany. Some practical recommendations to reduce the losses during pre-processing could be formulated. Defining mass-based recovery targets in the legislation does not set incentives to recover trace elements. Instead, the priorities for recycling could be defined based on other parameters like the environmental impacts of the materials. The implementation of measures to reduce the gold losses would also improve the recovery of several other non-ferrous metals like tin, nickel, and palladium.
Monitoring the defoliation of hardwood forests in Pennsylvania using LANDSAT. [gypsy moth surveys
NASA Technical Reports Server (NTRS)
Dottavio, C. L.; Nelson, R. F.; Williams, D. L. (Principal Investigator)
1983-01-01
An automated system for conducting annual gypsy moth defoliation surveys using LANDSAT MSS data and digital processing techniques is described. A two-step preprocessing procedure was developed that uses multitemporal data sets representing forest canopy conditions before and after defoliation to create a digital image in which all nonforest cover types are eliminated or masked out of a LANDSAT image that exhibits insect defoliation. A temporal window for defoliation assessment was identified and a statewide data base was established. A data management system to interface image analysis software with the statewide data base was developed and a cost benefit analysis of this operational system was conducted.
Traffic sign classification with dataset augmentation and convolutional neural network
NASA Astrophysics Data System (ADS)
Tang, Qing; Kurnianggoro, Laksono; Jo, Kang-Hyun
2018-04-01
This paper presents a method for traffic sign classification using a convolutional neural network (CNN). In this method, firstly we transfer a color image into grayscale, and then normalize it in the range (-1,1) as the preprocessing step. To increase robustness of classification model, we apply a dataset augmentation algorithm and create new images to train the model. To avoid overfitting, we utilize a dropout module before the last fully connection layer. To assess the performance of the proposed method, the German traffic sign recognition benchmark (GTSRB) dataset is utilized. Experimental results show that the method is effective in classifying traffic signs.
Morpheme matching based text tokenization for a scarce resourced language.
Rehman, Zobia; Anwar, Waqas; Bajwa, Usama Ijaz; Xuan, Wang; Chaoying, Zhou
2013-01-01
Text tokenization is a fundamental pre-processing step for almost all the information processing applications. This task is nontrivial for the scarce resourced languages such as Urdu, as there is inconsistent use of space between words. In this paper a morpheme matching based approach has been proposed for Urdu text tokenization, along with some other algorithms to solve the additional issues of boundary detection of compound words, affixation, reduplication, names and abbreviations. This study resulted into 97.28% precision, 93.71% recall, and 95.46% F1-measure; while tokenizing a corpus of 57000 words by using a morpheme list with 6400 entries.
NASA Astrophysics Data System (ADS)
Thiebaut, C.; Perraud, L.; Delvit, J. M.; Latry, C.
2016-07-01
We present an on-board satellite implementation of a gradient-based (optical flows) algorithm for the shifts estimation between images of a Shack-Hartmann wave-front sensor on extended landscapes. The proposed algorithm has low complexity in comparison with classical correlation methods which is a big advantage for being used on-board a satellite at high instrument data rate and in real-time. The electronic board used for this implementation is designed for space applications and is composed of radiation-hardened software and hardware. Processing times of both shift estimations and pre-processing steps are compatible of on-board real-time computation.
Network community-detection enhancement by proper weighting
NASA Astrophysics Data System (ADS)
Khadivi, Alireza; Ajdari Rad, Ali; Hasler, Martin
2011-04-01
In this paper, we show how proper assignment of weights to the edges of a complex network can enhance the detection of communities and how it can circumvent the resolution limit and the extreme degeneracy problems associated with modularity. Our general weighting scheme takes advantage of graph theoretic measures and it introduces two heuristics for tuning its parameters. We use this weighting as a preprocessing step for the greedy modularity optimization algorithm of Newman to improve its performance. The result of the experiments of our approach on computer-generated and real-world data networks confirm that the proposed approach not only mitigates the problems of modularity but also improves the modularity optimization.
Natural Resources Inventory and Land Evaluation in Switzerland
NASA Technical Reports Server (NTRS)
Haefner, H. (Principal Investigator)
1975-01-01
The author has identified the following significant results. A system was developed to operationally map and measure the areal extent of various land use categories for updating existing and producing new and actual thematic maps showing the latest state of rural and urban landscapes and its changes. The processing system includes: (1) preprocessing steps for radiometric and geometric corrections; (2) classification of the data by a multivariate procedure, using a stepwise linear discriminant analysis based on carefully selected training cells; and (3) output in form of color maps by printing black and white theme overlays of a selected scale with photomation system and its coloring and combination into a color composite.
Pattern recognition invariant under changes of scale and orientation
NASA Astrophysics Data System (ADS)
Arsenault, Henri H.; Parent, Sebastien; Moisan, Sylvain
1997-08-01
We have used a modified method proposed by neiberg and Casasent to successfully classify five kinds of military vehicles. The method uses a wedge filter to achieve scale invariance, and lines in a multi-dimensional feature space correspond to each target with out-of-plane orientations over 360 degrees around a vertical axis. The images were not binarized, but were filtered in a preprocessing step to reduce aliasing. The feature vectors were normalized and orthogonalized by means of a neural network. Out-of-plane rotations of 360 degrees and scale changes of a factor of four were considered. Error-free classification was achieved.
Metric learning for automatic sleep stage classification.
Phan, Huy; Do, Quan; Do, The-Luan; Vu, Duc-Lung
2013-01-01
We introduce in this paper a metric learning approach for automatic sleep stage classification based on single-channel EEG data. We show that learning a global metric from training data instead of using the default Euclidean metric, the k-nearest neighbor classification rule outperforms state-of-the-art methods on Sleep-EDF dataset with various classification settings. The overall accuracy for Awake/Sleep and 4-class classification setting are 98.32% and 94.49% respectively. Furthermore, the superior accuracy is achieved by performing classification on a low-dimensional feature space derived from time and frequency domains and without the need for artifact removal as a preprocessing step.
Morpheme Matching Based Text Tokenization for a Scarce Resourced Language
Rehman, Zobia; Anwar, Waqas; Bajwa, Usama Ijaz; Xuan, Wang; Chaoying, Zhou
2013-01-01
Text tokenization is a fundamental pre-processing step for almost all the information processing applications. This task is nontrivial for the scarce resourced languages such as Urdu, as there is inconsistent use of space between words. In this paper a morpheme matching based approach has been proposed for Urdu text tokenization, along with some other algorithms to solve the additional issues of boundary detection of compound words, affixation, reduplication, names and abbreviations. This study resulted into 97.28% precision, 93.71% recall, and 95.46% F1-measure; while tokenizing a corpus of 57000 words by using a morpheme list with 6400 entries. PMID:23990871
Testing Saliency Parameters for Automatic Target Recognition
NASA Technical Reports Server (NTRS)
Pandya, Sagar
2012-01-01
A bottom-up visual attention model (the saliency model) is tested to enhance the performance of Automated Target Recognition (ATR). JPL has developed an ATR system that identifies regions of interest (ROI) using a trained OT-MACH filter, and then classifies potential targets as true- or false-positives using machine-learning techniques. In this project, saliency is used as a pre-processing step to reduce the space for performing OT-MACH filtering. Saliency parameters, such as output level and orientation weight, are tuned to detect known target features. Preliminary results are promising and future work entails a rigrous and parameter-based search to gain maximum insight about this method.
Matrix preconditioning: a robust operation for optical linear algebra processors.
Ghosh, A; Paparao, P
1987-07-15
Analog electrooptical processors are best suited for applications demanding high computational throughput with tolerance for inaccuracies. Matrix preconditioning is one such application. Matrix preconditioning is a preprocessing step for reducing the condition number of a matrix and is used extensively with gradient algorithms for increasing the rate of convergence and improving the accuracy of the solution. In this paper, we describe a simple parallel algorithm for matrix preconditioning, which can be implemented efficiently on a pipelined optical linear algebra processor. From the results of our numerical experiments we show that the efficacy of the preconditioning algorithm is affected very little by the errors of the optical system.
Promoting Spontaneous Second Harmonic Generation through Organogelation.
Marco, A Belén; Aparicio, Fátima; Faour, Lara; Iliopoulos, Konstantinos; Morille, Yohann; Allain, Magali; Franco, Santiago; Andreu, Raquel; Sahraoui, Bouchta; Gindre, Denis; Canevet, David; Sallé, Marc
2016-07-27
An organogelator based on the Disperse Red nonlinear optical chromophore was synthesized according to a simple and efficient three-step procedure. The supramolecular gel organization leads to xerogels which display a spontaneous second harmonic generation (SHG) response without any need for preprocessing, and this SHG activity appears to be stable over several months. These findings, based on an intrinsic structural approach, are supported by favorable intermolecular supramolecular interactions, which promote a locally non-centrosymmetric NLO-active organization. This is in sharp contrast with most materials designed for SHG purposes, which generally require the use of expensive or heavy-to-handle external techniques for managing the dipoles' alignment.
NASA Astrophysics Data System (ADS)
Tao, Feifei; Mba, Ogan; Liu, Li; Ngadi, Michael
2017-04-01
Polyunsaturated fatty acids (PUFAs) are important nutrients present in Salmon. However, current methods for quantifying the fatty acids (FAs) contents in foods are generally based on gas chromatography (GC) technique, which is time-consuming, laborious and destructive to the tested samples. Therefore, the capability of near-infrared (NIR) hyperspectral imaging to predict the PUFAs contents of C20:2 n-6, C20:3 n-6, C20:5 n-3, C22:5 n-3 and C22:6 n-3 in Salmon fillets in a rapid and non-destructive way was investigated in this work. Mean reflectance spectra were first extracted from the region of interests (ROIs), and then the spectral pre-processing methods of 2nd derivative and Savitzky-Golay (SG) smoothing were performed on the original spectra. Based on the original and the pre-processed spectra, PLSR technique was employed to develop the quantitative models for predicting each PUFA content in Salmon fillets. The results showed that for all the studied PUFAs, the quantitative models developed using the pre-processed reflectance spectra by "2nd derivative + SG smoothing" could improve their modeling results. Good prediction results were achieved with RP and RMSEP of 0.91 and 0.75 mg/g dry weight, 0.86 and 1.44 mg/g dry weight, 0.82 and 3.01 mg/g dry weight for C20:3 n-6, C22:5 n-3 and C20:5 n-3, respectively after pre-processing by "2nd derivative + SG smoothing". The work demonstrated that NIR hyperspectral imaging could be a useful tool for rapid and non-destructive determination of the PUFA contents in fish fillets.
Leaf Chlorophyll Content Estimation of Winter Wheat Based on Visible and Near-Infrared Sensors.
Zhang, Jianfeng; Han, Wenting; Huang, Lvwen; Zhang, Zhiyong; Ma, Yimian; Hu, Yamin
2016-03-25
The leaf chlorophyll content is one of the most important factors for the growth of winter wheat. Visual and near-infrared sensors are a quick and non-destructive testing technology for the estimation of crop leaf chlorophyll content. In this paper, a new approach is developed for leaf chlorophyll content estimation of winter wheat based on visible and near-infrared sensors. First, the sliding window smoothing (SWS) was integrated with the multiplicative scatter correction (MSC) or the standard normal variable transformation (SNV) to preprocess the reflectance spectra images of wheat leaves. Then, a model for the relationship between the leaf relative chlorophyll content and the reflectance spectra was developed using the partial least squares (PLS) and the back propagation neural network. A total of 300 samples from areas surrounding Yangling, China, were used for the experimental studies. The samples of visible and near-infrared spectroscopy at the wavelength of 450,900 nm were preprocessed using SWS, MSC and SNV. The experimental results indicate that the preprocessing using SWS and SNV and then modeling using PLS can achieve the most accurate estimation, with the correlation coefficient at 0.8492 and the root mean square error at 1.7216. Thus, the proposed approach can be widely used for winter wheat chlorophyll content analysis.
Pinsard, Basile; Boutin, Arnaud; Doyon, Julien; Benali, Habib
2018-01-01
Functional MRI acquisition is sensitive to subjects' motion that cannot be fully constrained. Therefore, signal corrections have to be applied a posteriori in order to mitigate the complex interactions between changing tissue localization and magnetic fields, gradients and readouts. To circumvent current preprocessing strategies limitations, we developed an integrated method that correct motion and spatial low-frequency intensity fluctuations at the level of each slice in order to better fit the acquisition processes. The registration of single or multiple simultaneously acquired slices is achieved online by an Iterated Extended Kalman Filter, favoring the robust estimation of continuous motion, while an intensity bias field is non-parametrically fitted. The proposed extraction of gray-matter BOLD activity from the acquisition space to an anatomical group template space, taking into account distortions, better preserves fine-scale patterns of activity. Importantly, the proposed unified framework generalizes to high-resolution multi-slice techniques. When tested on simulated and real data the latter shows a reduction of motion explained variance and signal variability when compared to the conventional preprocessing approach. These improvements provide more stable patterns of activity, facilitating investigation of cerebral information representation in healthy and/or clinical populations where motion is known to impact fine-scale data. PMID:29755312
Pinsard, Basile; Boutin, Arnaud; Doyon, Julien; Benali, Habib
2018-01-01
Functional MRI acquisition is sensitive to subjects' motion that cannot be fully constrained. Therefore, signal corrections have to be applied a posteriori in order to mitigate the complex interactions between changing tissue localization and magnetic fields, gradients and readouts. To circumvent current preprocessing strategies limitations, we developed an integrated method that correct motion and spatial low-frequency intensity fluctuations at the level of each slice in order to better fit the acquisition processes. The registration of single or multiple simultaneously acquired slices is achieved online by an Iterated Extended Kalman Filter, favoring the robust estimation of continuous motion, while an intensity bias field is non-parametrically fitted. The proposed extraction of gray-matter BOLD activity from the acquisition space to an anatomical group template space, taking into account distortions, better preserves fine-scale patterns of activity. Importantly, the proposed unified framework generalizes to high-resolution multi-slice techniques. When tested on simulated and real data the latter shows a reduction of motion explained variance and signal variability when compared to the conventional preprocessing approach. These improvements provide more stable patterns of activity, facilitating investigation of cerebral information representation in healthy and/or clinical populations where motion is known to impact fine-scale data.
Spectroscopic Diagnosis of Arsenic Contamination in Agricultural Soils
Shi, Tiezhu; Liu, Huizeng; Chen, Yiyun; Fei, Teng; Wang, Junjie; Wu, Guofeng
2017-01-01
This study investigated the abilities of pre-processing, feature selection and machine-learning methods for the spectroscopic diagnosis of soil arsenic contamination. The spectral data were pre-processed by using Savitzky-Golay smoothing, first and second derivatives, multiplicative scatter correction, standard normal variate, and mean centering. Principle component analysis (PCA) and the RELIEF algorithm were used to extract spectral features. Machine-learning methods, including random forests (RF), artificial neural network (ANN), radial basis function- and linear function- based support vector machine (RBF- and LF-SVM) were employed for establishing diagnosis models. The model accuracies were evaluated and compared by using overall accuracies (OAs). The statistical significance of the difference between models was evaluated by using McNemar’s test (Z value). The results showed that the OAs varied with the different combinations of pre-processing, feature selection, and classification methods. Feature selection methods could improve the modeling efficiencies and diagnosis accuracies, and RELIEF often outperformed PCA. The optimal models established by RF (OA = 86%), ANN (OA = 89%), RBF- (OA = 89%) and LF-SVM (OA = 87%) had no statistical difference in diagnosis accuracies (Z < 1.96, p < 0.05). These results indicated that it was feasible to diagnose soil arsenic contamination using reflectance spectroscopy. The appropriate combination of multivariate methods was important to improve diagnosis accuracies. PMID:28471412
Improving performances of suboptimal greedy iterative biclustering heuristics via localization.
Erten, Cesim; Sözdinler, Melih
2010-10-15
Biclustering gene expression data is the problem of extracting submatrices of genes and conditions exhibiting significant correlation across both the rows and the columns of a data matrix of expression values. Even the simplest versions of the problem are computationally hard. Most of the proposed solutions therefore employ greedy iterative heuristics that locally optimize a suitably assigned scoring function. We provide a fast and simple pre-processing algorithm called localization that reorders the rows and columns of the input data matrix in such a way as to group correlated entries in small local neighborhoods within the matrix. The proposed localization algorithm takes its roots from effective use of graph-theoretical methods applied to problems exhibiting a similar structure to that of biclustering. In order to evaluate the effectivenesss of the localization pre-processing algorithm, we focus on three representative greedy iterative heuristic methods. We show how the localization pre-processing can be incorporated into each representative algorithm to improve biclustering performance. Furthermore, we propose a simple biclustering algorithm, Random Extraction After Localization (REAL) that randomly extracts submatrices from the localization pre-processed data matrix, eliminates those with low similarity scores, and provides the rest as correlated structures representing biclusters. We compare the proposed localization pre-processing with another pre-processing alternative, non-negative matrix factorization. We show that our fast and simple localization procedure provides similar or even better results than the computationally heavy matrix factorization pre-processing with regards to H-value tests. We next demonstrate that the performances of the three representative greedy iterative heuristic methods improve with localization pre-processing when biological correlations in the form of functional enrichment and PPI verification constitute the main performance criteria. The fact that the random extraction method based on localization REAL performs better than the representative greedy heuristic methods under same criteria also confirms the effectiveness of the suggested pre-processing method. Supplementary material including code implementations in LEDA C++ library, experimental data, and the results are available at http://code.google.com/p/biclustering/ cesim@khas.edu.tr; melihsozdinler@boun.edu.tr Supplementary data are available at Bioinformatics online.
Influence of California-style black ripe olive processing on the formation of acrylamide.
Charoenprasert, Suthawan; Mitchell, Alyson
2014-08-27
Methods used in processing California-style black ripe olives generate acrylamide. California-style black ripe olives contain higher levels of acrylamide (409.67 ± 42.60-511.91 ± 34.08 μg kg(-1)) as compared to California-style green ripe olives (44.02 ± 3.55-105.79 ± 22.01 μg kg(-1)), Greek olives (<1.42 μg kg(-1)), and Spanish olives (not detected), indicating that the higher temperatures used to sterilize the California-style green ripe and black ripe olives are required for acrylamide formation. Preprocessing brine storage influenced the formation of acrylamide in a time-dependent manner. Acrylamide increased during the first 30 days of storage. Longer brine storage times (>30 days) result in lower acrylamide levels in the finished product. The presence of calcium ions in the preprocessing brining solution results in higher levels of acrylamide in finished products. Air oxidation during lye processing and the neutralization of olives prior to sterilization significantly increase the formation of acrylamide in the finished products. Conversely, lye-processing decreases the levels of acrylamide in the final product. These results indicate that specific steps in the California-style black ripe olive processing may be manipulated to mitigate the formation of acrylamide in finished products.
Techno-Economic Assessment for Integrating Biosorption into Rare Earth Recovery Process
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jiao, Yongqin; Sutherland, John; Jin, Hongyue
The current uncertainty in the global supply of rare earth elements (REEs) necessitates the development of novel extraction technologies that utilize a variety of REE source materials. Herein, we examined the techno-economic performance of integrating a biosorption approach into a large-scale process for producing salable total rare earth oxides (TREOs) from various feedstocks. An airlift bioractor is proposed to carry out a biosorption process mediated by bioengineered rare earth-adsorbing bacteria. Techno-econmic asssements were compared for three distinctive categories of REE feedstocks requiring different pre-processing steps. Key parameters identified that affect profitability include REE concentration, composition of the feedstock, and costsmore » of feedstock pretreatment and waste management. Among the 11 specific feedstocks investigated, coal ash from the Appalachian Basin was projected to be the most profitable, largely due to its high-value REE content. Its cost breakdown includes pre-processing (primarily leaching) (8077.71%), biosorption (1619.04%), and oxalic acid precipitation and TREO roasting (3.35%). Surprisingly, biosorption from the high-grade Bull Hill REE ore is less profitable due to high material cost and low production revenue. Overall, our results confirmed that the application of biosorption to low-grade feedstocks for REE recovery is economically viable.« less
Medina, Christopher S; Manifold-Wheeler, Brett; Gonzales, Aaron; Bearer, Elaine L
2017-07-05
Magnetic resonance (MR) imaging provides a method to obtain anatomical information from the brain in vivo that is not typically available by optical imaging because of this organ's opacity. MR is nondestructive and obtains deep tissue contrast with 100-µm 3 voxel resolution or better. Manganese-enhanced MRI (MEMRI) may be used to observe axonal transport and localized neural activity in the living rodent and avian brain. Such enhancement enables researchers to investigate differences in functional circuitry or neuronal activity in images of brains of different animals. Moreover, once MR images of a number of animals are aligned into a single matrix, statistical analysis can be done comparing MR intensities between different multi-animal cohorts comprising individuals from different mouse strains or different transgenic animals, or at different time points after an experimental manipulation. Although preprocessing steps for such comparisons (including skull stripping and alignment) are automated for human imaging, no such automated processing has previously been readily available for mouse or other widely used experimental animals, and most investigators use in-house custom processing. This protocol describes a stepwise method to perform such preprocessing for mouse. © 2017 by John Wiley & Sons, Inc. Copyright © 2017 John Wiley & Sons, Inc.
Kumar, Keshav
2018-03-01
Excitation-emission matrix fluorescence (EEMF) and total synchronous fluorescence spectroscopy (TSFS) are the 2 fluorescence techniques that are commonly used for the analysis of multifluorophoric mixtures. These 2 fluorescence techniques are conceptually different and provide certain advantages over each other. The manual analysis of such highly correlated large volume of EEMF and TSFS towards developing a calibration model is difficult. Partial least square (PLS) analysis can analyze the large volume of EEMF and TSFS data sets by finding important factors that maximize the correlation between the spectral and concentration information for each fluorophore. However, often the application of PLS analysis on entire data sets does not provide a robust calibration model and requires application of suitable pre-processing step. The present work evaluates the application of genetic algorithm (GA) analysis prior to PLS analysis on EEMF and TSFS data sets towards improving the precision and accuracy of the calibration model. The GA algorithm essentially combines the advantages provided by stochastic methods with those provided by deterministic approaches and can find the set of EEMF and TSFS variables that perfectly correlate well with the concentration of each of the fluorophores present in the multifluorophoric mixtures. The utility of the GA assisted PLS analysis is successfully validated using (i) EEMF data sets acquired for dilute aqueous mixture of four biomolecules and (ii) TSFS data sets acquired for dilute aqueous mixtures of four carcinogenic polycyclic aromatic hydrocarbons (PAHs) mixtures. In the present work, it is shown that by using the GA it is possible to significantly improve the accuracy and precision of the PLS calibration model developed for both EEMF and TSFS data set. Hence, GA must be considered as a useful pre-processing technique while developing an EEMF and TSFS calibration model.
Recognition of Indian Sign Language in Live Video
NASA Astrophysics Data System (ADS)
Singha, Joyeeta; Das, Karen
2013-05-01
Sign Language Recognition has emerged as one of the important area of research in Computer Vision. The difficulty faced by the researchers is that the instances of signs vary with both motion and appearance. Thus, in this paper a novel approach for recognizing various alphabets of Indian Sign Language is proposed where continuous video sequences of the signs have been considered. The proposed system comprises of three stages: Preprocessing stage, Feature Extraction and Classification. Preprocessing stage includes skin filtering, histogram matching. Eigen values and Eigen Vectors were considered for feature extraction stage and finally Eigen value weighted Euclidean distance is used to recognize the sign. It deals with bare hands, thus allowing the user to interact with the system in natural way. We have considered 24 different alphabets in the video sequences and attained a success rate of 96.25%.
NASA Astrophysics Data System (ADS)
Fomin, Fedor V.
Preprocessing (data reduction or kernelization) as a strategy of coping with hard problems is universally used in almost every implementation. The history of preprocessing, like applying reduction rules simplifying truth functions, can be traced back to the 1950's [6]. A natural question in this regard is how to measure the quality of preprocessing rules proposed for a specific problem. For a long time the mathematical analysis of polynomial time preprocessing algorithms was neglected. The basic reason for this anomaly was that if we start with an instance I of an NP-hard problem and can show that in polynomial time we can replace this with an equivalent instance I' with |I'| < |I| then that would imply P=NP in classical complexity.
Gifford, René H.; Revit, Lawrence J.
2014-01-01
Background Although cochlear implant patients are achieving increasingly higher levels of performance, speech perception in noise continues to be problematic. The newest generations of implant speech processors are equipped with preprocessing and/or external accessories that are purported to improve listening in noise. Most speech perception measures in the clinical setting, however, do not provide a close approximation to real-world listening environments. Purpose To assess speech perception for adult cochlear implant recipients in the presence of a realistic restaurant simulation generated by an eight-loudspeaker (R-SPACE™) array in order to determine whether commercially available preprocessing strategies and/or external accessories yield improved sentence recognition in noise. Research Design Single-subject, repeated-measures design with two groups of participants: Advanced Bionics and Cochlear Corporation recipients. Study Sample Thirty-four subjects, ranging in age from 18 to 90 yr (mean 54.5 yr), participated in this prospective study. Fourteen subjects were Advanced Bionics recipients, and 20 subjects were Cochlear Corporation recipients. Intervention Speech reception thresholds (SRTs) in semidiffuse restaurant noise originating from an eight-loudspeaker array were assessed with the subjects’ preferred listening programs as well as with the addition of either Beam™ preprocessing (Cochlear Corporation) or the T-Mic® accessory option (Advanced Bionics). Data Collection and Analysis In Experiment 1, adaptive SRTs with the Hearing in Noise Test sentences were obtained for all 34 subjects. For Cochlear Corporation recipients, SRTs were obtained with their preferred everyday listening program as well as with the addition of Focus preprocessing. For Advanced Bionics recipients, SRTs were obtained with the integrated behind-the-ear (BTE) mic as well as with the T-Mic. Statistical analysis using a repeated-measures analysis of variance (ANOVA) evaluated the effects of the preprocessing strategy or external accessory in reducing the SRT in noise. In addition, a standard t-test was run to evaluate effectiveness across manufacturer for improving the SRT in noise. In Experiment 2, 16 of the 20 Cochlear Corporation subjects were reassessed obtaining an SRT in noise using the manufacturer-suggested “Everyday,” “Noise,” and “Focus” preprocessing strategies. A repeated-measures ANOVA was employed to assess the effects of preprocessing. Results The primary findings were (i) both Noise and Focus preprocessing strategies (Cochlear Corporation) significantly improved the SRT in noise as compared to Everyday preprocessing, (ii) the T-Mic accessory option (Advanced Bionics) significantly improved the SRT as compared to the BTE mic, and (iii) Focus preprocessing and the T-Mic resulted in similar degrees of improvement that were not found to be significantly different from one another. Conclusion Options available in current cochlear implant sound processors are able to significantly improve speech understanding in a realistic, semidiffuse noise with both Cochlear Corporation and Advanced Bionics systems. For Cochlear Corporation recipients, Focus preprocessing yields the best speech-recognition performance in a complex listening environment; however, it is recommended that Noise preprocessing be used as the new default for everyday listening environments to avoid the need for switching programs throughout the day. For Advanced Bionics recipients, the T-Mic offers significantly improved performance in noise and is recommended for everyday use in all listening environments. PMID:20807480
Automatic Nuclei Segmentation in H&E Stained Breast Cancer Histopathology Images
Veta, Mitko; van Diest, Paul J.; Kornegoor, Robert; Huisman, André; Viergever, Max A.; Pluim, Josien P. W.
2013-01-01
The introduction of fast digital slide scanners that provide whole slide images has led to a revival of interest in image analysis applications in pathology. Segmentation of cells and nuclei is an important first step towards automatic analysis of digitized microscopy images. We therefore developed an automated nuclei segmentation method that works with hematoxylin and eosin (H&E) stained breast cancer histopathology images, which represent regions of whole digital slides. The procedure can be divided into four main steps: 1) pre-processing with color unmixing and morphological operators, 2) marker-controlled watershed segmentation at multiple scales and with different markers, 3) post-processing for rejection of false regions and 4) merging of the results from multiple scales. The procedure was developed on a set of 21 breast cancer cases (subset A) and tested on a separate validation set of 18 cases (subset B). The evaluation was done in terms of both detection accuracy (sensitivity and positive predictive value) and segmentation accuracy (Dice coefficient). The mean estimated sensitivity for subset A was 0.875 (±0.092) and for subset B 0.853 (±0.077). The mean estimated positive predictive value was 0.904 (±0.075) and 0.886 (±0.069) for subsets A and B, respectively. For both subsets, the distribution of the Dice coefficients had a high peak around 0.9, with the vast majority of segmentations having values larger than 0.8. PMID:23922958
Automatic nuclei segmentation in H&E stained breast cancer histopathology images.
Veta, Mitko; van Diest, Paul J; Kornegoor, Robert; Huisman, André; Viergever, Max A; Pluim, Josien P W
2013-01-01
The introduction of fast digital slide scanners that provide whole slide images has led to a revival of interest in image analysis applications in pathology. Segmentation of cells and nuclei is an important first step towards automatic analysis of digitized microscopy images. We therefore developed an automated nuclei segmentation method that works with hematoxylin and eosin (H&E) stained breast cancer histopathology images, which represent regions of whole digital slides. The procedure can be divided into four main steps: 1) pre-processing with color unmixing and morphological operators, 2) marker-controlled watershed segmentation at multiple scales and with different markers, 3) post-processing for rejection of false regions and 4) merging of the results from multiple scales. The procedure was developed on a set of 21 breast cancer cases (subset A) and tested on a separate validation set of 18 cases (subset B). The evaluation was done in terms of both detection accuracy (sensitivity and positive predictive value) and segmentation accuracy (Dice coefficient). The mean estimated sensitivity for subset A was 0.875 (±0.092) and for subset B 0.853 (±0.077). The mean estimated positive predictive value was 0.904 (±0.075) and 0.886 (±0.069) for subsets A and B, respectively. For both subsets, the distribution of the Dice coefficients had a high peak around 0.9, with the vast majority of segmentations having values larger than 0.8.
Multisensor data fusion across time and space
NASA Astrophysics Data System (ADS)
Villeneuve, Pierre V.; Beaven, Scott G.; Reed, Robert A.
2014-06-01
Field measurement campaigns typically deploy numerous sensors having different sampling characteristics for spatial, temporal, and spectral domains. Data analysis and exploitation is made more difficult and time consuming as the sample data grids between sensors do not align. This report summarizes our recent effort to demonstrate feasibility of a processing chain capable of "fusing" image data from multiple independent and asynchronous sensors into a form amenable to analysis and exploitation using commercially-available tools. Two important technical issues were addressed in this work: 1) Image spatial registration onto a common pixel grid, 2) Image temporal interpolation onto a common time base. The first step leverages existing image matching and registration algorithms. The second step relies upon a new and innovative use of optical flow algorithms to perform accurate temporal upsampling of slower frame rate imagery. Optical flow field vectors were first derived from high-frame rate, high-resolution imagery, and then finally used as a basis for temporal upsampling of the slower frame rate sensor's imagery. Optical flow field values are computed using a multi-scale image pyramid, thus allowing for more extreme object motion. This involves preprocessing imagery to varying resolution scales and initializing new vector flow estimates using that from the previous coarser-resolution image. Overall performance of this processing chain is demonstrated using sample data involving complex too motion observed by multiple sensors mounted to the same base. Multiple sensors were included, including a high-speed visible camera, up to a coarser resolution LWIR camera.
Preprocessing of emotional visual information in the human piriform cortex.
Schulze, Patrick; Bestgen, Anne-Kathrin; Lech, Robert K; Kuchinke, Lars; Suchan, Boris
2017-08-23
This study examines the processing of visual information by the olfactory system in humans. Recent data point to the processing of visual stimuli by the piriform cortex, a region mainly known as part of the primary olfactory cortex. Moreover, the piriform cortex generates predictive templates of olfactory stimuli to facilitate olfactory processing. This study fills the gap relating to the question whether this region is also capable of preprocessing emotional visual information. To gain insight into the preprocessing and transfer of emotional visual information into olfactory processing, we recorded hemodynamic responses during affective priming using functional magnetic resonance imaging (fMRI). Odors of different valence (pleasant, neutral and unpleasant) were primed by images of emotional facial expressions (happy, neutral and disgust). Our findings are the first to demonstrate that the piriform cortex preprocesses emotional visual information prior to any olfactory stimulation and that the emotional connotation of this preprocessing is subsequently transferred and integrated into an extended olfactory network for olfactory processing.
NASA Astrophysics Data System (ADS)
Krishnamurthy, Narayanan; Maddali, Siddharth; Romanov, Vyacheslav; Hawk, Jeffrey
We present some structural properties of multi-component steel alloys as predicted by a random forest machine-learning model. These non-parametric models are trained on high-dimensional data sets defined by features such as chemical composition, pre-processing temperatures and environmental influences, the latter of which are based upon standardized testing procedures for tensile, creep and rupture properties as defined by the American Society of Testing and Materials (ASTM). We quantify the goodness of fit of these models as well as the inferred relative importance of each of these features, all with a conveniently defined metric and scale. The models are tested with synthetic data points, generated subject to the appropriate mathematical constraints for the various features. By this we highlight possible trends in the increase or degradation of the structural properties with perturbations in the features of importance. This work is presented as part of the Data Science Initiative at the National Energy Technology Laboratory, directed specifically towards the computational design of steel alloys.
NASA Astrophysics Data System (ADS)
Nguyen, Duy
2012-07-01
Digital Elevation Models (DEMs) are used in many applications in the context of earth sciences such as in topographic mapping, environmental modeling, rainfall-runoff studies, landslide hazard zonation, seismic source modeling, etc. During the last years multitude of scientific applications of Synthetic Aperture Radar Interferometry (InSAR) techniques have evolved. It has been shown that InSAR is an established technique of generating high quality DEMs from space borne and airborne data, and that it has advantages over other methods for the generation of large area DEM. However, the processing of InSAR data is still a challenging task. This paper describes InSAR operational steps and processing chain for DEM generation from Single Look Complex (SLC) SAR data and compare a satellite SAR estimate of surface elevation with a digital elevation model (DEM) from Topography map. The operational steps are performed in three major stages: Data Search, Data Processing, and product Validation. The Data processing stage is further divided into five steps of Data Pre-Processing, Co-registration, Interferogram generation, Phase unwrapping, and Geocoding. The Data processing steps have been tested with ERS 1/2 data using Delft Object-oriented Interferometric (DORIS) InSAR processing software. Results of the outcome of the application of the described processing steps to real data set are presented.
Brain computer interfaces, a review.
Nicolas-Alonso, Luis Fernando; Gomez-Gil, Jaime
2012-01-01
A brain-computer interface (BCI) is a hardware and software communications system that permits cerebral activity alone to control computers or external devices. The immediate goal of BCI research is to provide communications capabilities to severely disabled people who are totally paralyzed or 'locked in' by neurological neuromuscular disorders, such as amyotrophic lateral sclerosis, brain stem stroke, or spinal cord injury. Here, we review the state-of-the-art of BCIs, looking at the different steps that form a standard BCI: signal acquisition, preprocessing or signal enhancement, feature extraction, classification and the control interface. We discuss their advantages, drawbacks, and latest advances, and we survey the numerous technologies reported in the scientific literature to design each step of a BCI. First, the review examines the neuroimaging modalities used in the signal acquisition step, each of which monitors a different functional brain activity such as electrical, magnetic or metabolic activity. Second, the review discusses different electrophysiological control signals that determine user intentions, which can be detected in brain activity. Third, the review includes some techniques used in the signal enhancement step to deal with the artifacts in the control signals and improve the performance. Fourth, the review studies some mathematic algorithms used in the feature extraction and classification steps which translate the information in the control signals into commands that operate a computer or other device. Finally, the review provides an overview of various BCI applications that control a range of devices.
A linear shift-invariant image preprocessing technique for multispectral scanner systems
NASA Technical Reports Server (NTRS)
Mcgillem, C. D.; Riemer, T. E.
1973-01-01
A linear shift-invariant image preprocessing technique is examined which requires no specific knowledge of any parameter of the original image and which is sufficiently general to allow the effective radius of the composite imaging system to be arbitrarily shaped and reduced, subject primarily to the noise power constraint. In addition, the size of the point-spread function of the preprocessing filter can be arbitrarily controlled, thus minimizing truncation errors.
A novel automatic segmentation workflow of axial breast DCE-MRI
NASA Astrophysics Data System (ADS)
Besbes, Feten; Gargouri, Norhene; Damak, Alima; Sellami, Dorra
2018-04-01
In this paper we propose a novel process of a fully automatic breast tissue segmentation which is independent from expert calibration and contrast. The proposed algorithm is composed by two major steps. The first step consists in the detection of breast boundaries. It is based on image content analysis and Moore-Neighbour tracing algorithm. As a processing step, Otsu thresholding and neighbors algorithm are applied. Then, the external area of breast is removed to get an approximated breast region. The second preprocessing step is the delineation of the chest wall which is considered as the lowest cost path linking three key points; These points are located automatically at the breast. They are respectively, the left and right boundary points and the middle upper point placed at the sternum region using statistical method. For the minimum cost path search problem, we resolve it through Dijkstra algorithm. Evaluation results reveal the robustness of our process face to different breast densities, complex forms and challenging cases. In fact, the mean overlap between manual segmentation and automatic segmentation through our method is 96.5%. A comparative study shows that our proposed process is competitive and faster than existing methods. The segmentation of 120 slices with our method is achieved at least in 20.57+/-5.2s.
Boon, K H; Khalil-Hani, M; Malarvili, M B
2018-01-01
This paper presents a method that able to predict the paroxysmal atrial fibrillation (PAF). The method uses shorter heart rate variability (HRV) signals when compared to existing methods, and achieves good prediction accuracy. PAF is a common cardiac arrhythmia that increases the health risk of a patient, and the development of an accurate predictor of the onset of PAF is clinical important because it increases the possibility to electrically stabilize and prevent the onset of atrial arrhythmias with different pacing techniques. We propose a multi-objective optimization algorithm based on the non-dominated sorting genetic algorithm III for optimizing the baseline PAF prediction system, that consists of the stages of pre-processing, HRV feature extraction, and support vector machine (SVM) model. The pre-processing stage comprises of heart rate correction, interpolation, and signal detrending. After that, time-domain, frequency-domain, non-linear HRV features are extracted from the pre-processed data in feature extraction stage. Then, these features are used as input to the SVM for predicting the PAF event. The proposed optimization algorithm is used to optimize the parameters and settings of various HRV feature extraction algorithms, select the best feature subsets, and tune the SVM parameters simultaneously for maximum prediction performance. The proposed method achieves an accuracy rate of 87.7%, which significantly outperforms most of the previous works. This accuracy rate is achieved even with the HRV signal length being reduced from the typical 30 min to just 5 min (a reduction of 83%). Furthermore, another significant result is the sensitivity rate, which is considered more important that other performance metrics in this paper, can be improved with the trade-off of lower specificity. Copyright © 2017 Elsevier B.V. All rights reserved.
Towards a global-scale ambient noise cross-correlation data base
NASA Astrophysics Data System (ADS)
Ermert, Laura; Fichtner, Andreas; Sleeman, Reinoud
2014-05-01
We aim to obtain a global-scale data base of ambient seismic noise correlations. This database - to be made publicly available at ORFEUS - will enable us to study the distribution of microseismic and hum sources, and to perform multi-scale full waveform inversion for crustal and mantle structure. Ambient noise tomography has developed into a standard technique. According to theory, cross-correlations equal inter-station Green's functions only if the wave field is equipartitioned or the sources are isotropically distributed. In an attempt to circumvent these assumptions, we aim to investigate possibilities to directly model noise cross-correlations and invert for their sources using adjoint techniques. A data base containing correlations of 'gently' preprocessed noise, excluding preprocessing steps which are explicitly taken to reduce the influence of a non-isotropic source distribution like spectral whitening, is a key ingredient in this undertaking. Raw data are acquired from IRIS/FDSN and ORFEUS. We preprocess and correlate the time series using a tool based on the Python package Obspy which is run in parallel on a cluster of the Swiss National Supercomputing Centre. Correlation is done in two ways: Besides the classical cross-correlation function, the phase cross-correlation is calculated, which is an amplitude-independent measure of waveform similarity and therefore insensitive to high-energy events. Besides linear stacks of these correlations, instantaneous phase stacks are calculated which can be applied as optional weight, enhancing coherent portions of the traces and facilitating the emergence of a meaningful signal. The _STS1 virtual network by IRIS contains about 250 globally distributed stations, several of which have been operating for more than 20 years. It is the first data collection we will use for correlations in the hum frequency range, as the STS-1 instrument response is flat in the largest part of the period range where hum is observed, up to a period of about 300 seconds. Thus they provide us with the best-suited measurements for hum.
Li, Yuanpeng; Li, Fucui; Yang, Xinhao; Guo, Liu; Huang, Furong; Chen, Zhenqiang; Chen, Xingdan; Zheng, Shifu
2018-08-05
A rapid quantitative analysis model for determining the glycated albumin (GA) content based on Attenuated total reflectance (ATR)-Fourier transform infrared spectroscopy (FTIR) combining with linear SiPLS and nonlinear SVM has been developed. Firstly, the real GA content in human serum was determined by GA enzymatic method, meanwhile, the ATR-FTIR spectra of serum samples from the population of health examination were obtained. The spectral data of the whole spectra mid-infrared region (4000-600 cm -1 ) and GA's characteristic region (1800-800 cm -1 ) were used as the research object of quantitative analysis. Secondly, several preprocessing steps including first derivative, second derivative, variable standardization and spectral normalization, were performed. Lastly, quantitative analysis regression models were established by using SiPLS and SVM respectively. The SiPLS modeling results are as follows: root mean square error of cross validation (RMSECV T ) = 0.523 g/L, calibration coefficient (R C ) = 0.937, Root Mean Square Error of Prediction (RMSEP T ) = 0.787 g/L, and prediction coefficient (R P ) = 0.938. The SVM modeling results are as follows: RMSECV T = 0.0048 g/L, R C = 0.998, RMSEP T = 0.442 g/L, and R p = 0.916. The results indicated that the model performance was improved significantly after preprocessing and optimization of characteristic regions. While modeling performance of nonlinear SVM was considerably better than that of linear SiPLS. Hence, the quantitative analysis model for GA in human serum based on ATR-FTIR combined with SiPLS and SVM is effective. And it does not need sample preprocessing while being characterized by simple operations and high time efficiency, providing a rapid and accurate method for GA content determination. Copyright © 2018 Elsevier B.V. All rights reserved.
Biofuel supply chain considering depreciation cost of installed plants
NASA Astrophysics Data System (ADS)
Rabbani, Masoud; Ramezankhani, Farshad; Giahi, Ramin; Farshbaf-Geranmayeh, Amir
2016-06-01
Due to the depletion of the fossil fuels and major concerns about the security of energy in the future to produce fuels, the importance of utilizing the renewable energies is distinguished. Nowadays there has been a growing interest for biofuels. Thus, this paper reveals a general optimization model which enables the selection of preprocessing centers for the biomass, biofuel plants, and warehouses to store the biofuels. The objective of this model is to maximize the total benefits. Costs of the model consist of setup cost of preprocessing centers, plants and warehouses, transportation costs, production costs, emission cost and the depreciation cost. At first, the deprecation cost of the centers is calculated by means of three methods. The model chooses the best depreciation method in each period by switching between them. A numerical example is presented and solved by CPLEX solver in GAMS software and finally, sensitivity analyses are accomplished.
Piecewise Polynomial Aggregation as Preprocessing for Data Numerical Modeling
NASA Astrophysics Data System (ADS)
Dobronets, B. S.; Popova, O. A.
2018-05-01
Data aggregation issues for numerical modeling are reviewed in the present study. The authors discuss data aggregation procedures as preprocessing for subsequent numerical modeling. To calculate the data aggregation, the authors propose using numerical probabilistic analysis (NPA). An important feature of this study is how the authors represent the aggregated data. The study shows that the offered approach to data aggregation can be interpreted as the frequency distribution of a variable. To study its properties, the density function is used. For this purpose, the authors propose using the piecewise polynomial models. A suitable example of such approach is the spline. The authors show that their approach to data aggregation allows reducing the level of data uncertainty and significantly increasing the efficiency of numerical calculations. To demonstrate the degree of the correspondence of the proposed methods to reality, the authors developed a theoretical framework and considered numerical examples devoted to time series aggregation.
Charles J. Gatchell; R. Edward Thomas; Elizabeth S. Walker
1999-01-01
Using the ROMI-RIP simulator we examined the implications of preprocessing for gang-rip-first rough mills. Rip-first rough mills can improve yield and throughput by preprocessing 1 Common and 2A Common hardwood lumber. This can be achieved by using a chop saw to separate poorer quality board segments from better ones and remove waste areas with little or no yield. This...
Data pre-processing in record linkage to find the same companies from different databases
NASA Astrophysics Data System (ADS)
Gunawan, D.; Lubis, M. S.; Arisandi, D.; Azzahry, B.
2018-03-01
As public agencies, the Badan Pelayanan Perizinan Terpadu (BPPT) and the Badan Lingkungan Hidup (BLH) of Medan city manage process to obtain a business license from the public. However, each agency might have a different corporate data because of a separate data input process, even though the data may refer to the same company’s data. Therefore, it is required to identify and correlate data that refer to the same company which lie in different data sources. This research focuses on data pre-processing such as data cleaning, text pre-processing, indexing and record comparison. In addition, this research implements data matching using support vector machine algorithm. The result of this algorithm will be used to record linkage of data that can be used to identify and connect the company’s data based on the degree of similarity of each data. Previous data will be standardized in accordance with the format and structure appropriate to the stage of preprocessing data. After analyzing data pre-processing, we found that both database structures are not designed to support data integration. We decide that the data matching can be done with blocking criteria such as company name and the name of the owner (or applicant). In addition to data pre-processing, the result of data classification with a high level of similarity as many as 90 pairs of records.
The Foreign-Language Teacher and Cognitive Psychology or Where Do We Go from Here?
ERIC Educational Resources Information Center
Rivers, Wilga M.
Research into the psychology of perception can uncover important discoveries for more efficient learning. There must be increased understanding of the processing of input and the pre-processing of output for improved language instruction. Educators must at the present time be extremely wary of basing what they do in the foreign-language classroom…
Classifier dependent feature preprocessing methods
NASA Astrophysics Data System (ADS)
Rodriguez, Benjamin M., II; Peterson, Gilbert L.
2008-04-01
In mobile applications, computational complexity is an issue that limits sophisticated algorithms from being implemented on these devices. This paper provides an initial solution to applying pattern recognition systems on mobile devices by combining existing preprocessing algorithms for recognition. In pattern recognition systems, it is essential to properly apply feature preprocessing tools prior to training classification models in an attempt to reduce computational complexity and improve the overall classification accuracy. The feature preprocessing tools extended for the mobile environment are feature ranking, feature extraction, data preparation and outlier removal. Most desktop systems today are capable of processing a majority of the available classification algorithms without concern of processing while the same is not true on mobile platforms. As an application of pattern recognition for mobile devices, the recognition system targets the problem of steganalysis, determining if an image contains hidden information. The measure of performance shows that feature preprocessing increases the overall steganalysis classification accuracy by an average of 22%. The methods in this paper are tested on a workstation and a Nokia 6620 (Symbian operating system) camera phone with similar results.
An Effective Measured Data Preprocessing Method in Electrical Impedance Tomography
Yu, Chenglong; Yue, Shihong; Wang, Jianpei; Wang, Huaxiang
2014-01-01
As an advanced process detection technology, electrical impedance tomography (EIT) has widely been paid attention to and studied in the industrial fields. But the EIT techniques are greatly limited to the low spatial resolutions. This problem may result from the incorrect preprocessing of measuring data and lack of general criterion to evaluate different preprocessing processes. In this paper, an EIT data preprocessing method is proposed by all rooting measured data and evaluated by two constructed indexes based on all rooted EIT measured data. By finding the optimums of the two indexes, the proposed method can be applied to improve the EIT imaging spatial resolutions. In terms of a theoretical model, the optimal rooting times of the two indexes range in [0.23, 0.33] and in [0.22, 0.35], respectively. Moreover, these factors that affect the correctness of the proposed method are generally analyzed. The measuring data preprocessing is necessary and helpful for any imaging process. Thus, the proposed method can be generally and widely used in any imaging process. Experimental results validate the two proposed indexes. PMID:25165735
PEGASUS 5: An Automated Pre-Processor for Overset-Grid CFD
NASA Technical Reports Server (NTRS)
Rogers, Stuart E.; Suhs, Norman; Dietz, William; Rogers, Stuart; Nash, Steve; Chan, William; Tramel, Robert; Onufer, Jeff
2006-01-01
This viewgraph presentation reviews the use and requirements of Pegasus 5. PEGASUS 5 is a code which performs a pre-processing step for the Overset CFD method. The code prepares the overset volume grids for the flow solver by computing the domain connectivity database, and blanking out grid points which are contained inside a solid body. PEGASUS 5 successfully automates most of the overset process. It leads to dramatic reduction in user input over previous generations of overset software. It also can lead to an order of magnitude reduction in both turn-around time and user expertise requirements. It is also however not a "black-box" procedure; care must be taken to examine the resulting grid system.
Protein mass spectra data analysis for clinical biomarker discovery: a global review.
Roy, Pascal; Truntzer, Caroline; Maucort-Boulch, Delphine; Jouve, Thomas; Molinari, Nicolas
2011-03-01
The identification of new diagnostic or prognostic biomarkers is one of the main aims of clinical cancer research. In recent years there has been a growing interest in using high throughput technologies for the detection of such biomarkers. In particular, mass spectrometry appears as an exciting tool with great potential. However, to extract any benefit from the massive potential of clinical proteomic studies, appropriate methods, improvement and validation are required. To better understand the key statistical points involved with such studies, this review presents the main data analysis steps of protein mass spectra data analysis, from the pre-processing of the data to the identification and validation of biomarkers.
NASA Astrophysics Data System (ADS)
Wang, Shaojun; Jiang, Lan; Han, Weina; Hu, Jie; Li, Xiaowei; Wang, Qingsong; Lu, Yongfeng
2018-05-01
We realize hierarchical laser-induced periodic surface structures (LIPSSs) on the surface of a ZnO thin film in a single step by the irradiation of femtosecond laser pulses. The structures are characterized by the high-spatial-frequency LIPSSs (HSFLs) formed on the abnormal bumped low-spatial-frequency LIPSSs (LSFLs). Localized electric-field enhancement based on the initially formed LSFLs is proposed as a potential mechanism for the formation of HSFLs. The simulation results through the finite-difference time-domain method show good agreement with experiments. Furthermore, the crucial role of the LSFLs in the formation of HSFLs is validated by an elaborate experimental design with preprocessed HSFLs.
Biometric recognition using 3D ear shape.
Yan, Ping; Bowyer, Kevin W
2007-08-01
Previous works have shown that the ear is a promising candidate for biometric identification. However, in prior work, the preprocessing of ear images has had manual steps and algorithms have not necessarily handled problems caused by hair and earrings. We present a complete system for ear biometrics, including automated segmentation of the ear in a profile view image and 3D shape matching for recognition. We evaluated this system with the largest experimental study to date in ear biometrics, achieving a rank-one recognition rate of 97.8 percent for an identification scenario and an equal error rate of 1.2 percent for a verification scenario on a database of 415 subjects and 1,386 total probes.
NASA Astrophysics Data System (ADS)
Wei, Ping; Li, Xinyang; Luo, Xi; Li, Jianfeng
2018-02-01
The centroid method is commonly adopted to locate the spot in the sub-apertures in the Shack-Hartmann wavefront sensor (SH-WFS), in which preprocessing image is required before calculating the spot location due to that the centroid method is extremely sensitive to noises. In this paper, the SH-WFS image was simulated according to the characteristics of the noises, background and intensity distribution. The Optimal parameters of SH-WFS image preprocessing method were put forward, in different signal-to-noise ratio (SNR) conditions, where the wavefront reconstruction error was considered as the evaluation index. Two methods of image preprocessing, thresholding method and windowing combing with thresholding method, were compared by studying the applicable range of SNR and analyzing the stability of the two methods, respectively.
A new classification scheme of plastic wastes based upon recycling labels
DOE Office of Scientific and Technical Information (OSTI.GOV)
Özkan, Kemal, E-mail: kozkan@ogu.edu.tr; Ergin, Semih, E-mail: sergin@ogu.edu.tr; Işık, Şahin, E-mail: sahini@ogu.edu.tr
Highlights: • PET, HPDE or PP types of plastics are considered. • An automated classification of plastic bottles based on the feature extraction and classification methods is performed. • The decision mechanism consists of PCA, Kernel PCA, FLDA, SVD and Laplacian Eigenmaps methods. • SVM is selected to achieve the classification task and majority voting technique is used. - Abstract: Since recycling of materials is widely assumed to be environmentally and economically beneficial, reliable sorting and processing of waste packaging materials such as plastics is very important for recycling with high efficiency. An automated system that can quickly categorize thesemore » materials is certainly needed for obtaining maximum classification while maintaining high throughput. In this paper, first of all, the photographs of the plastic bottles have been taken and several preprocessing steps were carried out. The first preprocessing step is to extract the plastic area of a bottle from the background. Then, the morphological image operations are implemented. These operations are edge detection, noise removal, hole removing, image enhancement, and image segmentation. These morphological operations can be generally defined in terms of the combinations of erosion and dilation. The effect of bottle color as well as label are eliminated using these operations. Secondly, the pixel-wise intensity values of the plastic bottle images have been used together with the most popular subspace and statistical feature extraction methods to construct the feature vectors in this study. Only three types of plastics are considered due to higher existence ratio of them than the other plastic types in the world. The decision mechanism consists of five different feature extraction methods including as Principal Component Analysis (PCA), Kernel PCA (KPCA), Fisher’s Linear Discriminant Analysis (FLDA), Singular Value Decomposition (SVD) and Laplacian Eigenmaps (LEMAP) and uses a simple experimental setup with a camera and homogenous backlighting. Due to the giving global solution for a classification problem, Support Vector Machine (SVM) is selected to achieve the classification task and majority voting technique is used as the decision mechanism. This technique equally weights each classification result and assigns the given plastic object to the class that the most classification results agree on. The proposed classification scheme provides high accuracy rate, and also it is able to run in real-time applications. It can automatically classify the plastic bottle types with approximately 90% recognition accuracy. Besides this, the proposed methodology yields approximately 96% classification rate for the separation of PET or non-PET plastic types. It also gives 92% accuracy for the categorization of non-PET plastic types into HPDE or PP.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Popova, Evdokia; Rodgers, Theron M.; Gong, Xinyi
A novel data science workflow is developed and demonstrated to extract process-structure linkages (i.e., reduced-order model) for microstructure evolution problems when the final microstructure depends on (simulation or experimental) processing parameters. Our workflow consists of four main steps: data pre-processing, microstructure quantification, dimensionality reduction, and extraction/validation of process-structure linkages. These methods that can be employed within each step vary based on the type and amount of available data. In this paper, this data-driven workflow is applied to a set of synthetic additive manufacturing microstructures obtained using the Potts-kinetic Monte Carlo (kMC) approach. Additive manufacturing techniques inherently produce complex microstructures thatmore » can vary significantly with processing conditions. Using the developed workflow, a low-dimensional data-driven model was established to correlate process parameters with the predicted final microstructure. In addition, the modular workflows developed and presented in this work facilitate easy dissemination and curation by the broader community.« less
Recognizing characters of ancient manuscripts
NASA Astrophysics Data System (ADS)
Diem, Markus; Sablatnig, Robert
2010-02-01
Considering printed Latin text, the main issues of Optical Character Recognition (OCR) systems are solved. However, for degraded handwritten document images, basic preprocessing steps such as binarization, gain poor results with state-of-the-art methods. In this paper ancient Slavonic manuscripts from the 11th century are investigated. In order to minimize the consequences of false character segmentation, a binarization-free approach based on local descriptors is proposed. Additionally local information allows the recognition of partially visible or washed out characters. The proposed algorithm consists of two steps: character classification and character localization. Initially Scale Invariant Feature Transform (SIFT) features are extracted which are subsequently classified using Support Vector Machines (SVM). Afterwards, the interest points are clustered according to their spatial information. Thereby, characters are localized and finally recognized based on a weighted voting scheme of pre-classified local descriptors. Preliminary results show that the proposed system can handle highly degraded manuscript images with background clutter (e.g. stains, tears) and faded out characters.
A comparison of 1D and 2D LSTM architectures for the recognition of handwritten Arabic
NASA Astrophysics Data System (ADS)
Yousefi, Mohammad Reza; Soheili, Mohammad Reza; Breuel, Thomas M.; Stricker, Didier
2015-01-01
In this paper, we present an Arabic handwriting recognition method based on recurrent neural network. We use the Long Short Term Memory (LSTM) architecture, that have proven successful in different printed and handwritten OCR tasks. Applications of LSTM for handwriting recognition employ the two-dimensional architecture to deal with the variations in both vertical and horizontal axis. However, we show that using a simple pre-processing step that normalizes the position and baseline of letters, we can make use of 1D LSTM, which is faster in learning and convergence, and yet achieve superior performance. In a series of experiments on IFN/ENIT database for Arabic handwriting recognition, we demonstrate that our proposed pipeline can outperform 2D LSTM networks. Furthermore, we provide comparisons with 1D LSTM networks trained with manually crafted features to show that the automatically learned features in a globally trained 1D LSTM network with our normalization step can even outperform such systems.
Vibration study of a vehicle suspension assembly with the finite element method
NASA Astrophysics Data System (ADS)
Cătălin Marinescu, Gabriel; Castravete, Ştefan-Cristian; Dumitru, Nicolae
2017-10-01
The main steps of the present work represent a methodology of analysing various vibration effects over suspension mechanical parts of a vehicle. A McPherson type suspension from an existing vehicle was created using CAD software. Using the CAD model as input, a finite element model of the suspension assembly was developed. Abaqus finite element analysis software was used to pre-process, solve, and post-process the results. Geometric nonlinearities are included in the model. Severe sources of nonlinearities such us friction and contact are also included in the model. The McPherson spring is modelled as linear spring. The analysis include several steps: preload, modal analysis, the reduction of the model to 200 generalized coordinates, a deterministic external excitation, a random excitation that comes from different types of roads. The vibration data used as an input for the simulation were previously obtained by experimental means. Mathematical expressions used for the simulation were also presented in the paper.
Advanced Extraction of Spatial Information from High Resolution Satellite Data
NASA Astrophysics Data System (ADS)
Pour, T.; Burian, J.; Miřijovský, J.
2016-06-01
In this paper authors processed five satellite image of five different Middle-European cities taken by five different sensors. The aim of the paper was to find methods and approaches leading to evaluation and spatial data extraction from areas of interest. For this reason, data were firstly pre-processed using image fusion, mosaicking and segmentation processes. Results going into the next step were two polygon layers; first one representing single objects and the second one representing city blocks. In the second step, polygon layers were classified and exported into Esri shapefile format. Classification was partly hierarchical expert based and partly based on the tool SEaTH used for separability distinction and thresholding. Final results along with visual previews were attached to the original thesis. Results are evaluated visually and statistically in the last part of the paper. In the discussion author described difficulties of working with data of large size, taken by different sensors and different also thematically.
Sinha, S K; Karray, F
2002-01-01
Pipeline surface defects such as holes and cracks cause major problems for utility managers, particularly when the pipeline is buried under the ground. Manual inspection for surface defects in the pipeline has a number of drawbacks, including subjectivity, varying standards, and high costs. Automatic inspection system using image processing and artificial intelligence techniques can overcome many of these disadvantages and offer utility managers an opportunity to significantly improve quality and reduce costs. A recognition and classification of pipe cracks using images analysis and neuro-fuzzy algorithm is proposed. In the preprocessing step the scanned images of pipe are analyzed and crack features are extracted. In the classification step the neuro-fuzzy algorithm is developed that employs a fuzzy membership function and error backpropagation algorithm. The idea behind the proposed approach is that the fuzzy membership function will absorb variation of feature values and the backpropagation network, with its learning ability, will show good classification efficiency.
Integrated Low-Rank-Based Discriminative Feature Learning for Recognition.
Zhou, Pan; Lin, Zhouchen; Zhang, Chao
2016-05-01
Feature learning plays a central role in pattern recognition. In recent years, many representation-based feature learning methods have been proposed and have achieved great success in many applications. However, these methods perform feature learning and subsequent classification in two separate steps, which may not be optimal for recognition tasks. In this paper, we present a supervised low-rank-based approach for learning discriminative features. By integrating latent low-rank representation (LatLRR) with a ridge regression-based classifier, our approach combines feature learning with classification, so that the regulated classification error is minimized. In this way, the extracted features are more discriminative for the recognition tasks. Our approach benefits from a recent discovery on the closed-form solutions to noiseless LatLRR. When there is noise, a robust Principal Component Analysis (PCA)-based denoising step can be added as preprocessing. When the scale of a problem is large, we utilize a fast randomized algorithm to speed up the computation of robust PCA. Extensive experimental results demonstrate the effectiveness and robustness of our method.
Popova, Evdokia; Rodgers, Theron M.; Gong, Xinyi; ...
2017-03-13
A novel data science workflow is developed and demonstrated to extract process-structure linkages (i.e., reduced-order model) for microstructure evolution problems when the final microstructure depends on (simulation or experimental) processing parameters. Our workflow consists of four main steps: data pre-processing, microstructure quantification, dimensionality reduction, and extraction/validation of process-structure linkages. These methods that can be employed within each step vary based on the type and amount of available data. In this paper, this data-driven workflow is applied to a set of synthetic additive manufacturing microstructures obtained using the Potts-kinetic Monte Carlo (kMC) approach. Additive manufacturing techniques inherently produce complex microstructures thatmore » can vary significantly with processing conditions. Using the developed workflow, a low-dimensional data-driven model was established to correlate process parameters with the predicted final microstructure. In addition, the modular workflows developed and presented in this work facilitate easy dissemination and curation by the broader community.« less
On the way toward systems biology of Aspergillus fumigatus infection.
Albrecht, Daniela; Kniemeyer, Olaf; Mech, Franziska; Gunzer, Matthias; Brakhage, Axel; Guthke, Reinhard
2011-06-01
Pathogenicity of Aspergillus fumigatus is multifactorial. Thus, global studies are essential for the understanding of the infection process. Therefore, a data warehouse was established where genome sequence, transcriptome and proteome data are stored. These data are analyzed for the elucidation of virulence determinants. The data analysis workflow starts with pre-processing including imputing of missing values and normalization. Last step is the identification of differentially expressed genes/proteins as interesting candidates for further analysis, in particular for functional categorization and correlation studies. Sequence data and other prior knowledge extracted from databases are integrated to support the inference of gene regulatory networks associated with pathogenicity. This knowledge-assisted data analysis aims at establishing mathematical models with predictive strength to assist further experimental work. Recently, first steps were done to extend the integrative data analysis and computational modeling by evaluating spatio-temporal data (movies) that monitor interactions of A. fumigatus morphotypes (e.g. conidia) with host immune cells. Copyright © 2011 Elsevier GmbH. All rights reserved.
Genova, Giuseppe; Tosetti, Roberta; Tonutti, Pietro
2016-01-30
Grape juice is an important dietary source of health-promoting antioxidant molecules. Different factors may affect juice composition and nutraceutical properties. The effects of some of these factors (harvest time, pre-processing ethylene treatment of grapes and juice thermal pasteurization) were here evaluated, considering in particular the phenolic composition and antioxidant capacity. Grapes (Vitis vinifera L., red-skinned variety Sangiovese) were collected twice in relation to the technological harvest (TH) and 12 days before TH (early harvest, EH) and treated with gaseous ethylene (1000 ppm) or air for 48 h. Fresh and pasteurized (78 °C for 30 min) juices were produced using a water bath. Three-way analysis of variance showed that the harvest date had the strongest impact on total polyphenols, hydroxycinnamates, flavonols, and especially on total flavonoids. Pre-processing ethylene treatment significantly increased the proanthocyanidin, anthocyanin and flavan-3-ol content in the juices. Pasteurization induced a significant increase in anthocyanin concentration. Antioxidant capacity was enhanced by ethylene treatment and pasteurization in juices from both TH and EH grapes. These results suggest that an appropriate management of grape harvesting date, postharvest and processing may lead to an improvement in nutraceutical quality of juices. Further research is needed to study the effect of the investigated factors on juice organoleptic properties. © 2015 Society of Chemical Industry.
Sakwari, Gloria; Mamuya, Simon H D; Bråtveit, Magne; Larsson, Lennart; Pehrson, Christina; Moen, Bente E
2013-03-01
Endotoxin exposure associated with organic dust exposure has been studied in several industries. Coffee cherries that are dried directly after harvest may differ in dust and endotoxin emissions to those that are peeled and washed before drying. The aim of this study was to measure personal total dust and endotoxin levels and to evaluate their determinants of exposure in coffee processing factories. Using Sidekick Casella pumps at a flow rate of 2l/min, total dust levels were measured in the workers' breathing zone throughout the shift. Endotoxin was analyzed using the kinetic chromogenic Limulus amebocyte lysate assay. Separate linear mixed-effects models were used to evaluate exposure determinants for dust and endotoxin. Total dust and endotoxin exposure were significantly higher in Robusta than in Arabica coffee factories (geometric mean 3.41 mg/m(3) and 10 800 EU/m(3) versus 2.10 mg/m(3) and 1400 EU/m(3), respectively). Dry pre-processed coffee and differences in work tasks explained 30% of the total variance for total dust and 71% of the variance for endotoxin exposure. High exposure in Robusta processing is associated with the dry pre-processing method used after harvest. Dust and endotoxin exposure is high, in particular when processing dry pre-processed coffee. Minimization of dust emissions and use of efficient dust exhaust systems are important to prevent the development of respiratory system impairment in workers.
visnormsc: A Graphical User Interface to Normalize Single-cell RNA Sequencing Data.
Tang, Lijun; Zhou, Nan
2017-12-26
Single-cell RNA sequencing (RNA-seq) allows the analysis of gene expression with high resolution. The intrinsic defects of this promising technology imports technical noise into the single-cell RNA-seq data, increasing the difficulty of accurate downstream inference. Normalization is a crucial step in single-cell RNA-seq data pre-processing. SCnorm is an accurate and efficient method that can be used for this purpose. An R implementation of this method is currently available. On one hand, the R package possesses many excellent features from R. On the other hand, R programming ability is required, which prevents the biologists who lack the skills from learning to use it quickly. To make this method more user-friendly, we developed a graphical user interface, visnormsc, for normalization of single-cell RNA-seq data. It is implemented in Python and is freely available at https://github.com/solo7773/visnormsc . Although visnormsc is based on the existing method, it contributes to this field by offering a user-friendly alternative. The out-of-the-box and cross-platform features make visnormsc easy to learn and to use. It is expected to serve biologists by simplifying single-cell RNA-seq normalization.
Norton, Kerri-Ann; Iyatomi, Hitoshi; Celebi, M Emre; Ishizaki, Sumiko; Sawada, Mizuki; Suzaki, Reiko; Kobayashi, Ken; Tanaka, Masaru; Ogawa, Koichi
2012-08-01
Computer-aided diagnosis of dermoscopy images has shown great promise in developing a quantitative, objective way of classifying skin lesions. An important step in the classification process is lesion segmentation. Many studies have been successful in segmenting melanocytic skin lesions (MSLs), but few have focused on non-melanocytic skin lesions (NoMSLs), as the wide variety of lesions makes accurate segmentation difficult. We developed an automatic segmentation program for detecting borders of skin lesions in dermoscopy images. The method consists of a pre-processing phase, general lesion segmentation phase, including illumination correction, and bright region segmentation phase. We tested our method on a set of 107 NoMSLs and a set of 319 MSLs. Our method achieved precision/recall scores of 84.5% and 88.5% for NoMSLs, and 93.9% and 93.8% for MSLs, in comparison with manual extractions from four or five dermatologists. The accuracy of our method was competitive or better than five recently published methods. Our new method is the first method for detecting borders of both non-melanocytic and melanocytic skin lesions. © 2011 John Wiley & Sons A/S.
UNC-Utah NA-MIC framework for DTI fiber tract analysis.
Verde, Audrey R; Budin, Francois; Berger, Jean-Baptiste; Gupta, Aditya; Farzinfar, Mahshid; Kaiser, Adrien; Ahn, Mihye; Johnson, Hans; Matsui, Joy; Hazlett, Heather C; Sharma, Anuja; Goodlett, Casey; Shi, Yundi; Gouttard, Sylvain; Vachet, Clement; Piven, Joseph; Zhu, Hongtu; Gerig, Guido; Styner, Martin
2014-01-01
Diffusion tensor imaging has become an important modality in the field of neuroimaging to capture changes in micro-organization and to assess white matter integrity or development. While there exists a number of tractography toolsets, these usually lack tools for preprocessing or to analyze diffusion properties along the fiber tracts. Currently, the field is in critical need of a coherent end-to-end toolset for performing an along-fiber tract analysis, accessible to non-technical neuroimaging researchers. The UNC-Utah NA-MIC DTI framework represents a coherent, open source, end-to-end toolset for atlas fiber tract based DTI analysis encompassing DICOM data conversion, quality control, atlas building, fiber tractography, fiber parameterization, and statistical analysis of diffusion properties. Most steps utilize graphical user interfaces (GUI) to simplify interaction and provide an extensive DTI analysis framework for non-technical researchers/investigators. We illustrate the use of our framework on a small sample, cross sectional neuroimaging study of eight healthy 1-year-old children from the Infant Brain Imaging Study (IBIS) Network. In this limited test study, we illustrate the power of our method by quantifying the diffusion properties at 1 year of age on the genu and splenium fiber tracts.
Optimized SIFTFlow for registration of whole-mount histology to reference optical images
Shojaii, Rushin; Martel, Anne L.
2016-01-01
Abstract. The registration of two-dimensional histology images to reference images from other modalities is an important preprocessing step in the reconstruction of three-dimensional histology volumes. This is a challenging problem because of the differences in the appearances of histology images and other modalities, and the presence of large nonrigid deformations which occur during slide preparation. This paper shows the feasibility of using densely sampled scale-invariant feature transform (SIFT) features and a SIFTFlow deformable registration algorithm for coregistering whole-mount histology images with blockface optical images. We present a method for jointly optimizing the regularization parameters used by the SIFTFlow objective function and use it to determine the most appropriate values for the registration of breast lumpectomy specimens. We demonstrate that tuning the regularization parameters results in significant improvements in accuracy and we also show that SIFTFlow outperforms a previously described edge-based registration method. The accuracy of the histology images to blockface images registration using the optimized SIFTFlow method was assessed using an independent test set of images from five different lumpectomy specimens and the mean registration error was 0.32±0.22 mm. PMID:27774494
Evaluation of a deep learning architecture for MR imaging prediction of ATRX in glioma patients
NASA Astrophysics Data System (ADS)
Korfiatis, Panagiotis; Kline, Timothy L.; Erickson, Bradley J.
2018-02-01
Predicting mutation/loss of alpha-thalassemia/mental retardation syndrome X-linked (ATRX) gene utilizing MR imaging is of high importance since it is a predictor of response and prognosis in brain tumors. In this study, we compare a deep neural network approach based on a residual deep neural network (ResNet) architecture and one based on a classical machine learning approach and evaluate their ability in predicting ATRX mutation status without the need for a distinct tumor segmentation step. We found that the ResNet50 (50 layers) architecture, pre trained on ImageNet data was the best performing model, achieving an accuracy of 0.91 for the test set (classification of a slice as no tumor, ATRX mutated, or mutated) in terms of f1 score in a test set of 35 cases. The SVM classifier achieved 0.63 for differentiating the Flair signal abnormality regions from the test patients based on their mutation status. We report a method that alleviates the need for extensive preprocessing and acts as a proof of concept that deep neural network architectures can be used to predict molecular biomarkers from routine medical images.
Automatic detection and quantitative analysis of cells in the mouse primary motor cortex
NASA Astrophysics Data System (ADS)
Meng, Yunlong; He, Yong; Wu, Jingpeng; Chen, Shangbin; Li, Anan; Gong, Hui
2014-09-01
Neuronal cells play very important role on metabolism regulation and mechanism control, so cell number is a fundamental determinant of brain function. Combined suitable cell-labeling approaches with recently proposed three-dimensional optical imaging techniques, whole mouse brain coronal sections can be acquired with 1-μm voxel resolution. We have developed a completely automatic pipeline to perform cell centroids detection, and provided three-dimensional quantitative information of cells in the primary motor cortex of C57BL/6 mouse. It involves four principal steps: i) preprocessing; ii) image binarization; iii) cell centroids extraction and contour segmentation; iv) laminar density estimation. Investigations on the presented method reveal promising detection accuracy in terms of recall and precision, with average recall rate 92.1% and average precision rate 86.2%. We also analyze laminar density distribution of cells from pial surface to corpus callosum from the output vectorizations of detected cell centroids in mouse primary motor cortex, and find significant cellular density distribution variations in different layers. This automatic cell centroids detection approach will be beneficial for fast cell-counting and accurate density estimation, as time-consuming and error-prone manual identification is avoided.
Kernel-based discriminant feature extraction using a representative dataset
NASA Astrophysics Data System (ADS)
Li, Honglin; Sancho Gomez, Jose-Luis; Ahalt, Stanley C.
2002-07-01
Discriminant Feature Extraction (DFE) is widely recognized as an important pre-processing step in classification applications. Most DFE algorithms are linear and thus can only explore the linear discriminant information among the different classes. Recently, there has been several promising attempts to develop nonlinear DFE algorithms, among which is Kernel-based Feature Extraction (KFE). The efficacy of KFE has been experimentally verified by both synthetic data and real problems. However, KFE has some known limitations. First, KFE does not work well for strongly overlapped data. Second, KFE employs all of the training set samples during the feature extraction phase, which can result in significant computation when applied to very large datasets. Finally, KFE can result in overfitting. In this paper, we propose a substantial improvement to KFE that overcomes the above limitations by using a representative dataset, which consists of critical points that are generated from data-editing techniques and centroid points that are determined by using the Frequency Sensitive Competitive Learning (FSCL) algorithm. Experiments show that this new KFE algorithm performs well on significantly overlapped datasets, and it also reduces computational complexity. Further, by controlling the number of centroids, the overfitting problem can be effectively alleviated.
Erny, Guillaume L; Acunha, Tanize; Simó, Carolina; Cifuentes, Alejandro; Alves, Arminda
2017-04-07
Separation techniques hyphenated with high-resolution mass spectrometry have been a true revolution in analytical separation techniques. Such instruments not only provide unmatched resolution, but they also allow measuring the peaks accurate masses that permit identifying monoisotopic formulae. However, data files can be large, with a major contribution from background noise and background ions. Such unnecessary contribution to the overall signal can hide important features as well as decrease the accuracy of the centroid determination, especially with minor features. Thus, noise and baseline correction can be a valuable pre-processing step. The methodology that is described here, unlike any other approach, is used to correct the original dataset with the MS scans recorded as profiles spectrum. Using urine metabolic studies as examples, we demonstrate that this thorough correction reduces the data complexity by more than 90%. Such correction not only permits an improved visualisation of secondary peaks in the chromatographic domain, but it also facilitates the complete assignment of each MS scan which is invaluable to detect possible comigration/coeluting species. Copyright © 2017 Elsevier B.V. All rights reserved.
UNC-Utah NA-MIC framework for DTI fiber tract analysis
Verde, Audrey R.; Budin, Francois; Berger, Jean-Baptiste; Gupta, Aditya; Farzinfar, Mahshid; Kaiser, Adrien; Ahn, Mihye; Johnson, Hans; Matsui, Joy; Hazlett, Heather C.; Sharma, Anuja; Goodlett, Casey; Shi, Yundi; Gouttard, Sylvain; Vachet, Clement; Piven, Joseph; Zhu, Hongtu; Gerig, Guido; Styner, Martin
2014-01-01
Diffusion tensor imaging has become an important modality in the field of neuroimaging to capture changes in micro-organization and to assess white matter integrity or development. While there exists a number of tractography toolsets, these usually lack tools for preprocessing or to analyze diffusion properties along the fiber tracts. Currently, the field is in critical need of a coherent end-to-end toolset for performing an along-fiber tract analysis, accessible to non-technical neuroimaging researchers. The UNC-Utah NA-MIC DTI framework represents a coherent, open source, end-to-end toolset for atlas fiber tract based DTI analysis encompassing DICOM data conversion, quality control, atlas building, fiber tractography, fiber parameterization, and statistical analysis of diffusion properties. Most steps utilize graphical user interfaces (GUI) to simplify interaction and provide an extensive DTI analysis framework for non-technical researchers/investigators. We illustrate the use of our framework on a small sample, cross sectional neuroimaging study of eight healthy 1-year-old children from the Infant Brain Imaging Study (IBIS) Network. In this limited test study, we illustrate the power of our method by quantifying the diffusion properties at 1 year of age on the genu and splenium fiber tracts. PMID:24409141
Dhir, Sunny; Walia, Yashika; Zaidi, A A; Hallan, Vipin
2015-03-01
A simple method to amplify infective, complete genomes of single stranded RNA viruses by long distance PCR (LD PCR) from woody plant tissues is described in detail. The present protocol eliminates partial purification of viral particles and the amplification is achieved in three steps: (i) easy preparation of template RNA by incorporating a pre processing step before loading onto the column (ii) reverse transcription by AMV or Superscript reverse transcriptase and (iii) amplification of cDNA by LD PCR using LA or Protoscript Taq DNA polymerase. Incorporation of a preprocessing step helped to isolate consistent quality RNA from recalcitrant woody tissues such as apple, which was critical for efficient amplification of the complete genomes of Apple stem pitting virus (ASPV), Apple stem grooving virus (ASGV) and Apple chlorotic leaf spot virus (ACLSV). Complete genome of ASGV was cloned under T7 RNA polymerase promoter and was confirmed to be infectious through transcript inoculation producing symptoms similar to the wild type virus. This is the first report for the largest RNA virus genome amplified by PCR from total nucleic acid extracts of woody plant tissues. Copyright © 2014 Elsevier B.V. All rights reserved.
Tsapatsoulis, Nicolas; Loizou, Christos; Pattichis, Constantinos
2007-01-01
Efficient medical video transmission over 3G wireless is of great importance for fast diagnosis and on site medical staff training purposes. In this paper we present a region of interest based ultrasound video compression study which shows that significant reduction of the required, for transmission, bit rate can be achieved without altering the design of existing video codecs. Simple preprocessing of the original videos to define visually and clinically important areas is the only requirement.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hardy, David J., E-mail: dhardy@illinois.edu; Schulten, Klaus; Wolff, Matthew A.
2016-03-21
The multilevel summation method for calculating electrostatic interactions in molecular dynamics simulations constructs an approximation to a pairwise interaction kernel and its gradient, which can be evaluated at a cost that scales linearly with the number of atoms. The method smoothly splits the kernel into a sum of partial kernels of increasing range and decreasing variability with the longer-range parts interpolated from grids of increasing coarseness. Multilevel summation is especially appropriate in the context of dynamics and minimization, because it can produce continuous gradients. This article explores the use of B-splines to increase the accuracy of the multilevel summation methodmore » (for nonperiodic boundaries) without incurring additional computation other than a preprocessing step (whose cost also scales linearly). To obtain accurate results efficiently involves technical difficulties, which are overcome by a novel preprocessing algorithm. Numerical experiments demonstrate that the resulting method offers substantial improvements in accuracy and that its performance is competitive with an implementation of the fast multipole method in general and markedly better for Hamiltonian formulations of molecular dynamics. The improvement is great enough to establish multilevel summation as a serious contender for calculating pairwise interactions in molecular dynamics simulations. In particular, the method appears to be uniquely capable for molecular dynamics in two situations, nonperiodic boundary conditions and massively parallel computation, where the fast Fourier transform employed in the particle–mesh Ewald method falls short.« less
Hardy, David J; Wolff, Matthew A; Xia, Jianlin; Schulten, Klaus; Skeel, Robert D
2016-03-21
The multilevel summation method for calculating electrostatic interactions in molecular dynamics simulations constructs an approximation to a pairwise interaction kernel and its gradient, which can be evaluated at a cost that scales linearly with the number of atoms. The method smoothly splits the kernel into a sum of partial kernels of increasing range and decreasing variability with the longer-range parts interpolated from grids of increasing coarseness. Multilevel summation is especially appropriate in the context of dynamics and minimization, because it can produce continuous gradients. This article explores the use of B-splines to increase the accuracy of the multilevel summation method (for nonperiodic boundaries) without incurring additional computation other than a preprocessing step (whose cost also scales linearly). To obtain accurate results efficiently involves technical difficulties, which are overcome by a novel preprocessing algorithm. Numerical experiments demonstrate that the resulting method offers substantial improvements in accuracy and that its performance is competitive with an implementation of the fast multipole method in general and markedly better for Hamiltonian formulations of molecular dynamics. The improvement is great enough to establish multilevel summation as a serious contender for calculating pairwise interactions in molecular dynamics simulations. In particular, the method appears to be uniquely capable for molecular dynamics in two situations, nonperiodic boundary conditions and massively parallel computation, where the fast Fourier transform employed in the particle-mesh Ewald method falls short.
Autoreject: Automated artifact rejection for MEG and EEG data.
Jas, Mainak; Engemann, Denis A; Bekhti, Yousra; Raimondo, Federico; Gramfort, Alexandre
2017-10-01
We present an automated algorithm for unified rejection and repair of bad trials in magnetoencephalography (MEG) and electroencephalography (EEG) signals. Our method capitalizes on cross-validation in conjunction with a robust evaluation metric to estimate the optimal peak-to-peak threshold - a quantity commonly used for identifying bad trials in M/EEG. This approach is then extended to a more sophisticated algorithm which estimates this threshold for each sensor yielding trial-wise bad sensors. Depending on the number of bad sensors, the trial is then repaired by interpolation or by excluding it from subsequent analysis. All steps of the algorithm are fully automated thus lending itself to the name Autoreject. In order to assess the practical significance of the algorithm, we conducted extensive validation and comparisons with state-of-the-art methods on four public datasets containing MEG and EEG recordings from more than 200 subjects. The comparisons include purely qualitative efforts as well as quantitatively benchmarking against human supervised and semi-automated preprocessing pipelines. The algorithm allowed us to automate the preprocessing of MEG data from the Human Connectome Project (HCP) going up to the computation of the evoked responses. The automated nature of our method minimizes the burden of human inspection, hence supporting scalability and reliability demanded by data analysis in modern neuroscience. Copyright © 2017 Elsevier Inc. All rights reserved.
Nunez, Michael D.; Vandekerckhove, Joachim; Srinivasan, Ramesh
2016-01-01
Perceptual decision making can be accounted for by drift-diffusion models, a class of decision-making models that assume a stochastic accumulation of evidence on each trial. Fitting response time and accuracy to a drift-diffusion model produces evidence accumulation rate and non-decision time parameter estimates that reflect cognitive processes. Our goal is to elucidate the effect of attention on visual decision making. In this study, we show that measures of attention obtained from simultaneous EEG recordings can explain per-trial evidence accumulation rates and perceptual preprocessing times during a visual decision making task. Models assuming linear relationships between diffusion model parameters and EEG measures as external inputs were fit in a single step in a hierarchical Bayesian framework. The EEG measures were features of the evoked potential (EP) to the onset of a masking noise and the onset of a task-relevant signal stimulus. Single-trial evoked EEG responses, P200s to the onsets of visual noise and N200s to the onsets of visual signal, explain single-trial evidence accumulation and preprocessing times. Within-trial evidence accumulation variance was not found to be influenced by attention to the signal or noise. Single-trial measures of attention lead to better out-of-sample predictions of accuracy and correct reaction time distributions for individual subjects. PMID:28435173
Nunez, Michael D; Vandekerckhove, Joachim; Srinivasan, Ramesh
2017-02-01
Perceptual decision making can be accounted for by drift-diffusion models, a class of decision-making models that assume a stochastic accumulation of evidence on each trial. Fitting response time and accuracy to a drift-diffusion model produces evidence accumulation rate and non-decision time parameter estimates that reflect cognitive processes. Our goal is to elucidate the effect of attention on visual decision making. In this study, we show that measures of attention obtained from simultaneous EEG recordings can explain per-trial evidence accumulation rates and perceptual preprocessing times during a visual decision making task. Models assuming linear relationships between diffusion model parameters and EEG measures as external inputs were fit in a single step in a hierarchical Bayesian framework. The EEG measures were features of the evoked potential (EP) to the onset of a masking noise and the onset of a task-relevant signal stimulus. Single-trial evoked EEG responses, P200s to the onsets of visual noise and N200s to the onsets of visual signal, explain single-trial evidence accumulation and preprocessing times. Within-trial evidence accumulation variance was not found to be influenced by attention to the signal or noise. Single-trial measures of attention lead to better out-of-sample predictions of accuracy and correct reaction time distributions for individual subjects.
NASA Astrophysics Data System (ADS)
Hardy, David J.; Wolff, Matthew A.; Xia, Jianlin; Schulten, Klaus; Skeel, Robert D.
2016-03-01
The multilevel summation method for calculating electrostatic interactions in molecular dynamics simulations constructs an approximation to a pairwise interaction kernel and its gradient, which can be evaluated at a cost that scales linearly with the number of atoms. The method smoothly splits the kernel into a sum of partial kernels of increasing range and decreasing variability with the longer-range parts interpolated from grids of increasing coarseness. Multilevel summation is especially appropriate in the context of dynamics and minimization, because it can produce continuous gradients. This article explores the use of B-splines to increase the accuracy of the multilevel summation method (for nonperiodic boundaries) without incurring additional computation other than a preprocessing step (whose cost also scales linearly). To obtain accurate results efficiently involves technical difficulties, which are overcome by a novel preprocessing algorithm. Numerical experiments demonstrate that the resulting method offers substantial improvements in accuracy and that its performance is competitive with an implementation of the fast multipole method in general and markedly better for Hamiltonian formulations of molecular dynamics. The improvement is great enough to establish multilevel summation as a serious contender for calculating pairwise interactions in molecular dynamics simulations. In particular, the method appears to be uniquely capable for molecular dynamics in two situations, nonperiodic boundary conditions and massively parallel computation, where the fast Fourier transform employed in the particle-mesh Ewald method falls short.
An effective and efficient compression algorithm for ECG signals with irregular periods.
Chou, Hsiao-Hsuan; Chen, Ying-Jui; Shiau, Yu-Chien; Kuo, Te-Son
2006-06-01
This paper presents an effective and efficient preprocessing algorithm for two-dimensional (2-D) electrocardiogram (ECG) compression to better compress irregular ECG signals by exploiting their inter- and intra-beat correlations. To better reveal the correlation structure, we first convert the ECG signal into a proper 2-D representation, or image. This involves a few steps including QRS detection and alignment, period sorting, and length equalization. The resulting 2-D ECG representation is then ready to be compressed by an appropriate image compression algorithm. We choose the state-of-the-art JPEG2000 for its high efficiency and flexibility. In this way, the proposed algorithm is shown to outperform some existing arts in the literature by simultaneously achieving high compression ratio (CR), low percent root mean squared difference (PRD), low maximum error (MaxErr), and low standard derivation of errors (StdErr). In particular, because the proposed period sorting method rearranges the detected heartbeats into a smoother image that is easier to compress, this algorithm is insensitive to irregular ECG periods. Thus either the irregular ECG signals or the QRS false-detection cases can be better compressed. This is a significant improvement over existing 2-D ECG compression methods. Moreover, this algorithm is not tied exclusively to JPEG2000. It can also be combined with other 2-D preprocessing methods or appropriate codecs to enhance the compression performance in irregular ECG cases.
Lung Cancer Survival Prediction using Ensemble Data Mining on Seer Data
Agrawal, Ankit; Misra, Sanchit; Narayanan, Ramanathan; ...
2012-01-01
We analyze the lung cancer data available from the SEER program with the aim of developing accurate survival prediction models for lung cancer. Carefully designed preprocessing steps resulted in removal/modification/splitting of several attributes, and 2 of the 11 derived attributes were found to have significant predictive power. Several supervised classification methods were used on the preprocessed data along with various data mining optimizations and validations. In our experiments, ensemble voting of five decision tree based classifiers and meta-classifiers was found to result in the best prediction performance in terms of accuracy and area under the ROC curve. We have developedmore » an on-line lung cancer outcome calculator for estimating the risk of mortality after 6 months, 9 months, 1 year, 2 year and 5 years of diagnosis, for which a smaller non-redundant subset of 13 attributes was carefully selected using attribute selection techniques, while trying to retain the predictive power of the original set of attributes. Further, ensemble voting models were also created for predicting conditional survival outcome for lung cancer (estimating risk of mortality after 5 years of diagnosis, given that the patient has already survived for a period of time), and included in the calculator. The on-line lung cancer outcome calculator developed as a result of this study is available at http://info.eecs.northwestern.edu:8080/LungCancerOutcomeCalculator/.« less
NASA Astrophysics Data System (ADS)
Aviles, Angelica I.; Alsaleh, Samar; Sobrevilla, Pilar; Casals, Alicia
2016-03-01
Robotic-Assisted Surgery approach overcomes the limitations of the traditional laparoscopic and open surgeries. However, one of its major limitations is the lack of force feedback. Since there is no direct interaction between the surgeon and the tissue, there is no way of knowing how much force the surgeon is applying which can result in irreversible injuries. The use of force sensors is not practical since they impose different constraints. Thus, we make use of a neuro-visual approach to estimate the applied forces, in which the 3D shape recovery together with the geometry of motion are used as input to a deep network based on LSTM-RNN architecture. When deep networks are used in real time, pre-processing of data is a key factor to reduce complexity and improve the network performance. A common pre-processing step is dimensionality reduction which attempts to eliminate redundant and insignificant information by selecting a subset of relevant features to use in model construction. In this work, we show the effects of dimensionality reduction in a real-time application: estimating the applied force in Robotic-Assisted Surgeries. According to the results, we demonstrated positive effects of doing dimensionality reduction on deep networks including: faster training, improved network performance, and overfitting prevention. We also show a significant accuracy improvement, ranging from about 33% to 86%, over existing approaches related to force estimation.
PCA-based artifact removal algorithm for stroke detection using UWB radar imaging.
Ricci, Elisa; di Domenico, Simone; Cianca, Ernestina; Rossi, Tommaso; Diomedi, Marina
2017-06-01
Stroke patients should be dispatched at the highest level of care available in the shortest time. In this context, a transportable system in specialized ambulances, able to evaluate the presence of an acute brain lesion in a short time interval (i.e., few minutes), could shorten delay of treatment. UWB radar imaging is an emerging diagnostic branch that has great potential for the implementation of a transportable and low-cost device. Transportability, low cost and short response time pose challenges to the signal processing algorithms of the backscattered signals as they should guarantee good performance with a reasonably low number of antennas and low computational complexity, tightly related to the response time of the device. The paper shows that a PCA-based preprocessing algorithm can: (1) achieve good performance already with a computationally simple beamforming algorithm; (2) outperform state-of-the-art preprocessing algorithms; (3) enable a further improvement in the performance (and/or decrease in the number of antennas) by using a multistatic approach with just a modest increase in computational complexity. This is an important result toward the implementation of such a diagnostic device that could play an important role in emergency scenario.
Frequency domain analysis of errors in cross-correlations of ambient seismic noise
NASA Astrophysics Data System (ADS)
Liu, Xin; Ben-Zion, Yehuda; Zigone, Dimitri
2016-12-01
We analyse random errors (variances) in cross-correlations of ambient seismic noise in the frequency domain, which differ from previous time domain methods. Extending previous theoretical results on ensemble averaged cross-spectrum, we estimate confidence interval of stacked cross-spectrum of finite amount of data at each frequency using non-overlapping windows with fixed length. The extended theory also connects amplitude and phase variances with the variance of each complex spectrum value. Analysis of synthetic stationary ambient noise is used to estimate the confidence interval of stacked cross-spectrum obtained with different length of noise data corresponding to different number of evenly spaced windows of the same duration. This method allows estimating Signal/Noise Ratio (SNR) of noise cross-correlation in the frequency domain, without specifying filter bandwidth or signal/noise windows that are needed for time domain SNR estimations. Based on synthetic ambient noise data, we also compare the probability distributions, causal part amplitude and SNR of stacked cross-spectrum function using one-bit normalization or pre-whitening with those obtained without these pre-processing steps. Natural continuous noise records contain both ambient noise and small earthquakes that are inseparable from the noise with the existing pre-processing steps. Using probability distributions of random cross-spectrum values based on the theoretical results provides an effective way to exclude such small earthquakes, and additional data segments (outliers) contaminated by signals of different statistics (e.g. rain, cultural noise), from continuous noise waveforms. This technique is applied to constrain values and uncertainties of amplitude and phase velocity of stacked noise cross-spectrum at different frequencies, using data from southern California at both regional scale (˜35 km) and dense linear array (˜20 m) across the plate-boundary faults. A block bootstrap resampling method is used to account for temporal correlation of noise cross-spectrum at low frequencies (0.05-0.2 Hz) near the ocean microseismic peaks.
Facilitating access to pre-processed research evidence in public health
2010-01-01
Background Evidence-informed decision making is accepted in Canada and worldwide as necessary for the provision of effective health services. This process involves: 1) clearly articulating a practice-based issue; 2) searching for and accessing relevant evidence; 3) appraising methodological rigor and choosing the most synthesized evidence of the highest quality and relevance to the practice issue and setting that is available; and 4) extracting, interpreting, and translating knowledge, in light of the local context and resources, into practice, program and policy decisions. While the public health sector in Canada is working toward evidence-informed decision making, considerable barriers, including efficient access to synthesized resources, exist. Methods In this paper we map to a previously developed 6 level pyramid of pre-processed research evidence, relevant resources that include public health-related effectiveness evidence. The resources were identified through extensive searches of both the published and unpublished domains. Results Many resources with public health-related evidence were identified. While there were very few resources dedicated solely to public health evidence, many clinically focused resources include public health-related evidence, making tools such as the pyramid, that identify these resources, particularly helpful for public health decisions makers. A practical example illustrates the application of this model and highlights its potential to reduce the time and effort that would be required by public health decision makers to address their practice-based issues. Conclusions This paper describes an existing hierarchy of pre-processed evidence and its adaptation to the public health setting. A number of resources with public health-relevant content that are either freely accessible or requiring a subscription are identified. This will facilitate easier and faster access to pre-processed, public health-relevant evidence, with the intent of promoting evidence-informed decision making. Access to such resources addresses several barriers identified by public health decision makers to evidence-informed decision making, most importantly time, as well as lack of knowledge of resources that house public health-relevant evidence. PMID:20181270
The Minimal Preprocessing Pipelines for the Human Connectome Project
Glasser, Matthew F.; Sotiropoulos, Stamatios N; Wilson, J Anthony; Coalson, Timothy S; Fischl, Bruce; Andersson, Jesper L; Xu, Junqian; Jbabdi, Saad; Webster, Matthew; Polimeni, Jonathan R; Van Essen, David C; Jenkinson, Mark
2013-01-01
The Human Connectome Project (HCP) faces the challenging task of bringing multiple magnetic resonance imaging (MRI) modalities together in a common automated preprocessing framework across a large cohort of subjects. The MRI data acquired by the HCP differ in many ways from data acquired on conventional 3 Tesla scanners and often require newly developed preprocessing methods. We describe the minimal preprocessing pipelines for structural, functional, and diffusion MRI that were developed by the HCP to accomplish many low level tasks, including spatial artifact/distortion removal, surface generation, cross-modal registration, and alignment to standard space. These pipelines are specially designed to capitalize on the high quality data offered by the HCP. The final standard space makes use of a recently introduced CIFTI file format and the associated grayordinates spatial coordinate system. This allows for combined cortical surface and subcortical volume analyses while reducing the storage and processing requirements for high spatial and temporal resolution data. Here, we provide the minimum image acquisition requirements for the HCP minimal preprocessing pipelines and additional advice for investigators interested in replicating the HCP’s acquisition protocols or using these pipelines. Finally, we discuss some potential future improvements for the pipelines. PMID:23668970
Satterthwaite, Theodore D.; Elliott, Mark A.; Gerraty, Raphael T.; Ruparel, Kosha; Loughead, James; Calkins, Monica E.; Eickhoff, Simon B.; Hakonarson, Hakon; Gur, Ruben C.; Gur, Raquel E.; Wolf, Daniel H.
2013-01-01
Several recent reports in large, independent samples have demonstrated the influence of motion artifact on resting-state functional connectivity MRI (rsfc-MRI). Standard rsfc-MRI preprocessing typically includes regression of confounding signals and band-pass filtering. However, substantial heterogeneity exists in how these techniques are implemented across studies, and no prior study has examined the effect of differing approaches for the control of motion-induced artifacts. To better understand how in-scanner head motion affects rsfc-MRI data, we describe the spatial, temporal, and spectral characteristics of motion artifacts in a sample of 348 adolescents. Analyses utilize a novel approach for describing head motion on a voxelwise basis. Next, we systematically evaluate the efficacy of a range of confound regression and filtering techniques for the control of motion-induced artifacts. Results reveal that the effectiveness of preprocessing procedures on the control of motion is heterogeneous, and that improved preprocessing provides a substantial benefit beyond typical procedures. These results demonstrate that the effect of motion on rsfc-MRI can be substantially attenuated through improved preprocessing procedures, but not completely removed. PMID:22926292
Brain Computer Interfaces, a Review
Nicolas-Alonso, Luis Fernando; Gomez-Gil, Jaime
2012-01-01
A brain-computer interface (BCI) is a hardware and software communications system that permits cerebral activity alone to control computers or external devices. The immediate goal of BCI research is to provide communications capabilities to severely disabled people who are totally paralyzed or ‘locked in’ by neurological neuromuscular disorders, such as amyotrophic lateral sclerosis, brain stem stroke, or spinal cord injury. Here, we review the state-of-the-art of BCIs, looking at the different steps that form a standard BCI: signal acquisition, preprocessing or signal enhancement, feature extraction, classification and the control interface. We discuss their advantages, drawbacks, and latest advances, and we survey the numerous technologies reported in the scientific literature to design each step of a BCI. First, the review examines the neuroimaging modalities used in the signal acquisition step, each of which monitors a different functional brain activity such as electrical, magnetic or metabolic activity. Second, the review discusses different electrophysiological control signals that determine user intentions, which can be detected in brain activity. Third, the review includes some techniques used in the signal enhancement step to deal with the artifacts in the control signals and improve the performance. Fourth, the review studies some mathematic algorithms used in the feature extraction and classification steps which translate the information in the control signals into commands that operate a computer or other device. Finally, the review provides an overview of various BCI applications that control a range of devices. PMID:22438708
NASA Technical Reports Server (NTRS)
Crane, Robert K.; Wang, Xuhe; Westenhaver, David
1996-01-01
The preprocessing software manual describes the Actspp program originally developed to observe and diagnose Advanced Communications Technology Satellite (ACTS) propagation terminal/receiver problems. However, it has been quite useful for automating the preprocessing functions needed to convert the terminal output to useful attenuation estimates. Prior to having data acceptable for archival functions, the individual receiver system must be calibrated and the power level shifts caused by ranging tone modulation must be received. Actspp provides three output files: the daylog, the diurnal coefficient file, and the file that contains calibration information.
Automatic segmentation of tumor-laden lung volumes from the LIDC database
NASA Astrophysics Data System (ADS)
O'Dell, Walter G.
2012-03-01
The segmentation of the lung parenchyma is often a critical pre-processing step prior to application of computer-aided detection of lung nodules. Segmentation of the lung volume can dramatically decrease computation time and reduce the number of false positive detections by excluding from consideration extra-pulmonary tissue. However, while many algorithms are capable of adequately segmenting the healthy lung, none have been demonstrated to work reliably well on tumor-laden lungs. Of particular challenge is to preserve tumorous masses attached to the chest wall, mediastinum or major vessels. In this role, lung volume segmentation comprises an important computational step that can adversely affect the performance of the overall CAD algorithm. An automated lung volume segmentation algorithm has been developed with the goals to maximally exclude extra-pulmonary tissue while retaining all true nodules. The algorithm comprises a series of tasks including intensity thresholding, 2-D and 3-D morphological operations, 2-D and 3-D floodfilling, and snake-based clipping of nodules attached to the chest wall. It features the ability to (1) exclude trachea and bowels, (2) snip large attached nodules using snakes, (3) snip small attached nodules using dilation, (4) preserve large masses fully internal to lung volume, (5) account for basal aspects of the lung where in a 2-D slice the lower sections appear to be disconnected from main lung, and (6) achieve separation of the right and left hemi-lungs. The algorithm was developed and trained to on the first 100 datasets of the LIDC image database.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jeon, Chang Ho; Kim, Bohyoung; Gu, Bon Seung
2013-10-15
Purpose: To modify the preprocessing technique, which was previously proposed, improving compressibility of computed tomography (CT) images to cover the diversity of three dimensional configurations of different body parts and to evaluate the robustness of the technique in terms of segmentation correctness and increase in reversible compression ratio (CR) for various CT examinations.Methods: This study had institutional review board approval with waiver of informed patient consent. A preprocessing technique was previously proposed to improve the compressibility of CT images by replacing pixel values outside the body region with a constant value resulting in maximizing data redundancy. Since the technique wasmore » developed aiming at only chest CT images, the authors modified the segmentation method to cover the diversity of three dimensional configurations of different body parts. The modified version was evaluated as follows. In randomly selected 368 CT examinations (352 787 images), each image was preprocessed by using the modified preprocessing technique. Radiologists visually confirmed whether the segmented region covers the body region or not. The images with and without the preprocessing were reversibly compressed using Joint Photographic Experts Group (JPEG), JPEG2000 two-dimensional (2D), and JPEG2000 three-dimensional (3D) compressions. The percentage increase in CR per examination (CR{sub I}) was measured.Results: The rate of correct segmentation was 100.0% (95% CI: 99.9%, 100.0%) for all the examinations. The median of CR{sub I} were 26.1% (95% CI: 24.9%, 27.1%), 40.2% (38.5%, 41.1%), and 34.5% (32.7%, 36.2%) in JPEG, JPEG2000 2D, and JPEG2000 3D, respectively.Conclusions: In various CT examinations, the modified preprocessing technique can increase in the CR by 25% or more without concerning about degradation of diagnostic information.« less
Real-time topic-aware influence maximization using preprocessing.
Chen, Wei; Lin, Tian; Yang, Cheng
2016-01-01
Influence maximization is the task of finding a set of seed nodes in a social network such that the influence spread of these seed nodes based on certain influence diffusion model is maximized. Topic-aware influence diffusion models have been recently proposed to address the issue that influence between a pair of users are often topic-dependent and information, ideas, innovations etc. being propagated in networks are typically mixtures of topics. In this paper, we focus on the topic-aware influence maximization task. In particular, we study preprocessing methods to avoid redoing influence maximization for each mixture from scratch. We explore two preprocessing algorithms with theoretical justifications. Our empirical results on data obtained in a couple of existing studies demonstrate that one of our algorithms stands out as a strong candidate providing microsecond online response time and competitive influence spread, with reasonable preprocessing effort.
Comparative performance evaluation of transform coding in image pre-processing
NASA Astrophysics Data System (ADS)
Menon, Vignesh V.; NB, Harikrishnan; Narayanan, Gayathri; CK, Niveditha
2017-07-01
We are in the midst of a communication transmute which drives the development as largely as dissemination of pioneering communication systems with ever-increasing fidelity and resolution. Distinguishable researches have been appreciative in image processing techniques crazed by a growing thirst for faster and easier encoding, storage and transmission of visual information. In this paper, the researchers intend to throw light on many techniques which could be worn at the transmitter-end in order to ease the transmission and reconstruction of the images. The researchers investigate the performance of different image transform coding schemes used in pre-processing, their comparison, and effectiveness, the necessary and sufficient conditions, properties and complexity in implementation. Whimsical by prior advancements in image processing techniques, the researchers compare various contemporary image pre-processing frameworks- Compressed Sensing, Singular Value Decomposition, Integer Wavelet Transform on performance. The paper exposes the potential of Integer Wavelet transform to be an efficient pre-processing scheme.
Effect of microaerobic fermentation in preprocessing fibrous lignocellulosic materials.
Alattar, Manar Arica; Green, Terrence R; Henry, Jordan; Gulca, Vitalie; Tizazu, Mikias; Bergstrom, Robby; Popa, Radu
2012-06-01
Amending soil with organic matter is common in agricultural and logging practices. Such amendments have benefits to soil fertility and crop yields. These benefits may be increased if material is preprocessed before introduction into soil. We analyzed the efficiency of microaerobic fermentation (MF), also referred to as Bokashi, in preprocessing fibrous lignocellulosic (FLC) organic materials using varying produce amendments and leachate treatments. Adding produce amendments increased leachate production and fermentation rates and decreased the biological oxygen demand of the leachate. Continuously draining leachate without returning it to the fermentors led to acidification and decreased concentrations of polysaccharides (PS) in leachates. PS fragmentation and the production of soluble metabolites and gases stabilized in fermentors in about 2-4 weeks. About 2 % of the carbon content was lost as CO(2). PS degradation rates, upon introduction of processed materials into soil, were similar to unfermented FLC. Our results indicate that MF is insufficient for adequate preprocessing of FLC material.
Multi-Scale Distributed Representation for Deep Learning and its Application to b-Jet Tagging
NASA Astrophysics Data System (ADS)
Lee, Jason Sang Hun; Park, Inkyu; Park, Sangnam
2018-06-01
Recently machine learning algorithms based on deep layered artificial neural networks (DNNs) have been applied to a wide variety of high energy physics problems such as jet tagging or event classification. We explore a simple but effective preprocessing step which transforms each realvalued observational quantity or input feature into a binary number with a fixed number of digits. Each binary digit represents the quantity or magnitude in different scales. We have shown that this approach improves the performance of DNNs significantly for some specific tasks without any further complication in feature engineering. We apply this multi-scale distributed binary representation to deep learning on b-jet tagging using daughter particles' momenta and vertex information.
Shatokhina, Iuliia; Obereder, Andreas; Rosensteiner, Matthias; Ramlau, Ronny
2013-04-20
We present a fast method for the wavefront reconstruction from pyramid wavefront sensor (P-WFS) measurements. The method is based on an analytical relation between pyramid and Shack-Hartmann sensor (SH-WFS) data. The algorithm consists of two steps--a transformation of the P-WFS data to SH data, followed by the application of cumulative reconstructor with domain decomposition, a wavefront reconstructor from SH-WFS measurements. The closed loop simulations confirm that our method provides the same quality as the standard matrix vector multiplication method. A complexity analysis as well as speed tests confirm that the method is very fast. Thus, the method can be used on extremely large telescopes, e.g., for eXtreme adaptive optics systems.
High precision mass measurements for wine metabolomics
Roullier-Gall, Chloé; Witting, Michael; Gougeon, Régis D.; Schmitt-Kopplin, Philippe
2014-01-01
An overview of the critical steps for the non-targeted Ultra-High Performance Liquid Chromatography coupled with Quadrupole Time-of-Flight Mass Spectrometry (UPLC-Q-ToF-MS) analysis of wine chemistry is given, ranging from the study design, data preprocessing and statistical analyses, to markers identification. UPLC-Q-ToF-MS data was enhanced by the alignment of exact mass data from FTICR-MS, and marker peaks were identified using UPLC-Q-ToF-MS2. In combination with multivariate statistical tools and the annotation of peaks with metabolites from relevant databases, this analytical process provides a fine description of the chemical complexity of wines, as exemplified in the case of red (Pinot noir) and white (Chardonnay) wines from various geographic origins in Burgundy. PMID:25431760
fMRI paradigm designing and post-processing tools
James, Jija S; Rajesh, PG; Chandran, Anuvitha VS; Kesavadas, Chandrasekharan
2014-01-01
In this article, we first review some aspects of functional magnetic resonance imaging (fMRI) paradigm designing for major cognitive functions by using stimulus delivery systems like Cogent, E-Prime, Presentation, etc., along with their technical aspects. We also review the stimulus presentation possibilities (block, event-related) for visual or auditory paradigms and their advantage in both clinical and research setting. The second part mainly focus on various fMRI data post-processing tools such as Statistical Parametric Mapping (SPM) and Brain Voyager, and discuss the particulars of various preprocessing steps involved (realignment, co-registration, normalization, smoothing) in these software and also the statistical analysis principles of General Linear Modeling for final interpretation of a functional activation result. PMID:24851001
Automatic Microaneurysms Detection Based on Multifeature Fusion Dictionary Learning
Wang, Zhenzhu; Du, Wenyou
2017-01-01
Recently, microaneurysm (MA) detection has attracted a lot of attention in the medical image processing community. Since MAs can be seen as the earliest lesions in diabetic retinopathy, their detection plays a critical role in diabetic retinopathy diagnosis. In this paper, we propose a novel MA detection approach named multifeature fusion dictionary learning (MFFDL). The proposed method consists of four steps: preprocessing, candidate extraction, multifeature dictionary learning, and classification. The novelty of our proposed approach lies in incorporating the semantic relationships among multifeatures and dictionary learning into a unified framework for automatic detection of MAs. We evaluate the proposed algorithm by comparing it with the state-of-the-art approaches and the experimental results validate the effectiveness of our algorithm. PMID:28421125
Automatic Microaneurysms Detection Based on Multifeature Fusion Dictionary Learning.
Zhou, Wei; Wu, Chengdong; Chen, Dali; Wang, Zhenzhu; Yi, Yugen; Du, Wenyou
2017-01-01
Recently, microaneurysm (MA) detection has attracted a lot of attention in the medical image processing community. Since MAs can be seen as the earliest lesions in diabetic retinopathy, their detection plays a critical role in diabetic retinopathy diagnosis. In this paper, we propose a novel MA detection approach named multifeature fusion dictionary learning (MFFDL). The proposed method consists of four steps: preprocessing, candidate extraction, multifeature dictionary learning, and classification. The novelty of our proposed approach lies in incorporating the semantic relationships among multifeatures and dictionary learning into a unified framework for automatic detection of MAs. We evaluate the proposed algorithm by comparing it with the state-of-the-art approaches and the experimental results validate the effectiveness of our algorithm.
High precision mass measurements for wine metabolomics
NASA Astrophysics Data System (ADS)
Roullier-Gall, Chloé; Witting, Michael; Gougeon, Régis; Schmitt-Kopplin, Philippe
2014-11-01
An overview of the critical steps for the non-targeted Ultra-High Performance Liquid Chromatography coupled with Quadrupole Time-of-Flight Mass Spectrometry (UPLC-Q-ToF-MS) analysis of wine chemistry is given, ranging from the study design, data preprocessing and statistical analyses, to markers identification. UPLC-Q-ToF-MS data was enhanced by the alignment of exact mass data from FTICR-MS, and marker peaks were identified using UPLC-Q-ToF-MS². In combination with multivariate statistical tools and the annotation of peaks with metabolites from relevant databases, this analytical process provides a fine description of the chemical complexity of wines, as exemplified in the case of red (Pinot noir) and white (Chardonnay) wines from various geographic origins in Burgundy.
NASA Astrophysics Data System (ADS)
Kaur, Jaswinder; Jagdev, Gagandeep, Dr.
2018-01-01
Optical character recognition is concerned with the recognition of optically processed characters. The recognition is done offline after the writing or printing has been completed, unlike online recognition where the computer has to recognize the characters instantly as they are drawn. The performance of character recognition depends upon the quality of scanned documents. The preprocessing steps are used for removing low-frequency background noise and normalizing the intensity of individual scanned documents. Several filters are used for reducing certain image details and enabling an easier or faster evaluation. The primary aim of the research work is to recognize handwritten and machine written characters and differentiate them. The language opted for the research work is Punjabi Gurmukhi and tool utilized is Matlab.
AbdelRahman, Samir E; Zhang, Mingyuan; Bray, Bruce E; Kawamoto, Kensaku
2014-05-27
The aim of this study was to propose an analytical approach to develop high-performing predictive models for congestive heart failure (CHF) readmission using an operational dataset with incomplete records and changing data over time. Our analytical approach involves three steps: pre-processing, systematic model development, and risk factor analysis. For pre-processing, variables that were absent in >50% of records were removed. Moreover, the dataset was divided into a validation dataset and derivation datasets which were separated into three temporal subsets based on changes to the data over time. For systematic model development, using the different temporal datasets and the remaining explanatory variables, the models were developed by combining the use of various (i) statistical analyses to explore the relationships between the validation and the derivation datasets; (ii) adjustment methods for handling missing values; (iii) classifiers; (iv) feature selection methods; and (iv) discretization methods. We then selected the best derivation dataset and the models with the highest predictive performance. For risk factor analysis, factors in the highest-performing predictive models were analyzed and ranked using (i) statistical analyses of the best derivation dataset, (ii) feature rankers, and (iii) a newly developed algorithm to categorize risk factors as being strong, regular, or weak. The analysis dataset consisted of 2,787 CHF hospitalizations at University of Utah Health Care from January 2003 to June 2013. In this study, we used the complete-case analysis and mean-based imputation adjustment methods; the wrapper subset feature selection method; and four ranking strategies based on information gain, gain ratio, symmetrical uncertainty, and wrapper subset feature evaluators. The best-performing models resulted from the use of a complete-case analysis derivation dataset combined with the Class-Attribute Contingency Coefficient discretization method and a voting classifier which averaged the results of multi-nominal logistic regression and voting feature intervals classifiers. Of 42 final model risk factors, discharge disposition, discretized age, and indicators of anemia were the most significant. This model achieved a c-statistic of 86.8%. The proposed three-step analytical approach enhanced predictive model performance for CHF readmissions. It could potentially be leveraged to improve predictive model performance in other areas of clinical medicine.
Chang'E-3 data pre-processing system based on scientific workflow
NASA Astrophysics Data System (ADS)
tan, xu; liu, jianjun; wang, yuanyuan; yan, wei; zhang, xiaoxia; li, chunlai
2016-04-01
The Chang'E-3(CE3) mission have obtained a huge amount of lunar scientific data. Data pre-processing is an important segment of CE3 ground research and application system. With a dramatic increase in the demand of data research and application, Chang'E-3 data pre-processing system(CEDPS) based on scientific workflow is proposed for the purpose of making scientists more flexible and productive by automating data-driven. The system should allow the planning, conduct and control of the data processing procedure with the following possibilities: • describe a data processing task, include:1)define input data/output data, 2)define the data relationship, 3)define the sequence of tasks,4)define the communication between tasks,5)define mathematical formula, 6)define the relationship between task and data. • automatic processing of tasks. Accordingly, Describing a task is the key point whether the system is flexible. We design a workflow designer which is a visual environment for capturing processes as workflows, the three-level model for the workflow designer is discussed:1) The data relationship is established through product tree.2)The process model is constructed based on directed acyclic graph(DAG). Especially, a set of process workflow constructs, including Sequence, Loop, Merge, Fork are compositional one with another.3)To reduce the modeling complexity of the mathematical formulas using DAG, semantic modeling based on MathML is approached. On top of that, we will present how processed the CE3 data with CEDPS.
Sakwari, Gloria
2013-01-01
Introduction: Endotoxin exposure associated with organic dust exposure has been studied in several industries. Coffee cherries that are dried directly after harvest may differ in dust and endotoxin emissions to those that are peeled and washed before drying. The aim of this study was to measure personal total dust and endotoxin levels and to evaluate their determinants of exposure in coffee processing factories. Methods: Using Sidekick Casella pumps at a flow rate of 2l/min, total dust levels were measured in the workers’ breathing zone throughout the shift. Endotoxin was analyzed using the kinetic chromogenic Limulus amebocyte lysate assay. Separate linear mixed-effects models were used to evaluate exposure determinants for dust and endotoxin. Results: Total dust and endotoxin exposure were significantly higher in Robusta than in Arabica coffee factories (geometric mean 3.41mg/m3 and 10 800 EU/m3 versus 2.10mg/m3 and 1400 EU/m3, respectively). Dry pre-processed coffee and differences in work tasks explained 30% of the total variance for total dust and 71% of the variance for endotoxin exposure. High exposure in Robusta processing is associated with the dry pre-processing method used after harvest. Conclusions: Dust and endotoxin exposure is high, in particular when processing dry pre-processed coffee. Minimization of dust emissions and use of efficient dust exhaust systems are important to prevent the development of respiratory system impairment in workers. PMID:23028014
Interactive Medical Volume Visualization for Surgical Operations
2001-10-25
the preprocessing and processing stages, related medical brain tissues, which are skull, white matter, gray matter and pathology ( tumor ), are segmented ...from 12 or 16 bit data depths. NMR segmentation plays an important role in our work, because, classifying brain tissues from NMR slices requires an...performing segmentation of brain structures. Our segmentation process uses Self Organizing Feature Maps (SOFM) [12]. In SOM, on the contrary to Feedback
Low-cost digital image processing at the University of Oklahoma
NASA Technical Reports Server (NTRS)
Harrington, J. A., Jr.
1981-01-01
Computer assisted instruction in remote sensing at the University of Oklahoma involves two separate approaches and is dependent upon initial preprocessing of a LANDSAT computer compatible tape using software developed for an IBM 370/158 computer. In-house generated preprocessing algorithms permits students or researchers to select a subset of a LANDSAT scene for subsequent analysis using either general purpose statistical packages or color graphic image processing software developed for Apple II microcomputers. Procedures for preprocessing the data and image analysis using either of the two approaches for low-cost LANDSAT data processing are described.
Zhang, Chu; Liu, Fei; He, Yong
2018-02-01
Hyperspectral imaging was used to identify and to visualize the coffee bean varieties. Spectral preprocessing of pixel-wise spectra was conducted by different methods, including moving average smoothing (MA), wavelet transform (WT) and empirical mode decomposition (EMD). Meanwhile, spatial preprocessing of the gray-scale image at each wavelength was conducted by median filter (MF). Support vector machine (SVM) models using full sample average spectra and pixel-wise spectra, and the selected optimal wavelengths by second derivative spectra all achieved classification accuracy over 80%. Primarily, the SVM models using pixel-wise spectra were used to predict the sample average spectra, and these models obtained over 80% of the classification accuracy. Secondly, the SVM models using sample average spectra were used to predict pixel-wise spectra, but achieved with lower than 50% of classification accuracy. The results indicated that WT and EMD were suitable for pixel-wise spectra preprocessing. The use of pixel-wise spectra could extend the calibration set, and resulted in the good prediction results for pixel-wise spectra and sample average spectra. The overall results indicated the effectiveness of using spectral preprocessing and the adoption of pixel-wise spectra. The results provided an alternative way of data processing for applications of hyperspectral imaging in food industry.
Framework for Parallel Preprocessing of Microarray Data Using Hadoop
2018-01-01
Nowadays, microarray technology has become one of the popular ways to study gene expression and diagnosis of disease. National Center for Biology Information (NCBI) hosts public databases containing large volumes of biological data required to be preprocessed, since they carry high levels of noise and bias. Robust Multiarray Average (RMA) is one of the standard and popular methods that is utilized to preprocess the data and remove the noises. Most of the preprocessing algorithms are time-consuming and not able to handle a large number of datasets with thousands of experiments. Parallel processing can be used to address the above-mentioned issues. Hadoop is a well-known and ideal distributed file system framework that provides a parallel environment to run the experiment. In this research, for the first time, the capability of Hadoop and statistical power of R have been leveraged to parallelize the available preprocessing algorithm called RMA to efficiently process microarray data. The experiment has been run on cluster containing 5 nodes, while each node has 16 cores and 16 GB memory. It compares efficiency and the performance of parallelized RMA using Hadoop with parallelized RMA using affyPara package as well as sequential RMA. The result shows the speed-up rate of the proposed approach outperforms the sequential approach and affyPara approach. PMID:29796018
Multiwavelet grading of prostate pathological images
NASA Astrophysics Data System (ADS)
Soltanian-Zadeh, Hamid; Jafari-Khouzani, Kourosh
2002-05-01
We have developed image analysis methods to automatically grade pathological images of prostate. The proposed method generates Gleason grades to images, where each image is assigned a grade between 1 and 5. This is done using features extracted from multiwavelet transformations. We extract energy and entropy features from submatrices obtained in the decomposition. Next, we apply a k-NN classifier to grade the image. To find optimal multiwavelet basis, preprocessing, and classifier, we use features extracted by different multiwavelets with either critically sampled preprocessing or repeated row preprocessing and different k-NN classifiers and compare their performances, evaluated by total misclassification rate (TMR). To evaluate sensitivity to noise, we add white Gaussian noise to images and compare the results (TMR's). We applied proposed methods to 100 images. We evaluated the first and second levels of decomposition using Geronimo, Hardin, and Massopust (GHM), Chui and Lian (CL), and Shen (SA4) multiwavelets. We also evaluated k-NN classifier for k=1,2,3,4,5. Experimental results illustrate that first level of decomposition is quite noisy. They also show that critically sampled preprocessing outperforms repeated row preprocessing and has less sensitivity to noise. Finally, comparison studies indicate that SA4 multiwavelet and k-NN classifier (k=1) generates optimal results (with smallest TMR of 3%).
Advancements and challenges in crosshole GPR full-waveform inversion for hydrological applications
NASA Astrophysics Data System (ADS)
Klotzsche, A.; Van Der Kruk, J.; Vereecken, H.
2016-12-01
Crosshole ground penetrating radar (GPR) full-waveform inversion (FWI) demonstrated over the last decade a high potential to detect, map, and resolve decimeter-small-scale structures within aquifers. GPR FWI uses Maxwell's equations to find a model that fits the measurements with the entire measured waveform. One big advantage is that by applying one method, we can derive two soil properties: dielectric permittivity and electrical conductivity. Both parameters are sensitive to different soil properties such as soil water content and porosity, or, clay content. Hence, an improved characterization of the critical zone is possible. The application of the FWI to aquifers in Germany, Switzerland, Denmark, and USA showed for all sites improved and higher resolution images than standard ray-based methods and provided new insights in the aquifers' structures. Furthermore, small-scale high contrast layers caused by changes in porosity were characterize and enhanced our understanding of the electromagnetic wave propagation related to these features. However, to obtain reliable and accurate inversion results from experimental data and hence porosity estimates, many detailed steps in acquiring the data, pre-processing and inverting the data need to be carefully followed. Here, we provide an overview of recent developments and advancements of the 2D crosshole GPR FWI that provide improved inversion results for permittivity and electrical conductivity. In addition, we will provide guidelines and point out important challenges and pitfalls that can occur during the inversion of experimental data. We will illustrate the necessary steps that are required to achieve reliable FWI results, which are indicated by e.g. a good fit of the measured and modelled traces, and, absence of a remaining gradient for the final models. Important requirements for a successful application are an accurate time zero correction, good starting models for the FWI, and, a well-estimated source wavelet.
An automated distinction of DICOM images for lung cancer CAD system
NASA Astrophysics Data System (ADS)
Suzuki, H.; Saita, S.; Kubo, M.; Kawata, Y.; Niki, N.; Nishitani, H.; Ohmatsu, H.; Eguchi, K.; Kaneko, M.; Moriyama, N.
2009-02-01
Automated distinction of medical images is an important preprocessing in Computer-Aided Diagnosis (CAD) systems. The CAD systems have been developed using medical image sets with specific scan conditions and body parts. However, varied examinations are performed in medical sites. The specification of the examination is contained into DICOM textual meta information. Most DICOM textual meta information can be considered reliable, however the body part information cannot always be considered reliable. In this paper, we describe an automated distinction of DICOM images as a preprocessing for lung cancer CAD system. Our approach uses DICOM textual meta information and low cost image processing. Firstly, the textual meta information such as scan conditions of DICOM image is distinguished. Secondly, the DICOM image is set to distinguish the body parts which are identified by image processing. The identification of body parts is based on anatomical structure which is represented by features of three regions, body tissue, bone, and air. The method is effective to the practical use of lung cancer CAD system in medical sites.
Rapid determination of total protein and wet gluten in commercial wheat flour using siSVR-NIR.
Chen, Jia; Zhu, Shipin; Zhao, Guohua
2017-04-15
The determination of total protein and wet gluten is of critical importance when screening commercial flour for desired processing suitability. To this end, a near-infrared spectroscopy (NIR) method with support vector regression was developed in the present study. The effects of spectral preprocessing and the synergy interval on model performance were investigated. The results showed that the models from raw spectra were not acceptable, but they were substantially improved by properly applying spectral preprocessing methods. Meanwhile, the synergy interval was validated with a good ability to improve the performance of models based on the whole spectrum. The coefficient of determination (R 2 ), the root mean square error of prediction (RMSEP) and the standard deviation ratio (SDR) of the best models for total protein (wet gluten) were 0.906 (0.850), 0.425 (1.024) and 3.065 (2.482), respectively. These two best models have similar and lower relative errors (approximately 8.8%), which indicates their feasibility. Copyright © 2016 Elsevier Ltd. All rights reserved.
Data preprocessing methods of FT-NIR spectral data for the classification cooking oil
NASA Astrophysics Data System (ADS)
Ruah, Mas Ezatul Nadia Mohd; Rasaruddin, Nor Fazila; Fong, Sim Siong; Jaafar, Mohd Zuli
2014-12-01
This recent work describes the data pre-processing method of FT-NIR spectroscopy datasets of cooking oil and its quality parameters with chemometrics method. Pre-processing of near-infrared (NIR) spectral data has become an integral part of chemometrics modelling. Hence, this work is dedicated to investigate the utility and effectiveness of pre-processing algorithms namely row scaling, column scaling and single scaling process with Standard Normal Variate (SNV). The combinations of these scaling methods have impact on exploratory analysis and classification via Principle Component Analysis plot (PCA). The samples were divided into palm oil and non-palm cooking oil. The classification model was build using FT-NIR cooking oil spectra datasets in absorbance mode at the range of 4000cm-1-14000cm-1. Savitzky Golay derivative was applied before developing the classification model. Then, the data was separated into two sets which were training set and test set by using Duplex method. The number of each class was kept equal to 2/3 of the class that has the minimum number of sample. Then, the sample was employed t-statistic as variable selection method in order to select which variable is significant towards the classification models. The evaluation of data pre-processing were looking at value of modified silhouette width (mSW), PCA and also Percentage Correctly Classified (%CC). The results show that different data processing strategies resulting to substantial amount of model performances quality. The effects of several data pre-processing i.e. row scaling, column standardisation and single scaling process with Standard Normal Variate indicated by mSW and %CC. At two PCs model, all five classifier gave high %CC except Quadratic Distance Analysis.
Smartphone based Tomographic PIV using colored shadows
NASA Astrophysics Data System (ADS)
Aguirre-Pablo, Andres A.; Alarfaj, Meshal K.; Li, Er Qiang; Thoroddsen, Sigurdur T.
2016-11-01
We use low-cost smartphones and Tomo-PIV, to reconstruct the 3D-3C velocity field of a vortex ring. The experiment is carried out in an octagonal tank of water with a vortex ring generator consisting of a flexible membrane enclosed by a cylindrical chamber. This chamber is pre-seeded with black polyethylene microparticles. The membrane is driven by an adjustable impulsive air-pressure to produce the vortex ring. Four synchronized smartphone cameras, of 40 Mpx each, are used to capture the location of particles from different viewing angles. We use red, green and blue LED's as backlighting sources, to capture particle locations at different times. The exposure time on the smartphone cameras are set to 2 seconds, while exposing each LED color for about 80 μs with different time steps that can go below 300 μs. The timing of these light pulses is controlled with a digital delay generator. The backlight is blocked by the instantaneous location of the particles in motion, leaving a shadow of the corresponding color for each time step. The image then is preprocessed to separate the 3 different color fields, before using the MART reconstruction and cross-correlation of the time steps to obtain the 3D-3C velocity field. This proof of concept experiment represents a possible low-cost Tomo-PIV setup.
Dermatas, Evangelos
2015-01-01
A novel method for finger vein pattern extraction from infrared images is presented. This method involves four steps: preprocessing which performs local normalization of the image intensity, image enhancement, image segmentation, and finally postprocessing for image cleaning. In the image enhancement step, an image which will be both smooth and similar to the original is sought. The enhanced image is obtained by minimizing the objective function of a modified separable Mumford Shah Model. Since, this minimization procedure is computationally intensive for large images, a local application of the Mumford Shah Model in small window neighborhoods is proposed. The finger veins are located in concave nonsmooth regions and, so, in order to distinct them from the other tissue parts, all the differences between the smooth neighborhoods, obtained by the local application of the model, and the corresponding windows of the original image are added. After that, veins in the enhanced image have been sufficiently emphasized. Thus, after image enhancement, an accurate segmentation can be obtained readily by a local entropy thresholding method. Finally, the resulted binary image may suffer from some misclassifications and, so, a postprocessing step is performed in order to extract a robust finger vein pattern. PMID:26120357
Uncertainty assessment in geodetic network adjustment by combining GUM and Monte-Carlo-simulations
NASA Astrophysics Data System (ADS)
Niemeier, Wolfgang; Tengen, Dieter
2017-06-01
In this article first ideas are presented to extend the classical concept of geodetic network adjustment by introducing a new method for uncertainty assessment as two-step analysis. In the first step the raw data and possible influencing factors are analyzed using uncertainty modeling according to GUM (Guidelines to the Expression of Uncertainty in Measurements). This approach is well established in metrology, but rarely adapted within Geodesy. The second step consists of Monte-Carlo-Simulations (MC-simulations) for the complete processing chain from raw input data and pre-processing to adjustment computations and quality assessment. To perform these simulations, possible realizations of raw data and the influencing factors are generated, using probability distributions for all variables and the established concept of pseudo-random number generators. Final result is a point cloud which represents the uncertainty of the estimated coordinates; a confidence region can be assigned to these point clouds, as well. This concept may replace the common concept of variance propagation and the quality assessment of adjustment parameters by using their covariance matrix. It allows a new way for uncertainty assessment in accordance with the GUM concept for uncertainty modelling and propagation. As practical example the local tie network in "Metsähovi Fundamental Station", Finland is used, where classical geodetic observations are combined with GNSS data.
ERPLAB: an open-source toolbox for the analysis of event-related potentials
Lopez-Calderon, Javier; Luck, Steven J.
2014-01-01
ERPLAB toolbox is a freely available, open-source toolbox for processing and analyzing event-related potential (ERP) data in the MATLAB environment. ERPLAB is closely integrated with EEGLAB, a popular open-source toolbox that provides many EEG preprocessing steps and an excellent user interface design. ERPLAB adds to EEGLAB’s EEG processing functions, providing additional tools for filtering, artifact detection, re-referencing, and sorting of events, among others. ERPLAB also provides robust tools for averaging EEG segments together to create averaged ERPs, for creating difference waves and other recombinations of ERP waveforms through algebraic expressions, for filtering and re-referencing the averaged ERPs, for plotting ERP waveforms and scalp maps, and for quantifying several types of amplitudes and latencies. ERPLAB’s tools can be accessed either from an easy-to-learn graphical user interface or from MATLAB scripts, and a command history function makes it easy for users with no programming experience to write scripts. Consequently, ERPLAB provides both ease of use and virtually unlimited power and flexibility, making it appropriate for the analysis of both simple and complex ERP experiments. Several forms of documentation are available, including a detailed user’s guide, a step-by-step tutorial, a scripting guide, and a set of video-based demonstrations. PMID:24782741
Chen, Wenxi; Kitazawa, Masumi; Togawa, Tatsuo
2009-09-01
This paper proposes a method to estimate a woman's menstrual cycle based on the hidden Markov model (HMM). A tiny device was developed that attaches around the abdominal region to measure cutaneous temperature at 10-min intervals during sleep. The measured temperature data were encoded as a two-dimensional image (QR code, i.e., quick response code) and displayed in the LCD window of the device. A mobile phone captured the QR code image, decoded the information and transmitted the data to a database server. The collected data were analyzed by three steps to estimate the biphasic temperature property in a menstrual cycle. The key step was an HMM-based step between preprocessing and postprocessing. A discrete Markov model, with two hidden phases, was assumed to represent higher- and lower-temperature phases during a menstrual cycle. The proposed method was verified by the data collected from 30 female participants, aged from 14 to 46, over six consecutive months. By comparing the estimated results with individual records from the participants, 71.6% of 190 menstrual cycles were correctly estimated. The sensitivity and positive predictability were 91.8 and 96.6%, respectively. This objective evaluation provides a promising approach for managing premenstrual syndrome and birth control.
Whole head quantitative susceptibility mapping using a least-norm direct dipole inversion method.
Sun, Hongfu; Ma, Yuhan; MacDonald, M Ethan; Pike, G Bruce
2018-06-15
A new dipole field inversion method for whole head quantitative susceptibility mapping (QSM) is proposed. Instead of performing background field removal and local field inversion sequentially, the proposed method performs dipole field inversion directly on the total field map in a single step. To aid this under-determined and ill-posed inversion process and obtain robust QSM images, Tikhonov regularization is implemented to seek the local susceptibility solution with the least-norm (LN) using the L-curve criterion. The proposed LN-QSM does not require brain edge erosion, thereby preserving the cerebral cortex in the final images. This should improve its applicability for QSM-based cortical grey matter measurement, functional imaging and venography of full brain. Furthermore, LN-QSM also enables susceptibility mapping of the entire head without the need for brain extraction, which makes QSM reconstruction more automated and less dependent on intermediate pre-processing methods and their associated parameters. It is shown that the proposed LN-QSM method reduced errors in a numerical phantom simulation, improved accuracy in a gadolinium phantom experiment, and suppressed artefacts in nine subjects, as compared to two-step and other single-step QSM methods. Measurements of deep grey matter and skull susceptibilities from LN-QSM are consistent with established reconstruction methods. Copyright © 2018 Elsevier Inc. All rights reserved.
ERPLAB: an open-source toolbox for the analysis of event-related potentials.
Lopez-Calderon, Javier; Luck, Steven J
2014-01-01
ERPLAB toolbox is a freely available, open-source toolbox for processing and analyzing event-related potential (ERP) data in the MATLAB environment. ERPLAB is closely integrated with EEGLAB, a popular open-source toolbox that provides many EEG preprocessing steps and an excellent user interface design. ERPLAB adds to EEGLAB's EEG processing functions, providing additional tools for filtering, artifact detection, re-referencing, and sorting of events, among others. ERPLAB also provides robust tools for averaging EEG segments together to create averaged ERPs, for creating difference waves and other recombinations of ERP waveforms through algebraic expressions, for filtering and re-referencing the averaged ERPs, for plotting ERP waveforms and scalp maps, and for quantifying several types of amplitudes and latencies. ERPLAB's tools can be accessed either from an easy-to-learn graphical user interface or from MATLAB scripts, and a command history function makes it easy for users with no programming experience to write scripts. Consequently, ERPLAB provides both ease of use and virtually unlimited power and flexibility, making it appropriate for the analysis of both simple and complex ERP experiments. Several forms of documentation are available, including a detailed user's guide, a step-by-step tutorial, a scripting guide, and a set of video-based demonstrations.
Automatic localization of cochlear implant electrodes in CTs with a limited intensity range
NASA Astrophysics Data System (ADS)
Zhao, Yiyuan; Dawant, Benoit M.; Noble, Jack H.
2017-02-01
Cochlear implants (CIs) are neural prosthetics for treating severe-to-profound hearing loss. Our group has developed an image-guided cochlear implant programming (IGCIP) system that uses image analysis techniques to recommend patientspecific CI processor settings to improve hearing outcomes. One crucial step in IGCIP is the localization of CI electrodes in post-implantation CTs. Manual localization of electrodes requires time and expertise. To automate this process, our group has proposed automatic techniques that have been validated on CTs acquired with scanners that produce images with an extended range of intensity values. However, there are many clinical CTs acquired with a limited intensity range. This limitation complicates the electrode localization process. In this work, we present a pre-processing step for CTs with a limited intensity range and extend the methods we proposed for full intensity range CTs to localize CI electrodes in CTs with limited intensity range. We evaluate our method on CTs of 20 subjects implanted with CI arrays produced by different manufacturers. Our method achieves a mean localization error of 0.21mm. This indicates our method is robust for automatic localization of CI electrodes in different types of CTs, which represents a crucial step for translating IGCIP from research laboratory to clinical use.
Compact Circuit Preprocesses Accelerometer Output
NASA Technical Reports Server (NTRS)
Bozeman, Richard J., Jr.
1993-01-01
Compact electronic circuit transfers dc power to, and preprocesses ac output of, accelerometer and associated preamplifier. Incorporated into accelerometer case during initial fabrication or retrofit onto commercial accelerometer. Made of commercial integrated circuits and other conventional components; made smaller by use of micrologic and surface-mount technology.
A new approach to telemetry data processing. Ph.D. Thesis - Maryland Univ.
NASA Technical Reports Server (NTRS)
Broglio, C. J.
1973-01-01
An approach for a preprocessing system for telemetry data processing was developed. The philosophy of the approach is the development of a preprocessing system to interface with the main processor and relieve it of the burden of stripping information from a telemetry data stream. To accomplish this task, a telemetry preprocessing language was developed. Also, a hardware device for implementing the operation of this language was designed using a cellular logic module concept. In the development of the hardware device and the cellular logic module, a distributed form of control was implemented. This is accomplished by a technique of one-to-one intermodule communications and a set of privileged communication operations. By transferring this control state from module to module, the control function is dispersed through the system. A compiler for translating the preprocessing language statements into an operations table for the hardware device was also developed. Finally, to complete the system design and verify it, a simulator for the collular logic module was written using the APL/360 system.
Source-Modeling Auditory Processes of EEG Data Using EEGLAB and Brainstorm.
Stropahl, Maren; Bauer, Anna-Katharina R; Debener, Stefan; Bleichner, Martin G
2018-01-01
Electroencephalography (EEG) source localization approaches are often used to disentangle the spatial patterns mixed up in scalp EEG recordings. However, approaches differ substantially between experiments, may be strongly parameter-dependent, and results are not necessarily meaningful. In this paper we provide a pipeline for EEG source estimation, from raw EEG data pre-processing using EEGLAB functions up to source-level analysis as implemented in Brainstorm. The pipeline is tested using a data set of 10 individuals performing an auditory attention task. The analysis approach estimates sources of 64-channel EEG data without the prerequisite of individual anatomies or individually digitized sensor positions. First, we show advanced EEG pre-processing using EEGLAB, which includes artifact attenuation using independent component analysis (ICA). ICA is a linear decomposition technique that aims to reveal the underlying statistical sources of mixed signals and is further a powerful tool to attenuate stereotypical artifacts (e.g., eye movements or heartbeat). Data submitted to ICA are pre-processed to facilitate good-quality decompositions. Aiming toward an objective approach on component identification, the semi-automatic CORRMAP algorithm is applied for the identification of components representing prominent and stereotypic artifacts. Second, we present a step-wise approach to estimate active sources of auditory cortex event-related processing, on a single subject level. The presented approach assumes that no individual anatomy is available and therefore the default anatomy ICBM152, as implemented in Brainstorm, is used for all individuals. Individual noise modeling in this dataset is based on the pre-stimulus baseline period. For EEG source modeling we use the OpenMEEG algorithm as the underlying forward model based on the symmetric Boundary Element Method (BEM). We then apply the method of dynamical statistical parametric mapping (dSPM) to obtain physiologically plausible EEG source estimates. Finally, we show how to perform group level analysis in the time domain on anatomically defined regions of interest (auditory scout). The proposed pipeline needs to be tailored to the specific datasets and paradigms. However, the straightforward combination of EEGLAB and Brainstorm analysis tools may be of interest to others performing EEG source localization.
Automated Processing Workflow for Ambient Seismic Recordings
NASA Astrophysics Data System (ADS)
Girard, A. J.; Shragge, J.
2017-12-01
Structural imaging using body-wave energy present in ambient seismic data remains a challenging task, largely because these wave modes are commonly much weaker than surface wave energy. In a number of situations body-wave energy has been extracted successfully; however, (nearly) all successful body-wave extraction and imaging approaches have focused on cross-correlation processing. While this is useful for interferometric purposes, it can also lead to the inclusion of unwanted noise events that dominate the resulting stack, leaving body-wave energy overpowered by the coherent noise. Conversely, wave-equation imaging can be applied directly on non-correlated ambient data that has been preprocessed to mitigate unwanted energy (i.e., surface waves, burst-like and electromechanical noise) to enhance body-wave arrivals. Following this approach, though, requires a significant preprocessing effort on often Terabytes of ambient seismic data, which is expensive and requires automation to be a feasible approach. In this work we outline an automated processing workflow designed to optimize body wave energy from an ambient seismic data set acquired on a large-N array at a mine site near Lalor Lake, Manitoba, Canada. We show that processing ambient seismic data in the recording domain, rather than the cross-correlation domain, allows us to mitigate energy that is inappropriate for body-wave imaging. We first develop a method for window selection that automatically identifies and removes data contaminated by coherent high-energy bursts. We then apply time- and frequency-domain debursting techniques to mitigate the effects of remaining strong amplitude and/or monochromatic energy without severely degrading the overall waveforms. After each processing step we implement a QC check to investigate improvements in the convergence rates - and the emergence of reflection events - in the cross-correlation plus stack waveforms over hour-long windows. Overall, the QC analyses suggest that automated preprocessing of ambient seismic recordings in the recording domain successfully mitigates unwanted coherent noise events in both the time and frequency domain. Accordingly, we assert that this method is beneficial for direct wave-equation imaging with ambient seismic recordings.
Small white matter lesion detection in cerebral small vessel disease
NASA Astrophysics Data System (ADS)
Ghafoorian, Mohsen; Karssemeijer, Nico; van Uden, Inge; de Leeuw, Frank E.; Heskes, Tom; Marchiori, Elena; Platel, Bram
2015-03-01
Cerebral small vessel disease (SVD) is a common finding on magnetic resonance images of elderly people. White matter lesions (WML) are important markers for not only the small vessel disease, but also neuro-degenerative diseases including multiple sclerosis, Alzheimer's disease and vascular dementia. Volumetric measurements such as the "total lesion load", have been studied and related to these diseases. With respect to SVD we conjecture that small lesions are important, as they have been observed to grow over time and they form the majority of lesions in number. To study these small lesions they need to be annotated, which is a complex and time-consuming task. Existing (semi) automatic methods have been aimed at volumetric measurements and large lesions, and are not suitable for the detection of small lesions. In this research we established a supervised voxel classification CAD system, optimized and trained to exclusively detect small WMLs. To achieve this, several preprocessing steps were taken, which included a robust standardization of subject intensities to reduce inter-subject intensity variability as much as possible. A number of features that were found to be well identifying small lesions were calculated including multimodal intensities, tissue probabilities, several features for accurate location description, a number of second order derivative features as well as multi-scale annular filter for blobness detection. Only small lesions were used to learn the target concept via Adaboost using random forests as its basic classifiers. Finally the results were evaluated using Free-response receiver operating characteristic.
BFL: a node and edge betweenness based fast layout algorithm for large scale networks
Hashimoto, Tatsunori B; Nagasaki, Masao; Kojima, Kaname; Miyano, Satoru
2009-01-01
Background Network visualization would serve as a useful first step for analysis. However, current graph layout algorithms for biological pathways are insensitive to biologically important information, e.g. subcellular localization, biological node and graph attributes, or/and not available for large scale networks, e.g. more than 10000 elements. Results To overcome these problems, we propose the use of a biologically important graph metric, betweenness, a measure of network flow. This metric is highly correlated with many biological phenomena such as lethality and clusters. We devise a new fast parallel algorithm calculating betweenness to minimize the preprocessing cost. Using this metric, we also invent a node and edge betweenness based fast layout algorithm (BFL). BFL places the high-betweenness nodes to optimal positions and allows the low-betweenness nodes to reach suboptimal positions. Furthermore, BFL reduces the runtime by combining a sequential insertion algorim with betweenness. For a graph with n nodes, this approach reduces the expected runtime of the algorithm to O(n2) when considering edge crossings, and to O(n log n) when considering only density and edge lengths. Conclusion Our BFL algorithm is compared against fast graph layout algorithms and approaches requiring intensive optimizations. For gene networks, we show that our algorithm is faster than all layout algorithms tested while providing readability on par with intensive optimization algorithms. We achieve a 1.4 second runtime for a graph with 4000 nodes and 12000 edges on a standard desktop computer. PMID:19146673
Euclidean commute time distance embedding and its application to spectral anomaly detection
NASA Astrophysics Data System (ADS)
Albano, James A.; Messinger, David W.
2012-06-01
Spectral image analysis problems often begin by performing a preprocessing step composed of applying a transformation that generates an alternative representation of the spectral data. In this paper, a transformation based on a Markov-chain model of a random walk on a graph is introduced. More precisely, we quantify the random walk using a quantity known as the average commute time distance and find a nonlinear transformation that embeds the nodes of a graph in a Euclidean space where the separation between them is equal to the square root of this quantity. This has been referred to as the Commute Time Distance (CTD) transformation and it has the important characteristic of increasing when the number of paths between two nodes decreases and/or the lengths of those paths increase. Remarkably, a closed form solution exists for computing the average commute time distance that avoids running an iterative process and is found by simply performing an eigendecomposition on the graph Laplacian matrix. Contained in this paper is a discussion of the particular graph constructed on the spectral data for which the commute time distance is then calculated from, an introduction of some important properties of the graph Laplacian matrix, and a subspace projection that approximately preserves the maximal variance of the square root commute time distance. Finally, RX anomaly detection and Topological Anomaly Detection (TAD) algorithms will be applied to the CTD subspace followed by a discussion of their results.
Automatic segmentation of relevant structures in DCE MR mammograms
NASA Astrophysics Data System (ADS)
Koenig, Matthias; Laue, Hendrik; Boehler, Tobias; Peitgen, Heinz-Otto
2007-03-01
The automatic segmentation of relevant structures such as skin edge, chest wall, or nipple in dynamic contrast enhanced MR imaging (DCE MRI) of the breast provides additional information for computer aided diagnosis (CAD) systems. Automatic reporting using BI-RADS criteria benefits of information about location of those structures. Lesion positions can be automatically described relatively to such reference structures for reporting purposes. Furthermore, this information can assist data reduction for computation expensive preprocessing such as registration, or for visualization of only the segments of current interest. In this paper, a novel automatic method for determining the air-breast boundary resp. skin edge, for approximation of the chest wall, and locating of the nipples is presented. The method consists of several steps which are built on top of each other. Automatic threshold computation leads to the air-breast boundary which is then analyzed to determine the location of the nipple. Finally, results of both steps are starting point for approximation of the chest wall. The proposed process was evaluated on a large data set of DCE MRI recorded by T1 sequences and yielded reasonable results in all cases.
Automatic Preocessing of Impact Ionization Mass Spectra Obtained by Cassini CDA
NASA Astrophysics Data System (ADS)
Villeneuve, M.
2015-12-01
Since Cassini's arrival at Saturn in 2004, the Comic Dust Analyzer (CDA) has recorded nearly 200,000 mass spectra of dust particles. A majority of this data has been collected in Saturn's diffuse E ring where sodium salts embedded in water ice particles indicate that many particles are in fact frozen droplets from Enceladus' subsurface ocean that have been expelled from cracks in the icy crust. So far only a small fraction of the obtained spectra have been processed because the steps in processing the spectra require human manipulation. We developed an automatic processing pipeline for CDA mass spectra which will consistently analyze this data. The preprocessing steps are to de-noise the spectra, determine and remove the baseline, calculate the correct stretch parameter, and finally to identify elements and compounds in the spectra. With the E ring constantly evolving due to embedded active moons, this data will provide valuable information about the source of the E ring, the subsurface of Saturn's ice moon Enceladus, as well as about the dynamics of the ring itself.
NASA Astrophysics Data System (ADS)
Shahriari Nia, Morteza; Wang, Daisy Zhe; Bohlman, Stephanie Ann; Gader, Paul; Graves, Sarah J.; Petrovic, Milenko
2015-01-01
Hyperspectral images can be used to identify savannah tree species at the landscape scale, which is a key step in measuring biomass and carbon, and tracking changes in species distributions, including invasive species, in these ecosystems. Before automated species mapping can be performed, image processing and atmospheric correction is often performed, which can potentially affect the performance of classification algorithms. We determine how three processing and correction techniques (atmospheric correction, Gaussian filters, and shade/green vegetation filters) affect the prediction accuracy of classification of tree species at pixel level from airborne visible/infrared imaging spectrometer imagery of longleaf pine savanna in Central Florida, United States. Species classification using fast line-of-sight atmospheric analysis of spectral hypercubes (FLAASH) atmospheric correction outperformed ATCOR in the majority of cases. Green vegetation (normalized difference vegetation index) and shade (near-infrared) filters did not increase classification accuracy when applied to large and continuous patches of specific species. Finally, applying a Gaussian filter reduces interband noise and increases species classification accuracy. Using the optimal preprocessing steps, our classification accuracy of six species classes is about 75%.
Users Manual for the Geospatial Stream Flow Model (GeoSFM)
Artan, Guleid A.; Asante, Kwabena; Smith, Jodie; Pervez, Md Shahriar; Entenmann, Debbie; Verdin, James P.; Rowland, James
2008-01-01
The monitoring of wide-area hydrologic events requires the manipulation of large amounts of geospatial and time series data into concise information products that characterize the location and magnitude of the event. To perform these manipulations, scientists at the U.S. Geological Survey Center for Earth Resources Observation and Science (EROS), with the cooperation of the U.S. Agency for International Development, Office of Foreign Disaster Assistance (USAID/OFDA), have implemented a hydrologic modeling system. The system includes a data assimilation component to generate data for a Geospatial Stream Flow Model (GeoSFM) that can be run operationally to identify and map wide-area streamflow anomalies. GeoSFM integrates a geographical information system (GIS) for geospatial preprocessing and postprocessing tasks and hydrologic modeling routines implemented as dynamically linked libraries (DLLs) for time series manipulations. Model results include maps that depicting the status of streamflow and soil water conditions. This Users Manual provides step-by-step instructions for running the model and for downloading and processing the input data required for initial model parameterization and daily operation.
Impact of artifact removal on ChIP quality metrics in ChIP-seq and ChIP-exo data
Carroll, Thomas S.; Liang, Ziwei; Salama, Rafik; Stark, Rory; de Santiago, Ines
2014-01-01
With the advent of ChIP-seq multiplexing technologies and the subsequent increase in ChIP-seq throughput, the development of working standards for the quality assessment of ChIP-seq studies has received significant attention. The ENCODE consortium's large scale analysis of transcription factor binding and epigenetic marks as well as concordant work on ChIP-seq by other laboratories has established a new generation of ChIP-seq quality control measures. The use of these metrics alongside common processing steps has however not been evaluated. In this study, we investigate the effects of blacklisting and removal of duplicated reads on established metrics of ChIP-seq quality and show that the interpretation of these metrics is highly dependent on the ChIP-seq preprocessing steps applied. Further to this we perform the first investigation of the use of these metrics for ChIP-exo data and make recommendations for the adaptation of the NSC statistic to allow for the assessment of ChIP-exo efficiency. PMID:24782889
Processing methods for photoacoustic Doppler flowmetry with a clinical ultrasound scanner
NASA Astrophysics Data System (ADS)
Bücking, Thore M.; van den Berg, Pim J.; Balabani, Stavroula; Steenbergen, Wiendelt; Beard, Paul C.; Brunker, Joanna
2018-02-01
Photoacoustic flowmetry (PAF) based on time-domain cross correlation of photoacoustic signals is a promising technique for deep tissue measurement of blood flow velocity. Signal processing has previously been developed for single element transducers. Here, the processing methods for acoustic resolution PAF using a clinical ultrasound transducer array are developed and validated using a 64-element transducer array with a -6 dB detection band of 11 to 17 MHz. Measurements were performed on a flow phantom consisting of a tube (580 μm inner diameter) perfused with human blood flowing at physiological speeds ranging from 3 to 25 mm / s. The processing pipeline comprised: image reconstruction, filtering, displacement detection, and masking. High-pass filtering and background subtraction were found to be key preprocessing steps to enable accurate flow velocity estimates, which were calculated using a cross-correlation based method. In addition, the regions of interest in the calculated velocity maps were defined using a masking approach based on the amplitude of the cross-correlation functions. These developments enabled blood flow measurements using a transducer array, bringing PAF one step closer to clinical applicability.
Assessing the performance of a motion tracking system based on optical joint transform correlation
NASA Astrophysics Data System (ADS)
Elbouz, M.; Alfalou, A.; Brosseau, C.; Ben Haj Yahia, N.; Alam, M. S.
2015-08-01
We present an optimized system specially designed for the tracking and recognition of moving subjects in a confined environment (such as an elderly remaining at home). In the first step of our study, we use a VanderLugt correlator (VLC) with an adapted pre-processing treatment of the input plane and a postprocessing of the correlation plane via a nonlinear function allowing us to make a robust decision. The second step is based on an optical joint transform correlation (JTC)-based system (NZ-NL-correlation JTC) for achieving improved detection and tracking of moving persons in a confined space. The proposed system has been found to have significantly superior discrimination and robustness capabilities allowing to detect an unknown target in an input scene and to determine the target's trajectory when this target is in motion. This system offers robust tracking performance of a moving target in several scenarios, such as rotational variation of input faces. Test results obtained using various real life video sequences show that the proposed system is particularly suitable for real-time detection and tracking of moving objects.
Quantum Search in Hilbert Space
NASA Technical Reports Server (NTRS)
Zak, Michail
2003-01-01
A proposed quantum-computing algorithm would perform a search for an item of information in a database stored in a Hilbert-space memory structure. The algorithm is intended to make it possible to search relatively quickly through a large database under conditions in which available computing resources would otherwise be considered inadequate to perform such a task. The algorithm would apply, more specifically, to a relational database in which information would be stored in a set of N complex orthonormal vectors, each of N dimensions (where N can be exponentially large). Each vector would constitute one row of a unitary matrix, from which one would derive the Hamiltonian operator (and hence the evolutionary operator) of a quantum system. In other words, all the stored information would be mapped onto a unitary operator acting on a quantum state that would represent the item of information to be retrieved. Then one could exploit quantum parallelism: one could pose all search queries simultaneously by performing a quantum measurement on the system. In so doing, one would effectively solve the search problem in one computational step. One could exploit the direct- and inner-product decomposability of the unitary matrix to make the dimensionality of the memory space exponentially large by use of only linear resources. However, inasmuch as the necessary preprocessing (the mapping of the stored information into a Hilbert space) could be exponentially expensive, the proposed algorithm would likely be most beneficial in applications in which the resources available for preprocessing were much greater than those available for searching.
Genetic Algorithm for Optimization: Preprocessing with n Dimensional Bisection and Error Estimation
NASA Technical Reports Server (NTRS)
Sen, S. K.; Shaykhian, Gholam Ali
2006-01-01
A knowledge of the appropriate values of the parameters of a genetic algorithm (GA) such as the population size, the shrunk search space containing the solution, crossover and mutation probabilities is not available a priori for a general optimization problem. Recommended here is a polynomial-time preprocessing scheme that includes an n-dimensional bisection and that determines the foregoing parameters before deciding upon an appropriate GA for all problems of similar nature and type. Such a preprocessing is not only fast but also enables us to get the global optimal solution and its reasonably narrow error bounds with a high degree of confidence.
Muralidhar, Gautam S; Channappayya, Sumohana S; Slater, John H; Blinka, Ellen M; Bovik, Alan C; Frey, Wolfgang; Markey, Mia K
2008-11-06
Automated analysis of fluorescence microscopy images of endothelial cells labeled for actin is important for quantifying changes in the actin cytoskeleton. The current manual approach is laborious and inefficient. The goal of our work is to develop automated image analysis methods, thereby increasing cell analysis throughput. In this study, we present preliminary results on comparing different algorithms for cell segmentation and image denoising.
Practical Implementation of Semi-Automated As-Built Bim Creation for Complex Indoor Environments
NASA Astrophysics Data System (ADS)
Yoon, S.; Jung, J.; Heo, J.
2015-05-01
In recent days, for efficient management and operation of existing buildings, the importance of as-built BIM is emphasized in AEC/FM domain. However, fully automated as-built BIM creation is a tough issue since newly-constructed buildings are becoming more complex. To manage this problem, our research group has developed a semi-automated approach, focusing on productive 3D as-built BIM creation for complex indoor environments. In order to test its feasibility for a variety of complex indoor environments, we applied the developed approach to model the `Charlotte stairs' in Lotte World Mall, Korea. The approach includes 4 main phases: data acquisition, data pre-processing, geometric drawing, and as-built BIM creation. In the data acquisition phase, due to its complex structure, we moved the scanner location several times to obtain the entire point clouds of the test site. After which, data pre-processing phase entailing point-cloud registration, noise removal, and coordinate transformation was followed. The 3D geometric drawing was created using the RANSAC-based plane detection and boundary tracing methods. Finally, in order to create a semantically-rich BIM, the geometric drawing was imported into the commercial BIM software. The final as-built BIM confirmed that the feasibility of the proposed approach in the complex indoor environment.
NASA Astrophysics Data System (ADS)
Anderson, R. B.; Finch, N.; Clegg, S. M.; Graff, T.; Morris, R. V.; Laura, J.
2018-04-01
The PySAT point spectra tool provides a flexible graphical interface, enabling scientists to apply a wide variety of preprocessing and machine learning methods to point spectral data, with an emphasis on multivariate regression.
SeqTrim: a high-throughput pipeline for pre-processing any type of sequence read
2010-01-01
Background High-throughput automated sequencing has enabled an exponential growth rate of sequencing data. This requires increasing sequence quality and reliability in order to avoid database contamination with artefactual sequences. The arrival of pyrosequencing enhances this problem and necessitates customisable pre-processing algorithms. Results SeqTrim has been implemented both as a Web and as a standalone command line application. Already-published and newly-designed algorithms have been included to identify sequence inserts, to remove low quality, vector, adaptor, low complexity and contaminant sequences, and to detect chimeric reads. The availability of several input and output formats allows its inclusion in sequence processing workflows. Due to its specific algorithms, SeqTrim outperforms other pre-processors implemented as Web services or standalone applications. It performs equally well with sequences from EST libraries, SSH libraries, genomic DNA libraries and pyrosequencing reads and does not lead to over-trimming. Conclusions SeqTrim is an efficient pipeline designed for pre-processing of any type of sequence read, including next-generation sequencing. It is easily configurable and provides a friendly interface that allows users to know what happened with sequences at every pre-processing stage, and to verify pre-processing of an individual sequence if desired. The recommended pipeline reveals more information about each sequence than previously described pre-processors and can discard more sequencing or experimental artefacts. PMID:20089148
Zoller, Christian Johannes; Hohmann, Ansgar; Foschum, Florian; Geiger, Simeon; Geiger, Martin; Ertl, Thomas Peter; Kienle, Alwin
2018-06-01
A GPU-based Monte Carlo software (MCtet) was developed to calculate the light propagation in arbitrarily shaped objects, like a human tooth, represented by a tetrahedral mesh. A unique feature of MCtet is a concept to realize different kinds of light-sources illuminating the complex-shaped surface of an object, for which no preprocessing step is needed. With this concept, it is also possible to consider photons leaving a turbid media and reentering again in case of a concave object. The correct implementation was shown by comparison with five other Monte Carlo software packages. A hundredfold acceleration compared with central processing units-based programs was found. MCtet can simulate anisotropic light propagation, e.g., by accounting for scattering at cylindrical structures. The important influence of the anisotropic light propagation, caused, e.g., by the tubules in human dentin, is shown for the transmission spectrum through a tooth. It was found that the sensitivity to a change in the oxygen saturation inside the pulp for transmission spectra is much larger if the tubules are considered. Another "light guiding" effect based on a combination of a low scattering and a high refractive index in enamel is described. (2018) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE).
Task-specific image partitioning.
Kim, Sungwoong; Nowozin, Sebastian; Kohli, Pushmeet; Yoo, Chang D
2013-02-01
Image partitioning is an important preprocessing step for many of the state-of-the-art algorithms used for performing high-level computer vision tasks. Typically, partitioning is conducted without regard to the task in hand. We propose a task-specific image partitioning framework to produce a region-based image representation that will lead to a higher task performance than that reached using any task-oblivious partitioning framework and existing supervised partitioning framework, albeit few in number. The proposed method partitions the image by means of correlation clustering, maximizing a linear discriminant function defined over a superpixel graph. The parameters of the discriminant function that define task-specific similarity/dissimilarity among superpixels are estimated based on structured support vector machine (S-SVM) using task-specific training data. The S-SVM learning leads to a better generalization ability while the construction of the superpixel graph used to define the discriminant function allows a rich set of features to be incorporated to improve discriminability and robustness. We evaluate the learned task-aware partitioning algorithms on three benchmark datasets. Results show that task-aware partitioning leads to better labeling performance than the partitioning computed by the state-of-the-art general-purpose and supervised partitioning algorithms. We believe that the task-specific image partitioning paradigm is widely applicable to improving performance in high-level image understanding tasks.
Liao, Xiaolei; Zhao, Juanjuan; Jiao, Cheng; Lei, Lei; Qiang, Yan; Cui, Qiang
2016-01-01
Background Lung parenchyma segmentation is often performed as an important pre-processing step in the computer-aided diagnosis of lung nodules based on CT image sequences. However, existing lung parenchyma image segmentation methods cannot fully segment all lung parenchyma images and have a slow processing speed, particularly for images in the top and bottom of the lung and the images that contain lung nodules. Method Our proposed method first uses the position of the lung parenchyma image features to obtain lung parenchyma ROI image sequences. A gradient and sequential linear iterative clustering algorithm (GSLIC) for sequence image segmentation is then proposed to segment the ROI image sequences and obtain superpixel samples. The SGNF, which is optimized by a genetic algorithm (GA), is then utilized for superpixel clustering. Finally, the grey and geometric features of the superpixel samples are used to identify and segment all of the lung parenchyma image sequences. Results Our proposed method achieves higher segmentation precision and greater accuracy in less time. It has an average processing time of 42.21 seconds for each dataset and an average volume pixel overlap ratio of 92.22 ± 4.02% for four types of lung parenchyma image sequences. PMID:27532214
A Review on Biomass Torrefaction Process and Product Properties
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jaya Shankar Tumuluru; Shahab Sokhansanj; Christopher T. Wright
2011-08-01
Biomass Torrefaction is gaining attention as an important preprocessing step to improve the quality of biomass in terms of physical properties and chemical composition. Torrefaction is a slow heating of biomass in an inert or reduced environment to a maximum temperature of approximately 300 C. Torrefaction can also be defined as a group of products resulting from the partially controlled and isothermal pyrolysis of biomass occurring in a temperature range of 200-280 C. Thus, the process can be called a mild pyrolysis as it occurs at the lower temperature range of the pyrolysis process. At the end of the torrefactionmore » process, a solid uniform product with lower moisture content and higher energy content than raw biomass is produced. Most of the smoke-producing compounds and other volatiles are removed during torrefaction, which produces a final product that will have a lower mass but a higher heating value. The present review work looks into (a) torrefaction process and different products produced during the process and (b) solid torrefied material properties which include: (i) physical properties like moisture content, density, grindability, particle size distribution and particle surface area and pelletability; (ii) chemical properties like proximate and ultimate composition; and (iii) storage properties like off-gassing and spontaneous combustion.« less
NASA Astrophysics Data System (ADS)
Yu, H.; Barriga, S.; Agurto, C.; Zamora, G.; Bauman, W.; Soliz, P.
2012-03-01
Retinal vasculature is one of the most important anatomical structures in digital retinal photographs. Accurate segmentation of retinal blood vessels is an essential task in automated analysis of retinopathy. This paper presents a new and effective vessel segmentation algorithm that features computational simplicity and fast implementation. This method uses morphological pre-processing to decrease the disturbance of bright structures and lesions before vessel extraction. Next, a vessel probability map is generated by computing the eigenvalues of the second derivatives of Gaussian filtered image at multiple scales. Then, the second order local entropy thresholding is applied to segment the vessel map. Lastly, a rule-based decision step, which measures the geometric shape difference between vessels and lesions is applied to reduce false positives. The algorithm is evaluated on the low-resolution DRIVE and STARE databases and the publicly available high-resolution image database from Friedrich-Alexander University Erlangen-Nuremberg, Germany). The proposed method achieved comparable performance to state of the art unsupervised vessel segmentation methods with a competitive faster speed on the DRIVE and STARE databases. For the high resolution fundus image database, the proposed algorithm outperforms an existing approach both on performance and speed. The efficiency and robustness make the blood vessel segmentation method described here suitable for broad application in automated analysis of retinal images.
NASA Astrophysics Data System (ADS)
van Aardt, J. A.; Wu, J.; Asner, G. P.
2010-12-01
Our understanding of vegetation complexity and biodiversity, from a remote sensing perspective, has evolved from 2D species diversity to also include 3D vegetation structural diversity. Attempts at using image-based approaches for structural assessment have met with reasonable success, but 3D remote sensing technologies, such as radar and light detection and ranging (lidar), are arguably more adept at sensing vegetation structure. While radar-derived structure metrics tend to break down at high biomass levels, novel waveform lidar systems present us with new opportunities for detailed and scalable structural characterization of vegetation. These sensors digitize the entire backscattered energy profile at high spatial and vertical resolutions and often at off-nadir angles. Research teams at Rochester Institute of Technology (RIT) and Carnegie Institution for Science have been using airborne data from the Carnegie Airborne Observatory (CAO) to assess vegetation structure and variation in savanna ecosystems in and around the Kruger National Park, South Africa. It quickly became evident that (i) pre-processing of small-footprint waveform data is a critical step prior to testing scientific hypotheses, (ii) a number of assumptions of how vegetation structure is expressed in these 3D signals need to be evaluated, and very importantly (iii) we need to re-evaluate our linkages between coarse in-field measurements, e.g., volume, biomass, leaf area index (LAI), and metrics derived from waveform lidar. Research has progressed to the stage where we have evaluated various pre-processing steps, e.g., convolution via the Wiener filter, Richardson-Lucy, and non-negative least squares algorithms, and the coupling of waveform voxels to tree structure in a simulation environment. This was done in the MODTRAN-based Digital Imaging and Remote Sensing Image Generation (DIRSIG) simulation environment, developed at RIT. We generated "truth" cross-section datasets of detailed virtual trees in this environment and evaluated inversion approaches to tree structure estimation. Various outgoing pulse widths, tree structures, and a noise component were included as part of the simulation effort. Results, for example, have shown that the Richardson-Lucy algorithm outperforms other approaches in terms of retrieval of known structural information, that our assumption regarding the position of the ground surface needs re-evaluation, and has shed light on herbaceous biomass and waveform interactions and the impact of outgoing pulse width on assessments. These efforts have gone a long way in providing a solid foundation for analysis and interpretation of actual waveform data from the savanna study area. We expect that newfound knowledge with respect to waveform-target interactions from these simulations will also aid efforts to reconstruct 3D trees from real data and better describe associated structural diversity. Results will be presented at the conference.
ERIC Educational Resources Information Center
Marill, Thomas; And Others
The aim of the CYCLOPS Project research is the development of techniques for allowing computers to perform visual scene analysis, pre-processing of visual imagery, and perceptual learning. Work on scene analysis and learning has previously been described. The present report deals with research on pre-processing and with further work on scene…
USDA-ARS?s Scientific Manuscript database
In multivariate regression analysis of spectroscopy data, spectral preprocessing is often performed to reduce unwanted background information (offsets, sloped baselines) or accentuate absorption features in intrinsically overlapping bands. These procedures, also known as pretreatments, are commonly ...
Fast automatic delineation of cardiac volume of interest in MSCT images
NASA Astrophysics Data System (ADS)
Lorenz, Cristian; Lessick, Jonathan; Lavi, Guy; Bulow, Thomas; Renisch, Steffen
2004-05-01
Computed Tomography Angiography (CTA) is an emerging modality for assessing cardiac anatomy. The delineation of the cardiac volume of interest (VOI) is a pre-processing step for subsequent visualization or image processing. It serves the suppression of anatomic structures being not in the primary focus of the cardiac application, such as sternum, ribs, spinal column, descending aorta and pulmonary vasculature. These structures obliterate standard visualizations such as direct volume renderings or maximum intensity projections. In addition, outcome and performance of post-processing steps such as ventricle suppression, coronary artery segmentation or the detection of short and long axes of the heart can be improved. The structures being part of the cardiac VOI (coronary arteries and veins, myocardium, ventricles and atria) differ tremendously in appearance. In addition, there is no clear image feature associated with the contour (or better cut-surface) distinguishing between cardiac VOI and surrounding tissue making the automatic delineation of the cardiac VOI a difficult task. The presented approach locates in a first step chest wall and descending aorta in all image slices giving a rough estimate of the location of the heart. In a second step, a Fourier based active contour approach delineates slice-wise the border of the cardiac VOI. The algorithm has been evaluated on 41 multi-slice CT data-sets including cases with coronary stents and venous and arterial bypasses. The typical processing time amounts to 5-10s on a 1GHz P3 PC.
Robust detrending, rereferencing, outlier detection, and inpainting for multichannel data.
de Cheveigné, Alain; Arzounian, Dorothée
2018-05-15
Electroencephalography (EEG), magnetoencephalography (MEG) and related techniques are prone to glitches, slow drift, steps, etc., that contaminate the data and interfere with the analysis and interpretation. These artifacts are usually addressed in a preprocessing phase that attempts to remove them or minimize their impact. This paper offers a set of useful techniques for this purpose: robust detrending, robust rereferencing, outlier detection, data interpolation (inpainting), step removal, and filter ringing artifact removal. These techniques provide a less wasteful alternative to discarding corrupted trials or channels, and they are relatively immune to artifacts that disrupt alternative approaches such as filtering. Robust detrending allows slow drifts and common mode signals to be factored out while avoiding the deleterious effects of glitches. Robust rereferencing reduces the impact of artifacts on the reference. Inpainting allows corrupt data to be interpolated from intact parts based on the correlation structure estimated over the intact parts. Outlier detection allows the corrupt parts to be identified. Step removal fixes the high-amplitude flux jump artifacts that are common with some MEG systems. Ringing removal allows the ringing response of the antialiasing filter to glitches (steps, pulses) to be suppressed. The performance of the methods is illustrated and evaluated using synthetic data and data from real EEG and MEG systems. These methods, which are mainly automatic and require little tuning, can greatly improve the quality of the data. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.
Monthly reservoir inflow forecasting using a new hybrid SARIMA genetic programming approach
NASA Astrophysics Data System (ADS)
Moeeni, Hamid; Bonakdari, Hossein; Ebtehaj, Isa
2017-03-01
Forecasting reservoir inflow is one of the most important components of water resources and hydroelectric systems operation management. Seasonal autoregressive integrated moving average (SARIMA) models have been frequently used for predicting river flow. SARIMA models are linear and do not consider the random component of statistical data. To overcome this shortcoming, monthly inflow is predicted in this study based on a combination of seasonal autoregressive integrated moving average (SARIMA) and gene expression programming (GEP) models, which is a new hybrid method (SARIMA-GEP). To this end, a four-step process is employed. First, the monthly inflow datasets are pre-processed. Second, the datasets are modelled linearly with SARIMA and in the third stage, the non-linearity of residual series caused by linear modelling is evaluated. After confirming the non-linearity, the residuals are modelled in the fourth step using a gene expression programming (GEP) method. The proposed hybrid model is employed to predict the monthly inflow to the Jamishan Dam in west Iran. Thirty years' worth of site measurements of monthly reservoir dam inflow with extreme seasonal variations are used. The results of this hybrid model (SARIMA-GEP) are compared with SARIMA, GEP, artificial neural network (ANN) and SARIMA-ANN models. The results indicate that the SARIMA-GEP model ( R 2=78.8, VAF =78.8, RMSE =0.89, MAPE =43.4, CRM =0.053) outperforms SARIMA and GEP and SARIMA-ANN ( R 2=68.3, VAF =66.4, RMSE =1.12, MAPE =56.6, CRM =0.032) displays better performance than the SARIMA and ANN models. A comparison of the two hybrid models indicates the superiority of SARIMA-GEP over the SARIMA-ANN model.
Visualization of volumetric seismic data
NASA Astrophysics Data System (ADS)
Spickermann, Dela; Böttinger, Michael; Ashfaq Ahmed, Khawar; Gajewski, Dirk
2015-04-01
Mostly driven by demands of high quality subsurface imaging, highly specialized tools and methods have been developed to support the processing, visualization and interpretation of seismic data. 3D seismic data acquisition and 4D time-lapse seismic monitoring are well-established techniques in academia and industry, producing large amounts of data to be processed, visualized and interpreted. In this context, interactive 3D visualization methods proved to be valuable for the analysis of 3D seismic data cubes - especially for sedimentary environments with continuous horizons. In crystalline and hard rock environments, where hydraulic stimulation techniques may be applied to produce geothermal energy, interpretation of the seismic data is a more challenging problem. Instead of continuous reflection horizons, the imaging targets are often steep dipping faults, causing a lot of diffractions. Without further preprocessing these geological structures are often hidden behind the noise in the data. In this PICO presentation we will present a workflow consisting of data processing steps, which enhance the signal-to-noise ratio, followed by a visualization step based on the use the commercially available general purpose 3D visualization system Avizo. Specifically, we have used Avizo Earth, an extension to Avizo, which supports the import of seismic data in SEG-Y format and offers easy access to state-of-the-art 3D visualization methods at interactive frame rates, even for large seismic data cubes. In seismic interpretation using visualization, interactivity is a key requirement for understanding complex 3D structures. In order to enable an easy communication of the insights gained during the interactive visualization process, animations of the visualized data were created which support the spatial understanding of the data.
Peak Detection Method Evaluation for Ion Mobility Spectrometry by Using Machine Learning Approaches
Hauschild, Anne-Christin; Kopczynski, Dominik; D’Addario, Marianna; Baumbach, Jörg Ingo; Rahmann, Sven; Baumbach, Jan
2013-01-01
Ion mobility spectrometry with pre-separation by multi-capillary columns (MCC/IMS) has become an established inexpensive, non-invasive bioanalytics technology for detecting volatile organic compounds (VOCs) with various metabolomics applications in medical research. To pave the way for this technology towards daily usage in medical practice, different steps still have to be taken. With respect to modern biomarker research, one of the most important tasks is the automatic classification of patient-specific data sets into different groups, healthy or not, for instance. Although sophisticated machine learning methods exist, an inevitable preprocessing step is reliable and robust peak detection without manual intervention. In this work we evaluate four state-of-the-art approaches for automated IMS-based peak detection: local maxima search, watershed transformation with IPHEx, region-merging with VisualNow, and peak model estimation (PME). We manually generated a gold standard with the aid of a domain expert (manual) and compare the performance of the four peak calling methods with respect to two distinct criteria. We first utilize established machine learning methods and systematically study their classification performance based on the four peak detectors’ results. Second, we investigate the classification variance and robustness regarding perturbation and overfitting. Our main finding is that the power of the classification accuracy is almost equally good for all methods, the manually created gold standard as well as the four automatic peak finding methods. In addition, we note that all tools, manual and automatic, are similarly robust against perturbations. However, the classification performance is more robust against overfitting when using the PME as peak calling preprocessor. In summary, we conclude that all methods, though small differences exist, are largely reliable and enable a wide spectrum of real-world biomedical applications. PMID:24957992
Peak detection method evaluation for ion mobility spectrometry by using machine learning approaches.
Hauschild, Anne-Christin; Kopczynski, Dominik; D'Addario, Marianna; Baumbach, Jörg Ingo; Rahmann, Sven; Baumbach, Jan
2013-04-16
Ion mobility spectrometry with pre-separation by multi-capillary columns (MCC/IMS) has become an established inexpensive, non-invasive bioanalytics technology for detecting volatile organic compounds (VOCs) with various metabolomics applications in medical research. To pave the way for this technology towards daily usage in medical practice, different steps still have to be taken. With respect to modern biomarker research, one of the most important tasks is the automatic classification of patient-specific data sets into different groups, healthy or not, for instance. Although sophisticated machine learning methods exist, an inevitable preprocessing step is reliable and robust peak detection without manual intervention. In this work we evaluate four state-of-the-art approaches for automated IMS-based peak detection: local maxima search, watershed transformation with IPHEx, region-merging with VisualNow, and peak model estimation (PME).We manually generated Metabolites 2013, 3 278 a gold standard with the aid of a domain expert (manual) and compare the performance of the four peak calling methods with respect to two distinct criteria. We first utilize established machine learning methods and systematically study their classification performance based on the four peak detectors' results. Second, we investigate the classification variance and robustness regarding perturbation and overfitting. Our main finding is that the power of the classification accuracy is almost equally good for all methods, the manually created gold standard as well as the four automatic peak finding methods. In addition, we note that all tools, manual and automatic, are similarly robust against perturbations. However, the classification performance is more robust against overfitting when using the PME as peak calling preprocessor. In summary, we conclude that all methods, though small differences exist, are largely reliable and enable a wide spectrum of real-world biomedical applications.
Bridge Structure Deformation Prediction Based on GNSS Data Using Kalman-ARIMA-GARCH Model
Li, Xiaoqing; Wang, Yu
2018-01-01
Bridges are an essential part of the ground transportation system. Health monitoring is fundamentally important for the safety and service life of bridges. A large amount of structural information is obtained from various sensors using sensing technology, and the data processing has become a challenging issue. To improve the prediction accuracy of bridge structure deformation based on data mining and to accurately evaluate the time-varying characteristics of bridge structure performance evolution, this paper proposes a new method for bridge structure deformation prediction, which integrates the Kalman filter, autoregressive integrated moving average model (ARIMA), and generalized autoregressive conditional heteroskedasticity (GARCH). Firstly, the raw deformation data is directly pre-processed using the Kalman filter to reduce the noise. After that, the linear recursive ARIMA model is established to analyze and predict the structure deformation. Finally, the nonlinear recursive GARCH model is introduced to further improve the accuracy of the prediction. Simulation results based on measured sensor data from the Global Navigation Satellite System (GNSS) deformation monitoring system demonstrated that: (1) the Kalman filter is capable of denoising the bridge deformation monitoring data; (2) the prediction accuracy of the proposed Kalman-ARIMA-GARCH model is satisfactory, where the mean absolute error increases only from 3.402 mm to 5.847 mm with the increment of the prediction step; and (3) in comparision to the Kalman-ARIMA model, the Kalman-ARIMA-GARCH model results in superior prediction accuracy as it includes partial nonlinear characteristics (heteroscedasticity); the mean absolute error of five-step prediction using the proposed model is improved by 10.12%. This paper provides a new way for structural behavior prediction based on data processing, which can lay a foundation for the early warning of bridge health monitoring system based on sensor data using sensing technology. PMID:29351254
Bridge Structure Deformation Prediction Based on GNSS Data Using Kalman-ARIMA-GARCH Model.
Xin, Jingzhou; Zhou, Jianting; Yang, Simon X; Li, Xiaoqing; Wang, Yu
2018-01-19
Bridges are an essential part of the ground transportation system. Health monitoring is fundamentally important for the safety and service life of bridges. A large amount of structural information is obtained from various sensors using sensing technology, and the data processing has become a challenging issue. To improve the prediction accuracy of bridge structure deformation based on data mining and to accurately evaluate the time-varying characteristics of bridge structure performance evolution, this paper proposes a new method for bridge structure deformation prediction, which integrates the Kalman filter, autoregressive integrated moving average model (ARIMA), and generalized autoregressive conditional heteroskedasticity (GARCH). Firstly, the raw deformation data is directly pre-processed using the Kalman filter to reduce the noise. After that, the linear recursive ARIMA model is established to analyze and predict the structure deformation. Finally, the nonlinear recursive GARCH model is introduced to further improve the accuracy of the prediction. Simulation results based on measured sensor data from the Global Navigation Satellite System (GNSS) deformation monitoring system demonstrated that: (1) the Kalman filter is capable of denoising the bridge deformation monitoring data; (2) the prediction accuracy of the proposed Kalman-ARIMA-GARCH model is satisfactory, where the mean absolute error increases only from 3.402 mm to 5.847 mm with the increment of the prediction step; and (3) in comparision to the Kalman-ARIMA model, the Kalman-ARIMA-GARCH model results in superior prediction accuracy as it includes partial nonlinear characteristics (heteroscedasticity); the mean absolute error of five-step prediction using the proposed model is improved by 10.12%. This paper provides a new way for structural behavior prediction based on data processing, which can lay a foundation for the early warning of bridge health monitoring system based on sensor data using sensing technology.
New decision support tool for acute lymphoblastic leukemia classification
NASA Astrophysics Data System (ADS)
Madhukar, Monica; Agaian, Sos; Chronopoulos, Anthony T.
2012-03-01
In this paper, we build up a new decision support tool to improve treatment intensity choice in childhood ALL. The developed system includes different methods to accurately measure furthermore cell properties in microscope blood film images. The blood images are exposed to series of pre-processing steps which include color correlation, and contrast enhancement. By performing K-means clustering on the resultant images, the nuclei of the cells under consideration are obtained. Shape features and texture features are then extracted for classification. The system is further tested on the classification of spectra measured from the cell nuclei in blood samples in order to distinguish normal cells from those affected by Acute Lymphoblastic Leukemia. The results show that the proposed system robustly segments and classifies acute lymphoblastic leukemia based on complete microscopic blood images.
Epipolar Rectification for CARTOSAT-1 Stereo Images Using SIFT and RANSAC
NASA Astrophysics Data System (ADS)
Akilan, A.; Sudheer Reddy, D.; Nagasubramanian, V.; Radhadevi, P. V.; Varadan, G.
2014-11-01
Cartosat-1 provides stereo images of spatial resolution 2.5 m with high fidelity of geometry. Stereo camera on the spacecraft has look angles of +26 degree and -5 degree respectively that yields effective along track stereo. Any DSM generation algorithm can use the stereo images for accurate 3D reconstruction and measurement of ground. Dense match points and pixel-wise matching are prerequisite in DSM generation to capture discontinuities and occlusions for accurate 3D modelling application. Epipolar image matching reduces the computational effort from two dimensional area searches to one dimensional. Thus, epipolar rectification is preferred as a pre-processing step for accurate DSM generation. In this paper we explore a method based on SIFT and RANSAC for epipolar rectification of cartosat-1 stereo images.
Bonny, Jean Marie; Boespflug-Tanguly, Odile; Zanca, Michel; Renou, Jean Pierre
2003-03-01
A solution for discrete multi-exponential analysis of T(2) relaxation decay curves obtained in current multi-echo imaging protocol conditions is described. We propose a preprocessing step to improve the signal-to-noise ratio and thus lower the signal-to-noise ratio threshold from which a high percentage of true multi-exponential detection is detected. It consists of a multispectral nonlinear edge-preserving filter that takes into account the signal-dependent Rician distribution of noise affecting magnitude MR images. Discrete multi-exponential decomposition, which requires no a priori knowledge, is performed by a non-linear least-squares procedure initialized with estimates obtained from a total least-squares linear prediction algorithm. This approach was validated and optimized experimentally on simulated data sets of normal human brains.
SpcAudace: Spectroscopic processing and analysis package of Audela software
NASA Astrophysics Data System (ADS)
Mauclaire, Benjamin
2017-11-01
SpcAudace processes long slit spectra with automated pipelines and performs astrophysical analysis of the latter data. These powerful pipelines do all the required steps in one pass: standard preprocessing, masking of bad pixels, geometric corrections, registration, optimized spectrum extraction, wavelength calibration and instrumental response computation and correction. Both high and low resolution long slit spectra are managed for stellar and non-stellar targets. Many types of publication-quality figures can be easily produced: pdf and png plots or annotated time series plots. Astrophysical quantities can be derived from individual or large amount of spectra with advanced functions: from line profile characteristics to equivalent width and periodogram. More than 300 documented functions are available and can be used into TCL scripts for automation. SpcAudace is based on Audela open source software.
Research on pre-processing of QR Code
NASA Astrophysics Data System (ADS)
Sun, Haixing; Xia, Haojie; Dong, Ning
2013-10-01
QR code encodes many kinds of information because of its advantages: large storage capacity, high reliability, full arrange of utter-high-speed reading, small printing size and high-efficient representation of Chinese characters, etc. In order to obtain the clearer binarization image from complex background, and improve the recognition rate of QR code, this paper researches on pre-processing methods of QR code (Quick Response Code), and shows algorithms and results of image pre-processing for QR code recognition. Improve the conventional method by changing the Souvola's adaptive text recognition method. Additionally, introduce the QR code Extraction which adapts to different image size, flexible image correction approach, and improve the efficiency and accuracy of QR code image processing.
NASA Astrophysics Data System (ADS)
Borodinov, A. A.; Myasnikov, V. V.
2018-04-01
The present work is devoted to comparing the accuracy of the known qualification algorithms in the task of recognizing local objects on radar images for various image preprocessing methods. Preprocessing involves speckle noise filtering and normalization of the object orientation in the image by the method of image moments and by a method based on the Hough transform. In comparison, the following classification algorithms are used: Decision tree; Support vector machine, AdaBoost, Random forest. The principal component analysis is used to reduce the dimension. The research is carried out on the objects from the base of radar images MSTAR. The paper presents the results of the conducted studies.
Magnetic Field Satellite (Magsat) data processing system specifications
NASA Technical Reports Server (NTRS)
Berman, D.; Gomez, R.; Miller, A.
1980-01-01
The software specifications for the MAGSAT data processing system (MDPS) are presented. The MDPS is divided functionally into preprocessing of primary input data, data management, chronicle processing, and postprocessing. Data organization and validity, and checks of spacecraft and instrumentation are dicussed. Output products of the MDPS, including various plots and data tapes, are described. Formats for important tapes are presented. Dicussions and mathematical formulations for coordinate transformations and field model coefficients are included.
Parafoveal Preprocessing in Reading Revisited: Evidence from a Novel Preview Manipulation
ERIC Educational Resources Information Center
Gagl, Benjamin; Hawelka, Stefan; Richlan, Fabio; Schuster, Sarah; Hutzler, Florian
2014-01-01
The study investigated parafoveal preprocessing by the means of the classical invisible boundary paradigm and a novel manipulation of the parafoveal previews (i.e., visual degradation). Eye movements were investigated on 5-letter target words with constraining (i.e., highly informative) initial letters or similarly constraining final letters.…
Chockalingam, Sriram; Aluru, Maneesha; Aluru, Srinivas
2016-09-19
Pre-processing of microarray data is a well-studied problem. Furthermore, all popular platforms come with their own recommended best practices for differential analysis of genes. However, for genome-scale network inference using microarray data collected from large public repositories, these methods filter out a considerable number of genes. This is primarily due to the effects of aggregating a diverse array of experiments with different technical and biological scenarios. Here we introduce a pre-processing pipeline suitable for inferring genome-scale gene networks from large microarray datasets. We show that partitioning of the available microarray datasets according to biological relevance into tissue- and process-specific categories significantly extends the limits of downstream network construction. We demonstrate the effectiveness of our pre-processing pipeline by inferring genome-scale networks for the model plant Arabidopsis thaliana using two different construction methods and a collection of 11,760 Affymetrix ATH1 microarray chips. Our pre-processing pipeline and the datasets used in this paper are made available at http://alurulab.cc.gatech.edu/microarray-pp.
Pre-processing and post-processing in group-cluster mergers
NASA Astrophysics Data System (ADS)
Vijayaraghavan, R.; Ricker, P. M.
2013-11-01
Galaxies in clusters are more likely to be of early type and to have lower star formation rates than galaxies in the field. Recent observations and simulations suggest that cluster galaxies may be `pre-processed' by group or filament environments and that galaxies that fall into a cluster as part of a larger group can stay coherent within the cluster for up to one orbital period (`post-processing'). We investigate these ideas by means of a cosmological N-body simulation and idealized N-body plus hydrodynamics simulations of a group-cluster merger. We find that group environments can contribute significantly to galaxy pre-processing by means of enhanced galaxy-galaxy merger rates, removal of galaxies' hot halo gas by ram pressure stripping and tidal truncation of their galaxies. Tidal distortion of the group during infall does not contribute to pre-processing. Post-processing is also shown to be effective: galaxy-galaxy collisions are enhanced during a group's pericentric passage within a cluster, the merger shock enhances the ram pressure on group and cluster galaxies and an increase in local density during the merger leads to greater galactic tidal truncation.
Does preprocessing change nonlinear measures of heart rate variability?
Gomes, Murilo E D; Guimarães, Homero N; Ribeiro, Antônio L P; Aguirre, Luis A
2002-11-01
This work investigated if methods used to produce a uniformly sampled heart rate variability (HRV) time series significantly change the deterministic signature underlying the dynamics of such signals and some nonlinear measures of HRV. Two methods of preprocessing were used: the convolution of inverse interval function values with a rectangular window and the cubic polynomial interpolation. The HRV time series were obtained from 33 Wistar rats submitted to autonomic blockade protocols and from 17 healthy adults. The analysis of determinism was carried out by the method of surrogate data sets and nonlinear autoregressive moving average modelling and prediction. The scaling exponents alpha, alpha(1) and alpha(2) derived from the detrended fluctuation analysis were calculated from raw HRV time series and respective preprocessed signals. It was shown that the technique of cubic interpolation of HRV time series did not significantly change any nonlinear characteristic studied in this work, while the method of convolution only affected the alpha(1) index. The results suggested that preprocessed time series may be used to study HRV in the field of nonlinear dynamics.
A Novel Binarization Algorithm for Ballistics Firearm Identification
NASA Astrophysics Data System (ADS)
Li, Dongguang
The identification of ballistics specimens from imaging systems is of paramount importance in criminal investigation. Binarization plays a key role in preprocess of recognizing cartridges in the ballistic imaging systems. Unfortunately, it is very difficult to get the satisfactory binary image using existing binary algorithms. In this paper, we utilize the global and local thresholds to enhance the image binarization. Importantly, we present a novel criterion for effectively detecting edges in the images. Comprehensive experiments have been conducted over sample ballistic images. The empirical results demonstrate the proposed method can provide a better solution than existing binary algorithms.
Cowley, Benjamin U.; Korpela, Jussi
2018-01-01
Existing tools for the preprocessing of EEG data provide a large choice of methods to suitably prepare and analyse a given dataset. Yet it remains a challenge for the average user to integrate methods for batch processing of the increasingly large datasets of modern research, and compare methods to choose an optimal approach across the many possible parameter configurations. Additionally, many tools still require a high degree of manual decision making for, e.g., the classification of artifacts in channels, epochs or segments. This introduces extra subjectivity, is slow, and is not reproducible. Batching and well-designed automation can help to regularize EEG preprocessing, and thus reduce human effort, subjectivity, and consequent error. The Computational Testing for Automated Preprocessing (CTAP) toolbox facilitates: (i) batch processing that is easy for experts and novices alike; (ii) testing and comparison of preprocessing methods. Here we demonstrate the application of CTAP to high-resolution EEG data in three modes of use. First, a linear processing pipeline with mostly default parameters illustrates ease-of-use for naive users. Second, a branching pipeline illustrates CTAP's support for comparison of competing methods. Third, a pipeline with built-in parameter-sweeping illustrates CTAP's capability to support data-driven method parameterization. CTAP extends the existing functions and data structure from the well-known EEGLAB toolbox, based on Matlab, and produces extensive quality control outputs. CTAP is available under MIT open-source licence from https://github.com/bwrc/ctap. PMID:29692705
Cowley, Benjamin U; Korpela, Jussi
2018-01-01
Existing tools for the preprocessing of EEG data provide a large choice of methods to suitably prepare and analyse a given dataset. Yet it remains a challenge for the average user to integrate methods for batch processing of the increasingly large datasets of modern research, and compare methods to choose an optimal approach across the many possible parameter configurations. Additionally, many tools still require a high degree of manual decision making for, e.g., the classification of artifacts in channels, epochs or segments. This introduces extra subjectivity, is slow, and is not reproducible. Batching and well-designed automation can help to regularize EEG preprocessing, and thus reduce human effort, subjectivity, and consequent error. The Computational Testing for Automated Preprocessing (CTAP) toolbox facilitates: (i) batch processing that is easy for experts and novices alike; (ii) testing and comparison of preprocessing methods. Here we demonstrate the application of CTAP to high-resolution EEG data in three modes of use. First, a linear processing pipeline with mostly default parameters illustrates ease-of-use for naive users. Second, a branching pipeline illustrates CTAP's support for comparison of competing methods. Third, a pipeline with built-in parameter-sweeping illustrates CTAP's capability to support data-driven method parameterization. CTAP extends the existing functions and data structure from the well-known EEGLAB toolbox, based on Matlab, and produces extensive quality control outputs. CTAP is available under MIT open-source licence from https://github.com/bwrc/ctap.
Mishra, Alok; Swati, D
2015-09-01
Variation in the interval between the R-R peaks of the electrocardiogram represents the modulation of the cardiac oscillations by the autonomic nervous system. This variation is contaminated by anomalous signals called ectopic beats, artefacts or noise which mask the true behaviour of heart rate variability. In this paper, we have proposed a combination filter of recursive impulse rejection filter and recursive 20% filter, with recursive application and preference of replacement over removal of abnormal beats to improve the pre-processing of the inter-beat intervals. We have tested this novel recursive combinational method with median method replacement to estimate the standard deviation of normal to normal (SDNN) beat intervals of congestive heart failure (CHF) and normal sinus rhythm subjects. This work discusses the improvement in pre-processing over single use of impulse rejection filter and removal of abnormal beats for heart rate variability for the estimation of SDNN and Poncaré plot descriptors (SD1, SD2, and SD1/SD2) in detail. We have found the 22 ms value of SDNN and 36 ms value of SD2 descriptor of Poincaré plot as clinical indicators in discriminating the normal cases from CHF cases. The pre-processing is also useful in calculation of Lyapunov exponent which is a nonlinear index as Lyapunov exponents calculated after proposed pre-processing modified in a way that it start following the notion of less complex behaviour of diseased states.
Sam2bam: High-Performance Framework for NGS Data Preprocessing Tools
Cheng, Yinhe; Tzeng, Tzy-Hwa Kathy
2016-01-01
This paper introduces a high-throughput software tool framework called sam2bam that enables users to significantly speed up pre-processing for next-generation sequencing data. The sam2bam is especially efficient on single-node multi-core large-memory systems. It can reduce the runtime of data pre-processing in marking duplicate reads on a single node system by 156–186x compared with de facto standard tools. The sam2bam consists of parallel software components that can fully utilize multiple processors, available memory, high-bandwidth storage, and hardware compression accelerators, if available. The sam2bam provides file format conversion between well-known genome file formats, from SAM to BAM, as a basic feature. Additional features such as analyzing, filtering, and converting input data are provided by using plug-in tools, e.g., duplicate marking, which can be attached to sam2bam at runtime. We demonstrated that sam2bam could significantly reduce the runtime of next generation sequencing (NGS) data pre-processing from about two hours to about one minute for a whole-exome data set on a 16-core single-node system using up to 130 GB of memory. The sam2bam could reduce the runtime of NGS data pre-processing from about 20 hours to about nine minutes for a whole-genome sequencing data set on the same system using up to 711 GB of memory. PMID:27861637
NASA Astrophysics Data System (ADS)
Sharma, Sanjib; Siddique, Ridwan; Reed, Seann; Ahnert, Peter; Mendoza, Pablo; Mejia, Alfonso
2018-03-01
The relative roles of statistical weather preprocessing and streamflow postprocessing in hydrological ensemble forecasting at short- to medium-range forecast lead times (day 1-7) are investigated. For this purpose, a regional hydrologic ensemble prediction system (RHEPS) is developed and implemented. The RHEPS is comprised of the following components: (i) hydrometeorological observations (multisensor precipitation estimates, gridded surface temperature, and gauged streamflow); (ii) weather ensemble forecasts (precipitation and near-surface temperature) from the National Centers for Environmental Prediction 11-member Global Ensemble Forecast System Reforecast version 2 (GEFSRv2); (iii) NOAA's Hydrology Laboratory-Research Distributed Hydrologic Model (HL-RDHM); (iv) heteroscedastic censored logistic regression (HCLR) as the statistical preprocessor; (v) two statistical postprocessors, an autoregressive model with a single exogenous variable (ARX(1,1)) and quantile regression (QR); and (vi) a comprehensive verification strategy. To implement the RHEPS, 1 to 7 days weather forecasts from the GEFSRv2 are used to force HL-RDHM and generate raw ensemble streamflow forecasts. Forecasting experiments are conducted in four nested basins in the US Middle Atlantic region, ranging in size from 381 to 12 362 km2. Results show that the HCLR preprocessed ensemble precipitation forecasts have greater skill than the raw forecasts. These improvements are more noticeable in the warm season at the longer lead times (> 3 days). Both postprocessors, ARX(1,1) and QR, show gains in skill relative to the raw ensemble streamflow forecasts, particularly in the cool season, but QR outperforms ARX(1,1). The scenarios that implement preprocessing and postprocessing separately tend to perform similarly, although the postprocessing-alone scenario is often more effective. The scenario involving both preprocessing and postprocessing consistently outperforms the other scenarios. In some cases, however, the differences between this scenario and the scenario with postprocessing alone are not as significant. We conclude that implementing both preprocessing and postprocessing ensures the most skill improvements, but postprocessing alone can often be a competitive alternative.
Delwiche, Stephen R; Reeves, James B
2010-01-01
In multivariate regression analysis of spectroscopy data, spectral preprocessing is often performed to reduce unwanted background information (offsets, sloped baselines) or accentuate absorption features in intrinsically overlapping bands. These procedures, also known as pretreatments, are commonly smoothing operations or derivatives. While such operations are often useful in reducing the number of latent variables of the actual decomposition and lowering residual error, they also run the risk of misleading the practitioner into accepting calibration equations that are poorly adapted to samples outside of the calibration. The current study developed a graphical method to examine this effect on partial least squares (PLS) regression calibrations of near-infrared (NIR) reflection spectra of ground wheat meal with two analytes, protein content and sodium dodecyl sulfate sedimentation (SDS) volume (an indicator of the quantity of the gluten proteins that contribute to strong doughs). These two properties were chosen because of their differing abilities to be modeled by NIR spectroscopy: excellent for protein content, fair for SDS sedimentation volume. To further demonstrate the potential pitfalls of preprocessing, an artificial component, a randomly generated value, was included in PLS regression trials. Savitzky-Golay (digital filter) smoothing, first-derivative, and second-derivative preprocess functions (5 to 25 centrally symmetric convolution points, derived from quadratic polynomials) were applied to PLS calibrations of 1 to 15 factors. The results demonstrated the danger of an over reliance on preprocessing when (1) the number of samples used in a multivariate calibration is low (<50), (2) the spectral response of the analyte is weak, and (3) the goodness of the calibration is based on the coefficient of determination (R(2)) rather than a term based on residual error. The graphical method has application to the evaluation of other preprocess functions and various types of spectroscopy data.
Thomas, Minta; De Brabanter, Kris; De Moor, Bart
2014-05-10
DNA microarrays are potentially powerful technology for improving diagnostic classification, treatment selection, and prognostic assessment. The use of this technology to predict cancer outcome has a history of almost a decade. Disease class predictors can be designed for known disease cases and provide diagnostic confirmation or clarify abnormal cases. The main input to this class predictors are high dimensional data with many variables and few observations. Dimensionality reduction of these features set significantly speeds up the prediction task. Feature selection and feature transformation methods are well known preprocessing steps in the field of bioinformatics. Several prediction tools are available based on these techniques. Studies show that a well tuned Kernel PCA (KPCA) is an efficient preprocessing step for dimensionality reduction, but the available bandwidth selection method for KPCA was computationally expensive. In this paper, we propose a new data-driven bandwidth selection criterion for KPCA, which is related to least squares cross-validation for kernel density estimation. We propose a new prediction model with a well tuned KPCA and Least Squares Support Vector Machine (LS-SVM). We estimate the accuracy of the newly proposed model based on 9 case studies. Then, we compare its performances (in terms of test set Area Under the ROC Curve (AUC) and computational time) with other well known techniques such as whole data set + LS-SVM, PCA + LS-SVM, t-test + LS-SVM, Prediction Analysis of Microarrays (PAM) and Least Absolute Shrinkage and Selection Operator (Lasso). Finally, we assess the performance of the proposed strategy with an existing KPCA parameter tuning algorithm by means of two additional case studies. We propose, evaluate, and compare several mathematical/statistical techniques, which apply feature transformation/selection for subsequent classification, and consider its application in medical diagnostics. Both feature selection and feature transformation perform well on classification tasks. Due to the dynamic selection property of feature selection, it is hard to define significant features for the classifier, which predicts classes of future samples. Moreover, the proposed strategy enjoys a distinctive advantage with its relatively lesser time complexity.
Stanford, Tyman E; Bagley, Christopher J; Solomon, Patty J
2016-01-01
Proteomic matrix-assisted laser desorption/ionisation (MALDI) linear time-of-flight (TOF) mass spectrometry (MS) may be used to produce protein profiles from biological samples with the aim of discovering biomarkers for disease. However, the raw protein profiles suffer from several sources of bias or systematic variation which need to be removed via pre-processing before meaningful downstream analysis of the data can be undertaken. Baseline subtraction, an early pre-processing step that removes the non-peptide signal from the spectra, is complicated by the following: (i) each spectrum has, on average, wider peaks for peptides with higher mass-to-charge ratios ( m / z ), and (ii) the time-consuming and error-prone trial-and-error process for optimising the baseline subtraction input arguments. With reference to the aforementioned complications, we present an automated pipeline that includes (i) a novel 'continuous' line segment algorithm that efficiently operates over data with a transformed m / z -axis to remove the relationship between peptide mass and peak width, and (ii) an input-free algorithm to estimate peak widths on the transformed m / z scale. The automated baseline subtraction method was deployed on six publicly available proteomic MS datasets using six different m/z-axis transformations. Optimality of the automated baseline subtraction pipeline was assessed quantitatively using the mean absolute scaled error (MASE) when compared to a gold-standard baseline subtracted signal. Several of the transformations investigated were able to reduce, if not entirely remove, the peak width and peak location relationship resulting in near-optimal baseline subtraction using the automated pipeline. The proposed novel 'continuous' line segment algorithm is shown to far outperform naive sliding window algorithms with regard to the computational time required. The improvement in computational time was at least four-fold on real MALDI TOF-MS data and at least an order of magnitude on many simulated datasets. The advantages of the proposed pipeline include informed and data specific input arguments for baseline subtraction methods, the avoidance of time-intensive and subjective piecewise baseline subtraction, and the ability to automate baseline subtraction completely. Moreover, individual steps can be adopted as stand-alone routines.
Mass detection with digitized screening mammograms by using Gabor features
NASA Astrophysics Data System (ADS)
Zheng, Yufeng; Agyepong, Kwabena
2007-03-01
Breast cancer is the leading cancer among American women. The current lifetime risk of developing breast cancer is 13.4% (one in seven). Mammography is the most effective technology presently available for breast cancer screening. With digital mammograms computer-aided detection (CAD) has proven to be a useful tool for radiologists. In this paper, we focus on mass detection that is a common category of breast cancers relative to calcification and architecture distortion. We propose a new mass detection algorithm utilizing Gabor filters, termed as "Gabor Mass Detection" (GMD). There are three steps in the GMD algorithm, (1) preprocessing, (2) generating alarms and (3) classification (reducing false alarms). Down-sampling, quantization, denoising and enhancement are done in the preprocessing step. Then a total of 30 Gabor filtered images (along 6 bands by 5 orientations) are produced. Alarm segments are generated by thresholding four Gabor images of full orientations (Stage-I classification) with image-dependent thresholds computed via histogram analysis. Next a set of edge histogram descriptors (EHD) are extracted from 24 Gabor images (6 by 4) that will be used for Stage-II classification. After clustering EHD features with fuzzy C-means clustering method, a k-nearest neighbor classifier is used to reduce the number of false alarms. We initially analyzed 431 digitized mammograms (159 normal images vs. 272 cancerous images, from the DDSM project, University of South Florida) with the proposed GMD algorithm. And a ten-fold cross validation was used for testing the GMD algorithm upon the available data. The GMD performance is as follows: sensitivity (true positive rate) = 0.88 at false positives per image (FPI) = 1.25, and the area under the ROC curve = 0.83. The overall performance of the GMD algorithm is satisfactory and the accuracy of locating masses (highlighting the boundaries of suspicious areas) is relatively high. Furthermore, the GMD algorithm can successfully detect early-stage (with small values of Assessment & low Subtlety) malignant masses. In addition, Gabor filtered images are used in both stages of classifications, which greatly simplifies the GMD algorithm.
Yulia, Meinilwita
2017-01-01
Asian palm civet coffee or kopi luwak (Indonesian words for coffee and palm civet) is well known as the world's priciest and rarest coffee. To protect the authenticity of luwak coffee and protect consumer from luwak coffee adulteration, it is very important to develop a robust and simple method for determining the adulteration of luwak coffee. In this research, the use of UV-Visible spectra combined with PLSR was evaluated to establish rapid and simple methods for quantification of adulteration in luwak-arabica coffee blend. Several preprocessing methods were tested and the results show that most of the preprocessing spectra were effective in improving the quality of calibration models with the best PLS calibration model selected for Savitzky-Golay smoothing spectra which had the lowest RMSECV (0.039) and highest RPDcal value (4.64). Using this PLS model, a prediction for quantification of luwak content was calculated and resulted in satisfactory prediction performance with high both RPDp and RER values. PMID:28913348
Novel algorithm by low complexity filter on retinal vessel segmentation
NASA Astrophysics Data System (ADS)
Rostampour, Samad
2011-10-01
This article shows a new method to detect blood vessels in the retina by digital images. Retinal vessel segmentation is important for detection of side effect of diabetic disease, because diabetes can form new capillaries which are very brittle. The research has been done in two phases: preprocessing and processing. Preprocessing phase consists to apply a new filter that produces a suitable output. It shows vessels in dark color on white background and make a good difference between vessels and background. The complexity is very low and extra images are eliminated. The second phase is processing and used the method is called Bayesian. It is a built-in in supervision classification method. This method uses of mean and variance of intensity of pixels for calculate of probability. Finally Pixels of image are divided into two classes: vessels and background. Used images are related to the DRIVE database. After performing this operation, the calculation gives 95 percent of efficiency average. The method also was performed from an external sample DRIVE database which has retinopathy, and perfect result was obtained
Thyroid Nodule Classification in Ultrasound Images by Fine-Tuning Deep Convolutional Neural Network.
Chi, Jianning; Walia, Ekta; Babyn, Paul; Wang, Jimmy; Groot, Gary; Eramian, Mark
2017-08-01
With many thyroid nodules being incidentally detected, it is important to identify as many malignant nodules as possible while excluding those that are highly likely to be benign from fine needle aspiration (FNA) biopsies or surgeries. This paper presents a computer-aided diagnosis (CAD) system for classifying thyroid nodules in ultrasound images. We use deep learning approach to extract features from thyroid ultrasound images. Ultrasound images are pre-processed to calibrate their scale and remove the artifacts. A pre-trained GoogLeNet model is then fine-tuned using the pre-processed image samples which leads to superior feature extraction. The extracted features of the thyroid ultrasound images are sent to a Cost-sensitive Random Forest classifier to classify the images into "malignant" and "benign" cases. The experimental results show the proposed fine-tuned GoogLeNet model achieves excellent classification performance, attaining 98.29% classification accuracy, 99.10% sensitivity and 93.90% specificity for the images in an open access database (Pedraza et al. 16), while 96.34% classification accuracy, 86% sensitivity and 99% specificity for the images in our local health region database.
Pre-processing SAR image stream to facilitate compression for transport on bandwidth-limited-link
Rush, Bobby G.; Riley, Robert
2015-09-29
Pre-processing is applied to a raw VideoSAR (or similar near-video rate) product to transform the image frame sequence into a product that resembles more closely the type of product for which conventional video codecs are designed, while sufficiently maintaining utility and visual quality of the product delivered by the codec.
NASA Astrophysics Data System (ADS)
An, Chan-Ho; Yang, Janghoon; Jang, Seunghun; Kim, Dong Ku
In this letter, a pre-processed lattice reduction (PLR) scheme is developed for the lattice reduction aided (LRA) detection of multiple input multiple-output (MIMO) systems in spatially correlated channel. The PLR computes the LLL-reduced matrix of the equivalent matrix, which is the product of the present channel matrix and unimodular transformation matrix for LR of spatial correlation matrix, rather than the present channel matrix itself. In conjunction with PLR followed by recursive lattice reduction (RLR) scheme [7], pre-processed RLR (PRLR) is shown to efficiently carry out the LR of the channel matrix, especially for the burst packet message in spatially and temporally correlated channel while matching the performance of conventional LRA detection.
Hiroyasu, Tomoyuki; Hayashinuma, Katsutoshi; Ichikawa, Hiroshi; Yagi, Nobuaki
2015-08-01
A preprocessing method for endoscopy image analysis using texture analysis is proposed. In a previous study, we proposed a feature value that combines a co-occurrence matrix and a run-length matrix to analyze the extent of early gastric cancer from images taken with narrow-band imaging endoscopy. However, the obtained feature value does not identify lesion zones correctly due to the influence of noise and halation. Therefore, we propose a new preprocessing method with a non-local means filter for de-noising and contrast limited adaptive histogram equalization. We have confirmed that the pattern of gastric mucosa in images can be improved by the proposed method. Furthermore, the lesion zone is shown more correctly by the obtained color map.
Compiler analysis for irregular problems in FORTRAN D
NASA Technical Reports Server (NTRS)
Vonhanxleden, Reinhard; Kennedy, Ken; Koelbel, Charles; Das, Raja; Saltz, Joel
1992-01-01
We developed a dataflow framework which provides a basis for rigorously defining strategies to make use of runtime preprocessing methods for distributed memory multiprocessors. In many programs, several loops access the same off-processor memory locations. Our runtime support gives us a mechanism for tracking and reusing copies of off-processor data. A key aspect of our compiler analysis strategy is to determine when it is safe to reuse copies of off-processor data. Another crucial function of the compiler analysis is to identify situations which allow runtime preprocessing overheads to be amortized. This dataflow analysis will make it possible to effectively use the results of interprocedural analysis in our efforts to reduce interprocessor communication and the need for runtime preprocessing.
Merabet, Youssef El; Meurie, Cyril; Ruichek, Yassine; Sbihi, Abderrahmane; Touahni, Raja
2015-01-01
In this paper, we present a novel strategy for roof segmentation from aerial images (orthophotoplans) based on the cooperation of edge- and region-based segmentation methods. The proposed strategy is composed of three major steps. The first one, called the pre-processing step, consists of simplifying the acquired image with an appropriate couple of invariant and gradient, optimized for the application, in order to limit illumination changes (shadows, brightness, etc.) affecting the images. The second step is composed of two main parallel treatments: on the one hand, the simplified image is segmented by watershed regions. Even if the first segmentation of this step provides good results in general, the image is often over-segmented. To alleviate this problem, an efficient region merging strategy adapted to the orthophotoplan particularities, with a 2D modeling of roof ridges technique, is applied. On the other hand, the simplified image is segmented by watershed lines. The third step consists of integrating both watershed segmentation strategies into a single cooperative segmentation scheme in order to achieve satisfactory segmentation results. Tests have been performed on orthophotoplans containing 100 roofs with varying complexity, and the results are evaluated with the VINETcriterion using ground-truth image segmentation. A comparison with five popular segmentation techniques of the literature demonstrates the effectiveness and the reliability of the proposed approach. Indeed, we obtain a good segmentation rate of 96% with the proposed method compared to 87.5% with statistical region merging (SRM), 84% with mean shift, 82% with color structure code (CSC), 80% with efficient graph-based segmentation algorithm (EGBIS) and 71% with JSEG. PMID:25648706
Maintaining a Local Data Integration System in Support of Weather Forecast Operations
NASA Technical Reports Server (NTRS)
Watson, Leela R.; Blottman, Peter F.; Sharp, David W.; Hoeth, Brian
2010-01-01
Since 2000, both the National Weather Service in Melbourne, FL (NWS MLB) and the Spaceflight Meteorology Group (SMG) have used a local data integration system (LDIS) as part of their forecast and warning operations. Each has benefited from 3-dimensional analyses that are delivered to forecasters every 15 minutes across the peninsula of Florida. The intent is to generate products that enhance short-range weather forecasts issued in support of NWS MLB and SMG operational requirements within East Central Florida. The current LDIS uses the Advanced Regional Prediction System (ARPS) Data Analysis System (ADAS) package as its core, which integrates a wide variety of national, regional, and local observational data sets. It assimilates all available real-time data within its domain and is run at a finer spatial and temporal resolution than current national- or regional-scale analysis packages. As such, it provides local forecasters with a more comprehensive and complete understanding of evolving fine-scale weather features. Recent efforts have been undertaken to update the LDIS through the formal tasking process of NASA's Applied Meteorology Unit. The goals include upgrading LDIS with the latest version of ADAS, incorporating new sources of observational data, and making adjustments to shell scripts written to govern the system. A series of scripts run a complete modeling system consisting of the preprocessing step, the main model integration, and the post-processing step. The preprocessing step prepares the terrain, surface characteristics data sets, and the objective analysis for model initialization. Data ingested through ADAS include (but are not limited to) Level II Weather Surveillance Radar- 1988 Doppler (WSR-88D) data from six Florida radars, Geostationary Operational Environmental Satellites (GOES) visible and infrared satellite imagery, surface and upper air observations throughout Florida from NOAA's Earth System Research Laboratory/Global Systems Division/Meteorological Assimilation Data Ingest System (MADIS), as well as the Kennedy Space Center ICape Canaveral Air Force Station wind tower network. The scripts provide NWS MLB and SMG with several options for setting a desirable runtime configuration of the LDIS to account for adjustments in grid spacing, domain location, choice of observational data sources, and selection of background model fields, among others. The utility of an improved LDIS will be demonstrated through postanalysis warm and cool season case studies that compare high-resolution model output with and without the ADAS analyses. Operationally, these upgrades will result in more accurate depictions of the current local environment to help with short-range weather forecasting applications, while also offering an improved initialization for local versions of the Weather Research and Forecasting model.
Real-time portable system for fabric defect detection using an ARM processor
NASA Astrophysics Data System (ADS)
Fernandez-Gallego, J. A.; Yañez-Puentes, J. P.; Ortiz-Jaramillo, B.; Alvarez, J.; Orjuela-Vargas, S. A.; Philips, W.
2012-06-01
Modern textile industry seeks to produce textiles as little defective as possible since the presence of defects can decrease the final price of products from 45% to 65%. Automated visual inspection (AVI) systems, based on image analysis, have become an important alternative for replacing traditional inspections methods that involve human tasks. An AVI system gives the advantage of repeatability when implemented within defined constrains, offering more objective and reliable results for particular tasks than human inspection. Costs of automated inspection systems development can be reduced using modular solutions with embedded systems, in which an important advantage is the low energy consumption. Among the possibilities for developing embedded systems, the ARM processor has been explored for acquisition, monitoring and simple signal processing tasks. In a recent approach we have explored the use of the ARM processor for defects detection by implementing the wavelet transform. However, the computation speed of the preprocessing was not yet sufficient for real time applications. In this approach we significantly improve the preprocessing speed of the algorithm, by optimizing matrix operations, such that it is adequate for a real time application. The system was tested for defect detection using different defect types. The paper is focused in giving a detailed description of the basis of the algorithm implementation, such that other algorithms may use of the ARM operations for fast implementations.
Božičević, Alen; Dobrzyński, Maciej; De Bie, Hans; Gafner, Frank; Garo, Eliane; Hamburger, Matthias
2017-12-05
The technological development of LC-MS instrumentation has led to significant improvements of performance and sensitivity, enabling high-throughput analysis of complex samples, such as plant extracts. Most software suites allow preprocessing of LC-MS chromatograms to obtain comprehensive information on single constituents. However, more advanced processing needs, such as the systematic and unbiased comparative metabolite profiling of large numbers of complex LC-MS chromatograms remains a challenge. Currently, users have to rely on different tools to perform such data analyses. We developed a two-step protocol comprising a comparative metabolite profiling tool integrated in ACD/MS Workbook Suite, and a web platform developed in R language designed for clustering and visualization of chromatographic data. Initially, all relevant chromatographic and spectroscopic data (retention time, molecular ions with the respective ion abundance, and sample names) are automatically extracted and assembled in an Excel spreadsheet. The file is then loaded into an online web application that includes various statistical algorithms and provides the user with tools to compare and visualize the results in intuitive 2D heatmaps. We applied this workflow to LC-ESIMS profiles obtained from 69 honey samples. Within few hours of calculation with a standard PC, honey samples were preprocessed and organized in clusters based on their metabolite profile similarities, thereby highlighting the common metabolite patterns and distributions among samples. Implementation in the ACD/Laboratories software package enables ulterior integration of other analytical data, and in silico prediction tools for modern drug discovery.
Inci, Fatih; Filippini, Chiara; Baday, Murat; Ozen, Mehmet Ozgun; Calamak, Semih; Durmus, Naside Gozde; Wang, ShuQi; Hanhauser, Emily; Hobbs, Kristen S; Juillard, Franceline; Kuang, Ping Ping; Vetter, Michael L; Carocci, Margot; Yamamoto, Hidemi S; Takagi, Yuko; Yildiz, Umit Hakan; Akin, Demir; Wesemann, Duane R; Singhal, Amit; Yang, Priscilla L; Nibert, Max L; Fichorova, Raina N; Lau, Daryl T-Y; Henrich, Timothy J; Kaye, Kenneth M; Schachter, Steven C; Kuritzkes, Daniel R; Steinmetz, Lars M; Gambhir, Sanjiv S; Davis, Ronald W; Demirci, Utkan
2015-08-11
Recent advances in biosensing technologies present great potential for medical diagnostics, thus improving clinical decisions. However, creating a label-free general sensing platform capable of detecting multiple biotargets in various clinical specimens over a wide dynamic range, without lengthy sample-processing steps, remains a considerable challenge. In practice, these barriers prevent broad applications in clinics and at patients' homes. Here, we demonstrate the nanoplasmonic electrical field-enhanced resonating device (NE(2)RD), which addresses all these impediments on a single platform. The NE(2)RD employs an immunodetection assay to capture biotargets, and precisely measures spectral color changes by their wavelength and extinction intensity shifts in nanoparticles without prior sample labeling or preprocessing. We present through multiple examples, a label-free, quantitative, portable, multitarget platform by rapidly detecting various protein biomarkers, drugs, protein allergens, bacteria, eukaryotic cells, and distinct viruses. The linear dynamic range of NE(2)RD is five orders of magnitude broader than ELISA, with a sensitivity down to 400 fg/mL This range and sensitivity are achieved by self-assembling gold nanoparticles to generate hot spots on a 3D-oriented substrate for ultrasensitive measurements. We demonstrate that this precise platform handles multiple clinical samples such as whole blood, serum, and saliva without sample preprocessing under diverse conditions of temperature, pH, and ionic strength. The NE(2)RD's broad dynamic range, detection limit, and portability integrated with a disposable fluidic chip have broad applications, potentially enabling the transition toward precision medicine at the point-of-care or primary care settings and at patients' homes.
Awan, Muaaz Gul; Saeed, Fahad
2017-08-01
Modern high resolution Mass Spectrometry instruments can generate millions of spectra in a single systems biology experiment. Each spectrum consists of thousands of peaks but only a small number of peaks actively contribute to deduction of peptides. Therefore, pre-processing of MS data to detect noisy and non-useful peaks are an active area of research. Most of the sequential noise reducing algorithms are impractical to use as a pre-processing step due to high time-complexity. In this paper, we present a GPU based dimensionality-reduction algorithm, called G-MSR, for MS2 spectra. Our proposed algorithm uses novel data structures which optimize the memory and computational operations inside GPU. These novel data structures include Binary Spectra and Quantized Indexed Spectra (QIS) . The former helps in communicating essential information between CPU and GPU using minimum amount of data while latter enables us to store and process complex 3-D data structure into a 1-D array structure while maintaining the integrity of MS data. Our proposed algorithm also takes into account the limited memory of GPUs and switches between in-core and out-of-core modes based upon the size of input data. G-MSR achieves a peak speed-up of 386x over its sequential counterpart and is shown to process over a million spectra in just 32 seconds. The code for this algorithm is available as a GPL open-source at GitHub at the following link: https://github.com/pcdslab/G-MSR.
An SPM12 extension for multiple sclerosis lesion segmentation
NASA Astrophysics Data System (ADS)
Roura, Eloy; Oliver, Arnau; Cabezas, Mariano; Valverde, Sergi; Pareto, Deborah; Vilanova, Joan C.; Ramió-Torrentà, Lluís.; Rovira, Àlex; Lladó, Xavier
2016-03-01
Purpose: Magnetic resonance imaging is nowadays the hallmark to diagnose multiple sclerosis (MS), characterized by white matter lesions. Several approaches have been recently presented to tackle the lesion segmentation problem, but none of them have been accepted as a standard tool in the daily clinical practice. In this work we present yet another tool able to automatically segment white matter lesions outperforming the current-state-of-the-art approaches. Methods: This work is an extension of Roura et al. [1], where external and platform dependent pre-processing libraries (brain extraction, noise reduction and intensity normalization) were required to achieve an optimal performance. Here we have updated and included all these required pre-processing steps into a single framework (SPM software). Therefore, there is no need of external tools to achieve the desired segmentation results. Besides, we have changed the working space from T1w to FLAIR, reducing interpolation errors produced in the registration process from FLAIR to T1w space. Finally a post-processing constraint based on shape and location has been added to reduce false positive detections. Results: The evaluation of the tool has been done on 24 MS patients. Qualitative and quantitative results are shown with both approaches in terms of lesion detection and segmentation. Conclusion: We have simplified both installation and implementation of the approach, providing a multiplatform tool1 integrated into the SPM software, which relies only on using T1w and FLAIR images. We have reduced with this new version the computation time of the previous approach while maintaining the performance.
Post-Disaster Damage Assessment Through Coherent Change Detection on SAR Imagery
NASA Astrophysics Data System (ADS)
Guida, L.; Boccardo, P.; Donevski, I.; Lo Schiavo, L.; Molinari, M. E.; Monti-Guarnieri, A.; Oxoli, D.; Brovelli, M. A.
2018-04-01
Damage assessment is a fundamental step to support emergency response and recovery activities in a post-earthquake scenario. In recent years, UAVs and satellite optical imagery was applied to assess major structural damages before technicians could reach the areas affected by the earthquake. However, bad weather conditions may harm the quality of these optical assessments, thus limiting the practical applicability of these techniques. In this paper, the application of Synthetic Aperture Radar (SAR) imagery is investigated and a novel approach to SAR-based damage assessment is presented. Coherent Change Detection (CCD) algorithms on multiple interferometrically pre-processed SAR images of the area affected by the seismic event are exploited to automatically detect potential damages to buildings and other physical structures. As a case study, the 2016 Central Italy earthquake involving the cities of Amatrice and Accumoli was selected. The main contribution of the research outlined above is the integration of a complex process, requiring the coordination of a variety of methods and tools, into a unitary framework, which allows end-to-end application of the approach from SAR data pre-processing to result visualization in a Geographic Information System (GIS). A prototype of this pipeline was implemented, and the outcomes of this methodology were validated through an extended comparison with traditional damage assessment maps, created through photo-interpretation of high resolution aerial imagery. The results indicate that the proposed methodology is able to perform damage detection with a good level of accuracy, as most of the detected points of change are concentrated around highly damaged buildings.
Inci, Fatih; Filippini, Chiara; Ozen, Mehmet Ozgun; Calamak, Semih; Durmus, Naside Gozde; Wang, ShuQi; Hanhauser, Emily; Hobbs, Kristen S.; Juillard, Franceline; Kuang, Ping Ping; Vetter, Michael L.; Carocci, Margot; Yamamoto, Hidemi S.; Takagi, Yuko; Yildiz, Umit Hakan; Akin, Demir; Wesemann, Duane R.; Singhal, Amit; Yang, Priscilla L.; Nibert, Max L.; Fichorova, Raina N.; Lau, Daryl T.-Y.; Henrich, Timothy J.; Kaye, Kenneth M.; Schachter, Steven C.; Kuritzkes, Daniel R.; Steinmetz, Lars M.; Gambhir, Sanjiv S.; Davis, Ronald W.; Demirci, Utkan
2015-01-01
Recent advances in biosensing technologies present great potential for medical diagnostics, thus improving clinical decisions. However, creating a label-free general sensing platform capable of detecting multiple biotargets in various clinical specimens over a wide dynamic range, without lengthy sample-processing steps, remains a considerable challenge. In practice, these barriers prevent broad applications in clinics and at patients’ homes. Here, we demonstrate the nanoplasmonic electrical field-enhanced resonating device (NE2RD), which addresses all these impediments on a single platform. The NE2RD employs an immunodetection assay to capture biotargets, and precisely measures spectral color changes by their wavelength and extinction intensity shifts in nanoparticles without prior sample labeling or preprocessing. We present through multiple examples, a label-free, quantitative, portable, multitarget platform by rapidly detecting various protein biomarkers, drugs, protein allergens, bacteria, eukaryotic cells, and distinct viruses. The linear dynamic range of NE2RD is five orders of magnitude broader than ELISA, with a sensitivity down to 400 fg/mL This range and sensitivity are achieved by self-assembling gold nanoparticles to generate hot spots on a 3D-oriented substrate for ultrasensitive measurements. We demonstrate that this precise platform handles multiple clinical samples such as whole blood, serum, and saliva without sample preprocessing under diverse conditions of temperature, pH, and ionic strength. The NE2RD’s broad dynamic range, detection limit, and portability integrated with a disposable fluidic chip have broad applications, potentially enabling the transition toward precision medicine at the point-of-care or primary care settings and at patients’ homes. PMID:26195743
Comparing Binaural Pre-processing Strategies III
Warzybok, Anna; Ernst, Stephan M. A.
2015-01-01
A comprehensive evaluation of eight signal pre-processing strategies, including directional microphones, coherence filters, single-channel noise reduction, binaural beamformers, and their combinations, was undertaken with normal-hearing (NH) and hearing-impaired (HI) listeners. Speech reception thresholds (SRTs) were measured in three noise scenarios (multitalker babble, cafeteria noise, and single competing talker). Predictions of three common instrumental measures were compared with the general perceptual benefit caused by the algorithms. The individual SRTs measured without pre-processing and individual benefits were objectively estimated using the binaural speech intelligibility model. Ten listeners with NH and 12 HI listeners participated. The participants varied in age and pure-tone threshold levels. Although HI listeners required a better signal-to-noise ratio to obtain 50% intelligibility than listeners with NH, no differences in SRT benefit from the different algorithms were found between the two groups. With the exception of single-channel noise reduction, all algorithms showed an improvement in SRT of between 2.1 dB (in cafeteria noise) and 4.8 dB (in single competing talker condition). Model predictions with binaural speech intelligibility model explained 83% of the measured variance of the individual SRTs in the no pre-processing condition. Regarding the benefit from the algorithms, the instrumental measures were not able to predict the perceptual data in all tested noise conditions. The comparable benefit observed for both groups suggests a possible application of noise reduction schemes for listeners with different hearing status. Although the model can predict the individual SRTs without pre-processing, further development is necessary to predict the benefits obtained from the algorithms at an individual level. PMID:26721922
Evaluation of the operational SAR based Baltic sea ice concentration products
NASA Astrophysics Data System (ADS)
Karvonen, Juha
Sea ice concentration is an important ice parameter both for weather and climate modeling and sea ice navigation. We have developed an fully automated algorithm for sea ice concentration retrieval using dual-polarized ScanSAR wide mode RADARSAT-2 data. RADARSAT-2 is a C-band SAR instrument enabling dual-polarized acquisition in ScanSAR mode. The swath width for the RADARSAT-2 ScanSAR mode is about 500 km, making it very suitable for operational sea ice monitoring. The polarization combination used in our concentration estimation is HH/HV. The SAR data is first preprocessed, the preprocessing consists of geo-rectification to Mercator projection, incidence angle correction fro both the polarization channels. and SAR mosaicking. After preprocessing a segmentation is performed for the SAR mosaics, and some single-channel and dual-channel features are computed for each SAR segment. Finally the SAR concentration is estimated based on these segment-wise features. The algorithm is similar as introduced in Karvonen 2014. The ice concentration is computed daily using a daily RADARSAT-2 SAR mosaic as its input, and it thus gives the concentration estimated at each Baltic Sea location based on the most recent SAR data at the location. The algorithm has been run in an operational test mode since January 2014. We present evaluation of the SAR-based concentration estimates for the Baltic ice season 2014 by comparing the SAR results with gridded the Finnish Ice Service ice charts and ice concentration estimates from a radiometer algorithm (AMSR-2 Bootstrap algorithm results). References: J. Karvonen, Baltic Sea Ice Concentration Estimation Based on C-Band Dual-Polarized SAR Data, IEEE Transactions on Geoscience and Remote Sensing, in press, DOI: 10.1109/TGRS.2013.2290331, 2014.
NASA Astrophysics Data System (ADS)
Dai, Shengyun; Pan, Xiaoning; Ma, Lijuan; Huang, Xingguo; Du, Chenzhao; Qiao, Yanjiang; Wu, Zhisheng
2018-05-01
Particle size is of great importance for the quantitative model of the NIR diffuse reflectance. In this paper, the effect of sample particle size on the measurement of harpagoside in Radix Scrophulariae powder by near infrared diffuse (NIR) reflectance spectroscopy was explored. High-performance liquid chromatography (HPLC) was employed as a reference method to construct the quantitative particle size model. Several spectral preprocessing methods were compared, and particle size models obtained by different preprocessing methods for establishing the partial least-squares (PLS) models of harpagoside. Data showed that the particle size distribution of 125-150 μm for Radix Scrophulariae exhibited the best prediction ability with R2pre=0.9513, RMSEP=0.1029 mg·g-1, and RPD = 4.78. For the hybrid granularity calibration model, the particle size distribution of 90-180 μm exhibited the best prediction ability with R2pre=0.8919, RMSEP=0.1632 mg·g-1, and RPD = 3.09. Furthermore, the Kubelka-Munk theory was used to relate the absorption coefficient k (concentration-dependent) and scatter coefficient s (particle size-dependent). The scatter coefficient s was calculated based on the Kubelka-Munk theory to study the changes of s after being mathematically preprocessed. A linear relationship was observed between k/s and absorption A within a certain range and the value for k/s was greater than 4. According to this relationship, the model was more accurately constructed with the particle size distribution of 90-180 μm when s was kept constant or in a small linear region. This region provided a good reference for the linear modeling of diffuse reflectance spectroscopy. To establish a diffuse reflectance NIR model, further accurate assessment should be obtained in advance for a precise linear model.
NASA Astrophysics Data System (ADS)
Hsu, Kuo-Hsien
2012-11-01
Formosat-2 image is a kind of high-spatial-resolution (2 meters GSD) remote sensing satellite data, which includes one panchromatic band and four multispectral bands (Blue, Green, Red, near-infrared). An essential sector in the daily processing of received Formosat-2 image is to estimate the cloud statistic of image using Automatic Cloud Coverage Assessment (ACCA) algorithm. The information of cloud statistic of image is subsequently recorded as an important metadata for image product catalog. In this paper, we propose an ACCA method with two consecutive stages: preprocessing and post-processing analysis. For pre-processing analysis, the un-supervised K-means classification, Sobel's method, thresholding method, non-cloudy pixels reexamination, and cross-band filter method are implemented in sequence for cloud statistic determination. For post-processing analysis, Box-Counting fractal method is implemented. In other words, the cloud statistic is firstly determined via pre-processing analysis, the correctness of cloud statistic of image of different spectral band is eventually cross-examined qualitatively and quantitatively via post-processing analysis. The selection of an appropriate thresholding method is very critical to the result of ACCA method. Therefore, in this work, We firstly conduct a series of experiments of the clustering-based and spatial thresholding methods that include Otsu's, Local Entropy(LE), Joint Entropy(JE), Global Entropy(GE), and Global Relative Entropy(GRE) method, for performance comparison. The result shows that Otsu's and GE methods both perform better than others for Formosat-2 image. Additionally, our proposed ACCA method by selecting Otsu's method as the threshoding method has successfully extracted the cloudy pixels of Formosat-2 image for accurate cloud statistic estimation.
Integrated data analysis for genome-wide research.
Steinfath, Matthias; Repsilber, Dirk; Scholz, Matthias; Walther, Dirk; Selbig, Joachim
2007-01-01
Integrated data analysis is introduced as the intermediate level of a systems biology approach to analyse different 'omics' datasets, i.e., genome-wide measurements of transcripts, protein levels or protein-protein interactions, and metabolite levels aiming at generating a coherent understanding of biological function. In this chapter we focus on different methods of correlation analyses ranging from simple pairwise correlation to kernel canonical correlation which were recently applied in molecular biology. Several examples are presented to illustrate their application. The input data for this analysis frequently originate from different experimental platforms. Therefore, preprocessing steps such as data normalisation and missing value estimation are inherent to this approach. The corresponding procedures, potential pitfalls and biases, and available software solutions are reviewed. The multiplicity of observations obtained in omics-profiling experiments necessitates the application of multiple testing correction techniques.
PCA method for automated detection of mispronounced words
NASA Astrophysics Data System (ADS)
Ge, Zhenhao; Sharma, Sudhendu R.; Smith, Mark J. T.
2011-06-01
This paper presents a method for detecting mispronunciations with the aim of improving Computer Assisted Language Learning (CALL) tools used by foreign language learners. The algorithm is based on Principle Component Analysis (PCA). It is hierarchical with each successive step refining the estimate to classify the test word as being either mispronounced or correct. Preprocessing before detection, like normalization and time-scale modification, is implemented to guarantee uniformity of the feature vectors input to the detection system. The performance using various features including spectrograms and Mel-Frequency Cepstral Coefficients (MFCCs) are compared and evaluated. Best results were obtained using MFCCs, achieving up to 99% accuracy in word verification and 93% in native/non-native classification. Compared with Hidden Markov Models (HMMs) which are used pervasively in recognition application, this particular approach is computational efficient and effective when training data is limited.
Segmenting Bone Parts for Bone Age Assessment using Point Distribution Model and Contour Modelling
NASA Astrophysics Data System (ADS)
Kaur, Amandeep; Singh Mann, Kulwinder, Dr.
2018-01-01
Bone age assessment (BAA) is a task performed on radiographs by the pediatricians in hospitals to predict the final adult height, to diagnose growth disorders by monitoring skeletal development. For building an automatic bone age assessment system the step in routine is to do image pre-processing of the bone X-rays so that features row can be constructed. In this research paper, an enhanced point distribution algorithm using contours has been implemented for segmenting bone parts as per well-established procedure of bone age assessment that would be helpful in building feature row and later on; it would be helpful in construction of automatic bone age assessment system. Implementation of the segmentation algorithm shows high degree of accuracy in terms of recall and precision in segmenting bone parts from left hand X-Rays.
Predicting Flood in Perlis Using Ant Colony Optimization
NASA Astrophysics Data System (ADS)
Nadia Sabri, Syaidatul; Saian, Rizauddin
2017-06-01
Flood forecasting is widely being studied in order to reduce the effect of flood such as loss of property, loss of life and contamination of water supply. Usually flood occurs due to continuous heavy rainfall. This study used a variant of Ant Colony Optimization (ACO) algorithm named the Ant-Miner to develop the classification prediction model to predict flood. However, since Ant-Miner only accept discrete data, while rainfall data is a time series data, a pre-processing steps is needed to discretize the rainfall data initially. This study used a technique called the Symbolic Aggregate Approximation (SAX) to convert the rainfall time series data into discrete data. As an addition, Simple K-Means algorithm was used to cluster the data produced by SAX. The findings show that the predictive accuracy of the classification prediction model is more than 80%.
NASA Astrophysics Data System (ADS)
Huang, Liang; Ni, Xuan; Ditto, William L.; Spano, Mark; Carney, Paul R.; Lai, Ying-Cheng
2017-01-01
We develop a framework to uncover and analyse dynamical anomalies from massive, nonlinear and non-stationary time series data. The framework consists of three steps: preprocessing of massive datasets to eliminate erroneous data segments, application of the empirical mode decomposition and Hilbert transform paradigm to obtain the fundamental components embedded in the time series at distinct time scales, and statistical/scaling analysis of the components. As a case study, we apply our framework to detecting and characterizing high-frequency oscillations (HFOs) from a big database of rat electroencephalogram recordings. We find a striking phenomenon: HFOs exhibit on-off intermittency that can be quantified by algebraic scaling laws. Our framework can be generalized to big data-related problems in other fields such as large-scale sensor data and seismic data analysis.
Recognition of Roasted Coffee Bean Levels using Image Processing and Neural Network
NASA Astrophysics Data System (ADS)
Nasution, T. H.; Andayani, U.
2017-03-01
The coffee beans roast levels have some characteristics. However, some people cannot recognize the coffee beans roast level. In this research, we propose to design a method to recognize the coffee beans roast level of images digital by processing the image and classifying with backpropagation neural network. The steps consist of how to collect the images data with image acquisition, pre-processing, feature extraction using Gray Level Co-occurrence Matrix (GLCM) method and finally normalization of data extraction using decimal scaling features. The values of decimal scaling features become an input of classifying in backpropagation neural network. We use the method of backpropagation to recognize the coffee beans roast levels. The results showed that the proposed method is able to identify the coffee roasts beans level with an accuracy of 97.5%.
Automatic tissue characterization from ultrasound imagery
NASA Astrophysics Data System (ADS)
Kadah, Yasser M.; Farag, Aly A.; Youssef, Abou-Bakr M.; Badawi, Ahmed M.
1993-08-01
In this work, feature extraction algorithms are proposed to extract the tissue characterization parameters from liver images. Then the resulting parameter set is further processed to obtain the minimum number of parameters representing the most discriminating pattern space for classification. This preprocessing step was applied to over 120 pathology-investigated cases to obtain the learning data for designing the classifier. The extracted features are divided into independent training and test sets and are used to construct both statistical and neural classifiers. The optimal criteria for these classifiers are set to have minimum error, ease of implementation and learning, and the flexibility for future modifications. Various algorithms for implementing various classification techniques are presented and tested on the data. The best performance was obtained using a single layer tensor model functional link network. Also, the voting k-nearest neighbor classifier provided comparably good diagnostic rates.
Unsupervised pattern recognition methods in ciders profiling based on GCE voltammetric signals.
Jakubowska, Małgorzata; Sordoń, Wanda; Ciepiela, Filip
2016-07-15
This work presents a complete methodology of distinguishing between different brands of cider and ageing degrees, based on voltammetric signals, utilizing dedicated data preprocessing procedures and unsupervised multivariate analysis. It was demonstrated that voltammograms recorded on glassy carbon electrode in Britton-Robinson buffer at pH 2 are reproducible for each brand. By application of clustering algorithms and principal component analysis visible homogenous clusters were obtained. Advanced signal processing strategy which included automatic baseline correction, interval scaling and continuous wavelet transform with dedicated mother wavelet, was a key step in the correct recognition of the objects. The results show that voltammetry combined with optimized univariate and multivariate data processing is a sufficient tool to distinguish between ciders from various brands and to evaluate their freshness. Copyright © 2016 Elsevier Ltd. All rights reserved.
Characterization of selected elementary motion detector cells to image primitives.
Benson, Leslie A; Barrett, Steven F; Wright, Cameron H G
2008-01-01
Developing a visual sensing system, complete with motion processing hardware and software would have many applications to current technology. It could be mounted on many autonomous vehicles to provide information about the navigational environment, as well as obstacle avoidance features. Incorporating the motion processing capabilities into the sensor requires a new approach to the algorithm implementation. This research, and that of many others, have turned to nature for inspiration. Elementary motion detector (EMD) cells are involved in a biological preprocessing network to provide information to the motion processing lobes of the house degrees y Musca domestica. This paper describes the response of the photoreceptor inputs to the EMDs. The inputs to the EMD components are tested as they are stimulated with varying image primitives. This is the first of many steps in characterizing the EMD response to image primitives.
Automated boundary segmentation and wound analysis for longitudinal corneal OCT images
NASA Astrophysics Data System (ADS)
Wang, Fei; Shi, Fei; Zhu, Weifang; Pan, Lingjiao; Chen, Haoyu; Huang, Haifan; Zheng, Kangkeng; Chen, Xinjian
2017-03-01
Optical coherence tomography (OCT) has been widely applied in the examination and diagnosis of corneal diseases, but the information directly achieved from the OCT images by manual inspection is limited. We propose an automatic processing method to assist ophthalmologists in locating the boundaries in corneal OCT images and analyzing the recovery of corneal wounds after treatment from longitudinal OCT images. It includes the following steps: preprocessing, epithelium and endothelium boundary segmentation and correction, wound detection, corneal boundary fitting and wound analysis. The method was tested on a data set with longitudinal corneal OCT images from 20 subjects. Each subject has five images acquired after corneal operation over a period of time. The segmentation and classification accuracy of the proposed algorithm is high and can be used for analyzing wound recovery after corneal surgery.
Gao, Lin; Zhang, Tongsheng; Wang, Jue; Stephen, Julia
2014-01-01
When connectivity analysis is carried out for event related EEG and MEG, the presence of strong spatial correlations from spontaneous activity in background may mask the local neuronal evoked activity and lead to spurious connections. In this paper, we hypothesized PCA decomposition could be used to diminish the background activity and further improve the performance of connectivity analysis in event related experiments. The idea was tested using simulation, where we found that for the 306-channel Elekta Neuromag system, the first 4 PCs represent the dominant background activity, and the source connectivity pattern after preprocessing is consistent with the true connectivity pattern designed in the simulation. Improving signal to noise of the evoked responses by discarding the first few PCs demonstrates increased coherences at major physiological frequency bands when removing the first few PCs. Furthermore, the evoked information was maintained after PCA preprocessing. In conclusion, it is demonstrated that the first few PCs represent background activity, and PCA decomposition can be employed to remove it to expose the evoked activity for the channels under investigation. Therefore, PCA can be applied as a preprocessing approach to improve neuronal connectivity analysis for event related data. PMID:22918837
Gao, Lin; Zhang, Tongsheng; Wang, Jue; Stephen, Julia
2013-04-01
When connectivity analysis is carried out for event related EEG and MEG, the presence of strong spatial correlations from spontaneous activity in background may mask the local neuronal evoked activity and lead to spurious connections. In this paper, we hypothesized PCA decomposition could be used to diminish the background activity and further improve the performance of connectivity analysis in event related experiments. The idea was tested using simulation, where we found that for the 306-channel Elekta Neuromag system, the first 4 PCs represent the dominant background activity, and the source connectivity pattern after preprocessing is consistent with the true connectivity pattern designed in the simulation. Improving signal to noise of the evoked responses by discarding the first few PCs demonstrates increased coherences at major physiological frequency bands when removing the first few PCs. Furthermore, the evoked information was maintained after PCA preprocessing. In conclusion, it is demonstrated that the first few PCs represent background activity, and PCA decomposition can be employed to remove it to expose the evoked activity for the channels under investigation. Therefore, PCA can be applied as a preprocessing approach to improve neuronal connectivity analysis for event related data.
Latifoğlu, Fatma; Polat, Kemal; Kara, Sadik; Güneş, Salih
2008-02-01
In this study, we proposed a new medical diagnosis system based on principal component analysis (PCA), k-NN based weighting pre-processing, and Artificial Immune Recognition System (AIRS) for diagnosis of atherosclerosis from Carotid Artery Doppler Signals. The suggested system consists of four stages. First, in the feature extraction stage, we have obtained the features related with atherosclerosis disease using Fast Fourier Transformation (FFT) modeling and by calculating of maximum frequency envelope of sonograms. Second, in the dimensionality reduction stage, the 61 features of atherosclerosis disease have been reduced to 4 features using PCA. Third, in the pre-processing stage, we have weighted these 4 features using different values of k in a new weighting scheme based on k-NN based weighting pre-processing. Finally, in the classification stage, AIRS classifier has been used to classify subjects as healthy or having atherosclerosis. Hundred percent of classification accuracy has been obtained by the proposed system using 10-fold cross validation. This success shows that the proposed system is a robust and effective system in diagnosis of atherosclerosis disease.
Semi-autonomous remote sensing time series generation tool
NASA Astrophysics Data System (ADS)
Babu, Dinesh Kumar; Kaufmann, Christof; Schmidt, Marco; Dhams, Thorsten; Conrad, Christopher
2017-10-01
High spatial and temporal resolution data is vital for crop monitoring and phenology change detection. Due to the lack of satellite architecture and frequent cloud cover issues, availability of daily high spatial data is still far from reality. Remote sensing time series generation of high spatial and temporal data by data fusion seems to be a practical alternative. However, it is not an easy process, since it involves multiple steps and also requires multiple tools. In this paper, a framework of Geo Information System (GIS) based tool is presented for semi-autonomous time series generation. This tool will eliminate the difficulties by automating all the steps and enable the users to generate synthetic time series data with ease. Firstly, all the steps required for the time series generation process are identified and grouped into blocks based on their functionalities. Later two main frameworks are created, one to perform all the pre-processing steps on various satellite data and the other one to perform data fusion to generate time series. The two frameworks can be used individually to perform specific tasks or they could be combined to perform both the processes in one go. This tool can handle most of the known geo data formats currently available which makes it a generic tool for time series generation of various remote sensing satellite data. This tool is developed as a common platform with good interface which provides lot of functionalities to enable further development of more remote sensing applications. A detailed description on the capabilities and the advantages of the frameworks are given in this paper.
Planning bioinformatics workflows using an expert system.
Chen, Xiaoling; Chang, Jeffrey T
2017-04-15
Bioinformatic analyses are becoming formidably more complex due to the increasing number of steps required to process the data, as well as the proliferation of methods that can be used in each step. To alleviate this difficulty, pipelines are commonly employed. However, pipelines are typically implemented to automate a specific analysis, and thus are difficult to use for exploratory analyses requiring systematic changes to the software or parameters used. To automate the development of pipelines, we have investigated expert systems. We created the Bioinformatics ExperT SYstem (BETSY) that includes a knowledge base where the capabilities of bioinformatics software is explicitly and formally encoded. BETSY is a backwards-chaining rule-based expert system comprised of a data model that can capture the richness of biological data, and an inference engine that reasons on the knowledge base to produce workflows. Currently, the knowledge base is populated with rules to analyze microarray and next generation sequencing data. We evaluated BETSY and found that it could generate workflows that reproduce and go beyond previously published bioinformatics results. Finally, a meta-investigation of the workflows generated from the knowledge base produced a quantitative measure of the technical burden imposed by each step of bioinformatics analyses, revealing the large number of steps devoted to the pre-processing of data. In sum, an expert system approach can facilitate exploratory bioinformatic analysis by automating the development of workflows, a task that requires significant domain expertise. https://github.com/jefftc/changlab. jeffrey.t.chang@uth.tmc.edu. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Planning bioinformatics workflows using an expert system
Chen, Xiaoling; Chang, Jeffrey T.
2017-01-01
Abstract Motivation: Bioinformatic analyses are becoming formidably more complex due to the increasing number of steps required to process the data, as well as the proliferation of methods that can be used in each step. To alleviate this difficulty, pipelines are commonly employed. However, pipelines are typically implemented to automate a specific analysis, and thus are difficult to use for exploratory analyses requiring systematic changes to the software or parameters used. Results: To automate the development of pipelines, we have investigated expert systems. We created the Bioinformatics ExperT SYstem (BETSY) that includes a knowledge base where the capabilities of bioinformatics software is explicitly and formally encoded. BETSY is a backwards-chaining rule-based expert system comprised of a data model that can capture the richness of biological data, and an inference engine that reasons on the knowledge base to produce workflows. Currently, the knowledge base is populated with rules to analyze microarray and next generation sequencing data. We evaluated BETSY and found that it could generate workflows that reproduce and go beyond previously published bioinformatics results. Finally, a meta-investigation of the workflows generated from the knowledge base produced a quantitative measure of the technical burden imposed by each step of bioinformatics analyses, revealing the large number of steps devoted to the pre-processing of data. In sum, an expert system approach can facilitate exploratory bioinformatic analysis by automating the development of workflows, a task that requires significant domain expertise. Availability and Implementation: https://github.com/jefftc/changlab Contact: jeffrey.t.chang@uth.tmc.edu PMID:28052928
NASA Astrophysics Data System (ADS)
Rishi, Rahul; Choudhary, Amit; Singh, Ravinder; Dhaka, Vijaypal Singh; Ahlawat, Savita; Rao, Mukta
2010-02-01
In this paper we propose a system for classification problem of handwritten text. The system is composed of preprocessing module, supervised learning module and recognition module on a very broad level. The preprocessing module digitizes the documents and extracts features (tangent values) for each character. The radial basis function network is used in the learning and recognition modules. The objective is to analyze and improve the performance of Multi Layer Perceptron (MLP) using RBF transfer functions over Logarithmic Sigmoid Function. The results of 35 experiments indicate that the Feed Forward MLP performs accurately and exhaustively with RBF. With the change in weight update mechanism and feature-drawn preprocessing module, the proposed system is competent with good recognition show.
Eggenreich, K; Windhab, S; Schrank, S; Treffer, D; Juster, H; Steinbichler, G; Laske, S; Koscher, G; Roblegg, E; Khinast, J G
2016-05-30
The objective of the present study was to develop a one-step process for the production of tablets directly from primary powder by means of injection molding (IM), to create solid-dispersion based tablets. Fenofibrate was used as the model API, a polyvinyl caprolactame-polyvinyl acetate-polyethylene glycol graft co-polymer served as a matrix system. Formulations were injection-molded into tablets using state-of-the-art IM equipment. The resulting tablets were physico-chemically characterized and the drug release kinetics and mechanism were determined. Comparison tablets were produced, either directly from powder or from pre-processed pellets prepared via hot melt extrusion (HME). The content of the model drug in the formulations was 10% (w/w), 20% (w/w) and 30% (w/w), respectively. After 120min, both powder-based and pellet-based injection-molded tablets exhibited a drug release of 60% independent of the processing route. Content uniformity analysis demonstrated that the model drug was homogeneously distributed. Moreover, analysis of single dose uniformity also revealed geometric drug homogeneity between tablets of one shot. Copyright © 2016 Elsevier B.V. All rights reserved.