Does preprocessing change nonlinear measures of heart rate variability?
Gomes, Murilo E D; Guimarães, Homero N; Ribeiro, Antônio L P; Aguirre, Luis A
2002-11-01
This work investigated if methods used to produce a uniformly sampled heart rate variability (HRV) time series significantly change the deterministic signature underlying the dynamics of such signals and some nonlinear measures of HRV. Two methods of preprocessing were used: the convolution of inverse interval function values with a rectangular window and the cubic polynomial interpolation. The HRV time series were obtained from 33 Wistar rats submitted to autonomic blockade protocols and from 17 healthy adults. The analysis of determinism was carried out by the method of surrogate data sets and nonlinear autoregressive moving average modelling and prediction. The scaling exponents alpha, alpha(1) and alpha(2) derived from the detrended fluctuation analysis were calculated from raw HRV time series and respective preprocessed signals. It was shown that the technique of cubic interpolation of HRV time series did not significantly change any nonlinear characteristic studied in this work, while the method of convolution only affected the alpha(1) index. The results suggested that preprocessed time series may be used to study HRV in the field of nonlinear dynamics.
NASA Astrophysics Data System (ADS)
Zhu, Zhe
2017-08-01
The free and open access to all archived Landsat images in 2008 has completely changed the way of using Landsat data. Many novel change detection algorithms based on Landsat time series have been developed We present a comprehensive review of four important aspects of change detection studies based on Landsat time series, including frequencies, preprocessing, algorithms, and applications. We observed the trend that the more recent the study, the higher the frequency of Landsat time series used. We reviewed a series of image preprocessing steps, including atmospheric correction, cloud and cloud shadow detection, and composite/fusion/metrics techniques. We divided all change detection algorithms into six categories, including thresholding, differencing, segmentation, trajectory classification, statistical boundary, and regression. Within each category, six major characteristics of different algorithms, such as frequency, change index, univariate/multivariate, online/offline, abrupt/gradual change, and sub-pixel/pixel/spatial were analyzed. Moreover, some of the widely-used change detection algorithms were also discussed. Finally, we reviewed different change detection applications by dividing these applications into two categories, change target and change agent detection.
Neural network versus classical time series forecasting models
NASA Astrophysics Data System (ADS)
Nor, Maria Elena; Safuan, Hamizah Mohd; Shab, Noorzehan Fazahiyah Md; Asrul, Mohd; Abdullah, Affendi; Mohamad, Nurul Asmaa Izzati; Lee, Muhammad Hisyam
2017-05-01
Artificial neural network (ANN) has advantage in time series forecasting as it has potential to solve complex forecasting problems. This is because ANN is data driven approach which able to be trained to map past values of a time series. In this study the forecast performance between neural network and classical time series forecasting method namely seasonal autoregressive integrated moving average models was being compared by utilizing gold price data. Moreover, the effect of different data preprocessing on the forecast performance of neural network being examined. The forecast accuracy was evaluated using mean absolute deviation, root mean square error and mean absolute percentage error. It was found that ANN produced the most accurate forecast when Box-Cox transformation was used as data preprocessing.
NASA Astrophysics Data System (ADS)
Schultz, Michael; Verbesselt, Jan; Herold, Martin; Avitabile, Valerio
2013-10-01
Researchers who use remotely sensed data can spend half of their total effort analysing prior data. If this data preprocessing does not match the application, this time spent on data analysis can increase considerably and can lead to inaccuracies. Despite the existence of a number of methods for pre-processing Landsat time series, each method has shortcomings, particularly for mapping forest changes under varying illumination, data availability and atmospheric conditions. Based on the requirements of mapping forest changes as defined by the United Nations (UN) Reducing Emissions from Forest Degradation and Deforestation (REDD) program, the accurate reporting of the spatio-temporal properties of these changes is necessary. We compared the impact of three fundamentally different radiometric preprocessing techniques Moderate Resolution Atmospheric TRANsmission (MODTRAN), Second Simulation of a Satellite Signal in the Solar Spectrum (6S) and simple Dark Object Subtraction (DOS) on mapping forest changes using Landsat time series data. A modification of Breaks For Additive Season and Trend (BFAST) monitor was used to jointly map the spatial and temporal agreement of forest changes at test sites in Ethiopia and Viet Nam. The suitability of the pre-processing methods for the occurring forest change drivers was assessed using recently captured Ground Truth and high resolution data (1000 points). A method for creating robust generic forest maps used for the sampling design is presented. An assessment of error sources has been performed identifying haze as a major source for time series analysis commission error.
Wilson, Scott; Bowyer, Andrea; Harrap, Stephen B
2015-01-01
The clinical characterization of cardiovascular dynamics during hemodialysis (HD) has important pathophysiological implications in terms of diagnostic, cardiovascular risk assessment, and treatment efficacy perspectives. Currently the diagnosis of significant intradialytic systolic blood pressure (SBP) changes among HD patients is imprecise and opportunistic, reliant upon the presence of hypotensive symptoms in conjunction with coincident but isolated noninvasive brachial cuff blood pressure (NIBP) readings. Considering hemodynamic variables as a time series makes a continuous recording approach more desirable than intermittent measures; however, in the clinical environment, the data signal is susceptible to corruption due to both impulsive and Gaussian-type noise. Signal preprocessing is an attractive solution to this problem. Prospectively collected continuous noninvasive SBP data over the short-break intradialytic period in ten patients was preprocessed using a novel median hybrid filter (MHF) algorithm and compared with 50 time-coincident pairs of intradialytic NIBP measures from routine HD practice. The median hybrid preprocessing technique for continuously acquired cardiovascular data yielded a dynamic regression without significant noise and artifact, suitable for high-level profiling of time-dependent SBP behavior. Signal accuracy is highly comparable with standard NIBP measurement, with the added clinical benefit of dynamic real-time hemodynamic information.
Noise-assisted data processing with empirical mode decomposition in biomedical signals.
Karagiannis, Alexandros; Constantinou, Philip
2011-01-01
In this paper, a methodology is described in order to investigate the performance of empirical mode decomposition (EMD) in biomedical signals, and especially in the case of electrocardiogram (ECG). Synthetic ECG signals corrupted with white Gaussian noise are employed and time series of various lengths are processed with EMD in order to extract the intrinsic mode functions (IMFs). A statistical significance test is implemented for the identification of IMFs with high-level noise components and their exclusion from denoising procedures. Simulation campaign results reveal that a decrease of processing time is accomplished with the introduction of preprocessing stage, prior to the application of EMD in biomedical time series. Furthermore, the variation in the number of IMFs according to the type of the preprocessing stage is studied as a function of SNR and time-series length. The application of the methodology in MIT-BIH ECG records is also presented in order to verify the findings in real ECG signals.
HydroClimATe: hydrologic and climatic analysis toolkit
Dickinson, Jesse; Hanson, Randall T.; Predmore, Steven K.
2014-01-01
The potential consequences of climate variability and climate change have been identified as major issues for the sustainability and availability of the worldwide water resources. Unlike global climate change, climate variability represents deviations from the long-term state of the climate over periods of a few years to several decades. Currently, rich hydrologic time-series data are available, but the combination of data preparation and statistical methods developed by the U.S. Geological Survey as part of the Groundwater Resources Program is relatively unavailable to hydrologists and engineers who could benefit from estimates of climate variability and its effects on periodic recharge and water-resource availability. This report documents HydroClimATe, a computer program for assessing the relations between variable climatic and hydrologic time-series data. HydroClimATe was developed for a Windows operating system. The software includes statistical tools for (1) time-series preprocessing, (2) spectral analysis, (3) spatial and temporal analysis, (4) correlation analysis, and (5) projections. The time-series preprocessing tools include spline fitting, standardization using a normal or gamma distribution, and transformation by a cumulative departure. The spectral analysis tools include discrete Fourier transform, maximum entropy method, and singular spectrum analysis. The spatial and temporal analysis tool is empirical orthogonal function analysis. The correlation analysis tools are linear regression and lag correlation. The projection tools include autoregressive time-series modeling and generation of many realizations. These tools are demonstrated in four examples that use stream-flow discharge data, groundwater-level records, gridded time series of precipitation data, and the Multivariate ENSO Index.
Testing for intracycle determinism in pseudoperiodic time series.
Coelho, Mara C S; Mendes, Eduardo M A M; Aguirre, Luis A
2008-06-01
A determinism test is proposed based on the well-known method of the surrogate data. Assuming predictability to be a signature of determinism, the proposed method checks for intracycle (e.g., short-term) determinism in the pseudoperiodic time series for which standard methods of surrogate analysis do not apply. The approach presented is composed of two steps. First, the data are preprocessed to reduce the effects of seasonal and trend components. Second, standard tests of surrogate analysis can then be used. The determinism test is applied to simulated and experimental pseudoperiodic time series and the results show the applicability of the proposed test.
A harmonic linear dynamical system for prominent ECG feature extraction.
Thi, Ngoc Anh Nguyen; Yang, Hyung-Jeong; Kim, SunHee; Do, Luu Ngoc
2014-01-01
Unsupervised mining of electrocardiography (ECG) time series is a crucial task in biomedical applications. To have efficiency of the clustering results, the prominent features extracted from preprocessing analysis on multiple ECG time series need to be investigated. In this paper, a Harmonic Linear Dynamical System is applied to discover vital prominent features via mining the evolving hidden dynamics and correlations in ECG time series. The discovery of the comprehensible and interpretable features of the proposed feature extraction methodology effectively represents the accuracy and the reliability of clustering results. Particularly, the empirical evaluation results of the proposed method demonstrate the improved performance of clustering compared to the previous main stream feature extraction approaches for ECG time series clustering tasks. Furthermore, the experimental results on real-world datasets show scalability with linear computation time to the duration of the time series.
Wong, Raymond
2013-01-01
Voice biometrics is one kind of physiological characteristics whose voice is different for each individual person. Due to this uniqueness, voice classification has found useful applications in classifying speakers' gender, mother tongue or ethnicity (accent), emotion states, identity verification, verbal command control, and so forth. In this paper, we adopt a new preprocessing method named Statistical Feature Extraction (SFX) for extracting important features in training a classification model, based on piecewise transformation treating an audio waveform as a time-series. Using SFX we can faithfully remodel statistical characteristics of the time-series; together with spectral analysis, a substantial amount of features are extracted in combination. An ensemble is utilized in selecting only the influential features to be used in classification model induction. We focus on the comparison of effects of various popular data mining algorithms on multiple datasets. Our experiment consists of classification tests over four typical categories of human voice data, namely, Female and Male, Emotional Speech, Speaker Identification, and Language Recognition. The experiments yield encouraging results supporting the fact that heuristically choosing significant features from both time and frequency domains indeed produces better performance in voice classification than traditional signal processing techniques alone, like wavelets and LPC-to-CC. PMID:24288684
NASA Astrophysics Data System (ADS)
Tezuka, Miwa; Kanno, Kazutaka; Bunsen, Masatoshi
2016-08-01
Reservoir computing is a machine-learning paradigm based on information processing in the human brain. We numerically demonstrate reservoir computing with a slowly modulated mask signal for preprocessing by using a mutually coupled optoelectronic system. The performance of our system is quantitatively evaluated by a chaotic time series prediction task. Our system can produce comparable performance with reservoir computing with a single feedback system and a fast modulated mask signal. We showed that it is possible to slow down the modulation speed of the mask signal by using the mutually coupled system in reservoir computing.
The Timeseries Toolbox - A Web Application to Enable Accessible, Reproducible Time Series Analysis
NASA Astrophysics Data System (ADS)
Veatch, W.; Friedman, D.; Baker, B.; Mueller, C.
2017-12-01
The vast majority of data analyzed by climate researchers are repeated observations of physical process or time series data. This data lends itself of a common set of statistical techniques and models designed to determine trends and variability (e.g., seasonality) of these repeated observations. Often, these same techniques and models can be applied to a wide variety of different time series data. The Timeseries Toolbox is a web application designed to standardize and streamline these common approaches to time series analysis and modeling with particular attention to hydrologic time series used in climate preparedness and resilience planning and design by the U. S. Army Corps of Engineers. The application performs much of the pre-processing of time series data necessary for more complex techniques (e.g. interpolation, aggregation). With this tool, users can upload any dataset that conforms to a standard template and immediately begin applying these techniques to analyze their time series data.
NASA Astrophysics Data System (ADS)
Du, Kongchang; Zhao, Ying; Lei, Jiaqiang
2017-09-01
In hydrological time series prediction, singular spectrum analysis (SSA) and discrete wavelet transform (DWT) are widely used as preprocessing techniques for artificial neural network (ANN) and support vector machine (SVM) predictors. These hybrid or ensemble models seem to largely reduce the prediction error. In current literature researchers apply these techniques to the whole observed time series and then obtain a set of reconstructed or decomposed time series as inputs to ANN or SVM. However, through two comparative experiments and mathematical deduction we found the usage of SSA and DWT in building hybrid models is incorrect. Since SSA and DWT adopt 'future' values to perform the calculation, the series generated by SSA reconstruction or DWT decomposition contain information of 'future' values. These hybrid models caused incorrect 'high' prediction performance and may cause large errors in practice.
NASA Astrophysics Data System (ADS)
Cannata, Massimiliano; Neumann, Jakob; Cardoso, Mirko; Rossetto, Rudy; Foglia, Laura; Borsi, Iacopo
2017-04-01
In situ time-series are an important aspect of environmental modelling, especially with the advancement of numerical simulation techniques and increased model complexity. In order to make use of the increasing data available through the requirements of the EU Water Framework Directive, the FREEWAT GIS environment incorporates the newly developed Observation Analysis Tool for time-series analysis. The tool is used to import time-series data into QGIS from local CSV files, online sensors using the istSOS service, or MODFLOW model result files and enables visualisation, pre-processing of data for model development, and post-processing of model results. OAT can be used as a pre-processor for calibration observations, integrating the creation of observations for calibration directly from sensor time-series. The tool consists in an expandable Python library of processing methods and an interface integrated in the QGIS FREEWAT plug-in which includes a large number of modelling capabilities, data management tools and calibration capacity.
Testing for nonlinearity in non-stationary physiological time series.
Guarín, Diego; Delgado, Edilson; Orozco, Álvaro
2011-01-01
Testing for nonlinearity is one of the most important preprocessing steps in nonlinear time series analysis. Typically, this is done by means of the linear surrogate data methods. But it is a known fact that the validity of the results heavily depends on the stationarity of the time series. Since most physiological signals are non-stationary, it is easy to falsely detect nonlinearity using the linear surrogate data methods. In this document, we propose a methodology to extend the procedure for generating constrained surrogate time series in order to assess nonlinearity in non-stationary data. The method is based on the band-phase-randomized surrogates, which consists (contrary to the linear surrogate data methods) in randomizing only a portion of the Fourier phases in the high frequency domain. Analysis of simulated time series showed that in comparison to the linear surrogate data method, our method is able to discriminate between linear stationarity, linear non-stationary and nonlinear time series. Applying our methodology to heart rate variability (HRV) records of five healthy patients, we encountered that nonlinear correlations are present in this non-stationary physiological signals.
NASA Astrophysics Data System (ADS)
Merkel, Ronny; Breuhan, Andy; Hildebrandt, Mario; Vielhauer, Claus; Bräutigam, Anja
2012-06-01
In the field of crime scene forensics, current methods of evidence collection, such as the acquisition of shoe-marks, tireimpressions, palm-prints or fingerprints are in most cases still performed in an analogue way. For example, fingerprints are captured by powdering and sticky tape lifting, ninhydrine bathing or cyanoacrylate fuming and subsequent photographing. Images of the evidence are then further processed by forensic experts. With the upcoming use of new multimedia systems for the digital capturing and processing of crime scene traces in forensics, higher resolutions can be achieved, leading to a much better quality of forensic images. Furthermore, the fast and mostly automated preprocessing of such data using digital signal processing techniques is an emerging field. Also, by the optical and non-destructive lifting of forensic evidence, traces are not destroyed and therefore can be re-captured, e.g. by creating time series of a trace, to extract its aging behavior and maybe determine the time the trace was left. However, such new methods and tools face different challenges, which need to be addressed before a practical application in the field. Based on the example of fingerprint age determination, which is an unresolved research challenge to forensic experts since decades, we evaluate the influences of different environmental conditions as well as different types of sweating and their implications to the capturing sensory, preprocessing methods and feature extraction. We use a Chromatic White Light (CWL) sensor to exemplary represent such a new optical and contactless measurement device and investigate the influence of 16 different environmental conditions, 8 different sweat types and 11 different preprocessing methods on the aging behavior of 48 fingerprint time series (2592 fingerprint scans in total). We show the challenges that arise for such new multimedia systems capturing and processing forensic evidence
Learning investment indicators through data extension
NASA Astrophysics Data System (ADS)
Dvořák, Marek
2017-07-01
Stock prices in the form of time series were analysed using single and multivariate statistical methods. After simple data preprocessing in the form of logarithmic differences, we augmented this single variate time series to a multivariate representation. This method makes use of sliding windows to calculate several dozen of new variables using simple statistic tools like first and second moments as well as more complicated statistic, like auto-regression coefficients and residual analysis, followed by an optional quadratic transformation that was further used for data extension. These were used as a explanatory variables in a regularized logistic LASSO regression which tried to estimate Buy-Sell Index (BSI) from real stock market data.
Poplová, Michaela; Sovka, Pavel; Cifra, Michal
2017-01-01
Photonic signals are broadly exploited in communication and sensing and they typically exhibit Poisson-like statistics. In a common scenario where the intensity of the photonic signals is low and one needs to remove a nonstationary trend of the signals for any further analysis, one faces an obstacle: due to the dependence between the mean and variance typical for a Poisson-like process, information about the trend remains in the variance even after the trend has been subtracted, possibly yielding artifactual results in further analyses. Commonly available detrending or normalizing methods cannot cope with this issue. To alleviate this issue we developed a suitable pre-processing method for the signals that originate from a Poisson-like process. In this paper, a Poisson pre-processing method for nonstationary time series with Poisson distribution is developed and tested on computer-generated model data and experimental data of chemiluminescence from human neutrophils and mung seeds. The presented method transforms a nonstationary Poisson signal into a stationary signal with a Poisson distribution while preserving the type of photocount distribution and phase-space structure of the signal. The importance of the suggested pre-processing method is shown in Fano factor and Hurst exponent analysis of both computer-generated model signals and experimental photonic signals. It is demonstrated that our pre-processing method is superior to standard detrending-based methods whenever further signal analysis is sensitive to variance of the signal.
Poplová, Michaela; Sovka, Pavel
2017-01-01
Photonic signals are broadly exploited in communication and sensing and they typically exhibit Poisson-like statistics. In a common scenario where the intensity of the photonic signals is low and one needs to remove a nonstationary trend of the signals for any further analysis, one faces an obstacle: due to the dependence between the mean and variance typical for a Poisson-like process, information about the trend remains in the variance even after the trend has been subtracted, possibly yielding artifactual results in further analyses. Commonly available detrending or normalizing methods cannot cope with this issue. To alleviate this issue we developed a suitable pre-processing method for the signals that originate from a Poisson-like process. In this paper, a Poisson pre-processing method for nonstationary time series with Poisson distribution is developed and tested on computer-generated model data and experimental data of chemiluminescence from human neutrophils and mung seeds. The presented method transforms a nonstationary Poisson signal into a stationary signal with a Poisson distribution while preserving the type of photocount distribution and phase-space structure of the signal. The importance of the suggested pre-processing method is shown in Fano factor and Hurst exponent analysis of both computer-generated model signals and experimental photonic signals. It is demonstrated that our pre-processing method is superior to standard detrending-based methods whenever further signal analysis is sensitive to variance of the signal. PMID:29216207
The Neuro Bureau ADHD-200 Preprocessed repository.
Bellec, Pierre; Chu, Carlton; Chouinard-Decorte, François; Benhajali, Yassine; Margulies, Daniel S; Craddock, R Cameron
2017-01-01
In 2011, the "ADHD-200 Global Competition" was held with the aim of identifying biomarkers of attention-deficit/hyperactivity disorder from resting-state functional magnetic resonance imaging (rs-fMRI) and structural MRI (s-MRI) data collected on 973 individuals. Statisticians and computer scientists were potentially the most qualified for the machine learning aspect of the competition, but generally lacked the specialized skills to implement the necessary steps of data preparation for rs-fMRI. Realizing this barrier to entry, the Neuro Bureau prospectively collaborated with all competitors by preprocessing the data and sharing these results at the Neuroimaging Informatics Tools and Resources Clearinghouse (NITRC) (http://www.nitrc.org/frs/?group_id=383). This "ADHD-200 Preprocessed" release included multiple analytical pipelines to cater to different philosophies of data analysis. The processed derivatives included denoised and registered 4D fMRI volumes, regional time series extracted from brain parcellations, maps of 10 intrinsic connectivity networks, fractional amplitude of low frequency fluctuation, and regional homogeneity, along with grey matter density maps. The data was used by several teams who competed in the ADHD-200 Global Competition, including the winning entry by a group of biostaticians. To the best of our knowledge, the ADHD-200 Preprocessed release was the first large public resource of preprocessed resting-state fMRI and structural MRI data, and remains to this day the only resource featuring a battery of alternative processing paths. Copyright © 2016 Elsevier Inc. All rights reserved.
Machine learning for the automatic localisation of foetal body parts in cine-MRI scans
NASA Astrophysics Data System (ADS)
Bowles, Christopher; Nowlan, Niamh C.; Hayat, Tayyib T. A.; Malamateniou, Christina; Rutherford, Mary; Hajnal, Joseph V.; Rueckert, Daniel; Kainz, Bernhard
2015-03-01
Being able to automate the location of individual foetal body parts has the potential to dramatically reduce the work required to analyse time resolved foetal Magnetic Resonance Imaging (cine-MRI) scans, for example, for use in the automatic evaluation of the foetal development. Currently, manual preprocessing of every scan is required to locate body parts before analysis can be performed, leading to a significant time overhead. With the volume of scans becoming available set to increase as cine-MRI scans become more prevalent in clinical practice, this stage of manual preprocessing is a bottleneck, limiting the data available for further analysis. Any tools which can automate this process will therefore save many hours of research time and increase the rate of new discoveries in what is a key area in understanding early human development. Here we present a series of techniques which can be applied to foetal cine-MRI scans in order to first locate and then differentiate between individual body parts. A novel approach to maternal movement suppression and segmentation using Fourier transforms is put forward as a preprocessing step, allowing for easy extraction of short movements of individual foetal body parts via the clustering of optical flow vector fields. These body part movements are compared to a labelled database and probabilistically classified before being spatially and temporally combined to give a final estimate for the location of each body part.
On-Board, Real-Time Preprocessing System for Optical Remote-Sensing Imagery
Qi, Baogui; Zhuang, Yin; Chen, He; Chen, Liang
2018-01-01
With the development of remote-sensing technology, optical remote-sensing imagery processing has played an important role in many application fields, such as geological exploration and natural disaster prevention. However, relative radiation correction and geometric correction are key steps in preprocessing because raw image data without preprocessing will cause poor performance during application. Traditionally, remote-sensing data are downlinked to the ground station, preprocessed, and distributed to users. This process generates long delays, which is a major bottleneck in real-time applications for remote-sensing data. Therefore, on-board, real-time image preprocessing is greatly desired. In this paper, a real-time processing architecture for on-board imagery preprocessing is proposed. First, a hierarchical optimization and mapping method is proposed to realize the preprocessing algorithm in a hardware structure, which can effectively reduce the computation burden of on-board processing. Second, a co-processing system using a field-programmable gate array (FPGA) and a digital signal processor (DSP; altogether, FPGA-DSP) based on optimization is designed to realize real-time preprocessing. The experimental results demonstrate the potential application of our system to an on-board processor, for which resources and power consumption are limited. PMID:29693585
On-Board, Real-Time Preprocessing System for Optical Remote-Sensing Imagery.
Qi, Baogui; Shi, Hao; Zhuang, Yin; Chen, He; Chen, Liang
2018-04-25
With the development of remote-sensing technology, optical remote-sensing imagery processing has played an important role in many application fields, such as geological exploration and natural disaster prevention. However, relative radiation correction and geometric correction are key steps in preprocessing because raw image data without preprocessing will cause poor performance during application. Traditionally, remote-sensing data are downlinked to the ground station, preprocessed, and distributed to users. This process generates long delays, which is a major bottleneck in real-time applications for remote-sensing data. Therefore, on-board, real-time image preprocessing is greatly desired. In this paper, a real-time processing architecture for on-board imagery preprocessing is proposed. First, a hierarchical optimization and mapping method is proposed to realize the preprocessing algorithm in a hardware structure, which can effectively reduce the computation burden of on-board processing. Second, a co-processing system using a field-programmable gate array (FPGA) and a digital signal processor (DSP; altogether, FPGA-DSP) based on optimization is designed to realize real-time preprocessing. The experimental results demonstrate the potential application of our system to an on-board processor, for which resources and power consumption are limited.
Retinex Preprocessing for Improved Multi-Spectral Image Classification
NASA Technical Reports Server (NTRS)
Thompson, B.; Rahman, Z.; Park, S.
2000-01-01
The goal of multi-image classification is to identify and label "similar regions" within a scene. The ability to correctly classify a remotely sensed multi-image of a scene is affected by the ability of the classification process to adequately compensate for the effects of atmospheric variations and sensor anomalies. Better classification may be obtained if the multi-image is preprocessed before classification, so as to reduce the adverse effects of image formation. In this paper, we discuss the overall impact on multi-spectral image classification when the retinex image enhancement algorithm is used to preprocess multi-spectral images. The retinex is a multi-purpose image enhancement algorithm that performs dynamic range compression, reduces the dependence on lighting conditions, and generally enhances apparent spatial resolution. The retinex has been successfully applied to the enhancement of many different types of grayscale and color images. We show in this paper that retinex preprocessing improves the spatial structure of multi-spectral images and thus provides better within-class variations than would otherwise be obtained without the preprocessing. For a series of multi-spectral images obtained with diffuse and direct lighting, we show that without retinex preprocessing the class spectral signatures vary substantially with the lighting conditions. Whereas multi-dimensional clustering without preprocessing produced one-class homogeneous regions, the classification on the preprocessed images produced multi-class non-homogeneous regions. This lack of homogeneity is explained by the interaction between different agronomic treatments applied to the regions: the preprocessed images are closer to ground truth. The principle advantage that the retinex offers is that for different lighting conditions classifications derived from the retinex preprocessed images look remarkably "similar", and thus more consistent, whereas classifications derived from the original images, without preprocessing, are much less similar.
JTSA: an open source framework for time series abstractions.
Sacchi, Lucia; Capozzi, Davide; Bellazzi, Riccardo; Larizza, Cristiana
2015-10-01
The evaluation of the clinical status of a patient is frequently based on the temporal evolution of some parameters, making the detection of temporal patterns a priority in data analysis. Temporal abstraction (TA) is a methodology widely used in medical reasoning for summarizing and abstracting longitudinal data. This paper describes JTSA (Java Time Series Abstractor), a framework including a library of algorithms for time series preprocessing and abstraction and an engine to execute a workflow for temporal data processing. The JTSA framework is grounded on a comprehensive ontology that models temporal data processing both from the data storage and the abstraction computation perspective. The JTSA framework is designed to allow users to build their own analysis workflows by combining different algorithms. Thanks to the modular structure of a workflow, simple to highly complex patterns can be detected. The JTSA framework has been developed in Java 1.7 and is distributed under GPL as a jar file. JTSA provides: a collection of algorithms to perform temporal abstraction and preprocessing of time series, a framework for defining and executing data analysis workflows based on these algorithms, and a GUI for workflow prototyping and testing. The whole JTSA project relies on a formal model of the data types and of the algorithms included in the library. This model is the basis for the design and implementation of the software application. Taking into account this formalized structure, the user can easily extend the JTSA framework by adding new algorithms. Results are shown in the context of the EU project MOSAIC to extract relevant patterns from data coming related to the long term monitoring of diabetic patients. The proof that JTSA is a versatile tool to be adapted to different needs is given by its possible uses, both as a standalone tool for data summarization and as a module to be embedded into other architectures to select specific phenotypes based on TAs in a large dataset. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
de Vos, Stijn; Wardenaar, Klaas J; Bos, Elisabeth H; Wit, Ernst C; Bouwmans, Mara E J; de Jonge, Peter
2017-01-01
Differences in within-person emotion dynamics may be an important source of heterogeneity in depression. To investigate these dynamics, researchers have previously combined multilevel regression analyses with network representations. However, sparse network methods, specifically developed for longitudinal network analyses, have not been applied. Therefore, this study used this approach to investigate population-level and individual-level emotion dynamics in healthy and depressed persons and compared this method with the multilevel approach. Time-series data were collected in pair-matched healthy persons and major depressive disorder (MDD) patients (n = 54). Seven positive affect (PA) and seven negative affect (NA) items were administered electronically at 90 times (30 days; thrice per day). The population-level (healthy vs. MDD) and individual-level time series were analyzed using a sparse longitudinal network model based on vector autoregression. The population-level model was also estimated with a multilevel approach. Effects of different preprocessing steps were evaluated as well. The characteristics of the longitudinal networks were investigated to gain insight into the emotion dynamics. In the population-level networks, longitudinal network connectivity was strongest in the healthy group, with nodes showing more and stronger longitudinal associations with each other. Individually estimated networks varied strongly across individuals. Individual variations in network connectivity were unrelated to baseline characteristics (depression status, neuroticism, severity). A multilevel approach applied to the same data showed higher connectivity in the MDD group, which seemed partly related to the preprocessing approach. The sparse network approach can be useful for the estimation of networks with multiple nodes, where overparameterization is an issue, and for individual-level networks. However, its current inability to model random effects makes it less useful as a population-level approach in case of large heterogeneity. Different preprocessing strategies appeared to strongly influence the results, complicating inferences about network density.
A Systematic Evaluation of Blood Serum and Plasma Pre-Analytics for Metabolomics Cohort Studies
Jobard, Elodie; Trédan, Olivier; Postoly, Déborah; André, Fabrice; Martin, Anne-Laure; Elena-Herrmann, Bénédicte; Boyault, Sandrine
2016-01-01
The recent thriving development of biobanks and associated high-throughput phenotyping studies requires the elaboration of large-scale approaches for monitoring biological sample quality and compliance with standard protocols. We present a metabolomic investigation of human blood samples that delineates pitfalls and guidelines for the collection, storage and handling procedures for serum and plasma. A series of eight pre-processing technical parameters is systematically investigated along variable ranges commonly encountered across clinical studies. While metabolic fingerprints, as assessed by nuclear magnetic resonance, are not significantly affected by altered centrifugation parameters or delays between sample pre-processing (blood centrifugation) and storage, our metabolomic investigation highlights that both the delay and storage temperature between blood draw and centrifugation are the primary parameters impacting serum and plasma metabolic profiles. Storing the blood drawn at 4 °C is shown to be a reliable routine to confine variability associated with idle time prior to sample pre-processing. Based on their fine sensitivity to pre-analytical parameters and protocol variations, metabolic fingerprints could be exploited as valuable ways to determine compliance with standard procedures and quality assessment of blood samples within large multi-omic clinical and translational cohort studies. PMID:27929400
Grootswagers, Tijl; Wardle, Susan G; Carlson, Thomas A
2017-04-01
Multivariate pattern analysis (MVPA) or brain decoding methods have become standard practice in analyzing fMRI data. Although decoding methods have been extensively applied in brain-computer interfaces, these methods have only recently been applied to time series neuroimaging data such as MEG and EEG to address experimental questions in cognitive neuroscience. In a tutorial style review, we describe a broad set of options to inform future time series decoding studies from a cognitive neuroscience perspective. Using example MEG data, we illustrate the effects that different options in the decoding analysis pipeline can have on experimental results where the aim is to "decode" different perceptual stimuli or cognitive states over time from dynamic brain activation patterns. We show that decisions made at both preprocessing (e.g., dimensionality reduction, subsampling, trial averaging) and decoding (e.g., classifier selection, cross-validation design) stages of the analysis can significantly affect the results. In addition to standard decoding, we describe extensions to MVPA for time-varying neuroimaging data including representational similarity analysis, temporal generalization, and the interpretation of classifier weight maps. Finally, we outline important caveats in the design and interpretation of time series decoding experiments.
EARLINET: potential operationality of a research network
NASA Astrophysics Data System (ADS)
Sicard, M.; D'Amico, G.; Comerón, A.; Mona, L.; Alados-Arboledas, L.; Amodeo, A.; Baars, H.; Belegante, L.; Binietoglou, I.; Bravo-Aranda, J. A.; Fernández, A. J.; Fréville, P.; García-Vizcaíno, D.; Giunta, A.; Granados-Muñoz, M. J.; Guerrero-Rascado, J. L.; Hadjimitsis, D.; Haefele, A.; Hervo, M.; Iarlori, M.; Kokkalis, P.; Lange, D.; Mamouri, R. E.; Mattis, I.; Molero, F.; Montoux, N.; Muñoz, A.; Muñoz Porcar, C.; Navas-Guzmán, F.; Nicolae, D.; Nisantzi, A.; Papagiannopoulos, N.; Papayannis, A.; Pereira, S.; Preißler, J.; Pujadas, M.; Rizi, V.; Rocadenbosch, F.; Sellegri, K.; Simeonov, V.; Tsaknakis, G.; Wagner, F.; Pappalardo, G.
2015-07-01
In the framework of ACTRIS summer 2012 measurement campaign (8 June-17 July 2012), EARLINET organized and performed a controlled exercise of feasibility to demonstrate its potential to perform operational, coordinated measurements and deliver products in near-real time. Eleven lidar stations participated to the exercise which started on 9 July 2012 at 06:00 UT and ended 72 h later on 12 July at 06:00 UT. For the first time the Single-Calculus Chain (SCC), the common calculus chain developed within EARLINET for the automatic evaluation of lidar data from raw signals up to the final products, was used. All stations sent in real time measurements of 1 h of duration to the SCC server in a predefined netcdf file format. The pre-processing of the data was performed in real time by the SCC while the optical processing was performed in near-real time after the exercise ended. 98 and 84 % of the files sent to SCC were successfully pre-processed and processed, respectively. Those percentages are quite large taking into account that no cloud screening was performed on lidar data. The paper shows time series of continuous and homogeneously obtained products retrieved at different levels of the SCC: range-square corrected signals (pre-processing) and daytime backscatter and nighttime extinction coefficient profiles (optical processing), as well as combined plots of all direct and derived optical products. The derived products include backscatter- and extinction-related Ångström exponents, lidar ratios and color ratios. The combined plots reveal extremely valuable for aerosol classification. The efforts made to define the measurements protocol and to configure properly the SCC pave the way for applying this protocol for specific applications such as the monitoring of special events, atmospheric modelling, climate research and calibration/validation activities of spaceborne observations.
The scheme and research of TV series multidimensional comprehensive evaluation on cross-platform
NASA Astrophysics Data System (ADS)
Chai, Jianping; Bai, Xuesong; Zhou, Hongjun; Yin, Fulian
2016-10-01
As for shortcomings of the comprehensive evaluation system on traditional TV programs such as single data source, ignorance of new media as well as the high time cost and difficulty of making surveys, a new evaluation of TV series is proposed in this paper, which has a perspective in cross-platform multidimensional evaluation after broadcasting. This scheme considers the data directly collected from cable television and the Internet as research objects. It's based on TOPSIS principle, after preprocessing and calculation of the data, they become primary indicators that reflect different profiles of the viewing of TV series. Then after the process of reasonable empowerment and summation by the six methods(PCA, AHP, etc.), the primary indicators form the composite indices on different channels or websites. The scheme avoids the inefficiency and difficulty of survey and marking; At the same time, it not only reflects different dimensions of viewing, but also combines TV media and new media, completing the multidimensional comprehensive evaluation of TV series on cross-platform.
Piecewise Polynomial Aggregation as Preprocessing for Data Numerical Modeling
NASA Astrophysics Data System (ADS)
Dobronets, B. S.; Popova, O. A.
2018-05-01
Data aggregation issues for numerical modeling are reviewed in the present study. The authors discuss data aggregation procedures as preprocessing for subsequent numerical modeling. To calculate the data aggregation, the authors propose using numerical probabilistic analysis (NPA). An important feature of this study is how the authors represent the aggregated data. The study shows that the offered approach to data aggregation can be interpreted as the frequency distribution of a variable. To study its properties, the density function is used. For this purpose, the authors propose using the piecewise polynomial models. A suitable example of such approach is the spline. The authors show that their approach to data aggregation allows reducing the level of data uncertainty and significantly increasing the efficiency of numerical calculations. To demonstrate the degree of the correspondence of the proposed methods to reality, the authors developed a theoretical framework and considered numerical examples devoted to time series aggregation.
NASA Astrophysics Data System (ADS)
Huang, Liang; Ni, Xuan; Ditto, William L.; Spano, Mark; Carney, Paul R.; Lai, Ying-Cheng
2017-01-01
We develop a framework to uncover and analyse dynamical anomalies from massive, nonlinear and non-stationary time series data. The framework consists of three steps: preprocessing of massive datasets to eliminate erroneous data segments, application of the empirical mode decomposition and Hilbert transform paradigm to obtain the fundamental components embedded in the time series at distinct time scales, and statistical/scaling analysis of the components. As a case study, we apply our framework to detecting and characterizing high-frequency oscillations (HFOs) from a big database of rat electroencephalogram recordings. We find a striking phenomenon: HFOs exhibit on-off intermittency that can be quantified by algebraic scaling laws. Our framework can be generalized to big data-related problems in other fields such as large-scale sensor data and seismic data analysis.
Direction of Coupling from Phases of Interacting Oscillators: A Permutation Information Approach
NASA Astrophysics Data System (ADS)
Bahraminasab, A.; Ghasemi, F.; Stefanovska, A.; McClintock, P. V. E.; Kantz, H.
2008-02-01
We introduce a directionality index for a time series based on a comparison of neighboring values. It can distinguish unidirectional from bidirectional coupling, as well as reveal and quantify asymmetry in bidirectional coupling. It is tested on a numerical model of coupled van der Pol oscillators, and applied to cardiorespiratory data from healthy subjects. There is no need for preprocessing and fine-tuning the parameters, which makes the method very simple, computationally fast and robust.
BiGGEsTS: integrated environment for biclustering analysis of time series gene expression data
Gonçalves, Joana P; Madeira, Sara C; Oliveira, Arlindo L
2009-01-01
Background The ability to monitor changes in expression patterns over time, and to observe the emergence of coherent temporal responses using expression time series, is critical to advance our understanding of complex biological processes. Biclustering has been recognized as an effective method for discovering local temporal expression patterns and unraveling potential regulatory mechanisms. The general biclustering problem is NP-hard. In the case of time series this problem is tractable, and efficient algorithms can be used. However, there is still a need for specialized applications able to take advantage of the temporal properties inherent to expression time series, both from a computational and a biological perspective. Findings BiGGEsTS makes available state-of-the-art biclustering algorithms for analyzing expression time series. Gene Ontology (GO) annotations are used to assess the biological relevance of the biclusters. Methods for preprocessing expression time series and post-processing results are also included. The analysis is additionally supported by a visualization module capable of displaying informative representations of the data, including heatmaps, dendrograms, expression charts and graphs of enriched GO terms. Conclusion BiGGEsTS is a free open source graphical software tool for revealing local coexpression of genes in specific intervals of time, while integrating meaningful information on gene annotations. It is freely available at: . We present a case study on the discovery of transcriptional regulatory modules in the response of Saccharomyces cerevisiae to heat stress. PMID:19583847
Comparison of preprocessing methods and storage times for touch DNA samples
Dong, Hui; Wang, Jing; Zhang, Tao; Ge, Jian-ye; Dong, Ying-qiang; Sun, Qi-fan; Liu, Chao; Li, Cai-xia
2017-01-01
Aim To select appropriate preprocessing methods for different substrates by comparing the effects of four different preprocessing methods on touch DNA samples and to determine the effect of various storage times on the results of touch DNA sample analysis. Method Hand touch DNA samples were used to investigate the detection and inspection results of DNA on different substrates. Four preprocessing methods, including the direct cutting method, stubbing procedure, double swab technique, and vacuum cleaner method, were used in this study. DNA was extracted from mock samples with four different preprocessing methods. The best preprocess protocol determined from the study was further used to compare performance after various storage times. DNA extracted from all samples was quantified and amplified using standard procedures. Results The amounts of DNA and the number of alleles detected on the porous substrates were greater than those on the non-porous substrates. The performances of the four preprocessing methods varied with different substrates. The direct cutting method displayed advantages for porous substrates, and the vacuum cleaner method was advantageous for non-porous substrates. No significant degradation trend was observed as the storage times increased. Conclusion Different substrates require the use of different preprocessing method in order to obtain the highest DNA amount and allele number from touch DNA samples. This study provides a theoretical basis for explorations of touch DNA samples and may be used as a reference when dealing with touch DNA samples in case work. PMID:28252870
NASA Astrophysics Data System (ADS)
Sawall, Mathias; von Harbou, Erik; Moog, Annekathrin; Behrens, Richard; Schröder, Henning; Simoneau, Joël; Steimers, Ellen; Neymeyr, Klaus
2018-04-01
Spectral data preprocessing is an integral and sometimes inevitable part of chemometric analyses. For Nuclear Magnetic Resonance (NMR) spectra a possible first preprocessing step is a phase correction which is applied to the Fourier transformed free induction decay (FID) signal. This preprocessing step can be followed by a separate baseline correction step. Especially if series of high-resolution spectra are considered, then automated and computationally fast preprocessing routines are desirable. A new method is suggested that applies the phase and the baseline corrections simultaneously in an automated form without manual input, which distinguishes this work from other approaches. The underlying multi-objective optimization or Pareto optimization provides improved results compared to consecutively applied correction steps. The optimization process uses an objective function which applies strong penalty constraints and weaker regularization conditions. The new method includes an approach for the detection of zero baseline regions. The baseline correction uses a modified Whittaker smoother. The functionality of the new method is demonstrated for experimental NMR spectra. The results are verified against gravimetric data. The method is compared to alternative preprocessing tools. Additionally, the simultaneous correction method is compared to a consecutive application of the two correction steps.
An empirical comparison of SPM preprocessing parameters to the analysis of fMRI data.
Della-Maggiore, Valeria; Chau, Wilkin; Peres-Neto, Pedro R; McIntosh, Anthony R
2002-09-01
We present the results from two sets of Monte Carlo simulations aimed at evaluating the robustness of some preprocessing parameters of SPM99 for the analysis of functional magnetic resonance imaging (fMRI). Statistical robustness was estimated by implementing parametric and nonparametric simulation approaches based on the images obtained from an event-related fMRI experiment. Simulated datasets were tested for combinations of the following parameters: basis function, global scaling, low-pass filter, high-pass filter and autoregressive modeling of serial autocorrelation. Based on single-subject SPM analysis, we derived the following conclusions that may serve as a guide for initial analysis of fMRI data using SPM99: (1) The canonical hemodynamic response function is a more reliable basis function to model the fMRI time series than HRF with time derivative. (2) Global scaling should be avoided since it may significantly decrease the power depending on the experimental design. (3) The use of a high-pass filter may be beneficial for event-related designs with fixed interstimulus intervals. (4) When dealing with fMRI time series with short interstimulus intervals (<8 s), the use of first-order autoregressive model is recommended over a low-pass filter (HRF) because it reduces the risk of inferential bias while providing a relatively good power. For datasets with interstimulus intervals longer than 8 seconds, temporal smoothing is not recommended since it decreases power. While the generalizability of our results may be limited, the methods we employed can be easily implemented by other scientists to determine the best parameter combination to analyze their data.
Semi-autonomous remote sensing time series generation tool
NASA Astrophysics Data System (ADS)
Babu, Dinesh Kumar; Kaufmann, Christof; Schmidt, Marco; Dhams, Thorsten; Conrad, Christopher
2017-10-01
High spatial and temporal resolution data is vital for crop monitoring and phenology change detection. Due to the lack of satellite architecture and frequent cloud cover issues, availability of daily high spatial data is still far from reality. Remote sensing time series generation of high spatial and temporal data by data fusion seems to be a practical alternative. However, it is not an easy process, since it involves multiple steps and also requires multiple tools. In this paper, a framework of Geo Information System (GIS) based tool is presented for semi-autonomous time series generation. This tool will eliminate the difficulties by automating all the steps and enable the users to generate synthetic time series data with ease. Firstly, all the steps required for the time series generation process are identified and grouped into blocks based on their functionalities. Later two main frameworks are created, one to perform all the pre-processing steps on various satellite data and the other one to perform data fusion to generate time series. The two frameworks can be used individually to perform specific tasks or they could be combined to perform both the processes in one go. This tool can handle most of the known geo data formats currently available which makes it a generic tool for time series generation of various remote sensing satellite data. This tool is developed as a common platform with good interface which provides lot of functionalities to enable further development of more remote sensing applications. A detailed description on the capabilities and the advantages of the frameworks are given in this paper.
Dórea, Fernanda C.; McEwen, Beverly J.; McNab, W. Bruce; Revie, Crawford W.; Sanchez, Javier
2013-01-01
Diagnostic test orders to an animal laboratory were explored as a data source for monitoring trends in the incidence of clinical syndromes in cattle. Four years of real data and over 200 simulated outbreak signals were used to compare pre-processing methods that could remove temporal effects in the data, as well as temporal aberration detection algorithms that provided high sensitivity and specificity. Weekly differencing demonstrated solid performance in removing day-of-week effects, even in series with low daily counts. For aberration detection, the results indicated that no single algorithm showed performance superior to all others across the range of outbreak scenarios simulated. Exponentially weighted moving average charts and Holt–Winters exponential smoothing demonstrated complementary performance, with the latter offering an automated method to adjust to changes in the time series that will likely occur in the future. Shewhart charts provided lower sensitivity but earlier detection in some scenarios. Cumulative sum charts did not appear to add value to the system; however, the poor performance of this algorithm was attributed to characteristics of the data monitored. These findings indicate that automated monitoring aimed at early detection of temporal aberrations will likely be most effective when a range of algorithms are implemented in parallel. PMID:23576782
Dórea, Fernanda C; McEwen, Beverly J; McNab, W Bruce; Revie, Crawford W; Sanchez, Javier
2013-06-06
Diagnostic test orders to an animal laboratory were explored as a data source for monitoring trends in the incidence of clinical syndromes in cattle. Four years of real data and over 200 simulated outbreak signals were used to compare pre-processing methods that could remove temporal effects in the data, as well as temporal aberration detection algorithms that provided high sensitivity and specificity. Weekly differencing demonstrated solid performance in removing day-of-week effects, even in series with low daily counts. For aberration detection, the results indicated that no single algorithm showed performance superior to all others across the range of outbreak scenarios simulated. Exponentially weighted moving average charts and Holt-Winters exponential smoothing demonstrated complementary performance, with the latter offering an automated method to adjust to changes in the time series that will likely occur in the future. Shewhart charts provided lower sensitivity but earlier detection in some scenarios. Cumulative sum charts did not appear to add value to the system; however, the poor performance of this algorithm was attributed to characteristics of the data monitored. These findings indicate that automated monitoring aimed at early detection of temporal aberrations will likely be most effective when a range of algorithms are implemented in parallel.
NASA Astrophysics Data System (ADS)
Fomin, Fedor V.
Preprocessing (data reduction or kernelization) as a strategy of coping with hard problems is universally used in almost every implementation. The history of preprocessing, like applying reduction rules simplifying truth functions, can be traced back to the 1950's [6]. A natural question in this regard is how to measure the quality of preprocessing rules proposed for a specific problem. For a long time the mathematical analysis of polynomial time preprocessing algorithms was neglected. The basic reason for this anomaly was that if we start with an instance I of an NP-hard problem and can show that in polynomial time we can replace this with an equivalent instance I' with |I'| < |I| then that would imply P=NP in classical complexity.
Predicting Flood in Perlis Using Ant Colony Optimization
NASA Astrophysics Data System (ADS)
Nadia Sabri, Syaidatul; Saian, Rizauddin
2017-06-01
Flood forecasting is widely being studied in order to reduce the effect of flood such as loss of property, loss of life and contamination of water supply. Usually flood occurs due to continuous heavy rainfall. This study used a variant of Ant Colony Optimization (ACO) algorithm named the Ant-Miner to develop the classification prediction model to predict flood. However, since Ant-Miner only accept discrete data, while rainfall data is a time series data, a pre-processing steps is needed to discretize the rainfall data initially. This study used a technique called the Symbolic Aggregate Approximation (SAX) to convert the rainfall time series data into discrete data. As an addition, Simple K-Means algorithm was used to cluster the data produced by SAX. The findings show that the predictive accuracy of the classification prediction model is more than 80%.
NanoStringNormCNV: pre-processing of NanoString CNV data.
Sendorek, Dorota H; Lalonde, Emilie; Yao, Cindy Q; Sabelnykova, Veronica Y; Bristow, Robert G; Boutros, Paul C
2018-03-15
The NanoString System is a well-established technology for measuring RNA and DNA abundance. Although it can estimate copy number variation, relatively few tools support analysis of these data. To address this gap, we created NanoStringNormCNV, an R package for pre-processing and copy number variant calling from NanoString data. This package implements algorithms for pre-processing, quality-control, normalization and copy number variation detection. A series of reporting and data visualization methods support exploratory analyses. To demonstrate its utility, we apply it to a new dataset of 96 genes profiled on 41 prostate tumour and 24 matched normal samples. NanoStringNormCNV is implemented in R and is freely available at http://labs.oicr.on.ca/boutros-lab/software/nanostringnormcnv. paul.boutros@oicr.on.ca. Supplementary data are available at Bioinformatics online.
NASA Astrophysics Data System (ADS)
Zhu, Rui
The economic competitiveness of biofuels production is highly dependent on feedstock cost, which constitutes 35-50 % of the total biofuels production cost. Economically viable feedstock pre-process has a significant influence on all the subsequent downstream processes in the biorefinery supply chain. In this work, hot water extraction (HWE) was exploited as a pre-process to initially fractionate cell wall structure of softwood Douglas fir, which is considerably more recalcitrant compared to hardwoods and agricultural feedstocks. A response surface model was developed and the highest hemicellulose extraction yield (HEY) was obtained when the temperature is 180 °C and the time is 79 min. HWE process partially removed hemicelluloses, reduced the moisture absorption and improved the thermal stability of wood. To investigate the effects of HWE pre-process on sulfite pretreatment to overcome recalcitrance of lignocellulose (SPORL), a series of SPORL with reduced combined severity factor (CSF) were conducted using HWE treated Douglas fir. Sugar analysis after enzymatic hydrolysis indicated that SPORL can be conducted at lower temperature (145 °C), shorter time (80 min), and lower acid volume (3 %), while still maintaining considerably high enzymatic digestibility ( 55-60%). Deriving valuable co-products would increase the overall revenue and improve the economics of the biofuels supply chain. The feasibility of extracting cellulose nanofibrils (CNFs) from HWE treated Douglas fir by ultrasonication and CNFs' reinforcing potentials in nylon 6 matrix were evaluated. Morphology analysis indicated that finer fibrils can be obtained by increasing ultrasonication time and/or amplitude. CNFs was found to have higher crystallinity and maintained the thermal stability compared to untreated fiber. A method of fabricating nylon 6/CNFs as-spun nanocomposite filaments using a combination of extrusion, compounding and capillary rheometer to minimize thermal degradation of CNFs was demonstrated. It was found that the nanocomposite filaments have slightly lower thermal stability and crystallinity compared to neat nylon 6 filaments. However, the incorporation of CNFs increased the tenacity and hydrophilicity of the nanocomposite filaments, indicating a potential for their use as precursor materials for textile yarns.
Investigation on the coloured noise in GPS-derived position with time-varying seasonal signals
NASA Astrophysics Data System (ADS)
Gruszczynska, Marta; Klos, Anna; Bos, Machiel Simon; Bogusz, Janusz
2016-04-01
The seasonal signals in the GPS-derived time series arise from real geophysical signals related to tidal (residual) or non-tidal (loadings from atmosphere, ocean and continental hydrosphere, thermo elastic strain, etc.) effects and numerical artefacts including aliasing from mismodelling in short periods or repeatability of the GPS satellite constellation with respect to the Sun (draconitics). Singular Spectrum Analysis (SSA) is a method for investigation of nonlinear dynamics, suitable to either stationary or non-stationary data series without prior knowledge about their character. The aim of SSA is to mathematically decompose the original time series into a sum of slowly varying trend, seasonal oscillations and noise. In this presentation we will explore the ability of SSA to subtract the time-varying seasonal signals in GPS-derived North-East-Up topocentric components and show properties of coloured noise from residua. For this purpose we used data from globally distributed IGS (International GNSS Service) permanent stations processed by the JPL (Jet Propulsion Laboratory) in a PPP (Precise Point Positioning) mode. After introducing a threshold of 13 years, 264 stations left with a maximum length reaching 23 years. The data was initially pre-processed for outliers, offsets and gaps. The SSA was applied to pre-processed series to estimate the time-varying seasonal signals. We adopted a 3-years window as the optimal dimension of its size determined with the Akaike's Information Criteria (AIC) values. A Fisher-Snedecor test corrected for the presence of temporal correlation was used to determine the statistical significance of reconstructed components. This procedure showed that first four components describing annual and semi-annual signals, are significant at a 99.7% confidence level, which corresponds to 3-sigma criterion. We compared the non-parametric SSA approach with a commonly chosen parametric Least-Squares Estimation that assumes constant amplitudes and phases over time. We noticed a maximum difference in seasonal oscillation of 3.5 mm and a maximum change in velocity of 0.15 mm/year for Up component (YELL, Yellowknife, Canada), when SSA and LSE are compared. The annual signal has the greatest influence on data variability in time series, while the semi-annual signal in Up component has much smaller contribution in the total variance of data. For some stations more than 35% of the total variance is explained by annual signal. According to the Power Spectral Densities (PSD) we proved that SSA has the ability to properly subtract the seasonals changing in time with almost no influence on power-law character of stochastic part. Then, the modified Maximum Likelihood Estimation (MLE) in Hector software was applied to SSA-filtered time series. We noticed a significant improvement in spectral indices and power-law amplitudes in comparison to classically determined ones with LSE, which will be the main subject of this presentation.
Revising time series of the Elbe river discharge for flood frequency determination at gauge Dresden
NASA Astrophysics Data System (ADS)
Bartl, S.; Schümberg, S.; Deutsch, M.
2009-11-01
The German research programme RIsk MAnagment of eXtreme flood events has accomplished the improvement of regional hazard assessment for the large rivers in Germany. Here we focused on the Elbe river at its gauge Dresden, which belongs to the oldest gauges in Europe with officially available daily discharge time series beginning on 1 January 1890. The project on the one hand aimed to extend and to revise the existing time series, and on the other hand to examine the variability of the Elbe river discharge conditions on a greater time scale. Therefore one major task were the historical searches and the examination of the retrieved documents and the contained information. After analysing this information the development of the river course and the discharge conditions were discussed. Using the provided knowledge, in an other subproject, a historical hydraulic model was established. Its results then again were used here. A further purpose was the determining of flood frequency based on all pre-processed data. The obtained knowledge about historical changes was also used to get an idea about possible future variations under climate change conditions. Especially variations in the runoff characteristic of the Elbe river over the course of the year were analysed. It succeeded to obtain a much longer discharge time series which contain fewer errors and uncertainties. Hence an optimized regional hazard assessment was realised.
Temporal data mining for the quality assessment of hemodialysis services.
Bellazzi, Riccardo; Larizza, Cristiana; Magni, Paolo; Bellazzi, Roberto
2005-05-01
This paper describes the temporal data mining aspects of a research project that deals with the definition of methods and tools for the assessment of the clinical performance of hemodialysis (HD) services, on the basis of the time series automatically collected during hemodialysis sessions. Intelligent data analysis and temporal data mining techniques are applied to gain insight and to discover knowledge on the causes of unsatisfactory clinical results. In particular, two new methods for association rule discovery and temporal rule discovery are applied to the time series. Such methods exploit several pre-processing techniques, comprising data reduction, multi-scale filtering and temporal abstractions. We have analyzed the data of more than 5800 dialysis sessions coming from 43 different patients monitored for 19 months. The qualitative rules associating the outcome parameters and the measured variables were examined by the domain experts, which were able to distinguish between rules confirming available background knowledge and unexpected but plausible rules. The new methods proposed in the paper are suitable tools for knowledge discovery in clinical time series. Their use in the context of an auditing system for dialysis management helped clinicians to improve their understanding of the patients' behavior.
van Mierlo, Pieter; Lie, Octavian; Staljanssens, Willeke; Coito, Ana; Vulliémoz, Serge
2018-04-26
We investigated the influence of processing steps in the estimation of multivariate directed functional connectivity during seizures recorded with intracranial EEG (iEEG) on seizure-onset zone (SOZ) localization. We studied the effect of (i) the number of nodes, (ii) time-series normalization, (iii) the choice of multivariate time-varying connectivity measure: Adaptive Directed Transfer Function (ADTF) or Adaptive Partial Directed Coherence (APDC) and (iv) graph theory measure: outdegree or shortest path length. First, simulations were performed to quantify the influence of the various processing steps on the accuracy to localize the SOZ. Afterwards, the SOZ was estimated from a 113-electrodes iEEG seizure recording and compared with the resection that rendered the patient seizure-free. The simulations revealed that ADTF is preferred over APDC to localize the SOZ from ictal iEEG recordings. Normalizing the time series before analysis resulted in an increase of 25-35% of correctly localized SOZ, while adding more nodes to the connectivity analysis led to a moderate decrease of 10%, when comparing 128 with 32 input nodes. The real-seizure connectivity estimates localized the SOZ inside the resection area using the ADTF coupled to outdegree or shortest path length. Our study showed that normalizing the time-series is an important pre-processing step, while adding nodes to the analysis did only marginally affect the SOZ localization. The study shows that directed multivariate Granger-based connectivity analysis is feasible with many input nodes (> 100) and that normalization of the time-series before connectivity analysis is preferred.
Automated Bayesian model development for frequency detection in biological time series.
Granqvist, Emma; Oldroyd, Giles E D; Morris, Richard J
2011-06-24
A first step in building a mathematical model of a biological system is often the analysis of the temporal behaviour of key quantities. Mathematical relationships between the time and frequency domain, such as Fourier Transforms and wavelets, are commonly used to extract information about the underlying signal from a given time series. This one-to-one mapping from time points to frequencies inherently assumes that both domains contain the complete knowledge of the system. However, for truncated, noisy time series with background trends this unique mapping breaks down and the question reduces to an inference problem of identifying the most probable frequencies. In this paper we build on the method of Bayesian Spectrum Analysis and demonstrate its advantages over conventional methods by applying it to a number of test cases, including two types of biological time series. Firstly, oscillations of calcium in plant root cells in response to microbial symbionts are non-stationary and noisy, posing challenges to data analysis. Secondly, circadian rhythms in gene expression measured over only two cycles highlights the problem of time series with limited length. The results show that the Bayesian frequency detection approach can provide useful results in specific areas where Fourier analysis can be uninformative or misleading. We demonstrate further benefits of the Bayesian approach for time series analysis, such as direct comparison of different hypotheses, inherent estimation of noise levels and parameter precision, and a flexible framework for modelling the data without pre-processing. Modelling in systems biology often builds on the study of time-dependent phenomena. Fourier Transforms are a convenient tool for analysing the frequency domain of time series. However, there are well-known limitations of this method, such as the introduction of spurious frequencies when handling short and noisy time series, and the requirement for uniformly sampled data. Biological time series often deviate significantly from the requirements of optimality for Fourier transformation. In this paper we present an alternative approach based on Bayesian inference. We show the value of placing spectral analysis in the framework of Bayesian inference and demonstrate how model comparison can automate this procedure.
Automated Bayesian model development for frequency detection in biological time series
2011-01-01
Background A first step in building a mathematical model of a biological system is often the analysis of the temporal behaviour of key quantities. Mathematical relationships between the time and frequency domain, such as Fourier Transforms and wavelets, are commonly used to extract information about the underlying signal from a given time series. This one-to-one mapping from time points to frequencies inherently assumes that both domains contain the complete knowledge of the system. However, for truncated, noisy time series with background trends this unique mapping breaks down and the question reduces to an inference problem of identifying the most probable frequencies. Results In this paper we build on the method of Bayesian Spectrum Analysis and demonstrate its advantages over conventional methods by applying it to a number of test cases, including two types of biological time series. Firstly, oscillations of calcium in plant root cells in response to microbial symbionts are non-stationary and noisy, posing challenges to data analysis. Secondly, circadian rhythms in gene expression measured over only two cycles highlights the problem of time series with limited length. The results show that the Bayesian frequency detection approach can provide useful results in specific areas where Fourier analysis can be uninformative or misleading. We demonstrate further benefits of the Bayesian approach for time series analysis, such as direct comparison of different hypotheses, inherent estimation of noise levels and parameter precision, and a flexible framework for modelling the data without pre-processing. Conclusions Modelling in systems biology often builds on the study of time-dependent phenomena. Fourier Transforms are a convenient tool for analysing the frequency domain of time series. However, there are well-known limitations of this method, such as the introduction of spurious frequencies when handling short and noisy time series, and the requirement for uniformly sampled data. Biological time series often deviate significantly from the requirements of optimality for Fourier transformation. In this paper we present an alternative approach based on Bayesian inference. We show the value of placing spectral analysis in the framework of Bayesian inference and demonstrate how model comparison can automate this procedure. PMID:21702910
A travel time forecasting model based on change-point detection method
NASA Astrophysics Data System (ADS)
LI, Shupeng; GUANG, Xiaoping; QIAN, Yongsheng; ZENG, Junwei
2017-06-01
Travel time parameters obtained from road traffic sensors data play an important role in traffic management practice. A travel time forecasting model is proposed for urban road traffic sensors data based on the method of change-point detection in this paper. The first-order differential operation is used for preprocessing over the actual loop data; a change-point detection algorithm is designed to classify the sequence of large number of travel time data items into several patterns; then a travel time forecasting model is established based on autoregressive integrated moving average (ARIMA) model. By computer simulation, different control parameters are chosen for adaptive change point search for travel time series, which is divided into several sections of similar state.Then linear weight function is used to fit travel time sequence and to forecast travel time. The results show that the model has high accuracy in travel time forecasting.
Mishra, Alok; Swati, D
2015-09-01
Variation in the interval between the R-R peaks of the electrocardiogram represents the modulation of the cardiac oscillations by the autonomic nervous system. This variation is contaminated by anomalous signals called ectopic beats, artefacts or noise which mask the true behaviour of heart rate variability. In this paper, we have proposed a combination filter of recursive impulse rejection filter and recursive 20% filter, with recursive application and preference of replacement over removal of abnormal beats to improve the pre-processing of the inter-beat intervals. We have tested this novel recursive combinational method with median method replacement to estimate the standard deviation of normal to normal (SDNN) beat intervals of congestive heart failure (CHF) and normal sinus rhythm subjects. This work discusses the improvement in pre-processing over single use of impulse rejection filter and removal of abnormal beats for heart rate variability for the estimation of SDNN and Poncaré plot descriptors (SD1, SD2, and SD1/SD2) in detail. We have found the 22 ms value of SDNN and 36 ms value of SD2 descriptor of Poincaré plot as clinical indicators in discriminating the normal cases from CHF cases. The pre-processing is also useful in calculation of Lyapunov exponent which is a nonlinear index as Lyapunov exponents calculated after proposed pre-processing modified in a way that it start following the notion of less complex behaviour of diseased states.
Mumbare, Sachin S; Gosavi, Shriram; Almale, Balaji; Patil, Aruna; Dhakane, Supriya; Kadu, Aniruddha
2014-10-01
India's National Family Welfare Programme is dominated by sterilization, particularly tubectomy. Sterilization, being a terminal method of contraception, decides the final number of children for that couple. Many studies have shown the declining trend in the average number of living children at the time of sterilization over a short period of time. So this study was planned to do time series analysis of the average children at the time of terminal contraception, to do forecasting till 2020 for the same and to compare the rates of change in various subgroups of the population. Data was preprocessed in MS Access 2007 by creating and running SQL queries. After testing stationarity of every series with augmented Dickey-Fuller test, time series analysis and forecasting was done using best-fit Box-Jenkins ARIMA (p, d, q) nonseasonal model. To compare the rates of change of average children in various subgroups, at sterilization, analysis of covariance (ANCOVA) was applied. Forecasting showed that the replacement level of 2.1 total fertility rate (TFR) will be achieved in 2018 for couples opting for sterilization. The same will be achieved in 2020, 2016, 2018, and 2019 for rural area, urban area, Hindu couples, and Buddhist couples, respectively. It will not be achieved till 2020 in Muslim couples. Every stratum of population showed the declining trend. The decline for male children and in rural area was significantly faster than the decline for female children and in urban area, respectively. The decline was not significantly different in Hindu, Muslim, and Buddhist couples.
Real-time topic-aware influence maximization using preprocessing.
Chen, Wei; Lin, Tian; Yang, Cheng
2016-01-01
Influence maximization is the task of finding a set of seed nodes in a social network such that the influence spread of these seed nodes based on certain influence diffusion model is maximized. Topic-aware influence diffusion models have been recently proposed to address the issue that influence between a pair of users are often topic-dependent and information, ideas, innovations etc. being propagated in networks are typically mixtures of topics. In this paper, we focus on the topic-aware influence maximization task. In particular, we study preprocessing methods to avoid redoing influence maximization for each mixture from scratch. We explore two preprocessing algorithms with theoretical justifications. Our empirical results on data obtained in a couple of existing studies demonstrate that one of our algorithms stands out as a strong candidate providing microsecond online response time and competitive influence spread, with reasonable preprocessing effort.
Tani, Yuji; Ogasawara, Katsuhiko
2012-01-01
This study aimed to contribute to the management of a healthcare organization by providing management information using time-series analysis of business data accumulated in the hospital information system, which has not been utilized thus far. In this study, we examined the performance of the prediction method using the auto-regressive integrated moving-average (ARIMA) model, using the business data obtained at the Radiology Department. We made the model using the data used for analysis, which was the number of radiological examinations in the past 9 years, and we predicted the number of radiological examinations in the last 1 year. Then, we compared the actual value with the forecast value. We were able to establish that the performance prediction method was simple and cost-effective by using free software. In addition, we were able to build the simple model by pre-processing the removal of trend components using the data. The difference between predicted values and actual values was 10%; however, it was more important to understand the chronological change rather than the individual time-series values. Furthermore, our method was highly versatile and adaptable compared to the general time-series data. Therefore, different healthcare organizations can use our method for the analysis and forecasting of their business data.
Baksi, Krishanu D; Kuntal, Bhusan K; Mande, Sharmila S
2018-01-01
Realization of the importance of microbiome studies, coupled with the decreasing sequencing cost, has led to the exponential growth of microbiome data. A number of these microbiome studies have focused on understanding changes in the microbial community over time. Such longitudinal microbiome studies have the potential to offer unique insights pertaining to the microbial social networks as well as their responses to perturbations. In this communication, we introduce a web based framework called 'TIME' (Temporal Insights into Microbial Ecology'), developed specifically to obtain meaningful insights from microbiome time series data. The TIME web-server is designed to accept a wide range of popular formats as input with options to preprocess and filter the data. Multiple samples, defined by a series of longitudinal time points along with their metadata information, can be compared in order to interactively visualize the temporal variations. In addition to standard microbiome data analytics, the web server implements popular time series analysis methods like Dynamic time warping, Granger causality and Dickey Fuller test to generate interactive layouts for facilitating easy biological inferences. Apart from this, a new metric for comparing metagenomic time series data has been introduced to effectively visualize the similarities/differences in the trends of the resident microbial groups. Augmenting the visualizations with the stationarity information pertaining to the microbial groups is utilized to predict the microbial competition as well as community structure. Additionally, the 'causality graph analysis' module incorporated in TIME allows predicting taxa that might have a higher influence on community structure in different conditions. TIME also allows users to easily identify potential taxonomic markers from a longitudinal microbiome analysis. We illustrate the utility of the web-server features on a few published time series microbiome data and demonstrate the ease with which it can be used to perform complex analysis.
Retinal image restoration by means of blind deconvolution
NASA Astrophysics Data System (ADS)
Marrugo, Andrés G.; Šorel, Michal; Šroubek, Filip; Millán, María S.
2011-11-01
Retinal imaging plays a key role in the diagnosis and management of ophthalmologic disorders, such as diabetic retinopathy, glaucoma, and age-related macular degeneration. Because of the acquisition process, retinal images often suffer from blurring and uneven illumination. This problem may seriously affect disease diagnosis and progression assessment. Here we present a method for color retinal image restoration by means of multichannel blind deconvolution. The method is applied to a pair of retinal images acquired within a lapse of time, ranging from several minutes to months. It consists of a series of preprocessing steps to adjust the images so they comply with the considered degradation model, followed by the estimation of the point-spread function and, ultimately, image deconvolution. The preprocessing is mainly composed of image registration, uneven illumination compensation, and segmentation of areas with structural changes. In addition, we have developed a procedure for the detection and visualization of structural changes. This enables the identification of subtle developments in the retina not caused by variation in illumination or blur. The method was tested on synthetic and real images. Encouraging experimental results show that the method is capable of significant restoration of degraded retinal images.
NASA Astrophysics Data System (ADS)
Huvanandana, Jacqueline; Nguyen, Chinh; Thamrin, Cindy; Tracy, Mark; Hinder, Murray; McEwan, Alistair L.
2017-04-01
Despite the decline in mortality rates of extremely preterm infants, intraventricular haemorrhage (IVH) remains common in survivors. The need for resuscitation and cardiorespiratory management, particularly within the first 24 hours of life, are important factors in the incidence and timing of IVH. Variability analyses of heart rate and blood pressure data has demonstrated potential approaches to predictive monitoring. In this study, we investigated the early identification of infants at a high risk of developing IVH, using time series analysis of blood pressure and respiratory data. We also explore approaches to improving model performance, such as the inclusion of multiple variables and signal pre-processing to enhance the results from detrended fluctuation analysis. Of the models we evaluated, the highest area under receiver-operator characteristic curve (5th, 95th percentile) achieved was 0.921 (0.82, 1.00) by mean diastolic blood pressure and the long-term scaling exponent of pulse interval (PI α2), exhibiting a sensitivity of >90% at a specificity of 75%. Following evaluation in a larger population, our approach may be useful in predictive monitoring to identify infants at high risk of developing IVH, offering caregivers more time to adjust intensive care treatment.
Object recognition of ladar with support vector machine
NASA Astrophysics Data System (ADS)
Sun, Jian-Feng; Li, Qi; Wang, Qi
2005-01-01
Intensity, range and Doppler images can be obtained by using laser radar. Laser radar can detect much more object information than other detecting sensor, such as passive infrared imaging and synthetic aperture radar (SAR), so it is well suited as the sensor of object recognition. Traditional method of laser radar object recognition is extracting target features, which can be influenced by noise. In this paper, a laser radar recognition method-Support Vector Machine is introduced. Support Vector Machine (SVM) is a new hotspot of recognition research after neural network. It has well performance on digital written and face recognition. Two series experiments about SVM designed for preprocessing and non-preprocessing samples are performed by real laser radar images, and the experiments results are compared.
Time-Frequency Analyses of Tide-Gauge Sensor Data
Erol, Serdar
2011-01-01
The real world phenomena being observed by sensors are generally non-stationary in nature. The classical linear techniques for analysis and modeling natural time-series observations are inefficient and should be replaced by non-linear techniques of whose theoretical aspects and performances are varied. In this manner adopting the most appropriate technique and strategy is essential in evaluating sensors’ data. In this study, two different time-series analysis approaches, namely least squares spectral analysis (LSSA) and wavelet analysis (continuous wavelet transform, cross wavelet transform and wavelet coherence algorithms as extensions of wavelet analysis), are applied to sea-level observations recorded by tide-gauge sensors, and the advantages and drawbacks of these methods are reviewed. The analyses were carried out using sea-level observations recorded at the Antalya-II and Erdek tide-gauge stations of the Turkish National Sea-Level Monitoring System. In the analyses, the useful information hidden in the noisy signals was detected, and the common features between the two sea-level time series were clarified. The tide-gauge records have data gaps in time because of issues such as instrumental shortcomings and power outages. Concerning the difficulties of the time-frequency analysis of data with voids, the sea-level observations were preprocessed, and the missing parts were predicted using the neural network method prior to the analysis. In conclusion the merits and limitations of the techniques in evaluating non-stationary observations by means of tide-gauge sensors records were documented and an analysis strategy for the sequential sensors observations was presented. PMID:22163829
Time-frequency analyses of tide-gauge sensor data.
Erol, Serdar
2011-01-01
The real world phenomena being observed by sensors are generally non-stationary in nature. The classical linear techniques for analysis and modeling natural time-series observations are inefficient and should be replaced by non-linear techniques of whose theoretical aspects and performances are varied. In this manner adopting the most appropriate technique and strategy is essential in evaluating sensors' data. In this study, two different time-series analysis approaches, namely least squares spectral analysis (LSSA) and wavelet analysis (continuous wavelet transform, cross wavelet transform and wavelet coherence algorithms as extensions of wavelet analysis), are applied to sea-level observations recorded by tide-gauge sensors, and the advantages and drawbacks of these methods are reviewed. The analyses were carried out using sea-level observations recorded at the Antalya-II and Erdek tide-gauge stations of the Turkish National Sea-Level Monitoring System. In the analyses, the useful information hidden in the noisy signals was detected, and the common features between the two sea-level time series were clarified. The tide-gauge records have data gaps in time because of issues such as instrumental shortcomings and power outages. Concerning the difficulties of the time-frequency analysis of data with voids, the sea-level observations were preprocessed, and the missing parts were predicted using the neural network method prior to the analysis. In conclusion the merits and limitations of the techniques in evaluating non-stationary observations by means of tide-gauge sensors records were documented and an analysis strategy for the sequential sensors observations was presented.
Sereshti, Hassan; Poursorkh, Zahra; Aliakbarzadeh, Ghazaleh; Zarre, Shahin; Ataolahi, Sahar
2018-01-15
Quality of saffron, a valuable food additive, could considerably affect the consumers' health. In this work, a novel preprocessing strategy for image analysis of saffron thin layer chromatographic (TLC) patterns was introduced. This includes performing a series of image pre-processing techniques on TLC images such as compression, inversion, elimination of general baseline (using asymmetric least squares (AsLS)), removing spots shift and concavity (by correlation optimization warping (COW)), and finally conversion to RGB chromatograms. Subsequently, an unsupervised multivariate data analysis including principal component analysis (PCA) and k-means clustering was utilized to investigate the soil salinity effect, as a cultivation parameter, on saffron TLC patterns. This method was used as a rapid and simple technique to obtain the chemical fingerprints of saffron TLC images. Finally, the separated TLC spots were chemically identified using high-performance liquid chromatography-diode array detection (HPLC-DAD). Accordingly, the saffron quality from different areas of Iran was evaluated and classified. Copyright © 2017 Elsevier Ltd. All rights reserved.
A framework for periodic outlier pattern detection in time-series sequences.
Rasheed, Faraz; Alhajj, Reda
2014-05-01
Periodic pattern detection in time-ordered sequences is an important data mining task, which discovers in the time series all patterns that exhibit temporal regularities. Periodic pattern mining has a large number of applications in real life; it helps understanding the regular trend of the data along time, and enables the forecast and prediction of future events. An interesting related and vital problem that has not received enough attention is to discover outlier periodic patterns in a time series. Outlier patterns are defined as those which are different from the rest of the patterns; outliers are not noise. While noise does not belong to the data and it is mostly eliminated by preprocessing, outliers are actual instances in the data but have exceptional characteristics compared with the majority of the other instances. Outliers are unusual patterns that rarely occur, and, thus, have lesser support (frequency of appearance) in the data. Outlier patterns may hint toward discrepancy in the data such as fraudulent transactions, network intrusion, change in customer behavior, recession in the economy, epidemic and disease biomarkers, severe weather conditions like tornados, etc. We argue that detecting the periodicity of outlier patterns might be more important in many sequences than the periodicity of regular, more frequent patterns. In this paper, we present a robust and time efficient suffix tree-based algorithm capable of detecting the periodicity of outlier patterns in a time series by giving more significance to less frequent yet periodic patterns. Several experiments have been conducted using both real and synthetic data; all aspects of the proposed approach are compared with the existing algorithm InfoMiner; the reported results demonstrate the effectiveness and applicability of the proposed approach.
NASA Astrophysics Data System (ADS)
Nourani, Vahid; Andalib, Gholamreza; Dąbrowska, Dominika
2017-05-01
Accurate nitrate load predictions can elevate decision management of water quality of watersheds which affects to environment and drinking water. In this paper, two scenarios were considered for Multi-Station (MS) nitrate load modeling of the Little River watershed. In the first scenario, Markovian characteristics of streamflow-nitrate time series were proposed for the MS modeling. For this purpose, feature extraction criterion of Mutual Information (MI) was employed for input selection of artificial intelligence models (Feed Forward Neural Network, FFNN and least square support vector machine). In the second scenario for considering seasonality-based characteristics of the time series, wavelet transform was used to extract multi-scale features of streamflow-nitrate time series of the watershed's sub-basins to model MS nitrate loads. Self-Organizing Map (SOM) clustering technique which finds homogeneous sub-series clusters was also linked to MI for proper cluster agent choice to be imposed into the models for predicting the nitrate loads of the watershed's sub-basins. The proposed MS method not only considers the prediction of the outlet nitrate but also covers predictions of interior sub-basins nitrate load values. The results indicated that the proposed FFNN model coupled with the SOM-MI improved the performance of MS nitrate predictions compared to the Markovian-based models up to 39%. Overall, accurate selection of dominant inputs which consider seasonality-based characteristics of streamflow-nitrate process could enhance the efficiency of nitrate load predictions.
Andronache, Adrian; Rosazza, Cristina; Sattin, Davide; Leonardi, Matilde; D'Incerti, Ludovico; Minati, Ludovico
2013-01-01
An emerging application of resting-state functional MRI (rs-fMRI) is the study of patients with disorders of consciousness (DoC), where integrity of default-mode network (DMN) activity is associated to the clinical level of preservation of consciousness. Due to the inherent inability to follow verbal instructions, arousal induced by scanning noise and postural pain, these patients tend to exhibit substantial levels of movement. This results in spurious, non-neural fluctuations of the rs-fMRI signal, which impair the evaluation of residual functional connectivity. Here, the effect of data preprocessing choices on the detectability of the DMN was systematically evaluated in a representative cohort of 30 clinically and etiologically heterogeneous DoC patients and 33 healthy controls. Starting from a standard preprocessing pipeline, additional steps were gradually inserted, namely band-pass filtering (BPF), removal of co-variance with the movement vectors, removal of co-variance with the global brain parenchyma signal, rejection of realignment outlier volumes and ventricle masking. Both independent-component analysis (ICA) and seed-based analysis (SBA) were performed, and DMN detectability was assessed quantitatively as well as visually. The results of the present study strongly show that the detection of DMN activity in the sub-optimal fMRI series acquired on DoC patients is contingent on the use of adequate filtering steps. ICA and SBA are differently affected but give convergent findings for high-grade preprocessing. We propose that future studies in this area should adopt the described preprocessing procedures as a minimum standard to reduce the probability of wrongly inferring that DMN activity is absent.
Providing web-based tools for time series access and analysis
NASA Astrophysics Data System (ADS)
Eberle, Jonas; Hüttich, Christian; Schmullius, Christiane
2014-05-01
Time series information is widely used in environmental change analyses and is also an essential information for stakeholders and governmental agencies. However, a challenging issue is the processing of raw data and the execution of time series analysis. In most cases, data has to be found, downloaded, processed and even converted in the correct data format prior to executing time series analysis tools. Data has to be prepared to use it in different existing software packages. Several packages like TIMESAT (Jönnson & Eklundh, 2004) for phenological studies, BFAST (Verbesselt et al., 2010) for breakpoint detection, and GreenBrown (Forkel et al., 2013) for trend calculations are provided as open-source software and can be executed from the command line. This is needed if data pre-processing and time series analysis is being automated. To bring both parts, automated data access and data analysis, together, a web-based system was developed to provide access to satellite based time series data and access to above mentioned analysis tools. Users of the web portal are able to specify a point or a polygon and an available dataset (e.g., Vegetation Indices and Land Surface Temperature datasets from NASA MODIS). The data is then being processed and provided as a time series CSV file. Afterwards the user can select an analysis tool that is being executed on the server. The final data (CSV, plot images, GeoTIFFs) is visualized in the web portal and can be downloaded for further usage. As a first use case, we built up a complimentary web-based system with NASA MODIS products for Germany and parts of Siberia based on the Earth Observation Monitor (www.earth-observation-monitor.net). The aim of this work is to make time series analysis with existing tools as easy as possible that users can focus on the interpretation of the results. References: Jönnson, P. and L. Eklundh (2004). TIMESAT - a program for analysing time-series of satellite sensor data. Computers and Geosciences 30, 833-845. Verbesselt, J., R. Hyndman, G. Newnham and D. Culvenor (2010). Detecting trend and seasonal changes in satellite image time series. Remote Sensing of Environment, 114, 106-115. DOI: 10.1016/j.rse.2009.08.014 Forkel, M., N. Carvalhais, J. Verbesselt, M. Mahecha, C. Neigh and M. Reichstein (2013). Trend Change Detection in NDVI Time Series: Effects of Inter-Annual Variability and Methodology. Remote Sensing 5, 2113-2144.
NASA Astrophysics Data System (ADS)
Mehrvand, Masoud; Baghanam, Aida Hosseini; Razzaghzadeh, Zahra; Nourani, Vahid
2017-04-01
Since statistical downscaling methods are the most largely used models to study hydrologic impact studies under climate change scenarios, nonlinear regression models known as Artificial Intelligence (AI)-based models such as Artificial Neural Network (ANN) and Support Vector Machine (SVM) have been used to spatially downscale the precipitation outputs of Global Climate Models (GCMs). The study has been carried out using GCM and station data over GCM grid points located around the Peace-Tampa Bay watershed weather stations. Before downscaling with AI-based model, correlation coefficient values have been computed between a few selected large-scale predictor variables and local scale predictands to select the most effective predictors. The selected predictors are then assessed considering grid location for the site in question. In order to increase AI-based downscaling model accuracy pre-processing has been developed on precipitation time series. In this way, the precipitation data derived from various GCM data analyzed thoroughly to find the highest value of correlation coefficient between GCM-based historical data and station precipitation data. Both GCM and station precipitation time series have been assessed by comparing mean and variances over specific intervals. Results indicated that there is similar trend between GCM and station precipitation data; however station data has non-stationary time series while GCM data does not. Finally AI-based downscaling model have been applied to several GCMs with selected predictors by targeting local precipitation time series as predictand. The consequences of recent step have been used to produce multiple ensembles of downscaled AI-based models.
Galka, Andreas; Siniatchkin, Michael; Stephani, Ulrich; Groening, Kristina; Wolff, Stephan; Bosch-Bayard, Jorge; Ozaki, Tohru
2010-12-01
The analysis of time series obtained by functional magnetic resonance imaging (fMRI) may be approached by fitting predictive parametric models, such as nearest-neighbor autoregressive models with exogeneous input (NNARX). As a part of the modeling procedure, it is possible to apply instantaneous linear transformations to the data. Spatial smoothing, a common preprocessing step, may be interpreted as such a transformation. The autoregressive parameters may be constrained, such that they provide a response behavior that corresponds to the canonical haemodynamic response function (HRF). We present an algorithm for estimating the parameters of the linear transformations and of the HRF within a rigorous maximum-likelihood framework. Using this approach, an optimal amount of both the spatial smoothing and the HRF can be estimated simultaneously for a given fMRI data set. An example from a motor-task experiment is discussed. It is found that, for this data set, weak, but non-zero, spatial smoothing is optimal. Furthermore, it is demonstrated that activated regions can be estimated within the maximum-likelihood framework.
Users Manual for the Geospatial Stream Flow Model (GeoSFM)
Artan, Guleid A.; Asante, Kwabena; Smith, Jodie; Pervez, Md Shahriar; Entenmann, Debbie; Verdin, James P.; Rowland, James
2008-01-01
The monitoring of wide-area hydrologic events requires the manipulation of large amounts of geospatial and time series data into concise information products that characterize the location and magnitude of the event. To perform these manipulations, scientists at the U.S. Geological Survey Center for Earth Resources Observation and Science (EROS), with the cooperation of the U.S. Agency for International Development, Office of Foreign Disaster Assistance (USAID/OFDA), have implemented a hydrologic modeling system. The system includes a data assimilation component to generate data for a Geospatial Stream Flow Model (GeoSFM) that can be run operationally to identify and map wide-area streamflow anomalies. GeoSFM integrates a geographical information system (GIS) for geospatial preprocessing and postprocessing tasks and hydrologic modeling routines implemented as dynamically linked libraries (DLLs) for time series manipulations. Model results include maps that depicting the status of streamflow and soil water conditions. This Users Manual provides step-by-step instructions for running the model and for downloading and processing the input data required for initial model parameterization and daily operation.
Framework for Parallel Preprocessing of Microarray Data Using Hadoop
2018-01-01
Nowadays, microarray technology has become one of the popular ways to study gene expression and diagnosis of disease. National Center for Biology Information (NCBI) hosts public databases containing large volumes of biological data required to be preprocessed, since they carry high levels of noise and bias. Robust Multiarray Average (RMA) is one of the standard and popular methods that is utilized to preprocess the data and remove the noises. Most of the preprocessing algorithms are time-consuming and not able to handle a large number of datasets with thousands of experiments. Parallel processing can be used to address the above-mentioned issues. Hadoop is a well-known and ideal distributed file system framework that provides a parallel environment to run the experiment. In this research, for the first time, the capability of Hadoop and statistical power of R have been leveraged to parallelize the available preprocessing algorithm called RMA to efficiently process microarray data. The experiment has been run on cluster containing 5 nodes, while each node has 16 cores and 16 GB memory. It compares efficiency and the performance of parallelized RMA using Hadoop with parallelized RMA using affyPara package as well as sequential RMA. The result shows the speed-up rate of the proposed approach outperforms the sequential approach and affyPara approach. PMID:29796018
Technical Manual for the Geospatial Stream Flow Model (GeoSFM)
Asante, Kwabena O.; Artan, Guleid A.; Pervez, Md Shahriar; Bandaragoda, Christina; Verdin, James P.
2008-01-01
The monitoring of wide-area hydrologic events requires the use of geospatial and time series data available in near-real time. These data sets must be manipulated into information products that speak to the location and magnitude of the event. Scientists at the U.S. Geological Survey Earth Resources Observation and Science (USGS EROS) Center have implemented a hydrologic modeling system which consists of an operational data processing system and the Geospatial Stream Flow Model (GeoSFM). The data processing system generates daily forcing evapotranspiration and precipitation data from various remotely sensed and ground-based data sources. To allow for rapid implementation in data scarce environments, widely available terrain, soil, and land cover data sets are used for model setup and initial parameter estimation. GeoSFM performs geospatial preprocessing and postprocessing tasks as well as hydrologic modeling tasks within an ArcView GIS environment. The integration of GIS routines and time series processing routines is achieved seamlessly through the use of dynamically linked libraries (DLLs) embedded within Avenue scripts. GeoSFM is run operationally to identify and map wide-area streamflow anomalies. Daily model results including daily streamflow and soil water maps are disseminated through Internet map servers, flood hazard bulletins and other media.
Barry, Robert L.; Williams, Joy M.; Klassen, L. Martyn; Gallivan, Jason P.; Culham, Jody C.
2009-01-01
Blood-oxygenation-level-dependent (BOLD) functional magnetic resonance imaging (fMRI) is currently the dominant technique for non-invasive investigation of brain functions. One of the challenges with BOLD fMRI, particularly at high fields, is compensation for the effects of spatiotemporally varying magnetic field inhomogeneities (ΔB0) caused by normal subject respiration, and in some studies, movement of the subject during the scan to perform tasks related to the functional paradigm. The presence of ΔB0 during data acquisition distorts reconstructed images and introduces extraneous fluctuations in the fMRI time series that decrease the BOLD contrast-to-noise ratio. Optimization of the fMRI data-processing pipeline to compensate for geometric distortions is of paramount importance to ensure high quality of fMRI data. To investigate ΔB0 caused by subject movement, echo-planar imaging scans were collected with and without concurrent motion of a phantom arm. The phantom arm was constructed and moved by the experimenter to emulate forearm motions while subjects remained still and observed a visual stimulation paradigm. These data were then subjected to eight different combinations of preprocessing steps. The best preprocessing pipeline included navigator correction, a complex phase regressor, and spatial smoothing. The synergy between navigator correction and phase regression reduced geometric distortions better than either step in isolation, and preconditioned the data to make them more amenable to the benefits of spatial smoothing. The combination of these steps provided a 10% increase in t-statistics compared to only navigator correction and spatial smoothing, and reduced the noise and false activations in regions where no legitimate effects would occur. PMID:19695810
Groundwater similarity across a watershed derived from time-warped and flow-corrected time series
NASA Astrophysics Data System (ADS)
Rinderer, M.; McGlynn, B. L.; van Meerveld, H. J.
2017-05-01
Information about catchment-scale groundwater dynamics is necessary to understand how catchments store and release water and why water quantity and quality varies in streams. However, groundwater level monitoring is often restricted to a limited number of sites. Knowledge of the factors that determine similarity between monitoring sites can be used to predict catchment-scale groundwater storage and connectivity of different runoff source areas. We used distance-based and correlation-based similarity measures to quantify the spatial and temporal differences in shallow groundwater similarity for 51 monitoring sites in a Swiss prealpine catchment. The 41 months long time series were preprocessed using Dynamic Time-Warping and a Flow-corrected Time Transformation to account for small timing differences and bias toward low-flow periods. The mean distance-based groundwater similarity was correlated to topographic indices, such as upslope contributing area, topographic wetness index, and local slope. Correlation-based similarity was less related to landscape position but instead revealed differences between seasons. Analysis of variance and partial Mantel tests showed that landscape position, represented by the topographic wetness index, explained 52% of the variability in mean distance-based groundwater similarity, while spatial distance, represented by the Euclidean distance, explained only 5%. The variability in distance-based similarity and correlation-based similarity between groundwater and streamflow time series was significantly larger for midslope locations than for other landscape positions. This suggests that groundwater dynamics at these midslope sites, which are important to understand runoff source areas and hydrological connectivity at the catchment scale, are most difficult to predict.
Muncy, Nathan M; Hedges-Muncy, Ariana M; Kirwan, C Brock
2017-01-01
Pre-processing MRI scans prior to performing volumetric analyses is common practice in MRI studies. As pre-processing steps adjust the voxel intensities, the space in which the scan exists, and the amount of data in the scan, it is possible that the steps have an effect on the volumetric output. To date, studies have compared between and not within pipelines, and so the impact of each step is unknown. This study aims to quantify the effects of pre-processing steps on volumetric measures in T1-weighted scans within a single pipeline. It was our hypothesis that pre-processing steps would significantly impact ROI volume estimations. One hundred fifteen participants from the OASIS dataset were used, where each participant contributed three scans. All scans were then pre-processed using a step-wise pipeline. Bilateral hippocampus, putamen, and middle temporal gyrus volume estimations were assessed following each successive step, and all data were processed by the same pipeline 5 times. Repeated-measures analyses tested for a main effects of pipeline step, scan-rescan (for MRI scanner consistency) and repeated pipeline runs (for algorithmic consistency). A main effect of pipeline step was detected, and interestingly an interaction between pipeline step and ROI exists. No effect for either scan-rescan or repeated pipeline run was detected. We then supply a correction for noise in the data resulting from pre-processing.
Airborne Hyperspectral Imaging of Seagrass and Coral Reef
NASA Astrophysics Data System (ADS)
Merrill, J.; Pan, Z.; Mewes, T.; Herwitz, S.
2013-12-01
This talk presents the process of project preparation, airborne data collection, data pre-processing and comparative analysis of a series of airborne hyperspectral projects focused on the mapping of seagrass and coral reef communities in the Florida Keys. As part of a series of large collaborative projects funded by the NASA ROSES program and the Florida Fish and Wildlife Conservation Commission and administered by the NASA UAV Collaborative, a series of airborne hyperspectral datasets were collected over six sites in the Florida Keys in May 2012, October 2012 and May 2013 by Galileo Group, Inc. using a manned Cessna 172 and NASA's SIERRA Unmanned Aerial Vehicle. Precise solar and tidal data were used to calculate airborne collection parameters and develop flight plans designed to optimize data quality. Two independent Visible and Near-Infrared (VNIR) hyperspectral imaging systems covering 400-100nm were used to collect imagery over six Areas of Interest (AOIs). Multiple collections were performed over all sites across strict solar windows in the mornings and afternoons. Independently developed pre-processing algorithms were employed to radiometrically correct, synchronize and georectify individual flight lines which were then combined into color balanced mosaics for each Area of Interest. The use of two different hyperspectral sensor as well as environmental variations between each collection allow for the comparative analysis of data quality as well as the iterative refinement of flight planning and collection parameters.
NASA Technical Reports Server (NTRS)
Gao, Feng; DeColstoun, Eric Brown; Ma, Ronghua; Weng, Qihao; Masek, Jeffrey G.; Chen, Jin; Pan, Yaozhong; Song, Conghe
2012-01-01
Cities have been expanding rapidly worldwide, especially over the past few decades. Mapping the dynamic expansion of impervious surface in both space and time is essential for an improved understanding of the urbanization process, land-cover and land-use change, and their impacts on the environment. Landsat and other medium-resolution satellites provide the necessary spatial details and temporal frequency for mapping impervious surface expansion over the past four decades. Since the US Geological Survey opened the historical record of the Landsat image archive for free access in 2008, the decades-old bottleneck of data limitation has gone. Remote-sensing scientists are now rich with data, and the challenge is how to make best use of this precious resource. In this article, we develop an efficient algorithm to map the continuous expansion of impervious surface using a time series of four decades of medium-resolution satellite images. The algorithm is based on a supervised classification of the time-series image stack using a decision tree. Each imerpervious class represents urbanization starting in a different image. The algorithm also allows us to remove inconsistent training samples because impervious expansion is not reversible during the study period. The objective is to extract a time series of complete and consistent impervious surface maps from a corresponding times series of images collected from multiple sensors, and with a minimal amount of image preprocessing effort. The approach was tested in the lower Yangtze River Delta region, one of the fastest urban growth areas in China. Results from nearly four decades of medium-resolution satellite data from the Landsat Multispectral Scanner (MSS), Thematic Mapper (TM), Enhanced Thematic Mapper plus (ETM+) and China-Brazil Earth Resources Satellite (CBERS) show a consistent urbanization process that is consistent with economic development plans and policies. The time-series impervious spatial extent maps derived from this study agree well with an existing urban extent polygon data set that was previously developed independently. The overall mapping accuracy was estimated at about 92.5% with 3% commission error and 12% omission error for the impervious type from all images regardless of image quality and initial spatial resolution.
Schulze, H Georg; Turner, Robin F B
2015-06-01
High-throughput information extraction from large numbers of Raman spectra is becoming an increasingly taxing problem due to the proliferation of new applications enabled using advances in instrumentation. Fortunately, in many of these applications, the entire process can be automated, yielding reproducibly good results with significant time and cost savings. Information extraction consists of two stages, preprocessing and analysis. We focus here on the preprocessing stage, which typically involves several steps, such as calibration, background subtraction, baseline flattening, artifact removal, smoothing, and so on, before the resulting spectra can be further analyzed. Because the results of some of these steps can affect the performance of subsequent ones, attention must be given to the sequencing of steps, the compatibility of these sequences, and the propensity of each step to generate spectral distortions. We outline here important considerations to effect full automation of Raman spectral preprocessing: what is considered full automation; putative general principles to effect full automation; the proper sequencing of processing and analysis steps; conflicts and circularities arising from sequencing; and the need for, and approaches to, preprocessing quality control. These considerations are discussed and illustrated with biological and biomedical examples reflecting both successful and faulty preprocessing.
2017-01-01
Pre-processing MRI scans prior to performing volumetric analyses is common practice in MRI studies. As pre-processing steps adjust the voxel intensities, the space in which the scan exists, and the amount of data in the scan, it is possible that the steps have an effect on the volumetric output. To date, studies have compared between and not within pipelines, and so the impact of each step is unknown. This study aims to quantify the effects of pre-processing steps on volumetric measures in T1-weighted scans within a single pipeline. It was our hypothesis that pre-processing steps would significantly impact ROI volume estimations. One hundred fifteen participants from the OASIS dataset were used, where each participant contributed three scans. All scans were then pre-processed using a step-wise pipeline. Bilateral hippocampus, putamen, and middle temporal gyrus volume estimations were assessed following each successive step, and all data were processed by the same pipeline 5 times. Repeated-measures analyses tested for a main effects of pipeline step, scan-rescan (for MRI scanner consistency) and repeated pipeline runs (for algorithmic consistency). A main effect of pipeline step was detected, and interestingly an interaction between pipeline step and ROI exists. No effect for either scan-rescan or repeated pipeline run was detected. We then supply a correction for noise in the data resulting from pre-processing. PMID:29023597
An Effective Measured Data Preprocessing Method in Electrical Impedance Tomography
Yu, Chenglong; Yue, Shihong; Wang, Jianpei; Wang, Huaxiang
2014-01-01
As an advanced process detection technology, electrical impedance tomography (EIT) has widely been paid attention to and studied in the industrial fields. But the EIT techniques are greatly limited to the low spatial resolutions. This problem may result from the incorrect preprocessing of measuring data and lack of general criterion to evaluate different preprocessing processes. In this paper, an EIT data preprocessing method is proposed by all rooting measured data and evaluated by two constructed indexes based on all rooted EIT measured data. By finding the optimums of the two indexes, the proposed method can be applied to improve the EIT imaging spatial resolutions. In terms of a theoretical model, the optimal rooting times of the two indexes range in [0.23, 0.33] and in [0.22, 0.35], respectively. Moreover, these factors that affect the correctness of the proposed method are generally analyzed. The measuring data preprocessing is necessary and helpful for any imaging process. Thus, the proposed method can be generally and widely used in any imaging process. Experimental results validate the two proposed indexes. PMID:25165735
Genetic Algorithm for Optimization: Preprocessing with n Dimensional Bisection and Error Estimation
NASA Technical Reports Server (NTRS)
Sen, S. K.; Shaykhian, Gholam Ali
2006-01-01
A knowledge of the appropriate values of the parameters of a genetic algorithm (GA) such as the population size, the shrunk search space containing the solution, crossover and mutation probabilities is not available a priori for a general optimization problem. Recommended here is a polynomial-time preprocessing scheme that includes an n-dimensional bisection and that determines the foregoing parameters before deciding upon an appropriate GA for all problems of similar nature and type. Such a preprocessing is not only fast but also enables us to get the global optimal solution and its reasonably narrow error bounds with a high degree of confidence.
Impact of Autocorrelation on Functional Connectivity
Arbabshirani, Mohammad R.; Damaraju, Eswar; Phlypo, Ronald; Plis, Sergey; Allen, Elena; Ma, Sai; Mathalon, Daniel; Preda, Adrian; Vaidya, Jatin G.; Adali, Tülay; Calhoun, Vince D.
2014-01-01
Although the impact of serial correlation (autocorrelation) in residuals of general linear models for fMRI time-series has been studied extensively, the effect of autocorrelation on functional connectivity studies has been largely neglected until recently. Some recent studies based on results from economics have questioned the conventional estimation of functional connectivity and argue that not correcting for autocorrelation in fMRI time-series results in “spurious” correlation coefficients. In this paper, first we assess the effect of autocorrelation on Pearson correlation coefficient through theoretical approximation and simulation. Then we present this effect on real fMRI data. To our knowledge this is the first work comprehensively investigating the effect of autocorrelation on functional connectivity estimates. Our results show that although FC values are altered, even following correction for autocorrelation, results of hypothesis testing on FC values remain very similar to those before correction. In real data we show this is true for main effects and also for group difference testing between healthy controls and schizophrenia patients. We further discuss model order selection in the context of autoregressive processes, effects of frequency filtering and propose a preprocessing pipeline for connectivity studies. PMID:25072392
NASA Astrophysics Data System (ADS)
Li, Yongbo; Xu, Minqiang; Wang, Rixin; Huang, Wenhu
2016-01-01
This paper presents a new rolling bearing fault diagnosis method based on local mean decomposition (LMD), improved multiscale fuzzy entropy (IMFE), Laplacian score (LS) and improved support vector machine based binary tree (ISVM-BT). When the fault occurs in rolling bearings, the measured vibration signal is a multi-component amplitude-modulated and frequency-modulated (AM-FM) signal. LMD, a new self-adaptive time-frequency analysis method can decompose any complicated signal into a series of product functions (PFs), each of which is exactly a mono-component AM-FM signal. Hence, LMD is introduced to preprocess the vibration signal. Furthermore, IMFE that is designed to avoid the inaccurate estimation of fuzzy entropy can be utilized to quantify the complexity and self-similarity of time series for a range of scales based on fuzzy entropy. Besides, the LS approach is introduced to refine the fault features by sorting the scale factors. Subsequently, the obtained features are fed into the multi-fault classifier ISVM-BT to automatically fulfill the fault pattern identifications. The experimental results validate the effectiveness of the methodology and demonstrate that proposed algorithm can be applied to recognize the different categories and severities of rolling bearings.
Näreoja, Tuomas; Rosenholm, Jessica M; Lamminmäki, Urpo; Hänninen, Pekka E
2017-05-01
Thyrotropin or thyroid-stimulating hormone (TSH) is used as a marker for thyroid function. More precise and more sensitive immunoassays are needed to facilitate continuous monitoring of thyroid dysfunctions and to assess the efficacy of the selected therapy and dosage of medication. Moreover, most thyroid diseases are autoimmune diseases making TSH assays very prone to immunoassay interferences due to autoantibodies in the sample matrix. We have developed a super-sensitive TSH immunoassay utilizing nanoparticle labels with a detection limit of 60 nU L -1 in preprocessed serum samples by reducing nonspecific binding. The developed preprocessing step by affinity purification removed interfering compounds and improved the recovery of spiked TSH from serum. The sensitivity enhancement was achieved by stabilization of the protein corona of the nanoparticle bioconjugates and a spot-coated configuration of the active solid-phase that reduced sedimentation of the nanoparticle bioconjugates and their contact time with antibody-coated solid phase, thus making use of the higher association rate of specific binding due to high avidity nanoparticle bioconjugates. Graphical Abstract We were able to decrease the lowest limit of detection and increase sensitivity of TSH immunoassay using Eu(III)-nanoparticles. The improvement was achieved by decreasing binding time of nanoparticle bioconjugates by small capture area and fast circular rotation. Also, we applied a step to stabilize protein corona of the nanoparticles and a serum-preprocessing step with a structurally related antibody.
Sentiment analysis of feature ranking methods for classification accuracy
NASA Astrophysics Data System (ADS)
Joseph, Shashank; Mugauri, Calvin; Sumathy, S.
2017-11-01
Text pre-processing and feature selection are important and critical steps in text mining. Text pre-processing of large volumes of datasets is a difficult task as unstructured raw data is converted into structured format. Traditional methods of processing and weighing took much time and were less accurate. To overcome this challenge, feature ranking techniques have been devised. A feature set from text preprocessing is fed as input for feature selection. Feature selection helps improve text classification accuracy. Of the three feature selection categories available, the filter category will be the focus. Five feature ranking methods namely: document frequency, standard deviation information gain, CHI-SQUARE, and weighted-log likelihood -ratio is analyzed.
Epileptic Seizures Prediction Using Machine Learning Methods
Usman, Syed Muhammad
2017-01-01
Epileptic seizures occur due to disorder in brain functionality which can affect patient's health. Prediction of epileptic seizures before the beginning of the onset is quite useful for preventing the seizure by medication. Machine learning techniques and computational methods are used for predicting epileptic seizures from Electroencephalograms (EEG) signals. However, preprocessing of EEG signals for noise removal and features extraction are two major issues that have an adverse effect on both anticipation time and true positive prediction rate. Therefore, we propose a model that provides reliable methods of both preprocessing and feature extraction. Our model predicts epileptic seizures' sufficient time before the onset of seizure starts and provides a better true positive rate. We have applied empirical mode decomposition (EMD) for preprocessing and have extracted time and frequency domain features for training a prediction model. The proposed model detects the start of the preictal state, which is the state that starts few minutes before the onset of the seizure, with a higher true positive rate compared to traditional methods, 92.23%, and maximum anticipation time of 33 minutes and average prediction time of 23.6 minutes on scalp EEG CHB-MIT dataset of 22 subjects. PMID:29410700
Short-term PV/T module temperature prediction based on PCA-RBF neural network
NASA Astrophysics Data System (ADS)
Li, Jiyong; Zhao, Zhendong; Li, Yisheng; Xiao, Jing; Tang, Yunfeng
2018-02-01
Aiming at the non-linearity and large inertia of temperature control in PV/T system, short-term temperature prediction of PV/T module is proposed, to make the PV/T system controller run forward according to the short-term forecasting situation to optimize control effect. Based on the analysis of the correlation between PV/T module temperature and meteorological factors, and the temperature of adjacent time series, the principal component analysis (PCA) method is used to pre-process the original input sample data. Combined with the RBF neural network theory, the simulation results show that the PCA method makes the prediction accuracy of the network model higher and the generalization performance stronger than that of the RBF neural network without the main component extraction.
SpcAudace: Spectroscopic processing and analysis package of Audela software
NASA Astrophysics Data System (ADS)
Mauclaire, Benjamin
2017-11-01
SpcAudace processes long slit spectra with automated pipelines and performs astrophysical analysis of the latter data. These powerful pipelines do all the required steps in one pass: standard preprocessing, masking of bad pixels, geometric corrections, registration, optimized spectrum extraction, wavelength calibration and instrumental response computation and correction. Both high and low resolution long slit spectra are managed for stellar and non-stellar targets. Many types of publication-quality figures can be easily produced: pdf and png plots or annotated time series plots. Astrophysical quantities can be derived from individual or large amount of spectra with advanced functions: from line profile characteristics to equivalent width and periodogram. More than 300 documented functions are available and can be used into TCL scripts for automation. SpcAudace is based on Audela open source software.
NASA Astrophysics Data System (ADS)
Yakunin, A. G.; Hussein, H. M.
2017-08-01
An example of information-measuring systems for climate monitoring and operational control of energy resources consumption of the university campus that is functioning in the Altai State Technical University since 2009. The advantages of using such systems for studying various physical processes are discussed. General principles of construction of similar systems, their software, hardware and algorithmic support are considered. It is shown that their fundamental difference from traditional SCADA - systems is the use of databases for storing the results of the observation with a specialized data structure, and by preprocessing of the input signal for its compression. Another difference is the absence of clear criteria for detecting the anomalies in the time series of the observed process. The examples of algorithms that solve this problem are given.
ERIC Educational Resources Information Center
Cechinel, Cristian
2014-01-01
This work presents a quantitative study of the use of a Learning Management System (LMS) by the professors of a distance learning course, focused on the guidance given for the students' Final Undergraduate Project. Data taken from the logs of 34 professors in two distinct virtual rooms were collected. After pre-processing the data, a series of…
Implementation of cryptographic hash function SHA256 in C++
NASA Astrophysics Data System (ADS)
Shrivastava, Akash
2012-02-01
This abstract explains the implementation of SHA Secure hash algorithm 256 using C++. The SHA-2 is a strong hashing algorithm used in almost all kinds of security applications. The algorithm consists of 2 phases: Preprocessing and hash computation. Preprocessing involves padding a message, parsing the padded message into m-bits blocks, and setting initialization values to be used in the hash computation. It generates a message schedule from padded message and uses that schedule, along with functions, constants, and word operations to iteratively generate a series of hash values. The final hash value generated by the computation is used to determine the message digest. SHA-2 includes a significant number of changes from its predecessor, SHA-1. SHA-2 consists of a set of four hash functions with digests that are 224, 256, 384 or 512 bits. The algorithm outputs a 256 bits message block with an internal state block of 256 bits and initial block size of 512 bits. Maximum message length in bit is generated is 2^64 -1, over all computed over a series of 64 rounds consisting or several operations such as and, or, Xor, Shr, Rot. The code will provide clear understanding of the hash algorithm and generates hash values to retrieve message digest.
NASA Astrophysics Data System (ADS)
Khai Tiu, Ervin Shan; Huang, Yuk Feng; Ling, Lloyd
2018-03-01
An accurate streamflow forecasting model is important for the development of flood mitigation plan as to ensure sustainable development for a river basin. This study adopted Variational Mode Decomposition (VMD) data-preprocessing technique to process and denoise the rainfall data before putting into the Support Vector Machine (SVM) streamflow forecasting model in order to improve the performance of the selected model. Rainfall data and river water level data for the period of 1996-2016 were used for this purpose. Homogeneity tests (Standard Normal Homogeneity Test, the Buishand Range Test, the Pettitt Test and the Von Neumann Ratio Test) and normality tests (Shapiro-Wilk Test, Anderson-Darling Test, Lilliefors Test and Jarque-Bera Test) had been carried out on the rainfall series. Homogenous and non-normally distributed data were found in all the stations, respectively. From the recorded rainfall data, it was observed that Dungun River Basin possessed higher monthly rainfall from November to February, which was during the Northeast Monsoon. Thus, the monthly and seasonal rainfall series of this monsoon would be the main focus for this research as floods usually happen during the Northeast Monsoon period. The predicted water levels from SVM model were assessed with the observed water level using non-parametric statistical tests (Biased Method, Kendall's Tau B Test and Spearman's Rho Test).
Run-time parallelization and scheduling of loops
NASA Technical Reports Server (NTRS)
Saltz, Joel H.; Mirchandaney, Ravi; Crowley, Kay
1991-01-01
Run-time methods are studied to automatically parallelize and schedule iterations of a do loop in certain cases where compile-time information is inadequate. The methods presented involve execution time preprocessing of the loop. At compile-time, these methods set up the framework for performing a loop dependency analysis. At run-time, wavefronts of concurrently executable loop iterations are identified. Using this wavefront information, loop iterations are reordered for increased parallelism. Symbolic transformation rules are used to produce: inspector procedures that perform execution time preprocessing, and executors or transformed versions of source code loop structures. These transformed loop structures carry out the calculations planned in the inspector procedures. Performance results are presented from experiments conducted on the Encore Multimax. These results illustrate that run-time reordering of loop indexes can have a significant impact on performance.
NASA Astrophysics Data System (ADS)
Sharma, Sanjib; Siddique, Ridwan; Reed, Seann; Ahnert, Peter; Mendoza, Pablo; Mejia, Alfonso
2018-03-01
The relative roles of statistical weather preprocessing and streamflow postprocessing in hydrological ensemble forecasting at short- to medium-range forecast lead times (day 1-7) are investigated. For this purpose, a regional hydrologic ensemble prediction system (RHEPS) is developed and implemented. The RHEPS is comprised of the following components: (i) hydrometeorological observations (multisensor precipitation estimates, gridded surface temperature, and gauged streamflow); (ii) weather ensemble forecasts (precipitation and near-surface temperature) from the National Centers for Environmental Prediction 11-member Global Ensemble Forecast System Reforecast version 2 (GEFSRv2); (iii) NOAA's Hydrology Laboratory-Research Distributed Hydrologic Model (HL-RDHM); (iv) heteroscedastic censored logistic regression (HCLR) as the statistical preprocessor; (v) two statistical postprocessors, an autoregressive model with a single exogenous variable (ARX(1,1)) and quantile regression (QR); and (vi) a comprehensive verification strategy. To implement the RHEPS, 1 to 7 days weather forecasts from the GEFSRv2 are used to force HL-RDHM and generate raw ensemble streamflow forecasts. Forecasting experiments are conducted in four nested basins in the US Middle Atlantic region, ranging in size from 381 to 12 362 km2. Results show that the HCLR preprocessed ensemble precipitation forecasts have greater skill than the raw forecasts. These improvements are more noticeable in the warm season at the longer lead times (> 3 days). Both postprocessors, ARX(1,1) and QR, show gains in skill relative to the raw ensemble streamflow forecasts, particularly in the cool season, but QR outperforms ARX(1,1). The scenarios that implement preprocessing and postprocessing separately tend to perform similarly, although the postprocessing-alone scenario is often more effective. The scenario involving both preprocessing and postprocessing consistently outperforms the other scenarios. In some cases, however, the differences between this scenario and the scenario with postprocessing alone are not as significant. We conclude that implementing both preprocessing and postprocessing ensures the most skill improvements, but postprocessing alone can often be a competitive alternative.
Towards a global-scale ambient noise cross-correlation data base
NASA Astrophysics Data System (ADS)
Ermert, Laura; Fichtner, Andreas; Sleeman, Reinoud
2014-05-01
We aim to obtain a global-scale data base of ambient seismic noise correlations. This database - to be made publicly available at ORFEUS - will enable us to study the distribution of microseismic and hum sources, and to perform multi-scale full waveform inversion for crustal and mantle structure. Ambient noise tomography has developed into a standard technique. According to theory, cross-correlations equal inter-station Green's functions only if the wave field is equipartitioned or the sources are isotropically distributed. In an attempt to circumvent these assumptions, we aim to investigate possibilities to directly model noise cross-correlations and invert for their sources using adjoint techniques. A data base containing correlations of 'gently' preprocessed noise, excluding preprocessing steps which are explicitly taken to reduce the influence of a non-isotropic source distribution like spectral whitening, is a key ingredient in this undertaking. Raw data are acquired from IRIS/FDSN and ORFEUS. We preprocess and correlate the time series using a tool based on the Python package Obspy which is run in parallel on a cluster of the Swiss National Supercomputing Centre. Correlation is done in two ways: Besides the classical cross-correlation function, the phase cross-correlation is calculated, which is an amplitude-independent measure of waveform similarity and therefore insensitive to high-energy events. Besides linear stacks of these correlations, instantaneous phase stacks are calculated which can be applied as optional weight, enhancing coherent portions of the traces and facilitating the emergence of a meaningful signal. The _STS1 virtual network by IRIS contains about 250 globally distributed stations, several of which have been operating for more than 20 years. It is the first data collection we will use for correlations in the hum frequency range, as the STS-1 instrument response is flat in the largest part of the period range where hum is observed, up to a period of about 300 seconds. Thus they provide us with the best-suited measurements for hum.
NASA Astrophysics Data System (ADS)
Di Piazza, A.; Cordano, E.; Eccel, E.
2012-04-01
The issue of climate change detection is considered a major challenge. In particular, high temporal resolution climate change scenarios are required in the evaluation of the effects of climate change on agricultural management (crop suitability, yields, risk assessment, etc.) energy production and water management. In this work, a "Weather Generator" technique was used for downscaling climate change scenarios for temperature. An R package (RMAWGEN, Cordano and Eccel, 2011 - available on http://cran.r-project.org) was developed aiming to generate synthetic daily weather conditions by using the theory of vectorial auto-regressive models (VAR). The VAR model was chosen for its ability in maintaining the temporal and spatial correlations among variables. In particular, observed time series of daily maximum and minimum temperature are transformed into "new" normally-distributed variable time series which are used to calibrate the parameters of a VAR model by using ordinary least square methods. Therefore the implemented algorithm, applied to monthly mean climatic values downscaled by Global Climate Model predictions, can generate several stochastic daily scenarios where the statistical consistency among series is saved. Further details are present in RMAWGEN documentation. An application is presented here by using a dataset with daily temperature time series recorded in 41 different sites of Trentino region for the period 1958-2010. Temperature time series were pre-processed to fill missing values (by a site-specific calibrated Inverse Distance Weighting algorithm, corrected with elevation) and to remove inhomogeneities. Several climatic indices were taken into account, useful for several impact assessment applications, and their time trends within the time series were analyzed. The indices go from the more classical ones, as annual mean temperatures, seasonal mean temperatures and their anomalies (from the reference period 1961-1990) to the climate change indices selected from the list recommended by the World Meteorological Organization Commission for Climatology (WMO-CCL) and the Research Programme on Climate Variability and Predictability (CLIVAR) project's Expert Team on Climate Change Detection, Monitoring and Indices (ETCCDMI). Each index was applied to both observed (and processed) data and to synthetic time series produced by the Weather Generator, over the thirty year reference period 1981-2010, in order to validate the procedure. Climate projections were statistically downscaled for a selection of sites for the two 30-year periods 2021-2050 and 2071-2099 of the European project "Ensembles" multi-model output (scenario A1B). The use of several climatic indices strengthens the trend analysis of both the generated synthetic series and future climate projections.
Satellite-motion Compensation for Monitoring Travelling Ionospheric Disturbances (TIDs) Using GPS
NASA Astrophysics Data System (ADS)
Jackson-Booth, N.; Penney, R.
2016-12-01
The ionosphere exerts a strong influence over a wide range of modern communications and navigtion systems, but is subject to complex influences from both terrestrial and solar sources. Ionospheric disturbances can be triggered by lower-atmosphere phenomena such as hurricanes as well as geophysical events such as earthquakes, as well as being strongly influenced by cyclical and unpredictable solar behaviour. Dual-band GPS receivers provide a popular and convenient means of obtaining information about the ionosphere, and ionospheric disturbances. While GPS measurements can provide clues about the state of the ionosphere, there are many challenges in obtaining reliable information from them. For example, drop-outs and carrier-phase cycle slips may have little influence on using GPS for (medium-precision) navigation, but can lead to signal-processing artefacts that would cause false alarms in detecting ionospheric disturbances. If one is interested in measuring the motion of travelling ionospheric disturbances (TIDs) one must also be able to disentangle the effects of satellite motion from the TID motion. We discuss a novel approach to robustly separating TID waveforms from background trends within GPS time-series of total electron content (TEC), as well as innovative techniques for estimating TID velocities using ideas from Synthetic Aperture Radar (SAR). Underpinning these, we consider how to robustly pre-process GPS time-series to reduce the influence of drop-outs while also reducing data volumes. We present comparisons of our TID velocity estimates with more standard "cross-correlation" techniques, including cases where these standard techniques produce pathological results. We also show results from simulated GPS time-series derived from modelled ionospheric disturbances.
NASA Astrophysics Data System (ADS)
Trauth, N.; Schmidt, C.; Munz, M.
2016-12-01
Heat as a natural tracer to quantify water fluxes between groundwater and surface water has evolved to a standard hydrological method. Typically, time series of temperatures in the surface water and in the sediment are observed and are subsequently evaluated by a vertical 1D representation of heat transport by advection and dispersion. Several analytical solutions as well as their implementation into user-friendly software exist in order to estimate water fluxes from the observed temperatures. Analytical solutions can be easily implemented but assumptions on the boundary conditions have to be made a priori, e.g. sinusoidal upper temperature boundary. Numerical models offer more flexibility and can handle temperature data which is characterized by irregular variations such as storm-event induced temperature changes and thus cannot readily be incorporated in analytical solutions. This also reduced the effort of data preprocessing such as the extraction of the diurnal temperature variation. We developed a software to estimate water FLUXes Based On Temperatures- FLUX-BOT. FLUX-BOT is a numerical code written in MATLAB which is intended to calculate vertical water fluxes in saturated sediments, based on the inversion of measured temperature time series observed at multiple depths. It applies a cell-centered Crank-Nicolson implicit finite difference scheme to solve the one-dimensional heat advection-conduction equation. Besides its core inverse numerical routines, FLUX-BOT includes functions visualizing the results and functions for performing uncertainty analysis. We provide applications of FLUX-BOT to generic as well as to measured temperature data to demonstrate its performance.
NASA Astrophysics Data System (ADS)
Wijesingha, J. S. J.; Deshapriya, N. L.; Samarakoon, L.
2015-04-01
Billions of people in the world depend on rice as a staple food and as an income-generating crop. Asia is the leader in rice cultivation and it is necessary to maintain an up-to-date rice-related database to ensure food security as well as economic development. This study investigates general applicability of high temporal resolution Moderate Resolution Imaging Spectroradiometer (MODIS) 250m gridded vegetation product for monitoring rice crop growth, mapping rice crop acreage and analyzing crop yield, at the province-level. The MODIS 250m Normalized Difference Vegetation Index (NDVI) and Enhanced Vegetation Index (EVI) time series data, field data and crop calendar information were utilized in this research in Sa Kaeo Province, Thailand. The following methodology was used: (1) data pre-processing and rice plant growth analysis using Vegetation Indices (VI) (2) extraction of rice acreage and start-of-season dates from VI time series data (3) accuracy assessment, and (4) yield analysis with MODIS VI. The results show a direct relationship between rice plant height and MODIS VI. The crop calendar information and the smoothed NDVI time series with Whittaker Smoother gave high rice acreage estimation (with 86% area accuracy and 75% classification accuracy). Point level yield analysis showed that the MODIS EVI is highly correlated with rice yield and yield prediction using maximum EVI in the rice cycle predicted yield with an average prediction error 4.2%. This study shows the immense potential of MODIS gridded vegetation product for keeping an up-to-date Geographic Information System of rice cultivation.
Ngan, Shing-Chung; Hu, Xiaoping; Khong, Pek-Lan
2011-03-01
We propose a method for preprocessing event-related functional magnetic resonance imaging (fMRI) data that can lead to enhancement of template-free activation detection. The method is based on using a figure of merit to guide the wavelet shrinkage of a given fMRI data set. Several previous studies have demonstrated that in the root-mean-square error setting, wavelet shrinkage can improve the signal-to-noise ratio of fMRI time courses. However, preprocessing fMRI data in the root-mean-square error setting does not necessarily lead to enhancement of template-free activation detection. Motivated by this observation, in this paper, we move to the detection setting and investigate the possibility of using wavelet shrinkage to enhance template-free activation detection of fMRI data. The main ingredients of our method are (i) forward wavelet transform of the voxel time courses, (ii) shrinking the resulting wavelet coefficients as directed by an appropriate figure of merit, (iii) inverse wavelet transform of the shrunk data, and (iv) submitting these preprocessed time courses to a given activation detection algorithm. Two figures of merit are developed in the paper, and two other figures of merit adapted from the literature are described. Receiver-operating characteristic analyses with simulated fMRI data showed quantitative evidence that data preprocessing as guided by the figures of merit developed in the paper can yield improved detectability of the template-free measures. We also demonstrate the application of our methodology on an experimental fMRI data set. The proposed method is useful for enhancing template-free activation detection in event-related fMRI data. It is of significant interest to extend the present framework to produce comprehensive, adaptive and fully automated preprocessing of fMRI data optimally suited for subsequent data analysis steps. Copyright © 2010 Elsevier B.V. All rights reserved.
Trajectory data analyses for pedestrian space-time activity study.
Qi, Feng; Du, Fei
2013-02-25
It is well recognized that human movement in the spatial and temporal dimensions has direct influence on disease transmission(1-3). An infectious disease typically spreads via contact between infected and susceptible individuals in their overlapped activity spaces. Therefore, daily mobility-activity information can be used as an indicator to measure exposures to risk factors of infection. However, a major difficulty and thus the reason for paucity of studies of infectious disease transmission at the micro scale arise from the lack of detailed individual mobility data. Previously in transportation and tourism research detailed space-time activity data often relied on the time-space diary technique, which requires subjects to actively record their activities in time and space. This is highly demanding for the participants and collaboration from the participants greatly affects the quality of data(4). Modern technologies such as GPS and mobile communications have made possible the automatic collection of trajectory data. The data collected, however, is not ideal for modeling human space-time activities, limited by the accuracies of existing devices. There is also no readily available tool for efficient processing of the data for human behavior study. We present here a suite of methods and an integrated ArcGIS desktop-based visual interface for the pre-processing and spatiotemporal analyses of trajectory data. We provide examples of how such processing may be used to model human space-time activities, especially with error-rich pedestrian trajectory data, that could be useful in public health studies such as infectious disease transmission modeling. The procedure presented includes pre-processing, trajectory segmentation, activity space characterization, density estimation and visualization, and a few other exploratory analysis methods. Pre-processing is the cleaning of noisy raw trajectory data. We introduce an interactive visual pre-processing interface as well as an automatic module. Trajectory segmentation(5) involves the identification of indoor and outdoor parts from pre-processed space-time tracks. Again, both interactive visual segmentation and automatic segmentation are supported. Segmented space-time tracks are then analyzed to derive characteristics of one's activity space such as activity radius etc. Density estimation and visualization are used to examine large amount of trajectory data to model hot spots and interactions. We demonstrate both density surface mapping(6) and density volume rendering(7). We also include a couple of other exploratory data analyses (EDA) and visualizations tools, such as Google Earth animation support and connection analysis. The suite of analytical as well as visual methods presented in this paper may be applied to any trajectory data for space-time activity studies.
Micro-Analyzer: automatic preprocessing of Affymetrix microarray data.
Guzzi, Pietro Hiram; Cannataro, Mario
2013-08-01
A current trend in genomics is the investigation of the cell mechanism using different technologies, in order to explain the relationship among genes, molecular processes and diseases. For instance, the combined use of gene-expression arrays and genomic arrays has been demonstrated as an effective instrument in clinical practice. Consequently, in a single experiment different kind of microarrays may be used, resulting in the production of different types of binary data (images and textual raw data). The analysis of microarray data requires an initial preprocessing phase, that makes raw data suitable for use on existing analysis platforms, such as the TIGR M4 (TM4) Suite. An additional challenge to be faced by emerging data analysis platforms is the ability to treat in a combined way those different microarray formats coupled with clinical data. In fact, resulting integrated data may include both numerical and symbolic data (e.g. gene expression and SNPs regarding molecular data), as well as temporal data (e.g. the response to a drug, time to progression and survival rate), regarding clinical data. Raw data preprocessing is a crucial step in analysis but is often performed in a manual and error prone way using different software tools. Thus novel, platform independent, and possibly open source tools enabling the semi-automatic preprocessing and annotation of different microarray data are needed. The paper presents Micro-Analyzer (Microarray Analyzer), a cross-platform tool for the automatic normalization, summarization and annotation of Affymetrix gene expression and SNP binary data. It represents the evolution of the μ-CS tool, extending the preprocessing to SNP arrays that were not allowed in μ-CS. The Micro-Analyzer is provided as a Java standalone tool and enables users to read, preprocess and analyse binary microarray data (gene expression and SNPs) by invoking TM4 platform. It avoids: (i) the manual invocation of external tools (e.g. the Affymetrix Power Tools), (ii) the manual loading of preprocessing libraries, and (iii) the management of intermediate files, such as results and metadata. Micro-Analyzer users can directly manage Affymetrix binary data without worrying about locating and invoking the proper preprocessing tools and chip-specific libraries. Moreover, users of the Micro-Analyzer tool can load the preprocessed data directly into the well-known TM4 platform, extending in such a way also the TM4 capabilities. Consequently, Micro Analyzer offers the following advantages: (i) it reduces possible errors in the preprocessing and further analysis phases, e.g. due to the incorrect choice of parameters or due to the use of old libraries, (ii) it enables the combined and centralized pre-processing of different arrays, (iii) it may enhance the quality of further analysis by storing the workflow, i.e. information about the preprocessing steps, and (iv) finally Micro-Analzyer is freely available as a standalone application at the project web site http://sourceforge.net/projects/microanalyzer/. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Data preprocessing for a vehicle-based localization system used in road traffic applications
NASA Astrophysics Data System (ADS)
Patelczyk, Timo; Löffler, Andreas; Biebl, Erwin
2016-09-01
This paper presents a fixed-point implementation of the preprocessing using a field programmable gate array (FPGA), which is required for a multipath joint angle and delay estimation (JADE) used in road traffic applications. This paper lays the foundation for many model-based parameter estimation methods. Here, a simulation of a vehicle-based localization system application for protecting vulnerable road users, which were equipped with appropriate transponders, is considered. For such safety critical applications, the robustness and real-time capability of the localization is particularly important. Additionally, a motivation to use a fixed-point implementation for the data preprocessing is a limited computing power of the head unit of a vehicle. This study aims to process the raw data provided by the localization system used in this paper. The data preprocessing applied includes a wideband calibration of the physical localization system, separation of relevant information from the received sampled signal, and preparation of the incoming data via further processing. Further, a channel matrix estimation was implemented to complete the data preprocessing, which contains information on channel parameters, e.g., the positions of the objects to be located. In the presented case of a vehicle-based localization system application we assume an urban environment, in which multipath propagation occurs. Since most methods for localization are based on uncorrelated signals, this fact must be addressed. Hence, a decorrelation of incoming data stream in terms of a further localization is required. This decorrelation was accomplished by considering several snapshots in different time slots. As a final aspect of the use of fixed-point arithmetic, quantization errors are considered. In addition, the resources and runtime of the presented implementation are discussed; these factors are strongly linked to a practical implementation.
Oh, Sunghee; Song, Seongho
2017-01-01
In gene expression profile, data analysis pipeline is categorized into four levels, major downstream tasks, i.e., (1) identification of differential expression; (2) clustering co-expression patterns; (3) classification of subtypes of samples; and (4) detection of genetic regulatory networks, are performed posterior to preprocessing procedure such as normalization techniques. To be more specific, temporal dynamic gene expression data has its inherent feature, namely, two neighboring time points (previous and current state) are highly correlated with each other, compared to static expression data which samples are assumed as independent individuals. In this chapter, we demonstrate how HMMs and hierarchical Bayesian modeling methods capture the horizontal time dependency structures in time series expression profiles by focusing on the identification of differential expression. In addition, those differential expression genes and transcript variant isoforms over time detected in core prerequisite steps can be generally further applied in detection of genetic regulatory networks to comprehensively uncover dynamic repertoires in the aspects of system biology as the coupled framework.
Hauk, O; Keil, A; Elbert, T; Müller, M M
2002-01-30
We describe a methodology to apply current source density (CSD) and minimum norm (MN) estimation as pre-processing tools for time-series analysis of single trial EEG data. The performance of these methods is compared for the case of wavelet time-frequency analysis of simulated gamma-band activity. A reasonable comparison of CSD and MN on the single trial level requires regularization such that the corresponding transformed data sets have similar signal-to-noise ratios (SNRs). For region-of-interest approaches, it should be possible to optimize the SNR for single estimates rather than for the whole distributed solution. An effective implementation of the MN method is described. Simulated data sets were created by modulating the strengths of a radial and a tangential test dipole with wavelets in the frequency range of the gamma band, superimposed with simulated spatially uncorrelated noise. The MN and CSD transformed data sets as well as the average reference (AR) representation were subjected to wavelet frequency-domain analysis, and power spectra were mapped for relevant frequency bands. For both CSD and MN, the influence of noise can be sufficiently suppressed by regularization to yield meaningful information, but only MN represents both radial and tangential dipole sources appropriately as single peaks. Therefore, when relating wavelet power spectrum topographies to their neuronal generators, MN should be preferred.
Run-time parallelization and scheduling of loops
NASA Technical Reports Server (NTRS)
Saltz, Joel H.; Mirchandaney, Ravi; Crowley, Kay
1990-01-01
Run time methods are studied to automatically parallelize and schedule iterations of a do loop in certain cases, where compile-time information is inadequate. The methods presented involve execution time preprocessing of the loop. At compile-time, these methods set up the framework for performing a loop dependency analysis. At run time, wave fronts of concurrently executable loop iterations are identified. Using this wavefront information, loop iterations are reordered for increased parallelism. Symbolic transformation rules are used to produce: inspector procedures that perform execution time preprocessing and executors or transformed versions of source code loop structures. These transformed loop structures carry out the calculations planned in the inspector procedures. Performance results are presented from experiments conducted on the Encore Multimax. These results illustrate that run time reordering of loop indices can have a significant impact on performance. Furthermore, the overheads associated with this type of reordering are amortized when the loop is executed several times with the same dependency structure.
A wavelet method for modeling and despiking motion artifacts from resting-state fMRI time series.
Patel, Ameera X; Kundu, Prantik; Rubinov, Mikail; Jones, P Simon; Vértes, Petra E; Ersche, Karen D; Suckling, John; Bullmore, Edward T
2014-07-15
The impact of in-scanner head movement on functional magnetic resonance imaging (fMRI) signals has long been established as undesirable. These effects have been traditionally corrected by methods such as linear regression of head movement parameters. However, a number of recent independent studies have demonstrated that these techniques are insufficient to remove motion confounds, and that even small movements can spuriously bias estimates of functional connectivity. Here we propose a new data-driven, spatially-adaptive, wavelet-based method for identifying, modeling, and removing non-stationary events in fMRI time series, caused by head movement, without the need for data scrubbing. This method involves the addition of just one extra step, the Wavelet Despike, in standard pre-processing pipelines. With this method, we demonstrate robust removal of a range of different motion artifacts and motion-related biases including distance-dependent connectivity artifacts, at a group and single-subject level, using a range of previously published and new diagnostic measures. The Wavelet Despike is able to accommodate the substantial spatial and temporal heterogeneity of motion artifacts and can consequently remove a range of high and low frequency artifacts from fMRI time series, that may be linearly or non-linearly related to physical movements. Our methods are demonstrated by the analysis of three cohorts of resting-state fMRI data, including two high-motion datasets: a previously published dataset on children (N=22) and a new dataset on adults with stimulant drug dependence (N=40). We conclude that there is a real risk of motion-related bias in connectivity analysis of fMRI data, but that this risk is generally manageable, by effective time series denoising strategies designed to attenuate synchronized signal transients induced by abrupt head movements. The Wavelet Despiking software described in this article is freely available for download at www.brainwavelet.org. Copyright © 2014. Published by Elsevier Inc.
A wavelet method for modeling and despiking motion artifacts from resting-state fMRI time series
Patel, Ameera X.; Kundu, Prantik; Rubinov, Mikail; Jones, P. Simon; Vértes, Petra E.; Ersche, Karen D.; Suckling, John; Bullmore, Edward T.
2014-01-01
The impact of in-scanner head movement on functional magnetic resonance imaging (fMRI) signals has long been established as undesirable. These effects have been traditionally corrected by methods such as linear regression of head movement parameters. However, a number of recent independent studies have demonstrated that these techniques are insufficient to remove motion confounds, and that even small movements can spuriously bias estimates of functional connectivity. Here we propose a new data-driven, spatially-adaptive, wavelet-based method for identifying, modeling, and removing non-stationary events in fMRI time series, caused by head movement, without the need for data scrubbing. This method involves the addition of just one extra step, the Wavelet Despike, in standard pre-processing pipelines. With this method, we demonstrate robust removal of a range of different motion artifacts and motion-related biases including distance-dependent connectivity artifacts, at a group and single-subject level, using a range of previously published and new diagnostic measures. The Wavelet Despike is able to accommodate the substantial spatial and temporal heterogeneity of motion artifacts and can consequently remove a range of high and low frequency artifacts from fMRI time series, that may be linearly or non-linearly related to physical movements. Our methods are demonstrated by the analysis of three cohorts of resting-state fMRI data, including two high-motion datasets: a previously published dataset on children (N = 22) and a new dataset on adults with stimulant drug dependence (N = 40). We conclude that there is a real risk of motion-related bias in connectivity analysis of fMRI data, but that this risk is generally manageable, by effective time series denoising strategies designed to attenuate synchronized signal transients induced by abrupt head movements. The Wavelet Despiking software described in this article is freely available for download at www.brainwavelet.org. PMID:24657353
NASA Astrophysics Data System (ADS)
Eberle, J.; Schmullius, C.
2017-12-01
Increasing archives of global satellite data present a new challenge to handle multi-source satellite data in a user-friendly way. Any user is confronted with different data formats and data access services. In addition the handling of time-series data is complex as an automated processing and execution of data processing steps is needed to supply the user with the desired product for a specific area of interest. In order to simplify the access to data archives of various satellite missions and to facilitate the subsequent processing, a regional data and processing middleware has been developed. The aim of this system is to provide standardized and web-based interfaces to multi-source time-series data for individual regions on Earth. For further use and analysis uniform data formats and data access services are provided. Interfaces to data archives of the sensor MODIS (NASA) as well as the satellites Landsat (USGS) and Sentinel (ESA) have been integrated in the middleware. Various scientific algorithms, such as the calculation of trends and breakpoints of time-series data, can be carried out on the preprocessed data on the basis of uniform data management. Jupyter Notebooks are linked to the data and further processing can be conducted directly on the server using Python and the statistical language R. In addition to accessing EO data, the middleware is also used as an intermediary between the user and external databases (e.g., Flickr, YouTube). Standardized web services as specified by OGC are provided for all tools of the middleware. Currently, the use of cloud services is being researched to bring algorithms to the data. As a thematic example, an operational monitoring of vegetation phenology is being implemented on the basis of various optical satellite data and validation data from the German Weather Service. Other examples demonstrate the monitoring of wetlands focusing on automated discovery and access of Landsat and Sentinel data for local areas.
Tracks detection from high-orbit space objects
NASA Astrophysics Data System (ADS)
Shumilov, Yu. P.; Vygon, V. G.; Grishin, E. A.; Konoplev, A. O.; Semichev, O. P.; Shargorodskii, V. D.
2017-05-01
The paper presents studies results of a complex algorithm for the detection of highly orbital space objects. Before the implementation of the algorithm, a series of frames with weak tracks of space objects, which can be discrete, is recorded. The algorithm includes pre-processing, classical for astronomy, consistent filtering of each frame and its threshold processing, shear transformation, median filtering of the transformed series of frames, repeated threshold processing and detection decision making. Modeling of space objects weak tracks on of the night starry sky real frames obtained in the regime of a stationary telescope was carried out. It is shown that the permeability of an optoelectronic device has increased by almost 2m.
Prony Ringdown GUI (CERTS Prony Ringdown, part of the DSI Tool Box)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tuffner, Francis; Marinovici, PNNL Laurentiu; Hauer, PNNL John
2014-02-21
The PNNL Prony Ringdown graphical user interface is one analysis tool included in the Dynamic System Identification toolbox (DSI Toolbox). The Dynamic System Identification toolbox is a MATLAB-based collection of tools for parsing and analyzing phasor measurement unit data, especially in regards to small signal stability. It includes tools to read the data, preprocess it, and perform small signal analysis. 5. Method of Solution: The Dynamic System Identification Toolbox (DSI Toolbox) is designed to provide a research environment for examining phasor measurement unit data and performing small signal stability analysis. The software uses a series of text-driven menus to helpmore » guide users and organize the toolbox features. Methods for reading in populate phasor measurement unit data are provided, with appropriate preprocessing options for small-signal-stability analysis. The toolbox includes the Prony Ringdown GUI and basic algorithms to estimate information on oscillatory modes of the system, such as modal frequency and damping ratio.« less
Time-Of-Flight Camera, Optical Tracker and Computed Tomography in Pairwise Data Registration.
Pycinski, Bartlomiej; Czajkowska, Joanna; Badura, Pawel; Juszczyk, Jan; Pietka, Ewa
2016-01-01
A growing number of medical applications, including minimal invasive surgery, depends on multi-modal or multi-sensors data processing. Fast and accurate 3D scene analysis, comprising data registration, seems to be crucial for the development of computer aided diagnosis and therapy. The advancement of surface tracking system based on optical trackers already plays an important role in surgical procedures planning. However, new modalities, like the time-of-flight (ToF) sensors, widely explored in non-medical fields are powerful and have the potential to become a part of computer aided surgery set-up. Connection of different acquisition systems promises to provide a valuable support for operating room procedures. Therefore, the detailed analysis of the accuracy of such multi-sensors positioning systems is needed. We present the system combining pre-operative CT series with intra-operative ToF-sensor and optical tracker point clouds. The methodology contains: optical sensor set-up and the ToF-camera calibration procedures, data pre-processing algorithms, and registration technique. The data pre-processing yields a surface, in case of CT, and point clouds for ToF-sensor and marker-driven optical tracker representation of an object of interest. An applied registration technique is based on Iterative Closest Point algorithm. The experiments validate the registration of each pair of modalities/sensors involving phantoms of four various human organs in terms of Hausdorff distance and mean absolute distance metrics. The best surface alignment was obtained for CT and optical tracker combination, whereas the worst for experiments involving ToF-camera. The obtained accuracies encourage to further develop the multi-sensors systems. The presented substantive discussion concerning the system limitations and possible improvements mainly related to the depth information produced by the ToF-sensor is useful for computer aided surgery developers.
Image preprocessing study on KPCA-based face recognition
NASA Astrophysics Data System (ADS)
Li, Xuan; Li, Dehua
2015-12-01
Face recognition as an important biometric identification method, with its friendly, natural, convenient advantages, has obtained more and more attention. This paper intends to research a face recognition system including face detection, feature extraction and face recognition, mainly through researching on related theory and the key technology of various preprocessing methods in face detection process, using KPCA method, focuses on the different recognition results in different preprocessing methods. In this paper, we choose YCbCr color space for skin segmentation and choose integral projection for face location. We use erosion and dilation of the opening and closing operation and illumination compensation method to preprocess face images, and then use the face recognition method based on kernel principal component analysis method for analysis and research, and the experiments were carried out using the typical face database. The algorithms experiment on MATLAB platform. Experimental results show that integration of the kernel method based on PCA algorithm under certain conditions make the extracted features represent the original image information better for using nonlinear feature extraction method, which can obtain higher recognition rate. In the image preprocessing stage, we found that images under various operations may appear different results, so as to obtain different recognition rate in recognition stage. At the same time, in the process of the kernel principal component analysis, the value of the power of the polynomial function can affect the recognition result.
NASA Astrophysics Data System (ADS)
Luce, R.; Hildebrandt, P.; Kuhlmann, U.; Liesen, J.
2016-09-01
The key challenge of time-resolved Raman spectroscopy is the identification of the constituent species and the analysis of the kinetics of the underlying reaction network. In this work we present an integral approach that allows for determining both the component spectra and the rate constants simultaneously from a series of vibrational spectra. It is based on an algorithm for non-negative matrix factorization which is applied to the experimental data set following a few pre-processing steps. As a prerequisite for physically unambiguous solutions, each component spectrum must include one vibrational band that does not significantly interfere with vibrational bands of other species. The approach is applied to synthetic "experimental" spectra derived from model systems comprising a set of species with component spectra differing with respect to their degree of spectral interferences and signal-to-noise ratios. In each case, the species involved are connected via monomolecular reaction pathways. The potential and limitations of the approach for recovering the respective rate constants and component spectra are discussed.
NASA Astrophysics Data System (ADS)
Ferreira, Maria Teodora; Follmann, Rosangela; Domingues, Margarete O.; Macau, Elbert E. N.; Kiss, István Z.
2017-08-01
Phase synchronization may emerge from mutually interacting non-linear oscillators, even under weak coupling, when phase differences are bounded, while amplitudes remain uncorrelated. However, the detection of this phenomenon can be a challenging problem to tackle. In this work, we apply the Discrete Complex Wavelet Approach (DCWA) for phase assignment, considering signals from coupled chaotic systems and experimental data. The DCWA is based on the Dual-Tree Complex Wavelet Transform (DT-CWT), which is a discrete transformation. Due to its multi-scale properties in the context of phase characterization, it is possible to obtain very good results from scalar time series, even with non-phase-coherent chaotic systems without state space reconstruction or pre-processing. The method correctly predicts the phase synchronization for a chemical experiment with three locally coupled, non-phase-coherent chaotic processes. The impact of different time-scales is demonstrated on the synchronization process that outlines the advantages of DCWA for analysis of experimental data.
Real-time acquisition and preprocessing system of transient electromagnetic data based on LabVIEW
NASA Astrophysics Data System (ADS)
Zhao, Huinan; Zhang, Shuang; Gu, Lingjia; Sun, Jian
2014-09-01
Transient electromagnetic method (TEM) is regarded as an everlasting issue for geological exploration. It is widely used in many research fields, such as mineral exploration, hydrogeology survey, engineering exploration and unexploded ordnance detection. The traditional measurement systems are often based on ARM DSP or FPGA, which have not real-time display, data preprocessing and data playback functions. In order to overcome the defects, a real-time data acquisition and preprocessing system based on LabVIEW virtual instrument development platform is proposed in the paper, moreover, a calibration model is established for TEM system based on a conductivity loop. The test results demonstrated that the system can complete real-time data acquisition and system calibration. For Transmit-Loop-Receive (TLR) response, the correlation coefficient between the measured results and the calculated results is 0.987. The measured results are basically consistent with the calculated results. Through the late inversion process for TLR, the signal of underground conductor was obtained. In the complex test environment, abnormal values usually exist in the measured data. In order to solve this problem, the judgment and revision algorithm of abnormal values is proposed in the paper. The test results proved that the proposed algorithm can effectively eliminate serious disturbance signals from the measured transient electromagnetic data.
Barlow, Paul M.; Cunningham, William L.; Zhai, Tong; Gray, Mark
2015-01-01
This report is a user guide for the streamflow-hydrograph analysis methods provided with version 1.0 of the U.S. Geological Survey (USGS) Groundwater Toolbox computer program. These include six hydrograph-separation methods to determine the groundwater-discharge (base-flow) and surface-runoff components of streamflow—the Base-Flow Index (BFI; Standard and Modified), HYSEP (Fixed Interval, Sliding Interval, and Local Minimum), and PART methods—and the RORA recession-curve displacement method and associated RECESS program to estimate groundwater recharge from streamflow data. The Groundwater Toolbox is a customized interface built on the nonproprietary, open source MapWindow geographic information system software. The program provides graphing, mapping, and analysis capabilities in a Microsoft Windows computing environment. In addition to the four hydrograph-analysis methods, the Groundwater Toolbox allows for the retrieval of hydrologic time-series data (streamflow, groundwater levels, and precipitation) from the USGS National Water Information System, downloading of a suite of preprocessed geographic information system coverages and meteorological data from the National Oceanic and Atmospheric Administration National Climatic Data Center, and analysis of data with several preprocessing and postprocessing utilities. With its data retrieval and analysis tools, the Groundwater Toolbox provides methods to estimate many of the components of the water budget for a hydrologic basin, including precipitation; streamflow; base flow; runoff; groundwater recharge; and total, groundwater, and near-surface evapotranspiration.
Atanassova, Vassia; Sotirova, Evdokia; Doukovska, Lyubka; Bureva, Veselina; Mavrov, Deyan; Tomov, Jivko
2017-01-01
The approach of InterCriteria Analysis (ICA) was applied for the aim of reducing the set of variables on the input of a neural network, taking into account the fact that their large number increases the number of neurons in the network, thus making them unusable for hardware implementation. Here, for the first time, with the help of the ICA method, correlations between triples of the input parameters for training of the neural networks were obtained. In this case, we use the approach of ICA for data preprocessing, which may yield reduction of the total time for training the neural networks, hence, the time for the network's processing of data and images. PMID:28874908
Pre-processing Tasks in Indonesian Twitter Messages
NASA Astrophysics Data System (ADS)
Hidayatullah, A. F.; Ma'arif, M. R.
2017-01-01
Twitter text messages are very noisy. Moreover, tweet data are unstructured and complicated enough. The focus of this work is to investigate pre-processing technique for Twitter messages in Bahasa Indonesia. The main goal of this experiment is to clean the tweet data for further analysis. Thus, the objectives of this pre-processing task is simply removing all meaningless character and left valuable words. In this research, we divide our proposed pre-processing experiments into two parts. The first part is common pre-processing task. The second part is a specific pre-processing task for tweet data. From the experimental result we can conclude that by employing a specific pre-processing task related to tweet data characteristic we obtained more valuable result. The result obtained is better in terms of less meaningful word occurrence which is not significant in number comparing to the result obtained by just running common pre-processing tasks.
NASA Astrophysics Data System (ADS)
Wismüller, Axel; DSouza, Adora M.; Abidin, Anas Z.; Wang, Xixi; Hobbs, Susan K.; Nagarajan, Mahesh B.
2015-03-01
Echo state networks (ESN) are recurrent neural networks where the hidden layer is replaced with a fixed reservoir of neurons. Unlike feed-forward networks, neuron training in ESN is restricted to the output neurons alone thereby providing a computational advantage. We demonstrate the use of such ESNs in our mutual connectivity analysis (MCA) framework for recovering the primary motor cortex network associated with hand movement from resting state functional MRI (fMRI) data. Such a framework consists of two steps - (1) defining a pair-wise affinity matrix between different pixel time series within the brain to characterize network activity and (2) recovering network components from the affinity matrix with non-metric clustering. Here, ESNs are used to evaluate pair-wise cross-estimation performance between pixel time series to create the affinity matrix, which is subsequently subject to non-metric clustering with the Louvain method. For comparison, the ground truth of the motor cortex network structure is established with a task-based fMRI sequence. Overlap between the primary motor cortex network recovered with our model free MCA approach and the ground truth was measured with the Dice coefficient. Our results show that network recovery with our proposed MCA approach is in close agreement with the ground truth. Such network recovery is achieved without requiring low-pass filtering of the time series ensembles prior to analysis, an fMRI preprocessing step that has courted controversy in recent years. Thus, we conclude our MCA framework can allow recovery and visualization of the underlying functionally connected networks in the brain on resting state fMRI.
Preprocessed Consortium for Neuropsychiatric Phenomics dataset.
Gorgolewski, Krzysztof J; Durnez, Joke; Poldrack, Russell A
2017-01-01
Here we present preprocessed MRI data of 265 participants from the Consortium for Neuropsychiatric Phenomics (CNP) dataset. The preprocessed dataset includes minimally preprocessed data in the native, MNI and surface spaces accompanied with potential confound regressors, tissue probability masks, brain masks and transformations. In addition the preprocessed dataset includes unthresholded group level and single subject statistical maps from all tasks included in the original dataset. We hope that availability of this dataset will greatly accelerate research.
EARLINET Single Calculus Chain - technical - Part 1: Pre-processing of raw lidar data
NASA Astrophysics Data System (ADS)
D'Amico, Giuseppe; Amodeo, Aldo; Mattis, Ina; Freudenthaler, Volker; Pappalardo, Gelsomina
2016-02-01
In this paper we describe an automatic tool for the pre-processing of aerosol lidar data called ELPP (EARLINET Lidar Pre-Processor). It is one of two calculus modules of the EARLINET Single Calculus Chain (SCC), the automatic tool for the analysis of EARLINET data. ELPP is an open source module that executes instrumental corrections and data handling of the raw lidar signals, making the lidar data ready to be processed by the optical retrieval algorithms. According to the specific lidar configuration, ELPP automatically performs dead-time correction, atmospheric and electronic background subtraction, gluing of lidar signals, and trigger-delay correction. Moreover, the signal-to-noise ratio of the pre-processed signals can be improved by means of configurable time integration of the raw signals and/or spatial smoothing. ELPP delivers the statistical uncertainties of the final products by means of error propagation or Monte Carlo simulations. During the development of ELPP, particular attention has been payed to make the tool flexible enough to handle all lidar configurations currently used within the EARLINET community. Moreover, it has been designed in a modular way to allow an easy extension to lidar configurations not yet implemented. The primary goal of ELPP is to enable the application of quality-assured procedures in the lidar data analysis starting from the raw lidar data. This provides the added value of full traceability of each delivered lidar product. Several tests have been performed to check the proper functioning of ELPP. The whole SCC has been tested with the same synthetic data sets, which were used for the EARLINET algorithm inter-comparison exercise. ELPP has been successfully employed for the automatic near-real-time pre-processing of the raw lidar data measured during several EARLINET inter-comparison campaigns as well as during intense field campaigns.
NASA Astrophysics Data System (ADS)
Li, Wanjing; Schütze, Rainer; Böhler, Martin; Boochs, Frank; Marzani, Franck S.; Voisin, Yvon
2009-06-01
We present an approach to integrate a preprocessing step of the region of interest (ROI) localization into 3-D scanners (laser or stereoscopic). The definite objective is to make the 3-D scanner intelligent enough to localize rapidly in the scene, during the preprocessing phase, the regions with high surface curvature, so that precise scanning will be done only in these regions instead of in the whole scene. In this way, the scanning time can be largely reduced, and the results contain only pertinent data. To test its feasibility and efficiency, we simulated the preprocessing process under an active stereoscopic system composed of two cameras and a video projector. The ROI localization is done in an iterative way. First, the video projector projects a regular point pattern in the scene, and then the pattern is modified iteratively according to the local surface curvature of each reconstructed 3-D point. Finally, the last pattern is used to determine the ROI. Our experiments showed that with this approach, the system is capable to localize all types of objects, including small objects with small depth.
Aydin, Ilhan; Karakose, Mehmet; Akin, Erhan
2014-03-01
Although reconstructed phase space is one of the most powerful methods for analyzing a time series, it can fail in fault diagnosis of an induction motor when the appropriate pre-processing is not performed. Therefore, boundary analysis based a new feature extraction method in phase space is proposed for diagnosis of induction motor faults. The proposed approach requires the measurement of one phase current signal to construct the phase space representation. Each phase space is converted into an image, and the boundary of each image is extracted by a boundary detection algorithm. A fuzzy decision tree has been designed to detect broken rotor bars and broken connector faults. The results indicate that the proposed approach has a higher recognition rate than other methods on the same dataset. © 2013 ISA Published by ISA All rights reserved.
Multiple imputation of rainfall missing data in the Iberian Mediterranean context
NASA Astrophysics Data System (ADS)
Miró, Juan Javier; Caselles, Vicente; Estrela, María José
2017-11-01
Given the increasing need for complete rainfall data networks, in recent years have been proposed diverse methods for filling gaps in observed precipitation series, progressively more advanced that traditional approaches to overcome the problem. The present study has consisted in validate 10 methods (6 linear, 2 non-linear and 2 hybrid) that allow multiple imputation, i.e., fill at the same time missing data of multiple incomplete series in a dense network of neighboring stations. These were applied for daily and monthly rainfall in two sectors in the Júcar River Basin Authority (east Iberian Peninsula), which is characterized by a high spatial irregularity and difficulty of rainfall estimation. A classification of precipitation according to their genetic origin was applied as pre-processing, and a quantile-mapping adjusting as post-processing technique. The results showed in general a better performance for the non-linear and hybrid methods, highlighting that the non-linear PCA (NLPCA) method outperforms considerably the Self Organizing Maps (SOM) method within non-linear approaches. On linear methods, the Regularized Expectation Maximization method (RegEM) was the best, but far from NLPCA. Applying EOF filtering as post-processing of NLPCA (hybrid approach) yielded the best results.
Near Real-Time Processing of Proteomics Data Using Hadoop.
Hillman, Chris; Ahmad, Yasmeen; Whitehorn, Mark; Cobley, Andy
2014-03-01
This article presents a near real-time processing solution using MapReduce and Hadoop. The solution is aimed at some of the data management and processing challenges facing the life sciences community. Research into genes and their product proteins generates huge volumes of data that must be extensively preprocessed before any biological insight can be gained. In order to carry out this processing in a timely manner, we have investigated the use of techniques from the big data field. These are applied specifically to process data resulting from mass spectrometers in the course of proteomic experiments. Here we present methods of handling the raw data in Hadoop, and then we investigate a process for preprocessing the data using Java code and the MapReduce framework to identify 2D and 3D peaks.
Parallel processing of genomics data
NASA Astrophysics Data System (ADS)
Agapito, Giuseppe; Guzzi, Pietro Hiram; Cannataro, Mario
2016-10-01
The availability of high-throughput experimental platforms for the analysis of biological samples, such as mass spectrometry, microarrays and Next Generation Sequencing, have made possible to analyze a whole genome in a single experiment. Such platforms produce an enormous volume of data per single experiment, thus the analysis of this enormous flow of data poses several challenges in term of data storage, preprocessing, and analysis. To face those issues, efficient, possibly parallel, bioinformatics software needs to be used to preprocess and analyze data, for instance to highlight genetic variation associated with complex diseases. In this paper we present a parallel algorithm for the parallel preprocessing and statistical analysis of genomics data, able to face high dimension of data and resulting in good response time. The proposed system is able to find statistically significant biological markers able to discriminate classes of patients that respond to drugs in different ways. Experiments performed on real and synthetic genomic datasets show good speed-up and scalability.
The Role of GRAIL Orbit Determination in Preprocessing of Gravity Science Measurements
NASA Technical Reports Server (NTRS)
Kruizinga, Gerhard; Asmar, Sami; Fahnestock, Eugene; Harvey, Nate; Kahan, Daniel; Konopliv, Alex; Oudrhiri, Kamal; Paik, Meegyeong; Park, Ryan; Strekalov, Dmitry;
2013-01-01
The Gravity Recovery And Interior Laboratory (GRAIL) mission has constructed a lunar gravity field with unprecedented uniform accuracy on the farside and nearside of the Moon. GRAIL lunar gravity field determination begins with preprocessing of the gravity science measurements by applying corrections for time tag error, general relativity, measurement noise and biases. Gravity field determination requires the generation of spacecraft ephemerides of an accuracy not attainable with the pre-GRAIL lunar gravity fields. Therefore, a bootstrapping strategy was developed, iterating between science data preprocessing and lunar gravity field estimation in order to construct sufficiently accurate orbit ephemerides.This paper describes the GRAIL measurements, their dependence on the spacecraft ephemerides and the role of orbit determination in the bootstrapping strategy. Simulation results will be presented that validate the bootstrapping strategy followed by bootstrapping results for flight data, which have led to the latest GRAIL lunar gravity fields.
KONFIG and REKONFIG: Two interactive preprocessing to the Navy/NASA Engine Program (NNEP)
NASA Technical Reports Server (NTRS)
Fishbach, L. H.
1981-01-01
The NNEP is a computer program that is currently being used to simulate the thermodynamic cycle performance of almost all types of turbine engines by many government, industry, and university personnel. The NNEP uses arrays of input data to set up the engine simulation and component matching method as well as to describe the characteristics of the components. A preprocessing program (KONFIG) is described in which the user at a terminal on a time shared computer can interactively prepare the arrays of data required. It is intended to make it easier for the occasional or new user to operate NNEP. Another preprocessing program (REKONFIG) in which the user can modify the component specifications of a previously configured NNEP dataset is also described. It is intended to aid in preparing data for parametric studies and/or studies of similar engines such a mixed flow turbofans, turboshafts, etc.
Assessing the severity of sleep apnea syndrome based on ballistocardiogram
Zhou, Xingshe; Zhao, Weichao; Liu, Fan; Ni, Hongbo; Yu, Zhiwen
2017-01-01
Background Sleep Apnea Syndrome (SAS) is a common sleep-related breathing disorder, which affects about 4-7% males and 2-4% females all around the world. Different approaches have been adopted to diagnose SAS and measure its severity, including the gold standard Polysomnography (PSG) in sleep study field as well as several alternative techniques such as single-channel ECG, pulse oximeter and so on. However, many shortcomings still limit their generalization in home environment. In this study, we aim to propose an efficient approach to automatically assess the severity of sleep apnea syndrome based on the ballistocardiogram (BCG) signal, which is non-intrusive and suitable for in home environment. Methods We develop an unobtrusive sleep monitoring system to capture the BCG signals, based on which we put forward a three-stage sleep apnea syndrome severity assessment framework, i.e., data preprocessing, sleep-related breathing events (SBEs) detection, and sleep apnea syndrome severity evaluation. First, in the data preprocessing stage, to overcome the limits of BCG signals (e.g., low precision and reliability), we utilize wavelet decomposition to obtain the outline information of heartbeats, and apply a RR correction algorithm to handle missing or spurious RR intervals. Afterwards, in the event detection stage, we propose an automatic sleep-related breathing event detection algorithm named Physio_ICSS based on the iterative cumulative sums of squares (i.e., the ICSS algorithm), which is originally used to detect structural breakpoints in a time series. In particular, to efficiently detect sleep-related breathing events in the obtained time series of RR intervals, the proposed algorithm not only explores the practical factors of sleep-related breathing events (e.g., the limit of lasting duration and possible occurrence sleep stages) but also overcomes the event segmentation issue (e.g., equal-length segmentation method might divide one sleep-related breathing event into different fragments and lead to incorrect results) of existing approaches. Finally, by fusing features extracted from multiple domains, we can identify sleep-related breathing events and assess the severity level of sleep apnea syndrome effectively. Conclusions Experimental results on 136 individuals of different sleep apnea syndrome severities validate the effectiveness of the proposed framework, with the accuracy of 94.12% (128/136). PMID:28445548
Interactive Web-based Visualization of Atomic Position-time Series Data
NASA Astrophysics Data System (ADS)
Thapa, S.; Karki, B. B.
2017-12-01
Extracting and interpreting the information contained in large sets of time-varying three dimensional positional data for the constituent atoms of simulated material is a challenging task. We have recently implemented a web-based visualization system to analyze the position-time series data extracted from the local or remote hosts. It involves a pre-processing step for data reduction, which involves skipping uninteresting parts of the data uniformly (at full atomic configuration level) or non-uniformly (at atomic species level or individual atom level). Atomic configuration snapshot is rendered using the ball-stick representation and can be animated by rendering successive configurations. The entire atomic dynamics can be captured as the trajectories by rendering the atomic positions at all time steps together as points. The trajectories can be manipulated at both species and atomic levels so that we can focus on one or more trajectories of interest, and can be also superimposed with the instantaneous atomic structure. The implementation was done using WebGL and Three.js for graphical rendering, HTML5 and Javascript for GUI, and Elasticsearch and JSON for data storage and retrieval within the Grails Framework. We have applied our visualization system to the simulation datatsets for proton-bearing forsterite (Mg2SiO4) - an abundant mineral of Earths upper mantle. Visualization reveals that protons (hydrogen ions) incorporated as interstitials are much more mobile than protons substituting the host Mg and Si cation sites. The proton diffusion appears to be anisotropic with high mobility along the x-direction, showing limited discrete jumps in other two directions.
Prediction of carbonate rock type from NMR responses using data mining techniques
NASA Astrophysics Data System (ADS)
Gonçalves, Eduardo Corrêa; da Silva, Pablo Nascimento; Silveira, Carla Semiramis; Carneiro, Giovanna; Domingues, Ana Beatriz; Moss, Adam; Pritchard, Tim; Plastino, Alexandre; Azeredo, Rodrigo Bagueira de Vasconcellos
2017-05-01
Recent studies have indicated that the accurate identification of carbonate rock types in a reservoir can be employed as a preliminary step to enhance the effectiveness of petrophysical property modeling. Furthermore, rock typing activity has been shown to be of key importance in several steps of formation evaluation, such as the study of sedimentary series, reservoir zonation and well-to-well correlation. In this paper, a methodology based exclusively on the analysis of 1H-NMR (Nuclear Magnetic Resonance) relaxation responses - using data mining algorithms - is evaluated to perform the automatic classification of carbonate samples according to their rock type. We analyze the effectiveness of six different classification algorithms (k-NN, Naïve Bayes, C4.5, Random Forest, SMO and Multilayer Perceptron) and two data preprocessing strategies (discretization and feature selection). The dataset used in this evaluation is formed by 78 1H-NMR T2 distributions of fully brine-saturated rock samples from six different rock type classes. The experiments reveal that the combination of preprocessing strategies with classification algorithms is able to achieve a prediction accuracy of 97.4%.
Statistical baseline assessment in cardiotocography.
Agostinelli, Angela; Braccili, Eleonora; Marchegiani, Enrico; Rosati, Riccardo; Sbrollini, Agnese; Burattini, Luca; Morettini, Micaela; Di Nardo, Francesco; Fioretti, Sandro; Burattini, Laura
2017-07-01
Cardiotocography (CTG) is the most common non-invasive diagnostic technique to evaluate fetal well-being. It consists in the recording of fetal heart rate (FHR; bpm) and maternal uterine contractions. Among the main parameters characterizing FHR, baseline (BL) is fundamental to determine fetal hypoxia and distress. In computerized applications, BL is typically computed as mean FHR±ΔFHR, with ΔFHR=8 bpm or ΔFHR=10 bpm, both values being experimentally fixed. In this context, the present work aims: to propose a statistical procedure for ΔFHR assessment; to quantitatively determine ΔFHR value by applying such procedure to clinical data; and to compare the statistically-determined ΔFHR value against the experimentally-determined ΔFHR values. To these aims, the 552 recordings of the "CTU-UHB intrapartum CTG database" from Physionet were submitted to an automatic procedure, which consisted in a FHR preprocessing phase and a statistical BL assessment. During preprocessing, FHR time series were divided into 20-min sliding windows, in which missing data were removed by linear interpolation. Only windows with a correction rate lower than 10% were further processed for BL assessment, according to which ΔFHR was computed as FHR standard deviation. Total number of accepted windows was 1192 (38.5%) over 383 recordings (69.4%) with at least an accepted window. Statistically-determined ΔFHR value was 9.7 bpm. Such value was statistically different from 8 bpm (P<;10 -19 ) but not from 10 bpm (P=0.16). Thus, ΔFHR=10 bpm is preferable over 8 bpm because both experimentally and statistically validated.
The Influence of Preprocessing Steps on Graph Theory Measures Derived from Resting State fMRI
Gargouri, Fatma; Kallel, Fathi; Delphine, Sebastien; Ben Hamida, Ahmed; Lehéricy, Stéphane; Valabregue, Romain
2018-01-01
Resting state functional MRI (rs-fMRI) is an imaging technique that allows the spontaneous activity of the brain to be measured. Measures of functional connectivity highly depend on the quality of the BOLD signal data processing. In this study, our aim was to study the influence of preprocessing steps and their order of application on small-world topology and their efficiency in resting state fMRI data analysis using graph theory. We applied the most standard preprocessing steps: slice-timing, realign, smoothing, filtering, and the tCompCor method. In particular, we were interested in how preprocessing can retain the small-world economic properties and how to maximize the local and global efficiency of a network while minimizing the cost. Tests that we conducted in 54 healthy subjects showed that the choice and ordering of preprocessing steps impacted the graph measures. We found that the csr (where we applied realignment, smoothing, and tCompCor as a final step) and the scr (where we applied realignment, tCompCor and smoothing as a final step) strategies had the highest mean values of global efficiency (eg). Furthermore, we found that the fscr strategy (where we applied realignment, tCompCor, smoothing, and filtering as a final step), had the highest mean local efficiency (el) values. These results confirm that the graph theory measures of functional connectivity depend on the ordering of the processing steps, with the best results being obtained using smoothing and tCompCor as the final steps for global efficiency with additional filtering for local efficiency. PMID:29497372
The Influence of Preprocessing Steps on Graph Theory Measures Derived from Resting State fMRI.
Gargouri, Fatma; Kallel, Fathi; Delphine, Sebastien; Ben Hamida, Ahmed; Lehéricy, Stéphane; Valabregue, Romain
2018-01-01
Resting state functional MRI (rs-fMRI) is an imaging technique that allows the spontaneous activity of the brain to be measured. Measures of functional connectivity highly depend on the quality of the BOLD signal data processing. In this study, our aim was to study the influence of preprocessing steps and their order of application on small-world topology and their efficiency in resting state fMRI data analysis using graph theory. We applied the most standard preprocessing steps: slice-timing, realign, smoothing, filtering, and the tCompCor method. In particular, we were interested in how preprocessing can retain the small-world economic properties and how to maximize the local and global efficiency of a network while minimizing the cost. Tests that we conducted in 54 healthy subjects showed that the choice and ordering of preprocessing steps impacted the graph measures. We found that the csr (where we applied realignment, smoothing, and tCompCor as a final step) and the scr (where we applied realignment, tCompCor and smoothing as a final step) strategies had the highest mean values of global efficiency (eg) . Furthermore, we found that the fscr strategy (where we applied realignment, tCompCor, smoothing, and filtering as a final step), had the highest mean local efficiency (el) values. These results confirm that the graph theory measures of functional connectivity depend on the ordering of the processing steps, with the best results being obtained using smoothing and tCompCor as the final steps for global efficiency with additional filtering for local efficiency.
Luce, Robert; Hildebrandt, Peter; Kuhlmann, Uwe; Liesen, Jörg
2016-09-01
The key challenge of time-resolved Raman spectroscopy is the identification of the constituent species and the analysis of the kinetics of the underlying reaction network. In this work we present an integral approach that allows for determining both the component spectra and the rate constants simultaneously from a series of vibrational spectra. It is based on an algorithm for nonnegative matrix factorization that is applied to the experimental data set following a few pre-processing steps. As a prerequisite for physically unambiguous solutions, each component spectrum must include one vibrational band that does not significantly interfere with the vibrational bands of other species. The approach is applied to synthetic "experimental" spectra derived from model systems comprising a set of species with component spectra differing with respect to their degree of spectral interferences and signal-to-noise ratios. In each case, the species involved are connected via monomolecular reaction pathways. The potential and limitations of the approach for recovering the respective rate constants and component spectra are discussed. © The Author(s) 2016.
Cortical Signal Analysis and Advances in Functional Near-Infrared Spectroscopy Signal: A Review.
Kamran, Muhammad A; Mannan, Malik M Naeem; Jeong, Myung Yung
2016-01-01
Functional near-infrared spectroscopy (fNIRS) is a non-invasive neuroimaging modality that measures the concentration changes of oxy-hemoglobin (HbO) and de-oxy hemoglobin (HbR) at the same time. It is an emerging cortical imaging modality with a good temporal resolution that is acceptable for brain-computer interface applications. Researchers have developed several methods in last two decades to extract the neuronal activation related waveform from the observed fNIRS time series. But still there is no standard method for analysis of fNIRS data. This article presents a brief review of existing methodologies to model and analyze the activation signal. The purpose of this review article is to give a general overview of variety of existing methodologies to extract useful information from measured fNIRS data including pre-processing steps, effects of differential path length factor (DPF), variations and attributes of hemodynamic response function (HRF), extraction of evoked response, removal of physiological noises, instrumentation, and environmental noises and resting/activation state functional connectivity. Finally, the challenges in the analysis of fNIRS signal are summarized.
Cortical Signal Analysis and Advances in Functional Near-Infrared Spectroscopy Signal: A Review
Kamran, Muhammad A.; Mannan, Malik M. Naeem; Jeong, Myung Yung
2016-01-01
Functional near-infrared spectroscopy (fNIRS) is a non-invasive neuroimaging modality that measures the concentration changes of oxy-hemoglobin (HbO) and de-oxy hemoglobin (HbR) at the same time. It is an emerging cortical imaging modality with a good temporal resolution that is acceptable for brain-computer interface applications. Researchers have developed several methods in last two decades to extract the neuronal activation related waveform from the observed fNIRS time series. But still there is no standard method for analysis of fNIRS data. This article presents a brief review of existing methodologies to model and analyze the activation signal. The purpose of this review article is to give a general overview of variety of existing methodologies to extract useful information from measured fNIRS data including pre-processing steps, effects of differential path length factor (DPF), variations and attributes of hemodynamic response function (HRF), extraction of evoked response, removal of physiological noises, instrumentation, and environmental noises and resting/activation state functional connectivity. Finally, the challenges in the analysis of fNIRS signal are summarized. PMID:27375458
Comparison of pre-processing methods for multiplex bead-based immunoassays.
Rausch, Tanja K; Schillert, Arne; Ziegler, Andreas; Lüking, Angelika; Zucht, Hans-Dieter; Schulz-Knappe, Peter
2016-08-11
High throughput protein expression studies can be performed using bead-based protein immunoassays, such as the Luminex® xMAP® technology. Technical variability is inherent to these experiments and may lead to systematic bias and reduced power. To reduce technical variability, data pre-processing is performed. However, no recommendations exist for the pre-processing of Luminex® xMAP® data. We compared 37 different data pre-processing combinations of transformation and normalization methods in 42 samples on 384 analytes obtained from a multiplex immunoassay based on the Luminex® xMAP® technology. We evaluated the performance of each pre-processing approach with 6 different performance criteria. Three performance criteria were plots. All plots were evaluated by 15 independent and blinded readers. Four different combinations of transformation and normalization methods performed well as pre-processing procedure for this bead-based protein immunoassay. The following combinations of transformation and normalization were suitable for pre-processing Luminex® xMAP® data in this study: weighted Box-Cox followed by quantile or robust spline normalization (rsn), asinh transformation followed by loess normalization and Box-Cox followed by rsn.
Three dimensional unstructured multigrid for the Euler equations
NASA Technical Reports Server (NTRS)
Mavriplis, D. J.
1991-01-01
The three dimensional Euler equations are solved on unstructured tetrahedral meshes using a multigrid strategy. The driving algorithm consists of an explicit vertex-based finite element scheme, which employs an edge-based data structure to assemble the residuals. The multigrid approach employs a sequence of independently generated coarse and fine meshes to accelerate the convergence to steady-state of the fine grid solution. Variables, residuals and corrections are passed back and forth between the various grids of the sequence using linear interpolation. The addresses and weights for interpolation are determined in a preprocessing stage using linear interpolation. The addresses and weights for interpolation are determined in a preprocessing stage using an efficient graph traversal algorithm. The preprocessing operation is shown to require a negligible fraction of the CPU time required by the overall solution procedure, while gains in overall solution efficiencies greater than an order of magnitude are demonstrated on meshes containing up to 350,000 vertices. Solutions using globally regenerated fine meshes as well as adaptively refined meshes are given.
Metabolomics Reveal Optimal Grain Preprocessing (Milling) toward Rice Koji Fermentation.
Lee, Sunmin; Lee, Da Eun; Singh, Digar; Lee, Choong Hwan
2018-03-21
A time-correlated mass spectrometry (MS)-based metabolic profiling was performed for rice koji made using the substrates with varying degrees of milling (DOM). Overall, 67 primary and secondary metabolites were observed as significantly discriminant among different samples. Notably, a higher abundance of carbohydrate (sugars, sugar alcohols, organic acids, and phenolic acids) and lipid (fatty acids and lysophospholipids) derived metabolites with enhanced hydrolytic enzyme activities were observed for koji made with DOM of 5-7 substrates at 36 h. The antioxidant secondary metabolites (flavonoids and phenolic acid) were relatively higher in koji with DOM of 0 substrates, followed by DOM of 5 > DOM of 7 > DOM of 9 and 11 at 96 h. Hence, we conjecture that the rice substrate preprocessing between DOM of 5 and 7 was potentially optimal toward koji fermentation, with the end product being rich in distinctive organoleptic, nutritional, and functional metabolites. The study rationalizes the substrate preprocessing steps vital for commercial koji making.
A real time mobile-based face recognition with fisherface methods
NASA Astrophysics Data System (ADS)
Arisandi, D.; Syahputra, M. F.; Putri, I. L.; Purnamawati, S.; Rahmat, R. F.; Sari, P. P.
2018-03-01
Face Recognition is a field research in Computer Vision that study about learning face and determine the identity of the face from a picture sent to the system. By utilizing this face recognition technology, learning process about people’s identity between students in a university will become simpler. With this technology, student won’t need to browse student directory in university’s server site and look for the person with certain face trait. To obtain this goal, face recognition application use image processing methods consist of two phase, pre-processing phase and recognition phase. In pre-processing phase, system will process input image into the best image for recognition phase. Purpose of this pre-processing phase is to reduce noise and increase signal in image. Next, to recognize face phase, we use Fisherface Methods. This methods is chosen because of its advantage that would help system of its limited data. Therefore from experiment the accuracy of face recognition using fisherface is 90%.
The Effects of Pre-processing Strategies for Pediatric Cochlear Implant Recipients
Rakszawski, Bernadette; Wright, Rose; Cadieux, Jamie H.; Davidson, Lisa S.; Brenner, Christine
2016-01-01
Background Cochlear implants (CIs) have been shown to improve children’s speech recognition over traditional amplification when severe to profound sensorineural hearing loss is present. Despite improvements, understanding speech at low-level intensities or in the presence of background noise remains difficult. In an effort to improve speech understanding in challenging environments, Cochlear Ltd. offers pre-processing strategies that apply various algorithms prior to mapping the signal to the internal array. Two of these strategies include Autosensitivity Control™ (ASC) and Adaptive Dynamic Range Optimization (ADRO®). Based on previous research, the manufacturer’s default pre-processing strategy for pediatrics’ everyday programs combines ASC+ADRO®. Purpose The purpose of this study is to compare pediatric speech perception performance across various pre-processing strategies while applying a specific programming protocol utilizing increased threshold (T) levels to ensure access to very low-level sounds. Research Design This was a prospective, cross-sectional, observational study. Participants completed speech perception tasks in four pre-processing conditions: no pre-processing, ADRO®, ASC, ASC+ADRO®. Study Sample Eleven pediatric Cochlear Ltd. cochlear implant users were recruited: six bilateral, one unilateral, and four bimodal. Intervention Four programs, with the participants’ everyday map, were loaded into the processor with different pre-processing strategies applied in each of the four positions: no pre-processing, ADRO®, ASC, and ASC+ADRO®. Data Collection and Analysis Participants repeated CNC words presented at 50 and 70 dB SPL in quiet and HINT sentences presented adaptively with competing R-Space noise at 60 and 70 dB SPL. Each measure was completed as participants listened with each of the four pre-processing strategies listed above. Test order and condition were randomized. A repeated-measures analysis of variance (ANOVA) was used to compare each pre-processing strategy across group data. Critical differences were utilized to determine significant score differences between each pre-processing strategy for individual participants. Results For CNC words presented at 50 dB SPL, the group data revealed significantly better scores using ASC+ADRO® compared to all other pre-processing conditions while ASC resulted in poorer scores compared to ADRO® and ASC+ADRO®. Group data for HINT sentences presented in 70 dB SPL of R-Space noise revealed significantly improved scores using ASC and ASC+ADRO® compared to no pre-processing, with ASC+ADRO® scores being better than ADRO® alone scores. Group data for CNC words presented at 70 dB SPL and adaptive HINT sentences presented in 60 dB SPL of R-Space noise showed no significant difference among conditions. Individual data showed that the pre-processing strategy yielding the best scores varied across measures and participants. Conclusions Group data reveals an advantage with ASC+ADRO® for speech perception presented at lower levels and in higher levels of background noise. Individual data revealed that the optimal pre-processing strategy varied among participants; indicating that a variety of pre-processing strategies should be explored for each CI user considering his or her performance in challenging listening environments. PMID:26905529
EARLINET Single Calculus Chain - technical - Part 1: Pre-processing of raw lidar data
NASA Astrophysics Data System (ADS)
D'Amico, G.; Amodeo, A.; Mattis, I.; Freudenthaler, V.; Pappalardo, G.
2015-10-01
In this paper we describe an automatic tool for the pre-processing of lidar data called ELPP (EARLINET Lidar Pre-Processor). It is one of two calculus modules of the EARLINET Single Calculus Chain (SCC), the automatic tool for the analysis of EARLINET data. The ELPP is an open source module that executes instrumental corrections and data handling of the raw lidar signals, making the lidar data ready to be processed by the optical retrieval algorithms. According to the specific lidar configuration, the ELPP automatically performs dead-time correction, atmospheric and electronic background subtraction, gluing of lidar signals, and trigger-delay correction. Moreover, the signal-to-noise ratio of the pre-processed signals can be improved by means of configurable time integration of the raw signals and/or spatial smoothing. The ELPP delivers the statistical uncertainties of the final products by means of error propagation or Monte Carlo simulations. During the development of the ELPP module, particular attention has been payed to make the tool flexible enough to handle all lidar configurations currently used within the EARLINET community. Moreover, it has been designed in a modular way to allow an easy extension to lidar configurations not yet implemented. The primary goal of the ELPP module is to enable the application of quality-assured procedures in the lidar data analysis starting from the raw lidar data. This provides the added value of full traceability of each delivered lidar product. Several tests have been performed to check the proper functioning of the ELPP module. The whole SCC has been tested with the same synthetic data sets, which were used for the EARLINET algorithm inter-comparison exercise. The ELPP module has been successfully employed for the automatic near-real-time pre-processing of the raw lidar data measured during several EARLINET inter-comparison campaigns as well as during intense field campaigns.
NEEDS - Information Adaptive System
NASA Technical Reports Server (NTRS)
Kelly, W. L.; Benz, H. F.; Meredith, B. D.
1980-01-01
The Information Adaptive System (IAS) is an element of the NASA End-to-End Data System (NEEDS) Phase II and is focused toward onboard image processing. The IAS is a data preprocessing system which is closely coupled to the sensor system. Some of the functions planned for the IAS include sensor response nonuniformity correction, geometric correction, data set selection, data formatting, packetization, and adaptive system control. The inclusion of these sensor data preprocessing functions onboard the spacecraft will significantly improve the extraction of information from the sensor data in a timely and cost effective manner, and provide the opportunity to design sensor systems which can be reconfigured in near real-time for optimum performance. The purpose of this paper is to present the preliminary design of the IAS and the plans for its development.
Cuadros-Inostroza, Alvaro; Caldana, Camila; Redestig, Henning; Kusano, Miyako; Lisec, Jan; Peña-Cortés, Hugo; Willmitzer, Lothar; Hannah, Matthew A
2009-12-16
Metabolite profiling, the simultaneous quantification of multiple metabolites in an experiment, is becoming increasingly popular, particularly with the rise of systems-level biology. The workhorse in this field is gas-chromatography hyphenated with mass spectrometry (GC-MS). The high-throughput of this technology coupled with a demand for large experiments has led to data pre-processing, i.e. the quantification of metabolites across samples, becoming a major bottleneck. Existing software has several limitations, including restricted maximum sample size, systematic errors and low flexibility. However, the biggest limitation is that the resulting data usually require extensive hand-curation, which is subjective and can typically take several days to weeks. We introduce the TargetSearch package, an open source tool which is a flexible and accurate method for pre-processing even very large numbers of GC-MS samples within hours. We developed a novel strategy to iteratively correct and update retention time indices for searching and identifying metabolites. The package is written in the R programming language with computationally intensive functions written in C for speed and performance. The package includes a graphical user interface to allow easy use by those unfamiliar with R. TargetSearch allows fast and accurate data pre-processing for GC-MS experiments and overcomes the sample number limitations and manual curation requirements of existing software. We validate our method by carrying out an analysis against both a set of known chemical standard mixtures and of a biological experiment. In addition we demonstrate its capabilities and speed by comparing it with other GC-MS pre-processing tools. We believe this package will greatly ease current bottlenecks and facilitate the analysis of metabolic profiling data.
2009-01-01
Background Metabolite profiling, the simultaneous quantification of multiple metabolites in an experiment, is becoming increasingly popular, particularly with the rise of systems-level biology. The workhorse in this field is gas-chromatography hyphenated with mass spectrometry (GC-MS). The high-throughput of this technology coupled with a demand for large experiments has led to data pre-processing, i.e. the quantification of metabolites across samples, becoming a major bottleneck. Existing software has several limitations, including restricted maximum sample size, systematic errors and low flexibility. However, the biggest limitation is that the resulting data usually require extensive hand-curation, which is subjective and can typically take several days to weeks. Results We introduce the TargetSearch package, an open source tool which is a flexible and accurate method for pre-processing even very large numbers of GC-MS samples within hours. We developed a novel strategy to iteratively correct and update retention time indices for searching and identifying metabolites. The package is written in the R programming language with computationally intensive functions written in C for speed and performance. The package includes a graphical user interface to allow easy use by those unfamiliar with R. Conclusions TargetSearch allows fast and accurate data pre-processing for GC-MS experiments and overcomes the sample number limitations and manual curation requirements of existing software. We validate our method by carrying out an analysis against both a set of known chemical standard mixtures and of a biological experiment. In addition we demonstrate its capabilities and speed by comparing it with other GC-MS pre-processing tools. We believe this package will greatly ease current bottlenecks and facilitate the analysis of metabolic profiling data. PMID:20015393
Human Response to Simulated Low-Intensity Sonic Booms
NASA Technical Reports Server (NTRS)
Sullivan, Brenda M.
2004-01-01
NASA's High Speed Research (HSR ) program in the 1990s was intended to develop a technology base for a future High-Speed Civil Transport (HSCT). As part of this program, the NASA Langley Research Center sonic boom simulator (SBS) was built and used for a series of tests on subjective response to sonic booms. At the end of the HSR program, an HSCT was deemed impractical, but since then interest in supersonic flight has reawakened, this time focusing on a smaller aircraft suitable for a business jet. To respond to this interest, the Langley sonic boom simulator has been refurbished. The upgraded computer-controlled playback system is based on an SGI O2 computer, in place of the previous DEC MicroVAX. As the frequency response of the booth is not flat, an equalization filter is required. Because of the changes made during the renovation (new loudspeakers), the previous equalization filter no longer performed as well as before, so a new equalization filter has been designed. Booms to be presented in the booth are preprocessed using the filter. When the preprocessed signals are presented into the booth and measured with a microphone, the results are very similar to the intended shapes. Signals with short rise times and sharp "corners" are observed to have a small amount of "ringing" in the response. During the HSR program a considerable number of subjective tests were completed in the SBS. A summary of that research is given in Leatherwood et al. (Individual reports are available at http://techreports.larc.nasa.gov/ltrs/ltrs.html.) Topics of study included shaped sonic booms, asymmetrical booms, realistic (recorded) boom waveforms, indoor and outdoor booms shapes, among other factors. One conclusion of that research was that a loudness metric, like the Stevens Perceived Level (PL), predicted human reaction much more accurately than overpressure or unweighted sound pressure level. Structural vibration and rattle were not included in these studies.
Time-Of-Flight Camera, Optical Tracker and Computed Tomography in Pairwise Data Registration
Badura, Pawel; Juszczyk, Jan; Pietka, Ewa
2016-01-01
Purpose A growing number of medical applications, including minimal invasive surgery, depends on multi-modal or multi-sensors data processing. Fast and accurate 3D scene analysis, comprising data registration, seems to be crucial for the development of computer aided diagnosis and therapy. The advancement of surface tracking system based on optical trackers already plays an important role in surgical procedures planning. However, new modalities, like the time-of-flight (ToF) sensors, widely explored in non-medical fields are powerful and have the potential to become a part of computer aided surgery set-up. Connection of different acquisition systems promises to provide a valuable support for operating room procedures. Therefore, the detailed analysis of the accuracy of such multi-sensors positioning systems is needed. Methods We present the system combining pre-operative CT series with intra-operative ToF-sensor and optical tracker point clouds. The methodology contains: optical sensor set-up and the ToF-camera calibration procedures, data pre-processing algorithms, and registration technique. The data pre-processing yields a surface, in case of CT, and point clouds for ToF-sensor and marker-driven optical tracker representation of an object of interest. An applied registration technique is based on Iterative Closest Point algorithm. Results The experiments validate the registration of each pair of modalities/sensors involving phantoms of four various human organs in terms of Hausdorff distance and mean absolute distance metrics. The best surface alignment was obtained for CT and optical tracker combination, whereas the worst for experiments involving ToF-camera. Conclusion The obtained accuracies encourage to further develop the multi-sensors systems. The presented substantive discussion concerning the system limitations and possible improvements mainly related to the depth information produced by the ToF-sensor is useful for computer aided surgery developers. PMID:27434396
On the estimation of phase synchronization, spurious synchronization and filtering
NASA Astrophysics Data System (ADS)
Rios Herrera, Wady A.; Escalona, Joaquín; Rivera López, Daniel; Müller, Markus F.
2016-12-01
Phase synchronization, viz., the adjustment of instantaneous frequencies of two interacting self-sustained nonlinear oscillators, is frequently used for the detection of a possible interrelationship between empirical data recordings. In this context, the proper estimation of the instantaneous phase from a time series is a crucial aspect. The probability that numerical estimates provide a physically relevant meaning depends sensitively on the shape of its power spectral density. For this purpose, the power spectrum should be narrow banded possessing only one prominent peak [M. Chavez et al., J. Neurosci. Methods 154, 149 (2006)]. If this condition is not fulfilled, band-pass filtering seems to be the adequate technique in order to pre-process data for a posterior synchronization analysis. However, it was reported that band-pass filtering might induce spurious synchronization [L. Xu et al., Phys. Rev. E 73, 065201(R), (2006); J. Sun et al., Phys. Rev. E 77, 046213 (2008); and J. Wang and Z. Liu, EPL 102, 10003 (2013)], a statement that without further specification causes uncertainty over all measures that aim to quantify phase synchronization of broadband field data. We show by using signals derived from different test frameworks that appropriate filtering does not induce spurious synchronization. Instead, filtering in the time domain tends to wash out existent phase interrelations between signals. Furthermore, we show that measures derived for the estimation of phase synchronization like the mean phase coherence are also useful for the detection of interrelations between time series, which are not necessarily derived from coupled self-sustained nonlinear oscillators.
Design and implementation of a preprocessing system for a sodium lidar
NASA Technical Reports Server (NTRS)
Voelz, D. G.; Sechrist, C. F., Jr.
1983-01-01
A preprocessing system, designed and constructed for use with the University of Illinois sodium lidar system, was developed to increase the altitude resolution and range of the lidar system and also to decrease the processing burden of the main lidar computer. The preprocessing system hardware and the software required to implement the system are described. Some preliminary results of an airborne sodium lidar experiment conducted with the preprocessing system installed in the sodium lidar are presented.
Data preprocessing method for liquid chromatography-mass spectrometry based metabolomics.
Wei, Xiaoli; Shi, Xue; Kim, Seongho; Zhang, Li; Patrick, Jeffrey S; Binkley, Joe; McClain, Craig; Zhang, Xiang
2012-09-18
A set of data preprocessing algorithms for peak detection and peak list alignment are reported for analysis of liquid chromatography-mass spectrometry (LC-MS)-based metabolomics data. For spectrum deconvolution, peak picking is achieved at the selected ion chromatogram (XIC) level. To estimate and remove the noise in XICs, each XIC is first segmented into several peak groups based on the continuity of scan number, and the noise level is estimated by all the XIC signals, except the regions potentially with presence of metabolite ion peaks. After removing noise, the peaks of molecular ions are detected using both the first and the second derivatives, followed by an efficient exponentially modified Gaussian-based peak deconvolution method for peak fitting. A two-stage alignment algorithm is also developed, where the retention times of all peaks are first transferred into the z-score domain and the peaks are aligned based on the measure of their mixture scores after retention time correction using a partial linear regression. Analysis of a set of spike-in LC-MS data from three groups of samples containing 16 metabolite standards mixed with metabolite extract from mouse livers demonstrates that the developed data preprocessing method performs better than two of the existing popular data analysis packages, MZmine2.6 and XCMS(2), for peak picking, peak list alignment, and quantification.
A Data Pre-processing Method for Liquid Chromatography Mass Spectrometry-based Metabolomics
Wei, Xiaoli; Shi, Xue; Kim, Seongho; Zhang, Li; Patrick, Jeffrey S.; Binkley, Joe; McClain, Craig; Zhang, Xiang
2012-01-01
A set of data pre-processing algorithms for peak detection and peak list alignment are reported for analysis of LC-MS based metabolomics data. For spectrum deconvolution, peak picking is achieved at selected ion chromatogram (XIC) level. To estimate and remove the noise in XICs, each XIC is first segmented into several peak groups based on the continuity of scan number, and the noise level is estimated by all the XIC signals, except the regions potentially with presence of metabolite ion peaks. After removing noise, the peaks of molecular ions are detected using both the first and the second derivatives, followed by an efficient exponentially modified Gaussian-based peak deconvolution method for peak fitting. A two-stage alignment algorithm is also developed, where the retention times of all peaks are first transferred into z-score domain and the peaks are aligned based on the measure of their mixture scores after retention time correction using a partial linear regression. Analysis of a set of spike-in LC-MS data from three groups of samples containing 16 metabolite standards mixed with metabolite extract from mouse livers, demonstrates that the developed data pre-processing methods performs better than two of the existing popular data analysis packages, MZmine2.6 and XCMS2, for peak picking, peak list alignment and quantification. PMID:22931487
NASA Technical Reports Server (NTRS)
Cangahuala, L.; Drain, T. R.
1999-01-01
At present, ground navigation support for interplanetary spacecraft requires human intervention for data pre-processing, filtering, and post-processing activities; these actions must be repeated each time a new batch of data is collected by the ground data system.
Practical Algorithms for the Longest Common Extension Problem
NASA Astrophysics Data System (ADS)
Ilie, Lucian; Tinta, Liviu
The Longest Common Extension problem considers a string s and computes, for each of a number of pairs (i,j), the longest substring of s that starts at both i and j. It appears as a subproblem in many fundamental string problems and can be solved by linear-time preprocessing of the string that allows (worst-case) constant-time computation for each pair. The two known approaches use powerful algorithms: either constant-time computation of the Lowest Common Ancestor in trees or constant-time computation of Range Minimum Queries (RMQ) in arrays. We show here that, from practical point of view, such complicated approaches are not needed. We give two very simple algorithms for this problem that require no preprocessing. The first needs only the string and is significantly faster than all previous algorithms on the average. The second combines the first with a direct RMQ computation on the Longest Common Prefix array. It takes advantage of the superior speed of the cache memory and is the fastest on virtually all inputs.
An Automated, Adaptive Framework for Optimizing Preprocessing Pipelines in Task-Based Functional MRI
Churchill, Nathan W.; Spring, Robyn; Afshin-Pour, Babak; Dong, Fan; Strother, Stephen C.
2015-01-01
BOLD fMRI is sensitive to blood-oxygenation changes correlated with brain function; however, it is limited by relatively weak signal and significant noise confounds. Many preprocessing algorithms have been developed to control noise and improve signal detection in fMRI. Although the chosen set of preprocessing and analysis steps (the “pipeline”) significantly affects signal detection, pipelines are rarely quantitatively validated in the neuroimaging literature, due to complex preprocessing interactions. This paper outlines and validates an adaptive resampling framework for evaluating and optimizing preprocessing choices by optimizing data-driven metrics of task prediction and spatial reproducibility. Compared to standard “fixed” preprocessing pipelines, this optimization approach significantly improves independent validation measures of within-subject test-retest, and between-subject activation overlap, and behavioural prediction accuracy. We demonstrate that preprocessing choices function as implicit model regularizers, and that improvements due to pipeline optimization generalize across a range of simple to complex experimental tasks and analysis models. Results are shown for brief scanning sessions (<3 minutes each), demonstrating that with pipeline optimization, it is possible to obtain reliable results and brain-behaviour correlations in relatively small datasets. PMID:26161667
DPPP: Default Pre-Processing Pipeline
NASA Astrophysics Data System (ADS)
van Diepen, Ger; Dijkema, Tammo Jan
2018-04-01
DPPP (Default Pre-Processing Pipeline, also referred to as NDPPP) reads and writes radio-interferometric data in the form of Measurement Sets, mainly those that are created by the LOFAR telescope. It goes through visibilities in time order and contains standard operations like averaging, phase-shifting and flagging bad stations. Between the steps in a pipeline, the data is not written to disk, making this tool suitable for operations where I/O dominates. More advanced procedures such as gain calibration are also included. Other computing steps can be provided by loading a shared library; currently supported external steps are the AOFlagger (ascl:1010.017) and a bridge that enables loading python steps.
Layered recognition networks that pre-process, classify, and describe
NASA Technical Reports Server (NTRS)
Uhr, L.
1971-01-01
A brief overview is presented of six types of pattern recognition programs that: (1) preprocess, then characterize; (2) preprocess and characterize together; (3) preprocess and characterize into a recognition cone; (4) describe as well as name; (5) compose interrelated descriptions; and (6) converse. A computer program (of types 3 through 6) is presented that transforms and characterizes the input scene through the successive layers of a recognition cone, and then engages in a stylized conversation to describe the scene.
NASA Technical Reports Server (NTRS)
Austin, W. W.
1983-01-01
The effect on LANDSAT data of a Sun angle correction, an intersatellite LANDSAT-2 and LANDSAT-3 data range adjustment, and the atmospheric correction algorithm was evaluated. Fourteen 1978 crop year LACIE sites were used as the site data set. The preprocessing techniques were applied to multispectral scanner channel data and transformed data were plotted and used to analyze the effectiveness of the preprocessing techniques. Ratio transformations effectively reduce the need for preprocessing techniques to be applied directly to the data. Subtractive transformations are more sensitive to Sun angle and atmospheric corrections than ratios. Preprocessing techniques, other than those applied at the Goddard Space Flight Center, should only be applied as an option of the user. While performed on LANDSAT data the study results are also applicable to meteorological satellite data.
A Feature Fusion Based Forecasting Model for Financial Time Series
Guo, Zhiqiang; Wang, Huaiqing; Liu, Quan; Yang, Jie
2014-01-01
Predicting the stock market has become an increasingly interesting research area for both researchers and investors, and many prediction models have been proposed. In these models, feature selection techniques are used to pre-process the raw data and remove noise. In this paper, a prediction model is constructed to forecast stock market behavior with the aid of independent component analysis, canonical correlation analysis, and a support vector machine. First, two types of features are extracted from the historical closing prices and 39 technical variables obtained by independent component analysis. Second, a canonical correlation analysis method is utilized to combine the two types of features and extract intrinsic features to improve the performance of the prediction model. Finally, a support vector machine is applied to forecast the next day's closing price. The proposed model is applied to the Shanghai stock market index and the Dow Jones index, and experimental results show that the proposed model performs better in the area of prediction than other two similar models. PMID:24971455
Contactless physiological signals extraction based on skin color magnification
NASA Astrophysics Data System (ADS)
Suh, Kun Ha; Lee, Eui Chul
2017-11-01
Although the human visual system is not sufficiently sensitive to perceive blood circulation, blood flow caused by cardiac activity makes slight changes on human skin surfaces. With advances in imaging technology, it has become possible to capture these changes through digital cameras. However, it is difficult to obtain clear physiological signals from such changes due to its fineness and noise factors, such as motion artifacts and camera sensing disturbances. We propose a method for extracting physiological signals with improved quality from skin colored-videos recorded with a remote RGB camera. The results showed that our skin color magnification method reveals the hidden physiological components remarkably in the time-series signal. A Korea Food and Drug Administration-approved heart rate monitor was used for verifying the resulting signal synchronized with the actual cardiac pulse, and comparisons of signal peaks showed correlation coefficients of almost 1.0. In particular, our method can be an effective preprocessing before applying additional postfiltering techniques to improve accuracy in image-based physiological signal extractions.
The innovative concept of three-dimensional hybrid receptor modeling
NASA Astrophysics Data System (ADS)
Stojić, A.; Stanišić Stojić, S.
2017-09-01
The aim of this study was to improve the current understanding of air pollution transport processes at regional and long-range scale. For this purpose, three-dimensional (3D) potential source contribution function and concentration weighted trajectory models, as well as new hybrid receptor model, concentration weighted boundary layer (CWBL), which uses a two-dimensional grid and a planetary boundary layer height as a frame of reference, are presented. The refined approach to hybrid receptor modeling has two advantages. At first, it considers whether each trajectory endpoint meets the inclusion criteria based on planetary boundary layer height, which is expected to provide a more realistic representation of the spatial distribution of emission sources and pollutant transport pathways. Secondly, it includes pollutant time series preprocessing to make hybrid receptor models more applicable for suburban and urban locations. The 3D hybrid receptor models presented herein are designed to identify altitude distribution of potential sources, whereas CWBL can be used for analyzing the vertical distribution of pollutant concentrations along the transport pathway.
NASA Technical Reports Server (NTRS)
Duggin, M. J. (Principal Investigator); Piwinski, D.
1982-01-01
The use of NOAA AVHRR data to map and monitor vegetation types and conditions in near real-time can be enhanced by using a portion of each GAC image that is larger than the central 25% now considered. Enlargement of the cloud free image data set can permit development of a series of algorithms for correcting imagery for ground reflectance and for atmospheric scattering anisotropy within certain accuracy limits. Empirical correction algorithms used to normalize digital radiance or VIN data must contain factors for growth stage and for instrument spectral response. While it is not possible to correct for random fluctuations in target radiance, it is possible to estimate the necessary radiance difference between targets in order to provide target discrimination and quantification within predetermined limits of accuracy. A major difficulty lies in the lack of documentation of preprocessing algorithms used on AVHRR digital data.
Artificial intelligence based models for stream-flow forecasting: 2000-2015
NASA Astrophysics Data System (ADS)
Yaseen, Zaher Mundher; El-shafie, Ahmed; Jaafar, Othman; Afan, Haitham Abdulmohsin; Sayl, Khamis Naba
2015-11-01
The use of Artificial Intelligence (AI) has increased since the middle of the 20th century as seen in its application in a wide range of engineering and science problems. The last two decades, for example, has seen a dramatic increase in the development and application of various types of AI approaches for stream-flow forecasting. Generally speaking, AI has exhibited significant progress in forecasting and modeling non-linear hydrological applications and in capturing the noise complexity in the dataset. This paper explores the state-of-the-art application of AI in stream-flow forecasting, focusing on defining the data-driven of AI, the advantages of complementary models, as well as the literature and their possible future application in modeling and forecasting stream-flow. The review also identifies the major challenges and opportunities for prospective research, including, a new scheme for modeling the inflow, a novel method for preprocessing time series frequency based on Fast Orthogonal Search (FOS) techniques, and Swarm Intelligence (SI) as an optimization approach.
Combined Landsat-8 and Sentinel-2 Burned Area Mapping
NASA Astrophysics Data System (ADS)
Huang, H.; Roy, D. P.; Zhang, H.; Boschetti, L.; Yan, L.; Li, Z.
2017-12-01
Fire products derived from coarse spatial resolution satellite data have become an important source of information for the multiple user communities involved in fire science and applications. The advent of the MODIS on NASA's Terra and Aqua satellites enabled systematic production of 500m global burned area maps. There is, however, an unequivocal demand for systematically generated higher spatial resolution burned area products, in particular to examine the role of small-fires for various applications. Moderate spatial resolution contemporaneous satellite data from Landsat-8 and the Sentinel-2A and -2B sensors provide the opportunity for detailed spatial mapping of burned areas. Combined, these polar-orbiting systems provide 10m to 30m multi-spectral global coverage more than once every three days. This NASA funded research presents results to prototype a combined Landsat-8 Sentinel-2 burned area product. The Landsat-8 and Sentinel-2 pre-processing, the time-series burned area mapping algorithm, and preliminary results and validation using high spatial resolution commercial satellite data over Africa are presented.
Detection and Modeling of High-Dimensional Thresholds for Fault Detection and Diagnosis
NASA Technical Reports Server (NTRS)
He, Yuning
2015-01-01
Many Fault Detection and Diagnosis (FDD) systems use discrete models for detection and reasoning. To obtain categorical values like oil pressure too high, analog sensor values need to be discretized using a suitablethreshold. Time series of analog and discrete sensor readings are processed and discretized as they come in. This task isusually performed by the wrapper code'' of the FDD system, together with signal preprocessing and filtering. In practice,selecting the right threshold is very difficult, because it heavily influences the quality of diagnosis. If a threshold causesthe alarm trigger even in nominal situations, false alarms will be the consequence. On the other hand, if threshold settingdoes not trigger in case of an off-nominal condition, important alarms might be missed, potentially causing hazardoussituations. In this paper, we will in detail describe the underlying statistical modeling techniques and algorithm as well as the Bayesian method for selecting the most likely shape and its parameters. Our approach will be illustrated by several examples from the Aerospace domain.
Ensemble analyses improve signatures of tumour hypoxia and reveal inter-platform differences
2014-01-01
Background The reproducibility of transcriptomic biomarkers across datasets remains poor, limiting clinical application. We and others have suggested that this is in-part caused by differential error-structure between datasets, and their incomplete removal by pre-processing algorithms. Methods To test this hypothesis, we systematically assessed the effects of pre-processing on biomarker classification using 24 different pre-processing methods and 15 distinct signatures of tumour hypoxia in 10 datasets (2,143 patients). Results We confirm strong pre-processing effects for all datasets and signatures, and find that these differ between microarray versions. Importantly, exploiting different pre-processing techniques in an ensemble technique improved classification for a majority of signatures. Conclusions Assessing biomarkers using an ensemble of pre-processing techniques shows clear value across multiple diseases, datasets and biomarkers. Importantly, ensemble classification improves biomarkers with initially good results but does not result in spuriously improved performance for poor biomarkers. While further research is required, this approach has the potential to become a standard for transcriptomic biomarkers. PMID:24902696
IceTrendr: a linear time-series approach to monitoring glacier environments using Landsat
NASA Astrophysics Data System (ADS)
Nelson, P.; Kennedy, R. E.; Nolin, A. W.; Hughes, J. M.; Braaten, J.
2017-12-01
Arctic glaciers in Alaska and Canada have experienced some of the greatest ice mass loss of any region in recent decades. A challenge to understanding these changing ecosystems, however, is developing globally-consistent, multi-decadal monitoring of glacier ice. We present a toolset and approach that captures, labels, and maps glacier change for use in climate science, hydrology, and Earth science education using Landsat Time Series (LTS). The core step is "temporal segmentation," wherein a yearly LTS is cleaned using pre-processing steps, converted to a snow/ice index, and then simplified into the salient shape of the change trajectory ("temporal signature") using linear segmentation. Such signatures can be characterized as simple `stable' or `transition of glacier ice to rock' to more complex multi-year changes like `transition of glacier ice to debris-covered glacier ice to open water to bare rock to vegetation'. This pilot study demonstrates the potential for interactively mapping, visualizing, and labeling glacier changes. What is truly innovative is that IceTrendr not only maps the changes but also uses expert knowledge to label the changes and such labels can be applied to other glaciers exhibiting statistically similar temporal signatures. Our key findings are that the IceTrendr concept and software can provide important functionality for glaciologists and educators interested in studying glacier changes during the Landsat TM timeframe (1984-present). Issues of concern with using dense Landsat time-series approaches for glacier monitoring include many missing images during the period 1984-1995 and that automated cloud mask are challenged and require the user to manually identify cloud-free images. IceTrendr is much more than just a simple "then and now" approach to glacier mapping. This process is a means of integrating the power of computing, remote sensing, and expert knowledge to "tell the story" of glacier changes.
Multitaper Spectral Analysis and Wavelet Denoising Applied to Helioseismic Data
NASA Technical Reports Server (NTRS)
Komm, R. W.; Gu, Y.; Hill, F.; Stark, P. B.; Fodor, I. K.
1999-01-01
Estimates of solar normal mode frequencies from helioseismic observations can be improved by using Multitaper Spectral Analysis (MTSA) to estimate spectra from the time series, then using wavelet denoising of the log spectra. MTSA leads to a power spectrum estimate with reduced variance and better leakage properties than the conventional periodogram. Under the assumption of stationarity and mild regularity conditions, the log multitaper spectrum has a statistical distribution that is approximately Gaussian, so wavelet denoising is asymptotically an optimal method to reduce the noise in the estimated spectra. We find that a single m-upsilon spectrum benefits greatly from MTSA followed by wavelet denoising, and that wavelet denoising by itself can be used to improve m-averaged spectra. We compare estimates using two different 5-taper estimates (Stepian and sine tapers) and the periodogram estimate, for GONG time series at selected angular degrees l. We compare those three spectra with and without wavelet-denoising, both visually, and in terms of the mode parameters estimated from the pre-processed spectra using the GONG peak-fitting algorithm. The two multitaper estimates give equivalent results. The number of modes fitted well by the GONG algorithm is 20% to 60% larger (depending on l and the temporal frequency) when applied to the multitaper estimates than when applied to the periodogram. The estimated mode parameters (frequency, amplitude and width) are comparable for the three power spectrum estimates, except for modes with very small mode widths (a few frequency bins), where the multitaper spectra broadened the modest compared with the periodogram. We tested the influence of the number of tapers used and found that narrow modes at low n values are broadened to the extent that they can no longer be fit if the number of tapers is too large. For helioseismic time series of this length and temporal resolution, the optimal number of tapers is less than 10.
Multistep-Ahead Air Passengers Traffic Prediction with Hybrid ARIMA-SVMs Models
Ming, Wei; Xiong, Tao
2014-01-01
The hybrid ARIMA-SVMs prediction models have been established recently, which take advantage of the unique strength of ARIMA and SVMs models in linear and nonlinear modeling, respectively. Built upon this hybrid ARIMA-SVMs models alike, this study goes further to extend them into the case of multistep-ahead prediction for air passengers traffic with the two most commonly used multistep-ahead prediction strategies, that is, iterated strategy and direct strategy. Additionally, the effectiveness of data preprocessing approaches, such as deseasonalization and detrending, is investigated and proofed along with the two strategies. Real data sets including four selected airlines' monthly series were collected to justify the effectiveness of the proposed approach. Empirical results demonstrate that the direct strategy performs better than iterative one in long term prediction case while iterative one performs better in the case of short term prediction. Furthermore, both deseasonalization and detrending can significantly improve the prediction accuracy for both strategies, indicating the necessity of data preprocessing. As such, this study contributes as a full reference to the planners from air transportation industries on how to tackle multistep-ahead prediction tasks in the implementation of either prediction strategy. PMID:24723814
Multi-Temporal Land Cover Classification with Sequential Recurrent Encoders
NASA Astrophysics Data System (ADS)
Rußwurm, Marc; Körner, Marco
2018-03-01
Earth observation (EO) sensors deliver data with daily or weekly temporal resolution. Most land use and land cover (LULC) approaches, however, expect cloud-free and mono-temporal observations. The increasing temporal capabilities of today's sensors enables the use of temporal, along with spectral and spatial features. Domains, such as speech recognition or neural machine translation, work with inherently temporal data and, today, achieve impressive results using sequential encoder-decoder structures. Inspired by these sequence-to-sequence models, we adapt an encoder structure with convolutional recurrent layers in order to approximate a phenological model for vegetation classes based on a temporal sequence of Sentinel 2 (S2) images. In our experiments, we visualize internal activations over a sequence of cloudy and non-cloudy images and find several recurrent cells, which reduce the input activity for cloudy observations. Hence, we assume that our network has learned cloud-filtering schemes solely from input data, which could alleviate the need for tedious cloud-filtering as a preprocessing step for many EO approaches. Moreover, using unfiltered temporal series of top-of-atmosphere (TOA) reflectance data, we achieved in our experiments state-of-the-art classification accuracies on a large number of crop classes with minimal preprocessing compared to other classification approaches.
NASA Astrophysics Data System (ADS)
Hsu, Kuo-Hsien
2012-11-01
Formosat-2 image is a kind of high-spatial-resolution (2 meters GSD) remote sensing satellite data, which includes one panchromatic band and four multispectral bands (Blue, Green, Red, near-infrared). An essential sector in the daily processing of received Formosat-2 image is to estimate the cloud statistic of image using Automatic Cloud Coverage Assessment (ACCA) algorithm. The information of cloud statistic of image is subsequently recorded as an important metadata for image product catalog. In this paper, we propose an ACCA method with two consecutive stages: preprocessing and post-processing analysis. For pre-processing analysis, the un-supervised K-means classification, Sobel's method, thresholding method, non-cloudy pixels reexamination, and cross-band filter method are implemented in sequence for cloud statistic determination. For post-processing analysis, Box-Counting fractal method is implemented. In other words, the cloud statistic is firstly determined via pre-processing analysis, the correctness of cloud statistic of image of different spectral band is eventually cross-examined qualitatively and quantitatively via post-processing analysis. The selection of an appropriate thresholding method is very critical to the result of ACCA method. Therefore, in this work, We firstly conduct a series of experiments of the clustering-based and spatial thresholding methods that include Otsu's, Local Entropy(LE), Joint Entropy(JE), Global Entropy(GE), and Global Relative Entropy(GRE) method, for performance comparison. The result shows that Otsu's and GE methods both perform better than others for Formosat-2 image. Additionally, our proposed ACCA method by selecting Otsu's method as the threshoding method has successfully extracted the cloudy pixels of Formosat-2 image for accurate cloud statistic estimation.
Pre-Processes for Urban Areas Detection in SAR Images
NASA Astrophysics Data System (ADS)
Altay Açar, S.; Bayır, Ş.
2017-11-01
In this study, pre-processes for urban areas detection in synthetic aperture radar (SAR) images are examined. These pre-processes are image smoothing, thresholding and white coloured regions determination. Image smoothing is carried out to remove noises then thresholding is applied to obtain binary image. Finally, candidate urban areas are detected by using white coloured regions determination. All pre-processes are applied by utilizing the developed software. Two different SAR images which are acquired by TerraSAR-X are used in experimental study. Obtained results are shown visually.
McCarthy, Davis J; Campbell, Kieran R; Lun, Aaron T L; Wills, Quin F
2017-04-15
Single-cell RNA sequencing (scRNA-seq) is increasingly used to study gene expression at the level of individual cells. However, preparing raw sequence data for further analysis is not a straightforward process. Biases, artifacts and other sources of unwanted variation are present in the data, requiring substantial time and effort to be spent on pre-processing, quality control (QC) and normalization. We have developed the R/Bioconductor package scater to facilitate rigorous pre-processing, quality control, normalization and visualization of scRNA-seq data. The package provides a convenient, flexible workflow to process raw sequencing reads into a high-quality expression dataset ready for downstream analysis. scater provides a rich suite of plotting tools for single-cell data and a flexible data structure that is compatible with existing tools and can be used as infrastructure for future software development. The open-source code, along with installation instructions, vignettes and case studies, is available through Bioconductor at http://bioconductor.org/packages/scater . davis@ebi.ac.uk. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press.
Altimeter waveform software design
NASA Technical Reports Server (NTRS)
Hayne, G. S.; Miller, L. S.; Brown, G. S.
1977-01-01
Techniques are described for preprocessing raw return waveform data from the GEOS-3 radar altimeter. Topics discussed include: (1) general altimeter data preprocessing to be done at the GEOS-3 Data Processing Center to correct altimeter waveform data for temperature calibrations, to convert between engineering and final data units and to convert telemetered parameter quantities to more appropriate final data distribution values: (2) time "tagging" of altimeter return waveform data quantities to compensate for various delays, misalignments and calculational intervals; (3) data processing procedures for use in estimating spacecraft attitude from altimeter waveform sampling gates; and (4) feasibility of use of a ground-based reflector or transponder to obtain in-flight calibration information on GEOS-3 altimeter performance.
Some practical aspects of lossless and nearly-lossless compression of AVHRR imagery
NASA Technical Reports Server (NTRS)
Hogan, David B.; Miller, Chris X.; Christensen, Than Lee; Moorti, Raj
1994-01-01
Compression of Advanced Very high Resolution Radiometers (AVHRR) imagery operating in a lossless or nearly-lossless mode is evaluated. Several practical issues are analyzed including: variability of compression over time and among channels, rate-smoothing buffer size, multi-spectral preprocessing of data, day/night handling, and impact on key operational data applications. This analysis is based on a DPCM algorithm employing the Universal Noiseless Coder, which is a candidate for inclusion in many future remote sensing systems. It is shown that compression rates of about 2:1 (daytime) can be achieved with modest buffer sizes (less than or equal to 2.5 Mbytes) and a relatively simple multi-spectral preprocessing step.
NASA Technical Reports Server (NTRS)
Kelly, W. L.; Howle, W. M.; Meredith, B. D.
1980-01-01
The Information Adaptive System (IAS) is an element of the NASA End-to-End Data System (NEEDS) Phase II and is focused toward onbaord image processing. Since the IAS is a data preprocessing system which is closely coupled to the sensor system, it serves as a first step in providing a 'Smart' imaging sensor. Some of the functions planned for the IAS include sensor response nonuniformity correction, geometric correction, data set selection, data formatting, packetization, and adaptive system control. The inclusion of these sensor data preprocessing functions onboard the spacecraft will significantly improve the extraction of information from the sensor data in a timely and cost effective manner and provide the opportunity to design sensor systems which can be reconfigured in near real time for optimum performance. The purpose of this paper is to present the preliminary design of the IAS and the plans for its development.
Ioannidis, Vassilios; van Nimwegen, Erik; Stockinger, Heinz
2016-01-01
ISMARA ( ismara.unibas.ch) automatically infers the key regulators and regulatory interactions from high-throughput gene expression or chromatin state data. However, given the large sizes of current next generation sequencing (NGS) datasets, data uploading times are a major bottleneck. Additionally, for proprietary data, users may be uncomfortable with uploading entire raw datasets to an external server. Both these problems could be alleviated by providing a means by which users could pre-process their raw data locally, transferring only a small summary file to the ISMARA server. We developed a stand-alone client application that pre-processes large input files (RNA-seq or ChIP-seq data) on the user's computer for performing ISMARA analysis in a completely automated manner, including uploading of small processed summary files to the ISMARA server. This reduces file sizes by up to a factor of 1000, and upload times from many hours to mere seconds. The client application is available from ismara.unibas.ch/ISMARA/client. PMID:28232860
User data dissemination concepts for earth resources
NASA Technical Reports Server (NTRS)
Davies, R.; Scott, M.; Mitchell, C.; Torbett, A.
1976-01-01
Domestic data dissemination networks for earth-resources data in the 1985-1995 time frame were evaluated. The following topics were addressed: (1) earth-resources data sources and expected data volumes, (2) future user demand in terms of data volume and timeliness, (3) space-to-space and earth point-to-point transmission link requirements and implementation, (4) preprocessing requirements and implementation, (5) network costs, and (6) technological development to support this implementation. This study was parametric in that the data input (supply) was varied by a factor of about fifteen while the user request (demand) was varied by a factor of about nineteen. Correspondingly, the time from observation to delivery to the user was varied. This parametric evaluation was performed by a computer simulation that was based on network alternatives and resulted in preliminary transmission and preprocessing requirements. The earth-resource data sources considered were: shuttle sorties, synchronous satellites (e.g., SEOS), aircraft, and satellites in polar orbits.
Mathijssen, N M C; Sturm, P D; Pilot, P; Bloem, R M; Buma, P; Petit, P L; Schreurs, B W
2013-12-01
With bone impaction grafting, cancellous bone chips made from allograft femoral heads are impacted in a bone defect, which introduces an additional source of infection. The potential benefit of the use of pre-processed bone chips was investigated by comparing the bacterial contamination of bone chips prepared intraoperatively with the bacterial contamination of pre-processed bone chips at different stages in the surgical procedure. To investigate baseline contamination of the bone grafts, specimens were collected during 88 procedures before actual use or preparation of the bone chips: in 44 procedures intraoperatively prepared chips were used (Group A) and in the other 44 procedures pre-processed bone chips were used (Group B). In 64 of these procedures (32 using locally prepared bone chips and 32 using pre-processed bone chips) specimens were also collected later in the procedure to investigate contamination after use and preparation of the bone chips. In total, 8 procedures had one or more positive specimen(s) (12.5 %). Contamination rates were not significantly different between bone chips prepared at the operating theatre and pre-processed bone chips. In conclusion, there was no difference in bacterial contamination between bone chips prepared from whole femoral heads in the operating room and pre-processed bone chips, and therefore, both types of bone allografts are comparable with respect to risk of infection.
A hybrid prognostic model for multistep ahead prediction of machine condition
NASA Astrophysics Data System (ADS)
Roulias, D.; Loutas, T. H.; Kostopoulos, V.
2012-05-01
Prognostics are the future trend in condition based maintenance. In the current framework a data driven prognostic model is developed. The typical procedure of developing such a model comprises a) the selection of features which correlate well with the gradual degradation of the machine and b) the training of a mathematical tool. In this work the data are taken from a laboratory scale single stage gearbox under multi-sensor monitoring. Tests monitoring the condition of the gear pair from healthy state until total brake down following several days of continuous operation were conducted. After basic pre-processing of the derived data, an indicator that correlated well with the gearbox condition was obtained. Consecutively the time series is split in few distinguishable time regions via an intelligent data clustering scheme. Each operating region is modelled with a feed-forward artificial neural network (FFANN) scheme. The performance of the proposed model is tested by applying the system to predict the machine degradation level on unseen data. The results show the plausibility and effectiveness of the model in following the trend of the timeseries even in the case that a sudden change occurs. Moreover the model shows ability to generalise for application in similar mechanical assets.
Sand, Andreas; Kristiansen, Martin; Pedersen, Christian N S; Mailund, Thomas
2013-11-22
Hidden Markov models are widely used for genome analysis as they combine ease of modelling with efficient analysis algorithms. Calculating the likelihood of a model using the forward algorithm has worst case time complexity linear in the length of the sequence and quadratic in the number of states in the model. For genome analysis, however, the length runs to millions or billions of observations, and when maximising the likelihood hundreds of evaluations are often needed. A time efficient forward algorithm is therefore a key ingredient in an efficient hidden Markov model library. We have built a software library for efficiently computing the likelihood of a hidden Markov model. The library exploits commonly occurring substrings in the input to reuse computations in the forward algorithm. In a pre-processing step our library identifies common substrings and builds a structure over the computations in the forward algorithm which can be reused. This analysis can be saved between uses of the library and is independent of concrete hidden Markov models so one preprocessing can be used to run a number of different models.Using this library, we achieve up to 78 times shorter wall-clock time for realistic whole-genome analyses with a real and reasonably complex hidden Markov model. In one particular case the analysis was performed in less than 8 minutes compared to 9.6 hours for the previously fastest library. We have implemented the preprocessing procedure and forward algorithm as a C++ library, zipHMM, with Python bindings for use in scripts. The library is available at http://birc.au.dk/software/ziphmm/.
NASA Astrophysics Data System (ADS)
Wunderle, S.; Lieberherr, G.; Riffler, M.
2016-12-01
Data analysis of the recent years showed an increase of lake surface water temperature for many lakes around the world. But due to sparse in-situ measurements, which are often not well documented, only satellite data can provide the needed information of the last decades. The importance of lakes for climate research was also highlighted by the Global Climate Observing System (GCOS) defining lakes as Essential Climate Variables (ECVs). Within the frame of a research project funded by the Swiss National Science Foundation a procedure was developed to retrieve lake surface water temperature with high accuracy based on our archived AVHRR data at the University of Bern, Switzerland. The data archive starts in 1985 and is continuously filled with NOAA-/MetOp-AVHRR data received by our antenna resulting in a time series of more than 30 years (WMO definition of a climate period). The data set covering Europe is also used by other teams for climate related studies resulting in improved pre-processing to guarantee precise calibration and geocoding. The first part of our presentation will be dedicated to the quality of the LSWT retrieval comparing various in-situ measurements from lakes in Switzerland with varying sizes (150km2 - 9km2). The quality of the used split-window approach is sensitive to the derived split-window coefficients. The influence of water vapor, view angle, temporal and spatial validity and day vs. night data will be shown. In addition, some information will be presented about the influence of topography and climatic regions (e.g. Scandinavia vs. Greece) on the quality of the LSWT product. Based on these findings compiling time series for different lakes in Europe will be the focus of the second part of our presentation with details of the applied quality assessment to avoid erroneous signals. Hence, some information is given about hierarchical quality checks which are needed to guarantee a dataset without artefacts. Finally, some results of time series are presented to show the reaction of different lakes (size, depth) on climate forcing. The lakes are selected to be representative for different climatic regions in Europe (northern - southern Europe, etc.). At the end of the project the data set will be accessible for the public.
NASA Astrophysics Data System (ADS)
Akhbardeh, Alireza; Junnila, Sakari; Koivuluoma, Mikko; Koivistoinen, Teemu; Värri, Alpo
2006-12-01
As we know, singular value decomposition (SVD) is designed for computing singular values (SVs) of a matrix. Then, if it is used for finding SVs of an [InlineEquation not available: see fulltext.]-by-1 or 1-by- [InlineEquation not available: see fulltext.] array with elements representing samples of a signal, it will return only one singular value that is not enough to express the whole signal. To overcome this problem, we designed a new kind of the feature extraction method which we call ''time-frequency moments singular value decomposition (TFM-SVD).'' In this new method, we use statistical features of time series as well as frequency series (Fourier transform of the signal). This information is then extracted into a certain matrix with a fixed structure and the SVs of that matrix are sought. This transform can be used as a preprocessing stage in pattern clustering methods. The results in using it indicate that the performance of a combined system including this transform and classifiers is comparable with the performance of using other feature extraction methods such as wavelet transforms. To evaluate TFM-SVD, we applied this new method and artificial neural networks (ANNs) for ballistocardiogram (BCG) data clustering to look for probable heart disease of six test subjects. BCG from the test subjects was recorded using a chair-like ballistocardiograph, developed in our project. This kind of device combined with automated recording and analysis would be suitable for use in many places, such as home, office, and so forth. The results show that the method has high performance and it is almost insensitive to BCG waveform latency or nonlinear disturbance.
An Interactive Tool For Semi-automated Statistical Prediction Using Earth Observations and Models
NASA Astrophysics Data System (ADS)
Zaitchik, B. F.; Berhane, F.; Tadesse, T.
2015-12-01
We developed a semi-automated statistical prediction tool applicable to concurrent analysis or seasonal prediction of any time series variable in any geographic location. The tool was developed using Shiny, JavaScript, HTML and CSS. A user can extract a predictand by drawing a polygon over a region of interest on the provided user interface (global map). The user can select the Climatic Research Unit (CRU) precipitation or Climate Hazards Group InfraRed Precipitation with Station data (CHIRPS) as predictand. They can also upload their own predictand time series. Predictors can be extracted from sea surface temperature, sea level pressure, winds at different pressure levels, air temperature at various pressure levels, and geopotential height at different pressure levels. By default, reanalysis fields are applied as predictors, but the user can also upload their own predictors, including a wide range of compatible satellite-derived datasets. The package generates correlations of the variables selected with the predictand. The user also has the option to generate composites of the variables based on the predictand. Next, the user can extract predictors by drawing polygons over the regions that show strong correlations (composites). Then, the user can select some or all of the statistical prediction models provided. Provided models include Linear Regression models (GLM, SGLM), Tree-based models (bagging, random forest, boosting), Artificial Neural Network, and other non-linear models such as Generalized Additive Model (GAM) and Multivariate Adaptive Regression Splines (MARS). Finally, the user can download the analysis steps they used, such as the region they selected, the time period they specified, the predictand and predictors they chose and preprocessing options they used, and the model results in PDF or HTML format. Key words: Semi-automated prediction, Shiny, R, GLM, ANN, RF, GAM, MARS
2013-01-01
Background Matching pursuit algorithm (MP), especially with recent multivariate extensions, offers unique advantages in analysis of EEG and MEG. Methods We propose a novel construction of an optimal Gabor dictionary, based upon the metrics introduced in this paper. We implement this construction in a freely available software for MP decomposition of multivariate time series, with a user friendly interface via the Svarog package (Signal Viewer, Analyzer and Recorder On GPL, http://braintech.pl/svarog), and provide a hands-on introduction to its application to EEG. Finally, we describe numerical and mathematical optimizations used in this implementation. Results Optimal Gabor dictionaries, based on the metric introduced in this paper, for the first time allowed for a priori assessment of maximum one-step error of the MP algorithm. Variants of multivariate MP, implemented in the accompanying software, are organized according to the mathematical properties of the algorithms, relevant in the light of EEG/MEG analysis. Some of these variants have been successfully applied to both multichannel and multitrial EEG and MEG in previous studies, improving preprocessing for EEG/MEG inverse solutions and parameterization of evoked potentials in single trials; we mention also ongoing work and possible novel applications. Conclusions Mathematical results presented in this paper improve our understanding of the basics of the MP algorithm. Simple introduction of its properties and advantages, together with the accompanying stable and user-friendly Open Source software package, pave the way for a widespread and reproducible analysis of multivariate EEG and MEG time series and novel applications, while retaining a high degree of compatibility with the traditional, visual analysis of EEG. PMID:24059247
Evaluation of a Stereo Music Preprocessing Scheme for Cochlear Implant Users.
Buyens, Wim; van Dijk, Bas; Moonen, Marc; Wouters, Jan
2018-01-01
Although for most cochlear implant (CI) users good speech understanding is reached (at least in quiet environments), the perception and the appraisal of music are generally unsatisfactory. The improvement in music appraisal was evaluated in CI participants by using a stereo music preprocessing scheme implemented on a take-home device, in a comfortable listening environment. The preprocessing allowed adjusting the balance among vocals/bass/drums and other instruments, and was evaluated for different genres of music. The correlation between the preferred settings and the participants' speech and pitch detection performance was investigated. During the initial visit preceding the take-home test, the participants' speech-in-noise perception and pitch detection performance were measured, and a questionnaire about their music involvement was completed. The take-home device was provided, including the stereo music preprocessing scheme and seven playlists with six songs each. The participants were asked to adjust the balance by means of a turning wheel to make the music sound most enjoyable, and to repeat this three times for all songs. Twelve postlingually deafened CI users participated in the study. The data were collected by means of a take-home device, which preserved all the preferred settings for the different songs. Statistical analysis was done with a Friedman test (with post hoc Wilcoxon signed-rank test) to check the effect of "Genre." The correlations were investigated with Pearson's and Spearman's correlation coefficients. All participants preferred a balance significantly different from the original balance. Differences across participants were observed which could not be explained by perceptual abilities. An effect of "Genre" was found, showing significantly smaller preferred deviation from the original balance for Golden Oldies compared to the other genres. The stereo music preprocessing scheme showed an improvement in music appraisal with complex music and hence might be a good tool for music listening, training, or rehabilitation for CI users. American Academy of Audiology
Weiskopf, Nikolaus; Veit, Ralf; Erb, Michael; Mathiak, Klaus; Grodd, Wolfgang; Goebel, Rainer; Birbaumer, Niels
2003-07-01
A brain-computer interface (BCI) based on real-time functional magnetic resonance imaging (fMRI) is presented which allows human subjects to observe and control changes of their own blood oxygen level-dependent (BOLD) response. This BCI performs data preprocessing (including linear trend removal, 3D motion correction) and statistical analysis on-line. Local BOLD signals are continuously fed back to the subject in the magnetic resonance scanner with a delay of less than 2 s from image acquisition. The mean signal of a region of interest is plotted as a time-series superimposed on color-coded stripes which indicate the task, i.e., to increase or decrease the BOLD signal. We exemplify the presented BCI with one volunteer intending to control the signal of the rostral-ventral and dorsal part of the anterior cingulate cortex (ACC). The subject achieved significant changes of local BOLD responses as revealed by region of interest analysis and statistical parametric maps. The percent signal change increased across fMRI-feedback sessions suggesting a learning effect with training. This methodology of fMRI-feedback can assess voluntary control of circumscribed brain areas. As a further extension, behavioral effects of local self-regulation become accessible as a new field of research.
A Data Analytical Framework for Improving Real-Time, Decision Support Systems in Healthcare
ERIC Educational Resources Information Center
Yahav, Inbal
2010-01-01
In this dissertation we develop a framework that combines data mining, statistics and operations research methods for improving real-time decision support systems in healthcare. Our approach consists of three main concepts: data gathering and preprocessing, modeling, and deployment. We introduce the notion of offline and semi-offline modeling to…
NASA Astrophysics Data System (ADS)
Samanta, B.; Al-Balushi, K. R.
2003-03-01
A procedure is presented for fault diagnosis of rolling element bearings through artificial neural network (ANN). The characteristic features of time-domain vibration signals of the rotating machinery with normal and defective bearings have been used as inputs to the ANN consisting of input, hidden and output layers. The features are obtained from direct processing of the signal segments using very simple preprocessing. The input layer consists of five nodes, one each for root mean square, variance, skewness, kurtosis and normalised sixth central moment of the time-domain vibration signals. The inputs are normalised in the range of 0.0 and 1.0 except for the skewness which is normalised between -1.0 and 1.0. The output layer consists of two binary nodes indicating the status of the machine—normal or defective bearings. Two hidden layers with different number of neurons have been used. The ANN is trained using backpropagation algorithm with a subset of the experimental data for known machine conditions. The ANN is tested using the remaining set of data. The effects of some preprocessing techniques like high-pass, band-pass filtration, envelope detection (demodulation) and wavelet transform of the vibration signals, prior to feature extraction, are also studied. The results show the effectiveness of the ANN in diagnosis of the machine condition. The proposed procedure requires only a few features extracted from the measured vibration data either directly or with simple preprocessing. The reduced number of inputs leads to faster training requiring far less iterations making the procedure suitable for on-line condition monitoring and diagnostics of machines.
Facilitating access to pre-processed research evidence in public health
2010-01-01
Background Evidence-informed decision making is accepted in Canada and worldwide as necessary for the provision of effective health services. This process involves: 1) clearly articulating a practice-based issue; 2) searching for and accessing relevant evidence; 3) appraising methodological rigor and choosing the most synthesized evidence of the highest quality and relevance to the practice issue and setting that is available; and 4) extracting, interpreting, and translating knowledge, in light of the local context and resources, into practice, program and policy decisions. While the public health sector in Canada is working toward evidence-informed decision making, considerable barriers, including efficient access to synthesized resources, exist. Methods In this paper we map to a previously developed 6 level pyramid of pre-processed research evidence, relevant resources that include public health-related effectiveness evidence. The resources were identified through extensive searches of both the published and unpublished domains. Results Many resources with public health-related evidence were identified. While there were very few resources dedicated solely to public health evidence, many clinically focused resources include public health-related evidence, making tools such as the pyramid, that identify these resources, particularly helpful for public health decisions makers. A practical example illustrates the application of this model and highlights its potential to reduce the time and effort that would be required by public health decision makers to address their practice-based issues. Conclusions This paper describes an existing hierarchy of pre-processed evidence and its adaptation to the public health setting. A number of resources with public health-relevant content that are either freely accessible or requiring a subscription are identified. This will facilitate easier and faster access to pre-processed, public health-relevant evidence, with the intent of promoting evidence-informed decision making. Access to such resources addresses several barriers identified by public health decision makers to evidence-informed decision making, most importantly time, as well as lack of knowledge of resources that house public health-relevant evidence. PMID:20181270
Gifford, René H; Revit, Lawrence J
2010-01-01
Although cochlear implant patients are achieving increasingly higher levels of performance, speech perception in noise continues to be problematic. The newest generations of implant speech processors are equipped with preprocessing and/or external accessories that are purported to improve listening in noise. Most speech perception measures in the clinical setting, however, do not provide a close approximation to real-world listening environments. To assess speech perception for adult cochlear implant recipients in the presence of a realistic restaurant simulation generated by an eight-loudspeaker (R-SPACE) array in order to determine whether commercially available preprocessing strategies and/or external accessories yield improved sentence recognition in noise. Single-subject, repeated-measures design with two groups of participants: Advanced Bionics and Cochlear Corporation recipients. Thirty-four subjects, ranging in age from 18 to 90 yr (mean 54.5 yr), participated in this prospective study. Fourteen subjects were Advanced Bionics recipients, and 20 subjects were Cochlear Corporation recipients. Speech reception thresholds (SRTs) in semidiffuse restaurant noise originating from an eight-loudspeaker array were assessed with the subjects' preferred listening programs as well as with the addition of either Beam preprocessing (Cochlear Corporation) or the T-Mic accessory option (Advanced Bionics). In Experiment 1, adaptive SRTs with the Hearing in Noise Test sentences were obtained for all 34 subjects. For Cochlear Corporation recipients, SRTs were obtained with their preferred everyday listening program as well as with the addition of Focus preprocessing. For Advanced Bionics recipients, SRTs were obtained with the integrated behind-the-ear (BTE) mic as well as with the T-Mic. Statistical analysis using a repeated-measures analysis of variance (ANOVA) evaluated the effects of the preprocessing strategy or external accessory in reducing the SRT in noise. In addition, a standard t-test was run to evaluate effectiveness across manufacturer for improving the SRT in noise. In Experiment 2, 16 of the 20 Cochlear Corporation subjects were reassessed obtaining an SRT in noise using the manufacturer-suggested "Everyday," "Noise," and "Focus" preprocessing strategies. A repeated-measures ANOVA was employed to assess the effects of preprocessing. The primary findings were (i) both Noise and Focus preprocessing strategies (Cochlear Corporation) significantly improved the SRT in noise as compared to Everyday preprocessing, (ii) the T-Mic accessory option (Advanced Bionics) significantly improved the SRT as compared to the BTE mic, and (iii) Focus preprocessing and the T-Mic resulted in similar degrees of improvement that were not found to be significantly different from one another. Options available in current cochlear implant sound processors are able to significantly improve speech understanding in a realistic, semidiffuse noise with both Cochlear Corporation and Advanced Bionics systems. For Cochlear Corporation recipients, Focus preprocessing yields the best speech-recognition performance in a complex listening environment; however, it is recommended that Noise preprocessing be used as the new default for everyday listening environments to avoid the need for switching programs throughout the day. For Advanced Bionics recipients, the T-Mic offers significantly improved performance in noise and is recommended for everyday use in all listening environments. American Academy of Audiology.
VSP Monitoring of CO2 Injection at the Aneth Oil Field in Utah
NASA Astrophysics Data System (ADS)
Huang, L.; Rutledge, J.; Zhou, R.; Denli, H.; Cheng, A.; Zhao, M.; Peron, J.
2008-12-01
Remotely tracking the movement of injected CO2 within a geological formation is critically important for ensuring safe and long-term geologic carbon sequestration. To study the capability of vertical seismic profiling (VSP) for remote monitoring of CO2 injection, a geophone string with 60 levels and 96 channels was cemented into a monitoring well at the Aneth oil field in Utah operated by Resolute Natural Resources and Navajo National Oil and Gas Company. The oil field is located in the Paradox Basin of southeastern Utah, and was selected by the Southwest Regional Partnership on Carbon Sequestration, supported by the U.S. Department of Energy, to demonstrate combined enhanced oil recovery (EOR) and CO2 sequestration. The geophones are placed at depths from 805 m to 1704 m, and the oil reservoir is located approximately from 1731 m to 1786 m in depth. A baseline VSP dataset with one zero-offset and seven offset source locations was acquired in October, 2007 before CO2 injection. The offsets/source locations are approximately 1 km away from the monitoring well with buried geophone string. A time-lapse VSP dataset with the same source locations was collected in July, 2008 after five months of CO2/water injection into a horizontal well adjacent to the monitoring well. The total amount of CO2 injected during the time interval between the two VSP surveys was 181,000 MCF (million cubic feet), or 10,500 tons. The time-lapse VSP data are pre-processed to balance the phase and amplitude of seismic events above the oil reservoir. We conduct wave-equation migration imaging and interferometry analysis using the pre-processed time-lapse VSP data. The results demonstrate that time-lapse VSP surveys with high-resolution migration imaging and scattering analysis can provide reliable information about CO2 migration. Both the repeatability of VSP surveys and sophisticated time-lapse data pre-processing are essential to make VSP as an effective tool for monitoring CO2 injection.
A survey of visual preprocessing and shape representation techniques
NASA Technical Reports Server (NTRS)
Olshausen, Bruno A.
1988-01-01
Many recent theories and methods proposed for visual preprocessing and shape representation are summarized. The survey brings together research from the fields of biology, psychology, computer science, electrical engineering, and most recently, neural networks. It was motivated by the need to preprocess images for a sparse distributed memory (SDM), but the techniques presented may also prove useful for applying other associative memories to visual pattern recognition. The material of this survey is divided into three sections: an overview of biological visual processing; methods of preprocessing (extracting parts of shape, texture, motion, and depth); and shape representation and recognition (form invariance, primitives and structural descriptions, and theories of attention).
Zuo, Xi-Nian; Xu, Ting; Jiang, Lili; Yang, Zhi; Cao, Xiao-Yan; He, Yong; Zang, Yu-Feng; Castellanos, F. Xavier; Milham, Michael P.
2013-01-01
While researchers have extensively characterized functional connectivity between brain regions, the characterization of functional homogeneity within a region of the brain connectome is in early stages of development. Several functional homogeneity measures were proposed previously, among which regional homogeneity (ReHo) was most widely used as a measure to characterize functional homogeneity of resting state fMRI (R-fMRI) signals within a small region (Zang et al., 2004). Despite a burgeoning literature on ReHo in the field of neuroimaging brain disorders, its test–retest (TRT) reliability remains unestablished. Using two sets of public R-fMRI TRT data, we systematically evaluated the ReHo’s TRT reliability and further investigated the various factors influencing its reliability and found: 1) nuisance (head motion, white matter, and cerebrospinal fluid) correction of R-fMRI time series can significantly improve the TRT reliability of ReHo while additional removal of global brain signal reduces its reliability, 2) spatial smoothing of R-fMRI time series artificially enhances ReHo intensity and influences its reliability, 3) surface-based R-fMRI computation largely improves the TRT reliability of ReHo, 4) a scan duration of 5 min can achieve reliable estimates of ReHo, and 5) fast sampling rates of R-fMRI dramatically increase the reliability of ReHo. Inspired by these findings and seeking a highly reliable approach to exploratory analysis of the human functional connectome, we established an R-fMRI pipeline to conduct ReHo computations in both 3-dimensions (volume) and 2-dimensions (surface). PMID:23085497
On demand processing of climate station sensor data
NASA Astrophysics Data System (ADS)
Wöllauer, Stephan; Forteva, Spaska; Nauss, Thomas
2015-04-01
Large sets of climate stations with several sensors produce big amounts of finegrained time series data. To gain value of this data, further processing and aggregation is needed. We present a flexible system to process the raw data on demand. Several aspects need to be considered to process the raw data in a way that scientists can use the processed data conveniently for their specific research interests. First of all, it is not feasible to pre-process the data in advance because of the great variety of ways it can be processed. Therefore, in this approach only the raw measurement data is archived in a database. When a scientist requires some time series, the system processes the required raw data according to the user-defined request. Based on the type of measurement sensor, some data validation is needed, because the climate station sensors may produce erroneous data. Currently, three validation methods are integrated in the on demand processing system and are optionally selectable. The most basic validation method checks if measurement values are within a predefined range of possible values. For example, it may be assumed that an air temperature sensor measures values within a range of -40 °C to +60 °C. Values outside of this range are considered as a measurement error by this validation method and consequently rejected. An other validation method checks for outliers in the stream of measurement values by defining a maximum change rate between subsequent measurement values. The third validation method compares measurement data to the average values of neighboring stations and rejects measurement values with a high variance. These quality checks are optional, because especially extreme climatic values may be valid but rejected by some quality check method. An other important task is the preparation of measurement data in terms of time. The observed stations measure values in intervals of minutes to hours. Often scientists need a coarser temporal resolution (days, months, years). Therefore, the interval of time aggregation is selectable for the processing. For some use cases it is desirable that the resulting time series are as continuous as possible. To meet these requirements, the processing system includes techniques to fill gaps of missing values by interpolating measurement values with data from adjacent stations using available contemporaneous measurements from the respective stations as training datasets. Alongside processing of sensor values, we created interactive visualization techniques to get a quick overview of a big amount of archived time series data.
Chancerel, Perrine; Bolland, Til; Rotter, Vera Susanne
2011-03-01
Waste electrical and electronic equipment (WEEE) contains gold in low but from an environmental and economic point of view relevant concentration. After collection, WEEE is pre-processed in order to generate appropriate material fractions that are sent to the subsequent end-processing stages (recovery, reuse or disposal). The goal of this research is to quantify the overall recovery rates of pre-processing technologies used in Germany for the reference year 2007. To achieve this goal, facilities operating in Germany were listed and classified according to the technology they apply. Information on their processing capacity was gathered by evaluating statistical databases. Based on a literature review of experimental results for gold recovery rates of different pre-processing technologies, the German overall recovery rate of gold at the pre-processing level was quantified depending on the characteristics of the treated WEEE. The results reveal that - depending on the equipment groups - pre-processing recovery rates of gold of 29 to 61% are achieved in Germany. Some practical recommendations to reduce the losses during pre-processing could be formulated. Defining mass-based recovery targets in the legislation does not set incentives to recover trace elements. Instead, the priorities for recycling could be defined based on other parameters like the environmental impacts of the materials. The implementation of measures to reduce the gold losses would also improve the recovery of several other non-ferrous metals like tin, nickel, and palladium.
Computing Fourier integral operators with caustics
NASA Astrophysics Data System (ADS)
Caday, Peter
2016-12-01
Fourier integral operators (FIOs) have widespread applications in imaging, inverse problems, and PDEs. An implementation of a generic algorithm for computing FIOs associated with canonical graphs is presented, based on a recent paper of de Hoop et al. Given the canonical transformation and principal symbol of the operator, a preprocessing step reduces application of an FIO approximately to multiplications, pushforwards and forward and inverse discrete Fourier transforms, which can be computed in O({N}n+(n-1)/2{log}N) time for an n-dimensional FIO. The same preprocessed data also allows computation of the inverse and transpose of the FIO, with identical runtime. Examples demonstrate the algorithm’s output, and easily extendible MATLAB/C++ source code is available from the author.
Improving performances of suboptimal greedy iterative biclustering heuristics via localization.
Erten, Cesim; Sözdinler, Melih
2010-10-15
Biclustering gene expression data is the problem of extracting submatrices of genes and conditions exhibiting significant correlation across both the rows and the columns of a data matrix of expression values. Even the simplest versions of the problem are computationally hard. Most of the proposed solutions therefore employ greedy iterative heuristics that locally optimize a suitably assigned scoring function. We provide a fast and simple pre-processing algorithm called localization that reorders the rows and columns of the input data matrix in such a way as to group correlated entries in small local neighborhoods within the matrix. The proposed localization algorithm takes its roots from effective use of graph-theoretical methods applied to problems exhibiting a similar structure to that of biclustering. In order to evaluate the effectivenesss of the localization pre-processing algorithm, we focus on three representative greedy iterative heuristic methods. We show how the localization pre-processing can be incorporated into each representative algorithm to improve biclustering performance. Furthermore, we propose a simple biclustering algorithm, Random Extraction After Localization (REAL) that randomly extracts submatrices from the localization pre-processed data matrix, eliminates those with low similarity scores, and provides the rest as correlated structures representing biclusters. We compare the proposed localization pre-processing with another pre-processing alternative, non-negative matrix factorization. We show that our fast and simple localization procedure provides similar or even better results than the computationally heavy matrix factorization pre-processing with regards to H-value tests. We next demonstrate that the performances of the three representative greedy iterative heuristic methods improve with localization pre-processing when biological correlations in the form of functional enrichment and PPI verification constitute the main performance criteria. The fact that the random extraction method based on localization REAL performs better than the representative greedy heuristic methods under same criteria also confirms the effectiveness of the suggested pre-processing method. Supplementary material including code implementations in LEDA C++ library, experimental data, and the results are available at http://code.google.com/p/biclustering/ cesim@khas.edu.tr; melihsozdinler@boun.edu.tr Supplementary data are available at Bioinformatics online.
New technique for real-time distortion-invariant multiobject recognition and classification
NASA Astrophysics Data System (ADS)
Hong, Rutong; Li, Xiaoshun; Hong, En; Wang, Zuyi; Wei, Hongan
2001-04-01
A real-time hybrid distortion-invariant OPR system was established to make 3D multiobject distortion-invariant automatic pattern recognition. Wavelet transform technique was used to make digital preprocessing of the input scene, to depress the noisy background and enhance the recognized object. A three-layer backpropagation artificial neural network was used in correlation signal post-processing to perform multiobject distortion-invariant recognition and classification. The C-80 and NOA real-time processing ability and the multithread programming technology were used to perform high speed parallel multitask processing and speed up the post processing rate to ROIs. The reference filter library was constructed for the distortion version of 3D object model images based on the distortion parameter tolerance measuring as rotation, azimuth and scale. The real-time optical correlation recognition testing of this OPR system demonstrates that using the preprocessing, post- processing, the nonlinear algorithm os optimum filtering, RFL construction technique and the multithread programming technology, a high possibility of recognition and recognition rate ere obtained for the real-time multiobject distortion-invariant OPR system. The recognition reliability and rate was improved greatly. These techniques are very useful to automatic target recognition.
Budak, Umit; Şengür, Abdulkadir; Guo, Yanhui; Akbulut, Yaman
2017-12-01
Microaneurysms (MAs) are known as early signs of diabetic-retinopathy which are called red lesions in color fundus images. Detection of MAs in fundus images needs highly skilled physicians or eye angiography. Eye angiography is an invasive and expensive procedure. Therefore, an automatic detection system to identify the MAs locations in fundus images is in demand. In this paper, we proposed a system to detect the MAs in colored fundus images. The proposed method composed of three stages. In the first stage, a series of pre-processing steps are used to make the input images more convenient for MAs detection. To this end, green channel decomposition, Gaussian filtering, median filtering, back ground determination, and subtraction operations are applied to input colored fundus images. After pre-processing, a candidate MAs extraction procedure is applied to detect potential regions. A five-stepped procedure is adopted to get the potential MA locations. Finally, deep convolutional neural network (DCNN) with reinforcement sample learning strategy is used to train the proposed system. The DCNN is trained with color image patches which are collected from ground-truth MA locations and non-MA locations. We conducted extensive experiments on ROC dataset to evaluate of our proposal. The results are encouraging.
Robust Flood Monitoring Using Sentinel-1 SAR Time Series
NASA Astrophysics Data System (ADS)
DeVries, B.; Huang, C.; Armston, J.; Huang, W.
2017-12-01
The 2017 hurricane season in North and Central America has resulted in unprecedented levels of flooding that have affected millions of people and continue to impact communities across the region. The extent of casualties and damage to property incurred by these floods underscores the need for reliable systems to track flood location, timing and duration to aid response and recovery efforts. While a diverse range of data sources provide vital information on flood status in near real-time, only spaceborne Synthetic Aperture Radar (SAR) sensors can ensure wall-to-wall coverage over large areas, mostly independently of weather conditions or site accessibility. The European Space Agency's Sentinel-1 constellation represents the only SAR mission currently providing open access and systematic global coverage, allowing for a consistent stream of observations over flood-prone regions. Importantly, both the data and pre-processing software are freely available, enabling the development of improved methods, tools and data products to monitor floods in near real-time. We tracked flood onset and progression in Southeastern Texas, Southern Florida, and Puerto Rico using a novel approach based on temporal backscatter anomalies derived from times series of Sentinel-1 observations and historic baselines defined for each of the three sites. This approach was shown to provide a more objective measure of flood occurrence than the simple backscatter thresholds often employed in operational flood monitoring systems. Additionally, the use of temporal anomaly measures allowed us to partially overcome biases introduced by varying sensor view angles and image acquisition modes, allowing increased temporal resolution in areas where additional targeted observations are available. Our results demonstrate the distinct advantages offered by data from operational SAR missions such as Sentinel-1 and NASA's planned NISAR mission, and call attention to the continuing need for SAR Earth Observation missions that provide systematic repeat observations to facilitate continuous monitoring of flood-affected regions.
Tactile and bone-conduction auditory brain computer interface for vision and hearing impaired users.
Rutkowski, Tomasz M; Mori, Hiromu
2015-04-15
The paper presents a report on the recently developed BCI alternative for users suffering from impaired vision (lack of focus or eye-movements) or from the so-called "ear-blocking-syndrome" (limited hearing). We report on our recent studies of the extents to which vibrotactile stimuli delivered to the head of a user can serve as a platform for a brain computer interface (BCI) paradigm. In the proposed tactile and bone-conduction auditory BCI novel multiple head positions are used to evoke combined somatosensory and auditory (via the bone conduction effect) P300 brain responses, in order to define a multimodal tactile and bone-conduction auditory brain computer interface (tbcaBCI). In order to further remove EEG interferences and to improve P300 response classification synchrosqueezing transform (SST) is applied. SST outperforms the classical time-frequency analysis methods of the non-linear and non-stationary signals such as EEG. The proposed method is also computationally more effective comparing to the empirical mode decomposition. The SST filtering allows for online EEG preprocessing application which is essential in the case of BCI. Experimental results with healthy BCI-naive users performing online tbcaBCI, validate the paradigm, while the feasibility of the concept is illuminated through information transfer rate case studies. We present a comparison of the proposed SST-based preprocessing method, combined with a logistic regression (LR) classifier, together with classical preprocessing and LDA-based classification BCI techniques. The proposed tbcaBCI paradigm together with data-driven preprocessing methods are a step forward in robust BCI applications research. Copyright © 2014 Elsevier B.V. All rights reserved.
Low-damage direct patterning of silicon oxide mask by mechanical processing
2014-01-01
To realize the nanofabrication of silicon surfaces using atomic force microscopy (AFM), we investigated the etching of mechanically processed oxide masks using potassium hydroxide (KOH) solution. The dependence of the KOH solution etching rate on the load and scanning density of the mechanical pre-processing was evaluated. Particular load ranges were found to increase the etching rate, and the silicon etching rate also increased with removal of the natural oxide layer by diamond tip sliding. In contrast, the local oxide pattern formed (due to mechanochemical reaction of the silicon) by tip sliding at higher load was found to have higher etching resistance than that of unprocessed areas. The profile changes caused by the etching of the mechanically pre-processed areas with the KOH solution were also investigated. First, protuberances were processed by diamond tip sliding at lower and higher stresses than that of the shearing strength. Mechanical processing at low load and scanning density to remove the natural oxide layer was then performed. The KOH solution selectively etched the low load and scanning density processed area first and then etched the unprocessed silicon area. In contrast, the protuberances pre-processed at higher load were hardly etched. The etching resistance of plastic deformed layers was decreased, and their etching rate was increased because of surface damage induced by the pre-processing. These results show that etching depth can be controlled by controlling the etching time through natural oxide layer removal and mechanochemical oxide layer formation. These oxide layer removal and formation processes can be exploited to realize low-damage mask patterns. PMID:24948891
NASA Astrophysics Data System (ADS)
Derkachov, G.; Jakubczyk, T.; Jakubczyk, D.; Archer, J.; Woźniak, M.
2017-07-01
Utilising Compute Unified Device Architecture (CUDA) platform for Graphics Processing Units (GPUs) enables significant reduction of computation time at a moderate cost, by means of parallel computing. In the paper [Jakubczyk et al., Opto-Electron. Rev., 2016] we reported using GPU for Mie scattering inverse problem solving (up to 800-fold speed-up). Here we report the development of two subroutines utilising GPU at data preprocessing stages for the inversion procedure: (i) A subroutine, based on ray tracing, for finding spherical aberration correction function. (ii) A subroutine performing the conversion of an image to a 1D distribution of light intensity versus azimuth angle (i.e. scattering diagram), fed from a movie-reading CPU subroutine running in parallel. All subroutines are incorporated in PikeReader application, which we make available on GitHub repository. PikeReader returns a sequence of intensity distributions versus a common azimuth angle vector, corresponding to the recorded movie. We obtained an overall ∼ 400 -fold speed-up of calculations at data preprocessing stages using CUDA codes running on GPU in comparison to single thread MATLAB-only code running on CPU.
NASA Astrophysics Data System (ADS)
Silva, Ricardo Petri; Naozuka, Gustavo Taiji; Mastelini, Saulo Martiello; Felinto, Alan Salvany
2018-01-01
The incidence of luminous reflections (LR) in captured images can interfere with the color of the affected regions. These regions tend to oversaturate, becoming whitish and, consequently, losing the original color information of the scene. Decision processes that employ images acquired from digital cameras can be impaired by the LR incidence. Such applications include real-time video surgeries, facial, and ocular recognition. This work proposes an algorithm called contrast enhancement of potential LR regions, which is a preprocessing to increase the contrast of potential LR regions, in order to improve the performance of automatic LR detectors. In addition, three automatic detectors were compared with and without the employment of our preprocessing method. The first one is a technique already consolidated in the literature called the Chang-Tseng threshold. We propose two automatic detectors called adapted histogram peak and global threshold. We employed four performance metrics to evaluate the detectors, namely, accuracy, precision, exactitude, and root mean square error. The exactitude metric is developed by this work. Thus, a manually defined reference model was created. The global threshold detector combined with our preprocessing method presented the best results, with an average exactitude rate of 82.47%.
Iorgulescu, E; Voicu, V A; Sârbu, C; Tache, F; Albu, F; Medvedovici, A
2016-08-01
The influence of the experimental variability (instrumental repeatability, instrumental intermediate precision and sample preparation variability) and data pre-processing (normalization, peak alignment, background subtraction) on the discrimination power of multivariate data analysis methods (Principal Component Analysis -PCA- and Cluster Analysis -CA-) as well as a new algorithm based on linear regression was studied. Data used in the study were obtained through positive or negative ion monitoring electrospray mass spectrometry (+/-ESI/MS) and reversed phase liquid chromatography/UV spectrometric detection (RPLC/UV) applied to green tea extracts. Extractions in ethanol and heated water infusion were used as sample preparation procedures. The multivariate methods were directly applied to mass spectra and chromatograms, involving strictly a holistic comparison of shapes, without assignment of any structural identity to compounds. An alternative data interpretation based on linear regression analysis mutually applied to data series is also discussed. Slopes, intercepts and correlation coefficients produced by the linear regression analysis applied on pairs of very large experimental data series successfully retain information resulting from high frequency instrumental acquisition rates, obviously better defining the profiles being compared. Consequently, each type of sample or comparison between samples produces in the Cartesian space an ellipsoidal volume defined by the normal variation intervals of the slope, intercept and correlation coefficient. Distances between volumes graphically illustrates (dis)similarities between compared data. The instrumental intermediate precision had the major effect on the discrimination power of the multivariate data analysis methods. Mass spectra produced through ionization from liquid state in atmospheric pressure conditions of bulk complex mixtures resulting from extracted materials of natural origins provided an excellent data basis for multivariate analysis methods, equivalent to data resulting from chromatographic separations. The alternative evaluation of very large data series based on linear regression analysis produced information equivalent to results obtained through application of PCA an CA. Copyright © 2016 Elsevier B.V. All rights reserved.
Gifford, René H.; Revit, Lawrence J.
2014-01-01
Background Although cochlear implant patients are achieving increasingly higher levels of performance, speech perception in noise continues to be problematic. The newest generations of implant speech processors are equipped with preprocessing and/or external accessories that are purported to improve listening in noise. Most speech perception measures in the clinical setting, however, do not provide a close approximation to real-world listening environments. Purpose To assess speech perception for adult cochlear implant recipients in the presence of a realistic restaurant simulation generated by an eight-loudspeaker (R-SPACE™) array in order to determine whether commercially available preprocessing strategies and/or external accessories yield improved sentence recognition in noise. Research Design Single-subject, repeated-measures design with two groups of participants: Advanced Bionics and Cochlear Corporation recipients. Study Sample Thirty-four subjects, ranging in age from 18 to 90 yr (mean 54.5 yr), participated in this prospective study. Fourteen subjects were Advanced Bionics recipients, and 20 subjects were Cochlear Corporation recipients. Intervention Speech reception thresholds (SRTs) in semidiffuse restaurant noise originating from an eight-loudspeaker array were assessed with the subjects’ preferred listening programs as well as with the addition of either Beam™ preprocessing (Cochlear Corporation) or the T-Mic® accessory option (Advanced Bionics). Data Collection and Analysis In Experiment 1, adaptive SRTs with the Hearing in Noise Test sentences were obtained for all 34 subjects. For Cochlear Corporation recipients, SRTs were obtained with their preferred everyday listening program as well as with the addition of Focus preprocessing. For Advanced Bionics recipients, SRTs were obtained with the integrated behind-the-ear (BTE) mic as well as with the T-Mic. Statistical analysis using a repeated-measures analysis of variance (ANOVA) evaluated the effects of the preprocessing strategy or external accessory in reducing the SRT in noise. In addition, a standard t-test was run to evaluate effectiveness across manufacturer for improving the SRT in noise. In Experiment 2, 16 of the 20 Cochlear Corporation subjects were reassessed obtaining an SRT in noise using the manufacturer-suggested “Everyday,” “Noise,” and “Focus” preprocessing strategies. A repeated-measures ANOVA was employed to assess the effects of preprocessing. Results The primary findings were (i) both Noise and Focus preprocessing strategies (Cochlear Corporation) significantly improved the SRT in noise as compared to Everyday preprocessing, (ii) the T-Mic accessory option (Advanced Bionics) significantly improved the SRT as compared to the BTE mic, and (iii) Focus preprocessing and the T-Mic resulted in similar degrees of improvement that were not found to be significantly different from one another. Conclusion Options available in current cochlear implant sound processors are able to significantly improve speech understanding in a realistic, semidiffuse noise with both Cochlear Corporation and Advanced Bionics systems. For Cochlear Corporation recipients, Focus preprocessing yields the best speech-recognition performance in a complex listening environment; however, it is recommended that Noise preprocessing be used as the new default for everyday listening environments to avoid the need for switching programs throughout the day. For Advanced Bionics recipients, the T-Mic offers significantly improved performance in noise and is recommended for everyday use in all listening environments. PMID:20807480
Preprocessing of emotional visual information in the human piriform cortex.
Schulze, Patrick; Bestgen, Anne-Kathrin; Lech, Robert K; Kuchinke, Lars; Suchan, Boris
2017-08-23
This study examines the processing of visual information by the olfactory system in humans. Recent data point to the processing of visual stimuli by the piriform cortex, a region mainly known as part of the primary olfactory cortex. Moreover, the piriform cortex generates predictive templates of olfactory stimuli to facilitate olfactory processing. This study fills the gap relating to the question whether this region is also capable of preprocessing emotional visual information. To gain insight into the preprocessing and transfer of emotional visual information into olfactory processing, we recorded hemodynamic responses during affective priming using functional magnetic resonance imaging (fMRI). Odors of different valence (pleasant, neutral and unpleasant) were primed by images of emotional facial expressions (happy, neutral and disgust). Our findings are the first to demonstrate that the piriform cortex preprocesses emotional visual information prior to any olfactory stimulation and that the emotional connotation of this preprocessing is subsequently transferred and integrated into an extended olfactory network for olfactory processing.
Preprocessing of A-scan GPR data based on energy features
NASA Astrophysics Data System (ADS)
Dogan, Mesut; Turhan-Sayan, Gonul
2016-05-01
There is an increasing demand for noninvasive real-time detection and classification of buried objects in various civil and military applications. The problem of detection and annihilation of landmines is particularly important due to strong safety concerns. The requirement for a fast real-time decision process is as important as the requirements for high detection rates and low false alarm rates. In this paper, we introduce and demonstrate a computationally simple, timeefficient, energy-based preprocessing approach that can be used in ground penetrating radar (GPR) applications to eliminate reflections from the air-ground boundary and to locate the buried objects, simultaneously, at one easy step. The instantaneous power signals, the total energy values and the cumulative energy curves are extracted from the A-scan GPR data. The cumulative energy curves, in particular, are shown to be useful to detect the presence and location of buried objects in a fast and simple way while preserving the spectral content of the original A-scan data for further steps of physics-based target classification. The proposed method is demonstrated using the GPR data collected at the facilities of IPA Defense, Ankara at outdoor test lanes. Cylindrically shaped plastic containers were buried in fine-medium sand to simulate buried landmines. These plastic containers were half-filled by ammonium nitrate including metal pins. Results of this pilot study are demonstrated to be highly promising to motivate further research for the use of energy-based preprocessing features in landmine detection problem.
Churchill, Nathan W.; Oder, Anita; Abdi, Hervé; Tam, Fred; Lee, Wayne; Thomas, Christopher; Ween, Jon E.; Graham, Simon J.; Strother, Stephen C.
2016-01-01
Subject-specific artifacts caused by head motion and physiological noise are major confounds in BOLD fMRI analyses. However, there is little consensus on the optimal choice of data preprocessing steps to minimize these effects. To evaluate the effects of various preprocessing strategies, we present a framework which comprises a combination of (1) nonparametric testing including reproducibility and prediction metrics of the data-driven NPAIRS framework (Strother et al. [2002]: NeuroImage 15:747–771), and (2) intersubject comparison of SPM effects, using DISTATIS (a three-way version of metric multidimensional scaling (Abdi et al. [2009]: NeuroImage 45:89–95). It is shown that the quality of brain activation maps may be significantly limited by sub-optimal choices of data preprocessing steps (or “pipeline”) in a clinical task-design, an fMRI adaptation of the widely used Trail-Making Test. The relative importance of motion correction, physiological noise correction, motion parameter regression, and temporal detrending were examined for fMRI data acquired in young, healthy adults. Analysis performance and the quality of activation maps were evaluated based on Penalized Discriminant Analysis (PDA). The relative importance of different preprocessing steps was assessed by (1) a nonparametric Friedman rank test for fixed sets of preprocessing steps, applied to all subjects; and (2) evaluating pipelines chosen specifically for each subject. Results demonstrate that preprocessing choices have significant, but subject-dependant effects, and that individually-optimized pipelines may significantly improve the reproducibility of fMRI results over fixed pipelines. This was demonstrated by the detection of a significant interaction with motion parameter regression and physiological noise correction, even though the range of subject head motion was small across the group (≪ 1 voxel). Optimizing pipelines on an individual-subject basis also revealed brain activation patterns either weak or absent under fixed pipelines, which has implications for the overall interpretation of fMRI data, and the relative importance of preprocessing methods. PMID:21455942
Statistical Methods in Ai: Rare Event Learning Using Associative Rules and Higher-Order Statistics
NASA Astrophysics Data System (ADS)
Iyer, V.; Shetty, S.; Iyengar, S. S.
2015-07-01
Rare event learning has not been actively researched since lately due to the unavailability of algorithms which deal with big samples. The research addresses spatio-temporal streams from multi-resolution sensors to find actionable items from a perspective of real-time algorithms. This computing framework is independent of the number of input samples, application domain, labelled or label-less streams. A sampling overlap algorithm such as Brooks-Iyengar is used for dealing with noisy sensor streams. We extend the existing noise pre-processing algorithms using Data-Cleaning trees. Pre-processing using ensemble of trees using bagging and multi-target regression showed robustness to random noise and missing data. As spatio-temporal streams are highly statistically correlated, we prove that a temporal window based sampling from sensor data streams converges after n samples using Hoeffding bounds. Which can be used for fast prediction of new samples in real-time. The Data-cleaning tree model uses a nonparametric node splitting technique, which can be learned in an iterative way which scales linearly in memory consumption for any size input stream. The improved task based ensemble extraction is compared with non-linear computation models using various SVM kernels for speed and accuracy. We show using empirical datasets the explicit rule learning computation is linear in time and is only dependent on the number of leafs present in the tree ensemble. The use of unpruned trees (t) in our proposed ensemble always yields minimum number (m) of leafs keeping pre-processing computation to n × t log m compared to N2 for Gram Matrix. We also show that the task based feature induction yields higher Qualify of Data (QoD) in the feature space compared to kernel methods using Gram Matrix.
Detection of seizures from small samples using nonlinear dynamic system theory.
Yaylali, I; Koçak, H; Jayakar, P
1996-07-01
The electroencephalogram (EEG), like many other biological phenomena, is quite likely governed by nonlinear dynamics. Certain characteristics of the underlying dynamics have recently been quantified by computing the correlation dimensions (D2) of EEG time series data. In this paper, D2 of the unbiased autocovariance function of the scalp EEG data was used to detect electrographic seizure activity. Digital EEG data were acquired at a sampling rate of 200 Hz per channel and organized in continuous frames (duration 2.56 s, 512 data points). To increase the reliability of D2 computations with short duration data, raw EEG data were initially simplified using unbiased autocovariance analysis to highlight the periodic activity that is present during seizures. The D2 computation was then performed from the unbiased autocovariance function of each channel using the Grassberger-Procaccia method with Theiler's box-assisted correlation algorithm. Even with short duration data, this preprocessing proved to be computationally robust and displayed no significant sensitivity to implementation details such as the choices of embedding dimension and box size. The system successfully identified various types of seizures in clinical studies.
Monthly streamflow forecasting with auto-regressive integrated moving average
NASA Astrophysics Data System (ADS)
Nasir, Najah; Samsudin, Ruhaidah; Shabri, Ani
2017-09-01
Forecasting of streamflow is one of the many ways that can contribute to better decision making for water resource management. The auto-regressive integrated moving average (ARIMA) model was selected in this research for monthly streamflow forecasting with enhancement made by pre-processing the data using singular spectrum analysis (SSA). This study also proposed an extension of the SSA technique to include a step where clustering was performed on the eigenvector pairs before reconstruction of the time series. The monthly streamflow data of Sungai Muda at Jeniang, Sungai Muda at Jambatan Syed Omar and Sungai Ketil at Kuala Pegang was gathered from the Department of Irrigation and Drainage Malaysia. A ratio of 9:1 was used to divide the data into training and testing sets. The ARIMA, SSA-ARIMA and Clustered SSA-ARIMA models were all developed in R software. Results from the proposed model are then compared to a conventional auto-regressive integrated moving average model using the root-mean-square error and mean absolute error values. It was found that the proposed model can outperform the conventional model.
BoolNet--an R package for generation, reconstruction and analysis of Boolean networks.
Müssel, Christoph; Hopfensitz, Martin; Kestler, Hans A
2010-05-15
As the study of information processing in living cells moves from individual pathways to complex regulatory networks, mathematical models and simulation become indispensable tools for analyzing the complex behavior of such networks and can provide deep insights into the functioning of cells. The dynamics of gene expression, for example, can be modeled with Boolean networks (BNs). These are mathematical models of low complexity, but have the advantage of being able to capture essential properties of gene-regulatory networks. However, current implementations of BNs only focus on different sub-aspects of this model and do not allow for a seamless integration into existing preprocessing pipelines. BoolNet efficiently integrates methods for synchronous, asynchronous and probabilistic BNs. This includes reconstructing networks from time series, generating random networks, robustness analysis via perturbation, Markov chain simulations, and identification and visualization of attractors. The package BoolNet is freely available from the R project at http://cran.r-project.org/ or http://www.informatik.uni-ulm.de/ni/mitarbeiter/HKestler/boolnet/ under Artistic License 2.0. hans.kestler@uni-ulm.de Supplementary data are available at Bioinformatics online.
NASA Astrophysics Data System (ADS)
Fischer, Peter; Schuegraf, Philipp; Merkle, Nina; Storch, Tobias
2018-04-01
This paper presents a hybrid evolutionary algorithm for fast intensity based matching between satellite imagery from SAR and very high-resolution (VHR) optical sensor systems. The precise and accurate co-registration of image time series and images of different sensors is a key task in multi-sensor image processing scenarios. The necessary preprocessing step of image matching and tie-point detection is divided into a search problem and a similarity measurement. Within this paper we evaluate the use of an evolutionary search strategy for establishing the spatial correspondence between satellite imagery of optical and radar sensors. The aim of the proposed algorithm is to decrease the computational costs during the search process by formulating the search as an optimization problem. Based upon the canonical evolutionary algorithm, the proposed algorithm is adapted for SAR/optical imagery intensity based matching. Extensions are drawn using techniques like hybridization (e.g. local search) and others to lower the number of objective function calls and refine the result. The algorithm significantely decreases the computational costs whilst finding the optimal solution in a reliable way.
A linear shift-invariant image preprocessing technique for multispectral scanner systems
NASA Technical Reports Server (NTRS)
Mcgillem, C. D.; Riemer, T. E.
1973-01-01
A linear shift-invariant image preprocessing technique is examined which requires no specific knowledge of any parameter of the original image and which is sufficiently general to allow the effective radius of the composite imaging system to be arbitrarily shaped and reduced, subject primarily to the noise power constraint. In addition, the size of the point-spread function of the preprocessing filter can be arbitrarily controlled, thus minimizing truncation errors.
Variable threshold method for ECG R-peak detection.
Kew, Hsein-Ping; Jeong, Do-Un
2011-10-01
In this paper, a wearable belt-type ECG electrode worn around the chest by measuring the real-time ECG is produced in order to minimize the inconvenient in wearing. ECG signal is detected using a potential instrument system. The measured ECG signal is transmits via an ultra low power consumption wireless data communications unit to personal computer using Zigbee-compatible wireless sensor node. ECG signals carry a lot of clinical information for a cardiologist especially the R-peak detection in ECG. R-peak detection generally uses the threshold value which is fixed. There will be errors in peak detection when the baseline changes due to motion artifacts and signal size changes. Preprocessing process which includes differentiation process and Hilbert transform is used as signal preprocessing algorithm. Thereafter, variable threshold method is used to detect the R-peak which is more accurate and efficient than fixed threshold value method. R-peak detection using MIT-BIH databases and Long Term Real-Time ECG is performed in this research in order to evaluate the performance analysis.
User data dissemination concepts for earth resources: Executive summary
NASA Technical Reports Server (NTRS)
Davies, R.; Scott, M.; Mitchell, C.; Torbett, A.
1976-01-01
The impact of the future capabilities of earth-resources data sensors (both satellite and airborne) and their requirements on the data dissemination network were investigated and optimum ways of configuring this network were determined. The scope of this study was limited to the continental U.S.A. (including Alaska) and to the 1985-1995 time period. Some of the conclusions and recommendations reached were: (1) Data from satellites in sun-synchronous polar orbits (700-920 km) will generate most of the earth-resources data in the specified time period. (2) Data from aircraft and shuttle sorties cannot be readily integrated in a data-dissemination network unless already preprocessed in a digitized form to a standard geometric coordinate system. (3) Data transmission between readout stations and central preprocessing facilities, and between processing facilities and user facilities are most economically performed by domestic communication satellites. (4) The effect of the following factors should be studied: cloud cover, expanded coverage, pricing strategies, multidiscipline missions.
PCA-based artifact removal algorithm for stroke detection using UWB radar imaging.
Ricci, Elisa; di Domenico, Simone; Cianca, Ernestina; Rossi, Tommaso; Diomedi, Marina
2017-06-01
Stroke patients should be dispatched at the highest level of care available in the shortest time. In this context, a transportable system in specialized ambulances, able to evaluate the presence of an acute brain lesion in a short time interval (i.e., few minutes), could shorten delay of treatment. UWB radar imaging is an emerging diagnostic branch that has great potential for the implementation of a transportable and low-cost device. Transportability, low cost and short response time pose challenges to the signal processing algorithms of the backscattered signals as they should guarantee good performance with a reasonably low number of antennas and low computational complexity, tightly related to the response time of the device. The paper shows that a PCA-based preprocessing algorithm can: (1) achieve good performance already with a computationally simple beamforming algorithm; (2) outperform state-of-the-art preprocessing algorithms; (3) enable a further improvement in the performance (and/or decrease in the number of antennas) by using a multistatic approach with just a modest increase in computational complexity. This is an important result toward the implementation of such a diagnostic device that could play an important role in emergency scenario.
Study of Huizhou architecture component point cloud in surface reconstruction
NASA Astrophysics Data System (ADS)
Zhang, Runmei; Wang, Guangyin; Ma, Jixiang; Wu, Yulu; Zhang, Guangbin
2017-06-01
Surface reconfiguration softwares have many problems such as complicated operation on point cloud data, too many interaction definitions, and too stringent requirements for inputing data. Thus, it has not been widely popularized so far. This paper selects the unique Huizhou Architecture chuandou wooden beam framework as the research object, and presents a complete set of implementation in data acquisition from point, point cloud preprocessing and finally implemented surface reconstruction. Firstly, preprocessing the acquired point cloud data, including segmentation and filtering. Secondly, the surface’s normals are deduced directly from the point cloud dataset. Finally, the surface reconstruction is studied by using Greedy Projection Triangulation Algorithm. Comparing the reconstructed model with the three-dimensional surface reconstruction softwares, the results show that the proposed scheme is more smooth, time efficient and portable.
Devos, Olivier; Downey, Gerard; Duponchel, Ludovic
2014-04-01
Classification is an important task in chemometrics. For several years now, support vector machines (SVMs) have proven to be powerful for infrared spectral data classification. However such methods require optimisation of parameters in order to control the risk of overfitting and the complexity of the boundary. Furthermore, it is established that the prediction ability of classification models can be improved using pre-processing in order to remove unwanted variance in the spectra. In this paper we propose a new methodology based on genetic algorithm (GA) for the simultaneous optimisation of SVM parameters and pre-processing (GENOPT-SVM). The method has been tested for the discrimination of the geographical origin of Italian olive oil (Ligurian and non-Ligurian) on the basis of near infrared (NIR) or mid infrared (FTIR) spectra. Different classification models (PLS-DA, SVM with mean centre data, GENOPT-SVM) have been tested and statistically compared using McNemar's statistical test. For the two datasets, SVM with optimised pre-processing give models with higher accuracy than the one obtained with PLS-DA on pre-processed data. In the case of the NIR dataset, most of this accuracy improvement (86.3% compared with 82.8% for PLS-DA) occurred using only a single pre-processing step. For the FTIR dataset, three optimised pre-processing steps are required to obtain SVM model with significant accuracy improvement (82.2%) compared to the one obtained with PLS-DA (78.6%). Furthermore, this study demonstrates that even SVM models have to be developed on the basis of well-corrected spectral data in order to obtain higher classification rates. Copyright © 2013 Elsevier Ltd. All rights reserved.
Mapping Snow Grain Size over Greenland from MODIS
NASA Technical Reports Server (NTRS)
Lyapustin, Alexei; Tedesco, Marco; Wang, Yujie; Kokhanovsky, Alexander
2008-01-01
This paper presents a new automatic algorithm to derive optical snow grain size (SGS) at 1 km resolution using Moderate Resolution Imaging Spectroradiometer (MODIS) measurements. Differently from previous approaches, snow grains are not assumed to be spherical but a fractal approach is used to account for their irregular shape. The retrieval is conceptually based on an analytical asymptotic radiative transfer model which predicts spectral bidirectional snow reflectance as a function of the grain size and ice absorption. The analytical form of solution leads to an explicit and fast retrieval algorithm. The time series analysis of derived SGS shows a good sensitivity to snow metamorphism, including melting and snow precipitation events. Preprocessing is performed by a Multi-Angle Implementation of Atmospheric Correction (MAIAC) algorithm, which includes gridding MODIS data to 1 km resolution, water vapor retrieval, cloud masking and an atmospheric correction. MAIAC cloud mask (CM) is a new algorithm based on a time series of gridded MODIS measurements and an image-based rather than pixel-based processing. Extensive processing of MODIS TERRA data over Greenland shows a robust performance of CM algorithm in discrimination of clouds over bright snow and ice. As part of the validation analysis, SGS derived from MODIS over selected sites in 2004 was compared to the microwave brightness temperature measurements of SSM\\I radiometer, which is sensitive to the amount of liquid water in the snowpack. The comparison showed a good qualitative agreement, with both datasets detecting two main periods of snowmelt. Additionally, MODIS SGS was compared with predictions of the snow model CROCUS driven by measurements of the automatic whether stations of the Greenland Climate Network. We found that CROCUS grain size is on average a factor of two larger than MODIS-derived SGS. Overall, the agreement between CROCUS and MODIS results was satisfactory, in particular before and during the first melting period in mid-June. Following detailed time series analysis of SGS for four permanent sites, the paper presents SGS maps over the Greenland ice sheet for the March-September period of 2004.
Charles J. Gatchell; R. Edward Thomas; Elizabeth S. Walker
1999-01-01
Using the ROMI-RIP simulator we examined the implications of preprocessing for gang-rip-first rough mills. Rip-first rough mills can improve yield and throughput by preprocessing 1 Common and 2A Common hardwood lumber. This can be achieved by using a chop saw to separate poorer quality board segments from better ones and remove waste areas with little or no yield. This...
Data pre-processing in record linkage to find the same companies from different databases
NASA Astrophysics Data System (ADS)
Gunawan, D.; Lubis, M. S.; Arisandi, D.; Azzahry, B.
2018-03-01
As public agencies, the Badan Pelayanan Perizinan Terpadu (BPPT) and the Badan Lingkungan Hidup (BLH) of Medan city manage process to obtain a business license from the public. However, each agency might have a different corporate data because of a separate data input process, even though the data may refer to the same company’s data. Therefore, it is required to identify and correlate data that refer to the same company which lie in different data sources. This research focuses on data pre-processing such as data cleaning, text pre-processing, indexing and record comparison. In addition, this research implements data matching using support vector machine algorithm. The result of this algorithm will be used to record linkage of data that can be used to identify and connect the company’s data based on the degree of similarity of each data. Previous data will be standardized in accordance with the format and structure appropriate to the stage of preprocessing data. After analyzing data pre-processing, we found that both database structures are not designed to support data integration. We decide that the data matching can be done with blocking criteria such as company name and the name of the owner (or applicant). In addition to data pre-processing, the result of data classification with a high level of similarity as many as 90 pairs of records.
Classifier dependent feature preprocessing methods
NASA Astrophysics Data System (ADS)
Rodriguez, Benjamin M., II; Peterson, Gilbert L.
2008-04-01
In mobile applications, computational complexity is an issue that limits sophisticated algorithms from being implemented on these devices. This paper provides an initial solution to applying pattern recognition systems on mobile devices by combining existing preprocessing algorithms for recognition. In pattern recognition systems, it is essential to properly apply feature preprocessing tools prior to training classification models in an attempt to reduce computational complexity and improve the overall classification accuracy. The feature preprocessing tools extended for the mobile environment are feature ranking, feature extraction, data preparation and outlier removal. Most desktop systems today are capable of processing a majority of the available classification algorithms without concern of processing while the same is not true on mobile platforms. As an application of pattern recognition for mobile devices, the recognition system targets the problem of steganalysis, determining if an image contains hidden information. The measure of performance shows that feature preprocessing increases the overall steganalysis classification accuracy by an average of 22%. The methods in this paper are tested on a workstation and a Nokia 6620 (Symbian operating system) camera phone with similar results.
A comparison of PCA/ICA for data preprocessing in remote sensing imagery classification
NASA Astrophysics Data System (ADS)
He, Hui; Yu, Xianchuan
2005-10-01
In this paper a performance comparison of a variety of data preprocessing algorithms in remote sensing image classification is presented. These selected algorithms are principal component analysis (PCA) and three different independent component analyses, ICA (Fast-ICA (Aapo Hyvarinen, 1999), Kernel-ICA (KCCA and KGV (Bach & Jordan, 2002), EFFICA (Aiyou Chen & Peter Bickel, 2003). These algorithms were applied to a remote sensing imagery (1600×1197), obtained from Shunyi, Beijing. For classification, a MLC method is used for the raw and preprocessed data. The results show that classification with the preprocessed data have more confident results than that with raw data and among the preprocessing algorithms, ICA algorithms improve on PCA and EFFICA performs better than the others. The convergence of these ICA algorithms (for data points more than a million) are also studied, the result shows EFFICA converges much faster than the others. Furthermore, because EFFICA is a one-step maximum likelihood estimate (MLE) which reaches asymptotic Fisher efficiency (EFFICA), it computers quite small so that its demand of memory come down greatly, which settled the "out of memory" problem occurred in the other algorithms.
Ren, Zhou-Xin; Yu, Hai-Bin; Shen, Jun-Ling; Li, Ya; Li, Jian-Sheng
2015-06-01
To establish a preprocessing method for cell morphometry in microscopic images of A549 cells in epithelial-mesenchymal transition (EMT). Adobe Photoshop CS2 (Adobe Systems, Inc.) was used for preprocessing the images. First, all images were processed for size uniformity and high distinguishability between the cell and background area. Then, a blank image with the same size and grids was established and cross points of the grids were added into a distinct color. The blank image was merged into a processed image. In the merged images, the cells with 1 or more cross points were chosen, and then the cell areas were enclosed and were replaced in a distinct color. Except for chosen cellular areas, all areas were changed into a unique hue. Three observers quantified roundness of cells in images with the image preprocess (IPP) or without the method (Controls), respectively. Furthermore, 1 observer measured the roundness 3 times with the 2 methods, respectively. The results between IPPs and Controls were compared for repeatability and reproducibility. As compared with the Control method, among 3 observers, use of the IPP method resulted in a higher number and a higher percentage of same-chosen cells in an image. The relative average deviation values of roundness, either for 3 observers or 1 observer, were significantly higher in Controls than in IPPs (p < 0.01 or 0.001). The values of intraclass correlation coefficient, both in Single Type or Average, were higher in IPPs than in Controls both for 3 observers and 1 observer. Processed with Adobe Photoshop, a chosen cell from an image was more objective, regular, and accurate, creating an increase of reproducibility and repeatability on morphometry of A549 cells in epithelial to mesenchymal transition.
NASA Astrophysics Data System (ADS)
Tao, Feifei; Mba, Ogan; Liu, Li; Ngadi, Michael
2017-04-01
Polyunsaturated fatty acids (PUFAs) are important nutrients present in Salmon. However, current methods for quantifying the fatty acids (FAs) contents in foods are generally based on gas chromatography (GC) technique, which is time-consuming, laborious and destructive to the tested samples. Therefore, the capability of near-infrared (NIR) hyperspectral imaging to predict the PUFAs contents of C20:2 n-6, C20:3 n-6, C20:5 n-3, C22:5 n-3 and C22:6 n-3 in Salmon fillets in a rapid and non-destructive way was investigated in this work. Mean reflectance spectra were first extracted from the region of interests (ROIs), and then the spectral pre-processing methods of 2nd derivative and Savitzky-Golay (SG) smoothing were performed on the original spectra. Based on the original and the pre-processed spectra, PLSR technique was employed to develop the quantitative models for predicting each PUFA content in Salmon fillets. The results showed that for all the studied PUFAs, the quantitative models developed using the pre-processed reflectance spectra by "2nd derivative + SG smoothing" could improve their modeling results. Good prediction results were achieved with RP and RMSEP of 0.91 and 0.75 mg/g dry weight, 0.86 and 1.44 mg/g dry weight, 0.82 and 3.01 mg/g dry weight for C20:3 n-6, C22:5 n-3 and C20:5 n-3, respectively after pre-processing by "2nd derivative + SG smoothing". The work demonstrated that NIR hyperspectral imaging could be a useful tool for rapid and non-destructive determination of the PUFA contents in fish fillets.
Wang, Dong; Yang, Zhuang-qun; Hu, Xiao-yi
2007-08-01
To analyze the stress and displacement distribution of 3D-FE models in three conjunctive methods of vascularized iliac bone graft for established mandibular body defects. Using computer image process technique, a series of spiral CT images were put into Ansys preprocess programe to establish three 3D-FE models of different conjunctions. The three 3D-FE models of established mandibular body defects by vascularized iliac bone graft were built up. The distribution of Von Mises stress and displacement around mandibular segment, grafted ilium, plates and screws was obtained. It may be determined successfully that the optimal conjunctive shape be the on-lay conjunction.
Software for pre-processing Illumina next-generation sequencing short read sequences
2014-01-01
Background When compared to Sanger sequencing technology, next-generation sequencing (NGS) technologies are hindered by shorter sequence read length, higher base-call error rate, non-uniform coverage, and platform-specific sequencing artifacts. These characteristics lower the quality of their downstream analyses, e.g. de novo and reference-based assembly, by introducing sequencing artifacts and errors that may contribute to incorrect interpretation of data. Although many tools have been developed for quality control and pre-processing of NGS data, none of them provide flexible and comprehensive trimming options in conjunction with parallel processing to expedite pre-processing of large NGS datasets. Methods We developed ngsShoRT (next-generation sequencing Short Reads Trimmer), a flexible and comprehensive open-source software package written in Perl that provides a set of algorithms commonly used for pre-processing NGS short read sequences. We compared the features and performance of ngsShoRT with existing tools: CutAdapt, NGS QC Toolkit and Trimmomatic. We also compared the effects of using pre-processed short read sequences generated by different algorithms on de novo and reference-based assembly for three different genomes: Caenorhabditis elegans, Saccharomyces cerevisiae S288c, and Escherichia coli O157 H7. Results Several combinations of ngsShoRT algorithms were tested on publicly available Illumina GA II, HiSeq 2000, and MiSeq eukaryotic and bacteria genomic short read sequences with the focus on removing sequencing artifacts and low-quality reads and/or bases. Our results show that across three organisms and three sequencing platforms, trimming improved the mean quality scores of trimmed sequences. Using trimmed sequences for de novo and reference-based assembly improved assembly quality as well as assembler performance. In general, ngsShoRT outperformed comparable trimming tools in terms of trimming speed and improvement of de novo and reference-based assembly as measured by assembly contiguity and correctness. Conclusions Trimming of short read sequences can improve the quality of de novo and reference-based assembly and assembler performance. The parallel processing capability of ngsShoRT reduces trimming time and improves the memory efficiency when dealing with large datasets. We recommend combining sequencing artifacts removal, and quality score based read filtering and base trimming as the most consistent method for improving sequence quality and downstream assemblies. ngsShoRT source code, user guide and tutorial are available at http://research.bioinformatics.udel.edu/genomics/ngsShoRT/. ngsShoRT can be incorporated as a pre-processing step in genome and transcriptome assembly projects. PMID:24955109
NASA Astrophysics Data System (ADS)
Wei, Ping; Li, Xinyang; Luo, Xi; Li, Jianfeng
2018-02-01
The centroid method is commonly adopted to locate the spot in the sub-apertures in the Shack-Hartmann wavefront sensor (SH-WFS), in which preprocessing image is required before calculating the spot location due to that the centroid method is extremely sensitive to noises. In this paper, the SH-WFS image was simulated according to the characteristics of the noises, background and intensity distribution. The Optimal parameters of SH-WFS image preprocessing method were put forward, in different signal-to-noise ratio (SNR) conditions, where the wavefront reconstruction error was considered as the evaluation index. Two methods of image preprocessing, thresholding method and windowing combing with thresholding method, were compared by studying the applicable range of SNR and analyzing the stability of the two methods, respectively.
Real-time visualization of cross-sectional data in three dimensions
NASA Technical Reports Server (NTRS)
Mayes, Terrence J.; Foley, Theodore T.; Hamilton, Joseph A.; Duncavage, Tom C.
2005-01-01
This paper describes a technique for viewing and interacting with 2-D medical data in three dimensions. The approach requires little pre-processing, runs on personal computers, and has a wide range of application. Implementation details are discussed, examples are presented, and results are summarized.
NASA Astrophysics Data System (ADS)
Babb, Grace
2017-11-01
This work aims to produce a higher fidelity model of the blades for NASA's X-57 all electric propeller driven experimental aircraft. This model will, in turn, allow for more accurate calculations of the thrust each propeller can generate. This work uses computational fluid dynamics (CFD) to first analyze the propeller blades as a series of 11 differently shaped airfoils and calculate, among other things, the coefficients for lift and drag associated with each airfoil at different angles of attack. OpenFOAM-a C + + library that can be used to create series of applications for pre-processing, solving, and post-processing-is one of the primary tools utilized in these calculations. By comparing the data OpenFOAM generates about the NACA 23012 airfoil with existing experimental data about the NACA 23012 airfoil, the reliability of our model is measured and verified. A trustworthy model can then be used to generate more data and sent to NASA to aid in the design of the actual aircraft.
Sentinel-2 ArcGIS Tool for Environmental Monitoring
NASA Astrophysics Data System (ADS)
Plesoianu, Alin; Cosmin Sandric, Ionut; Anca, Paula; Vasile, Alexandru; Calugaru, Andreea; Vasile, Cristian; Zavate, Lucian
2017-04-01
This paper addresses one of the biggest challenges regarding Sentinel-2 data, related to the need of an efficient tool to access and process the large collection of images that are available. Consequently, developing a tool for the automation of Sentinel-2 data analysis is the most immediate need. We developed a series of tools for the automation of Sentinel-2 data download and processing for vegetation health monitoring. The tools automatically perform the following operations: downloading image tiles from ESA's Scientific Hub or other venders (Amazon), pre-processing of the images to extract the 10-m bands, creating image composites, applying a series of vegetation indexes (NDVI, OSAVI, etc.) and performing change detection analyses on different temporal data sets. All of these tools run in a dynamic way in the ArcGIS Platform, without the need of creating intermediate datasets (rasters, layers), as the images are processed on-the-fly in order to avoid data duplication. Finally, they allow complete integration with the ArcGIS environment and workflows
Generating a Long-Term Land Data Record from the AVHRR and MODIS Instruments
NASA Technical Reports Server (NTRS)
Pedelty, Jeffrey; Devadiga, Sadashiva; Masuoka, Edward; Brown, Molly; Pinzon, Jorge; Tucker, Compton; Vermote, Eric; Prince, Stephen; Nagol, Jyotheshwar; Justice, Christopher;
2007-01-01
The goal of NASA's Land Long Term Iiata Record (LTDR) project is to produce a consistent long term data set from the AVHRR and MODIS instruments for land climate studies. The project will create daily surface reflectance and normalized difference vegetation index (NDVI) products at a resolution of 0.05 deg., which is identical to the Climate Modeling Grid (CMG) used for MODIS products from EOS Terra and Aqua. Higher order products such as burned area, land surface temperature, albedo, bidirectional reflectance distribution function (BRDF) correction, leaf area index (LAI), and fraction of photosyntheticalIy active radiation absorbed by vegetation (fPAR), will be created. The LTDR project will reprocess Global Area Coverage (GAC) data from AVHRR sensors onboard NOAA satellites by applying the preprocessing improvements identified in the AVHRR Pathfinder Il project and atmospheric and BRDF corrections used in MODIS processing. The preprocessing improvements include radiometric in-flight vicarious calibration for the visible and near infrared channels and inverse navigation to relate an Earth location to each sensor instantaneous field of view (IFOV). Atmospheric corrections for Rayleigh scattering, ozone, and water vapor are undertaken, with aerosol correction being implemented. The LTDR also produces a surface reflectance product for channel 3 (3.75 micrometers). Quality assessment (QA) is an integral part of the LTDR production system, which is monitoring temporal trands in the AVHRR products using time-series approaches developed for MODIS land product quality assessment. The land surface reflectance products have been evaluated at AERONET sites. The AVHRR data record from LTDR is also being compared to products from the PAL (Pathfinder AVHRR Land) and GIMMS (Global Inventory Modeling and Mapping Studies) systems to assess the relative merits of this reprocessing vis-a-vis these existing data products. The LTDR products and associated information can be found at http://ltdr.nascom.nasa.gov/ltdr/ltdr.html.
Structural health monitoring feature design by genetic programming
NASA Astrophysics Data System (ADS)
Harvey, Dustin Y.; Todd, Michael D.
2014-09-01
Structural health monitoring (SHM) systems provide real-time damage and performance information for civil, aerospace, and other high-capital or life-safety critical structures. Conventional data processing involves pre-processing and extraction of low-dimensional features from in situ time series measurements. The features are then input to a statistical pattern recognition algorithm to perform the relevant classification or regression task necessary to facilitate decisions by the SHM system. Traditional design of signal processing and feature extraction algorithms can be an expensive and time-consuming process requiring extensive system knowledge and domain expertise. Genetic programming, a heuristic program search method from evolutionary computation, was recently adapted by the authors to perform automated, data-driven design of signal processing and feature extraction algorithms for statistical pattern recognition applications. The proposed method, called Autofead, is particularly suitable to handle the challenges inherent in algorithm design for SHM problems where the manifestation of damage in structural response measurements is often unclear or unknown. Autofead mines a training database of response measurements to discover information-rich features specific to the problem at hand. This study provides experimental validation on three SHM applications including ultrasonic damage detection, bearing damage classification for rotating machinery, and vibration-based structural health monitoring. Performance comparisons with common feature choices for each problem area are provided demonstrating the versatility of Autofead to produce significant algorithm improvements on a wide range of problems.
Klingner, Carsten M; Brodoehl, Stefan; Huonker, Ralph; Witte, Otto W
2016-01-01
The question regarding whether somatosensory inputs are processed in parallel or in series has not been clearly answered. Several studies that have applied dynamic causal modeling (DCM) to fMRI data have arrived at seemingly divergent conclusions. However, these divergent results could be explained by the hypothesis that the processing route of somatosensory information changes with time. Specifically, we suggest that somatosensory stimuli are processed in parallel only during the early stage, whereas the processing is later dominated by serial processing. This hypothesis was revisited in the present study based on fMRI analyses of tactile stimuli and the application of DCM to magnetoencephalographic (MEG) data collected during sustained (260 ms) tactile stimulation. Bayesian model comparisons were used to infer the processing stream. We demonstrated that the favored processing stream changes over time. We found that the neural activity elicited in the first 100 ms following somatosensory stimuli is best explained by models that support a parallel processing route, whereas a serial processing route is subsequently favored. These results suggest that the secondary somatosensory area (SII) receives information regarding a new stimulus in parallel with the primary somatosensory area (SI), whereas later processing in the SII is dominated by the preprocessed input from the SI.
Klingner, Carsten M.; Brodoehl, Stefan; Huonker, Ralph; Witte, Otto W.
2016-01-01
The question regarding whether somatosensory inputs are processed in parallel or in series has not been clearly answered. Several studies that have applied dynamic causal modeling (DCM) to fMRI data have arrived at seemingly divergent conclusions. However, these divergent results could be explained by the hypothesis that the processing route of somatosensory information changes with time. Specifically, we suggest that somatosensory stimuli are processed in parallel only during the early stage, whereas the processing is later dominated by serial processing. This hypothesis was revisited in the present study based on fMRI analyses of tactile stimuli and the application of DCM to magnetoencephalographic (MEG) data collected during sustained (260 ms) tactile stimulation. Bayesian model comparisons were used to infer the processing stream. We demonstrated that the favored processing stream changes over time. We found that the neural activity elicited in the first 100 ms following somatosensory stimuli is best explained by models that support a parallel processing route, whereas a serial processing route is subsequently favored. These results suggest that the secondary somatosensory area (SII) receives information regarding a new stimulus in parallel with the primary somatosensory area (SI), whereas later processing in the SII is dominated by the preprocessed input from the SI. PMID:28066197
pySPACE—a signal processing and classification environment in Python
Krell, Mario M.; Straube, Sirko; Seeland, Anett; Wöhrle, Hendrik; Teiwes, Johannes; Metzen, Jan H.; Kirchner, Elsa A.; Kirchner, Frank
2013-01-01
In neuroscience large amounts of data are recorded to provide insights into cerebral information processing and function. The successful extraction of the relevant signals becomes more and more challenging due to increasing complexities in acquisition techniques and questions addressed. Here, automated signal processing and machine learning tools can help to process the data, e.g., to separate signal and noise. With the presented software pySPACE (http://pyspace.github.io/pyspace), signal processing algorithms can be compared and applied automatically on time series data, either with the aim of finding a suitable preprocessing, or of training supervised algorithms to classify the data. pySPACE originally has been built to process multi-sensor windowed time series data, like event-related potentials from the electroencephalogram (EEG). The software provides automated data handling, distributed processing, modular build-up of signal processing chains and tools for visualization and performance evaluation. Included in the software are various algorithms like temporal and spatial filters, feature generation and selection, classification algorithms, and evaluation schemes. Further, interfaces to other signal processing tools are provided and, since pySPACE is a modular framework, it can be extended with new algorithms according to individual needs. In the presented work, the structural hierarchies are described. It is illustrated how users and developers can interface the software and execute offline and online modes. Configuration of pySPACE is realized with the YAML format, so that programming skills are not mandatory for usage. The concept of pySPACE is to have one comprehensive tool that can be used to perform complete signal processing and classification tasks. It further allows to define own algorithms, or to integrate and use already existing libraries. PMID:24399965
pySPACE-a signal processing and classification environment in Python.
Krell, Mario M; Straube, Sirko; Seeland, Anett; Wöhrle, Hendrik; Teiwes, Johannes; Metzen, Jan H; Kirchner, Elsa A; Kirchner, Frank
2013-01-01
In neuroscience large amounts of data are recorded to provide insights into cerebral information processing and function. The successful extraction of the relevant signals becomes more and more challenging due to increasing complexities in acquisition techniques and questions addressed. Here, automated signal processing and machine learning tools can help to process the data, e.g., to separate signal and noise. With the presented software pySPACE (http://pyspace.github.io/pyspace), signal processing algorithms can be compared and applied automatically on time series data, either with the aim of finding a suitable preprocessing, or of training supervised algorithms to classify the data. pySPACE originally has been built to process multi-sensor windowed time series data, like event-related potentials from the electroencephalogram (EEG). The software provides automated data handling, distributed processing, modular build-up of signal processing chains and tools for visualization and performance evaluation. Included in the software are various algorithms like temporal and spatial filters, feature generation and selection, classification algorithms, and evaluation schemes. Further, interfaces to other signal processing tools are provided and, since pySPACE is a modular framework, it can be extended with new algorithms according to individual needs. In the presented work, the structural hierarchies are described. It is illustrated how users and developers can interface the software and execute offline and online modes. Configuration of pySPACE is realized with the YAML format, so that programming skills are not mandatory for usage. The concept of pySPACE is to have one comprehensive tool that can be used to perform complete signal processing and classification tasks. It further allows to define own algorithms, or to integrate and use already existing libraries.
Zuo, Xi-Nian; Xu, Ting; Jiang, Lili; Yang, Zhi; Cao, Xiao-Yan; He, Yong; Zang, Yu-Feng; Castellanos, F Xavier; Milham, Michael P
2013-01-15
While researchers have extensively characterized functional connectivity between brain regions, the characterization of functional homogeneity within a region of the brain connectome is in early stages of development. Several functional homogeneity measures were proposed previously, among which regional homogeneity (ReHo) was most widely used as a measure to characterize functional homogeneity of resting state fMRI (R-fMRI) signals within a small region (Zang et al., 2004). Despite a burgeoning literature on ReHo in the field of neuroimaging brain disorders, its test-retest (TRT) reliability remains unestablished. Using two sets of public R-fMRI TRT data, we systematically evaluated the ReHo's TRT reliability and further investigated the various factors influencing its reliability and found: 1) nuisance (head motion, white matter, and cerebrospinal fluid) correction of R-fMRI time series can significantly improve the TRT reliability of ReHo while additional removal of global brain signal reduces its reliability, 2) spatial smoothing of R-fMRI time series artificially enhances ReHo intensity and influences its reliability, 3) surface-based R-fMRI computation largely improves the TRT reliability of ReHo, 4) a scan duration of 5 min can achieve reliable estimates of ReHo, and 5) fast sampling rates of R-fMRI dramatically increase the reliability of ReHo. Inspired by these findings and seeking a highly reliable approach to exploratory analysis of the human functional connectome, we established an R-fMRI pipeline to conduct ReHo computations in both 3-dimensions (volume) and 2-dimensions (surface). Copyright © 2012 Elsevier Inc. All rights reserved.
chipPCR: an R package to pre-process raw data of amplification curves.
Rödiger, Stefan; Burdukiewicz, Michał; Schierack, Peter
2015-09-01
Both the quantitative real-time polymerase chain reaction (qPCR) and quantitative isothermal amplification (qIA) are standard methods for nucleic acid quantification. Numerous real-time read-out technologies have been developed. Despite the continuous interest in amplification-based techniques, there are only few tools for pre-processing of amplification data. However, a transparent tool for precise control of raw data is indispensable in several scenarios, for example, during the development of new instruments. chipPCR is an R: package for the pre-processing and quality analysis of raw data of amplification curves. The package takes advantage of R: 's S4 object model and offers an extensible environment. chipPCR contains tools for raw data exploration: normalization, baselining, imputation of missing values, a powerful wrapper for amplification curve smoothing and a function to detect the start and end of an amplification curve. The capabilities of the software are enhanced by the implementation of algorithms unavailable in R: , such as a 5-point stencil for derivative interpolation. Simulation tools, statistical tests, plots for data quality management, amplification efficiency/quantification cycle calculation, and datasets from qPCR and qIA experiments are part of the package. Core functionalities are integrated in GUIs (web-based and standalone shiny applications), thus streamlining analysis and report generation. http://cran.r-project.org/web/packages/chipPCR. Source code: https://github.com/michbur/chipPCR. stefan.roediger@b-tu.de Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
NASA Astrophysics Data System (ADS)
Homem-de-Mello, Luiz S.
1992-04-01
While in NASA's earlier space missions such as Voyager the number of sensors was in the hundreds, future platforms such as the Space Station Freedom will have tens of thousands sensors. For these planned missions it will be impossible to use the comprehensive monitoring strategy that was used in the past in which human operators monitored all sensors all the time. A selective monitoring strategy must be substituted for the current comprehensive strategy. This selective monitoring strategy uses computer tools to preprocess the incoming data and direct the operators' attention to the most critical parts of the physical system at any given time. There are several techniques that can be used to preprocess the incoming information. This paper presents an approach to using diagnostic reasoning techniques to preprocess the sensor data and detect which parts of the physical system require more attention because components have failed or are most likely to have failed. Given the sensor readings and a model of the physical system, a number of assertions are generated and expressed as Boolean equations. The resulting system of Boolean equations is solved symbolically. Using a priori probabilities of component failure and Bayes' rule, revised probabilities of failure can be computed. These will indicate what components have failed or are the most likely to have failed. This approach is suitable for systems that are well understood and for which the correctness of the assertions can be guaranteed. Also, the system must be such that assertions can be made from instantaneous measurements. And the system must be such that changes are slow enough to allow the computation.
Diagnostics of Dielectric Materials with Several Relaxation Times
NASA Astrophysics Data System (ADS)
Karpov, A. G.; Klemeshev, V. A.
2018-04-01
A set of means for detection and preprocessing of dielectrometric information has been suggested for studying the polarization/depolarization of dielectrics. Special attention has been paid to the processing of dielectrometric data for inhomogeneous materials using dielectric diagrams. Rapid analysis has been carried out the results of which can be used as initial approximations in more accurate (more complicated and time-consuming) iterative algorithms for model fitting.
Convolutional neural networks for vibrational spectroscopic data analysis.
Acquarelli, Jacopo; van Laarhoven, Twan; Gerretzen, Jan; Tran, Thanh N; Buydens, Lutgarde M C; Marchiori, Elena
2017-02-15
In this work we show that convolutional neural networks (CNNs) can be efficiently used to classify vibrational spectroscopic data and identify important spectral regions. CNNs are the current state-of-the-art in image classification and speech recognition and can learn interpretable representations of the data. These characteristics make CNNs a good candidate for reducing the need for preprocessing and for highlighting important spectral regions, both of which are crucial steps in the analysis of vibrational spectroscopic data. Chemometric analysis of vibrational spectroscopic data often relies on preprocessing methods involving baseline correction, scatter correction and noise removal, which are applied to the spectra prior to model building. Preprocessing is a critical step because even in simple problems using 'reasonable' preprocessing methods may decrease the performance of the final model. We develop a new CNN based method and provide an accompanying publicly available software. It is based on a simple CNN architecture with a single convolutional layer (a so-called shallow CNN). Our method outperforms standard classification algorithms used in chemometrics (e.g. PLS) in terms of accuracy when applied to non-preprocessed test data (86% average accuracy compared to the 62% achieved by PLS), and it achieves better performance even on preprocessed test data (96% average accuracy compared to the 89% achieved by PLS). For interpretability purposes, our method includes a procedure for finding important spectral regions, thereby facilitating qualitative interpretation of results. Copyright © 2016 Elsevier B.V. All rights reserved.
Kwon, Yea-Hoon; Shin, Sae-Byuk; Kim, Shin-Dug
2018-04-30
The purpose of this study is to improve human emotional classification accuracy using a convolution neural networks (CNN) model and to suggest an overall method to classify emotion based on multimodal data. We improved classification performance by combining electroencephalogram (EEG) and galvanic skin response (GSR) signals. GSR signals are preprocessed using by the zero-crossing rate. Sufficient EEG feature extraction can be obtained through CNN. Therefore, we propose a suitable CNN model for feature extraction by tuning hyper parameters in convolution filters. The EEG signal is preprocessed prior to convolution by a wavelet transform while considering time and frequency simultaneously. We use a database for emotion analysis using the physiological signals open dataset to verify the proposed process, achieving 73.4% accuracy, showing significant performance improvement over the current best practice models.
Classification of product inspection items using nonlinear features
NASA Astrophysics Data System (ADS)
Talukder, Ashit; Casasent, David P.; Lee, H.-W.
1998-03-01
Automated processing and classification of real-time x-ray images of randomly oriented touching pistachio nuts is discussed. The ultimate objective is the development of a system for automated non-invasive detection of defective product items on a conveyor belt. This approach involves two main steps: preprocessing and classification. Preprocessing locates individual items and segments ones that touch using a modified watershed algorithm. The second stage involves extraction of features that allow discrimination between damaged and clean items (pistachio nuts). This feature extraction and classification stage is the new aspect of this paper. We use a new nonlinear feature extraction scheme called the maximum representation and discriminating feature (MRDF) extraction method to compute nonlinear features that are used as inputs to a classifier. The MRDF is shown to provide better classification and a better ROC (receiver operating characteristic) curve than other methods.
Real-time system for imaging and object detection with a multistatic GPR array
Paglieroni, David W; Beer, N Reginald; Bond, Steven W; Top, Philip L; Chambers, David H; Mast, Jeffrey E; Donetti, John G; Mason, Blake C; Jones, Steven M
2014-10-07
A method and system for detecting the presence of subsurface objects within a medium is provided. In some embodiments, the imaging and detection system operates in a multistatic mode to collect radar return signals generated by an array of transceiver antenna pairs that is positioned across the surface and that travels down the surface. The imaging and detection system pre-processes the return signal to suppress certain undesirable effects. The imaging and detection system then generates synthetic aperture radar images from real aperture radar images generated from the pre-processed return signal. The imaging and detection system then post-processes the synthetic aperture radar images to improve detection of subsurface objects. The imaging and detection system identifies peaks in the energy levels of the post-processed image frame, which indicates the presence of a subsurface object.
The Minimal Preprocessing Pipelines for the Human Connectome Project
Glasser, Matthew F.; Sotiropoulos, Stamatios N; Wilson, J Anthony; Coalson, Timothy S; Fischl, Bruce; Andersson, Jesper L; Xu, Junqian; Jbabdi, Saad; Webster, Matthew; Polimeni, Jonathan R; Van Essen, David C; Jenkinson, Mark
2013-01-01
The Human Connectome Project (HCP) faces the challenging task of bringing multiple magnetic resonance imaging (MRI) modalities together in a common automated preprocessing framework across a large cohort of subjects. The MRI data acquired by the HCP differ in many ways from data acquired on conventional 3 Tesla scanners and often require newly developed preprocessing methods. We describe the minimal preprocessing pipelines for structural, functional, and diffusion MRI that were developed by the HCP to accomplish many low level tasks, including spatial artifact/distortion removal, surface generation, cross-modal registration, and alignment to standard space. These pipelines are specially designed to capitalize on the high quality data offered by the HCP. The final standard space makes use of a recently introduced CIFTI file format and the associated grayordinates spatial coordinate system. This allows for combined cortical surface and subcortical volume analyses while reducing the storage and processing requirements for high spatial and temporal resolution data. Here, we provide the minimum image acquisition requirements for the HCP minimal preprocessing pipelines and additional advice for investigators interested in replicating the HCP’s acquisition protocols or using these pipelines. Finally, we discuss some potential future improvements for the pipelines. PMID:23668970
Satterthwaite, Theodore D.; Elliott, Mark A.; Gerraty, Raphael T.; Ruparel, Kosha; Loughead, James; Calkins, Monica E.; Eickhoff, Simon B.; Hakonarson, Hakon; Gur, Ruben C.; Gur, Raquel E.; Wolf, Daniel H.
2013-01-01
Several recent reports in large, independent samples have demonstrated the influence of motion artifact on resting-state functional connectivity MRI (rsfc-MRI). Standard rsfc-MRI preprocessing typically includes regression of confounding signals and band-pass filtering. However, substantial heterogeneity exists in how these techniques are implemented across studies, and no prior study has examined the effect of differing approaches for the control of motion-induced artifacts. To better understand how in-scanner head motion affects rsfc-MRI data, we describe the spatial, temporal, and spectral characteristics of motion artifacts in a sample of 348 adolescents. Analyses utilize a novel approach for describing head motion on a voxelwise basis. Next, we systematically evaluate the efficacy of a range of confound regression and filtering techniques for the control of motion-induced artifacts. Results reveal that the effectiveness of preprocessing procedures on the control of motion is heterogeneous, and that improved preprocessing provides a substantial benefit beyond typical procedures. These results demonstrate that the effect of motion on rsfc-MRI can be substantially attenuated through improved preprocessing procedures, but not completely removed. PMID:22926292
Active and Passive Remote Sensing Data Time Series for Flood Detection and Surface Water Mapping
NASA Astrophysics Data System (ADS)
Bioresita, Filsa; Puissant, Anne; Stumpf, André; Malet, Jean-Philippe
2017-04-01
As a consequence of environmental changes surface waters are undergoing changes in time and space. A better knowledge of the spatial and temporal distribution of surface waters resources becomes essential to support sustainable policies and development activities. Especially because surface waters, are not only a vital sweet water resource, but can also pose hazards to human settlements and infrastructures through flooding. Floods are a highly frequent disaster in the world and can caused huge material losses. Detecting and mapping their spatial distribution is fundamental to ascertain damages and for relief efforts. Spaceborne Synthetic Aperture Radar (SAR) is an effective way to monitor surface waters bodies over large areas since it provides excellent temporal coverage and, all-weather day-and-night imaging capabilities. However, emergent vegetation, trees, wind or flow turbulence can increase radar back-scatter returns and pose problems for the delineation of inundated areas. In such areas, passive remote sensing data can be used to identify vegetated areas and support the interpretation of SAR data. The availability of new Earth Observation products, for example Sentinel-1 (active) and Sentinel-2 (passive) imageries, with both high spatial and temporal resolution, have the potential to facilitate flood detection and monitoring of surface waters changes which are very dynamic in space and time. In this context, the research consists of two parts. In the first part, the objective is to propose generic and reproducible methodologies for the analysis of Sentinel-1 time series data for floods detection and surface waters mapping. The processing chain comprises a series of pre-processing steps and the statistical modeling of the pixel value distribution to produce probabilistic maps for the presence of surface waters. Images pre-processing for all Sentinel-1 images comprise the reduction SAR effect like orbit errors, speckle noise, and geometric effects. A modified Split Based Approach (MSBA) is used in order to focus on surface water areas automatically and facilitate the estimation of class models for water and non-water areas. A Finite Mixture Model is employed as the underlying statistical model to produce probabilistic maps. Subsequently, bilateral filtering is applied to take into account spatial neighborhood relationships in the generation of final map. The elimination of shadows effect is performed in a post-processing step. The processing chain is tested on three case studies. The first case is a flood event in central Ireland, the second case is located in Yorkshire county / Great Britain, and the third test case covers a recent flood event in northern Italy. The tests showed that the modified SBA step and the Finite Mixture Models can be applied for the automatic surface water detection in a variety of test cases. An evaluation again Copernicus products derived from very-high resolution imagery was performed, and showed a high overall accuracy and F-measure of the obtained maps. This evaluation also showed that the use of probability maps and bilateral filtering improved the accuracy of classification results significantly. Based on this quantitative evaluation, it is concluded that the processing chain can be applied for flood mapping from Sentinel-1 data. To estimate robust statistical distributions the method requires sufficient surface waters areas in the observed zone and sufficient contrast between surface waters and other land use classes. Ongoing research addresses the fusion of Sentinel-1 and passive remote sensing data (e.g. Sentinel-2) in order to reduce the current shortcomings in the developed processing chain. In this work, fusion is performed at the feature level to better account for the difference image properties of SAR and optical sensors. Further, the processing chain is currently being optimized in terms of calculation time for a further integration as a flood mapping service on the A2S (Alsace Aval Sentinel) high-performance computing infrastructure of University of Strasbourg.
NASA Technical Reports Server (NTRS)
Crane, Robert K.; Wang, Xuhe; Westenhaver, David
1996-01-01
The preprocessing software manual describes the Actspp program originally developed to observe and diagnose Advanced Communications Technology Satellite (ACTS) propagation terminal/receiver problems. However, it has been quite useful for automating the preprocessing functions needed to convert the terminal output to useful attenuation estimates. Prior to having data acceptable for archival functions, the individual receiver system must be calibrated and the power level shifts caused by ranging tone modulation must be received. Actspp provides three output files: the daylog, the diurnal coefficient file, and the file that contains calibration information.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jeon, Chang Ho; Kim, Bohyoung; Gu, Bon Seung
2013-10-15
Purpose: To modify the preprocessing technique, which was previously proposed, improving compressibility of computed tomography (CT) images to cover the diversity of three dimensional configurations of different body parts and to evaluate the robustness of the technique in terms of segmentation correctness and increase in reversible compression ratio (CR) for various CT examinations.Methods: This study had institutional review board approval with waiver of informed patient consent. A preprocessing technique was previously proposed to improve the compressibility of CT images by replacing pixel values outside the body region with a constant value resulting in maximizing data redundancy. Since the technique wasmore » developed aiming at only chest CT images, the authors modified the segmentation method to cover the diversity of three dimensional configurations of different body parts. The modified version was evaluated as follows. In randomly selected 368 CT examinations (352 787 images), each image was preprocessed by using the modified preprocessing technique. Radiologists visually confirmed whether the segmented region covers the body region or not. The images with and without the preprocessing were reversibly compressed using Joint Photographic Experts Group (JPEG), JPEG2000 two-dimensional (2D), and JPEG2000 three-dimensional (3D) compressions. The percentage increase in CR per examination (CR{sub I}) was measured.Results: The rate of correct segmentation was 100.0% (95% CI: 99.9%, 100.0%) for all the examinations. The median of CR{sub I} were 26.1% (95% CI: 24.9%, 27.1%), 40.2% (38.5%, 41.1%), and 34.5% (32.7%, 36.2%) in JPEG, JPEG2000 2D, and JPEG2000 3D, respectively.Conclusions: In various CT examinations, the modified preprocessing technique can increase in the CR by 25% or more without concerning about degradation of diagnostic information.« less
Simulating Nonequilibrium Radiation via Orthogonal Polynomial Refinement
2015-01-07
measured by the preprocessing time, computer memory space, and average query time. In many search procedures for the number of points np of a data set, a...analytic expression for the radiative flux density is possible by the commonly accepted local thermal equilibrium ( LTE ) approximation. A semi...Vol. 227, pp. 9463-9476, 2008. 10. Galvez, M., Ray-Tracing model for radiation transport in three-dimensional LTE system, App. Physics, Vol. 38
Comparative performance evaluation of transform coding in image pre-processing
NASA Astrophysics Data System (ADS)
Menon, Vignesh V.; NB, Harikrishnan; Narayanan, Gayathri; CK, Niveditha
2017-07-01
We are in the midst of a communication transmute which drives the development as largely as dissemination of pioneering communication systems with ever-increasing fidelity and resolution. Distinguishable researches have been appreciative in image processing techniques crazed by a growing thirst for faster and easier encoding, storage and transmission of visual information. In this paper, the researchers intend to throw light on many techniques which could be worn at the transmitter-end in order to ease the transmission and reconstruction of the images. The researchers investigate the performance of different image transform coding schemes used in pre-processing, their comparison, and effectiveness, the necessary and sufficient conditions, properties and complexity in implementation. Whimsical by prior advancements in image processing techniques, the researchers compare various contemporary image pre-processing frameworks- Compressed Sensing, Singular Value Decomposition, Integer Wavelet Transform on performance. The paper exposes the potential of Integer Wavelet transform to be an efficient pre-processing scheme.
Effect of microaerobic fermentation in preprocessing fibrous lignocellulosic materials.
Alattar, Manar Arica; Green, Terrence R; Henry, Jordan; Gulca, Vitalie; Tizazu, Mikias; Bergstrom, Robby; Popa, Radu
2012-06-01
Amending soil with organic matter is common in agricultural and logging practices. Such amendments have benefits to soil fertility and crop yields. These benefits may be increased if material is preprocessed before introduction into soil. We analyzed the efficiency of microaerobic fermentation (MF), also referred to as Bokashi, in preprocessing fibrous lignocellulosic (FLC) organic materials using varying produce amendments and leachate treatments. Adding produce amendments increased leachate production and fermentation rates and decreased the biological oxygen demand of the leachate. Continuously draining leachate without returning it to the fermentors led to acidification and decreased concentrations of polysaccharides (PS) in leachates. PS fragmentation and the production of soluble metabolites and gases stabilized in fermentors in about 2-4 weeks. About 2 % of the carbon content was lost as CO(2). PS degradation rates, upon introduction of processed materials into soil, were similar to unfermented FLC. Our results indicate that MF is insufficient for adequate preprocessing of FLC material.
Analyzing large scale genomic data on the cloud with Sparkhit
Huang, Liren; Krüger, Jan
2018-01-01
Abstract Motivation The increasing amount of next-generation sequencing data poses a fundamental challenge on large scale genomic analytics. Existing tools use different distributed computational platforms to scale-out bioinformatics workloads. However, the scalability of these tools is not efficient. Moreover, they have heavy run time overheads when pre-processing large amounts of data. To address these limitations, we have developed Sparkhit: a distributed bioinformatics framework built on top of the Apache Spark platform. Results Sparkhit integrates a variety of analytical methods. It is implemented in the Spark extended MapReduce model. It runs 92–157 times faster than MetaSpark on metagenomic fragment recruitment and 18–32 times faster than Crossbow on data pre-processing. We analyzed 100 terabytes of data across four genomic projects in the cloud in 21 h, which includes the run times of cluster deployment and data downloading. Furthermore, our application on the entire Human Microbiome Project shotgun sequencing data was completed in 2 h, presenting an approach to easily associate large amounts of public datasets with reference data. Availability and implementation Sparkhit is freely available at: https://rhinempi.github.io/sparkhit/. Contact asczyrba@cebitec.uni-bielefeld.de Supplementary information Supplementary data are available at Bioinformatics online. PMID:29253074
NASA Astrophysics Data System (ADS)
Gómez-Gutiérrez, Álvaro; Juan de Sanjosé-Blasco, José; Schnabel, Susanne; de Matías-Bejarano, Javier; Pulido-Fernández, Manuel; Berenguer-Sempere, Fernando
2015-04-01
In this work, the hypothesis of improving 3D models obtained with Structure from Motion (SfM) approaches using images pre-processed by High Dynamic Range (HDR) techniques is tested. Photographs of the Veleta Rock Glacier in Spain were captured with different exposure values (EV0, EV+1 and EV-1), two focal lengths (35 and 100 mm) and under different weather conditions for the years 2008, 2009, 2011, 2012 and 2014. HDR images were produced using the different EV steps within Fusion F.1 software. Point clouds were generated using commercial and free available SfM software: Agisoft Photoscan and 123D Catch. Models Obtained using pre-processed images and non-preprocessed images were compared in a 3D environment with a benchmark 3D model obtained by means of a Terrestrial Laser Scanner (TLS). A total of 40 point clouds were produced, georeferenced and compared. Results indicated that for Agisoft Photoscan software differences in the accuracy between models obtained with pre-processed and non-preprocessed images were not significant from a statistical viewpoint. However, in the case of the free available software 123D Catch, models obtained using images pre-processed by HDR techniques presented a higher point density and were more accurate. This tendency was observed along the 5 studied years and under different capture conditions. More work should be done in the near future to corroborate whether the results of similar software packages can be improved by HDR techniques (e.g. ARC3D, Bundler and PMVS2, CMP SfM, Photosynth and VisualSFM).
Big Data Challenges in Global Seismic 'Adjoint Tomography' (Invited)
NASA Astrophysics Data System (ADS)
Tromp, J.; Bozdag, E.; Krischer, L.; Lefebvre, M.; Lei, W.; Smith, J.
2013-12-01
The challenge of imaging Earth's interior on a global scale is closely linked to the challenge of handling large data sets. The related iterative workflow involves five distinct phases, namely, 1) data gathering and culling, 2) synthetic seismogram calculations, 3) pre-processing (time-series analysis and time-window selection), 4) data assimilation and adjoint calculations, 5) post-processing (pre-conditioning, regularization, model update). In order to implement this workflow on modern high-performance computing systems, a new seismic data format is being developed. The Adaptable Seismic Data Format (ASDF) is designed to replace currently used data formats with a more flexible format that allows for fast parallel I/O. The metadata is divided into abstract categories, such as "source" and "receiver", along with provenance information for complete reproducibility. The structure of ASDF is designed keeping in mind three distinct applications: earthquake seismology, seismic interferometry, and exploration seismology. Existing time-series analysis tool kits, such as SAC and ObsPy, can be easily interfaced with ASDF so that seismologists can use robust, previously developed software packages. ASDF accommodates an automated, efficient workflow for global adjoint tomography. Manually managing the large number of simulations associated with the workflow can rapidly become a burden, especially with increasing numbers of earthquakes and stations. Therefore, it is of importance to investigate the possibility of automating the entire workflow. Scientific Workflow Management Software (SWfMS) allows users to execute workflows almost routinely. SWfMS provides additional advantages. In particular, it is possible to group independent simulations in a single job to fit the available computational resources. They also give a basic level of fault resilience as the workflow can be resumed at the correct state preceding a failure. Some of the best candidates for our particular workflow are Kepler and Swift, and the latter appears to be the most serious candidate for a large-scale workflow on a single supercomputer, remaining sufficiently simple to accommodate further modifications and improvements.
Yield and Production Properties of Wood chips and Particles Torrefied in a Crucible Furnace Retort
Thomas L. Eberhardt; Chi-Leung So; Karen G. Reed
2016-01-01
Biomass preprocessing by torrefaction improves feedstock consistency and thereby improves the efficiency of biofuels operations, including pyrolysis, gasification, and combustion. A crucible furnace retort was fabricated of sufficient size to handle a commercially available wood chip feedstock. Varying the torrefaction times and temperatures provided an array of...
Fast Automatic Segmentation of White Matter Streamlines Based on a Multi-Subject Bundle Atlas.
Labra, Nicole; Guevara, Pamela; Duclap, Delphine; Houenou, Josselin; Poupon, Cyril; Mangin, Jean-François; Figueroa, Miguel
2017-01-01
This paper presents an algorithm for fast segmentation of white matter bundles from massive dMRI tractography datasets using a multisubject atlas. We use a distance metric to compare streamlines in a subject dataset to labeled centroids in the atlas, and label them using a per-bundle configurable threshold. In order to reduce segmentation time, the algorithm first preprocesses the data using a simplified distance metric to rapidly discard candidate streamlines in multiple stages, while guaranteeing that no false negatives are produced. The smaller set of remaining streamlines is then segmented using the original metric, thus eliminating any false positives from the preprocessing stage. As a result, a single-thread implementation of the algorithm can segment a dataset of almost 9 million streamlines in less than 6 minutes. Moreover, parallel versions of our algorithm for multicore processors and graphics processing units further reduce the segmentation time to less than 22 seconds and to 5 seconds, respectively. This performance enables the use of the algorithm in truly interactive applications for visualization, analysis, and segmentation of large white matter tractography datasets.
Large-Scale Point-Cloud Visualization through Localized Textured Surface Reconstruction.
Arikan, Murat; Preiner, Reinhold; Scheiblauer, Claus; Jeschke, Stefan; Wimmer, Michael
2014-09-01
In this paper, we introduce a novel scene representation for the visualization of large-scale point clouds accompanied by a set of high-resolution photographs. Many real-world applications deal with very densely sampled point-cloud data, which are augmented with photographs that often reveal lighting variations and inaccuracies in registration. Consequently, the high-quality representation of the captured data, i.e., both point clouds and photographs together, is a challenging and time-consuming task. We propose a two-phase approach, in which the first (preprocessing) phase generates multiple overlapping surface patches and handles the problem of seamless texture generation locally for each patch. The second phase stitches these patches at render-time to produce a high-quality visualization of the data. As a result of the proposed localization of the global texturing problem, our algorithm is more than an order of magnitude faster than equivalent mesh-based texturing techniques. Furthermore, since our preprocessing phase requires only a minor fraction of the whole data set at once, we provide maximum flexibility when dealing with growing data sets.
NASA Astrophysics Data System (ADS)
Princz, S.; Wenzel, U.; Miller, R.; Hessling, M.
2014-11-01
One aerobic and four anaerobic batch fermentations of the yeast Saccharomyces cerevisiae were conducted in a stirred bioreactor and monitored inline by NIR spectroscopy and a transflectance dip probe. From the acquired NIR spectra, chemometric partial least squares regression (PLSR) models for predicting biomass, glucose and ethanol were constructed. The spectra were directly measured in the fermentation broth and successfully inspected for adulteration using our novel data pre-processing method. These adulterations manifested as strong fluctuations in the shape and offset of the absorption spectra. They resulted from cells, cell clusters, or gas bubbles intercepting the optical path of the dip probe. In the proposed data pre-processing method, adulterated signals are removed by passing the time-scanned non-averaged spectra through two filter algorithms with a 5% quantile cutoff. The filtered spectra containing meaningful data are then averaged. A second step checks whether the whole time scan is analyzable. If true, the average is calculated and used to prepare the PLSR models. This new method distinctly improved the prediction results. To dissociate possible correlations between analyte concentrations, such as glucose and ethanol, the feeding analytes were alternately supplied at different concentrations (spiking) at the end of the four anaerobic fermentations. This procedure yielded low-error (anaerobic) PLSR models for predicting analyte concentrations of 0.31 g/l for biomass, 3.41 g/l for glucose, and 2.17 g/l for ethanol. The maximum concentrations were 14 g/l biomass, 167 g/l glucose, and 80 g/l ethanol. Data from the aerobic fermentation, carried out under high agitation and high aeration, were incorporated to realize combined PLSR models, which have not been previously reported to our knowledge.
Chriskos, Panteleimon; Frantzidis, Christos A; Gkivogkli, Polyxeni T; Bamidis, Panagiotis D; Kourtidou-Papadeli, Chrysoula
2018-01-01
Sleep staging, the process of assigning labels to epochs of sleep, depending on the stage of sleep they belong, is an arduous, time consuming and error prone process as the initial recordings are quite often polluted by noise from different sources. To properly analyze such data and extract clinical knowledge, noise components must be removed or alleviated. In this paper a pre-processing and subsequent sleep staging pipeline for the sleep analysis of electroencephalographic signals is described. Two novel methods of functional connectivity estimation (Synchronization Likelihood/SL and Relative Wavelet Entropy/RWE) are comparatively investigated for automatic sleep staging through manually pre-processed electroencephalographic recordings. A multi-step process that renders signals suitable for further analysis is initially described. Then, two methods that rely on extracting synchronization features from electroencephalographic recordings to achieve computerized sleep staging are proposed, based on bivariate features which provide a functional overview of the brain network, contrary to most proposed methods that rely on extracting univariate time and frequency features. Annotation of sleep epochs is achieved through the presented feature extraction methods by training classifiers, which are in turn able to accurately classify new epochs. Analysis of data from sleep experiments on a randomized, controlled bed-rest study, which was organized by the European Space Agency and was conducted in the "ENVIHAB" facility of the Institute of Aerospace Medicine at the German Aerospace Center (DLR) in Cologne, Germany attains high accuracy rates, over 90% based on ground truth that resulted from manual sleep staging by two experienced sleep experts. Therefore, it can be concluded that the above feature extraction methods are suitable for semi-automatic sleep staging.
Chriskos, Panteleimon; Frantzidis, Christos A.; Gkivogkli, Polyxeni T.; Bamidis, Panagiotis D.; Kourtidou-Papadeli, Chrysoula
2018-01-01
Sleep staging, the process of assigning labels to epochs of sleep, depending on the stage of sleep they belong, is an arduous, time consuming and error prone process as the initial recordings are quite often polluted by noise from different sources. To properly analyze such data and extract clinical knowledge, noise components must be removed or alleviated. In this paper a pre-processing and subsequent sleep staging pipeline for the sleep analysis of electroencephalographic signals is described. Two novel methods of functional connectivity estimation (Synchronization Likelihood/SL and Relative Wavelet Entropy/RWE) are comparatively investigated for automatic sleep staging through manually pre-processed electroencephalographic recordings. A multi-step process that renders signals suitable for further analysis is initially described. Then, two methods that rely on extracting synchronization features from electroencephalographic recordings to achieve computerized sleep staging are proposed, based on bivariate features which provide a functional overview of the brain network, contrary to most proposed methods that rely on extracting univariate time and frequency features. Annotation of sleep epochs is achieved through the presented feature extraction methods by training classifiers, which are in turn able to accurately classify new epochs. Analysis of data from sleep experiments on a randomized, controlled bed-rest study, which was organized by the European Space Agency and was conducted in the “ENVIHAB” facility of the Institute of Aerospace Medicine at the German Aerospace Center (DLR) in Cologne, Germany attains high accuracy rates, over 90% based on ground truth that resulted from manual sleep staging by two experienced sleep experts. Therefore, it can be concluded that the above feature extraction methods are suitable for semi-automatic sleep staging. PMID:29628883
Resource-Aware Mobile-Based Health Monitoring.
Masud, Mohammad M; Adel Serhani, Mohamed; Navaz, Alramzana Nujum
2017-03-01
Monitoring heart diseases often requires frequent measurements of electrocardiogram (ECG) signals at different periods of the day, and at different situations (e.g., traveling, and exercising). This can only be implemented using mobile devices in order to cope with mobility of patients under monitoring, thus supporting continuous monitoring practices. However, these devices are energy-aware, have limited computing resources (e.g., CPU speed and memory), and might lose network connectivity, which makes it very challenging to maintain a continuity of the monitoring episode. In this paper, we propose a mobile monitoring solution to cope with these challenges by compromising on the fly resources availability, battery level, and network intermittence. In order to solve this problem, first we divide the whole process into several subtasks such that each subtask can be executed sequentially either in the server or in the mobile or in parallel in both devices. Then, we developed a mathematical model that considers all the constraints and finds a dynamic programing solution to obtain the best execution path (i.e., which substep should be done where). The solution guarantees an optimum execution time, while considering device battery availability, execution and transmission time, and network availability. We conducted a series of experiments to evaluate our proposed approach using some key monitoring tasks starting from preprocessing to classification and prediction. The results we have obtained proved that our approach gives the best (lowest) running time for any combination of factors including processing speed, input size, and network bandwidth. Compared to several greedy but nonoptimal solutions, the execution time of our approach was at least 10 times faster and consumed 90% less energy.
Sundareshan, Malur K; Bhattacharjee, Supratik; Inampudi, Radhika; Pang, Ho-Yuen
2002-12-10
Computational complexity is a major impediment to the real-time implementation of image restoration and superresolution algorithms in many applications. Although powerful restoration algorithms have been developed within the past few years utilizing sophisticated mathematical machinery (based on statistical optimization and convex set theory), these algorithms are typically iterative in nature and require a sufficient number of iterations to be executed to achieve the desired resolution improvement that may be needed to meaningfully perform postprocessing image exploitation tasks in practice. Additionally, recent technological breakthroughs have facilitated novel sensor designs (focal plane arrays, for instance) that make it possible to capture megapixel imagery data at video frame rates. A major challenge in the processing of these large-format images is to complete the execution of the image processing steps within the frame capture times and to keep up with the output rate of the sensor so that all data captured by the sensor can be efficiently utilized. Consequently, development of novel methods that facilitate real-time implementation of image restoration and superresolution algorithms is of significant practical interest and is the primary focus of this study. The key to designing computationally efficient processing schemes lies in strategically introducing appropriate preprocessing steps together with the superresolution iterations to tailor optimized overall processing sequences for imagery data of specific formats. For substantiating this assertion, three distinct methods for tailoring a preprocessing filter and integrating it with the superresolution processing steps are outlined. These methods consist of a region-of-interest extraction scheme, a background-detail separation procedure, and a scene-derived information extraction step for implementing a set-theoretic restoration of the image that is less demanding in computation compared with the superresolution iterations. A quantitative evaluation of the performance of these algorithms for restoring and superresolving various imagery data captured by diffraction-limited sensing operations are also presented.
Nunez, Michael D.; Vandekerckhove, Joachim; Srinivasan, Ramesh
2016-01-01
Perceptual decision making can be accounted for by drift-diffusion models, a class of decision-making models that assume a stochastic accumulation of evidence on each trial. Fitting response time and accuracy to a drift-diffusion model produces evidence accumulation rate and non-decision time parameter estimates that reflect cognitive processes. Our goal is to elucidate the effect of attention on visual decision making. In this study, we show that measures of attention obtained from simultaneous EEG recordings can explain per-trial evidence accumulation rates and perceptual preprocessing times during a visual decision making task. Models assuming linear relationships between diffusion model parameters and EEG measures as external inputs were fit in a single step in a hierarchical Bayesian framework. The EEG measures were features of the evoked potential (EP) to the onset of a masking noise and the onset of a task-relevant signal stimulus. Single-trial evoked EEG responses, P200s to the onsets of visual noise and N200s to the onsets of visual signal, explain single-trial evidence accumulation and preprocessing times. Within-trial evidence accumulation variance was not found to be influenced by attention to the signal or noise. Single-trial measures of attention lead to better out-of-sample predictions of accuracy and correct reaction time distributions for individual subjects. PMID:28435173
Nunez, Michael D; Vandekerckhove, Joachim; Srinivasan, Ramesh
2017-02-01
Perceptual decision making can be accounted for by drift-diffusion models, a class of decision-making models that assume a stochastic accumulation of evidence on each trial. Fitting response time and accuracy to a drift-diffusion model produces evidence accumulation rate and non-decision time parameter estimates that reflect cognitive processes. Our goal is to elucidate the effect of attention on visual decision making. In this study, we show that measures of attention obtained from simultaneous EEG recordings can explain per-trial evidence accumulation rates and perceptual preprocessing times during a visual decision making task. Models assuming linear relationships between diffusion model parameters and EEG measures as external inputs were fit in a single step in a hierarchical Bayesian framework. The EEG measures were features of the evoked potential (EP) to the onset of a masking noise and the onset of a task-relevant signal stimulus. Single-trial evoked EEG responses, P200s to the onsets of visual noise and N200s to the onsets of visual signal, explain single-trial evidence accumulation and preprocessing times. Within-trial evidence accumulation variance was not found to be influenced by attention to the signal or noise. Single-trial measures of attention lead to better out-of-sample predictions of accuracy and correct reaction time distributions for individual subjects.
Low-cost digital image processing at the University of Oklahoma
NASA Technical Reports Server (NTRS)
Harrington, J. A., Jr.
1981-01-01
Computer assisted instruction in remote sensing at the University of Oklahoma involves two separate approaches and is dependent upon initial preprocessing of a LANDSAT computer compatible tape using software developed for an IBM 370/158 computer. In-house generated preprocessing algorithms permits students or researchers to select a subset of a LANDSAT scene for subsequent analysis using either general purpose statistical packages or color graphic image processing software developed for Apple II microcomputers. Procedures for preprocessing the data and image analysis using either of the two approaches for low-cost LANDSAT data processing are described.
Zhang, Chu; Liu, Fei; He, Yong
2018-02-01
Hyperspectral imaging was used to identify and to visualize the coffee bean varieties. Spectral preprocessing of pixel-wise spectra was conducted by different methods, including moving average smoothing (MA), wavelet transform (WT) and empirical mode decomposition (EMD). Meanwhile, spatial preprocessing of the gray-scale image at each wavelength was conducted by median filter (MF). Support vector machine (SVM) models using full sample average spectra and pixel-wise spectra, and the selected optimal wavelengths by second derivative spectra all achieved classification accuracy over 80%. Primarily, the SVM models using pixel-wise spectra were used to predict the sample average spectra, and these models obtained over 80% of the classification accuracy. Secondly, the SVM models using sample average spectra were used to predict pixel-wise spectra, but achieved with lower than 50% of classification accuracy. The results indicated that WT and EMD were suitable for pixel-wise spectra preprocessing. The use of pixel-wise spectra could extend the calibration set, and resulted in the good prediction results for pixel-wise spectra and sample average spectra. The overall results indicated the effectiveness of using spectral preprocessing and the adoption of pixel-wise spectra. The results provided an alternative way of data processing for applications of hyperspectral imaging in food industry.
Multiwavelet grading of prostate pathological images
NASA Astrophysics Data System (ADS)
Soltanian-Zadeh, Hamid; Jafari-Khouzani, Kourosh
2002-05-01
We have developed image analysis methods to automatically grade pathological images of prostate. The proposed method generates Gleason grades to images, where each image is assigned a grade between 1 and 5. This is done using features extracted from multiwavelet transformations. We extract energy and entropy features from submatrices obtained in the decomposition. Next, we apply a k-NN classifier to grade the image. To find optimal multiwavelet basis, preprocessing, and classifier, we use features extracted by different multiwavelets with either critically sampled preprocessing or repeated row preprocessing and different k-NN classifiers and compare their performances, evaluated by total misclassification rate (TMR). To evaluate sensitivity to noise, we add white Gaussian noise to images and compare the results (TMR's). We applied proposed methods to 100 images. We evaluated the first and second levels of decomposition using Geronimo, Hardin, and Massopust (GHM), Chui and Lian (CL), and Shen (SA4) multiwavelets. We also evaluated k-NN classifier for k=1,2,3,4,5. Experimental results illustrate that first level of decomposition is quite noisy. They also show that critically sampled preprocessing outperforms repeated row preprocessing and has less sensitivity to noise. Finally, comparison studies indicate that SA4 multiwavelet and k-NN classifier (k=1) generates optimal results (with smallest TMR of 3%).
Multisubject Learning for Common Spatial Patterns in Motor-Imagery BCI
Devlaminck, Dieter; Wyns, Bart; Grosse-Wentrup, Moritz; Otte, Georges; Santens, Patrick
2011-01-01
Motor-imagery-based brain-computer interfaces (BCIs) commonly use the common spatial pattern filter (CSP) as preprocessing step before feature extraction and classification. The CSP method is a supervised algorithm and therefore needs subject-specific training data for calibration, which is very time consuming to collect. In order to reduce the amount of calibration data that is needed for a new subject, one can apply multitask (from now on called multisubject) machine learning techniques to the preprocessing phase. Here, the goal of multisubject learning is to learn a spatial filter for a new subject based on its own data and that of other subjects. This paper outlines the details of the multitask CSP algorithm and shows results on two data sets. In certain subjects a clear improvement can be seen, especially when the number of training trials is relatively low. PMID:22007194
MZmine 2 Data-Preprocessing To Enhance Molecular Networking Reliability.
Olivon, Florent; Grelier, Gwendal; Roussi, Fanny; Litaudon, Marc; Touboul, David
2017-08-01
Molecular networking is becoming more and more popular into the metabolomic community to organize tandem mass spectrometry (MS 2 ) data. Even though this approach allows the treatment and comparison of large data sets, several drawbacks related to the MS-Cluster tool routinely used on the Global Natural Product Social Molecular Networking platform (GNPS) limit its potential. MS-Cluster cannot distinguish between chromatography well-resolved isomers as retention times are not taken into account. Annotation with predicted chemical formulas is also not implemented and semiquantification is only based on the number of MS 2 scans. We propose to introduce a data-preprocessing workflow including the preliminary data treatment by MZmine 2 followed by a homemade Python script freely available to the community that clears the major previously mentioned GNPS drawbacks. The efficiency of this workflow is exemplified with the analysis of six fractions of increasing polarities obtained from a sequential supercritical CO 2 extraction of Stillingia lineata leaves.
Genova, Giuseppe; Tosetti, Roberta; Tonutti, Pietro
2016-01-30
Grape juice is an important dietary source of health-promoting antioxidant molecules. Different factors may affect juice composition and nutraceutical properties. The effects of some of these factors (harvest time, pre-processing ethylene treatment of grapes and juice thermal pasteurization) were here evaluated, considering in particular the phenolic composition and antioxidant capacity. Grapes (Vitis vinifera L., red-skinned variety Sangiovese) were collected twice in relation to the technological harvest (TH) and 12 days before TH (early harvest, EH) and treated with gaseous ethylene (1000 ppm) or air for 48 h. Fresh and pasteurized (78 °C for 30 min) juices were produced using a water bath. Three-way analysis of variance showed that the harvest date had the strongest impact on total polyphenols, hydroxycinnamates, flavonols, and especially on total flavonoids. Pre-processing ethylene treatment significantly increased the proanthocyanidin, anthocyanin and flavan-3-ol content in the juices. Pasteurization induced a significant increase in anthocyanin concentration. Antioxidant capacity was enhanced by ethylene treatment and pasteurization in juices from both TH and EH grapes. These results suggest that an appropriate management of grape harvesting date, postharvest and processing may lead to an improvement in nutraceutical quality of juices. Further research is needed to study the effect of the investigated factors on juice organoleptic properties. © 2015 Society of Chemical Industry.
Raef, A.
2009-01-01
The recent proliferation of the 3D reflection seismic method into the near-surface area of geophysical applications, especially in response to the emergence of the need to comprehensively characterize and monitor near-surface carbon dioxide sequestration in shallow saline aquifers around the world, justifies the emphasis on cost-effective and robust quality control and assurance (QC/QA) workflow of 3D seismic data preprocessing that is suitable for near-surface applications. The main purpose of our seismic data preprocessing QC is to enable the use of appropriate header information, data that are free of noise-dominated traces, and/or flawed vertical stacking in subsequent processing steps. In this article, I provide an account of utilizing survey design specifications, noise properties, first breaks, and normal moveout for rapid and thorough graphical QC/QA diagnostics, which are easy to apply and efficient in the diagnosis of inconsistencies. A correlated vibroseis time-lapse 3D-seismic data set from a CO2-flood monitoring survey is used for demonstrating QC diagnostics. An important by-product of the QC workflow is establishing the number of layers for a refraction statics model in a data-driven graphical manner that capitalizes on the spatial coverage of the 3D seismic data. ?? China University of Geosciences (Wuhan) and Springer-Verlag GmbH 2009.
Digital image processing and analysis for activated sludge wastewater treatment.
Khan, Muhammad Burhan; Lee, Xue Yong; Nisar, Humaira; Ng, Choon Aun; Yeap, Kim Ho; Malik, Aamir Saeed
2015-01-01
Activated sludge system is generally used in wastewater treatment plants for processing domestic influent. Conventionally the activated sludge wastewater treatment is monitored by measuring physico-chemical parameters like total suspended solids (TSSol), sludge volume index (SVI) and chemical oxygen demand (COD) etc. For the measurement, tests are conducted in the laboratory, which take many hours to give the final measurement. Digital image processing and analysis offers a better alternative not only to monitor and characterize the current state of activated sludge but also to predict the future state. The characterization by image processing and analysis is done by correlating the time evolution of parameters extracted by image analysis of floc and filaments with the physico-chemical parameters. This chapter briefly reviews the activated sludge wastewater treatment; and, procedures of image acquisition, preprocessing, segmentation and analysis in the specific context of activated sludge wastewater treatment. In the latter part additional procedures like z-stacking, image stitching are introduced for wastewater image preprocessing, which are not previously used in the context of activated sludge. Different preprocessing and segmentation techniques are proposed, along with the survey of imaging procedures reported in the literature. Finally the image analysis based morphological parameters and correlation of the parameters with regard to monitoring and prediction of activated sludge are discussed. Hence it is observed that image analysis can play a very useful role in the monitoring of activated sludge wastewater treatment plants.
Pre-Processing and Cross-Correlation Techniques for Time-Distance Helioseismology
NASA Astrophysics Data System (ADS)
Wang, N.; de Ridder, S.; Zhao, J.
2014-12-01
In chaotic wave fields excited by a random distribution of noise sources a cross-correlation of the recordings made at two stations yield the interstation wave-field response. After early successes in helioseismology, laboratory studies and earth-seismology, this technique found broad application in global and regional seismology. This development came with an increasing understanding of pre-processing and cross-correlation workflows to yield an optimal signal-to-noise ratio (SNR). Helioseismologist rely heavily on stacking to increase the SNR. Until now, they have not studied different spectral-whitening and cross-correlation workflows and relies heavily on stacking to increase the SNR. The recordings vary considerably between sunspots and regular portions of the sun. Within the sunspot the periodic effects of the observation satellite orbit are difficult to remove. We remove a running alpha-mean from the data and apply a soft clip to deal with data glitches. The recordings contain energy of both flow and waves. A frequency domain filter selects the wave energy. Then the data is input to several pre-processing and cross-correlation techniques, common to earth seismology. We anticipate that spectral whitening will flatten the energy spectrum of the cross-correlations. We also expect that the cross-correlations converge faster to their expected value when the data is processed over overlapping windows. The result of this study are expected to aid in decreasing the stacking while maintaining good SNR.
NASA Astrophysics Data System (ADS)
Erdogan, Eren; Schmidt, Michael; Seitz, Florian; Durmaz, Murat
2017-02-01
Although the number of terrestrial global navigation satellite system (GNSS) receivers supported by the International GNSS Service (IGS) is rapidly growing, the worldwide rather inhomogeneously distributed observation sites do not allow the generation of high-resolution global ionosphere products. Conversely, with the regionally enormous increase in highly precise GNSS data, the demands on (near) real-time ionosphere products, necessary in many applications such as navigation, are growing very fast. Consequently, many analysis centers accepted the responsibility of generating such products. In this regard, the primary objective of our work is to develop a near real-time processing framework for the estimation of the vertical total electron content (VTEC) of the ionosphere using proper models that are capable of a global representation adapted to the real data distribution. The global VTEC representation developed in this work is based on a series expansion in terms of compactly supported B-spline functions, which allow for an appropriate handling of the heterogeneous data distribution, including data gaps. The corresponding series coefficients and additional parameters such as differential code biases of the GNSS satellites and receivers constitute the set of unknown parameters. The Kalman filter (KF), as a popular recursive estimator, allows processing of the data immediately after acquisition and paves the way of sequential (near) real-time estimation of the unknown parameters. To exploit the advantages of the chosen data representation and the estimation procedure, the B-spline model is incorporated into the KF under the consideration of necessary constraints. Based on a preprocessing strategy, the developed approach utilizes hourly batches of GPS and GLONASS observations provided by the IGS data centers with a latency of 1 h in its current realization. Two methods for validation of the results are performed, namely the self consistency analysis and a comparison with Jason-2 altimetry data. The highly promising validation results allow the conclusion that under the investigated conditions our derived near real-time product is of the same accuracy level as the so-called final post-processed products provided by the IGS with a latency of several days or even weeks.
Generic and Automated Data Evaluation in Analytical Measurement.
Adam, Martin; Fleischer, Heidi; Thurow, Kerstin
2017-04-01
In the past year, automation has become more and more important in the field of elemental and structural chemical analysis to reduce the high degree of manual operation and processing time as well as human errors. Thus, a high number of data points are generated, which requires fast and automated data evaluation. To handle the preprocessed export data from different analytical devices with software from various vendors offering a standardized solution without any programming knowledge should be preferred. In modern laboratories, multiple users will use this software on multiple personal computers with different operating systems (e.g., Windows, Macintosh, Linux). Also, mobile devices such as smartphones and tablets have gained growing importance. The developed software, Project Analytical Data Evaluation (ADE), is implemented as a web application. To transmit the preevaluated data from the device software to the Project ADE, the exported XML report files are detected and the included data are imported into the entities database using the Data Upload software. Different calculation types of a sample within one measurement series (e.g., method validation) are identified using information tags inside the sample name. The results are presented in tables and diagrams on different information levels (general, detailed for one analyte or sample).
A Multivariate Granger Causality Concept towards Full Brain Functional Connectivity.
Schmidt, Christoph; Pester, Britta; Schmid-Hertel, Nicole; Witte, Herbert; Wismüller, Axel; Leistritz, Lutz
2016-01-01
Detecting changes of spatially high-resolution functional connectivity patterns in the brain is crucial for improving the fundamental understanding of brain function in both health and disease, yet still poses one of the biggest challenges in computational neuroscience. Currently, classical multivariate Granger Causality analyses of directed interactions between single process components in coupled systems are commonly restricted to spatially low- dimensional data, which requires a pre-selection or aggregation of time series as a preprocessing step. In this paper we propose a new fully multivariate Granger Causality approach with embedded dimension reduction that makes it possible to obtain a representation of functional connectivity for spatially high-dimensional data. The resulting functional connectivity networks may consist of several thousand vertices and thus contain more detailed information compared to connectivity networks obtained from approaches based on particular regions of interest. Our large scale Granger Causality approach is applied to synthetic and resting state fMRI data with a focus on how well network community structure, which represents a functional segmentation of the network, is preserved. It is demonstrated that a number of different community detection algorithms, which utilize a variety of algorithmic strategies and exploit topological features differently, reveal meaningful information on the underlying network module structure.
Gebauer, André; Jahr, Thomas; Jentzsch, Gerhard
2007-05-01
In June 2003, a large scale injection experiment started at the Continental Deep Drilling site (KTB) in Germany. A tiltmeter array was installed which consisted of five high resolution borehole tiltmeters of the ASKANIA type, also equipped with three dimensional seismometers. For the next 11 months, 86 000 m(3) were injected into the KTB pilot borehole 4000 m deep. The average injection rate was approximately 200 l/min. The research objective was to observe and to analyze deformation caused by the injection into the upper crust at the kilometer range. A new data acquisition system was developed by Geo-Research Center Potsdam (GFZ) to master the expected huge amount of seismic and tilt data. Furthermore, it was necessary to develop a new preprocessing software called PREANALYSE for long-period time series. This software includes different useful functions, such as step and spike correction, interpolation, filtering, and spectral analysis. This worldwide unique installation offers the excellent opportunity of the separation of signals due to injection and due to environment by correlation of the data of the five stations with the ground water table and meteorological data.
Human Movement Recognition Based on the Stochastic Characterisation of Acceleration Data
Munoz-Organero, Mario; Lotfi, Ahmad
2016-01-01
Human activity recognition algorithms based on information obtained from wearable sensors are successfully applied in detecting many basic activities. Identified activities with time-stationary features are characterised inside a predefined temporal window by using different machine learning algorithms on extracted features from the measured data. Better accuracy, precision and recall levels could be achieved by combining the information from different sensors. However, detecting short and sporadic human movements, gestures and actions is still a challenging task. In this paper, a novel algorithm to detect human basic movements from wearable measured data is proposed and evaluated. The proposed algorithm is designed to minimise computational requirements while achieving acceptable accuracy levels based on characterising some particular points in the temporal series obtained from a single sensor. The underlying idea is that this algorithm would be implemented in the sensor device in order to pre-process the sensed data stream before sending the information to a central point combining the information from different sensors to improve accuracy levels. Intra- and inter-person validation is used for two particular cases: single step detection and fall detection and classification using a single tri-axial accelerometer. Relevant results for the above cases and pertinent conclusions are also presented. PMID:27618063
NASA Astrophysics Data System (ADS)
Zhang, Luozhi; Zhou, Yuanyuan; Huo, Dongming; Li, Jinxi; Zhou, Xin
2018-09-01
A method is presented for multiple-image encryption by using the combination of orthogonal encoding and compressive sensing based on double random phase encoding. As an original thought in optical encryption, it is demonstrated theoretically and carried out by using the orthogonal-basis matrices to build a modified measurement array, being projected onto the images. In this method, all the images can be compressed in parallel into a stochastic signal and be diffused to be a stationary white noise. Meanwhile, each single-image can be separately reestablished by adopting a proper decryption key combination through the block-reconstruction rather than the entire-rebuilt, for its costs of data and decryption time are greatly decreased, which may be promising both in multi-user multiplexing and huge-image encryption/decryption. Besides, the security of this method is characterized by using the bit-length of key, and the parallelism is investigated as well. The simulations and discussions are also made on the effects of decryption as well as the correlation coefficient by using a series of sampling rates, occlusion attacks, keys with various error rates, etc.
Data preprocessing methods of FT-NIR spectral data for the classification cooking oil
NASA Astrophysics Data System (ADS)
Ruah, Mas Ezatul Nadia Mohd; Rasaruddin, Nor Fazila; Fong, Sim Siong; Jaafar, Mohd Zuli
2014-12-01
This recent work describes the data pre-processing method of FT-NIR spectroscopy datasets of cooking oil and its quality parameters with chemometrics method. Pre-processing of near-infrared (NIR) spectral data has become an integral part of chemometrics modelling. Hence, this work is dedicated to investigate the utility and effectiveness of pre-processing algorithms namely row scaling, column scaling and single scaling process with Standard Normal Variate (SNV). The combinations of these scaling methods have impact on exploratory analysis and classification via Principle Component Analysis plot (PCA). The samples were divided into palm oil and non-palm cooking oil. The classification model was build using FT-NIR cooking oil spectra datasets in absorbance mode at the range of 4000cm-1-14000cm-1. Savitzky Golay derivative was applied before developing the classification model. Then, the data was separated into two sets which were training set and test set by using Duplex method. The number of each class was kept equal to 2/3 of the class that has the minimum number of sample. Then, the sample was employed t-statistic as variable selection method in order to select which variable is significant towards the classification models. The evaluation of data pre-processing were looking at value of modified silhouette width (mSW), PCA and also Percentage Correctly Classified (%CC). The results show that different data processing strategies resulting to substantial amount of model performances quality. The effects of several data pre-processing i.e. row scaling, column standardisation and single scaling process with Standard Normal Variate indicated by mSW and %CC. At two PCs model, all five classifier gave high %CC except Quadratic Distance Analysis.
Compact Circuit Preprocesses Accelerometer Output
NASA Technical Reports Server (NTRS)
Bozeman, Richard J., Jr.
1993-01-01
Compact electronic circuit transfers dc power to, and preprocesses ac output of, accelerometer and associated preamplifier. Incorporated into accelerometer case during initial fabrication or retrofit onto commercial accelerometer. Made of commercial integrated circuits and other conventional components; made smaller by use of micrologic and surface-mount technology.
The Foreign-Language Teacher and Cognitive Psychology or Where Do We Go from Here?
ERIC Educational Resources Information Center
Rivers, Wilga M.
Research into the psychology of perception can uncover important discoveries for more efficient learning. There must be increased understanding of the processing of input and the pre-processing of output for improved language instruction. Educators must at the present time be extremely wary of basing what they do in the foreign-language classroom…
A new approach to telemetry data processing. Ph.D. Thesis - Maryland Univ.
NASA Technical Reports Server (NTRS)
Broglio, C. J.
1973-01-01
An approach for a preprocessing system for telemetry data processing was developed. The philosophy of the approach is the development of a preprocessing system to interface with the main processor and relieve it of the burden of stripping information from a telemetry data stream. To accomplish this task, a telemetry preprocessing language was developed. Also, a hardware device for implementing the operation of this language was designed using a cellular logic module concept. In the development of the hardware device and the cellular logic module, a distributed form of control was implemented. This is accomplished by a technique of one-to-one intermodule communications and a set of privileged communication operations. By transferring this control state from module to module, the control function is dispersed through the system. A compiler for translating the preprocessing language statements into an operations table for the hardware device was also developed. Finally, to complete the system design and verify it, a simulator for the collular logic module was written using the APL/360 system.
Biometric and Emotion Identification: An ECG Compression Based Method.
Brás, Susana; Ferreira, Jacqueline H T; Soares, Sandra C; Pinho, Armando J
2018-01-01
We present an innovative and robust solution to both biometric and emotion identification using the electrocardiogram (ECG). The ECG represents the electrical signal that comes from the contraction of the heart muscles, indirectly representing the flow of blood inside the heart, it is known to convey a key that allows biometric identification. Moreover, due to its relationship with the nervous system, it also varies as a function of the emotional state. The use of information-theoretic data models, associated with data compression algorithms, allowed to effectively compare ECG records and infer the person identity, as well as emotional state at the time of data collection. The proposed method does not require ECG wave delineation or alignment, which reduces preprocessing error. The method is divided into three steps: (1) conversion of the real-valued ECG record into a symbolic time-series, using a quantization process; (2) conditional compression of the symbolic representation of the ECG, using the symbolic ECG records stored in the database as reference; (3) identification of the ECG record class, using a 1-NN (nearest neighbor) classifier. We obtained over 98% of accuracy in biometric identification, whereas in emotion recognition we attained over 90%. Therefore, the method adequately identify the person, and his/her emotion. Also, the proposed method is flexible and may be adapted to different problems, by the alteration of the templates for training the model.
Biometric and Emotion Identification: An ECG Compression Based Method
Brás, Susana; Ferreira, Jacqueline H. T.; Soares, Sandra C.; Pinho, Armando J.
2018-01-01
We present an innovative and robust solution to both biometric and emotion identification using the electrocardiogram (ECG). The ECG represents the electrical signal that comes from the contraction of the heart muscles, indirectly representing the flow of blood inside the heart, it is known to convey a key that allows biometric identification. Moreover, due to its relationship with the nervous system, it also varies as a function of the emotional state. The use of information-theoretic data models, associated with data compression algorithms, allowed to effectively compare ECG records and infer the person identity, as well as emotional state at the time of data collection. The proposed method does not require ECG wave delineation or alignment, which reduces preprocessing error. The method is divided into three steps: (1) conversion of the real-valued ECG record into a symbolic time-series, using a quantization process; (2) conditional compression of the symbolic representation of the ECG, using the symbolic ECG records stored in the database as reference; (3) identification of the ECG record class, using a 1-NN (nearest neighbor) classifier. We obtained over 98% of accuracy in biometric identification, whereas in emotion recognition we attained over 90%. Therefore, the method adequately identify the person, and his/her emotion. Also, the proposed method is flexible and may be adapted to different problems, by the alteration of the templates for training the model. PMID:29670564
CMOS image sensor with contour enhancement
NASA Astrophysics Data System (ADS)
Meng, Liya; Lai, Xiaofeng; Chen, Kun; Yuan, Xianghui
2010-10-01
Imitating the signal acquisition and processing of vertebrate retina, a CMOS image sensor with bionic pre-processing circuit is designed. Integration of signal-process circuit on-chip can reduce the requirement of bandwidth and precision of the subsequent interface circuit, and simplify the design of the computer-vision system. This signal pre-processing circuit consists of adaptive photoreceptor, spatial filtering resistive network and Op-Amp calculation circuit. The adaptive photoreceptor unit with a dynamic range of approximately 100 dB has a good self-adaptability for the transient changes in light intensity instead of intensity level itself. Spatial low-pass filtering resistive network used to mimic the function of horizontal cell, is composed of the horizontal resistor (HRES) circuit and OTA (Operational Transconductance Amplifier) circuit. HRES circuit, imitating dendrite of the neuron cell, comprises of two series MOS transistors operated in weak inversion region. Appending two diode-connected n-channel transistors to a simple transconductance amplifier forms the OTA Op-Amp circuit, which provides stable bias voltage for the gate of MOS transistors in HRES circuit, while serves as an OTA voltage follower to provide input voltage for the network nodes. The Op-Amp calculation circuit with a simple two-stage Op-Amp achieves the image contour enhancing. By adjusting the bias voltage of the resistive network, the smoothing effect can be tuned to change the effect of image's contour enhancement. Simulations of cell circuit and 16×16 2D circuit array are implemented using CSMC 0.5μm DPTM CMOS process.
Radial artery pulse waveform analysis based on curve fitting using discrete Fourier series.
Jiang, Zhixing; Zhang, David; Lu, Guangming
2018-04-19
Radial artery pulse diagnosis has been playing an important role in traditional Chinese medicine (TCM). For its non-invasion and convenience, the pulse diagnosis has great significance in diseases analysis of modern medicine. The practitioners sense the pulse waveforms in patients' wrist to make diagnoses based on their non-objective personal experience. With the researches of pulse acquisition platforms and computerized analysis methods, the objective study on pulse diagnosis can help the TCM to keep up with the development of modern medicine. In this paper, we propose a new method to extract feature from pulse waveform based on discrete Fourier series (DFS). It regards the waveform as one kind of signal that consists of a series of sub-components represented by sine and cosine (SC) signals with different frequencies and amplitudes. After the pulse signals are collected and preprocessed, we fit the average waveform for each sample using discrete Fourier series by least squares. The feature vector is comprised by the coefficients of discrete Fourier series function. Compared with the fitting method using Gaussian mixture function, the fitting errors of proposed method are smaller, which indicate that our method can represent the original signal better. The classification performance of proposed feature is superior to the other features extracted from waveform, liking auto-regression model and Gaussian mixture model. The coefficients of optimized DFS function, who is used to fit the arterial pressure waveforms, can obtain better performance in modeling the waveforms and holds more potential information for distinguishing different psychological states. Copyright © 2018 Elsevier B.V. All rights reserved.
Demonstration of angular anisotropy in the output of Thematic Mapper
NASA Technical Reports Server (NTRS)
Duggin, M. J. (Principal Investigator); Lindsay, J.; Piwinski, D. J.; Schoch, L. B.
1984-01-01
There is a dependence of TM output (proportional to scene radiance in a manner which will be discussed) upon season, upon cover type and upon view angle. The existence of a significant systematic variation across uniform scenes in p-type (radiometrically and geometrically pre-processed) data is demonstrated. Present pre-processing does remove the effects and the problem must be addressed because the effects are large. While this is in no way attributable to any shortcomings in the thematic mapper, it is an effect which is sufficiently important to warrant more study, with a view to developing suitable pre-processing correction algorithms.
NASA Astrophysics Data System (ADS)
Anderson, R. B.; Finch, N.; Clegg, S. M.; Graff, T.; Morris, R. V.; Laura, J.
2018-04-01
The PySAT point spectra tool provides a flexible graphical interface, enabling scientists to apply a wide variety of preprocessing and machine learning methods to point spectral data, with an emphasis on multivariate regression.
2D DOST based local phase pattern for face recognition
NASA Astrophysics Data System (ADS)
Moniruzzaman, Md.; Alam, Mohammad S.
2017-05-01
A new two dimensional (2-D) Discrete Orthogonal Stcokwell Transform (DOST) based Local Phase Pattern (LPP) technique has been proposed for efficient face recognition. The proposed technique uses 2-D DOST as preliminary preprocessing and local phase pattern to form robust feature signature which can effectively accommodate various 3D facial distortions and illumination variations. The S-transform, is an extension of the ideas of the continuous wavelet transform (CWT), is also known for its local spectral phase properties in time-frequency representation (TFR). It provides a frequency dependent resolution of the time-frequency space and absolutely referenced local phase information while maintaining a direct relationship with the Fourier spectrum which is unique in TFR. After utilizing 2-D Stransform as the preprocessing and build local phase pattern from extracted phase information yield fast and efficient technique for face recognition. The proposed technique shows better correlation discrimination compared to alternate pattern recognition techniques such as wavelet or Gabor based face recognition. The performance of the proposed method has been tested using the Yale and extended Yale facial database under different environments such as illumination variation and 3D changes in facial expressions. Test results show that the proposed technique yields better performance compared to alternate time-frequency representation (TFR) based face recognition techniques.
SeqTrim: a high-throughput pipeline for pre-processing any type of sequence read
2010-01-01
Background High-throughput automated sequencing has enabled an exponential growth rate of sequencing data. This requires increasing sequence quality and reliability in order to avoid database contamination with artefactual sequences. The arrival of pyrosequencing enhances this problem and necessitates customisable pre-processing algorithms. Results SeqTrim has been implemented both as a Web and as a standalone command line application. Already-published and newly-designed algorithms have been included to identify sequence inserts, to remove low quality, vector, adaptor, low complexity and contaminant sequences, and to detect chimeric reads. The availability of several input and output formats allows its inclusion in sequence processing workflows. Due to its specific algorithms, SeqTrim outperforms other pre-processors implemented as Web services or standalone applications. It performs equally well with sequences from EST libraries, SSH libraries, genomic DNA libraries and pyrosequencing reads and does not lead to over-trimming. Conclusions SeqTrim is an efficient pipeline designed for pre-processing of any type of sequence read, including next-generation sequencing. It is easily configurable and provides a friendly interface that allows users to know what happened with sequences at every pre-processing stage, and to verify pre-processing of an individual sequence if desired. The recommended pipeline reveals more information about each sequence than previously described pre-processors and can discard more sequencing or experimental artefacts. PMID:20089148
Spatio-Temporal Metabolite Profiling of the Barley Germination Process by MALDI MS Imaging
Gorzolka, Karin; Kölling, Jan; Nattkemper, Tim W.; Niehaus, Karsten
2016-01-01
MALDI mass spectrometry imaging was performed to localize metabolites during the first seven days of the barley germination. Up to 100 mass signals were detected of which 85 signals were identified as 48 different metabolites with highly tissue-specific localizations. Oligosaccharides were observed in the endosperm and in parts of the developed embryo. Lipids in the endosperm co-localized in dependency on their fatty acid compositions with changes in the distributions of diacyl phosphatidylcholines during germination. 26 potentially antifungal hordatines were detected in the embryo with tissue-specific localizations of their glycosylated, hydroxylated, and O-methylated derivates. In order to reveal spatio-temporal patterns in local metabolite compositions, multiple MSI data sets from a time series were analyzed in one batch. This requires a new preprocessing strategy to achieve comparability between data sets as well as a new strategy for unsupervised clustering. The resulting spatial segmentation for each time point sample is visualized in an interactive cluster map and enables simultaneous interactive exploration of all time points. Using this new analysis approach and visualization tool germination-dependent developments of metabolite patterns with single MS position accuracy were discovered. This is the first study that presents metabolite profiling of a cereals’ germination process over time by MALDI MSI with the identification of a large number of peaks of agronomically and industrially important compounds such as oligosaccharides, lipids and antifungal agents. Their detailed localization as well as the MS cluster analyses for on-tissue metabolite profile mapping revealed important information for the understanding of the germination process, which is of high scientific interest. PMID:26938880
ERIC Educational Resources Information Center
Marill, Thomas; And Others
The aim of the CYCLOPS Project research is the development of techniques for allowing computers to perform visual scene analysis, pre-processing of visual imagery, and perceptual learning. Work on scene analysis and learning has previously been described. The present report deals with research on pre-processing and with further work on scene…
USDA-ARS?s Scientific Manuscript database
In multivariate regression analysis of spectroscopy data, spectral preprocessing is often performed to reduce unwanted background information (offsets, sloped baselines) or accentuate absorption features in intrinsically overlapping bands. These procedures, also known as pretreatments, are commonly ...
New decision support tool for acute lymphoblastic leukemia classification
NASA Astrophysics Data System (ADS)
Madhukar, Monica; Agaian, Sos; Chronopoulos, Anthony T.
2012-03-01
In this paper, we build up a new decision support tool to improve treatment intensity choice in childhood ALL. The developed system includes different methods to accurately measure furthermore cell properties in microscope blood film images. The blood images are exposed to series of pre-processing steps which include color correlation, and contrast enhancement. By performing K-means clustering on the resultant images, the nuclei of the cells under consideration are obtained. Shape features and texture features are then extracted for classification. The system is further tested on the classification of spectra measured from the cell nuclei in blood samples in order to distinguish normal cells from those affected by Acute Lymphoblastic Leukemia. The results show that the proposed system robustly segments and classifies acute lymphoblastic leukemia based on complete microscopic blood images.
NASA Astrophysics Data System (ADS)
1989-01-01
A "NASA Tech Briefs" article describing an inspection tool and technique known as Optically Stimulated Electron Emission (OSEE) led to the formation of Photo Acoustic Technology, Inc. (PAT). PAT produces sensors and scanning systems which assure surface cleanliness prior to bonding, coating, painting, etc. The company's OP1000 series realtime pre-processing detection capability assures 100 percent surface quality testing. The technique involves brief exposure of the inspection surface to ultraviolet radiation. The energy interacts with the surface layer, causing free electrons to be emitted from the surface to be picked up by the detector. When contamination is present, it interferes with the electron flow in proportion to the thickness of the contaminant layer enabling measurement by system signal output. OP1000 systems operate in conventional atmospheres on all types of material and detect both organic and inorganic contamination.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Upadhyaya, Belle; Hines, J. Wesley; Damiano, Brian
The research and development under this project was focused on the following three major objectives: Objective 1: Identification of critical in-vessel SMR components for remote monitoring and development of their low-order dynamic models, along with a simulation model of an integral pressurized water reactor (iPWR). Objective 2: Development of an experimental flow control loop with motor-driven valves and pumps, incorporating data acquisition and on-line monitoring interface. Objective 3: Development of stationary and transient signal processing methods for electrical signatures, machinery vibration, and for characterizing process variables for equipment monitoring. This objective includes the development of a data analysis toolbox. Themore » following is a summary of the technical accomplishments under this project: - A detailed literature review of various SMR types and electrical signature analysis of motor-driven systems was completed. A bibliography of literature is provided at the end of this report. Assistance was provided by ORNL in identifying some key references. - A review of literature on pump-motor modeling and digital signal processing methods was performed. - An existing flow control loop was upgraded with new instrumentation, data acquisition hardware and software. The upgrading of the experimental loop included the installation of a new submersible pump driven by a three-phase induction motor. All the sensors were calibrated before full-scale experimental runs were performed. - MATLAB-Simulink model of a three-phase induction motor and pump system was completed. The model was used to simulate normal operation and fault conditions in the motor-pump system, and to identify changes in the electrical signatures. - A simulation model of an integral PWR (iPWR) was updated and the MATLAB-Simulink model was validated for known transients. The pump-motor model was interfaced with the iPWR model for testing the impact of primary flow perturbations (upsets) on plant parameters and the pump electrical signatures. Additionally, the reactor simulation is being used to generate normal operation data and data with instrumentation faults and process anomalies. A frequency controller was interfaced with the motor power supply in order to vary the electrical supply frequency. The experimental flow control loop was used to generate operational data under varying motor performance characteristics. Coolant leakage events were simulated by varying the bypass loop flow rate. The accuracy of motor power calculation was improved by incorporating the power factor, computed from motor current and voltage in each phase of the induction motor.- A variety of experimental runs were made for steady-state and transient pump operating conditions. Process, vibration, and electrical signatures were measured using a submersible pump with variable supply frequency. High correlation was seen between motor current and pump discharge pressure signal; similar high correlation was exhibited between pump motor power and flow rate. Wide-band analysis indicated high coherence (in the frequency domain) between motor current and vibration signals. - Wide-band operational data from a PWR were acquired from AMS Corporation and used to develop time-series models, and to estimate signal spectrum and sensor time constant. All the data were from different pressure transmitters in the system, including primary and secondary loops. These signals were pre-processed using the wavelet transform for filtering both low-frequency and high-frequency bands. This technique of signal pre-processing provides minimum distortion of the data, and results in a more optimal estimation of time constants of plant sensors using time-series modeling techniques.« less
NASA Astrophysics Data System (ADS)
Notti, Davide; Calò, Fabiana; Cigna, Francesca; Manunta, Michele; Herrera, Gerardo; Berti, Matteo; Meisina, Claudia; Tapete, Deodato; Zucca, Francesco
2015-11-01
Recent advances in multi-temporal Differential Synthetic Aperture Radar (SAR) Interferometry (DInSAR) have greatly improved our capability to monitor geological processes. Ground motion studies using DInSAR require both the availability of good quality input data and rigorous approaches to exploit the retrieved Time Series (TS) at their full potential. In this work we present a methodology for DInSAR TS analysis, with particular focus on landslides and subsidence phenomena. The proposed methodology consists of three main steps: (1) pre-processing, i.e., assessment of a SAR Dataset Quality Index (SDQI) (2) post-processing, i.e., application of empirical/stochastic methods to improve the TS quality, and (3) trend analysis, i.e., comparative implementation of methodologies for automatic TS analysis. Tests were carried out on TS datasets retrieved from processing of SAR imagery acquired by different radar sensors (i.e., ERS-1/2 SAR, RADARSAT-1, ENVISAT ASAR, ALOS PALSAR, TerraSAR-X, COSMO-SkyMed) using advanced DInSAR techniques (i.e., SqueeSAR™, PSInSAR™, SPN and SBAS). The obtained values of SDQI are discussed against the technical parameters of each data stack (e.g., radar band, number of SAR scenes, temporal coverage, revisiting time), the retrieved coverage of the DInSAR results, and the constraints related to the characterization of the investigated geological processes. Empirical and stochastic approaches were used to demonstrate how the quality of the TS can be improved after the SAR processing, and examples are discussed to mitigate phase unwrapping errors, and remove regional trends, noise and anomalies. Performance assessment of recently developed methods of trend analysis (i.e., PS-Time, Deviation Index and velocity TS) was conducted on two selected study areas in Northern Italy affected by land subsidence and landslides. Results show that the automatic detection of motion trends enhances the interpretation of DInSAR data, since it provides an objective picture of the deformation behaviour recorded through TS and therefore contributes to the understanding of the on-going geological processes.
NASA Astrophysics Data System (ADS)
Handhika, T.; Bustamam, A.; Ernastuti, Kerami, D.
2017-07-01
Multi-thread programming using OpenMP on the shared-memory architecture with hyperthreading technology allows the resource to be accessed by multiple processors simultaneously. Each processor can execute more than one thread for a certain period of time. However, its speedup depends on the ability of the processor to execute threads in limited quantities, especially the sequential algorithm which contains a nested loop. The number of the outer loop iterations is greater than the maximum number of threads that can be executed by a processor. The thread distribution technique that had been found previously only be applied by the high-level programmer. This paper generates a parallelization procedure for low-level programmer in dealing with 2-level nested loop problems with the maximum number of threads that can be executed by a processor is smaller than the number of the outer loop iterations. Data preprocessing which is related to the number of the outer loop and the inner loop iterations, the computational time required to execute each iteration and the maximum number of threads that can be executed by a processor are used as a strategy to determine which parallel region that will produce optimal speedup.
NASA Astrophysics Data System (ADS)
Stromer, D.; Christlein, V.; Schön, T.; Holub, W.; Maier, A.
2017-09-01
It is often the case that a document can not be opened, page-turned or touched anymore due to damages caused by aging processes, moisture or fire. To counter this, special imaging systems can be used. One of our earlier work revealed that a common 3-D X-ray micro-CT scanner is well suited for imaging and reconstructing historical documents written with iron gall ink - an ink consisting of metallic particles. We acquired a volume of a self-made book without opening or page-turning with a single 3-D scan. However, when investigating the reconstructed volume, we faced the problem of a proper automatic extraction of single pages within the volume in an acceptable time without losing information of the writings. Within this work, we evaluate different appropriate pre-processing methods with respect to computation time and accuracy which are decisive for a proper extraction of book pages from the reconstructed X-ray volume and the subsequent ink identification. The different methods were tested for an extreme case with low resolution, noisy input data and wavy pages. Finally, we present results of the page extraction after applying the evaluated methods.
Sunspot Pattern Classification using PCA and Neural Networks (Poster)
NASA Technical Reports Server (NTRS)
Rajkumar, T.; Thompson, D. E.; Slater, G. L.
2005-01-01
The sunspot classification scheme presented in this paper is considered as a 2-D classification problem on archived datasets, and is not a real-time system. As a first step, it mirrors the Zuerich/McIntosh historical classification system and reproduces classification of sunspot patterns based on preprocessing and neural net training datasets. Ultimately, the project intends to move from more rudimentary schemes, to develop spatial-temporal-spectral classes derived by correlating spatial and temporal variations in various wavelengths to the brightness fluctuation spectrum of the sun in those wavelengths. Once the approach is generalized, then the focus will naturally move from a 2-D to an n-D classification, where "n" includes time and frequency. Here, the 2-D perspective refers both to the actual SOH0 Michelson Doppler Imager (MDI) images that are processed, but also refers to the fact that a 2-D matrix is created from each image during preprocessing. The 2-D matrix is the result of running Principal Component Analysis (PCA) over the selected dataset images, and the resulting matrices and their eigenvalues are the objects that are stored in a database, classified, and compared. These matrices are indexed according to the standard McIntosh classification scheme.
Fan, Wei; Tsui, Kwok-Leung; Lin, Jianhui
2018-01-01
Railway axle bearings are one of the most important components used in vehicles and their failures probably result in unexpected accidents and economic losses. To realize a condition monitoring and fault diagnosis scheme of railway axle bearings, three dimensionless steadiness indexes in a time domain, a frequency domain, and a shape domain are respectively proposed to measure the steady states of bearing vibration signals. Firstly, vibration data collected from some designed experiments are pre-processed by using ensemble empirical mode decomposition (EEMD). Then, the coefficient of variation is introduced to construct two steady-state indexes from pre-processed vibration data in a time domain and a frequency domain, respectively. A shape function is used to construct a steady-state index in a shape domain. At last, to distinguish normal and abnormal bearing health states, some guideline thresholds are proposed. Further, to identify axle bearings with outer race defects, a pin roller defect, a cage defect, and coupling defects, the boundaries of all steadiness indexes are experimentally established. Experimental results showed that the proposed condition monitoring and fault diagnosis scheme is effective in identifying different bearing health conditions. PMID:29495446
NASA Astrophysics Data System (ADS)
Bhattacharjee, T.; Kumar, P.; Fillipe, L.
2018-02-01
Vibrational spectroscopy, especially FTIR and Raman, has shown enormous potential in disease diagnosis, especially in cancers. Their potential for detecting varied pathological conditions are regularly reported. However, to prove their applicability in clinics, large multi-center multi-national studies need to be undertaken; and these will result in enormous amount of data. A parallel effort to develop analytical methods, including user-friendly software that can quickly pre-process data and subject them to required multivariate analysis is warranted in order to obtain results in real time. This study reports a MATLAB based script that can automatically import data, preprocess spectra— interpolation, derivatives, normalization, and then carry out Principal Component Analysis (PCA) followed by Linear Discriminant Analysis (LDA) of the first 10 PCs; all with a single click. The software has been verified on data obtained from cell lines, animal models, and in vivo patient datasets, and gives results comparable to Minitab 16 software. The software can be used to import variety of file extensions, asc, .txt., .xls, and many others. Options to ignore noisy data, plot all possible graphs with PCA factors 1 to 5, and save loading factors, confusion matrices and other parameters are also present. The software can provide results for a dataset of 300 spectra within 0.01 s. We believe that the software will be vital not only in clinical trials using vibrational spectroscopic data, but also to obtain rapid results when these tools get translated into clinics.
Safe and sensible preprocessing and baseline correction of pupil-size data.
Mathôt, Sebastiaan; Fabius, Jasper; Van Heusden, Elle; Van der Stigchel, Stefan
2018-02-01
Measurement of pupil size (pupillometry) has recently gained renewed interest from psychologists, but there is little agreement on how pupil-size data is best analyzed. Here we focus on one aspect of pupillometric analyses: baseline correction, i.e., analyzing changes in pupil size relative to a baseline period. Baseline correction is useful in experiments that investigate the effect of some experimental manipulation on pupil size. In such experiments, baseline correction improves statistical power by taking into account random fluctuations in pupil size over time. However, we show that baseline correction can also distort data if unrealistically small pupil sizes are recorded during the baseline period, which can easily occur due to eye blinks, data loss, or other distortions. Divisive baseline correction (corrected pupil size = pupil size/baseline) is affected more strongly by such distortions than subtractive baseline correction (corrected pupil size = pupil size - baseline). We discuss the role of baseline correction as a part of preprocessing of pupillometric data, and make five recommendations: (1) before baseline correction, perform data preprocessing to mark missing and invalid data, but assume that some distortions will remain in the data; (2) use subtractive baseline correction; (3) visually compare your corrected and uncorrected data; (4) be wary of pupil-size effects that emerge faster than the latency of the pupillary response allows (within ±220 ms after the manipulation that induces the effect); and (5) remove trials on which baseline pupil size is unrealistically small (indicative of blinks and other distortions).
Tsai, Tsung-Heng; Tadesse, Mahlet G.; Di Poto, Cristina; Pannell, Lewis K.; Mechref, Yehia; Wang, Yue; Ressom, Habtom W.
2013-01-01
Motivation: Liquid chromatography-mass spectrometry (LC-MS) has been widely used for profiling expression levels of biomolecules in various ‘-omic’ studies including proteomics, metabolomics and glycomics. Appropriate LC-MS data preprocessing steps are needed to detect true differences between biological groups. Retention time (RT) alignment, which is required to ensure that ion intensity measurements among multiple LC-MS runs are comparable, is one of the most important yet challenging preprocessing steps. Current alignment approaches estimate RT variability using either single chromatograms or detected peaks, but do not simultaneously take into account the complementary information embedded in the entire LC-MS data. Results: We propose a Bayesian alignment model for LC-MS data analysis. The alignment model provides estimates of the RT variability along with uncertainty measures. The model enables integration of multiple sources of information including internal standards and clustered chromatograms in a mathematically rigorous framework. We apply the model to LC-MS metabolomic, proteomic and glycomic data. The performance of the model is evaluated based on ground-truth data, by measuring correlation of variation, RT difference across runs and peak-matching performance. We demonstrate that Bayesian alignment model improves significantly the RT alignment performance through appropriate integration of relevant information. Availability and implementation: MATLAB code, raw and preprocessed LC-MS data are available at http://omics.georgetown.edu/alignLCMS.html Contact: hwr@georgetown.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:24013927
Research on pre-processing of QR Code
NASA Astrophysics Data System (ADS)
Sun, Haixing; Xia, Haojie; Dong, Ning
2013-10-01
QR code encodes many kinds of information because of its advantages: large storage capacity, high reliability, full arrange of utter-high-speed reading, small printing size and high-efficient representation of Chinese characters, etc. In order to obtain the clearer binarization image from complex background, and improve the recognition rate of QR code, this paper researches on pre-processing methods of QR code (Quick Response Code), and shows algorithms and results of image pre-processing for QR code recognition. Improve the conventional method by changing the Souvola's adaptive text recognition method. Additionally, introduce the QR code Extraction which adapts to different image size, flexible image correction approach, and improve the efficiency and accuracy of QR code image processing.
NASA Astrophysics Data System (ADS)
Borodinov, A. A.; Myasnikov, V. V.
2018-04-01
The present work is devoted to comparing the accuracy of the known qualification algorithms in the task of recognizing local objects on radar images for various image preprocessing methods. Preprocessing involves speckle noise filtering and normalization of the object orientation in the image by the method of image moments and by a method based on the Hough transform. In comparison, the following classification algorithms are used: Decision tree; Support vector machine, AdaBoost, Random forest. The principal component analysis is used to reduce the dimension. The research is carried out on the objects from the base of radar images MSTAR. The paper presents the results of the conducted studies.
NASA Astrophysics Data System (ADS)
Di Mauro, Biagio; Fava, Francesco; Busetto, Lorenzo; Crosta, Giovanni Franco; Colombo, Roberto
2013-04-01
In this study a method based on the analysis of MODerate-resolution Imaging Spectroradiometer (MODIS) time series is proposed to estimate the post-fire resilience of mountain vegetation (broadleaf forest and prairies) in the Italian Alps. Resilience is defined herewith as the ability of a dynamical system to counteract disturbances. It can be quantified by the amount of time the disturbed system takes to resume, in statistical terms, an ecological functionality comparable with its undisturbed behavior. Satellite images of the Normalized Difference Vegetation Index (NDVI) and of the Enhanced Vegetation Index (EVI) with spatial resolution of 250m and temporal resolution of 16 days in the 2000-2012 time period were used. Wildfire affected areas in the Lombardy region between the years 2000 and 2010 were analysed. Only large fires (affected area >40ha) were selected. For each burned area, an undisturbed adjacent control site was located. Data pre-processing consisted in the smoothing of MODIS time series for noise removal and then a double logistic function was fitted. Land surface phenology descriptors (proxies for growing season start/end/length and green biomass) were extracted in order to characterize the time evolution of the vegetation. Descriptors from a burned area were compared to those extracted from the respective control site by means of the one-way analysis of variance. According to the number of subsequent years which exhibit statistically meaningful difference between burned and control site, five classes of resilience were identified and a set of thematic maps was created for each descriptor. The same method was applied to all 84 aggregated events and to events aggregated by main land cover. EVI index results more sensitive to fire impact than NDVI index. Analysis shows that fire causes both a reduction of the biomass and a variation in the phenology of the Alpine vegetation. Results suggest an average ecosystem resilience of 6-7 years. Moreover, broadleaf forest and prairies show different post-fire behavior in terms of land surface phenology descriptors. In addition to the above analysis, another method is proposed, which derives from the qualitative theory of dynamical systems. The (time dependent) spectral index of a burned area over the period of one year was plotted against its counterpart from the control site. Yearly plots (or scattergrams) before and after the fire were obtained. Each plot is a sequence of points on the plane, which are the vertices of a generally self-intersecting polygonal chain. Some geometrical descriptors were obtained from the yearly chains of each fire. Principal Components Analysis (PCA) of geometrical descriptors was applied to a set of case studies and the obtained results provide a system dynamics interpretation of the natural process.
Parafoveal Preprocessing in Reading Revisited: Evidence from a Novel Preview Manipulation
ERIC Educational Resources Information Center
Gagl, Benjamin; Hawelka, Stefan; Richlan, Fabio; Schuster, Sarah; Hutzler, Florian
2014-01-01
The study investigated parafoveal preprocessing by the means of the classical invisible boundary paradigm and a novel manipulation of the parafoveal previews (i.e., visual degradation). Eye movements were investigated on 5-letter target words with constraining (i.e., highly informative) initial letters or similarly constraining final letters.…
Chockalingam, Sriram; Aluru, Maneesha; Aluru, Srinivas
2016-09-19
Pre-processing of microarray data is a well-studied problem. Furthermore, all popular platforms come with their own recommended best practices for differential analysis of genes. However, for genome-scale network inference using microarray data collected from large public repositories, these methods filter out a considerable number of genes. This is primarily due to the effects of aggregating a diverse array of experiments with different technical and biological scenarios. Here we introduce a pre-processing pipeline suitable for inferring genome-scale gene networks from large microarray datasets. We show that partitioning of the available microarray datasets according to biological relevance into tissue- and process-specific categories significantly extends the limits of downstream network construction. We demonstrate the effectiveness of our pre-processing pipeline by inferring genome-scale networks for the model plant Arabidopsis thaliana using two different construction methods and a collection of 11,760 Affymetrix ATH1 microarray chips. Our pre-processing pipeline and the datasets used in this paper are made available at http://alurulab.cc.gatech.edu/microarray-pp.
Pre-processing and post-processing in group-cluster mergers
NASA Astrophysics Data System (ADS)
Vijayaraghavan, R.; Ricker, P. M.
2013-11-01
Galaxies in clusters are more likely to be of early type and to have lower star formation rates than galaxies in the field. Recent observations and simulations suggest that cluster galaxies may be `pre-processed' by group or filament environments and that galaxies that fall into a cluster as part of a larger group can stay coherent within the cluster for up to one orbital period (`post-processing'). We investigate these ideas by means of a cosmological N-body simulation and idealized N-body plus hydrodynamics simulations of a group-cluster merger. We find that group environments can contribute significantly to galaxy pre-processing by means of enhanced galaxy-galaxy merger rates, removal of galaxies' hot halo gas by ram pressure stripping and tidal truncation of their galaxies. Tidal distortion of the group during infall does not contribute to pre-processing. Post-processing is also shown to be effective: galaxy-galaxy collisions are enhanced during a group's pericentric passage within a cluster, the merger shock enhances the ram pressure on group and cluster galaxies and an increase in local density during the merger leads to greater galactic tidal truncation.
Cowley, Benjamin U.; Korpela, Jussi
2018-01-01
Existing tools for the preprocessing of EEG data provide a large choice of methods to suitably prepare and analyse a given dataset. Yet it remains a challenge for the average user to integrate methods for batch processing of the increasingly large datasets of modern research, and compare methods to choose an optimal approach across the many possible parameter configurations. Additionally, many tools still require a high degree of manual decision making for, e.g., the classification of artifacts in channels, epochs or segments. This introduces extra subjectivity, is slow, and is not reproducible. Batching and well-designed automation can help to regularize EEG preprocessing, and thus reduce human effort, subjectivity, and consequent error. The Computational Testing for Automated Preprocessing (CTAP) toolbox facilitates: (i) batch processing that is easy for experts and novices alike; (ii) testing and comparison of preprocessing methods. Here we demonstrate the application of CTAP to high-resolution EEG data in three modes of use. First, a linear processing pipeline with mostly default parameters illustrates ease-of-use for naive users. Second, a branching pipeline illustrates CTAP's support for comparison of competing methods. Third, a pipeline with built-in parameter-sweeping illustrates CTAP's capability to support data-driven method parameterization. CTAP extends the existing functions and data structure from the well-known EEGLAB toolbox, based on Matlab, and produces extensive quality control outputs. CTAP is available under MIT open-source licence from https://github.com/bwrc/ctap. PMID:29692705
Cowley, Benjamin U; Korpela, Jussi
2018-01-01
Existing tools for the preprocessing of EEG data provide a large choice of methods to suitably prepare and analyse a given dataset. Yet it remains a challenge for the average user to integrate methods for batch processing of the increasingly large datasets of modern research, and compare methods to choose an optimal approach across the many possible parameter configurations. Additionally, many tools still require a high degree of manual decision making for, e.g., the classification of artifacts in channels, epochs or segments. This introduces extra subjectivity, is slow, and is not reproducible. Batching and well-designed automation can help to regularize EEG preprocessing, and thus reduce human effort, subjectivity, and consequent error. The Computational Testing for Automated Preprocessing (CTAP) toolbox facilitates: (i) batch processing that is easy for experts and novices alike; (ii) testing and comparison of preprocessing methods. Here we demonstrate the application of CTAP to high-resolution EEG data in three modes of use. First, a linear processing pipeline with mostly default parameters illustrates ease-of-use for naive users. Second, a branching pipeline illustrates CTAP's support for comparison of competing methods. Third, a pipeline with built-in parameter-sweeping illustrates CTAP's capability to support data-driven method parameterization. CTAP extends the existing functions and data structure from the well-known EEGLAB toolbox, based on Matlab, and produces extensive quality control outputs. CTAP is available under MIT open-source licence from https://github.com/bwrc/ctap.
Sam2bam: High-Performance Framework for NGS Data Preprocessing Tools
Cheng, Yinhe; Tzeng, Tzy-Hwa Kathy
2016-01-01
This paper introduces a high-throughput software tool framework called sam2bam that enables users to significantly speed up pre-processing for next-generation sequencing data. The sam2bam is especially efficient on single-node multi-core large-memory systems. It can reduce the runtime of data pre-processing in marking duplicate reads on a single node system by 156–186x compared with de facto standard tools. The sam2bam consists of parallel software components that can fully utilize multiple processors, available memory, high-bandwidth storage, and hardware compression accelerators, if available. The sam2bam provides file format conversion between well-known genome file formats, from SAM to BAM, as a basic feature. Additional features such as analyzing, filtering, and converting input data are provided by using plug-in tools, e.g., duplicate marking, which can be attached to sam2bam at runtime. We demonstrated that sam2bam could significantly reduce the runtime of next generation sequencing (NGS) data pre-processing from about two hours to about one minute for a whole-exome data set on a 16-core single-node system using up to 130 GB of memory. The sam2bam could reduce the runtime of NGS data pre-processing from about 20 hours to about nine minutes for a whole-genome sequencing data set on the same system using up to 711 GB of memory. PMID:27861637
Delwiche, Stephen R; Reeves, James B
2010-01-01
In multivariate regression analysis of spectroscopy data, spectral preprocessing is often performed to reduce unwanted background information (offsets, sloped baselines) or accentuate absorption features in intrinsically overlapping bands. These procedures, also known as pretreatments, are commonly smoothing operations or derivatives. While such operations are often useful in reducing the number of latent variables of the actual decomposition and lowering residual error, they also run the risk of misleading the practitioner into accepting calibration equations that are poorly adapted to samples outside of the calibration. The current study developed a graphical method to examine this effect on partial least squares (PLS) regression calibrations of near-infrared (NIR) reflection spectra of ground wheat meal with two analytes, protein content and sodium dodecyl sulfate sedimentation (SDS) volume (an indicator of the quantity of the gluten proteins that contribute to strong doughs). These two properties were chosen because of their differing abilities to be modeled by NIR spectroscopy: excellent for protein content, fair for SDS sedimentation volume. To further demonstrate the potential pitfalls of preprocessing, an artificial component, a randomly generated value, was included in PLS regression trials. Savitzky-Golay (digital filter) smoothing, first-derivative, and second-derivative preprocess functions (5 to 25 centrally symmetric convolution points, derived from quadratic polynomials) were applied to PLS calibrations of 1 to 15 factors. The results demonstrated the danger of an over reliance on preprocessing when (1) the number of samples used in a multivariate calibration is low (<50), (2) the spectral response of the analyte is weak, and (3) the goodness of the calibration is based on the coefficient of determination (R(2)) rather than a term based on residual error. The graphical method has application to the evaluation of other preprocess functions and various types of spectroscopy data.
Classification of java tea (Orthosiphon aristatus) quality using FTIR spectroscopy and chemometrics
NASA Astrophysics Data System (ADS)
Heryanto, R.; Pradono, D. I.; Marlina, E.; Darusman, L. K.
2017-05-01
Java tea (Orthosiphon aristatus) is a plant that widely used as a medicinal herb in Indonesia. Its quality is varying depends on various factors, such as cultivating area, climate and harvesting time. This study aimed to investigate the effectiveness of FTIR spectroscopy coupled with chemometrics for discriminating the quality of java tea from different cultivating area. FTIR spectra of ethanolic extracts were collected from five different regions of origin of java tea. Prior to chemometrics evaluation, spectra were pre-processed by using baselining, normalization and derivatization. Principal Components Analysis (PCA) was used to reduce the spectra to two PCs, which explained 73% of the total variance. Score plot of two PCs showed groupings of the samples according to their regions of origin. Furthermore, Partial Least Squares-Discriminant Analysis (PLSDA) was applied to the pre-processed data. The approach produced an external validation success rate of 100%. This study shows that FTIR analysis and chemometrics has discriminatory power to classify java tea based on its quality related to the region of origin.
Initial Results from SQUID Sensor: Analysis and Modeling for the ELF/VLF Atmospheric Noise.
Hao, Huan; Wang, Huali; Chen, Liang; Wu, Jun; Qiu, Longqing; Rong, Liangliang
2017-02-14
In this paper, the amplitude probability density (APD) of the wideband extremely low frequency (ELF) and very low frequency (VLF) atmospheric noise is studied. The electromagnetic signals from the atmosphere, referred to herein as atmospheric noise, was recorded by a mobile low-temperature superconducting quantum interference device (SQUID) receiver under magnetically unshielded conditions. In order to eliminate the adverse effect brought by the geomagnetic activities and powerline, the measured field data was preprocessed to suppress the baseline wandering and harmonics by symmetric wavelet transform and least square methods firstly. Then statistical analysis was performed for the atmospheric noise on different time and frequency scales. Finally, the wideband ELF/VLF atmospheric noise was analyzed and modeled separately. Experimental results show that, Gaussian model is appropriate to depict preprocessed ELF atmospheric noise by a hole puncher operator. While for VLF atmospheric noise, symmetric α -stable (S α S) distribution is more accurate to fit the heavy-tail of the envelope probability density function (pdf).
Automatic Computer Mapping of Terrain
NASA Technical Reports Server (NTRS)
Smedes, H. W.
1971-01-01
Computer processing of 17 wavelength bands of visible, reflective infrared, and thermal infrared scanner spectrometer data, and of three wavelength bands derived from color aerial film has resulted in successful automatic computer mapping of eight or more terrain classes in a Yellowstone National Park test site. The tests involved: (1) supervised and non-supervised computer programs; (2) special preprocessing of the scanner data to reduce computer processing time and cost, and improve the accuracy; and (3) studies of the effectiveness of the proposed Earth Resources Technology Satellite (ERTS) data channels in the automatic mapping of the same terrain, based on simulations, using the same set of scanner data. The following terrain classes have been mapped with greater than 80 percent accuracy in a 12-square-mile area with 1,800 feet of relief; (1) bedrock exposures, (2) vegetated rock rubble, (3) talus, (4) glacial kame meadow, (5) glacial till meadow, (6) forest, (7) bog, and (8) water. In addition, shadows of clouds and cliffs are depicted, but were greatly reduced by using preprocessing techniques.
Initial Results from SQUID Sensor: Analysis and Modeling for the ELF/VLF Atmospheric Noise
Hao, Huan; Wang, Huali; Chen, Liang; Wu, Jun; Qiu, Longqing; Rong, Liangliang
2017-01-01
In this paper, the amplitude probability density (APD) of the wideband extremely low frequency (ELF) and very low frequency (VLF) atmospheric noise is studied. The electromagnetic signals from the atmosphere, referred to herein as atmospheric noise, was recorded by a mobile low-temperature superconducting quantum interference device (SQUID) receiver under magnetically unshielded conditions. In order to eliminate the adverse effect brought by the geomagnetic activities and powerline, the measured field data was preprocessed to suppress the baseline wandering and harmonics by symmetric wavelet transform and least square methods firstly. Then statistical analysis was performed for the atmospheric noise on different time and frequency scales. Finally, the wideband ELF/VLF atmospheric noise was analyzed and modeled separately. Experimental results show that, Gaussian model is appropriate to depict preprocessed ELF atmospheric noise by a hole puncher operator. While for VLF atmospheric noise, symmetric α-stable (SαS) distribution is more accurate to fit the heavy-tail of the envelope probability density function (pdf). PMID:28216590
Classification of ion mobility spectra by functional groups using neural networks
NASA Technical Reports Server (NTRS)
Bell, S.; Nazarov, E.; Wang, Y. F.; Eiceman, G. A.
1999-01-01
Neural networks were trained using whole ion mobility spectra from a standardized database of 3137 spectra for 204 chemicals at various concentrations. Performance of the network was measured by the success of classification into ten chemical classes. Eleven stages for evaluation of spectra and of spectral pre-processing were employed and minimums established for response thresholds and spectral purity. After optimization of the database, network, and pre-processing routines, the fraction of successful classifications by functional group was 0.91 throughout a range of concentrations. Network classification relied on a combination of features, including drift times, number of peaks, relative intensities, and other factors apparently including peak shape. The network was opportunistic, exploiting different features within different chemical classes. Application of neural networks in a two-tier design where chemicals were first identified by class and then individually eliminated all but one false positive out of 161 test spectra. These findings establish that ion mobility spectra, even with low resolution instrumentation, contain sufficient detail to permit the development of automated identification systems.
NASA Astrophysics Data System (ADS)
Keramitsoglou, I.; Katsouyanni, K.; Analitis, A.; Sismanidis, P.; Kiranoudis, C. T.
2016-12-01
High temperatures and heatwaves are associated with large increases in mortality, especially among susceptible individuals living in urban areas. The within-city variability in the effects associated with specific area characteristics, including the Urban Heat Island effect, have to be taken into account to estimate the level of heatwave risk associated with a specific city location. Real-time appraisal and quantification of spatially distributed heatwave risk is therefore required to develop innovative applications to safeguard citizens' health. TREASURE app (http://treasure.eu-project-sites.com/) integrates the expertise of epidemiologists, Earth Observation scientists and IT developers into intelligent operational and real-time heatwave risk assessment for citizens. The app provides the user with an assessment of personalized location-specific heatwave risk. For the development of the app an epidemiological analysis of a long series of mortality data against measured data series has been carried out to identify the temperature level associated with the minimum mortality (threshold) and the change in risk of death for increases in temperature above this level, in the warm period. Published results have been also taken into account. For the estimation of heatwave hazard thermal infrared Earth Observation data were exploited so as to provide spatially and temporally detailed air and land surface temperatures. An advanced workflow has been developed that uses 4 km/5' geostationary TIR data from EUMETSAT MSG2-SEVIRI satellite, output from the Global Forecast System weather model and SAFNWC software. This workflow consists of the preprocessing of the EO data and the retrieval of LST and TA at an enhanced spatial resolution of 1km. The mobile app was developed for, evaluated in and endorsed by two Mediterranean cities with different characteristics, namely Athens (GR) and Palma (ES) and has set the ground for application to any other European city.
Kepler AutoRegressive Planet Search: Motivation & Methodology
NASA Astrophysics Data System (ADS)
Caceres, Gabriel; Feigelson, Eric; Jogesh Babu, G.; Bahamonde, Natalia; Bertin, Karine; Christen, Alejandra; Curé, Michel; Meza, Cristian
2015-08-01
The Kepler AutoRegressive Planet Search (KARPS) project uses statistical methodology associated with autoregressive (AR) processes to model Kepler lightcurves in order to improve exoplanet transit detection in systems with high stellar variability. We also introduce a planet-search algorithm to detect transits in time-series residuals after application of the AR models. One of the main obstacles in detecting faint planetary transits is the intrinsic stellar variability of the host star. The variability displayed by many stars may have autoregressive properties, wherein later flux values are correlated with previous ones in some manner. Auto-Regressive Moving-Average (ARMA) models, Generalized Auto-Regressive Conditional Heteroskedasticity (GARCH), and related models are flexible, phenomenological methods used with great success to model stochastic temporal behaviors in many fields of study, particularly econometrics. Powerful statistical methods are implemented in the public statistical software environment R and its many packages. Modeling involves maximum likelihood fitting, model selection, and residual analysis. These techniques provide a useful framework to model stellar variability and are used in KARPS with the objective of reducing stellar noise to enhance opportunities to find as-yet-undiscovered planets. Our analysis procedure consisting of three steps: pre-processing of the data to remove discontinuities, gaps and outliers; ARMA-type model selection and fitting; and transit signal search of the residuals using a new Transit Comb Filter (TCF) that replaces traditional box-finding algorithms. We apply the procedures to simulated Kepler-like time series with known stellar and planetary signals to evaluate the effectiveness of the KARPS procedures. The ARMA-type modeling is effective at reducing stellar noise, but also reduces and transforms the transit signal into ingress/egress spikes. A periodogram based on the TCF is constructed to concentrate the signal of these periodic spikes. When a periodic transit is found, the model is displayed on a standard period-folded averaged light curve. We also illustrate the efficient coding in R.
Flood mapping with multitemporal MODIS data
NASA Astrophysics Data System (ADS)
Son, Nguyen-Thanh; Chen, Chi-Farn; Chen, Cheng-Ru
2014-05-01
Flood is one of the most devastating and frequent disasters resulting in loss of human life and serve damage to infrastructure and agricultural production. Flood is phenomenal in the Mekong River Delta (MRD), Vietnam. It annually lasts from July to November. Information on spatiotemporal flood dynamics is thus important for planners to devise successful strategies for flood monitoring and mitigation of its negative effects. The main objective of this study is to develop an approach for weekly mapping flood dynamics with the Moderate Resolution Imaging Spectroradiometer data in MRD using the water fraction model (WFM). The data processed for 2009 comprises three main steps: (1) data pre-processing to construct smooth time series of the difference in the values (DVLE) between land surface water index (LSWI) and enhanced vegetation index (EVI) using the empirical mode decomposition (EMD), (2) flood derivation using WFM, and (3) accuracy assessment. The mapping results were compared with the ground reference data, which were constructed from Envisat Advanced Synthetic Aperture Radar (ASAR) data. As several error sources, including mixed-pixel problems and low-resolution bias between the mapping results and ground reference data, could lower the level of classification accuracy, the comparisons indicated satisfactory results with the overall accuracy of 80.5% and Kappa coefficient of 0.61, respectively. These results were reaffirmed by a close correlation between the MODIS-derived flood area and that of the ground reference map at the provincial level, with the correlation coefficients (R2) of 0.93. Considering the importance of remote sensing for monitoring floods and mitigating the damage caused by floods to crops and infrastructure, this study eventually leads to the realization of the value of using time-series MODIS DVLE data for weekly flood monitoring in MRD with the aid of EMD and WFM. Such an approach that could provide quantitative information on spatiotemporal flood dynamics for monitoring purposes was completely transferable to other regions in the world.
Pre-processing SAR image stream to facilitate compression for transport on bandwidth-limited-link
Rush, Bobby G.; Riley, Robert
2015-09-29
Pre-processing is applied to a raw VideoSAR (or similar near-video rate) product to transform the image frame sequence into a product that resembles more closely the type of product for which conventional video codecs are designed, while sufficiently maintaining utility and visual quality of the product delivered by the codec.
NASA Astrophysics Data System (ADS)
An, Chan-Ho; Yang, Janghoon; Jang, Seunghun; Kim, Dong Ku
In this letter, a pre-processed lattice reduction (PLR) scheme is developed for the lattice reduction aided (LRA) detection of multiple input multiple-output (MIMO) systems in spatially correlated channel. The PLR computes the LLL-reduced matrix of the equivalent matrix, which is the product of the present channel matrix and unimodular transformation matrix for LR of spatial correlation matrix, rather than the present channel matrix itself. In conjunction with PLR followed by recursive lattice reduction (RLR) scheme [7], pre-processed RLR (PRLR) is shown to efficiently carry out the LR of the channel matrix, especially for the burst packet message in spatially and temporally correlated channel while matching the performance of conventional LRA detection.
Hiroyasu, Tomoyuki; Hayashinuma, Katsutoshi; Ichikawa, Hiroshi; Yagi, Nobuaki
2015-08-01
A preprocessing method for endoscopy image analysis using texture analysis is proposed. In a previous study, we proposed a feature value that combines a co-occurrence matrix and a run-length matrix to analyze the extent of early gastric cancer from images taken with narrow-band imaging endoscopy. However, the obtained feature value does not identify lesion zones correctly due to the influence of noise and halation. Therefore, we propose a new preprocessing method with a non-local means filter for de-noising and contrast limited adaptive histogram equalization. We have confirmed that the pattern of gastric mucosa in images can be improved by the proposed method. Furthermore, the lesion zone is shown more correctly by the obtained color map.
Compiler analysis for irregular problems in FORTRAN D
NASA Technical Reports Server (NTRS)
Vonhanxleden, Reinhard; Kennedy, Ken; Koelbel, Charles; Das, Raja; Saltz, Joel
1992-01-01
We developed a dataflow framework which provides a basis for rigorously defining strategies to make use of runtime preprocessing methods for distributed memory multiprocessors. In many programs, several loops access the same off-processor memory locations. Our runtime support gives us a mechanism for tracking and reusing copies of off-processor data. A key aspect of our compiler analysis strategy is to determine when it is safe to reuse copies of off-processor data. Another crucial function of the compiler analysis is to identify situations which allow runtime preprocessing overheads to be amortized. This dataflow analysis will make it possible to effectively use the results of interprocedural analysis in our efforts to reduce interprocessor communication and the need for runtime preprocessing.
Toward a Mobility-Driven Architecture for Multimodal Underwater Networking
2017-02-01
applications. By equipping AUVs with short-range, high -bandwidth underwater wireless communications , which feature lower energy-per-bit cost than acoustic...protocols. They suffer from significant transmission path losses at high frequencies , long propagation delays, low and distance-dependent bandwidth, time...of data preprocessing, data compression, and either tethering to a surface buoy able to use radio frequency (RF) communications or using undersea
The properties of the lunar regolith at Chang'e-3 landing site: A study based on LPR data
NASA Astrophysics Data System (ADS)
Feng, J.; Su, Y.; Xing, S.; Ding, C.; Li, C.
2015-12-01
In situ sampling from surface is difficult in the exploration of planets and sometimes radar sensing is a better choice. The properties of the surface material such as permittivity, density and depth can be obtained by a surface penetrating radar. The Chang'e-3 (CE-3) landed in the northern Mare Imbrium and a Lunar Penetrating Radar (LPR) is carried on the Yutu rover to detect the shallow structure of the lunar crust and the properties of the lunar regolith, which will give us a close look at the lunar subsurface. We process the radar data in a way which consist two steps: the regular preprocessing step and migration step. The preprocessing part includes zero time correction, de-wow, gain compensation, DC removal, geometric positioning. Then we combine all radar data obtained at the time the rover was moving, and use FIR filter to reduce the noise in the radar image with a pass band frequency range 200MHz-600MHz. A normal radar image is obtained after the preprocessing step. Using a nonlinear least squares fitting method, we fit the most hyperbolas in the radar image which are caused by the buried objects or rocks in the regolith and estimate the EM wave propagation velocity and the permittivity of the regolith. For there is a fixed mathematical relationship between dielectric constant and density, the density profile of the lunar regolith is also calculated. It seems that the permittivity and density at the landing site is larger than we thought before. Finally with a model of variable velocities, we apply the Kirchhoff migration method widely used in the seismology to transform the the unfocused space-time LPR image to a focused one showing the object's (most are stones) true location and size. From the migrated image, we find that the regolith depth in the landing site is smaller than previous study and the stone content rises rapidly with depth. Our study suggests that the landing site is a young region and the reworked history of the surface is short, which is consistent with crater density, showing the gradual formation of regolith by rock fracture during impact events.
Zhang, Mingjing; Wen, Ming; Zhang, Zhi-Min; Lu, Hongmei; Liang, Yizeng; Zhan, Dejian
2015-03-01
Retention time shift is one of the most challenging problems during the preprocessing of massive chromatographic datasets. Here, an improved version of the moving window fast Fourier transform cross-correlation algorithm is presented to perform nonlinear and robust alignment of chromatograms by analyzing the shifts matrix generated by moving window procedure. The shifts matrix in retention time can be estimated by fast Fourier transform cross-correlation with a moving window procedure. The refined shift of each scan point can be obtained by calculating the mode of corresponding column of the shifts matrix. This version is simple, but more effective and robust than the previously published moving window fast Fourier transform cross-correlation method. It can handle nonlinear retention time shift robustly if proper window size has been selected. The window size is the only one parameter needed to adjust and optimize. The properties of the proposed method are investigated by comparison with the previous moving window fast Fourier transform cross-correlation and recursive alignment by fast Fourier transform using chromatographic datasets. The pattern recognition results of a gas chromatography mass spectrometry dataset of metabolic syndrome can be improved significantly after preprocessing by this method. Furthermore, the proposed method is available as an open source package at https://github.com/zmzhang/MWFFT2. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Real-time portable system for fabric defect detection using an ARM processor
NASA Astrophysics Data System (ADS)
Fernandez-Gallego, J. A.; Yañez-Puentes, J. P.; Ortiz-Jaramillo, B.; Alvarez, J.; Orjuela-Vargas, S. A.; Philips, W.
2012-06-01
Modern textile industry seeks to produce textiles as little defective as possible since the presence of defects can decrease the final price of products from 45% to 65%. Automated visual inspection (AVI) systems, based on image analysis, have become an important alternative for replacing traditional inspections methods that involve human tasks. An AVI system gives the advantage of repeatability when implemented within defined constrains, offering more objective and reliable results for particular tasks than human inspection. Costs of automated inspection systems development can be reduced using modular solutions with embedded systems, in which an important advantage is the low energy consumption. Among the possibilities for developing embedded systems, the ARM processor has been explored for acquisition, monitoring and simple signal processing tasks. In a recent approach we have explored the use of the ARM processor for defects detection by implementing the wavelet transform. However, the computation speed of the preprocessing was not yet sufficient for real time applications. In this approach we significantly improve the preprocessing speed of the algorithm, by optimizing matrix operations, such that it is adequate for a real time application. The system was tested for defect detection using different defect types. The paper is focused in giving a detailed description of the basis of the algorithm implementation, such that other algorithms may use of the ARM operations for fast implementations.
Evaluation of Quality Assessment Protocols for High Throughput Genome Resequencing Data
Chiara, Matteo; Pavesi, Giulio
2017-01-01
Large-scale initiatives aiming to recover the complete sequence of thousands of human genomes are currently being undertaken worldwide, concurring to the generation of a comprehensive catalog of human genetic variation. The ultimate and most ambitious goal of human population scale genomics is the characterization of the so-called human “variome,” through the identification of causal mutations or haplotypes. Several research institutions worldwide currently use genotyping assays based on Next-Generation Sequencing (NGS) for diagnostics and clinical screenings, and the widespread application of such technologies promises major revolutions in medical science. Bioinformatic analysis of human resequencing data is one of the main factors limiting the effectiveness and general applicability of NGS for clinical studies. The requirement for multiple tools, to be combined in dedicated protocols in order to accommodate different types of data (gene panels, exomes, or whole genomes) and the high variability of the data makes difficult the establishment of a ultimate strategy of general use. While there already exist several studies comparing sensitivity and accuracy of bioinformatic pipelines for the identification of single nucleotide variants from resequencing data, little is known about the impact of quality assessment and reads pre-processing strategies. In this work we discuss major strengths and limitations of the various genome resequencing protocols are currently used in molecular diagnostics and for the discovery of novel disease-causing mutations. By taking advantage of publicly available data we devise and suggest a series of best practices for the pre-processing of the data that consistently improve the outcome of genotyping with minimal impacts on computational costs. PMID:28736571
V-FOR-WaTer - a new virtual research environment for environmental research
NASA Astrophysics Data System (ADS)
Strobl, Marcus; Azmi, Elnaz; Hassler, Sibylle; Mälicke, Mirko; Meyer, Jörg; Zehe, Erwin
2017-04-01
The preparation of heterogeneous datasets for scientific analysis is still a demanding task. Data preprocessing for hydrological models typically involves gathering datasets from different sources, extensive work within geoinformation systems, data transformation, the generation of computational grids and the definition of initial and boundary conditions. V-FOR-WaTer, a standardized and scalable data hub with compatible analysis tools, will ease comprehensive studies and significantly reduce data preparation time. The idea behind V-FOR-WaTer is to bring together various datasets (e.g. point measurements, 2D/3D data, time series data) from different sources (e.g. gathered in research projects, or as part of regular monitoring of state offices) and to provide common as well as innovative scaling tools in space and time to generate a coherent data grid. Each dataset holds detailed standardized metadata to ensure usability of the data, offer a comprehensive search function and provide reference information for appropriate citation of the dataset creators. V-FOR-WaTer includes a basis of data and tools, but its purpose is to grow by users who extend the virtual research environment with their own tools and research data. Researchers who upload new data or tools can receive a digital object identifier, or protect their data and tools from others until publication. Access to data and tools provided from V-FOR-WaTer happens via an easy-to-use web portal. Due to its modular architecture the portal is ready to be extended with new tools and features and also offers interfaces to Matlab, Python and R.
Comparing Binaural Pre-processing Strategies III
Warzybok, Anna; Ernst, Stephan M. A.
2015-01-01
A comprehensive evaluation of eight signal pre-processing strategies, including directional microphones, coherence filters, single-channel noise reduction, binaural beamformers, and their combinations, was undertaken with normal-hearing (NH) and hearing-impaired (HI) listeners. Speech reception thresholds (SRTs) were measured in three noise scenarios (multitalker babble, cafeteria noise, and single competing talker). Predictions of three common instrumental measures were compared with the general perceptual benefit caused by the algorithms. The individual SRTs measured without pre-processing and individual benefits were objectively estimated using the binaural speech intelligibility model. Ten listeners with NH and 12 HI listeners participated. The participants varied in age and pure-tone threshold levels. Although HI listeners required a better signal-to-noise ratio to obtain 50% intelligibility than listeners with NH, no differences in SRT benefit from the different algorithms were found between the two groups. With the exception of single-channel noise reduction, all algorithms showed an improvement in SRT of between 2.1 dB (in cafeteria noise) and 4.8 dB (in single competing talker condition). Model predictions with binaural speech intelligibility model explained 83% of the measured variance of the individual SRTs in the no pre-processing condition. Regarding the benefit from the algorithms, the instrumental measures were not able to predict the perceptual data in all tested noise conditions. The comparable benefit observed for both groups suggests a possible application of noise reduction schemes for listeners with different hearing status. Although the model can predict the individual SRTs without pre-processing, further development is necessary to predict the benefits obtained from the algorithms at an individual level. PMID:26721922
Assessment of data pre-processing methods for LC-MS/MS-based metabolomics of uterine cervix cancer.
Chen, Yanhua; Xu, Jing; Zhang, Ruiping; Shen, Guoqing; Song, Yongmei; Sun, Jianghao; He, Jiuming; Zhan, Qimin; Abliz, Zeper
2013-05-07
A metabolomics strategy based on rapid resolution liquid chromatography/tandem mass spectrometry (RRLC-MS/MS) and multivariate statistics has been implemented to identify potential biomarkers in uterine cervix cancer. Due to the importance of the data pre-processing method, three popular software packages have been compared. Then they have been used to acquire respective data matrices from the same LC-MS/MS data. Multivariate statistics was subsequently used to identify significantly changed biomarkers for uterine cervix cancer from the resulting data matrices. The reliabilities of the identified discriminated metabolites have been further validated on the basis of manually extracted data and ROC curves. Nine potential biomarkers have been identified as having a close relationship with uterine cervix cancer. Considering these in combination as a biomarker group, the AUC amounted to 0.997, with a sensitivity of 92.9% and a specificity of 95.6%. The prediction accuracy was 96.6%. Among these potential biomarkers, the amounts of four purine derivatives were greatly decreased, which might be related to a P2 receptor that might lead to a decrease in cell number through apoptosis. Moreover, only two of them were identified simultaneously by all of the pre-processing tools. The results have demonstrated that the data pre-processing method could seriously bias the metabolomics results. Therefore, application of two or more data pre-processing methods would reveal a more comprehensive set of potential biomarkers in non-targeted metabolomics, before a further validation with LC-MS/MS based targeted metabolomics in MRM mode could be conducted.
Research on detection method of UAV obstruction based on binocular vision
NASA Astrophysics Data System (ADS)
Zhu, Xiongwei; Lei, Xusheng; Sui, Zhehao
2018-04-01
For the autonomous obstacle positioning and ranging in the process of UAV (unmanned aerial vehicle) flight, a system based on binocular vision is constructed. A three-stage image preprocessing method is proposed to solve the problem of the noise and brightness difference in the actual captured image. The distance of the nearest obstacle is calculated by using the disparity map that generated by binocular vision. Then the contour of the obstacle is extracted by post-processing of the disparity map, and a color-based adaptive parameter adjustment algorithm is designed to extract contours of obstacle automatically. Finally, the safety distance measurement and obstacle positioning during the UAV flight process are achieved. Based on a series of tests, the error of distance measurement can keep within 2.24% of the measuring range from 5 m to 20 m.
Evaluation of an Area-Based matching algorithm with advanced shape models
NASA Astrophysics Data System (ADS)
Re, C.; Roncella, R.; Forlani, G.; Cremonese, G.; Naletto, G.
2014-04-01
Nowadays, the scientific institutions involved in planetary mapping are working on new strategies to produce accurate high resolution DTMs from space images at planetary scale, usually dealing with extremely large data volumes. From a methodological point of view, despite the introduction of a series of new algorithms for image matching (e.g. the Semi Global Matching) that yield superior results (especially because they produce usually smooth and continuous surfaces) with lower processing times, the preference in this field still goes to well established area-based matching techniques. Many efforts are consequently directed to improve each phase of the photogrammetric process, from image pre-processing to DTM interpolation. In this context, the Dense Matcher software (DM) developed at the University of Parma has been recently optimized to cope with very high resolution images provided by the most recent missions (LROC NAC and HiRISE) focusing the efforts mainly to the improvement of the correlation phase and the process automation. Important changes have been made to the correlation algorithm, still maintaining its high performance in terms of precision and accuracy, by implementing an advanced version of the Least Squares Matching (LSM) algorithm. In particular, an iterative algorithm has been developed to adapt the geometric transformation in image resampling using different shape functions as originally proposed by other authors in different applications.
Detection of long duration cloud contamination in hyper-temporal NDVI imagery
NASA Astrophysics Data System (ADS)
Ali, A.; de Bie, C. A. J. M.; Skidmore, A. K.; Scarrott, R. G.
2012-04-01
NDVI time series imagery are commonly used as a reliable source for land use and land cover mapping and monitoring. However long duration cloud can significantly influence its precision in areas where persistent clouds prevails. Therefore quantifying errors related to cloud contamination are essential for accurate land cover mapping and monitoring. This study aims to detect long duration cloud contamination in hyper-temporal NDVI imagery based land cover mapping and monitoring. MODIS-Terra NDVI imagery (250 m; 16-day; Feb'03-Dec'09) were used after necessary pre-processing using quality flags and upper envelope filter (ASAVOGOL). Subsequently stacked MODIS-Terra NDVI image (161 layers) was classified for 10 to 100 clusters using ISODATA. After classifications, 97 clusters image was selected as best classified with the help of divergence statistics. To detect long duration cloud contamination, mean NDVI class profiles of 97 clusters image was analyzed for temporal artifacts. Results showed that long duration clouds affect the normal temporal progression of NDVI and caused anomalies. Out of total 97 clusters, 32 clusters were found with cloud contamination. Cloud contamination was found more prominent in areas where high rainfall occurs. This study can help to stop error propagation in regional land cover mapping and monitoring, caused by long duration cloud contamination.
Yun Chen; Hui Yang
2014-01-01
The rapid advancements of biomedical instrumentation and healthcare technology have resulted in data-rich environments in hospitals. However, the meaningful information extracted from rich datasets is limited. There is a dire need to go beyond current medical practices, and develop data-driven methods and tools that will enable and help (i) the handling of big data, (ii) the extraction of data-driven knowledge, (iii) the exploitation of acquired knowledge for optimizing clinical decisions. This present study focuses on the prediction of mortality rates in Intensive Care Units (ICU) using patient-specific healthcare recordings. It is worth mentioning that postsurgical monitoring in ICU leads to massive datasets with unique properties, e.g., variable heterogeneity, patient heterogeneity, and time asyncronization. To cope with the challenges in ICU datasets, we developed the postsurgical decision support system with a series of analytical tools, including data categorization, data pre-processing, feature extraction, feature selection, and predictive modeling. Experimental results show that the proposed data-driven methodology outperforms traditional approaches and yields better results based on the evaluation of real-world ICU data from 4000 subjects in the database. This research shows great potentials for the use of data-driven analytics to improve the quality of healthcare services.
Analysis of EEG signals regularity in adults during video game play in 2D and 3D.
Khairuddin, Hamizah R; Malik, Aamir S; Mumtaz, Wajid; Kamel, Nidal; Xia, Likun
2013-01-01
Video games have long been part of the entertainment industry. Nonetheless, it is not well known how video games can affect us with the advancement of 3D technology. The purpose of this study is to investigate the EEG signals regularity when playing video games in 2D and 3D modes. A total of 29 healthy subjects (24 male, 5 female) with mean age of 21.79 (1.63) years participated. Subjects were asked to play a car racing video game in three different modes (2D, 3D passive and 3D active). In 3D passive mode, subjects needed to wear a passive polarized glasses (cinema type) while for 3D active, an active shutter glasses was used. Scalp EEG data was recorded during game play using 19-channel EEG machine and linked ear was used as reference. After data were pre-processed, the signal irregularity for all conditions was computed. Two parameters were used to measure signal complexity for time series data: i) Hjorth-Complexity and ii) Composite Permutation Entropy Index (CPEI). Based on these two parameters, our results showed that the complexity level increased from eyes closed to eyes open condition; and further increased in the case of 3D as compared to 2D game play.
SOMFlow: Guided Exploratory Cluster Analysis with Self-Organizing Maps and Analytic Provenance.
Sacha, Dominik; Kraus, Matthias; Bernard, Jurgen; Behrisch, Michael; Schreck, Tobias; Asano, Yuki; Keim, Daniel A
2018-01-01
Clustering is a core building block for data analysis, aiming to extract otherwise hidden structures and relations from raw datasets, such as particular groups that can be effectively related, compared, and interpreted. A plethora of visual-interactive cluster analysis techniques has been proposed to date, however, arriving at useful clusterings often requires several rounds of user interactions to fine-tune the data preprocessing and algorithms. We present a multi-stage Visual Analytics (VA) approach for iterative cluster refinement together with an implementation (SOMFlow) that uses Self-Organizing Maps (SOM) to analyze time series data. It supports exploration by offering the analyst a visual platform to analyze intermediate results, adapt the underlying computations, iteratively partition the data, and to reflect previous analytical activities. The history of previous decisions is explicitly visualized within a flow graph, allowing to compare earlier cluster refinements and to explore relations. We further leverage quality and interestingness measures to guide the analyst in the discovery of useful patterns, relations, and data partitions. We conducted two pair analytics experiments together with a subject matter expert in speech intonation research to demonstrate that the approach is effective for interactive data analysis, supporting enhanced understanding of clustering results as well as the interactive process itself.
Machine Learning Techniques for Stellar Light Curve Classification
NASA Astrophysics Data System (ADS)
Hinners, Trisha A.; Tat, Kevin; Thorp, Rachel
2018-07-01
We apply machine learning techniques in an attempt to predict and classify stellar properties from noisy and sparse time-series data. We preprocessed over 94 GB of Kepler light curves from the Mikulski Archive for Space Telescopes (MAST) to classify according to 10 distinct physical properties using both representation learning and feature engineering approaches. Studies using machine learning in the field have been primarily done on simulated data, making our study one of the first to use real light-curve data for machine learning approaches. We tuned our data using previous work with simulated data as a template and achieved mixed results between the two approaches. Representation learning using a long short-term memory recurrent neural network produced no successful predictions, but our work with feature engineering was successful for both classification and regression. In particular, we were able to achieve values for stellar density, stellar radius, and effective temperature with low error (∼2%–4%) and good accuracy (∼75%) for classifying the number of transits for a given star. The results show promise for improvement for both approaches upon using larger data sets with a larger minority class. This work has the potential to provide a foundation for future tools and techniques to aid in the analysis of astrophysical data.
Gao, Lin; Zhang, Tongsheng; Wang, Jue; Stephen, Julia
2014-01-01
When connectivity analysis is carried out for event related EEG and MEG, the presence of strong spatial correlations from spontaneous activity in background may mask the local neuronal evoked activity and lead to spurious connections. In this paper, we hypothesized PCA decomposition could be used to diminish the background activity and further improve the performance of connectivity analysis in event related experiments. The idea was tested using simulation, where we found that for the 306-channel Elekta Neuromag system, the first 4 PCs represent the dominant background activity, and the source connectivity pattern after preprocessing is consistent with the true connectivity pattern designed in the simulation. Improving signal to noise of the evoked responses by discarding the first few PCs demonstrates increased coherences at major physiological frequency bands when removing the first few PCs. Furthermore, the evoked information was maintained after PCA preprocessing. In conclusion, it is demonstrated that the first few PCs represent background activity, and PCA decomposition can be employed to remove it to expose the evoked activity for the channels under investigation. Therefore, PCA can be applied as a preprocessing approach to improve neuronal connectivity analysis for event related data. PMID:22918837
Gao, Lin; Zhang, Tongsheng; Wang, Jue; Stephen, Julia
2013-04-01
When connectivity analysis is carried out for event related EEG and MEG, the presence of strong spatial correlations from spontaneous activity in background may mask the local neuronal evoked activity and lead to spurious connections. In this paper, we hypothesized PCA decomposition could be used to diminish the background activity and further improve the performance of connectivity analysis in event related experiments. The idea was tested using simulation, where we found that for the 306-channel Elekta Neuromag system, the first 4 PCs represent the dominant background activity, and the source connectivity pattern after preprocessing is consistent with the true connectivity pattern designed in the simulation. Improving signal to noise of the evoked responses by discarding the first few PCs demonstrates increased coherences at major physiological frequency bands when removing the first few PCs. Furthermore, the evoked information was maintained after PCA preprocessing. In conclusion, it is demonstrated that the first few PCs represent background activity, and PCA decomposition can be employed to remove it to expose the evoked activity for the channels under investigation. Therefore, PCA can be applied as a preprocessing approach to improve neuronal connectivity analysis for event related data.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fromm, Catherine
2015-08-20
Ptychography is an advanced diffraction based imaging technique that can achieve resolution of 5nm and below. It is done by scanning a sample through a beam of focused x-rays using discrete yet overlapping scan steps. Scattering data is collected on a CCD camera, and the phase of the scattered light is reconstructed with sophisticated iterative algorithms. Because the experimental setup is similar, ptychography setups can be created by retrofitting existing STXM beam lines with new hardware. The other challenge comes in the reconstruction of the collected scattering images. Scattering data must be adjusted and packaged with experimental parameters to calibratemore » the reconstruction software. The necessary pre-processing of data prior to reconstruction is unique to each beamline setup, and even the optical alignments used on that particular day. Pre-processing software must be developed to be flexible and efficient in order to allow experiments appropriate control and freedom in the analysis of their hard-won data. This paper will describe the implementation of pre-processing software which successfully connects data collection steps to reconstruction steps, letting the user accomplish accurate and reliable ptychography.« less
Latifoğlu, Fatma; Polat, Kemal; Kara, Sadik; Güneş, Salih
2008-02-01
In this study, we proposed a new medical diagnosis system based on principal component analysis (PCA), k-NN based weighting pre-processing, and Artificial Immune Recognition System (AIRS) for diagnosis of atherosclerosis from Carotid Artery Doppler Signals. The suggested system consists of four stages. First, in the feature extraction stage, we have obtained the features related with atherosclerosis disease using Fast Fourier Transformation (FFT) modeling and by calculating of maximum frequency envelope of sonograms. Second, in the dimensionality reduction stage, the 61 features of atherosclerosis disease have been reduced to 4 features using PCA. Third, in the pre-processing stage, we have weighted these 4 features using different values of k in a new weighting scheme based on k-NN based weighting pre-processing. Finally, in the classification stage, AIRS classifier has been used to classify subjects as healthy or having atherosclerosis. Hundred percent of classification accuracy has been obtained by the proposed system using 10-fold cross validation. This success shows that the proposed system is a robust and effective system in diagnosis of atherosclerosis disease.
Biomass Feedstock and Conversion Supply System Design and Analysis
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jacobson, Jacob J.; Roni, Mohammad S.; Lamers, Patrick
Idaho National Laboratory (INL) supports the U.S. Department of Energy’s bioenergy research program. As part of the research program INL investigates the feedstock logistics economics and sustainability of these fuels. A series of reports were published between 2000 and 2013 to demonstrate the feedstock logistics cost. Those reports were tailored to specific feedstock and conversion process. Although those reports are different in terms of conversion, some of the process in the feedstock logistic are same for each conversion process. As a result, each report has similar information. A single report can be designed that could bring all commonality occurred inmore » the feedstock logistics process while discussing the feedstock logistics cost for different conversion process. Therefore, this report is designed in such a way that it can capture different feedstock logistics cost while eliminating the need of writing a conversion specific design report. Previous work established the current costs based on conventional equipment and processes. The 2012 programmatic target was to demonstrate a delivered biomass logistics cost of $55/dry ton for woody biomass delivered to fast pyrolysis conversion facility. The goal was achieved by applying field and process demonstration unit-scale data from harvest, collection, storage, preprocessing, handling, and transportation operations into INL’s biomass logistics model. The goal of the 2017 Design Case is to enable expansion of biofuels production beyond highly productive resource areas by breaking the reliance of cost-competitive biofuel production on a single, low-cost feedstock. The 2017 programmatic target is to supply feedstock to the conversion facility that meets the in-feed conversion process quality specifications at a total logistics cost of $80/dry T. The $80/dry T. target encompasses total delivered feedstock cost, including both grower payment and logistics costs, while meeting all conversion in-feed quality targets. The 2012 $55/dry T. programmatic target included only logistics costs with a limited focus on biomass quantity, quality and did not include a grower payment. The 2017 Design Case explores two approaches to addressing the logistics challenge: one is an agronomic solution based on blending and integrated landscape management and the second is a logistics solution based on distributed biomass preprocessing depots. The concept behind blended feedstocks and integrated landscape management is to gain access to more regional feedstock at lower access fees (i.e., grower payment) and to reduce preprocessing costs by blending high quality feedstocks with marginal quality feedstocks. Blending has been used in the grain industry for a long time; however, the concept of blended feedstocks in the biofuel industry is a relatively new concept. The blended feedstock strategy relies on the availability of multiple feedstock sources that are blended using a least-cost formulation within an economical supply radius, which, in turn, decreases the grower payment by reducing the amount of any single biomass. This report will introduce the concepts of blending and integrated landscape management and justify their importance in meeting the 2017 programmatic goals.« less
Pre-processing ambient noise cross-correlations with equalizing the covariance matrix eigenspectrum
NASA Astrophysics Data System (ADS)
Seydoux, Léonard; de Rosny, Julien; Shapiro, Nikolai M.
2017-09-01
Passive imaging techniques from ambient seismic noise requires a nearly isotropic distribution of the noise sources in order to ensure reliable traveltime measurements between seismic stations. However, real ambient seismic noise often partially fulfils this condition. It is generated in preferential areas (in deep ocean or near continental shores), and some highly coherent pulse-like signals may be present in the data such as those generated by earthquakes. Several pre-processing techniques have been developed in order to attenuate the directional and deterministic behaviour of this real ambient noise. Most of them are applied to individual seismograms before cross-correlation computation. The most widely used techniques are the spectral whitening and temporal smoothing of the individual seismic traces. We here propose an additional pre-processing to be used together with the classical ones, which is based on the spatial analysis of the seismic wavefield. We compute the cross-spectra between all available stations pairs in spectral domain, leading to the data covariance matrix. We apply a one-bit normalization to the covariance matrix eigenspectrum before extracting the cross-correlations in the time domain. The efficiency of the method is shown with several numerical tests. We apply the method to the data collected by the USArray, when the M8.8 Maule earthquake occurred on 2010 February 27. The method shows a clear improvement compared with the classical equalization to attenuate the highly energetic and coherent waves incoming from the earthquake, and allows to perform reliable traveltime measurement even in the presence of the earthquake.
Origin and structures of solar eruptions II: Magnetic modeling
NASA Astrophysics Data System (ADS)
Guo, Yang; Cheng, Xin; Ding, MingDe
2017-07-01
The topology and dynamics of the three-dimensional magnetic field in the solar atmosphere govern various solar eruptive phenomena and activities, such as flares, coronal mass ejections, and filaments/prominences. We have to observe and model the vector magnetic field to understand the structures and physical mechanisms of these solar activities. Vector magnetic fields on the photosphere are routinely observed via the polarized light, and inferred with the inversion of Stokes profiles. To analyze these vector magnetic fields, we need first to remove the 180° ambiguity of the transverse components and correct the projection effect. Then, the vector magnetic field can be served as the boundary conditions for a force-free field modeling after a proper preprocessing. The photospheric velocity field can also be derived from a time sequence of vector magnetic fields. Three-dimensional magnetic field could be derived and studied with theoretical force-free field models, numerical nonlinear force-free field models, magnetohydrostatic models, and magnetohydrodynamic models. Magnetic energy can be computed with three-dimensional magnetic field models or a time series of vector magnetic field. The magnetic topology is analyzed by pinpointing the positions of magnetic null points, bald patches, and quasi-separatrix layers. As a well conserved physical quantity, magnetic helicity can be computed with various methods, such as the finite volume method, discrete flux tube method, and helicity flux integration method. This quantity serves as a promising parameter characterizing the activity level of solar active regions.
NASA Astrophysics Data System (ADS)
Tanaka, S.; Hasegawa, K.; Okamoto, N.; Umegaki, R.; Wang, S.; Uemura, M.; Okamoto, A.; Koyamada, K.
2016-06-01
We propose a method for the precise 3D see-through imaging, or transparent visualization, of the large-scale and complex point clouds acquired via the laser scanning of 3D cultural heritage objects. Our method is based on a stochastic algorithm and directly uses the 3D points, which are acquired using a laser scanner, as the rendering primitives. This method achieves the correct depth feel without requiring depth sorting of the rendering primitives along the line of sight. Eliminating this need allows us to avoid long computation times when creating natural and precise 3D see-through views of laser-scanned cultural heritage objects. The opacity of each laser-scanned object is also flexibly controllable. For a laser-scanned point cloud consisting of more than 107 or 108 3D points, the pre-processing requires only a few minutes, and the rendering can be executed at interactive frame rates. Our method enables the creation of cumulative 3D see-through images of time-series laser-scanned data. It also offers the possibility of fused visualization for observing a laser-scanned object behind a transparent high-quality photographic image placed in the 3D scene. We demonstrate the effectiveness of our method by applying it to festival floats of high cultural value. These festival floats have complex outer and inner 3D structures and are suitable for see-through imaging.
NASA Astrophysics Data System (ADS)
Illing, Sebastian; Schuster, Mareike; Kadow, Christopher; Kröner, Igor; Richling, Andy; Grieger, Jens; Kruschke, Tim; Lang, Benjamin; Redl, Robert; Schartner, Thomas; Cubasch, Ulrich
2016-04-01
MiKlip is project for medium-term climate prediction funded by the Federal Ministry of Education and Research in Germany (BMBF) and aims to create a model system that is able provide reliable decadal climate forecasts. During the first project phase of MiKlip the sub-project INTEGRATION located at Freie Universität Berlin developed a framework for scientific infrastructures (FREVA). More information about FREVA can be found in EGU2016-13060. An instance of this framework is used as Central Evaluation System (CES) during the MiKlip project. Throughout the first project phase various sub-projects developed over 25 analysis tools - so called plugins - for the CES. The main focus of these plugins is on the evaluation and verification of decadal climate prediction data, but most plugins are not limited to this scope. They target a wide range of scientific questions. Starting from preprocessing tools like the "LeadtimeSelector", which creates lead-time dependent time-series from decadal hindcast sets, over tracking tools like the "Zykpak" plugin, which can objectively locate and track mid-latitude cyclones, to plugins like "MurCSS" or "SPECS", which calculate deterministic and probabilistic skill metrics. We also integrated some analyses from Model Evaluation Tools (MET), which was developed at NCAR. We will show the theoretical background, technical implementation strategies, and some interesting results of the evaluation of the MiKlip Prototype decadal prediction system for a selected set of these tools.
Fuzzy/Neural Software Estimates Costs of Rocket-Engine Tests
NASA Technical Reports Server (NTRS)
Douglas, Freddie; Bourgeois, Edit Kaminsky
2005-01-01
The Highly Accurate Cost Estimating Model (HACEM) is a software system for estimating the costs of testing rocket engines and components at Stennis Space Center. HACEM is built on a foundation of adaptive-network-based fuzzy inference systems (ANFIS) a hybrid software concept that combines the adaptive capabilities of neural networks with the ease of development and additional benefits of fuzzy-logic-based systems. In ANFIS, fuzzy inference systems are trained by use of neural networks. HACEM includes selectable subsystems that utilize various numbers and types of inputs, various numbers of fuzzy membership functions, and various input-preprocessing techniques. The inputs to HACEM are parameters of specific tests or series of tests. These parameters include test type (component or engine test), number and duration of tests, and thrust level(s) (in the case of engine tests). The ANFIS in HACEM are trained by use of sets of these parameters, along with costs of past tests. Thereafter, the user feeds HACEM a simple input text file that contains the parameters of a planned test or series of tests, the user selects the desired HACEM subsystem, and the subsystem processes the parameters into an estimate of cost(s).
NASA Astrophysics Data System (ADS)
Rishi, Rahul; Choudhary, Amit; Singh, Ravinder; Dhaka, Vijaypal Singh; Ahlawat, Savita; Rao, Mukta
2010-02-01
In this paper we propose a system for classification problem of handwritten text. The system is composed of preprocessing module, supervised learning module and recognition module on a very broad level. The preprocessing module digitizes the documents and extracts features (tangent values) for each character. The radial basis function network is used in the learning and recognition modules. The objective is to analyze and improve the performance of Multi Layer Perceptron (MLP) using RBF transfer functions over Logarithmic Sigmoid Function. The results of 35 experiments indicate that the Feed Forward MLP performs accurately and exhaustively with RBF. With the change in weight update mechanism and feature-drawn preprocessing module, the proposed system is competent with good recognition show.
Automatic Semantic Orientation of Adjectives for Indonesian Language Using PMI-IR and Clustering
NASA Astrophysics Data System (ADS)
Riyanti, Dewi; Arif Bijaksana, M.; Adiwijaya
2018-03-01
We present our work in the area of sentiment analysis for Indonesian language. We focus on bulding automatic semantic orientation using available resources in Indonesian. In this research we used Indonesian corpus that contains 9 million words from kompas.txt and tempo.txt that manually tagged and annotated with of part-of-speech tagset. And then we construct a dataset by taking all the adjectives from the corpus, removing the adjective with no orientation. The set contained 923 adjective words. This systems will include several steps such as text pre-processing and clustering. The text pre-processing aims to increase the accuracy. And finally clustering method will classify each word to related sentiment which is positive or negative. With improvements to the text preprocessing, can be achieved 72% of accuracy.
NASA Astrophysics Data System (ADS)
Zhang, Jianfeng; Zhu, Yan; Zhang, Xiaoping; Ye, Ming; Yang, Jinzhong
2018-06-01
Predicting water table depth over the long-term in agricultural areas presents great challenges because these areas have complex and heterogeneous hydrogeological characteristics, boundary conditions, and human activities; also, nonlinear interactions occur among these factors. Therefore, a new time series model based on Long Short-Term Memory (LSTM), was developed in this study as an alternative to computationally expensive physical models. The proposed model is composed of an LSTM layer with another fully connected layer on top of it, with a dropout method applied in the first LSTM layer. In this study, the proposed model was applied and evaluated in five sub-areas of Hetao Irrigation District in arid northwestern China using data of 14 years (2000-2013). The proposed model uses monthly water diversion, evaporation, precipitation, temperature, and time as input data to predict water table depth. A simple but effective standardization method was employed to pre-process data to ensure data on the same scale. 14 years of data are separated into two sets: training set (2000-2011) and validation set (2012-2013) in the experiment. As expected, the proposed model achieves higher R2 scores (0.789-0.952) in water table depth prediction, when compared with the results of traditional feed-forward neural network (FFNN), which only reaches relatively low R2 scores (0.004-0.495), proving that the proposed model can preserve and learn previous information well. Furthermore, the validity of the dropout method and the proposed model's architecture are discussed. Through experimentation, the results show that the dropout method can prevent overfitting significantly. In addition, comparisons between the R2 scores of the proposed model and Double-LSTM model (R2 scores range from 0.170 to 0.864), further prove that the proposed model's architecture is reasonable and can contribute to a strong learning ability on time series data. Thus, one can conclude that the proposed model can serve as an alternative approach predicting water table depth, especially in areas where hydrogeological data are difficult to obtain.
NASA Astrophysics Data System (ADS)
Kontoes, Charalampos; Papoutsis, Ioannis; Herekakis, Themistoklis; Michail, Dimitrios; Ieronymidi, Emmanuela
2013-04-01
Remote sensing tools for the accurate, robust and timely assessment of the damages inflicted by forest wildfires provide information that is of paramount importance to public environmental agencies and related stakeholders before, during and after the crisis. The Institute for Astronomy, Astrophysics, Space Applications and Remote Sensing of the National Observatory of Athens (IAASARS/NOA) has developed a fully automatic single and/or multi date processing chain that takes as input archived Landsat 4, 5 or 7 raw images and produces precise diachronic burnt area polygons and damage assessments over the Greek territory. The methodology consists of three fully automatic stages: 1) the pre-processing stage where the metadata of the raw images are extracted, followed by the application of the LEDAPS software platform for calibration and mask production and the Automated Precise Orthorectification Package, developed by NASA, for image geo-registration and orthorectification, 2) the core-BSM (Burn Scar Mapping) processing stage which incorporates a published classification algorithm based on a series of physical indexes, the application of two filters for noise removal using graph-based techniques and the grouping of pixels classified as burnt to form the appropriate pixels clusters before proceeding to conversion from raster to vector, and 3) the post-processing stage where the products are thematically refined and enriched using auxiliary GIS layers (underlying land cover/use, administrative boundaries, etc.) and human logic/evidence to suppress false alarms and omission errors. The established processing chain has been successfully applied to the entire archive of Landsat imagery over Greece spanning from 1984 to 2012, which has been collected and managed in IAASARS/NOA. The number of full Landsat frames that were subject of process in the framework of the study was 415. These burn scar mapping products are generated for the first time to such a temporal and spatial extent and are ideal to use in further environmental time series analyzes, production of statistical indexes (frequency, geographical distribution and number of fires per prefecture) and applications, including change detection and climate change models, urban planning, correlation with manmade activities, etc.
Influence of California-style black ripe olive processing on the formation of acrylamide.
Charoenprasert, Suthawan; Mitchell, Alyson
2014-08-27
Methods used in processing California-style black ripe olives generate acrylamide. California-style black ripe olives contain higher levels of acrylamide (409.67 ± 42.60-511.91 ± 34.08 μg kg(-1)) as compared to California-style green ripe olives (44.02 ± 3.55-105.79 ± 22.01 μg kg(-1)), Greek olives (<1.42 μg kg(-1)), and Spanish olives (not detected), indicating that the higher temperatures used to sterilize the California-style green ripe and black ripe olives are required for acrylamide formation. Preprocessing brine storage influenced the formation of acrylamide in a time-dependent manner. Acrylamide increased during the first 30 days of storage. Longer brine storage times (>30 days) result in lower acrylamide levels in the finished product. The presence of calcium ions in the preprocessing brining solution results in higher levels of acrylamide in finished products. Air oxidation during lye processing and the neutralization of olives prior to sterilization significantly increase the formation of acrylamide in the finished products. Conversely, lye-processing decreases the levels of acrylamide in the final product. These results indicate that specific steps in the California-style black ripe olive processing may be manipulated to mitigate the formation of acrylamide in finished products.
Research on strategy marine noise map based on i4ocean platform: Constructing flow and key approach
NASA Astrophysics Data System (ADS)
Huang, Baoxiang; Chen, Ge; Han, Yong
2016-02-01
Noise level in a marine environment has raised extensive concern in the scientific community. The research is carried out on i4Ocean platform following the process of ocean noise model integrating, noise data extracting, processing, visualizing, and interpreting, ocean noise map constructing and publishing. For the convenience of numerical computation, based on the characteristics of ocean noise field, a hybrid model related to spatial locations is suggested in the propagation model. The normal mode method K/I model is used for far field and ray method CANARY model is used for near field. Visualizing marine ambient noise data is critical to understanding and predicting marine noise for relevant decision making. Marine noise map can be constructed on virtual ocean scene. The systematic marine noise visualization framework includes preprocessing, coordinate transformation interpolation, and rendering. The simulation of ocean noise depends on realistic surface. Then the dynamic water simulation gird was improved with GPU fusion to achieve seamless combination with the visualization result of ocean noise. At the same time, the profile and spherical visualization include space, and time dimensionality were also provided for the vertical field characteristics of ocean ambient noise. Finally, marine noise map can be published with grid pre-processing and multistage cache technology to better serve the public.
Penny, Christian; Grothendick, Beau; Zhang, Lin; Borror, Connie M.; Barbano, Duane; Cornelius, Angela J.; Gilpin, Brent J.; Fagerquist, Clifton K.; Zaragoza, William J.; Jay-Russell, Michele T.; Lastovica, Albert J.; Ragimbeau, Catherine; Cauchie, Henry-Michel; Sandrin, Todd R.
2016-01-01
MALDI-TOF MS has been utilized as a reliable and rapid tool for microbial fingerprinting at the genus and species levels. Recently, there has been keen interest in using MALDI-TOF MS beyond the genus and species levels to rapidly identify antibiotic resistant strains of bacteria. The purpose of this study was to enhance strain level resolution for Campylobacter jejuni through the optimization of spectrum processing parameters using a series of designed experiments. A collection of 172 strains of C. jejuni were collected from Luxembourg, New Zealand, North America, and South Africa, consisting of four groups of antibiotic resistant isolates. The groups included: (1) 65 strains resistant to cefoperazone (2) 26 resistant to cefoperazone and beta-lactams (3) 5 strains resistant to cefoperazone, beta-lactams, and tetracycline, and (4) 76 strains resistant to cefoperazone, teicoplanin, amphotericin, B and cephalothin. Initially, a model set of 16 strains (three biological replicates and three technical replicates per isolate, yielding a total of 144 spectra) of C. jejuni was subjected to each designed experiment to enhance detection of antibiotic resistance. The most optimal parameters were applied to the larger collection of 172 isolates (two biological replicates and three technical replicates per isolate, yielding a total of 1,031 spectra). We observed an increase in antibiotic resistance detection whenever either a curve based similarity coefficient (Pearson or ranked Pearson) was applied rather than a peak based (Dice) and/or the optimized preprocessing parameters were applied. Increases in antimicrobial resistance detection were scored using the jackknife maximum similarity technique following cluster analysis. From the first four groups of antibiotic resistant isolates, the optimized preprocessing parameters increased detection respective to the aforementioned groups by: (1) 5% (2) 9% (3) 10%, and (4) 2%. An additional second categorization was created from the collection consisting of 31 strains resistant to beta-lactams and 141 strains sensitive to beta-lactams. Applying optimal preprocessing parameters, beta-lactam resistance detection was increased by 34%. These results suggest that spectrum processing parameters, which are rarely optimized or adjusted, affect the performance of MALDI-TOF MS-based detection of antibiotic resistance and can be fine-tuned to enhance screening performance. PMID:27303397
High-throughput imaging of heterogeneous cell organelles with an X-ray laser (CXIDB ID 25)
Hantke, Max, F.
2014-11-17
Preprocessed detector images that were used for the paper "High-throughput imaging of heterogeneous cell organelles with an X-ray laser". The CXI file contains the entire recorded data - including both hits and blanks. It also includes down-sampled images and LCLS machine parameters. Additionally, the Cheetah configuration file is attached that was used to create the pre-processed data.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhang, L; Fried, D; Fave, X
Purpose: To investigate how different image preprocessing techniques, their parameters, and the different boundary handling techniques can augment the information of features and improve feature’s differentiating capability. Methods: Twenty-seven NSCLC patients with a solid tumor volume and no visually obvious necrotic regions in the simulation CT images were identified. Fourteen of these patients had a necrotic region visible in their pre-treatment PET images (necrosis group), and thirteen had no visible necrotic region in the pre-treatment PET images (non-necrosis group). We investigated how image preprocessing can impact the ability of radiomics image features extracted from the CT to differentiate between twomore » groups. It is expected the histogram in the necrosis group is more negatively skewed, and the uniformity from the necrosis group is less. Therefore, we analyzed two first order features, skewness and uniformity, on the image inside the GTV in the intensity range [−20HU, 180HU] under the combination of several image preprocessing techniques: (1) applying the isotropic Gaussian or anisotropic diffusion smoothing filter with a range of parameter(Gaussian smoothing: size=11, sigma=0:0.1:2.3; anisotropic smoothing: iteration=4, kappa=0:10:110); (2) applying the boundaryadapted Laplacian filter; and (3) applying the adaptive upper threshold for the intensity range. A 2-tailed T-test was used to evaluate the differentiating capability of CT features on pre-treatment PT necrosis. Result: Without any preprocessing, no differences in either skewness or uniformity were observed between two groups. After applying appropriate Gaussian filters (sigma>=1.3) or anisotropic filters(kappa >=60) with the adaptive upper threshold, skewness was significantly more negative in the necrosis group(p<0.05). By applying the boundary-adapted Laplacian filtering after the appropriate Gaussian filters (0.5 <=sigma<=1.1) or anisotropic filters(20<=kappa <=50), the uniformity was significantly lower in the necrosis group (p<0.05). Conclusion: Appropriate selection of image preprocessing techniques allows radiomics features to extract more useful information and thereby improve prediction models based on these features.« less
Real-time skin feature identification in a time-sequential video stream
NASA Astrophysics Data System (ADS)
Kramberger, Iztok
2005-04-01
Skin color can be an important feature when tracking skin-colored objects. Particularly this is the case for computer-vision-based human-computer interfaces (HCI). Humans have a highly developed feeling of space and, therefore, it is reasonable to support this within intelligent HCI, where the importance of augmented reality can be foreseen. Joining human-like interaction techniques within multimodal HCI could, or will, gain a feature for modern mobile telecommunication devices. On the other hand, real-time processing plays an important role in achieving more natural and physically intuitive ways of human-machine interaction. The main scope of this work is the development of a stereoscopic computer-vision hardware-accelerated framework for real-time skin feature identification in the sense of a single-pass image segmentation process. The hardware-accelerated preprocessing stage is presented with the purpose of color and spatial filtering, where the skin color model within the hue-saturation-value (HSV) color space is given with a polyhedron of threshold values representing the basis of the filter model. An adaptive filter management unit is suggested to achieve better segmentation results. This enables the adoption of filter parameters to the current scene conditions in an adaptive way. Implementation of the suggested hardware structure is given at the level of filed programmable system level integrated circuit (FPSLIC) devices using an embedded microcontroller as their main feature. A stereoscopic clue is achieved using a time-sequential video stream, but this shows no difference for real-time processing requirements in terms of hardware complexity. The experimental results for the hardware-accelerated preprocessing stage are given by efficiency estimation of the presented hardware structure using a simple motion-detection algorithm based on a binary function.
Radar signal pre-processing to suppress surface bounce and multipath
Paglieroni, David W; Mast, Jeffrey E; Beer, N. Reginald
2013-12-31
A method and system for detecting the presence of subsurface objects within a medium is provided. In some embodiments, the imaging and detection system operates in a multistatic mode to collect radar return signals generated by an array of transceiver antenna pairs that is positioned across the surface and that travels down the surface. The imaging and detection system pre-processes that return signal to suppress certain undesirable effects. The imaging and detection system then generates synthetic aperture radar images from real aperture radar images generated from the pre-processed return signal. The imaging and detection system then post-processes the synthetic aperture radar images to improve detection of subsurface objects. The imaging and detection system identifies peaks in the energy levels of the post-processed image frame, which indicates the presence of a subsurface object.
Sepehrband, Farshid; Choupan, Jeiran; Caruyer, Emmanuel; Kurniawan, Nyoman D; Gal, Yaniv; Tieng, Quang M; McMahon, Katie L; Vegh, Viktor; Reutens, David C; Yang, Zhengyi
2014-01-01
We describe and evaluate a pre-processing method based on a periodic spiral sampling of diffusion-gradient directions for high angular resolution diffusion magnetic resonance imaging. Our pre-processing method incorporates prior knowledge about the acquired diffusion-weighted signal, facilitating noise reduction. Periodic spiral sampling of gradient direction encodings results in an acquired signal in each voxel that is pseudo-periodic with characteristics that allow separation of low-frequency signal from high frequency noise. Consequently, it enhances local reconstruction of the orientation distribution function used to define fiber tracks in the brain. Denoising with periodic spiral sampling was tested using synthetic data and in vivo human brain images. The level of improvement in signal-to-noise ratio and in the accuracy of local reconstruction of fiber tracks was significantly improved using our method.
Conductivity map from scanning tunneling potentiometry.
Zhang, Hao; Li, Xianqi; Chen, Yunmei; Durand, Corentin; Li, An-Ping; Zhang, X-G
2016-08-01
We present a novel method for extracting two-dimensional (2D) conductivity profiles from large electrochemical potential datasets acquired by scanning tunneling potentiometry of a 2D conductor. The method consists of a data preprocessing procedure to reduce/eliminate noise and a numerical conductivity reconstruction. The preprocessing procedure employs an inverse consistent image registration method to align the forward and backward scans of the same line for each image line followed by a total variation (TV) based image restoration method to obtain a (nearly) noise-free potential from the aligned scans. The preprocessed potential is then used for numerical conductivity reconstruction, based on a TV model solved by accelerated alternating direction method of multiplier. The method is demonstrated on a measurement of the grain boundary of a monolayer graphene, yielding a nearly 10:1 ratio for the grain boundary resistivity over bulk resistivity.
Linguistic Preprocessing and Tagging for Problem Report Trend Analysis
NASA Technical Reports Server (NTRS)
Beil, Robert J.; Malin, Jane T.
2012-01-01
Mr. Robert Beil, Systems Engineer at Kennedy Space Center (KSC), requested the NASA Engineering and Safety Center (NESC) develop a prototype tool suite that combines complementary software technology used at Johnson Space Center (JSC) and KSC for problem report preprocessing and semantic tag extraction, to improve input to data mining and trend analysis. This document contains the outcome of the assessment and the Findings, Observations and NESC Recommendations.
Nanobioinformatics: Emerging Computational Tools to Understand Nano-Bio Interaction
2012-11-16
followed for using animals for toxicity studies, Organization for Economic Co- operation and Development ( OECD ) has set guidelines for toxicity studies...operation and Development ( OECD ) has set guidelines for toxicity studies in guideline number 420, which says that only dosages of 50-2000 mg/kg body weight...GSH, SOD, GSSH, MDA, ALK , ALT, LDH), Cell lines. Preprocessing: After collection of data from the published articles preprocessing of the data is
Ford, Patrick; Santos, Eduardo; Ferrão, Paulo; Margarido, Fernanda; Van Vliet, Krystyn J; Olivetti, Elsa
2016-05-03
The challenges brought on by the increasing complexity of electronic products, and the criticality of the materials these devices contain, present an opportunity for maximizing the economic and societal benefits derived from recovery and recycling. Small appliances and computer devices (SACD), including mobile phones, contain significant amounts of precious metals including gold and platinum, the present value of which should serve as a key economic driver for many recycling decisions. However, a detailed analysis is required to estimate the economic value that is unrealized by incomplete recovery of these and other materials, and to ascertain how such value could be reinvested to improve recovery processes. We present a dynamic product flow analysis for SACD throughout Portugal, a European Union member, including annual data detailing product sales and industrial-scale preprocessing data for recovery of specific materials from devices. We employ preprocessing facility and metals pricing data to identify losses, and develop an economic framework around the value of recycling including uncertainty. We show that significant economic losses occur during preprocessing (over $70 M USD unrecovered in computers and mobile phones, 2006-2014) due to operations that fail to target high value materials, and characterize preprocessing operations according to material recovery and total costs.
Optimization of miRNA-seq data preprocessing.
Tam, Shirley; Tsao, Ming-Sound; McPherson, John D
2015-11-01
The past two decades of microRNA (miRNA) research has solidified the role of these small non-coding RNAs as key regulators of many biological processes and promising biomarkers for disease. The concurrent development in high-throughput profiling technology has further advanced our understanding of the impact of their dysregulation on a global scale. Currently, next-generation sequencing is the platform of choice for the discovery and quantification of miRNAs. Despite this, there is no clear consensus on how the data should be preprocessed before conducting downstream analyses. Often overlooked, data preprocessing is an essential step in data analysis: the presence of unreliable features and noise can affect the conclusions drawn from downstream analyses. Using a spike-in dilution study, we evaluated the effects of several general-purpose aligners (BWA, Bowtie, Bowtie 2 and Novoalign), and normalization methods (counts-per-million, total count scaling, upper quartile scaling, Trimmed Mean of M, DESeq, linear regression, cyclic loess and quantile) with respect to the final miRNA count data distribution, variance, bias and accuracy of differential expression analysis. We make practical recommendations on the optimal preprocessing methods for the extraction and interpretation of miRNA count data from small RNA-sequencing experiments. © The Author 2015. Published by Oxford University Press.
Vellmer, Sebastian; Tonoyan, Aram S; Suter, Dieter; Pronin, Igor N; Maximov, Ivan I
2018-02-01
Diffusion magnetic resonance imaging (dMRI) is a powerful tool in clinical applications, in particular, in oncology screening. dMRI demonstrated its benefit and efficiency in the localisation and detection of different types of human brain tumours. Clinical dMRI data suffer from multiple artefacts such as motion and eddy-current distortions, contamination by noise, outliers etc. In order to increase the image quality of the derived diffusion scalar metrics and the accuracy of the subsequent data analysis, various pre-processing approaches are actively developed and used. In the present work we assess the effect of different pre-processing procedures such as a noise correction, different smoothing algorithms and spatial interpolation of raw diffusion data, with respect to the accuracy of brain glioma differentiation. As a set of sensitive biomarkers of the glioma malignancy grades we chose the derived scalar metrics from diffusion and kurtosis tensor imaging as well as the neurite orientation dispersion and density imaging (NODDI) biophysical model. Our results show that the application of noise correction, anisotropic diffusion filtering, and cubic-order spline interpolation resulted in the highest sensitivity and specificity for glioma malignancy grading. Thus, these pre-processing steps are recommended for the statistical analysis in brain tumour studies. Copyright © 2017. Published by Elsevier GmbH.
Application of filtering techniques in preprocessing magnetic data
NASA Astrophysics Data System (ADS)
Liu, Haijun; Yi, Yongping; Yang, Hongxia; Hu, Guochuang; Liu, Guoming
2010-08-01
High precision magnetic exploration is a popular geophysical technique for its simplicity and its effectiveness. The explanation in high precision magnetic exploration is always a difficulty because of the existence of noise and disturbance factors, so it is necessary to find an effective preprocessing method to get rid of the affection of interference factors before further processing. The common way to do this work is by filtering. There are many kinds of filtering methods. In this paper we introduced in detail three popular kinds of filtering techniques including regularized filtering technique, sliding averages filtering technique, compensation smoothing filtering technique. Then we designed the work flow of filtering program based on these techniques and realized it with the help of DELPHI. To check it we applied it to preprocess magnetic data of a certain place in China. Comparing the initial contour map with the filtered contour map, we can see clearly the perfect effect our program. The contour map processed by our program is very smooth and the high frequency parts of data are disappeared. After filtering, we separated useful signals and noisy signals, minor anomaly and major anomaly, local anomaly and regional anomaly. It made us easily to focus on the useful information. Our program can be used to preprocess magnetic data. The results showed the effectiveness of our program.
Myers, Owen D; Sumner, Susan J; Li, Shuzhao; Barnes, Stephen; Du, Xiuxia
2017-09-05
XCMS and MZmine 2 are two widely used software packages for preprocessing untargeted LC/MS metabolomics data. Both construct extracted ion chromatograms (EICs) and detect peaks from the EICs, the first two steps in the data preprocessing workflow. While both packages have performed admirably in peak picking, they also detect a problematic number of false positive EIC peaks and can also fail to detect real EIC peaks. The former and latter translate downstream into spurious and missing compounds and present significant limitations with most existing software packages that preprocess untargeted mass spectrometry metabolomics data. We seek to understand the specific reasons why XCMS and MZmine 2 find the false positive EIC peaks that they do and in what ways they fail to detect real compounds. We investigate differences of EIC construction methods in XCMS and MZmine 2 and find several problems in the XCMS centWave peak detection algorithm which we show are partly responsible for the false positive and false negative compound identifications. In addition, we find a problem with MZmine 2's use of centWave. We hope that a detailed understanding of the XCMS and MZmine 2 algorithms will allow users to work with them more effectively and will also help with future algorithmic development.
Chen, Ru-huang; Jin, Gang
2015-08-01
This paper presented an application of mid-infrared (MIR), near-infrared (NIR) and Raman spectroscopies for collecting the spectra of 31 kinds of low density polyethylene/polyprolene (LDPE/PP) samples with different proportions. The different pre-processing methods (multiplicative scatter correction, mean centering and Savitzky-Golay first derivative) and spectral region were explored to develop partial least-squares (PLS) model for LDPE, their influence on the accuracy of PLS model also being discussed. Three spectroscopies were compared about the accuracy of quantitative measurement. Consequently, the pre-processing methods and spectral region have a great impact on the accuracy of PLS model, especially the spectra with subtle difference, random noise and baseline variation. After being pre-processed and spectral region selected, the calibration model of MIR, NIR and Raman exhibited R2/RMSEC values of 0.9906/2.941, 0.9973/1.561 and 0.9972/1.598 respectively, which corrsponding to 0.8876/10.15, 0.8493/11.75 and 0.8757/10.67 before any treatment. The results also suggested MIR, NIR and Raman are three strong tools to predict the content of LDPE in LDPE/PP blend. However, NIR and Raman showed higher accuracy after being pre-processed and more suitability to fast quantitative characterization due to their high measuring speed.
Preprocessing and meta-classification for brain-computer interfaces.
Hammon, Paul S; de Sa, Virginia R
2007-03-01
A brain-computer interface (BCI) is a system which allows direct translation of brain states into actions, bypassing the usual muscular pathways. A BCI system works by extracting user brain signals, applying machine learning algorithms to classify the user's brain state, and performing a computer-controlled action. Our goal is to improve brain state classification. Perhaps the most obvious way to improve classification performance is the selection of an advanced learning algorithm. However, it is now well known in the BCI community that careful selection of preprocessing steps is crucial to the success of any classification scheme. Furthermore, recent work indicates that combining the output of multiple classifiers (meta-classification) leads to improved classification rates relative to single classifiers (Dornhege et al., 2004). In this paper, we develop an automated approach which systematically analyzes the relative contributions of different preprocessing and meta-classification approaches. We apply this procedure to three data sets drawn from BCI Competition 2003 (Blankertz et al., 2004) and BCI Competition III (Blankertz et al., 2006), each of which exhibit very different characteristics. Our final classification results compare favorably with those from past BCI competitions. Additionally, we analyze the relative contributions of individual preprocessing and meta-classification choices and discuss which types of BCI data benefit most from specific algorithms.
NASA Astrophysics Data System (ADS)
Ravnik, Domen; Jerman, Tim; Pernuš, Franjo; Likar, Boštjan; Å piclin, Žiga
2018-03-01
Performance of a convolutional neural network (CNN) based white-matter lesion segmentation in magnetic resonance (MR) brain images was evaluated under various conditions involving different levels of image preprocessing and augmentation applied and different compositions of the training dataset. On images of sixty multiple sclerosis patients, half acquired on one and half on another scanner of different vendor, we first created a highly accurate multi-rater consensus based lesion segmentations, which were used in several experiments to evaluate the CNN segmentation result. First, the CNN was trained and tested without preprocessing the images and by using various combinations of preprocessing techniques, namely histogram-based intensity standardization, normalization by whitening, and train dataset augmentation by flipping the images across the midsagittal plane. Then, the CNN was trained and tested on images of the same, different or interleaved scanner datasets using a cross-validation approach. The results indicate that image preprocessing has little impact on performance in a same-scanner situation, while between-scanner performance benefits most from intensity standardization and normalization, but also further by incorporating heterogeneous multi-scanner datasets in the training phase. Under such conditions the between-scanner performance of the CNN approaches that of the ideal situation, when the CNN is trained and tested on the same scanner dataset.
User-friendly solutions for microarray quality control and pre-processing on ArrayAnalysis.org
Eijssen, Lars M. T.; Jaillard, Magali; Adriaens, Michiel E.; Gaj, Stan; de Groot, Philip J.; Müller, Michael; Evelo, Chris T.
2013-01-01
Quality control (QC) is crucial for any scientific method producing data. Applying adequate QC introduces new challenges in the genomics field where large amounts of data are produced with complex technologies. For DNA microarrays, specific algorithms for QC and pre-processing including normalization have been developed by the scientific community, especially for expression chips of the Affymetrix platform. Many of these have been implemented in the statistical scripting language R and are available from the Bioconductor repository. However, application is hampered by lack of integrative tools that can be used by users of any experience level. To fill this gap, we developed a freely available tool for QC and pre-processing of Affymetrix gene expression results, extending, integrating and harmonizing functionality of Bioconductor packages. The tool can be easily accessed through a wizard-like web portal at http://www.arrayanalysis.org or downloaded for local use in R. The portal provides extensive documentation, including user guides, interpretation help with real output illustrations and detailed technical documentation. It assists newcomers to the field in performing state-of-the-art QC and pre-processing while offering data analysts an integral open-source package. Providing the scientific community with this easily accessible tool will allow improving data quality and reuse and adoption of standards. PMID:23620278
Improved mechanical properties of retorted carrots by ultrasonic pre-treatments.
Day, Li; Xu, Mi; Øiseth, Sofia K; Mawson, Raymond
2012-05-01
The use of ultrasound pre-processing treatment, compared to blanching, to enhance mechanical properties of non-starchy cell wall materials was investigated using carrot as an example. The mechanical properties of carrot tissues were measured by compression and tensile testing after the pre-processing treatment prior to and after retorting. Carrot samples ultrasound treated for 10 min at 60 °C provided a higher mechanical strength (P<0.05) to the cell wall structure than blanching for the same time period. With the addition of 0.5% CaCl(2) in the pre-treatment solution, both blanching and ultrasound treatment showed synergistic effect on enhancing the mechanical properties of retorted carrot pieces. At a relatively short treatment time (10 min at 60 °C) with the use of 0.5% CaCl(2), ultrasound treatment achieved similar enhancement to the mechanical strength of retorted carrots to blanching for a much longer time period (i.e. 40 min). The mechanism involved appears to be related to the stress responses present in all living plant matter. However, there is a need to clarify the relative importance of the potential stress mechanisms in order to get a better understanding of the processing conditions likely to be most effective. The amount of ultrasound treatment required is likely to involve low treatment intensities and there are indications from the structural characterisation and mechanical property analyses that the plant cell wall tissues were more elastic than that accomplished using low temperature long time blanching. Crown Copyright © 2011. Published by Elsevier B.V. All rights reserved.
Software for Preprocessing Data from Rocket-Engine Tests
NASA Technical Reports Server (NTRS)
Cheng, Chiu-Fu
2004-01-01
Three computer programs have been written to preprocess digitized outputs of sensors during rocket-engine tests at Stennis Space Center (SSC). The programs apply exclusively to the SSC E test-stand complex and utilize the SSC file format. The programs are the following: Engineering Units Generator (EUGEN) converts sensor-output-measurement data to engineering units. The inputs to EUGEN are raw binary test-data files, which include the voltage data, a list identifying the data channels, and time codes. EUGEN effects conversion by use of a file that contains calibration coefficients for each channel. QUICKLOOK enables immediate viewing of a few selected channels of data, in contradistinction to viewing only after post-test processing (which can take 30 minutes to several hours depending on the number of channels and other test parameters) of data from all channels. QUICKLOOK converts the selected data into a form in which they can be plotted in engineering units by use of Winplot (a free graphing program written by Rick Paris). EUPLOT provides a quick means for looking at data files generated by EUGEN without the necessity of relying on the PV-WAVE based plotting software.
Pre-processing of data coming from a laser-EMAT system for non-destructive testing of steel slabs.
Sgarbi, Mirko; Colla, Valentina; Cateni, Sivia; Higson, Stuart
2012-01-01
Non destructive test systems are increasingly applied in the industrial context for their strong potentialities in improving and standardizing quality control. Especially in the intermediate manufacturing stages, early detection of defects on semi-finished products allow their direction towards later production processes according to their quality, with consequent considerable savings in time, energy, materials and work. However, the raw data coming from non destructive test systems are not always immediately suitable for sophisticated defect detection algorithms, due to noise and disturbances which are unavoidable, especially in harsh operating conditions, such as the ones which are typical of the steelmaking cycle. The paper describes some pre-processing operations which are required in order to exploit the data coming from a non destructive test system. Such a system is based on the joint exploitation of Laser and Electro-Magnetic Acoustic Transducer technologies and is applied to the detection of surface and sub-surface cracks in cold and hot steel slabs. Copyright © 2011 ISA. Published by Elsevier Ltd. All rights reserved.
Effect of shaping sensor data on pilot response
NASA Technical Reports Server (NTRS)
Bailey, Roger M.
1990-01-01
The pilot of a modern jet aircraft is subjected to varying workloads while being responsible for multiple, ongoing tasks. The ability to associate the pilot's responses with the task/situation, by modifying the way information is presented relative to the task, could provide a means of reducing workload. To examine the feasibility of this concept, a real time simulation study was undertaken to determine whether preprocessing of sensor data would affect pilot response. Results indicated that preprocessing could be an effective way to tailor the pilot's response to displayed data. The effects of three transformations or shaping functions were evaluated with respect to the pilot's ability to predict and detect out-of-tolerance conditions while monitoring an electronic engine display. Two nonlinear transformations, on being the inverse of the other, were compared to a linear transformation. Results indicate that a nonlinear transformation that increases the rate-or-change of output relative to input tends to advance the prediction response and improve the detection response, while a nonlinear transformation that decreases the rate-of-change of output relative to input tends to lengthen the prediction response and make detection more difficult.
Biomass Supply Logistics and Infrastructure
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sokhansanj, Shahabaddine
2009-04-01
Feedstock supply system encompasses numerous unit operations necessary to move lignocellulosic feedstock from the place where it is produced (in the field or on the stump) to the start of the conversion process (reactor throat) of the Biorefinery. These unit operations, which include collection, storage, preprocessing, handling, and transportation, represent one of the largest technical and logistics challenges to the emerging lignocellulosic biorefining industry. This chapter briefly reviews methods of estimating the quantities of biomass followed by harvesting and collection processes based on current practices on handling wet and dry forage materials. Storage and queuing are used to deal withmore » seasonal harvest times, variable yields, and delivery schedules. Preprocessing can be as simple as grinding and formatting the biomass for increased bulk density or improved conversion efficiency, or it can be as complex as improving feedstock quality through fractionation, tissue separation, drying, blending, and densification. Handling and Transportation consists of using a variety of transport equipment (truck, train, ship) for moving the biomass from one point to another. The chapter also provides typical cost figures for harvest and processing of biomass.« less
Biomass supply logistics and infrastructure.
Sokhansanj, Shahabaddine; Hess, J Richard
2009-01-01
Feedstock supply system encompasses numerous unit operations necessary to move lignocellulosic feedstock from the place where it is produced (in the field or on the stump) to the start of the conversion process (reactor throat) of the biorefinery. These unit operations, which include collection, storage, preprocessing, handling, and transportation, represent one of the largest technical and logistics challenges to the emerging lignocellulosic biorefining industry. This chapter briefly reviews the methods of estimating the quantities of biomass, followed by harvesting and collection processes based on current practices on handling wet and dry forage materials. Storage and queuing are used to deal with seasonal harvest times, variable yields, and delivery schedules. Preprocessing can be as simple as grinding and formatting the biomass for increased bulk density or improved conversion efficiency, or it can be as complex as improving feedstock quality through fractionation, tissue separation, drying, blending, and densification. Handling and transportation consists of using a variety of transport equipment (truck, train, ship) for moving the biomass from one point to another. The chapter also provides typical cost figures for harvest and processing of biomass.
Software for Preprocessing Data From Rocket-Engine Tests
NASA Technical Reports Server (NTRS)
Cheng, Chiu-Fu
2003-01-01
Three computer programs have been written to preprocess digitized outputs of sensors during rocket-engine tests at Stennis Space Center (SSC). The programs apply exclusively to the SSC E test-stand complex and utilize the SSC file format. The programs are the following: (1) Engineering Units Generator (EUGEN) converts sensor-output-measurement data to engineering units. The inputs to EUGEN are raw binary test-data files, which include the voltage data, a list identifying the data channels, and time codes. EUGEN effects conversion by use of a file that contains calibration coefficients for each channel. (2) QUICKLOOK enables immediate viewing of a few selected channels of data, in contradistinction to viewing only after post-test processing (which can take 30 minutes to several hours depending on the number of channels and other test parameters) of data from all channels. QUICKLOOK converts the selected data into a form in which they can be plotted in engineering units by use of Winplot. (3) EUPLOT provides a quick means for looking at data files generated by EUGEN without the necessity of relying on the PVWAVE based plotting software.
Resting-state functional magnetic resonance imaging: the impact of regression analysis.
Yeh, Chia-Jung; Tseng, Yu-Sheng; Lin, Yi-Ru; Tsai, Shang-Yueh; Huang, Teng-Yi
2015-01-01
To investigate the impact of regression methods on resting-state functional magnetic resonance imaging (rsfMRI). During rsfMRI preprocessing, regression analysis is considered effective for reducing the interference of physiological noise on the signal time course. However, it is unclear whether the regression method benefits rsfMRI analysis. Twenty volunteers (10 men and 10 women; aged 23.4 ± 1.5 years) participated in the experiments. We used node analysis and functional connectivity mapping to assess the brain default mode network by using five combinations of regression methods. The results show that regressing the global mean plays a major role in the preprocessing steps. When a global regression method is applied, the values of functional connectivity are significantly lower (P ≤ .01) than those calculated without a global regression. This step increases inter-subject variation and produces anticorrelated brain areas. rsfMRI data processed using regression should be interpreted carefully. The significance of the anticorrelated brain areas produced by global signal removal is unclear. Copyright © 2014 by the American Society of Neuroimaging.
Metabolic profiling of body fluids and multivariate data analysis.
Trezzi, Jean-Pierre; Jäger, Christian; Galozzi, Sara; Barkovits, Katalin; Marcus, Katrin; Mollenhauer, Brit; Hiller, Karsten
2017-01-01
Metabolome analyses of body fluids are challenging due pre-analytical variations, such as pre-processing delay and temperature, and constant dynamical changes of biochemical processes within the samples. Therefore, proper sample handling starting from the time of collection up to the analysis is crucial to obtain high quality samples and reproducible results. A metabolomics analysis is divided into 4 main steps: 1) Sample collection, 2) Metabolite extraction, 3) Data acquisition and 4) Data analysis. Here, we describe a protocol for gas chromatography coupled to mass spectrometry (GC-MS) based metabolic analysis for biological matrices, especially body fluids. This protocol can be applied on blood serum/plasma, saliva and cerebrospinal fluid (CSF) samples of humans and other vertebrates. It covers sample collection, sample pre-processing, metabolite extraction, GC-MS measurement and guidelines for the subsequent data analysis. Advantages of this protocol include: •Robust and reproducible metabolomics results, taking into account pre-analytical variations that may occur during the sampling process•Small sample volume required•Rapid and cost-effective processing of biological samples•Logistic regression based determination of biomarker signatures for in-depth data analysis.
Thermal-to-visible face recognition using partial least squares.
Hu, Shuowen; Choi, Jonghyun; Chan, Alex L; Schwartz, William Robson
2015-03-01
Although visible face recognition has been an active area of research for several decades, cross-modal face recognition has only been explored by the biometrics community relatively recently. Thermal-to-visible face recognition is one of the most difficult cross-modal face recognition challenges, because of the difference in phenomenology between the thermal and visible imaging modalities. We address the cross-modal recognition problem using a partial least squares (PLS) regression-based approach consisting of preprocessing, feature extraction, and PLS model building. The preprocessing and feature extraction stages are designed to reduce the modality gap between the thermal and visible facial signatures, and facilitate the subsequent one-vs-all PLS-based model building. We incorporate multi-modal information into the PLS model building stage to enhance cross-modal recognition. The performance of the proposed recognition algorithm is evaluated on three challenging datasets containing visible and thermal imagery acquired under different experimental scenarios: time-lapse, physical tasks, mental tasks, and subject-to-camera range. These scenarios represent difficult challenges relevant to real-world applications. We demonstrate that the proposed method performs robustly for the examined scenarios.
Rapid Fuel Quality Surveillance Through Chemometric Modeling of Near-Infrared Spectra
2009-01-01
measurements also have a first order advantage and are not time-dependent as is the case for chromatography. Thus, the data preprocessing requirements, while...due in part to the nature of hydrocarbon fuels, which imposes significant technical challenges that must be overcome, and in many cases , traditional...properties. The statistical significance of some other fuel properties is given in Table 2. Note also that in those cases where the property models
Method for measuring anterior chamber volume by image analysis
NASA Astrophysics Data System (ADS)
Zhai, Gaoshou; Zhang, Junhong; Wang, Ruichang; Wang, Bingsong; Wang, Ningli
2007-12-01
Anterior chamber volume (ACV) is very important for an oculist to make rational pathological diagnosis as to patients who have some optic diseases such as glaucoma and etc., yet it is always difficult to be measured accurately. In this paper, a method is devised to measure anterior chamber volumes based on JPEG-formatted image files that have been transformed from medical images using the anterior-chamber optical coherence tomographer (AC-OCT) and corresponding image-processing software. The corresponding algorithms for image analysis and ACV calculation are implemented in VC++ and a series of anterior chamber images of typical patients are analyzed, while anterior chamber volumes are calculated and are verified that they are in accord with clinical observation. It shows that the measurement method is effective and feasible and it has potential to improve accuracy of ACV calculation. Meanwhile, some measures should be taken to simplify the handcraft preprocess working as to images.
Building Structured Personal Health Records from Photographs of Printed Medical Records.
Li, Xiang; Hu, Gang; Teng, Xiaofei; Xie, Guotong
2015-01-01
Personal health records (PHRs) provide patient-centric healthcare by making health records accessible to patients. In China, it is very difficult for individuals to access electronic health records. Instead, individuals can easily obtain the printed copies of their own medical records, such as prescriptions and lab test reports, from hospitals. In this paper, we propose a practical approach to extract structured data from printed medical records photographed by mobile phones. An optical character recognition (OCR) pipeline is performed to recognize text in a document photo, which addresses the problems of low image quality and content complexity by image pre-processing and multiple OCR engine synthesis. A series of annotation algorithms that support flexible layouts are then used to identify the document type, entities of interest, and entity correlations, from which a structured PHR document is built. The proposed approach was applied to real world medical records to demonstrate the effectiveness and applicability.
Building Structured Personal Health Records from Photographs of Printed Medical Records
Li, Xiang; Hu, Gang; Teng, Xiaofei; Xie, Guotong
2015-01-01
Personal health records (PHRs) provide patient-centric healthcare by making health records accessible to patients. In China, it is very difficult for individuals to access electronic health records. Instead, individuals can easily obtain the printed copies of their own medical records, such as prescriptions and lab test reports, from hospitals. In this paper, we propose a practical approach to extract structured data from printed medical records photographed by mobile phones. An optical character recognition (OCR) pipeline is performed to recognize text in a document photo, which addresses the problems of low image quality and content complexity by image pre-processing and multiple OCR engine synthesis. A series of annotation algorithms that support flexible layouts are then used to identify the document type, entities of interest, and entity correlations, from which a structured PHR document is built. The proposed approach was applied to real world medical records to demonstrate the effectiveness and applicability. PMID:26958219
Segmentation of kidney using C-V model and anatomy priors
NASA Astrophysics Data System (ADS)
Lu, Jinghua; Chen, Jie; Zhang, Juan; Yang, Wenjia
2007-12-01
This paper presents an approach for kidney segmentation on abdominal CT images as the first step of a virtual reality surgery system. Segmentation for medical images is often challenging because of the objects' complicated anatomical structures, various gray levels, and unclear edges. A coarse to fine approach has been applied in the kidney segmentation using Chan-Vese model (C-V model) and anatomy prior knowledge. In pre-processing stage, the candidate kidney regions are located. Then C-V model formulated by level set method is applied in these smaller ROI, which can reduce the calculation complexity to a certain extent. At last, after some mathematical morphology procedures, the specified kidney structures have been extracted interactively with prior knowledge. The satisfying results on abdominal CT series show that the proposed approach keeps all the advantages of C-V model and overcome its disadvantages.
Teglia, Carla M; Azcarate, Silvana M; Alcaráz, Mirta R; Goicoechea, Héctor C; Culzoni, María J
2018-08-15
A low-level data fusion strategy was developed and implemented for data processing of second-order liquid chromatographic data with dual detection, i.e. absorbance and fluorescence monitoring. The synergistic effect of coupling individual information provided by two different detectors was evaluated by analyzing the results gathered after the application of a series of data preprocessing steps and chemometric resolution. The chemometric modeling involved data analysis by MCR-ALS, PARAFAC and N-PLS. Their ability to handle the new data block was assessed through the estimation of the analytical figures of merits achieved in the prediction of a validation set containing fifteen fluorescent and non-fluorescent veterinary active ingredients that can be found in poultry litter. Eventually, the feasibility of the application of the fusion strategy to real poultry litter samples containing the studied compounds was verified. Copyright © 2018 Elsevier B.V. All rights reserved.
Mainardi, L T; Pattini, L; Cerutti, S
2007-01-01
A novel method is presented for the investigation of protein properties of sequences using Ramanujan Fourier Transform (RFT). The new methodology involves the preprocessing of protein sequence data by numerically encoding it and then applying the RFT. The RFT is based on projecting the obtained numerical series on a set of basis functions constituted by Ramanujan sums (RS). In RS components, periodicities of finite integer length, rather than frequency, (as in classical harmonic analysis) are considered. The potential of the new approach is documented by a few examples in the analysis of hydrophobic profiles of proteins in two classes including abundance of alpha-helices (group A) or beta-strands (group B). Different patterns are provided as evidence. RFT can be used to characterize the structural properties of proteins and integrate complementary information provided by other signal processing transforms.
NASA Astrophysics Data System (ADS)
Thiebaut, C.; Perraud, L.; Delvit, J. M.; Latry, C.
2016-07-01
We present an on-board satellite implementation of a gradient-based (optical flows) algorithm for the shifts estimation between images of a Shack-Hartmann wave-front sensor on extended landscapes. The proposed algorithm has low complexity in comparison with classical correlation methods which is a big advantage for being used on-board a satellite at high instrument data rate and in real-time. The electronic board used for this implementation is designed for space applications and is composed of radiation-hardened software and hardware. Processing times of both shift estimations and pre-processing steps are compatible of on-board real-time computation.
Method for discovering relationships in data by dynamic quantum clustering
Weinstein, Marvin; Horn, David
2017-05-09
Data clustering is provided according to a dynamical framework based on quantum mechanical time evolution of states corresponding to data points. To expedite computations, we can approximate the time-dependent Hamiltonian formalism by a truncated calculation within a set of Gaussian wave-functions (coherent states) centered around the original points. This allows for analytic evaluation of the time evolution of all such states, opening up the possibility of exploration of relationships among data-points through observation of varying dynamical-distances among points and convergence of points into clusters. This formalism may be further supplemented by preprocessing, such as dimensional reduction through singular value decomposition and/or feature filtering.
Method for discovering relationships in data by dynamic quantum clustering
Weinstein, Marvin; Horn, David
2014-10-28
Data clustering is provided according to a dynamical framework based on quantum mechanical time evolution of states corresponding to data points. To expedite computations, we can approximate the time-dependent Hamiltonian formalism by a truncated calculation within a set of Gaussian wave-functions (coherent states) centered around the original points. This allows for analytic evaluation of the time evolution of all such states, opening up the possibility of exploration of relationships among data-points through observation of varying dynamical-distances among points and convergence of points into clusters. This formalism may be further supplemented by preprocessing, such as dimensional reduction through singular value decomposition and/or feature filtering.
NASA Astrophysics Data System (ADS)
Belabbassi, L.; Garzio, L. M.; Smith, M. J.; Knuth, F.; Vardaro, M.; Kerfoot, J.
2016-02-01
The Ocean Observatories Initiative (OOI), funded by the National Science Foundation, provides users with access to long-term datasets from a variety of deployed oceanographic sensors. The Pioneer Array in the Atlantic Ocean off the Coast of New England hosts 10 moorings and 6 gliders. Each mooring is outfitted with 6 to 19 different instruments telemetering more than 1000 data streams. These data are available to science users to collaborate on common scientific goals such as water quality monitoring and scale variability measures of continental shelf processes and coastal open ocean exchanges. To serve this purpose, the acquired datasets undergo an iterative multi-step quality assurance and quality control procedure automated to work with all types of data. Data processing involves several stages, including a fundamental pre-processing step when the data are prepared for processing. This takes a considerable amount of processing time and is often not given enough thought in development initiatives. The volume and complexity of OOI data necessitates the development of a systematic diagnostic tool to enable the management of a comprehensive data information system for the OOI arrays. We present two examples to demonstrate the current OOI pre-processing diagnostic tool. First, Data Filtering is used to identify incomplete, incorrect, or irrelevant parts of the data and then replaces, modifies or deletes the coarse data. This provides data consistency with similar datasets in the system. Second, Data Normalization occurs when the database is organized in fields and tables to minimize redundancy and dependency. At the end of this step, the data are stored in one place to reduce the risk of data inconsistency and promote easy and efficient mapping to the database.
Boon, K H; Khalil-Hani, M; Malarvili, M B
2018-01-01
This paper presents a method that able to predict the paroxysmal atrial fibrillation (PAF). The method uses shorter heart rate variability (HRV) signals when compared to existing methods, and achieves good prediction accuracy. PAF is a common cardiac arrhythmia that increases the health risk of a patient, and the development of an accurate predictor of the onset of PAF is clinical important because it increases the possibility to electrically stabilize and prevent the onset of atrial arrhythmias with different pacing techniques. We propose a multi-objective optimization algorithm based on the non-dominated sorting genetic algorithm III for optimizing the baseline PAF prediction system, that consists of the stages of pre-processing, HRV feature extraction, and support vector machine (SVM) model. The pre-processing stage comprises of heart rate correction, interpolation, and signal detrending. After that, time-domain, frequency-domain, non-linear HRV features are extracted from the pre-processed data in feature extraction stage. Then, these features are used as input to the SVM for predicting the PAF event. The proposed optimization algorithm is used to optimize the parameters and settings of various HRV feature extraction algorithms, select the best feature subsets, and tune the SVM parameters simultaneously for maximum prediction performance. The proposed method achieves an accuracy rate of 87.7%, which significantly outperforms most of the previous works. This accuracy rate is achieved even with the HRV signal length being reduced from the typical 30 min to just 5 min (a reduction of 83%). Furthermore, another significant result is the sensitivity rate, which is considered more important that other performance metrics in this paper, can be improved with the trade-off of lower specificity. Copyright © 2017 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Rose, Jake; Martin, Michael; Bourlai, Thirimachos
2014-06-01
In law enforcement and security applications, the acquisition of face images is critical in producing key trace evidence for the successful identification of potential threats. The goal of the study is to demonstrate that steroid usage significantly affects human facial appearance and hence, the performance of commercial and academic face recognition (FR) algorithms. In this work, we evaluate the performance of state-of-the-art FR algorithms on two unique face image datasets of subjects before (gallery set) and after (probe set) steroid (or human growth hormone) usage. For the purpose of this study, datasets of 73 subjects were created from multiple sources found on the Internet, containing images of men and women before and after steroid usage. Next, we geometrically pre-processed all images of both face datasets. Then, we applied image restoration techniques on the same face datasets, and finally, we applied FR algorithms in order to match the pre-processed face images of our probe datasets against the face images of the gallery set. Experimental results demonstrate that only a specific set of FR algorithms obtain the most accurate results (in terms of the rank-1 identification rate). This is because there are several factors that influence the efficiency of face matchers including (i) the time lapse between the before and after image pre-processing and restoration face photos, (ii) the usage of different drugs (e.g. Dianabol, Winstrol, and Decabolan), (iii) the usage of different cameras to capture face images, and finally, (iv) the variability of standoff distance, illumination and other noise factors (e.g. motion noise). All of the previously mentioned complicated scenarios make clear that cross-scenario matching is a very challenging problem and, thus, further investigation is required.
NASA Astrophysics Data System (ADS)
Gong, W.; Meyer, F. J.
2013-12-01
It is well known that spatio-temporal the tropospheric phase signatures complicate the interpretation and detection of smaller magnitude deformation signals or unstudied motion fields. Several advanced time-series InSAR techniques were developed in the last decade that make assumptions about the stochastic properties of the signal components in interferometric phases to reduce atmospheric delay effects on surface deformation estimates. However, their need for large datasets to successfully separate the different phase contributions limits their performance if data is scarce and irregularly sampled. Limited SAR data coverage is true for many areas affected by geophysical deformation. This is either due to their low priority in mission programming, unfavorable ground coverage condition, or turbulent seasonal weather effects. In this paper, we present new adaptive atmospheric phase filtering algorithms that are specifically designed to reconstruct surface deformation signals from atmosphere-affected and irregularly sampled InSAR time series. The filters take advantage of auxiliary atmospheric delay information that is extracted from various sources, e.g. atmospheric weather models. They are embedded into a model-free Persistent Scatterer Interferometry (PSI) approach that was selected to accommodate non-linear deformation patterns that are often observed near volcanoes and earthquake zones. Two types of adaptive phase filters were developed that operate in the time dimension and separate atmosphere from deformation based on their different temporal correlation properties. Both filter types use the fact that atmospheric models can reliably predict the spatial statistics and signal power of atmospheric phase delay fields in order to automatically optimize the filter's shape parameters. In essence, both filter types will attempt to maximize the linear correlation between a-priori and the extracted atmospheric phase information. Topography-related phase components, orbit errors and the master atmospheric delays are first removed in a pre-processing step before the atmospheric filters are applied. The first adaptive filter type is using a filter kernel of Gaussian shape and is adaptively adjusting the width (defined in days) of this filter until the correlation of extracted and modeled atmospheric signal power is maximized. If atmospheric properties vary along the time series, this approach will lead to filter setting that are adapted to best reproduce atmospheric conditions at a certain observation epoch. Despite the superior performance of this first filter design, its Gaussian shape imposes non-physical relative weights onto acquisitions that ignore the known atmospheric noise in the data. Hence, in our second approach we are using atmospheric a-priori information to adaptively define the full shape of the atmospheric filter. For this process, we use a so-called normalized convolution (NC) approach that is often used in image reconstruction. Several NC designs will be presented in this paper and studied for relative performance. A cross-validation of all developed algorithms was done using both synthetic and real data. This validation showed designed filters are outperforming conventional filter methods that particularly useful for regions with limited data coverage or lack of a deformation field prior.
Applying Knowledge Discovery in Databases in Public Health Data Set: Challenges and Concerns
Volrathongchia, Kanittha
2003-01-01
In attempting to apply Knowledge Discovery in Databases (KDD) to generate a predictive model from a health care dataset that is currently available to the public, the first step is to pre-process the data to overcome the challenges of missing data, redundant observations, and records containing inaccurate data. This study will demonstrate how to use simple pre-processing methods to improve the quality of input data. PMID:14728545
Cloud screening Coastal Zone Color Scanner images using channel 5
NASA Technical Reports Server (NTRS)
Eckstein, B. A.; Simpson, J. J.
1991-01-01
Clouds are removed from Coastal Zone Color Scanner (CZCS) data using channel 5. Instrumentation problems require pre-processing of channel 5 before an intelligent cloud-screening algorithm can be used. For example, at intervals of about 16 lines, the sensor records anomalously low radiances. Moreover, the calibration equation yields negative radiances when the sensor records zero counts, and pixels corrupted by electronic overshoot must also be excluded. The remaining pixels may then be used in conjunction with the procedure of Simpson and Humphrey to determine the CZCS cloud mask. These results plus in situ observations of phytoplankton pigment concentration show that pre-processing and proper cloud-screening of CZCS data are necessary for accurate satellite-derived pigment concentrations. This is especially true in the coastal margins, where pigment content is high and image distortion associated with electronic overshoot is also present. The pre-processing algorithm is critical to obtaining accurate global estimates of pigment from spacecraft data.
Research on Finite Element Model Generating Method of General Gear Based on Parametric Modelling
NASA Astrophysics Data System (ADS)
Lei, Yulong; Yan, Bo; Fu, Yao; Chen, Wei; Hou, Liguo
2017-06-01
Aiming at the problems of low efficiency and poor quality of gear meshing in the current mainstream finite element software, through the establishment of universal gear three-dimensional model, and explore the rules of unit and node arrangement. In this paper, a finite element model generation method of universal gear based on parameterization is proposed. Visual Basic program is used to realize the finite element meshing, give the material properties, and set the boundary / load conditions and other pre-processing work. The dynamic meshing analysis of the gears is carried out with the method proposed in this pape, and compared with the calculated values to verify the correctness of the method. The method greatly shortens the workload of gear finite element pre-processing, improves the quality of gear mesh, and provides a new idea for the FEM pre-processing.
Application of preprocessing filtering on Decision Tree C4.5 and rough set theory
NASA Astrophysics Data System (ADS)
Chan, Joseph C. C.; Lin, Tsau Y.
2001-03-01
This paper compares two artificial intelligence methods: the Decision Tree C4.5 and Rough Set Theory on the stock market data. The Decision Tree C4.5 is reviewed with the Rough Set Theory. An enhanced window application is developed to facilitate the pre-processing filtering by introducing the feature (attribute) transformations, which allows users to input formulas and create new attributes. Also, the application produces three varieties of data set with delaying, averaging, and summation. The results prove the improvement of pre-processing by applying feature (attribute) transformations on Decision Tree C4.5. Moreover, the comparison between Decision Tree C4.5 and Rough Set Theory is based on the clarity, automation, accuracy, dimensionality, raw data, and speed, which is supported by the rules sets generated by both algorithms on three different sets of data.
Acoustic Biometric System Based on Preprocessing Techniques and Linear Support Vector Machines
del Val, Lara; Izquierdo-Fuente, Alberto; Villacorta, Juan J.; Raboso, Mariano
2015-01-01
Drawing on the results of an acoustic biometric system based on a MSE classifier, a new biometric system has been implemented. This new system preprocesses acoustic images, extracts several parameters and finally classifies them, based on Support Vector Machine (SVM). The preprocessing techniques used are spatial filtering, segmentation—based on a Gaussian Mixture Model (GMM) to separate the person from the background, masking—to reduce the dimensions of images—and binarization—to reduce the size of each image. An analysis of classification error and a study of the sensitivity of the error versus the computational burden of each implemented algorithm are presented. This allows the selection of the most relevant algorithms, according to the benefits required by the system. A significant improvement of the biometric system has been achieved by reducing the classification error, the computational burden and the storage requirements. PMID:26091392
Acoustic Biometric System Based on Preprocessing Techniques and Linear Support Vector Machines.
del Val, Lara; Izquierdo-Fuente, Alberto; Villacorta, Juan J; Raboso, Mariano
2015-06-17
Drawing on the results of an acoustic biometric system based on a MSE classifier, a new biometric system has been implemented. This new system preprocesses acoustic images, extracts several parameters and finally classifies them, based on Support Vector Machine (SVM). The preprocessing techniques used are spatial filtering, segmentation-based on a Gaussian Mixture Model (GMM) to separate the person from the background, masking-to reduce the dimensions of images-and binarization-to reduce the size of each image. An analysis of classification error and a study of the sensitivity of the error versus the computational burden of each implemented algorithm are presented. This allows the selection of the most relevant algorithms, according to the benefits required by the system. A significant improvement of the biometric system has been achieved by reducing the classification error, the computational burden and the storage requirements.
Multicutter machining of compound parametric surfaces
NASA Astrophysics Data System (ADS)
Hatna, Abdelmadjid; Grieve, R. J.; Broomhead, P.
2000-10-01
Parametric free forms are used in industries as disparate as footwear, toys, sporting goods, ceramics, digital content creation, and conceptual design. Optimizing tool path patterns and minimizing the total machining time is a primordial issue in numerically controlled (NC) machining of free form surfaces. We demonstrate in the present work that multi-cutter machining can achieve as much as 60% reduction in total machining time for compound sculptured surfaces. The given approach is based upon the pre-processing as opposed to the usual post-processing of surfaces for the detection and removal of interference followed by precise tracking of unmachined areas.
Optical solver of combinatorial problems: nanotechnological approach.
Cohen, Eyal; Dolev, Shlomi; Frenkel, Sergey; Kryzhanovsky, Boris; Palagushkin, Alexandr; Rosenblit, Michael; Zakharov, Victor
2013-09-01
We present an optical computing system to solve NP-hard problems. As nano-optical computing is a promising venue for the next generation of computers performing parallel computations, we investigate the application of submicron, or even subwavelength, computing device designs. The system utilizes a setup of exponential sized masks with exponential space complexity produced in polynomial time preprocessing. The masks are later used to solve the problem in polynomial time. The size of the masks is reduced to nanoscaled density. Simulations were done to choose a proper design, and actual implementations show the feasibility of such a system.
Real-time motion artifacts compensation of ToF sensors data on GPU
NASA Astrophysics Data System (ADS)
Lefloch, Damien; Hoegg, Thomas; Kolb, Andreas
2013-05-01
Over the last decade, ToF sensors attracted many computer vision and graphics researchers. Nevertheless, ToF devices suffer from severe motion artifacts for dynamic scenes as well as low-resolution depth data which strongly justifies the importance of a valid correction. To counterbalance this effect, a pre-processing approach is introduced to greatly improve range image data on dynamic scenes. We first demonstrate the robustness of our approach using simulated data to finally validate our method using sensor range data. Our GPU-based processing pipeline enhances range data reliability in real-time.
Real-time stereo generation for surgical vision during minimal invasive robotic surgery
NASA Astrophysics Data System (ADS)
Laddi, Amit; Bhardwaj, Vijay; Mahapatra, Prasant; Pankaj, Dinesh; Kumar, Amod
2016-03-01
This paper proposes a framework for 3D surgical vision for minimal invasive robotic surgery. It presents an approach for generating the three dimensional view of the in-vivo live surgical procedures from two images captured by very small sized, full resolution camera sensor rig. A pre-processing scheme is employed to enhance the image quality and equalizing the color profile of two images. Polarized Projection using interlacing two images give a smooth and strain free three dimensional view. The algorithm runs in real time with good speed at full HD resolution.
3CCD image segmentation and edge detection based on MATLAB
NASA Astrophysics Data System (ADS)
He, Yong; Pan, Jiazhi; Zhang, Yun
2006-09-01
This research aimed to identify weeds from crops in early stage in the field operation by using image-processing technology. As 3CCD images offer greater binary value difference between weed and crop section than ordinary digital images taken by common cameras. It has 3 channels (green, red, ifred) which takes a snap-photo of the same area, and the three images can be composed into one image, which facilitates the segmentation of different areas. By the application of image-processing toolkit on MATLAB, the different areas in the image can be segmented clearly. As edge detection technique is the first and very important step in image processing, The different result of different processing method was compared. Especially, by using the wavelet packet transform toolkit on MATLAB, An image was preprocessed and then the edge was extracted, and getting more clearly cut image of edge. The segmentation methods include operations as erosion, dilation and other algorithms to preprocess the images. It is of great importance to segment different areas in digital images in field real time, so as to be applied in precision farming, to saving energy and herbicide and many other materials. At present time Large scale software as MATLAB on PC was used, but the computation can be reduced and integrated into a small embed system, which means that the application of this technique in agricultural engineering is feasible and of great economical value.
NASA Astrophysics Data System (ADS)
Aviles, Angelica I.; Alsaleh, Samar; Sobrevilla, Pilar; Casals, Alicia
2016-03-01
Robotic-Assisted Surgery approach overcomes the limitations of the traditional laparoscopic and open surgeries. However, one of its major limitations is the lack of force feedback. Since there is no direct interaction between the surgeon and the tissue, there is no way of knowing how much force the surgeon is applying which can result in irreversible injuries. The use of force sensors is not practical since they impose different constraints. Thus, we make use of a neuro-visual approach to estimate the applied forces, in which the 3D shape recovery together with the geometry of motion are used as input to a deep network based on LSTM-RNN architecture. When deep networks are used in real time, pre-processing of data is a key factor to reduce complexity and improve the network performance. A common pre-processing step is dimensionality reduction which attempts to eliminate redundant and insignificant information by selecting a subset of relevant features to use in model construction. In this work, we show the effects of dimensionality reduction in a real-time application: estimating the applied force in Robotic-Assisted Surgeries. According to the results, we demonstrated positive effects of doing dimensionality reduction on deep networks including: faster training, improved network performance, and overfitting prevention. We also show a significant accuracy improvement, ranging from about 33% to 86%, over existing approaches related to force estimation.
FPGA implementation of image dehazing algorithm for real time applications
NASA Astrophysics Data System (ADS)
Kumar, Rahul; Kaushik, Brajesh Kumar; Balasubramanian, R.
2017-09-01
Weather degradation such as haze, fog, mist, etc. severely reduces the effective range of visual surveillance. This degradation is a spatially varying phenomena, which makes this problem non trivial. Dehazing is an essential preprocessing stage in applications such as long range imaging, border security, intelligent transportation system, etc. However, these applications require low latency of the preprocessing block. In this work, single image dark channel prior algorithm is modified and implemented for fast processing with comparable visual quality of the restored image/video. Although conventional single image dark channel prior algorithm is computationally expensive, it yields impressive results. Moreover, a two stage image dehazing architecture is introduced, wherein, dark channel and airlight are estimated in the first stage. Whereas, transmission map and intensity restoration are computed in the next stages. The algorithm is implemented using Xilinx Vivado software and validated by using Xilinx zc702 development board, which contains an Artix7 equivalent Field Programmable Gate Array (FPGA) and ARM Cortex A9 dual core processor. Additionally, high definition multimedia interface (HDMI) has been incorporated for video feed and display purposes. The results show that the dehazing algorithm attains 29 frames per second for the image resolution of 1920x1080 which is suitable of real time applications. The design utilizes 9 18K_BRAM, 97 DSP_48, 6508 FFs and 8159 LUTs.
Change detection in satellite images
NASA Astrophysics Data System (ADS)
Thonnessen, U.; Hofele, G.; Middelmann, W.
2005-05-01
Change detection plays an important role in different military areas as strategic reconnaissance, verification of armament and disarmament control and damage assessment. It is the process of identifying differences in the state of an object or phenomenon by observing it at different times. The availability of spaceborne reconnaissance systems with high spatial resolution, multi spectral capabilities, and short revisit times offer new perspectives for change detection. Before performing any kind of change detection it is necessary to separate changes of interest from changes caused by differences in data acquisition parameters. In these cases it is necessary to perform a pre-processing to correct the data or to normalize it. Image registration and, corresponding to this task, the ortho-rectification of the image data is a further prerequisite for change detection. If feasible, a 1-to-1 geometric correspondence should be aspired for. Change detection on an iconic level with a succeeding interpretation of the changes by the observer is often proposed; nevertheless an automatic knowledge-based analysis delivering the interpretation of the changes on a semantic level should be the aim of the future. We present first results of change detection on a structural level concerning urban areas. After pre-processing, the images are segmented in areas of interest and structural analysis is applied to these regions to extract descriptions of urban infrastructure like buildings, roads and tanks of refineries. These descriptions are matched to detect changes and similarities.
Data Treatment for LC-MS Untargeted Analysis.
Riccadonna, Samantha; Franceschi, Pietro
2018-01-01
Liquid chromatography-mass spectrometry (LC-MS) untargeted experiments require complex chemometrics strategies to extract information from the experimental data. Here we discuss "data preprocessing", the set of procedures performed on the raw data to produce a data matrix which will be the starting point for the subsequent statistical analysis. Data preprocessing is a crucial step on the path to knowledge extraction, which should be carefully controlled and optimized in order to maximize the output of any untargeted metabolomics investigation.
Automated Processing Workflow for Ambient Seismic Recordings
NASA Astrophysics Data System (ADS)
Girard, A. J.; Shragge, J.
2017-12-01
Structural imaging using body-wave energy present in ambient seismic data remains a challenging task, largely because these wave modes are commonly much weaker than surface wave energy. In a number of situations body-wave energy has been extracted successfully; however, (nearly) all successful body-wave extraction and imaging approaches have focused on cross-correlation processing. While this is useful for interferometric purposes, it can also lead to the inclusion of unwanted noise events that dominate the resulting stack, leaving body-wave energy overpowered by the coherent noise. Conversely, wave-equation imaging can be applied directly on non-correlated ambient data that has been preprocessed to mitigate unwanted energy (i.e., surface waves, burst-like and electromechanical noise) to enhance body-wave arrivals. Following this approach, though, requires a significant preprocessing effort on often Terabytes of ambient seismic data, which is expensive and requires automation to be a feasible approach. In this work we outline an automated processing workflow designed to optimize body wave energy from an ambient seismic data set acquired on a large-N array at a mine site near Lalor Lake, Manitoba, Canada. We show that processing ambient seismic data in the recording domain, rather than the cross-correlation domain, allows us to mitigate energy that is inappropriate for body-wave imaging. We first develop a method for window selection that automatically identifies and removes data contaminated by coherent high-energy bursts. We then apply time- and frequency-domain debursting techniques to mitigate the effects of remaining strong amplitude and/or monochromatic energy without severely degrading the overall waveforms. After each processing step we implement a QC check to investigate improvements in the convergence rates - and the emergence of reflection events - in the cross-correlation plus stack waveforms over hour-long windows. Overall, the QC analyses suggest that automated preprocessing of ambient seismic recordings in the recording domain successfully mitigates unwanted coherent noise events in both the time and frequency domain. Accordingly, we assert that this method is beneficial for direct wave-equation imaging with ambient seismic recordings.
A Stereo Music Preprocessing Scheme for Cochlear Implant Users.
Buyens, Wim; van Dijk, Bas; Wouters, Jan; Moonen, Marc
2015-10-01
Listening to music is still one of the more challenging aspects of using a cochlear implant (CI) for most users. Simple musical structures, a clear rhythm/beat, and lyrics that are easy to follow are among the top factors contributing to music appreciation for CI users. Modifying the audio mix of complex music potentially improves music enjoyment in CI users. A stereo music preprocessing scheme is described in which vocals, drums, and bass are emphasized based on the representation of the harmonic and the percussive components in the input spectrogram, combined with the spatial allocation of instruments in typical stereo recordings. The scheme is assessed with postlingually deafened CI subjects (N = 7) using pop/rock music excerpts with different complexity levels. The scheme is capable of modifying relative instrument level settings, with the aim of improving music appreciation in CI users, and allows individual preference adjustments. The assessment with CI subjects confirms the preference for more emphasis on vocals, drums, and bass as offered by the preprocessing scheme, especially for songs with higher complexity. The stereo music preprocessing scheme has the potential to improve music enjoyment in CI users by modifying the audio mix in widespread (stereo) music recordings. Since music enjoyment in CI users is generally poor, this scheme can assist the music listening experience of CI users as a training or rehabilitation tool.
Masking as an effective quality control method for next-generation sequencing data analysis.
Yun, Sajung; Yun, Sijung
2014-12-13
Next generation sequencing produces base calls with low quality scores that can affect the accuracy of identifying simple nucleotide variation calls, including single nucleotide polymorphisms and small insertions and deletions. Here we compare the effectiveness of two data preprocessing methods, masking and trimming, and the accuracy of simple nucleotide variation calls on whole-genome sequence data from Caenorhabditis elegans. Masking substitutes low quality base calls with 'N's (undetermined bases), whereas trimming removes low quality bases that results in a shorter read lengths. We demonstrate that masking is more effective than trimming in reducing the false-positive rate in single nucleotide polymorphism (SNP) calling. However, both of the preprocessing methods did not affect the false-negative rate in SNP calling with statistical significance compared to the data analysis without preprocessing. False-positive rate and false-negative rate for small insertions and deletions did not show differences between masking and trimming. We recommend masking over trimming as a more effective preprocessing method for next generation sequencing data analysis since masking reduces the false-positive rate in SNP calling without sacrificing the false-negative rate although trimming is more commonly used currently in the field. The perl script for masking is available at http://code.google.com/p/subn/. The sequencing data used in the study were deposited in the Sequence Read Archive (SRX450968 and SRX451773).
Pre-processing by data augmentation for improved ellipse fitting.
Kumar, Pankaj; Belchamber, Erika R; Miklavcic, Stanley J
2018-01-01
Ellipse fitting is a highly researched and mature topic. Surprisingly, however, no existing method has thus far considered the data point eccentricity in its ellipse fitting procedure. Here, we introduce the concept of eccentricity of a data point, in analogy with the idea of ellipse eccentricity. We then show empirically that, irrespective of ellipse fitting method used, the root mean square error (RMSE) of a fit increases with the eccentricity of the data point set. The main contribution of the paper is based on the hypothesis that if the data point set were pre-processed to strategically add additional data points in regions of high eccentricity, then the quality of a fit could be improved. Conditional validity of this hypothesis is demonstrated mathematically using a model scenario. Based on this confirmation we propose an algorithm that pre-processes the data so that data points with high eccentricity are replicated. The improvement of ellipse fitting is then demonstrated empirically in real-world application of 3D reconstruction of a plant root system for phenotypic analysis. The degree of improvement for different underlying ellipse fitting methods as a function of data noise level is also analysed. We show that almost every method tested, irrespective of whether it minimizes algebraic error or geometric error, shows improvement in the fit following data augmentation using the proposed pre-processing algorithm.
Zhao, Li-Ting; Xiang, Yu-Hong; Dai, Yin-Mei; Zhang, Zhuo-Yong
2010-04-01
Near infrared spectroscopy was applied to measure the tissue slice of endometrial tissues for collecting the spectra. A total of 154 spectra were obtained from 154 samples. The number of normal, hyperplasia, and malignant samples was 36, 60, and 58, respectively. Original near infrared spectra are composed of many variables, for example, interference information including instrument errors and physical effects such as particle size and light scatter. In order to reduce these influences, original spectra data should be performed with different spectral preprocessing methods to compress variables and extract useful information. So the methods of spectral preprocessing and wavelength selection have played an important role in near infrared spectroscopy technique. In the present paper the raw spectra were processed using various preprocessing methods including first derivative, multiplication scatter correction, Savitzky-Golay first derivative algorithm, standard normal variate, smoothing, and moving-window median. Standard deviation was used to select the optimal spectral region of 4 000-6 000 cm(-1). Then principal component analysis was used for classification. Principal component analysis results showed that three types of samples could be discriminated completely and the accuracy almost achieved 100%. This study demonstrated that near infrared spectroscopy technology and chemometrics method could be a fast, efficient, and novel means to diagnose cancer. The proposed methods would be a promising and significant diagnosis technique of early stage cancer.
A hybrid lung and vessel segmentation algorithm for computer aided detection of pulmonary embolism
NASA Astrophysics Data System (ADS)
Raghupathi, Laks; Lakare, Sarang
2009-02-01
Advances in multi-detector technology have made CT pulmonary angiography (CTPA) a popular radiological tool for pulmonary emboli (PE) detection. CTPA provide rich detail of lung anatomy and is a useful diagnostic aid in highlighting even very small PE. However analyzing hundreds of slices is laborious and time-consuming for the practicing radiologist which may also cause misdiagnosis due to the presence of various PE look-alike. Computer-aided diagnosis (CAD) can be a potential second reader in providing key diagnostic information. Since PE occurs only in vessel arteries, it is important to mark this region of interest (ROI) during CAD preprocessing. In this paper, we present a new lung and vessel segmentation algorithm for extracting contrast-enhanced vessel ROI in CTPA. Existing approaches to segmentation either provide only the larger lung area without highlighting the vessels or is computationally prohibitive. In this paper, we propose a hybrid lung and vessel segmentation which uses an initial lung ROI and determines the vessels through a series of refinement steps. We first identify a coarse vessel ROI by finding the "holes" from the lung ROI. We then use the initial ROI as seed-points for a region-growing process while carefully excluding regions which are not relevant. The vessel segmentation mask covers 99% of the 259 PE from a real-world set of 107 CTPA. Further, our algorithm increases the net sensitivity of a prototype CAD system by 5-9% across all PE categories in the training and validation data sets. The average run-time of algorithm was only 100 seconds on a standard workstation.
LSD-induced entropic brain activity predicts subsequent personality change.
Lebedev, A V; Kaelen, M; Lövdén, M; Nilsson, J; Feilding, A; Nutt, D J; Carhart-Harris, R L
2016-09-01
Personality is known to be relatively stable throughout adulthood. Nevertheless, it has been shown that major life events with high personal significance, including experiences engendered by psychedelic drugs, can have an enduring impact on some core facets of personality. In the present, balanced-order, placebo-controlled study, we investigated biological predictors of post-lysergic acid diethylamide (LSD) changes in personality. Nineteen healthy adults underwent resting state functional MRI scans under LSD (75µg, I.V.) and placebo (saline I.V.). The Revised NEO Personality Inventory (NEO-PI-R) was completed at screening and 2 weeks after LSD/placebo. Scanning sessions consisted of three 7.5-min eyes-closed resting-state scans, one of which involved music listening. A standardized preprocessing pipeline was used to extract measures of sample entropy, which characterizes the predictability of an fMRI time-series. Mixed-effects models were used to evaluate drug-induced shifts in brain entropy and their relationship with the observed increases in the personality trait openness at the 2-week follow-up. Overall, LSD had a pronounced global effect on brain entropy, increasing it in both sensory and hierarchically higher networks across multiple time scales. These shifts predicted enduring increases in trait openness. Moreover, the predictive power of the entropy increases was greatest for the music-listening scans and when "ego-dissolution" was reported during the acute experience. These results shed new light on how LSD-induced shifts in brain dynamics and concomitant subjective experience can be predictive of lasting changes in personality. Hum Brain Mapp 37:3203-3213, 2016. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Oscillations during observations: Dynamic oscillatory networks serving visuospatial attention.
Wiesman, Alex I; Heinrichs-Graham, Elizabeth; Proskovec, Amy L; McDermott, Timothy J; Wilson, Tony W
2017-10-01
The dynamic allocation of neural resources to discrete features within a visual scene enables us to react quickly and accurately to salient environmental circumstances. A network of bilateral cortical regions is known to subserve such visuospatial attention functions; however the oscillatory and functional connectivity dynamics of information coding within this network are not fully understood. Particularly, the coding of information within prototypical attention-network hubs and the subsecond functional connections formed between these hubs have not been adequately characterized. Herein, we use the precise temporal resolution of magnetoencephalography (MEG) to define spectrally specific functional nodes and connections that underlie the deployment of attention in visual space. Twenty-three healthy young adults completed a visuospatial discrimination task designed to elicit multispectral activity in visual cortex during MEG, and the resulting data were preprocessed and reconstructed in the time-frequency domain. Oscillatory responses were projected to the cortical surface using a beamformer, and time series were extracted from peak voxels to examine their temporal evolution. Dynamic functional connectivity was then computed between nodes within each frequency band of interest. We find that visual attention network nodes are defined functionally by oscillatory frequency, that the allocation of attention to the visual space dynamically modulates functional connectivity between these regions on a millisecond timescale, and that these modulations significantly correlate with performance on a spatial discrimination task. We conclude that functional hubs underlying visuospatial attention are segregated not only anatomically but also by oscillatory frequency, and importantly that these oscillatory signatures promote dynamic communication between these hubs. Hum Brain Mapp 38:5128-5140, 2017. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.
NASA Astrophysics Data System (ADS)
Traganos, D.; Cerra, D.; Reinartz, P.
2017-05-01
Seagrasses are one of the most productive and widespread yet threatened coastal ecosystems on Earth. Despite their importance, they are declining due to various threats, which are mainly anthropogenic. Lack of data on their distribution hinders any effort to rectify this decline through effective detection, mapping and monitoring. Remote sensing can mitigate this data gap by allowing retrospective quantitative assessment of seagrass beds over large and remote areas. In this paper, we evaluate the quantitative application of Planet high resolution imagery for the detection of seagrasses in the Thermaikos Gulf, NW Aegean Sea, Greece. The low Signal-to-noise Ratio (SNR), which characterizes spectral bands at shorter wavelengths, prompts the application of the Unmixing-based denoising (UBD) as a pre-processing step for seagrass detection. A total of 15 spectral-temporal patterns is extracted from a Planet image time series to restore the corrupted blue and green band in the processed Planet image. Subsequently, we implement Lyzenga's empirical water column correction and Support Vector Machines (SVM) to evaluate quantitative benefits of denoising. Denoising aids detection of Posidonia oceanica seagrass species by increasing its producer and user accuracy by 31.7 % and 10.4 %, correspondingly, with a respective increase in its Kappa value from 0.3 to 0.48. In the near future, our objective is to improve accuracies in seagrass detection by applying more sophisticated, analytical water column correction algorithms to Planet imagery, developing time- and cost-effective monitoring of seagrass distribution that will enable in turn the effective management and conservation of these highly valuable and productive ecosystems.
CEBS: a comprehensive annotated database of toxicological data
Lea, Isabel A.; Gong, Hui; Paleja, Anand; Rashid, Asif; Fostel, Jennifer
2017-01-01
The Chemical Effects in Biological Systems database (CEBS) is a comprehensive and unique toxicology resource that compiles individual and summary animal data from the National Toxicology Program (NTP) testing program and other depositors into a single electronic repository. CEBS has undergone significant updates in recent years and currently contains over 11 000 test articles (exposure agents) and over 8000 studies including all available NTP carcinogenicity, short-term toxicity and genetic toxicity studies. Study data provided to CEBS are manually curated, accessioned and subject to quality assurance review prior to release to ensure high quality. The CEBS database has two main components: data collection and data delivery. To accommodate the breadth of data produced by NTP, the CEBS data collection component is an integrated relational design that allows the flexibility to capture any type of electronic data (to date). The data delivery component of the database comprises a series of dedicated user interface tables containing pre-processed data that support each component of the user interface. The user interface has been updated to include a series of nine Guided Search tools that allow access to NTP summary and conclusion data and larger non-NTP datasets. The CEBS database can be accessed online at http://www.niehs.nih.gov/research/resources/databases/cebs/. PMID:27899660
MTpy: A Python toolbox for magnetotellurics
NASA Astrophysics Data System (ADS)
Krieger, Lars; Peacock, Jared R.
2014-11-01
We present the software package MTpy that allows handling, processing, and imaging of magnetotelluric (MT) data sets. Written in Python, the code is open source, containing sub-packages and modules for various tasks within the standard MT data processing and handling scheme. Besides the independent definition of classes and functions, MTpy provides wrappers and convenience scripts to call standard external data processing and modelling software. In its current state, modules and functions of MTpy work on raw and pre-processed MT data. However, opposite to providing a static compilation of software, we prefer to introduce MTpy as a flexible software toolbox, whose contents can be combined and utilised according to the respective needs of the user. Just as the overall functionality of a mechanical toolbox can be extended by adding new tools, MTpy is a flexible framework, which will be dynamically extended in the future. Furthermore, it can help to unify and extend existing codes and algorithms within the (academic) MT community. In this paper, we introduce the structure and concept of MTpy. Additionally, we show some examples from an everyday work-flow of MT data processing: the generation of standard EDI data files from raw electric (E-) and magnetic flux density (B-) field time series as input, the conversion into MiniSEED data format, as well as the generation of a graphical data representation in the form of a Phase Tensor pseudosection.
Exploring and Analyzing Climate Variations Online by Using MERRA-2 data at GES DISC
NASA Astrophysics Data System (ADS)
Shen, S.; Ostrenga, D.; Vollmer, B.; Kempler, S.
2016-12-01
NASA Giovanni (Geospatial Interactive Online Visualization ANd aNalysis Infrastructure) (http://giovanni.sci.gsfc.nasa.gov/giovanni/) is a web-based data visualization and analysis system developed by the Goddard Earth Sciences Data and Information Services Center (GES DISC). Current data analysis functions include Lat-Lon map, time series, scatter plot, correlation map, difference, cross-section, vertical profile, and animation etc. The system enables basic statistical analysis and comparisons of multiple variables. This web-based tool facilitates data discovery, exploration and analysis of large amount of global and regional remote sensing and model data sets from a number of NASA data centers. Recently, long term global assimilated atmospheric, land, and ocean data have been integrated into the system that enables quick exploration and analysis of climate data without downloading, and preprocessing the data. Example data include climate reanalysis from NASA Modern-Era Retrospective analysis for Research and Applications, Version 2 (MERRA-2) which provides data beginning 1980 to present; land data from NASA Global Land Data Assimilation System (GLDAS) which assimilates data from 1948 to 2012; as well as ocean biological data from NASA Ocean Biogeochemical Model (NOBM) which assimilates data from 1998 to 2012. This presentation, using surface air temperature, precipitation, ozone, and aerosol, etc. from MERRA-2, demonstrates climate variation analysis with Giovanni at selected regions.
Integrative missing value estimation for microarray data.
Hu, Jianjun; Li, Haifeng; Waterman, Michael S; Zhou, Xianghong Jasmine
2006-10-12
Missing value estimation is an important preprocessing step in microarray analysis. Although several methods have been developed to solve this problem, their performance is unsatisfactory for datasets with high rates of missing data, high measurement noise, or limited numbers of samples. In fact, more than 80% of the time-series datasets in Stanford Microarray Database contain less than eight samples. We present the integrative Missing Value Estimation method (iMISS) by incorporating information from multiple reference microarray datasets to improve missing value estimation. For each gene with missing data, we derive a consistent neighbor-gene list by taking reference data sets into consideration. To determine whether the given reference data sets are sufficiently informative for integration, we use a submatrix imputation approach. Our experiments showed that iMISS can significantly and consistently improve the accuracy of the state-of-the-art Local Least Square (LLS) imputation algorithm by up to 15% improvement in our benchmark tests. We demonstrated that the order-statistics-based integrative imputation algorithms can achieve significant improvements over the state-of-the-art missing value estimation approaches such as LLS and is especially good for imputing microarray datasets with a limited number of samples, high rates of missing data, or very noisy measurements. With the rapid accumulation of microarray datasets, the performance of our approach can be further improved by incorporating larger and more appropriate reference datasets.
Rounds, Stewart A.; Buccola, Norman L.
2015-01-01
Water-quality models allow water resource professionals to examine conditions under an almost unlimited variety of potential future scenarios. The two-dimensional (longitudinal, vertical) water-quality model CE-QUAL-W2, version 3.7, was enhanced and augmented with new features to help dam operators and managers explore and optimize potential solutions for temperature management downstream of thermally stratified reservoirs. Such temperature management often is accomplished by blending releases from multiple dam outlets that access water of different temperatures at different depths. The modified blending algorithm in version 3.7 of CE-QUAL-W2 allows the user to specify a time-series of target release temperatures, designate from 2 to 10 floating or fixed-elevation outlets for blending, impose minimum and maximum head and flow constraints for any blended outlet, and set priority designations for each outlet that allow the model to choose which outlets to use and how to balance releases among them. The modified model was tested with a variety of examples and against a previously calibrated model of Detroit Lake on the North Santiam River in northwestern Oregon, and the results compared well. These updates to the blending algorithms will allow more complicated dam-operation scenarios to be evaluated somewhat automatically with the model, with decreased need for multiple model runs or preprocessing of model inputs to fully characterize the operational constraints.
Exploring and Analyzing Climate Variations Online by Using NASA MERRA-2 Data at GES DISC
NASA Technical Reports Server (NTRS)
Shen, Suhung; Ostrenga, Dana M.; Vollmer, Bruce E.; Kempler, Steven J.
2016-01-01
NASA Giovanni (Goddard Interactive Online Visualization ANd aNalysis Infrastructure) (http:giovanni.sci.gsfc.nasa.govgiovanni) is a web-based data visualization and analysis system developed by the Goddard Earth Sciences Data and Information Services Center (GES DISC). Current data analysis functions include Lat-Lon map, time series, scatter plot, correlation map, difference, cross-section, vertical profile, and animation etc. The system enables basic statistical analysis and comparisons of multiple variables. This web-based tool facilitates data discovery, exploration and analysis of large amount of global and regional remote sensing and model data sets from a number of NASA data centers. Long term global assimilated atmospheric, land, and ocean data have been integrated into the system that enables quick exploration and analysis of climate data without downloading, preprocessing, and learning data. Example data include climate reanalysis data from NASA Modern-Era Retrospective analysis for Research and Applications, Version 2 (MERRA-2) which provides data beginning in 1980 to present; land data from NASA Global Land Data Assimilation System (GLDAS), which assimilates data from 1948 to 2012; as well as ocean biological data from NASA Ocean Biogeochemical Model (NOBM), which provides data from 1998 to 2012. This presentation, using surface air temperature, precipitation, ozone, and aerosol, etc. from MERRA-2, demonstrates climate variation analysis with Giovanni at selected regions.
Solar radio proxies for improved satellite orbit prediction
NASA Astrophysics Data System (ADS)
Yaya, Philippe; Hecker, Louis; Dudok de Wit, Thierry; Fèvre, Clémence Le; Bruinsma, Sean
2017-12-01
Specification and forecasting of solar drivers to thermosphere density models is critical for satellite orbit prediction and debris avoidance. Satellite operators routinely forecast orbits up to 30 days into the future. This requires forecasts of the drivers to these orbit prediction models such as the solar Extreme-UV (EUV) flux and geomagnetic activity. Most density models use the 10.7 cm radio flux (F10.7 index) as a proxy for solar EUV. However, daily measurements at other centimetric wavelengths have also been performed by the Nobeyama Radio Observatory (Japan) since the 1950's, thereby offering prospects for improving orbit modeling. Here we present a pre-operational service at the Collecte Localisation Satellites company that collects these different observations in one single homogeneous dataset and provides a 30 days forecast on a daily basis. Interpolation and preprocessing algorithms were developed to fill in missing data and remove anomalous values. We compared various empirical time series prediction techniques and selected a multi-wavelength non-recursive analogue neural network. The prediction of the 30 cm flux, and to a lesser extent that of the 10.7 cm flux, performs better than NOAA's present prediction of the 10.7 cm flux, especially during periods of high solar activity. In addition, we find that the DTM-2013 density model (Drag Temperature Model) performs better with (past and predicted) values of the 30 cm radio flux than with the 10.7 flux.
Warrick, P A; Precup, D; Hamilton, E F; Kearney, R E
2007-01-01
To develop a singular-spectrum analysis (SSA) based change-point detection algorithm applicable to fetal heart rate (FHR) monitoring to improve the detection of deceleration events. We present a method for decomposing a signal into near-orthogonal components via the discrete cosine transform (DCT) and apply this in a novel online manner to change-point detection based on SSA. The SSA technique forms models of the underlying signal that can be compared over time; models that are sufficiently different indicate signal change points. To adapt the algorithm to deceleration detection where many successive similar change events can occur, we modify the standard SSA algorithm to hold the reference model constant under such conditions, an approach that we term "base-hold SSA". The algorithm is applied to a database of 15 FHR tracings that have been preprocessed to locate candidate decelerations and is compared to the markings of an expert obstetrician. Of the 528 true and 1285 false decelerations presented to the algorithm, the base-hold approach improved on standard SSA, reducing the number of missed decelerations from 64 to 49 (21.9%) while maintaining the same reduction in false-positives (278). The standard SSA assumption that changes are infrequent does not apply to FHR analysis where decelerations can occur successively and in close proximity; our base-hold SSA modification improves detection of these types of event series.
Automated Spatio-Temporal Analysis of Remotely Sensed Imagery for Water Resources Management
NASA Astrophysics Data System (ADS)
Bahr, Thomas
2016-04-01
Since 2012, the state of California faces an extreme drought, which impacts water supply in many ways. Advanced remote sensing is an important technology to better assess water resources, monitor drought conditions and water supplies, plan for drought response and mitigation, and measure drought impacts. In the present case study latest time series analysis capabilities are used to examine surface water in reservoirs located along the western flank of the Sierra Nevada region of California. This case study was performed using the COTS software package ENVI 5.3. Integration of custom processes and automation is supported by IDL (Interactive Data Language). Thus, ENVI analytics is running via the object-oriented and IDL-based ENVITask API. A time series from Landsat images (L-5 TM, L-7 ETM+, L-8 OLI) of the AOI was obtained for 1999 to 2015 (October acquisitions). Downloaded from the USGS EarthExplorer web site, they already were georeferenced to a UTM Zone 10N (WGS-84) coordinate system. ENVITasks were used to pre-process the Landsat images as follows: • Triangulation based gap-filling for the SLC-off Landsat-7 ETM+ images. • Spatial subsetting to the same geographic extent. • Radiometric correction to top-of-atmosphere (TOA) reflectance. • Atmospheric correction using QUAC®, which determines atmospheric correction parameters directly from the observed pixel spectra in a scene, without ancillary information. Spatio-temporal analysis was executed with the following tasks: • Creation of Modified Normalized Difference Water Index images (MNDWI, Xu 2006) to enhance open water features while suppressing noise from built-up land, vegetation, and soil. • Threshold based classification of the water index images to extract the water features. • Classification aggregation as a post-classification cleanup process. • Export of the respective water classes to vector layers for further evaluation in a GIS. • Animation of the classification series and export to a common video format. • Plotting the time series of water surface area in square kilometers. The automated spatio-temporal analysis introduced here can be embedded in virtually any existing geospatial workflow for operational applications. Three integration options were implemented in this case study: • Integration within any ArcGIS environment whether deployed on the desktop, in the cloud, or online. Execution uses a customized ArcGIS script tool. A Python script file retrieves the parameters from the user interface and runs the precompiled IDL code. That IDL code is used to interface between the Python script and the relevant ENVITasks. • Publishing the spatio-temporal analysis tasks as services via the ENVI Services Engine (ESE). ESE is a cloud-based image analysis solution to publish and deploy advanced ENVI image and data analytics to existing enterprise infrastructures. For this purpose the entire IDL code can be capsuled in a single ENVITask. • Integration in an existing geospatial workflow using the Python-to-IDL Bridge. This mechanism allows calling IDL code within Python on a user-defined platform. The results of this case study verify the drastic decrease of the amount of surface water in the AOI, indicative of the major drought that is pervasive throughout California. Accordingly, the time series analysis was correlated successfully with the daily reservoir elevations of the Don Pedro reservoir (station DNP, operated by CDEC).
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cafferty, Kara G.; Searcy, Erin M.; Nguyen, Long
To meet Energy Independence and Security Act (EISA) cellulosic biofuel mandates, the United States will require an annual domestic supply of about 242 million Mg of biomass by 2022. To improve the feedstock logistics of lignocellulosic biofuels and access available biomass resources from areas with varying yields, commodity systems have been proposed and designed to deliver on-spec biomass feedstocks at preprocessing “depots”, which densify and stabilize the biomass prior to long-distance transport and delivery to centralized biorefineries. The harvesting, preprocessing, and logistics (HPL) of biomass commodity supply chains thus could introduce spatially variable environmental impacts into the biofuel life cyclemore » due to needing to harvest, move, and preprocess biomass from multiple distances that have variable spatial density. This study examines the uncertainty in greenhouse gas (GHG) emissions of corn stover logisticsHPL within a bio-ethanol supply chain in the state of Kansas, where sustainable biomass supply varies spatially. Two scenarios were evaluated each having a different number of depots of varying capacity and location within Kansas relative to a central commodity-receiving biorefinery to test GHG emissions uncertainty. Monte Carlo simulation was used to estimate the spatial uncertainty in the HPL gate-to-gate sequence. The results show that the transport of densified biomass introduces the highest variability and contribution to the carbon footprint of the logistics HPL supply chain (0.2-13 g CO 2e/MJ). Moreover, depending upon the biomass availability and its spatial density and surrounding transportation infrastructure (road and rail), logistics HPL processes can increase the variability in life cycle environmental impacts for lignocellulosic biofuels. Within Kansas, life cycle GHG emissions could range from 24 to 41 g CO 2e/MJ depending upon the location, size and number of preprocessing depots constructed. However, this range can be minimized through optimizing the siting of preprocessing depots where ample rail infrastructure exists to supply biomass commodity to a regional biorefinery supply system« less
Cafferty, Kara G.; Searcy, Erin M.; Nguyen, Long; ...
2014-11-04
To meet Energy Independence and Security Act (EISA) cellulosic biofuel mandates, the United States will require an annual domestic supply of about 242 million Mg of biomass by 2022. To improve the feedstock logistics of lignocellulosic biofuels and access available biomass resources from areas with varying yields, commodity systems have been proposed and designed to deliver on-spec biomass feedstocks at preprocessing “depots”, which densify and stabilize the biomass prior to long-distance transport and delivery to centralized biorefineries. The harvesting, preprocessing, and logistics (HPL) of biomass commodity supply chains thus could introduce spatially variable environmental impacts into the biofuel life cyclemore » due to needing to harvest, move, and preprocess biomass from multiple distances that have variable spatial density. This study examines the uncertainty in greenhouse gas (GHG) emissions of corn stover logisticsHPL within a bio-ethanol supply chain in the state of Kansas, where sustainable biomass supply varies spatially. Two scenarios were evaluated each having a different number of depots of varying capacity and location within Kansas relative to a central commodity-receiving biorefinery to test GHG emissions uncertainty. Monte Carlo simulation was used to estimate the spatial uncertainty in the HPL gate-to-gate sequence. The results show that the transport of densified biomass introduces the highest variability and contribution to the carbon footprint of the logistics HPL supply chain (0.2-13 g CO 2e/MJ). Moreover, depending upon the biomass availability and its spatial density and surrounding transportation infrastructure (road and rail), logistics HPL processes can increase the variability in life cycle environmental impacts for lignocellulosic biofuels. Within Kansas, life cycle GHG emissions could range from 24 to 41 g CO 2e/MJ depending upon the location, size and number of preprocessing depots constructed. However, this range can be minimized through optimizing the siting of preprocessing depots where ample rail infrastructure exists to supply biomass commodity to a regional biorefinery supply system« less
2009-01-01
Background Large discrepancies in signature composition and outcome concordance have been observed between different microarray breast cancer expression profiling studies. This is often ascribed to differences in array platform as well as biological variability. We conjecture that other reasons for the observed discrepancies are the measurement error associated with each feature and the choice of preprocessing method. Microarray data are known to be subject to technical variation and the confidence intervals around individual point estimates of expression levels can be wide. Furthermore, the estimated expression values also vary depending on the selected preprocessing scheme. In microarray breast cancer classification studies, however, these two forms of feature variability are almost always ignored and hence their exact role is unclear. Results We have performed a comprehensive sensitivity analysis of microarray breast cancer classification under the two types of feature variability mentioned above. We used data from six state of the art preprocessing methods, using a compendium consisting of eight diferent datasets, involving 1131 hybridizations, containing data from both one and two-color array technology. For a wide range of classifiers, we performed a joint study on performance, concordance and stability. In the stability analysis we explicitly tested classifiers for their noise tolerance by using perturbed expression profiles that are based on uncertainty information directly related to the preprocessing methods. Our results indicate that signature composition is strongly influenced by feature variability, even if the array platform and the stratification of patient samples are identical. In addition, we show that there is often a high level of discordance between individual class assignments for signatures constructed on data coming from different preprocessing schemes, even if the actual signature composition is identical. Conclusion Feature variability can have a strong impact on breast cancer signature composition, as well as the classification of individual patient samples. We therefore strongly recommend that feature variability is considered in analyzing data from microarray breast cancer expression profiling experiments. PMID:19941644
Wan, Jian; Chen, Yi-Chieh; Morris, A Julian; Thennadil, Suresh N
2017-07-01
Near-infrared (NIR) spectroscopy is being widely used in various fields ranging from pharmaceutics to the food industry for analyzing chemical and physical properties of the substances concerned. Its advantages over other analytical techniques include available physical interpretation of spectral data, nondestructive nature and high speed of measurements, and little or no need for sample preparation. The successful application of NIR spectroscopy relies on three main aspects: pre-processing of spectral data to eliminate nonlinear variations due to temperature, light scattering effects and many others, selection of those wavelengths that contribute useful information, and identification of suitable calibration models using linear/nonlinear regression . Several methods have been developed for each of these three aspects and many comparative studies of different methods exist for an individual aspect or some combinations. However, there is still a lack of comparative studies for the interactions among these three aspects, which can shed light on what role each aspect plays in the calibration and how to combine various methods of each aspect together to obtain the best calibration model. This paper aims to provide such a comparative study based on four benchmark data sets using three typical pre-processing methods, namely, orthogonal signal correction (OSC), extended multiplicative signal correction (EMSC) and optical path-length estimation and correction (OPLEC); two existing wavelength selection methods, namely, stepwise forward selection (SFS) and genetic algorithm optimization combined with partial least squares regression for spectral data (GAPLSSP); four popular regression methods, namely, partial least squares (PLS), least absolute shrinkage and selection operator (LASSO), least squares support vector machine (LS-SVM), and Gaussian process regression (GPR). The comparative study indicates that, in general, pre-processing of spectral data can play a significant role in the calibration while wavelength selection plays a marginal role and the combination of certain pre-processing, wavelength selection, and nonlinear regression methods can achieve superior performance over traditional linear regression-based calibration.
An automated multi-scale network-based scheme for detection and location of seismic sources
NASA Astrophysics Data System (ADS)
Poiata, N.; Aden-Antoniow, F.; Satriano, C.; Bernard, P.; Vilotte, J. P.; Obara, K.
2017-12-01
We present a recently developed method - BackTrackBB (Poiata et al. 2016) - allowing to image energy radiation from different seismic sources (e.g., earthquakes, LFEs, tremors) in different tectonic environments using continuous seismic records. The method exploits multi-scale frequency-selective coherence in the wave field, recorded by regional seismic networks or local arrays. The detection and location scheme is based on space-time reconstruction of the seismic sources through an imaging function built from the sum of station-pair time-delay likelihood functions, projected onto theoretical 3D time-delay grids. This imaging function is interpreted as the location likelihood of the seismic source. A signal pre-processing step constructs a multi-band statistical representation of the non stationary signal, i.e. time series, by means of higher-order statistics or energy envelope characteristic functions. Such signal-processing is designed to detect in time signal transients - of different scales and a priori unknown predominant frequency - potentially associated with a variety of sources (e.g., earthquakes, LFE, tremors), and to improve the performance and the robustness of the detection-and-location location step. The initial detection-location, based on a single phase analysis with the P- or S-phase only, can then be improved recursively in a station selection scheme. This scheme - exploiting the 3-component records - makes use of P- and S-phase characteristic functions, extracted after a polarization analysis of the event waveforms, and combines the single phase imaging functions with the S-P differential imaging functions. The performance of the method is demonstrated here in different tectonic environments: (1) analysis of the one year long precursory phase of 2014 Iquique earthquake in Chile; (2) detection and location of tectonic tremor sources and low-frequency earthquakes during the multiple episodes of tectonic tremor activity in southwestern Japan.
Recent developments with the ORSER system
NASA Technical Reports Server (NTRS)
Baumer, G. M.; Turner, B. J.; Myers, W. L.
1981-01-01
Additions to the ORSER remote sensing data processing package are described. The ORSER package consists of about 35 individual programs that are grouped into preprocessing, data analysis, and display subsystems. Additional data formats and data management, data transformation, and geometric correlation programs were supplemented to the preprocessing subsystem. Enhancements to the data analysis techniques include a maximum likelihood classifier (MAXCLASS) and a new version of the STATS program which makes delineation of training areas easier and allows for detection of outlier points. Ongoing developments are also described.
Hyperspectral imaging in medicine: image pre-processing problems and solutions in Matlab.
Koprowski, Robert
2015-11-01
The paper presents problems and solutions related to hyperspectral image pre-processing. New methods of preliminary image analysis are proposed. The paper shows problems occurring in Matlab when trying to analyse this type of images. Moreover, new methods are discussed which provide the source code in Matlab that can be used in practice without any licensing restrictions. The proposed application and sample result of hyperspectral image analysis. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Processing method of images obtained during the TESIS/CORONAS-PHOTON experiment
NASA Astrophysics Data System (ADS)
Kuzin, S. V.; Shestov, S. V.; Bogachev, S. A.; Pertsov, A. A.; Ulyanov, A. S.; Reva, A. A.
2011-04-01
In January 2009, the CORONAS-PHOTON spacecraft was successfully launched. It includes a set of telescopes and spectroheliometers—TESIS—designed to image the solar corona in soft X-ray and EUV spectral ranges. Due to features of the reading system, to obtain physical information from these images, it is necessary to preprocess them, i.e., to remove the background, correct the white field, level, and clean. The paper discusses the algorithms and software developed and used for the preprocessing of images.
Preprocessing and Analysis of LC-MS-Based Proteomic Data
Tsai, Tsung-Heng; Wang, Minkun; Ressom, Habtom W.
2016-01-01
Liquid chromatography coupled with mass spectrometry (LC-MS) has been widely used for profiling protein expression levels. This chapter is focused on LC-MS data preprocessing, which is a crucial step in the analysis of LC-MS based proteomics. We provide a high-level overview, highlight associated challenges, and present a step-by-step example for analysis of data from LC-MS based untargeted proteomic study. Furthermore, key procedures and relevant issues with the subsequent analysis by multiple reaction monitoring (MRM) are discussed. PMID:26519169
Analyzed Boise Data for Oscillatory Hydraulic Tomography
Lim, David
2015-07-01
Data here has been "pre-processed" and "analyzed" from the raw data submitted to the GDR previously (raw data files found at http://gdr.openei.org/submissions/479. doi:10.15121/1176944 after 30 September 2017). First, we submit .mat files which are the "pre-processed" data (must have MATLAB software to use). Secondly, the csv files contain submitted data in its final analyzed form before being used for inversion. Specifically, we have fourier coefficients obtained from Fast Fourier Transform Algorithms.
[APPLICATION OF FISTULA PLUG WITH THE FIBRIN ADHESIVE IN TREATMENT OF RECTAL FISTULAS].
Aydinova, P R; Aliyev, E A
2015-05-01
Results of surgical treatment of 21 patients, suffering high transsphincteric and extrasphincteric rectal fistulas, were studied. In patients of Group I the fistula passage was closed, using fistula plug obturator; and in patients of Group II--by the same, but preprocessed by fibrin adhesive. The fistula aperture germeticity, prophylaxis of rude cicatrices development in operative wound zone, promotion of better fixation of bioplastic material were guaranteed, using fistula plug obturator with preprocessing, using fibrin adhesive.
NASA Astrophysics Data System (ADS)
Wu, Q.; Song, J.; Wang, J.; Chen, S.; Yu, B.; Liao, L.
2016-12-01
Monitoring the dynamics of leaf area index (LAI) throughout the life-cycle of forests (from seeding to maturity) is vital for simulating forest growth and quantifying carbon sequestration. However, all current global LAI produts show extremely low accuracy in forests and the coarse spatial resolution(nearly 1-km) mismatch with the spatial scale of forest inventory plots (nearly 26m*26m). To date, several studies have explored the possibility of satellite data to classify forest succession or predict stand age. And a few studies have explored the potential of using long term Landsat data to monitor the growing trend of forests, but no studies have quantified the inter-annual and intra-annual LAI dynamics along with forest succession. Vegetation indexes are not perfect variables in quantifying forest foliage dynamics. Hallet (1995) suggested remote sensing of biophysical characteristics should shift away from direct inference from vegetation indices toward more physically based algorithms. This work intends to be a pioneer example for improving the accuracy of forests LAI and providing temporal-spatial matching LAI datasets for monitoring forest processes. We integrates the Geometric-Optical and Radiative Transfer (GORT) model with the Physiological Principles Predicting Growth (3-PG) model to improve the estimation of the forest canopy LAI dynamics. Reflectance time-series data from 1987 to 2015 were collected and preprocessed for forests in southern China, using all available Landsat data (with <80% cloud). Effective LAI and true LAI were field measured to validate our results using various instruments, including digital hemispheric photographs (DHP), LAI-2000 Plant Canopy Analyzer (LI-COR), and Tracing radiation and Architecture of Canopies (TRAC). Results show that the relationship between spectral metrics of satellite images and forest LAI is clear in early stages before maturity. 3-PG provide accurate inter-annual trend of forest LAI, while satellite images provide clear intra-annual LAI dynamics. We concluded that the GORT-3PG model improved the LAI estimation significantly of forest stands. Improving forest LAI estimates will help inform forest management policy and such methods may be applied in other similar forests.
Roushangar, Kiyoumars; Alizadeh, Farhad; Adamowski, Jan
2018-08-01
Understanding precipitation on a regional basis is an important component of water resources planning and management. The present study outlines a methodology based on continuous wavelet transform (CWT) and multiscale entropy (CWME), combined with self-organizing map (SOM) and k-means clustering techniques, to measure and analyze the complexity of precipitation. Historical monthly precipitation data from 1960 to 2010 at 31 rain gauges across Iran were preprocessed by CWT. The multi-resolution CWT approach segregated the major features of the original precipitation series by unfolding the structure of the time series which was often ambiguous. The entropy concept was then applied to components obtained from CWT to measure dispersion, uncertainty, disorder, and diversification of subcomponents. Based on different validity indices, k-means clustering captured homogenous areas more accurately, and additional analysis was performed based on the outcome of this approach. The 31 rain gauges in this study were clustered into 6 groups, each one having a unique CWME pattern across different time scales. The results of clustering showed that hydrologic similarity (multiscale variation of precipitation) was not based on geographic contiguity. According to the pattern of entropy across the scales, each cluster was assigned an entropy signature that provided an estimation of the entropy pattern of precipitation data in each cluster. Based on the pattern of mean CWME for each cluster, a characteristic signature was assigned, which provided an estimation of the CWME of a cluster across scales of 1-2, 3-8, and 9-13 months relative to other stations. The validity of the homogeneous clusters demonstrated the usefulness of the proposed approach to regionalize precipitation. Further analysis based on wavelet coherence (WTC) was performed by selecting central rain gauges in each cluster and analyzing against temperature, wind, Multivariate ENSO index (MEI), and East Atlantic (EA) and North Atlantic Oscillation (NAO), indeces. The results revealed that all climatic features except NAO influenced precipitation in Iran during the 1960-2010 period. Copyright © 2018 Elsevier Inc. All rights reserved.
Dalla-Costa, Libera M; Morello, Luis G; Conte, Danieli; Pereira, Luciane A; Palmeiro, Jussara K; Ambrosio, Altair; Cardozo, Dayane; Krieger, Marco A; Raboni, Sonia M
2017-09-01
Sepsis is the leading cause of death in intensive care units (ICUs) worldwide and its diagnosis remains a challenge. Blood culturing is the gold standard technique for blood stream infection (BSI) identification. Molecular tests to detect pathogens in whole blood enable early use of antimicrobials and affect clinical outcomes. Here, using real-time PCR, we evaluated DNA extraction using seven manual and three automated commercially available systems with whole blood samples artificially contaminated with Escherichia coli, Staphylococcus aureus, and Candida albicans, microorganisms commonly associated with BSI. Overall, the commercial kits evaluated presented several technical limitations including long turnaround time and low DNA yield and purity. The performance of the kits was comparable for detection of high microorganism loads (10 6 CFU/mL). However, the detection of lower concentrations was variable, despite the addition of pre-processing treatment to kits without such steps. Of the evaluated kits, the UMD-Universal CE IVD kit generated a higher quantity of DNA with greater nucleic acid purity and afforded the detection of the lowest microbial load in the samples. The inclusion of pre-processing steps with the kit seems to be critical for the detection of microorganism DNA directly from whole blood. In conclusion, future application of molecular techniques will require overcoming major challenges such as the detection of low levels of microorganism nucleic acids amidst the large quantity of human DNA present in samples or differences in the cellular structures of etiological agents that can also prevent high-quality DNA yields. Copyright © 2017 Elsevier B.V. All rights reserved.
Widely applicable MATLAB routines for automated analysis of saccadic reaction times.
Leppänen, Jukka M; Forssman, Linda; Kaatiala, Jussi; Yrttiaho, Santeri; Wass, Sam
2015-06-01
Saccadic reaction time (SRT) is a widely used dependent variable in eye-tracking studies of human cognition and its disorders. SRTs are also frequently measured in studies with special populations, such as infants and young children, who are limited in their ability to follow verbal instructions and remain in a stable position over time. In this article, we describe a library of MATLAB routines (Mathworks, Natick, MA) that are designed to (1) enable completely automated implementation of SRT analysis for multiple data sets and (2) cope with the unique challenges of analyzing SRTs from eye-tracking data collected from poorly cooperating participants. The library includes preprocessing and SRT analysis routines. The preprocessing routines (i.e., moving median filter and interpolation) are designed to remove technical artifacts and missing samples from raw eye-tracking data. The SRTs are detected by a simple algorithm that identifies the last point of gaze in the area of interest, but, critically, the extracted SRTs are further subjected to a number of postanalysis verification checks to exclude values contaminated by artifacts. Example analyses of data from 5- to 11-month-old infants demonstrated that SRTs extracted with the proposed routines were in high agreement with SRTs obtained manually from video records, robust against potential sources of artifact, and exhibited moderate to high test-retest stability. We propose that the present library has wide utility in standardizing and automating SRT-based cognitive testing in various populations. The MATLAB routines are open source and can be downloaded from http://www.uta.fi/med/icl/methods.html .
NASA Astrophysics Data System (ADS)
Thomas, L.; Tremblais, B.; David, L.
2014-03-01
Optimization of multiplicative algebraic reconstruction technique (MART), simultaneous MART and block iterative MART reconstruction techniques was carried out on synthetic and experimental data. Different criteria were defined to improve the preprocessing of the initial images. Knowledge of how each reconstruction parameter influences the quality of particle volume reconstruction and computing time is the key in Tomo-PIV. These criteria were applied to a real case, a jet in cross flow, and were validated.
A Hadoop-based Molecular Docking System
NASA Astrophysics Data System (ADS)
Dong, Yueli; Guo, Quan; Sun, Bin
2017-10-01
Molecular docking always faces the challenge of managing tens of TB datasets. It is necessary to improve the efficiency of the storage and docking. We proposed the molecular docking platform based on Hadoop for virtual screening, it provides the preprocessing of ligand datasets and the analysis function of the docking results. A molecular cloud database that supports mass data management is constructed. Through this platform, the docking time is reduced, the data storage is efficient, and the management of the ligand datasets is convenient.
NASA Astrophysics Data System (ADS)
Baret, F.; Weiss, M.; Lacaze, R.; Camacho, F.; Smets, B.; Pacholczyk, P.; Makhmara, H.
2010-12-01
LAI and fAPAR are recognized as Essential Climate Variables providing key information for the understanding and modeling of canopy functioning. Global remote sensing observations at medium resolution are routinely acquired since the 80’s mainly with AVHRR, SEAWIFS, VEGETATION, MODIS and MERIS sensors. Several operational products have been derived and provide global maps of LAI and fAPAR at daily to monthly time steps. Inter-comparison between MODIS, CYCLOPES, GLOBCARBON and JRC-FAPAR products showed generally consistent seasonality, while large differences in magnitude and smoothness may be observed. One of the objectives of the GEOLAND2 European project is to develop such core products to be used in a range of application services including the carbon monitoring. Rather than generating an additional product from scratch, the version 1 of GEOLAND2 products was capitalizing on the existing products by combining them to retain their pros and limit their cons. For these reasons, MODIS and CYCLOPES products were selected since they both include LAI and fAPAR while having relatively close temporal sampling intervals (8 to 10 days). GLOBCARBON products were not used here because of the too long monthly time step inducing large uncertainties in the seasonality description. JRC-FAPAR was not selected as well to preserve better consistency between LAI and fAPAR products. MODIS and CYCLOPES products were then linearly combined to take advantage of the good performances of CYCLOPES products for low to medium values of LAI and fAPAR while benefiting from the better MODIS performances for the highest LAI values. A training database representative of the global variability of vegetation type and conditions was thus built. A back-propagation neural network was then calibrated to estimate the new LAI and fAPAR products from VEGETATION preprocessed observations. Similarly, the vegetation cover fraction (fCover) was also derived by scaling the original CYCLOPES fCover products. Validation results achieved following the principles proposed by CEOS-LPV show that the new product called GEOV1 behaves as expected with good performances over the whole range of LAI and fAPAR in a temporally smooth and spatially consistent manner. These products will be processed and delivered by VITO in near real time at 1 km spatial resolution and 10 days frequency using a pre-operational production quality tracking system. The entire VEGETATION archive, from 1999 will be processed to provide a consistent time series over both VEGETATION sensors at the same spatial and temporal sampling. A climatology of products computed over the VEGETATION period will be also delivered at the same spatial and temporal sampling, showing average values, between year variability and possible trends over the decade. Finally, the VEGETATION derived time series starting back to 1999 will be completed with consistent products at 4 km spatial resolution derived from the NOAA/AVHRR series to cover the 1981-2010 period.
NASA Astrophysics Data System (ADS)
Kang, Yubin; Choi, Jaeyoung; Park, Jinju; Kim, Woo-Byoung; Lee, Kun-Jae
2017-09-01
This study attempts to improve the physical and chemical adhesion between metals and ceramics by using electrolytic oxidation and a titanium organic/inorganic complex ion solution on the SS-304 plate. Surface analysis confirmed the existence of the Tisbnd Osbnd Mx bonds formed by the bonding between the metal ions and the Ti oxide at the surface of the pre-processed SS plate, and improved chemical adhesion during ceramic coating was expected by confirming the presence of the carboxylic group. The adhesion was evaluated by using the ceramic coating solution in order to assess the improved adhesion of the SS plate under conditions. The results showed that both the adhesion and durability were largely improved in the sample processed with all the pre-processing steps, thus confirming that the physical and chemical adhesion between metals and ceramics can be improved by enhancing the physical roughness via electrolytic oxidation and pre-processing using a Ti complex ion solution.
The Effect of Normalization in Violence Video Classification Performance
NASA Astrophysics Data System (ADS)
Ali, Ashikin; Senan, Norhalina
2017-08-01
Basically, data pre-processing is an important part of data mining. Normalization is a pre-processing stage for any type of problem statement, especially in video classification. Challenging problems that arises in video classification is because of the heterogeneous content, large variations in video quality and complex semantic meanings of the concepts involved. Therefore, to regularize this problem, it is thoughtful to ensure normalization or basically involvement of thorough pre-processing stage aids the robustness of classification performance. This process is to scale all the numeric variables into certain range to make it more meaningful for further phases in available data mining techniques. Thus, this paper attempts to examine the effect of 2 normalization techniques namely Min-max normalization and Z-score in violence video classifications towards the performance of classification rate using Multi-layer perceptron (MLP) classifier. Using Min-Max Normalization range of [0,1] the result shows almost 98% of accuracy, meanwhile Min-Max Normalization range of [-1,1] accuracy is 59% and for Z-score the accuracy is 50%.
Seidel, Kathrin; Kahl, Johannes; Paoletti, Flavio; Birlouez, Ines; Busscher, Nicolaas; Kretzschmar, Ursula; Särkkä-Tirkkonen, Marjo; Seljåsen, Randi; Sinesio, Fiorella; Torp, Torfinn; Baiamonte, Irene
2015-02-01
The market for processed food is rapidly growing. The industry needs methods for "processing with care" leading to high quality products in order to meet consumers' expectations. Processing influences the quality of the finished product through various factors. In carrot baby food, these are the raw material, the pre-processing and storage treatments as well as the processing conditions. In this study, a quality assessment was performed on baby food made from different pre-processed raw materials. The experiments were carried out under industrial conditions using fresh, frozen and stored organic carrots as raw material. Statistically significant differences were found for sensory attributes among the three autoclaved puree samples (e.g. overall odour F = 90.72, p < 0.001). Samples processed from frozen carrots show increased moisture content and decrease of several chemical constituents. Biocrystallization identified changes between replications of the cooking. Pre-treatment of raw material has a significant influence on the final quality of the baby food.
NASA Astrophysics Data System (ADS)
Matsumoto, H.; Haralabus, G.; Zampolli, M.; Özel, N. M.
2016-12-01
Underwater acoustic signal waveforms recorded during the 2015 Chile earthquake (Mw 8.3) by the hydrophones of hydroacoustic station HA03, located at the Juan Fernandez Islands, are analyzed. HA03 is part of the Comprehensive Nuclear-Test-Ban Treaty International Monitoring System. The interest in the particular data set stems from the fact that HA03 is located only approximately 700 km SW from the epicenter of the earthquake. This makes it possible to study aspects of the signal associated with the tsunamigenic earthquake, which would be more difficult to detect had the hydrophones been located far from the source. The analysis shows that the direction of arrival of the T phase can be estimated by means of a three-step preprocessing technique which circumvents spatial aliasing caused by the hydrophone spacing, the latter being large compared to the wavelength. Following this preprocessing step, standard frequency-wave number analysis (F-K analysis) can accurately estimate back azimuth and slowness of T-phase signals. The data analysis also shows that the dispersive tsunami signals can be identified by the water-column hydrophones at the time when the tsunami surface gravity wave reaches the station.
Thermo-Chemical Conversion of Microwave Activated Biomass Mixtures
NASA Astrophysics Data System (ADS)
Barmina, I.; Kolmickovs, A.; Valdmanis, R.; Vostrikovs, S.; Zake, M.
2018-05-01
Thermo-chemical conversion of microwave activated wheat straw mixtures with wood or peat pellets is studied experimentally with the aim to provide more effective application of wheat straw for heat energy production. Microwave pre-processing of straw pellets is used to provide a partial decomposition of the main constituents of straw and to activate the thermo-chemical conversion of wheat straw mixtures with wood or peat pellets. The experimental study includes complex measurements of the elemental composition of biomass pellets (wheat straw, wood, peat), DTG analysis of their thermal degradation, FTIR analysis of the composition of combustible volatiles entering the combustor, the flame temperature, the heat output of the device and composition of the products by comparing these characteristics for mixtures with unprocessed and mw pre-treated straw pellets. The results of experimental study confirm that mw pre-processing of straw activates the thermal decomposition of mixtures providing enhanced formation of combustible volatiles. This leads to improvement of the combustion conditions in the flame reaction zone, completing thus the combustion of volatiles, increasing the flame temperature, the heat output from the device, the produced heat energy per mass of burned mixture and decreasing at the same time the mass fraction of unburned volatiles in the products.
Software for Preprocessing Data From Rocket-Engine Tests
NASA Technical Reports Server (NTRS)
Cheng, Chiu-Fu
2002-01-01
Three computer programs have been written to preprocess digitized outputs of sensors during rocket-engine tests at Stennis Space Center (SSC). The programs apply exclusively to the SSC "E" test-stand complex and utilize the SSC file format. The programs are the following: 1) Engineering Units Generator (EUGEN) converts sensor-output-measurement data to engineering units. The inputs to EUGEN are raw binary test-data files, which include the voltage data, a list identifying the data channels, and time codes. EUGEN effects conversion by use of a file that contains calibration coefficients for each channel; 2) QUICKLOOK enables immediate viewing of a few selected channels of data, in contradistinction to viewing only after post test processing (which can take 30 minutes to several hours depending on the number of channels and other test parameters) of data from all channels. QUICKLOOK converts the selected data into a form in which they can be plotted in engineering units by use of Winplot (a free graphing program written by Rick Paris); and 3) EUPLOT provides a quick means for looking at data files generated by EUGEN without the necessity of relying on the PVWAVE based plotting software.
Altimeter error sources at the 10-cm performance level
NASA Technical Reports Server (NTRS)
Martin, C. F.
1977-01-01
Error sources affecting the calibration and operational use of a 10 cm altimeter are examined to determine the magnitudes of current errors and the investigations necessary to reduce them to acceptable bounds. Errors considered include those affecting operational data pre-processing, and those affecting altitude bias determination, with error budgets developed for both. The most significant error sources affecting pre-processing are bias calibration, propagation corrections for the ionosphere, and measurement noise. No ionospheric models are currently validated at the required 10-25% accuracy level. The optimum smoothing to reduce the effects of measurement noise is investigated and found to be on the order of one second, based on the TASC model of geoid undulations. The 10 cm calibrations are found to be feasible only through the use of altimeter passes that are very high elevation for a tracking station which tracks very close to the time of altimeter track, such as a high elevation pass across the island of Bermuda. By far the largest error source, based on the current state-of-the-art, is the location of the island tracking station relative to mean sea level in the surrounding ocean areas.
Road sign recognition with fuzzy adaptive pre-processing models.
Lin, Chien-Chuan; Wang, Ming-Shi
2012-01-01
A road sign recognition system based on adaptive image pre-processing models using two fuzzy inference schemes has been proposed. The first fuzzy inference scheme is to check the changes of the light illumination and rich red color of a frame image by the checking areas. The other is to check the variance of vehicle's speed and angle of steering wheel to select an adaptive size and position of the detection area. The Adaboost classifier was employed to detect the road sign candidates from an image and the support vector machine technique was employed to recognize the content of the road sign candidates. The prohibitory and warning road traffic signs are the processing targets in this research. The detection rate in the detection phase is 97.42%. In the recognition phase, the recognition rate is 93.04%. The total accuracy rate of the system is 92.47%. For video sequences, the best accuracy rate is 90.54%, and the average accuracy rate is 80.17%. The average computing time is 51.86 milliseconds per frame. The proposed system can not only overcome low illumination and rich red color around the road sign problems but also offer high detection rates and high computing performance.
Fernández, Roemi; Salinas, Carlota; Montes, Héctor; Sarria, Javier
2014-01-01
The motivation of this research was to explore the feasibility of detecting and locating fruits from different kinds of crops in natural scenarios. To this end, a unique, modular and easily adaptable multisensory system and a set of associated pre-processing algorithms are proposed. The offered multisensory rig combines a high resolution colour camera and a multispectral system for the detection of fruits, as well as for the discrimination of the different elements of the plants, and a Time-Of-Flight (TOF) camera that provides fast acquisition of distances enabling the localisation of the targets in the coordinate space. A controlled lighting system completes the set-up, increasing its flexibility for being used in different working conditions. The pre-processing algorithms designed for the proposed multisensory system include a pixel-based classification algorithm that labels areas of interest that belong to fruits and a registration algorithm that combines the results of the aforementioned classification algorithm with the data provided by the TOF camera for the 3D reconstruction of the desired regions. Several experimental tests have been carried out in outdoors conditions in order to validate the capabilities of the proposed system. PMID:25615730
Road Sign Recognition with Fuzzy Adaptive Pre-Processing Models
Lin, Chien-Chuan; Wang, Ming-Shi
2012-01-01
A road sign recognition system based on adaptive image pre-processing models using two fuzzy inference schemes has been proposed. The first fuzzy inference scheme is to check the changes of the light illumination and rich red color of a frame image by the checking areas. The other is to check the variance of vehicle's speed and angle of steering wheel to select an adaptive size and position of the detection area. The Adaboost classifier was employed to detect the road sign candidates from an image and the support vector machine technique was employed to recognize the content of the road sign candidates. The prohibitory and warning road traffic signs are the processing targets in this research. The detection rate in the detection phase is 97.42%. In the recognition phase, the recognition rate is 93.04%. The total accuracy rate of the system is 92.47%. For video sequences, the best accuracy rate is 90.54%, and the average accuracy rate is 80.17%. The average computing time is 51.86 milliseconds per frame. The proposed system can not only overcome low illumination and rich red color around the road sign problems but also offer high detection rates and high computing performance. PMID:22778650
Learning-based image preprocessing for robust computer-aided detection
NASA Astrophysics Data System (ADS)
Raghupathi, Laks; Devarakota, Pandu R.; Wolf, Matthias
2013-03-01
Recent studies have shown that low dose computed tomography (LDCT) can be an effective screening tool to reduce lung cancer mortality. Computer-aided detection (CAD) would be a beneficial second reader for radiologists in such cases. Studies demonstrate that while iterative reconstructions (IR) improve LDCT diagnostic quality, it however degrades CAD performance significantly (increased false positives) when applied directly. For improving CAD performance, solutions such as retraining with newer data or applying a standard preprocessing technique may not be suffice due to high prevalence of CT scanners and non-uniform acquisition protocols. Here, we present a learning-based framework that can adaptively transform a wide variety of input data to boost an existing CAD performance. This not only enhances their robustness but also their applicability in clinical workflows. Our solution consists of applying a suitable pre-processing filter automatically on the given image based on its characteristics. This requires the preparation of ground truth (GT) of choosing an appropriate filter resulting in improved CAD performance. Accordingly, we propose an efficient consolidation process with a novel metric. Using key anatomical landmarks, we then derive consistent feature descriptors for the classification scheme that then uses a priority mechanism to automatically choose an optimal preprocessing filter. We demonstrate CAD prototype∗ performance improvement using hospital-scale datasets acquired from North America, Europe and Asia. Though we demonstrated our results for a lung nodule CAD, this scheme is straightforward to extend to other post-processing tools dedicated to other organs and modalities.
Robust power spectral estimation for EEG data
Melman, Tamar; Victor, Jonathan D.
2016-01-01
Background Typical electroencephalogram (EEG) recordings often contain substantial artifact. These artifacts, often large and intermittent, can interfere with quantification of the EEG via its power spectrum. To reduce the impact of artifact, EEG records are typically cleaned by a preprocessing stage that removes individual segments or components of the recording. However, such preprocessing can introduce bias, discard available signal, and be labor-intensive. With this motivation, we present a method that uses robust statistics to reduce dependence on preprocessing by minimizing the effect of large intermittent outliers on the spectral estimates. New method Using the multitaper method[1] as a starting point, we replaced the final step of the standard power spectrum calculation with a quantile-based estimator, and the Jackknife approach to confidence intervals with a Bayesian approach. The method is implemented in provided MATLAB modules, which extend the widely used Chronux toolbox. Results Using both simulated and human data, we show that in the presence of large intermittent outliers, the robust method produces improved estimates of the power spectrum, and that the Bayesian confidence intervals yield close-to-veridical coverage factors. Comparison to existing method The robust method, as compared to the standard method, is less affected by artifact: inclusion of outliers produces fewer changes in the shape of the power spectrum as well as in the coverage factor. Conclusion In the presence of large intermittent outliers, the robust method can reduce dependence on data preprocessing as compared to standard methods of spectral estimation. PMID:27102041
Robust power spectral estimation for EEG data.
Melman, Tamar; Victor, Jonathan D
2016-08-01
Typical electroencephalogram (EEG) recordings often contain substantial artifact. These artifacts, often large and intermittent, can interfere with quantification of the EEG via its power spectrum. To reduce the impact of artifact, EEG records are typically cleaned by a preprocessing stage that removes individual segments or components of the recording. However, such preprocessing can introduce bias, discard available signal, and be labor-intensive. With this motivation, we present a method that uses robust statistics to reduce dependence on preprocessing by minimizing the effect of large intermittent outliers on the spectral estimates. Using the multitaper method (Thomson, 1982) as a starting point, we replaced the final step of the standard power spectrum calculation with a quantile-based estimator, and the Jackknife approach to confidence intervals with a Bayesian approach. The method is implemented in provided MATLAB modules, which extend the widely used Chronux toolbox. Using both simulated and human data, we show that in the presence of large intermittent outliers, the robust method produces improved estimates of the power spectrum, and that the Bayesian confidence intervals yield close-to-veridical coverage factors. The robust method, as compared to the standard method, is less affected by artifact: inclusion of outliers produces fewer changes in the shape of the power spectrum as well as in the coverage factor. In the presence of large intermittent outliers, the robust method can reduce dependence on data preprocessing as compared to standard methods of spectral estimation. Copyright © 2016 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Ren, Ruizhi; Gu, Lingjia; Fu, Haoyang; Sun, Chenglin
2017-04-01
An effective super-resolution (SR) algorithm is proposed for actual spectral remote sensing images based on sparse representation and wavelet preprocessing. The proposed SR algorithm mainly consists of dictionary training and image reconstruction. Wavelet preprocessing is used to establish four subbands, i.e., low frequency, horizontal, vertical, and diagonal high frequency, for an input image. As compared to the traditional approaches involving the direct training of image patches, the proposed approach focuses on the training of features derived from these four subbands. The proposed algorithm is verified using different spectral remote sensing images, e.g., moderate-resolution imaging spectroradiometer (MODIS) images with different bands, and the latest Chinese Jilin-1 satellite images with high spatial resolution. According to the visual experimental results obtained from the MODIS remote sensing data, the SR images using the proposed SR algorithm are superior to those using a conventional bicubic interpolation algorithm or traditional SR algorithms without preprocessing. Fusion algorithms, e.g., standard intensity-hue-saturation, principal component analysis, wavelet transform, and the proposed SR algorithms are utilized to merge the multispectral and panchromatic images acquired by the Jilin-1 satellite. The effectiveness of the proposed SR algorithm is assessed by parameters such as peak signal-to-noise ratio, structural similarity index, correlation coefficient, root-mean-square error, relative dimensionless global error in synthesis, relative average spectral error, spectral angle mapper, and the quality index Q4, and its performance is better than that of the standard image fusion algorithms.
Deuterated silicon nitride photonic devices for broadband optical frequency comb generation
NASA Astrophysics Data System (ADS)
Chiles, Jeff; Nader, Nima; Hickstein, Daniel D.; Yu, Su Peng; Briles, Travis Crain; Carlson, David; Jung, Hojoong; Shainline, Jeffrey M.; Diddams, Scott; Papp, Scott B.; Nam, Sae Woo; Mirin, Richard P.
2018-04-01
We report and characterize low-temperature, plasma-deposited deuterated silicon nitride thin films for nonlinear integrated photonics. With a peak processing temperature less than 300$^\\circ$C, it is back-end compatible with pre-processed CMOS substrates. We achieve microresonators with a quality factor of up to $1.6\\times 10^6 $ at 1552 nm, and $>1.2\\times 10^6$ throughout $\\lambda$ = 1510 -- 1600 nm, without annealing or stress management. We then demonstrate the immediate utility of this platform in nonlinear photonics by generating a 1 THz free spectral range, 900-nm-bandwidth modulation-instability microresonator Kerr comb and octave-spanning, supercontinuum-broadened spectra.
Plasmonic enhanced terahertz time-domain spectroscopy system for identification of common explosives
NASA Astrophysics Data System (ADS)
Demiraǧ, Yiǧit; Bütün, Bayram; Özbay, Ekmel
2017-05-01
In this study, we present a classification algorithm for terahertz time-domain spectroscopy systems (THz-TDS) that can be trained to identify most commonly used explosives (C4, HMX, RDX, PETN, TNT, composition-B and blackpowder) and some non-explosive samples (lactose, sucrose, PABA). Our procedure can be used in any THz-TDS system that detects either transmission or reflection spectra at room conditions. After preprocessing the signal in low THz regime (0.1 - 3 THz), our algorithm takes advantages of a latent space transformation based on principle component analysis in order to classify explosives with low false alarm rate.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hasanbeigi, Ali; Lu, Hongyou; Williams, Christopher
The purpose of this report is to describe international best practices for pre-processing and coprocessing of MSW and sewage sludge in cement plants, for the benefit of countries that wish to develop co-processing capacity. The report is divided into three main sections. Section 2 describes the fundamentals of co-processing, Section 3 describes exemplary international regulatory and institutional frameworks for co-processing, and Section 4 describes international best practices related to the technological aspects of co-processing.
Performance Improvement of Power Analysis Attacks on AES with Encryption-Related Signals
NASA Astrophysics Data System (ADS)
Lee, You-Seok; Lee, Young-Jun; Han, Dong-Guk; Kim, Ho-Won; Kim, Hyoung-Nam
A power analysis attack is a well-known side-channel attack but the efficiency of the attack is frequently degraded by the existence of power components, irrelative to the encryption included in signals used for the attack. To enhance the performance of the power analysis attack, we propose a preprocessing method based on extracting encryption-related parts from the measured power signals. Experimental results show that the attacks with the preprocessed signals detect correct keys with much fewer signals, compared to the conventional power analysis attacks.
Fast and Accurate Cell Tracking by a Novel Optical-Digital Hybrid Method
NASA Astrophysics Data System (ADS)
Torres-Cisneros, M.; Aviña-Cervantes, J. G.; Pérez-Careta, E.; Ambriz-Colín, F.; Tinoco, Verónica; Ibarra-Manzano, O. G.; Plascencia-Mora, H.; Aguilera-Gómez, E.; Ibarra-Manzano, M. A.; Guzman-Cabrera, R.; Debeir, Olivier; Sánchez-Mondragón, J. J.
2013-09-01
An innovative methodology to detect and track cells using microscope images enhanced by optical cross-correlation techniques is proposed in this paper. In order to increase the tracking sensibility, image pre-processing has been implemented as a morphological operator on the microscope image. Results show that the pre-processing process allows for additional frames of cell tracking, therefore increasing its robustness. The proposed methodology can be used in analyzing different problems such as mitosis, cell collisions, and cell overlapping, ultimately designed to identify and treat illnesses and malignancies.
NASA Astrophysics Data System (ADS)
Cox, Malcolm E.; James, Allan; Hawke, Amy; Raiber, Matthias
2013-05-01
Management of groundwater systems requires realistic conceptual hydrogeological models as a framework for numerical simulation modelling, but also for system understanding and communicating this to stakeholders and the broader community. To help overcome these challenges we developed GVS (Groundwater Visualisation System), a stand-alone desktop software package that uses interactive 3D visualisation and animation techniques. The goal was a user-friendly groundwater management tool that could support a range of existing real-world and pre-processed data, both surface and subsurface, including geology and various types of temporal hydrological information. GVS allows these data to be integrated into a single conceptual hydrogeological model. In addition, 3D geological models produced externally using other software packages, can readily be imported into GVS models, as can outputs of simulations (e.g. piezometric surfaces) produced by software such as MODFLOW or FEFLOW. Boreholes can be integrated, showing any down-hole data and properties, including screen information, intersected geology, water level data and water chemistry. Animation is used to display spatial and temporal changes, with time-series data such as rainfall, standing water levels and electrical conductivity, displaying dynamic processes. Time and space variations can be presented using a range of contouring and colour mapping techniques, in addition to interactive plots of time-series parameters. Other types of data, for example, demographics and cultural information, can also be readily incorporated. The GVS software can execute on a standard Windows or Linux-based PC with a minimum of 2 GB RAM, and the model output is easy and inexpensive to distribute, by download or via USB/DVD/CD. Example models are described here for three groundwater systems in Queensland, northeastern Australia: two unconfined alluvial groundwater systems with intensive irrigation, the Lockyer Valley and the upper Condamine Valley, and the Surat Basin, a large sedimentary basin of confined artesian aquifers. This latter example required more detail in the hydrostratigraphy, correlation of formations with drillholes and visualisation of simulation piezometric surfaces. Both alluvial system GVS models were developed during drought conditions to support government strategies to implement groundwater management. The Surat Basin model was industry sponsored research, for coal seam gas groundwater management and community information and consultation. The "virtual" groundwater systems in these 3D GVS models can be interactively interrogated by standard functions, plus production of 2D cross-sections, data selection from the 3D scene, rear end database and plot displays. A unique feature is that GVS allows investigation of time-series data across different display modes, both 2D and 3D. GVS has been used successfully as a tool to enhance community/stakeholder understanding and knowledge of groundwater systems and is of value for training and educational purposes. Projects completed confirm that GVS provides a powerful support to management and decision making, and as a tool for interpretation of groundwater system hydrological processes. A highly effective visualisation output is the production of short videos (e.g. 2-5 min) based on sequences of camera 'fly-throughs' and screen images. Further work involves developing support for multi-screen displays and touch-screen technologies, distributed rendering, gestural interaction systems. To highlight the visualisation and animation capability of the GVS software, links to related multimedia hosted online sites are included in the references.
The combination of satellite observation techniques for sequential ionosphere VTEC modeling
NASA Astrophysics Data System (ADS)
Erdogan, Eren; Limberger, Marco; Schmidt, Michael; Seitz, Florian; Dettmering, Denise; Börger, Klaus; Brandert, Sylvia; Görres, Barbara; Kersten, Wilhelm F.; Bothmer, Volker; Hinrichs, Johannes; Venzmer, Malte; Mrotzek, Niclas
2016-04-01
The project OPTIMAP is a joint initiative by the Bundeswehr GeoInformation Centre (BGIC), the German Space Situational Awareness Centre (GSSAC), the German Geodetic Research Institute of the Technical University of Munich (DGFI-TUM) and the Institute for Astrophysics at the University of Göttingen (IAG). The main goal is to develop an operational tool for ionospheric mapping and prediction (OPTIMAP). A key feature of the project is the combination of different satellite observation techniques to improve the spatio-temporal data coverage and the sensitivity for selected target parameters. In the current status, information about the vertical total electron content (VTEC) is derived from the dual frequency signal processing of four techniques: (1) Terrestrial observations of GPS and GLONASS ensure the high-resolution coverage of continental regions, (2) the satellite altimetry mission Jason-2 is taken into account to provide VTEC in nadir direction along the satellite tracks over the oceans, (3) GPS radio occultations to Formosat-3/COSMIC are exploited for the retrieval of electron density profiles that are integrated to obtain VTEC and (4) Jason-2 carrier-phase observations tracked by the on-board DORIS receiver are processed to determine the relative VTEC. All measurements are sequentially pre-processed in hourly batches serving as input data of a Kalman filter (KF) for modeling the global VTEC distribution. The KF runs in a predictor-corrector mode allowing for the sequential processing of the measurements where update steps are performed with one-minute sampling in the current configuration. The spatial VTEC distribution is represented by B-spline series expansions, i.e., the corresponding B-spline series coefficients together with additional technique-dependent unknowns such as Differential Code Biases and Intersystem Biases are estimated by the KF. As a preliminary solution, the prediction model to propagate the filter state through time is defined by a random walk.
Chen, Szi-Wen; Chen, Yuan-Ho
2015-01-01
In this paper, a discrete wavelet transform (DWT) based de-noising with its applications into the noise reduction for medical signal preprocessing is introduced. This work focuses on the hardware realization of a real-time wavelet de-noising procedure. The proposed de-noising circuit mainly consists of three modules: a DWT, a thresholding, and an inverse DWT (IDWT) modular circuits. We also proposed a novel adaptive thresholding scheme and incorporated it into our wavelet de-noising procedure. Performance was then evaluated on both the architectural designs of the software and. In addition, the de-noising circuit was also implemented by downloading the Verilog codes to a field programmable gate array (FPGA) based platform so that its ability in noise reduction may be further validated in actual practice. Simulation experiment results produced by applying a set of simulated noise-contaminated electrocardiogram (ECG) signals into the de-noising circuit showed that the circuit could not only desirably meet the requirement of real-time processing, but also achieve satisfactory performance for noise reduction, while the sharp features of the ECG signals can be well preserved. The proposed de-noising circuit was further synthesized using the Synopsys Design Compiler with an Artisan Taiwan Semiconductor Manufacturing Company (TSMC, Hsinchu, Taiwan) 40 nm standard cell library. The integrated circuit (IC) synthesis simulation results showed that the proposed design can achieve a clock frequency of 200 MHz and the power consumption was only 17.4 mW, when operated at 200 MHz. PMID:26501290
Lieb, Florian; Stark, Hans-Georg; Thielemann, Christiane
2017-06-01
Spike detection from extracellular recordings is a crucial preprocessing step when analyzing neuronal activity. The decision whether a specific part of the signal is a spike or not is important for any kind of other subsequent preprocessing steps, like spike sorting or burst detection in order to reduce the classification of erroneously identified spikes. Many spike detection algorithms have already been suggested, all working reasonably well whenever the signal-to-noise ratio is large enough. When the noise level is high, however, these algorithms have a poor performance. In this paper we present two new spike detection algorithms. The first is based on a stationary wavelet energy operator and the second is based on the time-frequency representation of spikes. Both algorithms are more reliable than all of the most commonly used methods. The performance of the algorithms is confirmed by using simulated data, resembling original data recorded from cortical neurons with multielectrode arrays. In order to demonstrate that the performance of the algorithms is not restricted to only one specific set of data, we also verify the performance using a simulated publicly available data set. We show that both proposed algorithms have the best performance under all tested methods, regardless of the signal-to-noise ratio in both data sets. This contribution will redound to the benefit of electrophysiological investigations of human cells. Especially the spatial and temporal analysis of neural network communications is improved by using the proposed spike detection algorithms.
Chen, Szi-Wen; Chen, Yuan-Ho
2015-10-16
In this paper, a discrete wavelet transform (DWT) based de-noising with its applications into the noise reduction for medical signal preprocessing is introduced. This work focuses on the hardware realization of a real-time wavelet de-noising procedure. The proposed de-noising circuit mainly consists of three modules: a DWT, a thresholding, and an inverse DWT (IDWT) modular circuits. We also proposed a novel adaptive thresholding scheme and incorporated it into our wavelet de-noising procedure. Performance was then evaluated on both the architectural designs of the software and. In addition, the de-noising circuit was also implemented by downloading the Verilog codes to a field programmable gate array (FPGA) based platform so that its ability in noise reduction may be further validated in actual practice. Simulation experiment results produced by applying a set of simulated noise-contaminated electrocardiogram (ECG) signals into the de-noising circuit showed that the circuit could not only desirably meet the requirement of real-time processing, but also achieve satisfactory performance for noise reduction, while the sharp features of the ECG signals can be well preserved. The proposed de-noising circuit was further synthesized using the Synopsys Design Compiler with an Artisan Taiwan Semiconductor Manufacturing Company (TSMC, Hsinchu, Taiwan) 40 nm standard cell library. The integrated circuit (IC) synthesis simulation results showed that the proposed design can achieve a clock frequency of 200 MHz and the power consumption was only 17.4 mW, when operated at 200 MHz.
Evaluation of segmentation algorithms for optical coherence tomography images of ovarian tissue
NASA Astrophysics Data System (ADS)
Sawyer, Travis W.; Rice, Photini F. S.; Sawyer, David M.; Koevary, Jennifer W.; Barton, Jennifer K.
2018-02-01
Ovarian cancer has the lowest survival rate among all gynecologic cancers due to predominantly late diagnosis. Early detection of ovarian cancer can increase 5-year survival rates from 40% up to 92%, yet no reliable early detection techniques exist. Optical coherence tomography (OCT) is an emerging technique that provides depthresolved, high-resolution images of biological tissue in real time and demonstrates great potential for imaging of ovarian tissue. Mouse models are crucial to quantitatively assess the diagnostic potential of OCT for ovarian cancer imaging; however, due to small organ size, the ovaries must rst be separated from the image background using the process of segmentation. Manual segmentation is time-intensive, as OCT yields three-dimensional data. Furthermore, speckle noise complicates OCT images, frustrating many processing techniques. While much work has investigated noise-reduction and automated segmentation for retinal OCT imaging, little has considered the application to the ovaries, which exhibit higher variance and inhomogeneity than the retina. To address these challenges, we evaluated a set of algorithms to segment OCT images of mouse ovaries. We examined ve preprocessing techniques and six segmentation algorithms. While all pre-processing methods improve segmentation, Gaussian filtering is most effective, showing an improvement of 32% +/- 1.2%. Of the segmentation algorithms, active contours performs best, segmenting with an accuracy of 0.948 +/- 0.012 compared with manual segmentation (1.0 being identical). Nonetheless, further optimization could lead to maximizing the performance for segmenting OCT images of the ovaries.
Fully automated MR liver volumetry using watershed segmentation coupled with active contouring.
Huynh, Hieu Trung; Le-Trong, Ngoc; Bao, Pham The; Oto, Aytek; Suzuki, Kenji
2017-02-01
Our purpose is to develop a fully automated scheme for liver volume measurement in abdominal MR images, without requiring any user input or interaction. The proposed scheme is fully automatic for liver volumetry from 3D abdominal MR images, and it consists of three main stages: preprocessing, rough liver shape generation, and liver extraction. The preprocessing stage reduced noise and enhanced the liver boundaries in 3D abdominal MR images. The rough liver shape was revealed fully automatically by using the watershed segmentation, thresholding transform, morphological operations, and statistical properties of the liver. An active contour model was applied to refine the rough liver shape to precisely obtain the liver boundaries. The liver volumes calculated by the proposed scheme were compared to the "gold standard" references which were estimated by an expert abdominal radiologist. The liver volumes computed by using our developed scheme excellently agreed (Intra-class correlation coefficient was 0.94) with the "gold standard" manual volumes by the radiologist in the evaluation with 27 cases from multiple medical centers. The running time was 8.4 min per case on average. We developed a fully automated liver volumetry scheme in MR, which does not require any interaction by users. It was evaluated with cases from multiple medical centers. The liver volumetry performance of our developed system was comparable to that of the gold standard manual volumetry, and it saved radiologists' time for manual liver volumetry of 24.7 min per case.
Real-time, haptics-enabled simulator for probing ex vivo liver tissue.
Lister, Kevin; Gao, Zhan; Desai, Jaydev P
2009-01-01
The advent of complex surgical procedures has driven the need for realistic surgical training simulators. Comprehensive simulators that provide realistic visual and haptic feedback during surgical tasks are required to familiarize surgeons with the procedures they are to perform. Complex organ geometry inherent to biological tissues and intricate material properties drive the need for finite element methods to assure accurate tissue displacement and force calculations. Advances in real-time finite element methods have not reached the state where they are applicable to soft tissue surgical simulation. Therefore a real-time, haptics-enabled simulator for probing of soft tissue has been developed which utilizes preprocessed finite element data (derived from accurate constitutive model of the soft-tissue obtained from carefully collected experimental data) to accurately replicate the probing task in real-time.
A novel time-domain signal processing algorithm for real time ventricular fibrillation detection
NASA Astrophysics Data System (ADS)
Monte, G. E.; Scarone, N. C.; Liscovsky, P. O.; Rotter S/N, P.
2011-12-01
This paper presents an application of a novel algorithm for real time detection of ECG pathologies, especially ventricular fibrillation. It is based on segmentation and labeling process of an oversampled signal. After this treatment, analyzing sequence of segments, global signal behaviours are obtained in the same way like a human being does. The entire process can be seen as a morphological filtering after a smart data sampling. The algorithm does not require any ECG digital signal pre-processing, and the computational cost is low, so it can be embedded into the sensors for wearable and permanent applications. The proposed algorithms could be the input signal description to expert systems or to artificial intelligence software in order to detect other pathologies.
Preprocessing of gene expression data by optimally robust estimators
2010-01-01
Background The preprocessing of gene expression data obtained from several platforms routinely includes the aggregation of multiple raw signal intensities to one expression value. Examples are the computation of a single expression measure based on the perfect match (PM) and mismatch (MM) probes for the Affymetrix technology, the summarization of bead level values to bead summary values for the Illumina technology or the aggregation of replicated measurements in the case of other technologies including real-time quantitative polymerase chain reaction (RT-qPCR) platforms. The summarization of technical replicates is also performed in other "-omics" disciplines like proteomics or metabolomics. Preprocessing methods like MAS 5.0, Illumina's default summarization method, RMA, or VSN show that the use of robust estimators is widely accepted in gene expression analysis. However, the selection of robust methods seems to be mainly driven by their high breakdown point and not by efficiency. Results We describe how optimally robust radius-minimax (rmx) estimators, i.e. estimators that minimize an asymptotic maximum risk on shrinking neighborhoods about an ideal model, can be used for the aggregation of multiple raw signal intensities to one expression value for Affymetrix and Illumina data. With regard to the Affymetrix data, we have implemented an algorithm which is a variant of MAS 5.0. Using datasets from the literature and Monte-Carlo simulations we provide some reasoning for assuming approximate log-normal distributions of the raw signal intensities by means of the Kolmogorov distance, at least for the discussed datasets, and compare the results of our preprocessing algorithms with the results of Affymetrix's MAS 5.0 and Illumina's default method. The numerical results indicate that when using rmx estimators an accuracy improvement of about 10-20% is obtained compared to Affymetrix's MAS 5.0 and about 1-5% compared to Illumina's default method. The improvement is also visible in the analysis of technical replicates where the reproducibility of the values (in terms of Pearson and Spearman correlation) is increased for all Affymetrix and almost all Illumina examples considered. Our algorithms are implemented in the R package named RobLoxBioC which is publicly available via CRAN, The Comprehensive R Archive Network (http://cran.r-project.org/web/packages/RobLoxBioC/). Conclusions Optimally robust rmx estimators have a high breakdown point and are computationally feasible. They can lead to a considerable gain in efficiency for well-established bioinformatics procedures and thus, can increase the reproducibility and power of subsequent statistical analysis. PMID:21118506
Historical feature pattern extraction based network attack situation sensing algorithm.
Zeng, Yong; Liu, Dacheng; Lei, Zhou
2014-01-01
The situation sequence contains a series of complicated and multivariate random trends, which are very sudden, uncertain, and difficult to recognize and describe its principle by traditional algorithms. To solve the above questions, estimating parameters of super long situation sequence is essential, but very difficult, so this paper proposes a situation prediction method based on historical feature pattern extraction (HFPE). First, HFPE algorithm seeks similar indications from the history situation sequence recorded and weighs the link intensity between occurred indication and subsequent effect. Then it calculates the probability that a certain effect reappears according to the current indication and makes a prediction after weighting. Meanwhile, HFPE method gives an evolution algorithm to derive the prediction deviation from the views of pattern and accuracy. This algorithm can continuously promote the adaptability of HFPE through gradual fine-tuning. The method preserves the rules in sequence at its best, does not need data preprocessing, and can track and adapt to the variation of situation sequence continuously.
Historical Feature Pattern Extraction Based Network Attack Situation Sensing Algorithm
Zeng, Yong; Liu, Dacheng; Lei, Zhou
2014-01-01
The situation sequence contains a series of complicated and multivariate random trends, which are very sudden, uncertain, and difficult to recognize and describe its principle by traditional algorithms. To solve the above questions, estimating parameters of super long situation sequence is essential, but very difficult, so this paper proposes a situation prediction method based on historical feature pattern extraction (HFPE). First, HFPE algorithm seeks similar indications from the history situation sequence recorded and weighs the link intensity between occurred indication and subsequent effect. Then it calculates the probability that a certain effect reappears according to the current indication and makes a prediction after weighting. Meanwhile, HFPE method gives an evolution algorithm to derive the prediction deviation from the views of pattern and accuracy. This algorithm can continuously promote the adaptability of HFPE through gradual fine-tuning. The method preserves the rules in sequence at its best, does not need data preprocessing, and can track and adapt to the variation of situation sequence continuously. PMID:24892054
A comparison of 1D and 2D LSTM architectures for the recognition of handwritten Arabic
NASA Astrophysics Data System (ADS)
Yousefi, Mohammad Reza; Soheili, Mohammad Reza; Breuel, Thomas M.; Stricker, Didier
2015-01-01
In this paper, we present an Arabic handwriting recognition method based on recurrent neural network. We use the Long Short Term Memory (LSTM) architecture, that have proven successful in different printed and handwritten OCR tasks. Applications of LSTM for handwriting recognition employ the two-dimensional architecture to deal with the variations in both vertical and horizontal axis. However, we show that using a simple pre-processing step that normalizes the position and baseline of letters, we can make use of 1D LSTM, which is faster in learning and convergence, and yet achieve superior performance. In a series of experiments on IFN/ENIT database for Arabic handwriting recognition, we demonstrate that our proposed pipeline can outperform 2D LSTM networks. Furthermore, we provide comparisons with 1D LSTM networks trained with manually crafted features to show that the automatically learned features in a globally trained 1D LSTM network with our normalization step can even outperform such systems.
CLASSIFYING MEDICAL IMAGES USING MORPHOLOGICAL APPEARANCE MANIFOLDS.
Varol, Erdem; Gaonkar, Bilwaj; Davatzikos, Christos
2013-12-31
Input features for medical image classification algorithms are extracted from raw images using a series of pre processing steps. One common preprocessing step in computational neuroanatomy and functional brain mapping is the nonlinear registration of raw images to a common template space. Typically, the registration methods used are parametric and their output varies greatly with changes in parameters. Most results reported previously perform registration using a fixed parameter setting and use the results as input to the subsequent classification step. The variation in registration results due to choice of parameters thus translates to variation of performance of the classifiers that depend on the registration step for input. Analogous issues have been investigated in the computer vision literature, where image appearance varies with pose and illumination, thereby making classification vulnerable to these confounding parameters. The proposed methodology addresses this issue by sampling image appearances as registration parameters vary, and shows that better classification accuracies can be obtained this way, compared to the conventional approach.
NASA Astrophysics Data System (ADS)
Fleishman, G. D.; Anfinogentov, S.; Loukitcheva, M.; Mysh'yakov, I.; Stupishin, A.
2017-12-01
Measuring and modeling coronal magnetic field, especially above active regions (ARs), remains one of the central problems of solar physics given that the solar coronal magnetism is the key driver of all solar activity. Nowadays the coronal magnetic field is often modelled using methods of nonlinear force-free field reconstruction, whose accuracy has not yet been comprehensively assessed. Given that the coronal magnetic probing is routinely unavailable, only morphological tests have been applied to evaluate performance of the reconstruction methods and a few direct tests using available semi-analytical force-free field solution. Here we report a detailed casting of various tools used for the nonlinear force-free field reconstruction, such as disambiguation methods, photospheric field preprocessing methods, and volume reconstruction methods in a 3D domain using a 3D snapshot of the publicly available full-fledged radiative MHD model. We take advantage of the fact that from the realistic MHD model we know the magnetic field vector distribution in the entire 3D domain, which enables us to perform "voxel-by-voxel" comparison of the restored magnetic field and the true magnetic field in the 3D model volume. Our tests show that the available disambiguation methods often fail at the quiet sun areas, where the magnetic structure is dominated by small-scale magnetic elements, while they work really well at the AR photosphere and (even better) chromosphere. The preprocessing of the photospheric magnetic field, although does produce a more force-free boundary condition, also results in some effective `elevation' of the magnetic field components. The effective `elevation' height turns out to be different for the longitudinal and transverse components of the magnetic field, which results in a systematic error in absolute heights in the reconstructed magnetic data cube. The extrapolation performed starting from actual AR photospheric magnetogram (i.e., without preprocessing) are free from this systematic error, while have other metrics either comparable or only marginally worse than those estimated for extrapolations from the preprocessed magnetograms. This finding favors the use of extrapolations from the original photospheric magnetogram without preprocessing.
OCR enhancement through neighbor embedding and fast approximate nearest neighbors
NASA Astrophysics Data System (ADS)
Smith, D. C.
2012-10-01
Generic optical character recognition (OCR) engines often perform very poorly in transcribing scanned low resolution (LR) text documents. To improve OCR performance, we apply the Neighbor Embedding (NE) single-image super-resolution (SISR) technique to LR scanned text documents to obtain high resolution (HR) versions, which we subsequently process with OCR. For comparison, we repeat this procedure using bicubic interpolation (BI). We demonstrate that mean-square errors (MSE) in NE HR estimates do not increase substantially when NE is trained in one Latin font style and tested in another, provided both styles belong to the same font category (serif or sans serif). This is very important in practice, since for each font size, the number of training sets required for each category may be reduced from dozens to just one. We also incorporate randomized k-d trees into our NE implementation to perform approximate nearest neighbor search, and obtain a 1000x speed up of our original NE implementation, with negligible MSE degradation. This acceleration also made it practical to combine all of our size-specific NE Latin models into a single Universal Latin Model (ULM). The ULM eliminates the need to determine the unknown font category and size of an input LR text document and match it to an appropriate model, a very challenging task, since the dpi (pixels per inch) of the input LR image is generally unknown. Our experiments show that OCR character error rates (CER) were over 90% when we applied the Tesseract OCR engine to LR text documents (scanned at 75 dpi and 100 dpi) in the 6-10 pt range. By contrast, using k-d trees and the ULM, CER after NE preprocessing averaged less than 7% at 3x (100 dpi LR scanning) and 4x (75 dpi LR scanning) magnification, over an order of magnitude improvement. Moreover, CER after NE preprocessing was more that 6 times lower on average than after BI preprocessing.
Huang, Ming-Xiong; Anderson, Bill; Huang, Charles W.; Kunde, Gerd J.; Vreeland, Erika C.; Huang, Jeffrey W.; Matlashov, Andrei N.; Karaulanov, Todor; Nettles, Christopher P.; Gomez, Andrew; Minser, Kayla; Weldon, Caroline; Paciotti, Giulio; Harsh, Michael; Lee, Roland R.; Flynn, Edward R.
2017-01-01
Superparamagnetic Relaxometry (SPMR) is a highly sensitive technique for the in vivo detection of tumor cells and may improve early stage detection of cancers. SPMR employs superparamagnetic iron oxide nanoparticles (SPION). After a brief magnetizing pulse is used to align the SPION, SPMR measures the time decay of SPION using Super-conducting Quantum Interference Device (SQUID) sensors. Substantial research has been carried out in developing the SQUID hardware and in improving the properties of the SPION. However, little research has been done in the pre-processing of sensor signals and post-processing source modeling in SPMR. In the present study, we illustrate new pre-processing tools that were developed to: 1) remove trials contaminated with artifacts, 2) evaluate and ensure that a single decay process associated with bounded SPION exists in the data, 3) automatically detect and correct flux jumps, and 4) accurately fit the sensor signals with different decay models. Furthermore, we developed an automated approach based on multi-start dipole imaging technique to obtain the locations and magnitudes of multiple magnetic sources, without initial guesses from the users. A regularization process was implemented to solve the ambiguity issue related to the SPMR source variables. A procedure based on reduced chi-square cost-function was introduced to objectively obtain the adequate number of dipoles that describe the data. The new pre-processing tools and multi-start source imaging approach have been successfully evaluated using phantom data. In conclusion, these tools and multi-start source modeling approach substantially enhance the accuracy and sensitivity in detecting and localizing sources from the SPMR signals. Furthermore, multi-start approach with regularization provided robust and accurate solutions for a poor SNR condition similar to the SPMR detection sensitivity in the order of 1000 cells. We believe such algorithms will help establishing the industrial standards for SPMR when applying the technique in pre-clinical and clinical settings. PMID:28072579
Li, Yuanpeng; Li, Fucui; Yang, Xinhao; Guo, Liu; Huang, Furong; Chen, Zhenqiang; Chen, Xingdan; Zheng, Shifu
2018-08-05
A rapid quantitative analysis model for determining the glycated albumin (GA) content based on Attenuated total reflectance (ATR)-Fourier transform infrared spectroscopy (FTIR) combining with linear SiPLS and nonlinear SVM has been developed. Firstly, the real GA content in human serum was determined by GA enzymatic method, meanwhile, the ATR-FTIR spectra of serum samples from the population of health examination were obtained. The spectral data of the whole spectra mid-infrared region (4000-600 cm -1 ) and GA's characteristic region (1800-800 cm -1 ) were used as the research object of quantitative analysis. Secondly, several preprocessing steps including first derivative, second derivative, variable standardization and spectral normalization, were performed. Lastly, quantitative analysis regression models were established by using SiPLS and SVM respectively. The SiPLS modeling results are as follows: root mean square error of cross validation (RMSECV T ) = 0.523 g/L, calibration coefficient (R C ) = 0.937, Root Mean Square Error of Prediction (RMSEP T ) = 0.787 g/L, and prediction coefficient (R P ) = 0.938. The SVM modeling results are as follows: RMSECV T = 0.0048 g/L, R C = 0.998, RMSEP T = 0.442 g/L, and R p = 0.916. The results indicated that the model performance was improved significantly after preprocessing and optimization of characteristic regions. While modeling performance of nonlinear SVM was considerably better than that of linear SiPLS. Hence, the quantitative analysis model for GA in human serum based on ATR-FTIR combined with SiPLS and SVM is effective. And it does not need sample preprocessing while being characterized by simple operations and high time efficiency, providing a rapid and accurate method for GA content determination. Copyright © 2018 Elsevier B.V. All rights reserved.
Despeckle filtering software toolbox for ultrasound imaging of the common carotid artery.
Loizou, Christos P; Theofanous, Charoula; Pantziaris, Marios; Kasparis, Takis
2014-04-01
Ultrasound imaging of the common carotid artery (CCA) is a non-invasive tool used in medicine to assess the severity of atherosclerosis and monitor its progression through time. It is also used in border detection and texture characterization of the atherosclerotic carotid plaque in the CCA, the identification and measurement of the intima-media thickness (IMT) and the lumen diameter that all are very important in the assessment of cardiovascular disease (CVD). Visual perception, however, is hindered by speckle, a multiplicative noise, that degrades the quality of ultrasound B-mode imaging. Noise reduction is therefore essential for improving the visual observation quality or as a pre-processing step for further automated analysis, such as image segmentation of the IMT and the atherosclerotic carotid plaque in ultrasound images. In order to facilitate this preprocessing step, we have developed in MATLAB(®) a unified toolbox that integrates image despeckle filtering (IDF), texture analysis and image quality evaluation techniques to automate the pre-processing and complement the disease evaluation in ultrasound CCA images. The proposed software, is based on a graphical user interface (GUI) and incorporates image normalization, 10 different despeckle filtering techniques (DsFlsmv, DsFwiener, DsFlsminsc, DsFkuwahara, DsFgf, DsFmedian, DsFhmedian, DsFad, DsFnldif, DsFsrad), image intensity normalization, 65 texture features, 15 quantitative image quality metrics and objective image quality evaluation. The software is publicly available in an executable form, which can be downloaded from http://www.cs.ucy.ac.cy/medinfo/. It was validated on 100 ultrasound images of the CCA, by comparing its results with quantitative visual analysis performed by a medical expert. It was observed that the despeckle filters DsFlsmv, and DsFhmedian improved image quality perception (based on the expert's assessment and the image texture and quality metrics). It is anticipated that the system could help the physician in the assessment of cardiovascular image analysis. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
EMERALD: A Flexible Framework for Managing Seismic Data
NASA Astrophysics Data System (ADS)
West, J. D.; Fouch, M. J.; Arrowsmith, R.
2010-12-01
The seismological community is challenged by the vast quantity of new broadband seismic data provided by large-scale seismic arrays such as EarthScope’s USArray. While this bonanza of new data enables transformative scientific studies of the Earth’s interior, it also illuminates limitations in the methods used to prepare and preprocess those data. At a recent seismic data processing focus group workshop, many participants expressed the need for better systems to minimize the time and tedium spent on data preparation in order to increase the efficiency of scientific research. Another challenge related to data from all large-scale transportable seismic experiments is that there currently exists no system for discovering and tracking changes in station metadata. This critical information, such as station location, sensor orientation, instrument response, and clock timing data, may change over the life of an experiment and/or be subject to post-experiment correction. Yet nearly all researchers utilize metadata acquired with the downloaded data, even though subsequent metadata updates might alter or invalidate results produced with older metadata. A third long-standing issue for the seismic community is the lack of easily exchangeable seismic processing codes. This problem stems directly from the storage of seismic data as individual time series files, and the history of each researcher developing his or her preferred data file naming convention and directory organization. Because most processing codes rely on the underlying data organization structure, such codes are not easily exchanged between investigators. To address these issues, we are developing EMERALD (Explore, Manage, Edit, Reduce, & Analyze Large Datasets). The goal of the EMERALD project is to provide seismic researchers with a unified, user-friendly, extensible system for managing seismic event data, thereby increasing the efficiency of scientific enquiry. EMERALD stores seismic data and metadata in a state-of-the-art open source relational database (PostgreSQL), and can, on a timed basis or on demand, download the most recent metadata, compare it with previously acquired values, and alert the user to changes. The backend relational database is capable of easily storing and managing many millions of records. The extensible, plug-in architecture of the EMERALD system allows any researcher to contribute new visualization and processing methods written in any of 12 programming languages, and a central Internet-enabled repository for such methods provides users with the opportunity to download, use, and modify new processing methods on demand. EMERALD includes data acquisition tools allowing direct importation of seismic data, and also imports data from a number of existing seismic file formats. Pre-processed clean sets of data can be exported as standard sac files with user-defined file naming and directory organization, for use with existing processing codes. The EMERALD system incorporates existing acquisition and processing tools, including SOD, TauP, GMT, and FISSURES/DHI, making much of the functionality of those tools available in a unified system with a user-friendly web browser interface. EMERALD is now in beta test. See emerald.asu.edu or contact john.d.west@asu.edu for more details.
Peleato, Nicolas M; Legge, Raymond L; Andrews, Robert C
2018-06-01
The use of fluorescence data coupled with neural networks for improved predictability of drinking water disinfection by-products (DBPs) was investigated. Novel application of autoencoders to process high-dimensional fluorescence data was related to common dimensionality reduction techniques of parallel factors analysis (PARAFAC) and principal component analysis (PCA). The proposed method was assessed based on component interpretability as well as for prediction of organic matter reactivity to formation of DBPs. Optimal prediction accuracies on a validation dataset were observed with an autoencoder-neural network approach or by utilizing the full spectrum without pre-processing. Latent representation by an autoencoder appeared to mitigate overfitting when compared to other methods. Although DBP prediction error was minimized by other pre-processing techniques, PARAFAC yielded interpretable components which resemble fluorescence expected from individual organic fluorophores. Through analysis of the network weights, fluorescence regions associated with DBP formation can be identified, representing a potential method to distinguish reactivity between fluorophore groupings. However, distinct results due to the applied dimensionality reduction approaches were observed, dictating a need for considering the role of data pre-processing in the interpretability of the results. In comparison to common organic measures currently used for DBP formation prediction, fluorescence was shown to improve prediction accuracies, with improvements to DBP prediction best realized when appropriate pre-processing and regression techniques were applied. The results of this study show promise for the potential application of neural networks to best utilize fluorescence EEM data for prediction of organic matter reactivity. Copyright © 2018 Elsevier Ltd. All rights reserved.
An enhanced TIMESAT algorithm for estimating vegetation phenology metrics from MODIS data
Tan, B.; Morisette, J.T.; Wolfe, R.E.; Gao, F.; Ederer, G.A.; Nightingale, J.; Pedelty, J.A.
2011-01-01
An enhanced TIMESAT algorithm was developed for retrieving vegetation phenology metrics from 250 m and 500 m spatial resolution Moderate Resolution Imaging Spectroradiometer (MODIS) vegetation indexes (VI) over North America. MODIS VI data were pre-processed using snow-cover and land surface temperature data, and temporally smoothed with the enhanced TIMESAT algorithm. An objective third derivative test was applied to define key phenology dates and retrieve a set of phenology metrics. This algorithm has been applied to two MODIS VIs: Normalized Difference Vegetation Index (NDVI) and Enhanced Vegetation Index (EVI). In this paper, we describe the algorithm and use EVI as an example to compare three sets of TIMESAT algorithm/MODIS VI combinations: a) original TIMESAT algorithm with original MODIS VI, b) original TIMESAT algorithm with pre-processed MODIS VI, and c) enhanced TIMESAT and pre-processed MODIS VI. All retrievals were compared with ground phenology observations, some made available through the National Phenology Network. Our results show that for MODIS data in middle to high latitude regions, snow and land surface temperature information is critical in retrieving phenology metrics from satellite observations. The results also show that the enhanced TIMESAT algorithm can better accommodate growing season start and end dates that vary significantly from year to year. The TIMESAT algorithm improvements contribute to more spatial coverage and more accurate retrievals of the phenology metrics. Among three sets of TIMESAT/MODIS VI combinations, the start of the growing season metric predicted by the enhanced TIMESAT algorithm using pre-processed MODIS VIs has the best associations with ground observed vegetation greenup dates. ?? 2010 IEEE.
Puccio, Benjamin; Pooley, James P; Pellman, John S; Taverna, Elise C; Craddock, R Cameron
2016-10-25
Skull-stripping is the procedure of removing non-brain tissue from anatomical MRI data. This procedure can be useful for calculating brain volume and for improving the quality of other image processing steps. Developing new skull-stripping algorithms and evaluating their performance requires gold standard data from a variety of different scanners and acquisition methods. We complement existing repositories with manually corrected brain masks for 125 T1-weighted anatomical scans from the Nathan Kline Institute Enhanced Rockland Sample Neurofeedback Study. Skull-stripped images were obtained using a semi-automated procedure that involved skull-stripping the data using the brain extraction based on nonlocal segmentation technique (BEaST) software, and manually correcting the worst results. Corrected brain masks were added into the BEaST library and the procedure was repeated until acceptable brain masks were available for all images. In total, 85 of the skull-stripped images were hand-edited and 40 were deemed to not need editing. The results are brain masks for the 125 images along with a BEaST library for automatically skull-stripping other data. Skull-stripped anatomical images from the Neurofeedback sample are available for download from the Preprocessed Connectomes Project. The resulting brain masks can be used by researchers to improve preprocessing of the Neurofeedback data, as training and testing data for developing new skull-stripping algorithms, and for evaluating the impact on other aspects of MRI preprocessing. We have illustrated the utility of these data as a reference for comparing various automatic methods and evaluated the performance of the newly created library on independent data.
An Enhanced TIMESAT Algorithm for Estimating Vegetation Phenology Metrics from MODIS Data
NASA Technical Reports Server (NTRS)
Tan, Bin; Morisette, Jeffrey T.; Wolfe, Robert E.; Gao, Feng; Ederer, Gregory A.; Nightingale, Joanne; Pedelty, Jeffrey A.
2012-01-01
An enhanced TIMESAT algorithm was developed for retrieving vegetation phenology metrics from 250 m and 500 m spatial resolution Moderate Resolution Imaging Spectroradiometer (MODIS) vegetation indexes (VI) over North America. MODIS VI data were pre-processed using snow-cover and land surface temperature data, and temporally smoothed with the enhanced TIMESAT algorithm. An objective third derivative test was applied to define key phenology dates and retrieve a set of phenology metrics. This algorithm has been applied to two MODIS VIs: Normalized Difference Vegetation Index (NDVI) and Enhanced Vegetation Index (EVI). In this paper, we describe the algorithm and use EVI as an example to compare three sets of TIMESAT algorithm/MODIS VI combinations: a) original TIMESAT algorithm with original MODIS VI, b) original TIMESAT algorithm with pre-processed MODIS VI, and c) enhanced TIMESAT and pre-processed MODIS VI. All retrievals were compared with ground phenology observations, some made available through the National Phenology Network. Our results show that for MODIS data in middle to high latitude regions, snow and land surface temperature information is critical in retrieving phenology metrics from satellite observations. The results also show that the enhanced TIMESAT algorithm can better accommodate growing season start and end dates that vary significantly from year to year. The TIMESAT algorithm improvements contribute to more spatial coverage and more accurate retrievals of the phenology metrics. Among three sets of TIMESAT/MODIS VI combinations, the start of the growing season metric predicted by the enhanced TIMESAT algorithm using pre-processed MODIS VIs has the best associations with ground observed vegetation greenup dates.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Johnson, Kevin J.; Wright, Bob W.; Jarman, Kristin H.
2003-05-09
A rapid retention time alignment algorithm was developed as a preprocessing utility to be used prior to chemometric analysis of large datasets of diesel fuel gas chromatographic profiles. Retention time variation from chromatogram-to-chromatogram has been a significant impediment against the use of chemometric techniques in the analysis of chromatographic data due to the inability of current multivariate techniques to correctly model information that shifts from variable to variable within a dataset. The algorithm developed is shown to increase the efficacy of pattern recognition methods applied to a set of diesel fuel chromatograms by retaining chemical selectivity while reducing chromatogram-to-chromatogram retentionmore » time variations and to do so on a time scale that makes analysis of large sets of chromatographic data practical.« less