Science.gov

Sample records for microarray preprocessing algorithms

  1. Micro-Analyzer: automatic preprocessing of Affymetrix microarray data.

    PubMed

    Guzzi, Pietro Hiram; Cannataro, Mario

    2013-08-01

    A current trend in genomics is the investigation of the cell mechanism using different technologies, in order to explain the relationship among genes, molecular processes and diseases. For instance, the combined use of gene-expression arrays and genomic arrays has been demonstrated as an effective instrument in clinical practice. Consequently, in a single experiment different kind of microarrays may be used, resulting in the production of different types of binary data (images and textual raw data). The analysis of microarray data requires an initial preprocessing phase, that makes raw data suitable for use on existing analysis platforms, such as the TIGR M4 (TM4) Suite. An additional challenge to be faced by emerging data analysis platforms is the ability to treat in a combined way those different microarray formats coupled with clinical data. In fact, resulting integrated data may include both numerical and symbolic data (e.g. gene expression and SNPs regarding molecular data), as well as temporal data (e.g. the response to a drug, time to progression and survival rate), regarding clinical data. Raw data preprocessing is a crucial step in analysis but is often performed in a manual and error prone way using different software tools. Thus novel, platform independent, and possibly open source tools enabling the semi-automatic preprocessing and annotation of different microarray data are needed. The paper presents Micro-Analyzer (Microarray Analyzer), a cross-platform tool for the automatic normalization, summarization and annotation of Affymetrix gene expression and SNP binary data. It represents the evolution of the μ-CS tool, extending the preprocessing to SNP arrays that were not allowed in μ-CS. The Micro-Analyzer is provided as a Java standalone tool and enables users to read, preprocess and analyse binary microarray data (gene expression and SNPs) by invoking TM4 platform. It avoids: (i) the manual invocation of external tools (e.g. the Affymetrix Power

  2. Optimisation algorithms for microarray biclustering.

    PubMed

    Perrin, Dimitri; Duhamel, Christophe

    2013-01-01

    In providing simultaneous information on expression profiles for thousands of genes, microarray technologies have, in recent years, been largely used to investigate mechanisms of gene expression. Clustering and classification of such data can, indeed, highlight patterns and provide insight on biological processes. A common approach is to consider genes and samples of microarray datasets as nodes in a bipartite graphs, where edges are weighted e.g. based on the expression levels. In this paper, using a previously-evaluated weighting scheme, we focus on search algorithms and evaluate, in the context of biclustering, several variations of Genetic Algorithms. We also introduce a new heuristic "Propagate", which consists in recursively evaluating neighbour solutions with one more or one less active conditions. The results obtained on three well-known datasets show that, for a given weighting scheme, optimal or near-optimal solutions can be identified. PMID:24109756

  3. User-friendly solutions for microarray quality control and pre-processing on ArrayAnalysis.org.

    PubMed

    Eijssen, Lars M T; Jaillard, Magali; Adriaens, Michiel E; Gaj, Stan; de Groot, Philip J; Müller, Michael; Evelo, Chris T

    2013-07-01

    Quality control (QC) is crucial for any scientific method producing data. Applying adequate QC introduces new challenges in the genomics field where large amounts of data are produced with complex technologies. For DNA microarrays, specific algorithms for QC and pre-processing including normalization have been developed by the scientific community, especially for expression chips of the Affymetrix platform. Many of these have been implemented in the statistical scripting language R and are available from the Bioconductor repository. However, application is hampered by lack of integrative tools that can be used by users of any experience level. To fill this gap, we developed a freely available tool for QC and pre-processing of Affymetrix gene expression results, extending, integrating and harmonizing functionality of Bioconductor packages. The tool can be easily accessed through a wizard-like web portal at http://www.arrayanalysis.org or downloaded for local use in R. The portal provides extensive documentation, including user guides, interpretation help with real output illustrations and detailed technical documentation. It assists newcomers to the field in performing state-of-the-art QC and pre-processing while offering data analysts an integral open-source package. Providing the scientific community with this easily accessible tool will allow improving data quality and reuse and adoption of standards. PMID:23620278

  4. An Efficient and Configurable Preprocessing Algorithm to Improve Stability Analysis.

    PubMed

    Sesia, Ilaria; Cantoni, Elena; Cernigliaro, Alice; Signorile, Giovanna; Fantino, Gianluca; Tavella, Patrizia

    2016-04-01

    The Allan variance (AVAR) is widely used to measure the stability of experimental time series. Specifically, AVAR is commonly used in space applications such as monitoring the clocks of the global navigation satellite systems (GNSSs). In these applications, the experimental data present some peculiar aspects which are not generally encountered when the measurements are carried out in a laboratory. Space clocks' data can in fact present outliers, jumps, and missing values, which corrupt the clock characterization. Therefore, an efficient preprocessing is fundamental to ensure a proper data analysis and improve the stability estimation performed with the AVAR or other similar variances. In this work, we propose a preprocessing algorithm and its implementation in a robust software code (in MATLAB language) able to deal with time series of experimental data affected by nonstationarities and missing data; our method is properly detecting and removing anomalous behaviors, hence making the subsequent stability analysis more reliable. PMID:26540679

  5. Fully Automated Complementary DNA Microarray Segmentation using a Novel Fuzzy-based Algorithm

    PubMed Central

    Saberkari, Hamidreza; Bahrami, Sheyda; Shamsi, Mousa; Amoshahy, Mohammad Javad; Ghavifekr, Habib Badri; Sedaaghi, Mohammad Hossein

    2015-01-01

    DNA microarray is a powerful approach to study simultaneously, the expression of 1000 of genes in a single experiment. The average value of the fluorescent intensity could be calculated in a microarray experiment. The calculated intensity values are very close in amount to the levels of expression of a particular gene. However, determining the appropriate position of every spot in microarray images is a main challenge, which leads to the accurate classification of normal and abnormal (cancer) cells. In this paper, first a preprocessing approach is performed to eliminate the noise and artifacts available in microarray cells using the nonlinear anisotropic diffusion filtering method. Then, the coordinate center of each spot is positioned utilizing the mathematical morphology operations. Finally, the position of each spot is exactly determined through applying a novel hybrid model based on the principle component analysis and the spatial fuzzy c-means clustering (SFCM) algorithm. Using a Gaussian kernel in SFCM algorithm will lead to improving the quality in complementary DNA microarray segmentation. The performance of the proposed algorithm has been evaluated on the real microarray images, which is available in Stanford Microarray Databases. Results illustrate that the accuracy of microarray cells segmentation in the proposed algorithm reaches to 100% and 98% for noiseless/noisy cells, respectively. PMID:26284175

  6. An improved preprocessing algorithm for haplotype inference by pure parsimony.

    PubMed

    Choi, Mun-Ho; Kang, Seung-Ho; Lim, Hyeong-Seok

    2014-08-01

    The identification of haplotypes, which encode SNPs in a single chromosome, makes it possible to perform a haplotype-based association test with disease. Given a set of genotypes from a population, the process of recovering the haplotypes, which explain the genotypes, is called haplotype inference (HI). We propose an improved preprocessing method for solving the haplotype inference by pure parsimony (HIPP), which excludes a large amount of redundant haplotypes by detecting some groups of haplotypes that are dispensable for optimal solutions. The method uses only inclusion relations between groups of haplotypes but dramatically reduces the number of candidate haplotypes; therefore, it causes the computational time and memory reduction of real HIPP solvers. The proposed method can be easily coupled with a wide range of optimization methods which consider a set of candidate haplotypes explicitly. For the simulated and well-known benchmark datasets, the experimental results show that our method coupled with a classical exact HIPP solver run much faster than the state-of-the-art solver and can solve a large number of instances that were so far unaffordable in a reasonable time. PMID:25152045

  7. Genetic Algorithm for Optimization: Preprocessing with n Dimensional Bisection and Error Estimation

    NASA Technical Reports Server (NTRS)

    Sen, S. K.; Shaykhian, Gholam Ali

    2006-01-01

    A knowledge of the appropriate values of the parameters of a genetic algorithm (GA) such as the population size, the shrunk search space containing the solution, crossover and mutation probabilities is not available a priori for a general optimization problem. Recommended here is a polynomial-time preprocessing scheme that includes an n-dimensional bisection and that determines the foregoing parameters before deciding upon an appropriate GA for all problems of similar nature and type. Such a preprocessing is not only fast but also enables us to get the global optimal solution and its reasonably narrow error bounds with a high degree of confidence.

  8. A biomimetic algorithm for the improved detection of microarray features

    NASA Astrophysics Data System (ADS)

    Nicolau, Dan V., Jr.; Nicolau, Dan V.; Maini, Philip K.

    2007-02-01

    One the major difficulties of microarray technology relate to the processing of large and - importantly - error-loaded images of the dots on the chip surface. Whatever the source of these errors, those obtained in the first stage of data acquisition - segmentation - are passed down to the subsequent processes, with deleterious results. As it has been demonstrated recently that biological systems have evolved algorithms that are mathematically efficient, this contribution attempts to test an algorithm that mimics a bacterial-"patented" algorithm for the search of available space and nutrients to find, "zero-in" and eventually delimitate the features existent on the microarray surface.

  9. Image preprocessing for improving computational efficiency in implementation of restoration and superresolution algorithms.

    PubMed

    Sundareshan, Malur K; Bhattacharjee, Supratik; Inampudi, Radhika; Pang, Ho-Yuen

    2002-12-10

    Computational complexity is a major impediment to the real-time implementation of image restoration and superresolution algorithms in many applications. Although powerful restoration algorithms have been developed within the past few years utilizing sophisticated mathematical machinery (based on statistical optimization and convex set theory), these algorithms are typically iterative in nature and require a sufficient number of iterations to be executed to achieve the desired resolution improvement that may be needed to meaningfully perform postprocessing image exploitation tasks in practice. Additionally, recent technological breakthroughs have facilitated novel sensor designs (focal plane arrays, for instance) that make it possible to capture megapixel imagery data at video frame rates. A major challenge in the processing of these large-format images is to complete the execution of the image processing steps within the frame capture times and to keep up with the output rate of the sensor so that all data captured by the sensor can be efficiently utilized. Consequently, development of novel methods that facilitate real-time implementation of image restoration and superresolution algorithms is of significant practical interest and is the primary focus of this study. The key to designing computationally efficient processing schemes lies in strategically introducing appropriate preprocessing steps together with the superresolution iterations to tailor optimized overall processing sequences for imagery data of specific formats. For substantiating this assertion, three distinct methods for tailoring a preprocessing filter and integrating it with the superresolution processing steps are outlined. These methods consist of a region-of-interest extraction scheme, a background-detail separation procedure, and a scene-derived information extraction step for implementing a set-theoretic restoration of the image that is less demanding in computation compared with the

  10. Image preprocessing for improving computational efficiency in implementation of restoration and superresolution algorithms

    NASA Astrophysics Data System (ADS)

    Sundareshan, Malur K.; Bhattacharjee, Supratik; Inampudi, Radhika; Pang, Ho-Yuen

    2002-12-01

    Computational complexity is a major impediment to the real-time implementation of image restoration and superresolution algorithms in many applications. Although powerful restoration algorithms have been developed within the past few years utilizing sophisticated mathematical machinery (based on statistical optimization and convex set theory), these algorithms are typically iterative in nature and require a sufficient number of iterations to be executed to achieve the desired resolution improvement that may be needed to meaningfully perform postprocessing image exploitation tasks in practice. Additionally, recent technological breakthroughs have facilitated novel sensor designs (focal plane arrays, for instance) that make it possible to capture megapixel imagery data at video frame rates. A major challenge in the processing of these large-format images is to complete the execution of the image processing steps within the frame capture times and to keep up with the output rate of the sensor so that all data captured by the sensor can be efficiently utilized. Consequently, development of novel methods that facilitate real-time implementation of image restoration and superresolution algorithms is of significant practical interest and is the primary focus of this study. The key to designing computationally efficient processing schemes lies in strategically introducing appropriate preprocessing steps together with the superresolution iterations to tailor optimized overall processing sequences for imagery data of specific formats. For substantiating this assertion, three distinct methods for tailoring a preprocessing filter and integrating it with the superresolution processing steps are outlined. These methods consist of a region-of-interest extraction scheme, a background-detail separation procedure, and a scene-derived information extraction step for implementing a set-theoretic restoration of the image that is less demanding in computation compared with the

  11. Cancer Classification in Microarray Data using a Hybrid Selective Independent Component Analysis and υ-Support Vector Machine Algorithm.

    PubMed

    Saberkari, Hamidreza; Shamsi, Mousa; Joroughi, Mahsa; Golabi, Faegheh; Sedaaghi, Mohammad Hossein

    2014-10-01

    Microarray data have an important role in identification and classification of the cancer tissues. Having a few samples of microarrays in cancer researches is always one of the most concerns which lead to some problems in designing the classifiers. For this matter, preprocessing gene selection techniques should be utilized before classification to remove the noninformative genes from the microarray data. An appropriate gene selection method can significantly improve the performance of cancer classification. In this paper, we use selective independent component analysis (SICA) for decreasing the dimension of microarray data. Using this selective algorithm, we can solve the instability problem occurred in the case of employing conventional independent component analysis (ICA) methods. First, the reconstruction error and selective set are analyzed as independent components of each gene, which have a small part in making error in order to reconstruct new sample. Then, some of the modified support vector machine (υ-SVM) algorithm sub-classifiers are trained, simultaneously. Eventually, the best sub-classifier with the highest recognition rate is selected. The proposed algorithm is applied on three cancer datasets (leukemia, breast cancer and lung cancer datasets), and its results are compared with other existing methods. The results illustrate that the proposed algorithm (SICA + υ-SVM) has higher accuracy and validity in order to increase the classification accuracy. Such that, our proposed algorithm exhibits relative improvements of 3.3% in correctness rate over ICA + SVM and SVM algorithms in lung cancer dataset. PMID:25426433

  12. Syndromic surveillance using veterinary laboratory data: data pre-processing and algorithm performance evaluation

    PubMed Central

    Dórea, Fernanda C.; McEwen, Beverly J.; McNab, W. Bruce; Revie, Crawford W.; Sanchez, Javier

    2013-01-01

    Diagnostic test orders to an animal laboratory were explored as a data source for monitoring trends in the incidence of clinical syndromes in cattle. Four years of real data and over 200 simulated outbreak signals were used to compare pre-processing methods that could remove temporal effects in the data, as well as temporal aberration detection algorithms that provided high sensitivity and specificity. Weekly differencing demonstrated solid performance in removing day-of-week effects, even in series with low daily counts. For aberration detection, the results indicated that no single algorithm showed performance superior to all others across the range of outbreak scenarios simulated. Exponentially weighted moving average charts and Holt–Winters exponential smoothing demonstrated complementary performance, with the latter offering an automated method to adjust to changes in the time series that will likely occur in the future. Shewhart charts provided lower sensitivity but earlier detection in some scenarios. Cumulative sum charts did not appear to add value to the system; however, the poor performance of this algorithm was attributed to characteristics of the data monitored. These findings indicate that automated monitoring aimed at early detection of temporal aberrations will likely be most effective when a range of algorithms are implemented in parallel. PMID:23576782

  13. DNA Microarray Data Analysis: A Novel Biclustering Algorithm Approach

    NASA Astrophysics Data System (ADS)

    Tchagang, Alain B.; Tewfik, Ahmed H.

    2006-12-01

    Biclustering algorithms refer to a distinct class of clustering algorithms that perform simultaneous row-column clustering. Biclustering problems arise in DNA microarray data analysis, collaborative filtering, market research, information retrieval, text mining, electoral trends, exchange analysis, and so forth. When dealing with DNA microarray experimental data for example, the goal of biclustering algorithms is to find submatrices, that is, subgroups of genes and subgroups of conditions, where the genes exhibit highly correlated activities for every condition. In this study, we develop novel biclustering algorithms using basic linear algebra and arithmetic tools. The proposed biclustering algorithms can be used to search for all biclusters with constant values, biclusters with constant values on rows, biclusters with constant values on columns, and biclusters with coherent values from a set of data in a timely manner and without solving any optimization problem. We also show how one of the proposed biclustering algorithms can be adapted to identify biclusters with coherent evolution. The algorithms developed in this study discover all valid biclusters of each type, while almost all previous biclustering approaches will miss some.

  14. Clustering Short Time-Series Microarray

    NASA Astrophysics Data System (ADS)

    Ping, Loh Wei; Hasan, Yahya Abu

    2008-01-01

    Most microarray analyses are carried out on static gene expressions. However, the dynamical study of microarrays has lately gained more attention. Most researches on time-series microarray emphasize on the bioscience and medical aspects but few from the numerical aspect. This study attempts to analyze short time-series microarray mathematically using STEM clustering tool which formally preprocess data followed by clustering. We next introduce the Circular Mould Distance (CMD) algorithm with combinations of both preprocessing and clustering analysis. Both methods are subsequently compared in terms of efficiencies.

  15. Rank-based algorithms for anlaysis of microarrays

    NASA Astrophysics Data System (ADS)

    Liu, Wei-min; Mei, Rui; Bartell, Daniel M.; Di, Xiaojun; Webster, Teresa A.; Ryder, Tom

    2001-06-01

    Analysis of microarray data often involves extracting information from raw intensities of spots of cells and making certain calls. Rank-based algorithms are powerful tools to provide probability values of hypothesis tests, especially when the distribution of the intensities is unknown. For our current gene expression arrays, a gene is detected by a set of probe pairs consisting of perfect match and mismatch cells. The one-sided upper-tail Wilcoxon's signed rank test is used in our algorithms for absolute calls (whether a gene is detected or not), as well as comparative calls (whether a gene is increasing or decreasing or no significant change in a sample compared with another sample). We also test the possibility to use only perfect match cells to make calls. This paper focuses on absolute calls. We have developed error analysis methods and software tools that allow us to compare the accuracy of the calls in the presence or absence of mismatch cells at different target concentrations. The usage of nonparametric rank-based tests is not limited to absolute and comparative calls of gene expression chips. They can also be applied to other oligonucleotide microarrays for genotyping and mutation detection, as well as spotted arrays.

  16. Benchmarking a memetic algorithm for ordering microarray data.

    PubMed

    Moscato, P; Mendes, A; Berretta, R

    2007-03-01

    This work introduces a new algorithm for "gene ordering". Given a matrix of gene expression data values, the task is to find a permutation of the gene names list such that genes with similar expression patterns should be relatively close in the permutation. The algorithm is based on a combined approach that integrates a constructive heuristic with evolutionary and Tabu Search techniques in a single methodology. To evaluate the benefits of this method, we compared our results with the current outputs provided by several widely used algorithms in functional genomics. We also compared the results with our own hierarchical clustering method when used in isolation. We show that the use of images, corrupted with known levels of noise, helps to illustrate some aspects of the performance of the algorithms and provide a complementary benchmark for the analysis. The use of these images, with known high-quality solutions, facilitates in some cases the assessment of the methods and helps the software development, validation and reproducibility of results. We also propose two quantitative measures of performance for gene ordering. Using these measures, we make a comparison with probably the most used algorithm (due to Eisen and collaborators, PNAS 1998) using a microarray dataset available on the public domain (the complete yeast cell cycle dataset). PMID:16870322

  17. Effective preprocessing in #SAT

    NASA Astrophysics Data System (ADS)

    Guo, Qin; Sang, Juan; He, Yong-mei

    2011-12-01

    Preprocessing #SAT instances can reduce their size considerably and decrease the solving time. In this paper we investigate the use of the hyper-binary resolution and equality reduction to preprocess the #SAT instances. And a preprocessing algorithm Preprocess MC is presented, which combines the unit propagation, the hyper-binary resolution, and the equality reduction together. The experiment shows that these excellent technologies not only reduce the size of the #SAT formula, but also improve the ability of the model counters to solve #SAT problems.

  18. Microarrays

    ERIC Educational Resources Information Center

    Plomin, Robert; Schalkwyk, Leonard C.

    2007-01-01

    Microarrays are revolutionizing genetics by making it possible to genotype hundreds of thousands of DNA markers and to assess the expression (RNA transcripts) of all of the genes in the genome. Microarrays are slides the size of a postage stamp that contain millions of DNA sequences to which single-stranded DNA or RNA can hybridize. This…

  19. Artifact Removal from Biosignal using Fixed Point ICA Algorithm for Pre-processing in Biometric Recognition

    NASA Astrophysics Data System (ADS)

    Mishra, Puneet; Singla, Sunil Kumar

    2013-01-01

    In the modern world of automation, biological signals, especially Electroencephalogram (EEG) and Electrocardiogram (ECG), are gaining wide attention as a source of biometric information. Earlier studies have shown that EEG and ECG show versatility with individuals and every individual has distinct EEG and ECG spectrum. EEG (which can be recorded from the scalp due to the effect of millions of neurons) may contain noise signals such as eye blink, eye movement, muscular movement, line noise, etc. Similarly, ECG may contain artifact like line noise, tremor artifacts, baseline wandering, etc. These noise signals are required to be separated from the EEG and ECG signals to obtain the accurate results. This paper proposes a technique for the removal of eye blink artifact from EEG and ECG signal using fixed point or FastICA algorithm of Independent Component Analysis (ICA). For validation, FastICA algorithm has been applied to synthetic signal prepared by adding random noise to the Electrocardiogram (ECG) signal. FastICA algorithm separates the signal into two independent components, i.e. ECG pure and artifact signal. Similarly, the same algorithm has been applied to remove the artifacts (Electrooculogram or eye blink) from the EEG signal.

  20. LANDSAT data preprocessing

    NASA Technical Reports Server (NTRS)

    Austin, W. W.

    1983-01-01

    The effect on LANDSAT data of a Sun angle correction, an intersatellite LANDSAT-2 and LANDSAT-3 data range adjustment, and the atmospheric correction algorithm was evaluated. Fourteen 1978 crop year LACIE sites were used as the site data set. The preprocessing techniques were applied to multispectral scanner channel data and transformed data were plotted and used to analyze the effectiveness of the preprocessing techniques. Ratio transformations effectively reduce the need for preprocessing techniques to be applied directly to the data. Subtractive transformations are more sensitive to Sun angle and atmospheric corrections than ratios. Preprocessing techniques, other than those applied at the Goddard Space Flight Center, should only be applied as an option of the user. While performed on LANDSAT data the study results are also applicable to meteorological satellite data.

  1. Genetic Bee Colony (GBC) algorithm: A new gene selection method for microarray cancer classification.

    PubMed

    Alshamlan, Hala M; Badr, Ghada H; Alohali, Yousef A

    2015-06-01

    Naturally inspired evolutionary algorithms prove effectiveness when used for solving feature selection and classification problems. Artificial Bee Colony (ABC) is a relatively new swarm intelligence method. In this paper, we propose a new hybrid gene selection method, namely Genetic Bee Colony (GBC) algorithm. The proposed algorithm combines the used of a Genetic Algorithm (GA) along with Artificial Bee Colony (ABC) algorithm. The goal is to integrate the advantages of both algorithms. The proposed algorithm is applied to a microarray gene expression profile in order to select the most predictive and informative genes for cancer classification. In order to test the accuracy performance of the proposed algorithm, extensive experiments were conducted. Three binary microarray datasets are use, which include: colon, leukemia, and lung. In addition, another three multi-class microarray datasets are used, which are: SRBCT, lymphoma, and leukemia. Results of the GBC algorithm are compared with our recently proposed technique: mRMR when combined with the Artificial Bee Colony algorithm (mRMR-ABC). We also compared the combination of mRMR with GA (mRMR-GA) and Particle Swarm Optimization (mRMR-PSO) algorithms. In addition, we compared the GBC algorithm with other related algorithms that have been recently published in the literature, using all benchmark datasets. The GBC algorithm shows superior performance as it achieved the highest classification accuracy along with the lowest average number of selected genes. This proves that the GBC algorithm is a promising approach for solving the gene selection problem in both binary and multi-class cancer classification. PMID:25880524

  2. Crossword: A Fully Automated Algorithm for the Segmentation and Quality Control of Protein Microarray Images

    PubMed Central

    2015-01-01

    Biological assays formatted as microarrays have become a critical tool for the generation of the comprehensive data sets required for systems-level understanding of biological processes. Manual annotation of data extracted from images of microarrays, however, remains a significant bottleneck, particularly for protein microarrays due to the sensitivity of this technology to weak artifact signal. In order to automate the extraction and curation of data from protein microarrays, we describe an algorithm called Crossword that logically combines information from multiple approaches to fully automate microarray segmentation. Automated artifact removal is also accomplished by segregating structured pixels from the background noise using iterative clustering and pixel connectivity. Correlation of the location of structured pixels across image channels is used to identify and remove artifact pixels from the image prior to data extraction. This component improves the accuracy of data sets while reducing the requirement for time-consuming visual inspection of the data. Crossword enables a fully automated protocol that is robust to significant spatial and intensity aberrations. Overall, the average amount of user intervention is reduced by an order of magnitude and the data quality is increased through artifact removal and reduced user variability. The increase in throughput should aid the further implementation of microarray technologies in clinical studies. PMID:24417579

  3. An efficient algorithm for the stochastic simulation of the hybridization of DNA to microarrays

    PubMed Central

    2009-01-01

    Background Although oligonucleotide microarray technology is ubiquitous in genomic research, reproducibility and standardization of expression measurements still concern many researchers. Cross-hybridization between microarray probes and non-target ssDNA has been implicated as a primary factor in sensitivity and selectivity loss. Since hybridization is a chemical process, it may be modeled at a population-level using a combination of material balance equations and thermodynamics. However, the hybridization reaction network may be exceptionally large for commercial arrays, which often possess at least one reporter per transcript. Quantification of the kinetics and equilibrium of exceptionally large chemical systems of this type is numerically infeasible with customary approaches. Results In this paper, we present a robust and computationally efficient algorithm for the simulation of hybridization processes underlying microarray assays. Our method may be utilized to identify the extent to which nucleic acid targets (e.g. cDNA) will cross-hybridize with probes, and by extension, characterize probe robustnessusing the information specified by MAGE-TAB. Using this algorithm, we characterize cross-hybridization in a modified commercial microarray assay. Conclusions By integrating stochastic simulation with thermodynamic prediction tools for DNA hybridization, one may robustly and rapidly characterize of the selectivity of a proposed microarray design at the probe and "system" levels. Our code is available at http://www.laurenzi.net. PMID:20003312

  4. MIClique: An algorithm to identify differentially coexpressed disease gene subset from microarray data.

    PubMed

    Zhang, Huanping; Song, Xiaofeng; Wang, Huinan; Zhang, Xiaobai

    2009-01-01

    Computational analysis of microarray data has provided an effective way to identify disease-related genes. Traditional disease gene selection methods from microarray data such as statistical test always focus on differentially expressed genes in different samples by individual gene prioritization. These traditional methods might miss differentially coexpressed (DCE) gene subsets because they ignore the interaction between genes. In this paper, MIClique algorithm is proposed to identify DEC gene subsets based on mutual information and clique analysis. Mutual information is used to measure the coexpression relationship between each pair of genes in two different kinds of samples. Clique analysis is a commonly used method in biological network, which generally represents biological module of similar function. By applying the MIClique algorithm to real gene expression data, some DEC gene subsets which correlated under one experimental condition but uncorrelated under another condition are detected from the graph of colon dataset and leukemia dataset. PMID:20169000

  5. Novel algorithm for coexpression detection in time-varying microarray data sets.

    PubMed

    Yin, Zong-Xian; Chiang, Jung-Hsien

    2008-01-01

    When analyzing the results of microarray experiments, biologists generally use unsupervised categorization tools. However, such tools regard each time point as an independent dimension and utilize the Euclidean distance to compute the similarities between expressions. Furthermore, some of these methods require the number of clusters to be determined in advance, which is clearly impossible in the case of a new dataset. Therefore, this study proposes a novel scheme, designated as the Variation-based Coexpression Detection (VCD) algorithm, to analyze the trends of expressions based on their variation over time. The proposed algorithm has two advantages. First, it is unnecessary to determine the number of clusters in advance since the algorithm automatically detects those genes whose profiles are grouped together and creates patterns for these groups. Second, the algorithm features a new measurement criterion for calculating the degree of change of the expressions between adjacent time points and evaluating their trend similarities. Three real-world microarray datasets are employed to evaluate the performance of the proposed algorithm. PMID:18245881

  6. SPACE: an algorithm to predict and quantify alternatively spliced isoforms using microarrays.

    PubMed

    Anton, Miguel A; Gorostiaga, Dorleta; Guruceaga, Elizabeth; Segura, Victor; Carmona-Saez, Pedro; Pascual-Montano, Alberto; Pio, Ruben; Montuenga, Luis M; Rubio, Angel

    2008-01-01

    Exon and exon+junction microarrays are promising tools for studying alternative splicing. Current analytical tools applied to these arrays lack two relevant features: the ability to predict unknown spliced forms and the ability to quantify the concentration of known and unknown isoforms. SPACE is an algorithm that has been developed to (1) estimate the number of different transcripts expressed under several conditions, (2) predict the precursor mRNA splicing structure and (3) quantify the transcript concentrations including unknown forms. The results presented here show its robustness and accuracy for real and simulated data. PMID:18312629

  7. Krylov subspace algorithms for computing GeneRank for the analysis of microarray data mining.

    PubMed

    Wu, Gang; Zhang, Ying; Wei, Yimin

    2010-04-01

    GeneRank is a new engine technology for the analysis of microarray experiments. It combines gene expression information with a network structure derived from gene notations or expression profile correlations. Using matrix decomposition techniques, we first give a matrix analysis of the GeneRank model. We reformulate the GeneRank vector as a linear combination of three parts in the general case when the matrix in question is non-diagonalizable. We then propose two Krylov subspace methods for computing GeneRank. Numerical experiments show that, when the GeneRank problem is very large, the new algorithms are appropriate choices. PMID:20426695

  8. mRMR-ABC: A Hybrid Gene Selection Algorithm for Cancer Classification Using Microarray Gene Expression Profiling.

    PubMed

    Alshamlan, Hala; Badr, Ghada; Alohali, Yousef

    2015-01-01

    An artificial bee colony (ABC) is a relatively recent swarm intelligence optimization approach. In this paper, we propose the first attempt at applying ABC algorithm in analyzing a microarray gene expression profile. In addition, we propose an innovative feature selection algorithm, minimum redundancy maximum relevance (mRMR), and combine it with an ABC algorithm, mRMR-ABC, to select informative genes from microarray profile. The new approach is based on a support vector machine (SVM) algorithm to measure the classification accuracy for selected genes. We evaluate the performance of the proposed mRMR-ABC algorithm by conducting extensive experiments on six binary and multiclass gene expression microarray datasets. Furthermore, we compare our proposed mRMR-ABC algorithm with previously known techniques. We reimplemented two of these techniques for the sake of a fair comparison using the same parameters. These two techniques are mRMR when combined with a genetic algorithm (mRMR-GA) and mRMR when combined with a particle swarm optimization algorithm (mRMR-PSO). The experimental results prove that the proposed mRMR-ABC algorithm achieves accurate classification performance using small number of predictive genes when tested using both datasets and compared to previously suggested methods. This shows that mRMR-ABC is a promising approach for solving gene selection and cancer classification problems. PMID:25961028

  9. mRMR-ABC: A Hybrid Gene Selection Algorithm for Cancer Classification Using Microarray Gene Expression Profiling

    PubMed Central

    Alshamlan, Hala; Badr, Ghada; Alohali, Yousef

    2015-01-01

    An artificial bee colony (ABC) is a relatively recent swarm intelligence optimization approach. In this paper, we propose the first attempt at applying ABC algorithm in analyzing a microarray gene expression profile. In addition, we propose an innovative feature selection algorithm, minimum redundancy maximum relevance (mRMR), and combine it with an ABC algorithm, mRMR-ABC, to select informative genes from microarray profile. The new approach is based on a support vector machine (SVM) algorithm to measure the classification accuracy for selected genes. We evaluate the performance of the proposed mRMR-ABC algorithm by conducting extensive experiments on six binary and multiclass gene expression microarray datasets. Furthermore, we compare our proposed mRMR-ABC algorithm with previously known techniques. We reimplemented two of these techniques for the sake of a fair comparison using the same parameters. These two techniques are mRMR when combined with a genetic algorithm (mRMR-GA) and mRMR when combined with a particle swarm optimization algorithm (mRMR-PSO). The experimental results prove that the proposed mRMR-ABC algorithm achieves accurate classification performance using small number of predictive genes when tested using both datasets and compared to previously suggested methods. This shows that mRMR-ABC is a promising approach for solving gene selection and cancer classification problems. PMID:25961028

  10. Forward-Masked Frequency Selectivity Improvements in Simulated and Actual Cochlear Implant Users Using a Preprocessing Algorithm.

    PubMed

    Langner, Florian; Jürgens, Tim

    2016-01-01

    Frequency selectivity can be quantified using masking paradigms, such as psychophysical tuning curves (PTCs). Normal-hearing (NH) listeners show sharp PTCs that are level- and frequency-dependent, whereas frequency selectivity is strongly reduced in cochlear implant (CI) users. This study aims at (a) assessing individual shapes of PTCs in CI users, (b) comparing these shapes to those of simulated CI listeners (NH listeners hearing through a CI simulation), and (c) increasing the sharpness of PTCs using a biologically inspired dynamic compression algorithm, BioAid, which has been shown to sharpen the PTC shape in hearing-impaired listeners. A three-alternative-forced-choice forward-masking technique was used to assess PTCs in 8 CI users (with their own speech processor) and 11 NH listeners (with and without listening through a vocoder to simulate electric hearing). CI users showed flat PTCs with large interindividual variability in shape, whereas simulated CI listeners had PTCs of the same average flatness, but more homogeneous shapes across listeners. The algorithm BioAid was used to process the stimuli before entering the CI users' speech processor or the vocoder simulation. This algorithm was able to partially restore frequency selectivity in both groups, particularly in seven out of eight CI users, meaning significantly sharper PTCs than in the unprocessed condition. The results indicate that algorithms can improve the large-scale sharpness of frequency selectivity in some CI users. This finding may be useful for the design of sound coding strategies particularly for situations in which high frequency selectivity is desired, such as for music perception. PMID:27604785

  11. Classifier dependent feature preprocessing methods

    NASA Astrophysics Data System (ADS)

    Rodriguez, Benjamin M., II; Peterson, Gilbert L.

    2008-04-01

    In mobile applications, computational complexity is an issue that limits sophisticated algorithms from being implemented on these devices. This paper provides an initial solution to applying pattern recognition systems on mobile devices by combining existing preprocessing algorithms for recognition. In pattern recognition systems, it is essential to properly apply feature preprocessing tools prior to training classification models in an attempt to reduce computational complexity and improve the overall classification accuracy. The feature preprocessing tools extended for the mobile environment are feature ranking, feature extraction, data preparation and outlier removal. Most desktop systems today are capable of processing a majority of the available classification algorithms without concern of processing while the same is not true on mobile platforms. As an application of pattern recognition for mobile devices, the recognition system targets the problem of steganalysis, determining if an image contains hidden information. The measure of performance shows that feature preprocessing increases the overall steganalysis classification accuracy by an average of 22%. The methods in this paper are tested on a workstation and a Nokia 6620 (Symbian operating system) camera phone with similar results.

  12. Evaluation of multivariate calibration models with different pre-processing and processing algorithms for a novel resolution and quantitation of spectrally overlapped quaternary mixture in syrup

    NASA Astrophysics Data System (ADS)

    Moustafa, Azza A.; Hegazy, Maha A.; Mohamed, Dalia; Ali, Omnia

    2016-02-01

    A novel approach for the resolution and quantitation of severely overlapped quaternary mixture of carbinoxamine maleate (CAR), pholcodine (PHL), ephedrine hydrochloride (EPH) and sunset yellow (SUN) in syrup was demonstrated utilizing different spectrophotometric assisted multivariate calibration methods. The applied methods have used different processing and pre-processing algorithms. The proposed methods were partial least squares (PLS), concentration residuals augmented classical least squares (CRACLS), and a novel method; continuous wavelet transforms coupled with partial least squares (CWT-PLS). These methods were applied to a training set in the concentration ranges of 40-100 μg/mL, 40-160 μg/mL, 100-500 μg/mL and 8-24 μg/mL for the four components, respectively. The utilized methods have not required any preliminary separation step or chemical pretreatment. The validity of the methods was evaluated by an external validation set. The selectivity of the developed methods was demonstrated by analyzing the drugs in their combined pharmaceutical formulation without any interference from additives. The obtained results were statistically compared with the official and reported methods where no significant difference was observed regarding both accuracy and precision.

  13. Evaluation of multivariate calibration models with different pre-processing and processing algorithms for a novel resolution and quantitation of spectrally overlapped quaternary mixture in syrup.

    PubMed

    Moustafa, Azza A; Hegazy, Maha A; Mohamed, Dalia; Ali, Omnia

    2016-02-01

    A novel approach for the resolution and quantitation of severely overlapped quaternary mixture of carbinoxamine maleate (CAR), pholcodine (PHL), ephedrine hydrochloride (EPH) and sunset yellow (SUN) in syrup was demonstrated utilizing different spectrophotometric assisted multivariate calibration methods. The applied methods have used different processing and pre-processing algorithms. The proposed methods were partial least squares (PLS), concentration residuals augmented classical least squares (CRACLS), and a novel method; continuous wavelet transforms coupled with partial least squares (CWT-PLS). These methods were applied to a training set in the concentration ranges of 40-100 μg/mL, 40-160 μg/mL, 100-500 μg/mL and 8-24 μg/mL for the four components, respectively. The utilized methods have not required any preliminary separation step or chemical pretreatment. The validity of the methods was evaluated by an external validation set. The selectivity of the developed methods was demonstrated by analyzing the drugs in their combined pharmaceutical formulation without any interference from additives. The obtained results were statistically compared with the official and reported methods where no significant difference was observed regarding both accuracy and precision. PMID:26519913

  14. A novel biclustering algorithm of binary microarray data: BiBinCons and BiBinAlter.

    PubMed

    Saber, Haifa Ben; Elloumi, Mourad

    2015-01-01

    The biclustering of microarray data has been the subject of a large research. No one of the existing biclustering algorithms is perfect. The construction of biologically significant groups of biclusters for large microarray data is still a problem that requires a continuous work. Biological validation of biclusters of microarray data is one of the most important open issues. So far, there are no general guidelines in the literature on how to validate biologically extracted biclusters. In this paper, we develop two biclustering algorithms of binary microarray data, adopting the Iterative Row and Column Clustering Combination (IRCCC) approach, called BiBinCons and BiBinAlter. However, the BiBinAlter algorithm is an improvement of BiBinCons. On the other hand, BiBinAlter differs from BiBinCons by the use of the EvalStab and IndHomog evaluation functions in addition to the CroBin one (Bioinformatics 20:1993-2003, 2004). BiBinAlter can extracts biclusters of good quality with better p-values. PMID:26628919

  15. Application of genetic algorithms and constructive neural networks for the analysis of microarray cancer data

    PubMed Central

    2014-01-01

    Background Extracting relevant information from microarray data is a very complex task due to the characteristics of the data sets, as they comprise a large number of features while few samples are generally available. In this sense, feature selection is a very important aspect of the analysis helping in the tasks of identifying relevant genes and also for maximizing predictive information. Methods Due to its simplicity and speed, Stepwise Forward Selection (SFS) is a widely used feature selection technique. In this work, we carry a comparative study of SFS and Genetic Algorithms (GA) as general frameworks for the analysis of microarray data with the aim of identifying group of genes with high predictive capability and biological relevance. Six standard and machine learning-based techniques (Linear Discriminant Analysis (LDA), Support Vector Machines (SVM), Naive Bayes (NB), C-MANTEC Constructive Neural Network, K-Nearest Neighbors (kNN) and Multilayer perceptron (MLP)) are used within both frameworks using six free-public datasets for the task of predicting cancer outcome. Results Better cancer outcome prediction results were obtained using the GA framework noting that this approach, in comparison to the SFS one, leads to a larger selection set, uses a large number of comparison between genetic profiles and thus it is computationally more intensive. Also the GA framework permitted to obtain a set of genes that can be considered to be more biologically relevant. Regarding the different classifiers used standard feedforward neural networks (MLP), LDA and SVM lead to similar and best results, while C-MANTEC and k-NN followed closely but with a lower accuracy. Further, C-MANTEC, MLP and LDA permitted to obtain a more limited set of genes in comparison to SVM, NB and kNN, and in particular C-MANTEC resulted in the most robust classifier in terms of changes in the parameter settings. Conclusions This study shows that if prediction accuracy is the objective, the GA

  16. K-Boost: a scalable algorithm for high-quality clustering of microarray gene expression data.

    PubMed

    Geraci, Filippo; Leoncini, Mauro; Montangero, Manuela; Pellegrini, Marco; Renda, M Elena

    2009-06-01

    Microarray technology for profiling gene expression levels is a popular tool in modern biological research. Applications range from tissue classification to the detection of metabolic networks, from drug discovery to time-critical personalized medicine. Given the increase in size and complexity of the data sets produced, their analysis is becoming problematic in terms of time/quality trade-offs. Clustering genes with similar expression profiles is a key initial step for subsequent manipulations and the increasing volumes of data to be analyzed requires methods that are at the same time efficient (completing an analysis in minutes rather than hours) and effective (identifying significant clusters with high biological correlations). In this paper, we propose K-Boost, a clustering algorithm based on a combination of the furthest-point-first (FPF) heuristic for solving the metric k-center problem, a stability-based method for determining the number of clusters, and a k-means-like cluster refinement. K-Boost runs in O (|N| x k) time, where N is the input matrix and k is the number of proposed clusters. Experiments show that this low complexity is usually coupled with a very good quality of the computed clusterings, which we measure using both internal and external criteria. Supporting data can be found as online Supplementary Material at www.liebertonline.com. PMID:19522668

  17. GTI: A Novel Algorithm for Identifying Outlier Gene Expression Profiles from Integrated Microarray Datasets

    PubMed Central

    Mpindi, John Patrick; Sara, Henri; Haapa-Paananen, Saija; Kilpinen, Sami; Pisto, Tommi; Bucher, Elmar; Ojala, Kalle; Iljin, Kristiina; Vainio, Paula; Björkman, Mari; Gupta, Santosh; Kohonen, Pekka; Nees, Matthias; Kallioniemi, Olli

    2011-01-01

    Background Meta-analysis of gene expression microarray datasets presents significant challenges for statistical analysis. We developed and validated a new bioinformatic method for the identification of genes upregulated in subsets of samples of a given tumour type (‘outlier genes’), a hallmark of potential oncogenes. Methodology A new statistical method (the gene tissue index, GTI) was developed by modifying and adapting algorithms originally developed for statistical problems in economics. We compared the potential of the GTI to detect outlier genes in meta-datasets with four previously defined statistical methods, COPA, the OS statistic, the t-test and ORT, using simulated data. We demonstrated that the GTI performed equally well to existing methods in a single study simulation. Next, we evaluated the performance of the GTI in the analysis of combined Affymetrix gene expression data from several published studies covering 392 normal samples of tissue from the central nervous system, 74 astrocytomas, and 353 glioblastomas. According to the results, the GTI was better able than most of the previous methods to identify known oncogenic outlier genes. In addition, the GTI identified 29 novel outlier genes in glioblastomas, including TYMS and CDKN2A. The over-expression of these genes was validated in vivo by immunohistochemical staining data from clinical glioblastoma samples. Immunohistochemical data were available for 65% (19 of 29) of these genes, and 17 of these 19 genes (90%) showed a typical outlier staining pattern. Furthermore, raltitrexed, a specific inhibitor of TYMS used in the therapy of tumour types other than glioblastoma, also effectively blocked cell proliferation in glioblastoma cell lines, thus highlighting this outlier gene candidate as a potential therapeutic target. Conclusions/Significance Taken together, these results support the GTI as a novel approach to identify potential oncogene outliers and drug targets. The algorithm is implemented in

  18. The LeFE algorithm: embracing the complexity of gene expression in the interpretation of microarray data.

    PubMed

    Eichler, Gabriel S; Reimers, Mark; Kane, David; Weinstein, John N

    2007-01-01

    Interpretation of microarray data remains a challenge, and most methods fail to consider the complex, nonlinear regulation of gene expression. To address that limitation, we introduce Learner of Functional Enrichment (LeFE), a statistical/machine learning algorithm based on Random Forest, and demonstrate it on several diverse datasets: smoker/never smoker, breast cancer classification, and cancer drug sensitivity. We also compare it with previously published algorithms, including Gene Set Enrichment Analysis. LeFE regularly identifies statistically significant functional themes consistent with known biology. PMID:17845722

  19. Neighborhood inverse consistency preprocessing

    SciTech Connect

    Freuder, E.C.; Elfe, C.D.

    1996-12-31

    Constraint satisfaction consistency preprocessing methods are used to reduce search effort. Time and especially space costs limit the amount of preprocessing that will be cost effective. A new form of consistency preprocessing, neighborhood inverse consistency, can achieve more problem pruning than the usual arc consistency preprocessing in a cost effective manner. There are two basic ideas: (1) Common forms of consistency enforcement basically operate by identifying and remembering solutions to subproblems for which a consistent value cannot be found for some additional problem variable. The space required for this memory can quickly become prohibitive. Inverse consistency basically operates by removing values for variables that are not consistent with any solution to some subproblem involving additional variables. The space requirement is at worst linear. (2) Typically consistency preprocessing achieves some level of consistency uniformly throughout the problem. A subproblem solution will be tested against each additional variable that constrains any subproblem variable. Neighborhood consistency focuses attention on the subproblem formed by the variables that are all constrained by the value in question. By targeting highly relevant subproblems we hope to {open_quotes}skim the cream{close_quotes}, obtaining a high payoff for a limited cost.

  20. Exploring the feasibility of next-generation sequencing and microarray data meta-analysis

    PubMed Central

    Wu, Po-Yen; Phan, John H.; Wang, May D.

    2016-01-01

    Emerging next-generation sequencing (NGS) technology potentially resolves many issues that prevent widespread clinical use of gene expression microarrays. However, the number of publicly available NGS datasets is still smaller than that of microarrays. This paper explores the possibilities for combining information from both microarray and NGS gene expression datasets for the discovery of differentially expressed genes (DEGs). We evaluate several existing methods in detecting DEGs using individual datasets as well as combined NGS and microarray datasets. Results indicate that analysis of combined NGS and microarray data is feasible, but successful detection of DEGs may depend on careful selection of algorithms as well as on data normalization and pre-processing. PMID:22256102

  1. An MCMC Algorithm for Target Estimation in Real-Time DNA Microarrays

    NASA Astrophysics Data System (ADS)

    Vikalo, Haris; Gokdemir, Mahsuni

    2010-12-01

    DNA microarrays detect the presence and quantify the amounts of nucleic acid molecules of interest. They rely on a chemical attraction between the target molecules and their Watson-Crick complements, which serve as biological sensing elements (probes). The attraction between these biomolecules leads to binding, in which probes capture target analytes. Recently developed real-time DNA microarrays are capable of observing kinetics of the binding process. They collect noisy measurements of the amount of captured molecules at discrete points in time. Molecular binding is a random process which, in this paper, is modeled by a stochastic differential equation. The target analyte quantification is posed as a parameter estimation problem, and solved using a Markov Chain Monte Carlo technique. In simulation studies where we test the robustness with respect to the measurement noise, the proposed technique significantly outperforms previously proposed methods. Moreover, the proposed approach is tested and verified on experimental data.

  2. Biomarker Discovery Based on Hybrid Optimization Algorithm and Artificial Neural Networks on Microarray Data for Cancer Classification.

    PubMed

    Moteghaed, Niloofar Yousefi; Maghooli, Keivan; Pirhadi, Shiva; Garshasbi, Masoud

    2015-01-01

    The improvement of high-through-put gene profiling based microarrays technology has provided monitoring the expression value of thousands of genes simultaneously. Detailed examination of changes in expression levels of genes can help physicians to have efficient diagnosing, classification of tumors and cancer's types as well as effective treatments. Finding genes that can classify the group of cancers correctly based on hybrid optimization algorithms is the main purpose of this paper. In this paper, a hybrid particle swarm optimization and genetic algorithm method are used for gene selection and also artificial neural network (ANN) is adopted as the classifier. In this work, we have improved the ability of the algorithm for the classification problem by finding small group of biomarkers and also best parameters of the classifier. The proposed approach is tested on three benchmark gene expression data sets: Blood (acute myeloid leukemia, acute lymphoblastic leukemia), colon and breast datasets. We used 10-fold cross-validation to achieve accuracy and also decision tree algorithm to find the relation between the biomarkers for biological point of view. To test the ability of the trained ANN models to categorize the cancers, we analyzed additional blinded samples that were not previously used for the training procedure. Experimental results show that the proposed method can reduce the dimension of the data set and confirm the most informative gene subset and improve classification accuracy with best parameters based on datasets. PMID:26120567

  3. Biomarker Discovery Based on Hybrid Optimization Algorithm and Artificial Neural Networks on Microarray Data for Cancer Classification

    PubMed Central

    Moteghaed, Niloofar Yousefi; Maghooli, Keivan; Pirhadi, Shiva; Garshasbi, Masoud

    2015-01-01

    The improvement of high-through-put gene profiling based microarrays technology has provided monitoring the expression value of thousands of genes simultaneously. Detailed examination of changes in expression levels of genes can help physicians to have efficient diagnosing, classification of tumors and cancer's types as well as effective treatments. Finding genes that can classify the group of cancers correctly based on hybrid optimization algorithms is the main purpose of this paper. In this paper, a hybrid particle swarm optimization and genetic algorithm method are used for gene selection and also artificial neural network (ANN) is adopted as the classifier. In this work, we have improved the ability of the algorithm for the classification problem by finding small group of biomarkers and also best parameters of the classifier. The proposed approach is tested on three benchmark gene expression data sets: Blood (acute myeloid leukemia, acute lymphoblastic leukemia), colon and breast datasets. We used 10-fold cross-validation to achieve accuracy and also decision tree algorithm to find the relation between the biomarkers for biological point of view. To test the ability of the trained ANN models to categorize the cancers, we analyzed additional blinded samples that were not previously used for the training procedure. Experimental results show that the proposed method can reduce the dimension of the data set and confirm the most informative gene subset and improve classification accuracy with best parameters based on datasets. PMID:26120567

  4. The preprocessed doacross loop

    NASA Technical Reports Server (NTRS)

    Saltz, Joel H.; Mirchandaney, Ravi

    1990-01-01

    Dependencies between loop iterations cannot always be characterized during program compilation. Doacross loops typically make use of a-priori knowledge of inter-iteration dependencies to carry out required synchronizations. A type of doacross loop is proposed that allows the scheduling of iterations of a loop among processors without advance knowledge of inter-iteration dependencies. The method proposed for loop iterations requires that parallelizable preprocessing and postprocessing steps be carried out during program execution.

  5. Classification of Non-Small Cell Lung Cancer Using Significance Analysis of Microarray-Gene Set Reduction Algorithm

    PubMed Central

    Zhang, Lei; Wang, Linlin; Du, Bochuan; Wang, Tianjiao; Tian, Pu

    2016-01-01

    Among non-small cell lung cancer (NSCLC), adenocarcinoma (AC), and squamous cell carcinoma (SCC) are two major histology subtypes, accounting for roughly 40% and 30% of all lung cancer cases, respectively. Since AC and SCC differ in their cell of origin, location within the lung, and growth pattern, they are considered as distinct diseases. Gene expression signatures have been demonstrated to be an effective tool for distinguishing AC and SCC. Gene set analysis is regarded as irrelevant to the identification of gene expression signatures. Nevertheless, we found that one specific gene set analysis method, significance analysis of microarray-gene set reduction (SAMGSR), can be adopted directly to select relevant features and to construct gene expression signatures. In this study, we applied SAMGSR to a NSCLC gene expression dataset. When compared with several novel feature selection algorithms, for example, LASSO, SAMGSR has equivalent or better performance in terms of predictive ability and model parsimony. Therefore, SAMGSR is a feature selection algorithm, indeed. Additionally, we applied SAMGSR to AC and SCC subtypes separately to discriminate their respective stages, that is, stage II versus stage I. Few overlaps between these two resulting gene signatures illustrate that AC and SCC are technically distinct diseases. Therefore, stratified analyses on subtypes are recommended when diagnostic or prognostic signatures of these two NSCLC subtypes are constructed. PMID:27446945

  6. Preoperative overnight parenteral nutrition (TPN) improves skeletal muscle protein metabolism indicated by microarray algorithm analyses in a randomized trial.

    PubMed

    Iresjö, Britt-Marie; Engström, Cecilia; Lundholm, Kent

    2016-06-01

    Loss of muscle mass is associated with increased risk of morbidity and mortality in hospitalized patients. Uncertainties of treatment efficiency by short-term artificial nutrition remain, specifically improvement of protein balance in skeletal muscles. In this study, algorithmic microarray analysis was applied to map cellular changes related to muscle protein metabolism in human skeletal muscle tissue during provision of overnight preoperative total parenteral nutrition (TPN). Twenty-two patients (11/group) scheduled for upper GI surgery due to malignant or benign disease received a continuous peripheral all-in-one TPN infusion (30 kcal/kg/day, 0.16 gN/kg/day) or saline infusion for 12 h prior operation. Biopsies from the rectus abdominis muscle were taken at the start of operation for isolation of muscle RNA RNA expression microarray analyses were performed with Agilent Sureprint G3, 8 × 60K arrays using one-color labeling. 447 mRNAs were differently expressed between study and control patients (P < 0.1). mRNAs related to ribosomal biogenesis, mRNA processing, and translation were upregulated during overnight nutrition; particularly anabolic signaling S6K1 (P < 0.01-0.1). Transcripts of genes associated with lysosomal degradation showed consistently lower expression during TPN while mRNAs for ubiquitin-mediated degradation of proteins as well as transcripts related to intracellular signaling pathways, PI3 kinase/MAPkinase, were either increased or decreased. In conclusion, muscle mRNA alterations during overnight standard TPN infusions at constant rate altered mRNAs associated with mTOR signaling; increased initiation of protein translation; and suppressed autophagy/lysosomal degradation of proteins. This indicates that overnight preoperative parenteral nutrition is effective to promote muscle protein metabolism. PMID:27273879

  7. Comparing Binaural Pre-processing Strategies III

    PubMed Central

    Warzybok, Anna; Ernst, Stephan M. A.

    2015-01-01

    A comprehensive evaluation of eight signal pre-processing strategies, including directional microphones, coherence filters, single-channel noise reduction, binaural beamformers, and their combinations, was undertaken with normal-hearing (NH) and hearing-impaired (HI) listeners. Speech reception thresholds (SRTs) were measured in three noise scenarios (multitalker babble, cafeteria noise, and single competing talker). Predictions of three common instrumental measures were compared with the general perceptual benefit caused by the algorithms. The individual SRTs measured without pre-processing and individual benefits were objectively estimated using the binaural speech intelligibility model. Ten listeners with NH and 12 HI listeners participated. The participants varied in age and pure-tone threshold levels. Although HI listeners required a better signal-to-noise ratio to obtain 50% intelligibility than listeners with NH, no differences in SRT benefit from the different algorithms were found between the two groups. With the exception of single-channel noise reduction, all algorithms showed an improvement in SRT of between 2.1 dB (in cafeteria noise) and 4.8 dB (in single competing talker condition). Model predictions with binaural speech intelligibility model explained 83% of the measured variance of the individual SRTs in the no pre-processing condition. Regarding the benefit from the algorithms, the instrumental measures were not able to predict the perceptual data in all tested noise conditions. The comparable benefit observed for both groups suggests a possible application of noise reduction schemes for listeners with different hearing status. Although the model can predict the individual SRTs without pre-processing, further development is necessary to predict the benefits obtained from the algorithms at an individual level. PMID:26721922

  8. Retinex Preprocessing for Improved Multi-Spectral Image Classification

    NASA Technical Reports Server (NTRS)

    Thompson, B.; Rahman, Z.; Park, S.

    2000-01-01

    The goal of multi-image classification is to identify and label "similar regions" within a scene. The ability to correctly classify a remotely sensed multi-image of a scene is affected by the ability of the classification process to adequately compensate for the effects of atmospheric variations and sensor anomalies. Better classification may be obtained if the multi-image is preprocessed before classification, so as to reduce the adverse effects of image formation. In this paper, we discuss the overall impact on multi-spectral image classification when the retinex image enhancement algorithm is used to preprocess multi-spectral images. The retinex is a multi-purpose image enhancement algorithm that performs dynamic range compression, reduces the dependence on lighting conditions, and generally enhances apparent spatial resolution. The retinex has been successfully applied to the enhancement of many different types of grayscale and color images. We show in this paper that retinex preprocessing improves the spatial structure of multi-spectral images and thus provides better within-class variations than would otherwise be obtained without the preprocessing. For a series of multi-spectral images obtained with diffuse and direct lighting, we show that without retinex preprocessing the class spectral signatures vary substantially with the lighting conditions. Whereas multi-dimensional clustering without preprocessing produced one-class homogeneous regions, the classification on the preprocessed images produced multi-class non-homogeneous regions. This lack of homogeneity is explained by the interaction between different agronomic treatments applied to the regions: the preprocessed images are closer to ground truth. The principle advantage that the retinex offers is that for different lighting conditions classifications derived from the retinex preprocessed images look remarkably "similar", and thus more consistent, whereas classifications derived from the original

  9. Microarrays: an overview.

    PubMed

    Lee, Norman H; Saeed, Alexander I

    2007-01-01

    Gene expression microarrays are being used widely to address a myriad of complex biological questions. To gather meaningful expression data, it is crucial to have a firm understanding of the steps involved in the application of microarrays. The available microarray platforms are discussed along with their advantages and disadvantages. Additional considerations include study design, quality control and systematic assessment of microarray performance, RNA-labeling strategies, sample allocation, signal amplification schemes, defining the number of appropriate biological replicates, data normalization, statistical approaches to identify differentially regulated genes, and clustering algorithms for data visualization. In this chapter, the underlying principles regarding microarrays are reviewed, to serve as a guide when navigating through this powerful technology. PMID:17332646

  10. Context-based preprocessing of molecular docking data

    PubMed Central

    2013-01-01

    Background Data preprocessing is a major step in data mining. In data preprocessing, several known techniques can be applied, or new ones developed, to improve data quality such that the mining results become more accurate and intelligible. Bioinformatics is one area with a high demand for generation of comprehensive models from large datasets. In this article, we propose a context-based data preprocessing approach to mine data from molecular docking simulation results. The test cases used a fully-flexible receptor (FFR) model of Mycobacterium tuberculosis InhA enzyme (FFR_InhA) and four different ligands. Results We generated an initial set of attributes as well as their respective instances. To improve this initial set, we applied two selection strategies. The first was based on our context-based approach while the second used the CFS (Correlation-based Feature Selection) machine learning algorithm. Additionally, we produced an extra dataset containing features selected by combining our context strategy and the CFS algorithm. To demonstrate the effectiveness of the proposed method, we evaluated its performance based on various predictive (RMSE, MAE, Correlation, and Nodes) and context (Precision, Recall and FScore) measures. Conclusions Statistical analysis of the results shows that the proposed context-based data preprocessing approach significantly improves predictive and context measures and outperforms the CFS algorithm. Context-based data preprocessing improves mining results by producing superior interpretable models, which makes it well-suited for practical applications in molecular docking simulations using FFR models. PMID:24564276

  11. Compact Circuit Preprocesses Accelerometer Output

    NASA Technical Reports Server (NTRS)

    Bozeman, Richard J., Jr.

    1993-01-01

    Compact electronic circuit transfers dc power to, and preprocesses ac output of, accelerometer and associated preamplifier. Incorporated into accelerometer case during initial fabrication or retrofit onto commercial accelerometer. Made of commercial integrated circuits and other conventional components; made smaller by use of micrologic and surface-mount technology.

  12. Arabic handwritten: pre-processing and segmentation

    NASA Astrophysics Data System (ADS)

    Maliki, Makki; Jassim, Sabah; Al-Jawad, Naseer; Sellahewa, Harin

    2012-06-01

    This paper is concerned with pre-processing and segmentation tasks that influence the performance of Optical Character Recognition (OCR) systems and handwritten/printed text recognition. In Arabic, these tasks are adversely effected by the fact that many words are made up of sub-words, with many sub-words there associated one or more diacritics that are not connected to the sub-word's body; there could be multiple instances of sub-words overlap. To overcome these problems we investigate and develop segmentation techniques that first segment a document into sub-words, link the diacritics with their sub-words, and removes possible overlapping between words and sub-words. We shall also investigate two approaches for pre-processing tasks to estimate sub-words baseline, and to determine parameters that yield appropriate slope correction, slant removal. We shall investigate the use of linear regression on sub-words pixels to determine their central x and y coordinates, as well as their high density part. We also develop a new incremental rotation procedure to be performed on sub-words that determines the best rotation angle needed to realign baselines. We shall demonstrate the benefits of these proposals by conducting extensive experiments on publicly available databases and in-house created databases. These algorithms help improve character segmentation accuracy by transforming handwritten Arabic text into a form that could benefit from analysis of printed text.

  13. Image preprocessing study on KPCA-based face recognition

    NASA Astrophysics Data System (ADS)

    Li, Xuan; Li, Dehua

    2015-12-01

    Face recognition as an important biometric identification method, with its friendly, natural, convenient advantages, has obtained more and more attention. This paper intends to research a face recognition system including face detection, feature extraction and face recognition, mainly through researching on related theory and the key technology of various preprocessing methods in face detection process, using KPCA method, focuses on the different recognition results in different preprocessing methods. In this paper, we choose YCbCr color space for skin segmentation and choose integral projection for face location. We use erosion and dilation of the opening and closing operation and illumination compensation method to preprocess face images, and then use the face recognition method based on kernel principal component analysis method for analysis and research, and the experiments were carried out using the typical face database. The algorithms experiment on MATLAB platform. Experimental results show that integration of the kernel method based on PCA algorithm under certain conditions make the extracted features represent the original image information better for using nonlinear feature extraction method, which can obtain higher recognition rate. In the image preprocessing stage, we found that images under various operations may appear different results, so as to obtain different recognition rate in recognition stage. At the same time, in the process of the kernel principal component analysis, the value of the power of the polynomial function can affect the recognition result.

  14. Preprocessing and analysis of the ECG signals

    NASA Astrophysics Data System (ADS)

    Zhu, Jianmin; Zhang, Xiaolan; Wang, Zhongyu; Wang, Xiaoling

    2008-10-01

    According to the request of automatic analysis and depressing high frequency interference of the ECG signals, this paper applies low-pass filter to preprocess ECG signals, and proposes a QRS complex detection method based on wavelet transform, which takes advantage of Marr wavelet to decompose and filter the ECG signals with Mallat algorithm, using the relationship between wavelet transform and signal singularity to detect QRS complex with amplitude threshold method in scale 3, and to detect P wave and R wave in scale 4. Meanwhile, compositive detection method is used for re-detection, thus to improving the detection accuracy ratio. At last, records from ECG database of MIT/BIH which is widely accepted in the world are used to test the algorithm. And the result shows that correction detecting ratio under this algorithm has been more than 99.8 percent. The detection method in this paper is simple and running fast, and is easy to be realized in the real-time detecting system using for clinical diagnosis.

  15. Biclustering of time series microarray data.

    PubMed

    Meng, Jia; Huang, Yufei

    2012-01-01

    Clustering is a popular data exploration technique widely used in microarray data analysis. In this chapter, we review ideas and algorithms of bicluster and its applications in time series microarray analysis. We introduce first the concept and importance of biclustering and its different variations. We then focus our discussion on the popular iterative signature algorithm (ISA) for searching biclusters in microarray dataset. Next, we discuss in detail the enrichment constraint time-dependent ISA (ECTDISA) for identifying biologically meaningful temporal transcription modules from time series microarray dataset. In the end, we provide an example of ECTDISA application to time series microarray data of Kaposi's Sarcoma-associated Herpesvirus (KSHV) infection. PMID:22130875

  16. Analysis of High-Throughput ELISA Microarray Data

    SciTech Connect

    White, Amanda M.; Daly, Don S.; Zangar, Richard C.

    2011-02-23

    Our research group develops analytical methods and software for the high-throughput analysis of quantitative enzyme-linked immunosorbent assay (ELISA) microarrays. ELISA microarrays differ from DNA microarrays in several fundamental aspects and most algorithms for analysis of DNA microarray data are not applicable to ELISA microarrays. In this review, we provide an overview of the steps involved in ELISA microarray data analysis and how the statistically sound algorithms we have developed provide an integrated software suite to address the needs of each data-processing step. The algorithms discussed are available in a set of open-source software tools (http://www.pnl.gov/statistics/ProMAT).

  17. Protein Microarrays

    NASA Astrophysics Data System (ADS)

    Ricard-Blum, S.

    Proteins are key actors in the life of the cell, involved in many physiological and pathological processes. Since variations in the expression of messenger RNA are not systematically correlated with variations in the protein levels, the latter better reflect the way a cell functions. Protein microarrays thus supply complementary information to DNA chips. They are used in particular to analyse protein expression profiles, to detect proteins within complex biological media, and to study protein-protein interactions, which give information about the functions of those proteins [3-9]. They have the same advantages as DNA microarrays for high-throughput analysis, miniaturisation, and the possibility of automation. Section 18.1 gives a brief overview of proteins. Following this, Sect. 18.2 describes how protein microarrays can be made on flat supports, explaining how proteins can be produced and immobilised on a solid support, and discussing the different kinds of substrate and detection method. Section 18.3 discusses the particular format of protein microarrays in suspension. The diversity of protein microarrays and their applications are then reported in Sect. 18.4, with applications to therapeutics (protein-drug interactions) and diagnostics. The prospects for future developments of protein microarrays are then outlined in the conclusion. The bibliography provides an extensive list of reviews and detailed references for those readers who wish to go further in this area. Indeed, the aim of the present chapter is not to give an exhaustive or detailed analysis of the state of the art, but rather to provide the reader with the basic elements needed to understand how proteins are designed and used.

  18. EMAAS: An extensible grid-based Rich Internet Application for microarray data analysis and management

    PubMed Central

    Barton, G; Abbott, J; Chiba, N; Huang, DW; Huang, Y; Krznaric, M; Mack-Smith, J; Saleem, A; Sherman, BT; Tiwari, B; Tomlinson, C; Aitman, T; Darlington, J; Game, L; Sternberg, MJE; Butcher, SA

    2008-01-01

    Background Microarray experimentation requires the application of complex analysis methods as well as the use of non-trivial computer technologies to manage the resultant large data sets. This, together with the proliferation of tools and techniques for microarray data analysis, makes it very challenging for a laboratory scientist to keep up-to-date with the latest developments in this field. Our aim was to develop a distributed e-support system for microarray data analysis and management. Results EMAAS (Extensible MicroArray Analysis System) is a multi-user rich internet application (RIA) providing simple, robust access to up-to-date resources for microarray data storage and analysis, combined with integrated tools to optimise real time user support and training. The system leverages the power of distributed computing to perform microarray analyses, and provides seamless access to resources located at various remote facilities. The EMAAS framework allows users to import microarray data from several sources to an underlying database, to pre-process, quality assess and analyse the data, to perform functional analyses, and to track data analysis steps, all through a single easy to use web portal. This interface offers distance support to users both in the form of video tutorials and via live screen feeds using the web conferencing tool EVO. A number of analysis packages, including R-Bioconductor and Affymetrix Power Tools have been integrated on the server side and are available programmatically through the Postgres-PLR library or on grid compute clusters. Integrated distributed resources include the functional annotation tool DAVID, GeneCards and the microarray data repositories GEO, CELSIUS and MiMiR. EMAAS currently supports analysis of Affymetrix 3' and Exon expression arrays, and the system is extensible to cater for other microarray and transcriptomic platforms. Conclusion EMAAS enables users to track and perform microarray data management and analysis tasks

  19. Research on pre-processing of QR Code

    NASA Astrophysics Data System (ADS)

    Sun, Haixing; Xia, Haojie; Dong, Ning

    2013-10-01

    QR code encodes many kinds of information because of its advantages: large storage capacity, high reliability, full arrange of utter-high-speed reading, small printing size and high-efficient representation of Chinese characters, etc. In order to obtain the clearer binarization image from complex background, and improve the recognition rate of QR code, this paper researches on pre-processing methods of QR code (Quick Response Code), and shows algorithms and results of image pre-processing for QR code recognition. Improve the conventional method by changing the Souvola's adaptive text recognition method. Additionally, introduce the QR code Extraction which adapts to different image size, flexible image correction approach, and improve the efficiency and accuracy of QR code image processing.

  20. Preprocessing cotton to prevent byssinosis

    PubMed Central

    Merchant, James A.; Lumsden, John C.; Kilburn, Kaye H.; Germino, Victor H.; Hamilton, John D.; Lynn, William S.; Byrd, H.; Baucom, D.

    1973-01-01

    Merchant, J. A., Lumsden, J. C., Kilburn, K. H., Germino, V. H., Hamilton, J. D., Lynn, W. S., Byrd, H., and Baucom, D. (1973).British Journal of Industrial Medicine,30, 237-247. Preprocessing cotton to prevent byssinosis. A fundamental approach of cleaning or deactivating cotton prior to manufacturing has long been advocated to prevent byssinosis, but no trial had been conducted to test the feasibility of such an approach. In the study described, it was possible to be directed by both biological observations and the results of manufacturing trials. An exposure chamber was built in a cotton textile mill which had been previously studied as part of a large cross-sectional survey. The chamber was provided with an independent air conditioning system and a carding machine which served as a dust generator. Sixteen subjects, who had shown reductions in expiratory flow rate with exposure to cotton dust, were chosen to form a panel for exposure to raw cottons and cottons which had been preprocessed by heating, washing, and steaming. Indicators of effects were symptoms of chest tightness and/or dyspnoea, change in FEV1·0, and fine dust levels over 6 hours of exposure. Exposure of the panel to no cotton dust resulted in no change in FEV1·0 and served as the control for subsequent trials. Exposure to strict middling cotton resulted in a byssinosis symptom prevalence of 22%, a significant decrement in FEV1·0 of 2·9%, and a fine dust level of 0·26 mg/m3. Exposure to strict low middling cotton resulted in a byssinosis symptom prevalence of 79%, a decrement in FEV1·0 of 8·5%, and a fine dust level of 0·89 mg/m3. Oven heating strict low middling cotton resulted in a byssinosis symptom prevalence of 56% and a relatively greater drop in FEV1·0 of 8·3% for 0·48 mg/m3 of fine dust. Washing the strict low grade cotton eliminated detectable biological effects with a symptom prevalence of 8%, an increase of 1·4% in FEV1·, and a dust level of 0·16 mg/m3, but the cotton

  1. Evaluation of the efficiency of continuous wavelet transform as processing and preprocessing algorithm for resolution of overlapped signals in univariate and multivariate regression analyses; an application to ternary and quaternary mixtures

    NASA Astrophysics Data System (ADS)

    Hegazy, Maha A.; Lotfy, Hayam M.; Mowaka, Shereen; Mohamed, Ekram Hany

    2016-07-01

    Wavelets have been adapted for a vast number of signal-processing applications due to the amount of information that can be extracted from a signal. In this work, a comparative study on the efficiency of continuous wavelet transform (CWT) as a signal processing tool in univariate regression and a pre-processing tool in multivariate analysis using partial least square (CWT-PLS) was conducted. These were applied to complex spectral signals of ternary and quaternary mixtures. CWT-PLS method succeeded in the simultaneous determination of a quaternary mixture of drotaverine (DRO), caffeine (CAF), paracetamol (PAR) and p-aminophenol (PAP, the major impurity of paracetamol). While, the univariate CWT failed to simultaneously determine the quaternary mixture components and was able to determine only PAR and PAP, the ternary mixtures of DRO, CAF, and PAR and CAF, PAR, and PAP. During the calculations of CWT, different wavelet families were tested. The univariate CWT method was validated according to the ICH guidelines. While for the development of the CWT-PLS model a calibration set was prepared by means of an orthogonal experimental design and their absorption spectra were recorded and processed by CWT. The CWT-PLS model was constructed by regression between the wavelet coefficients and concentration matrices and validation was performed by both cross validation and external validation sets. Both methods were successfully applied for determination of the studied drugs in pharmaceutical formulations.

  2. Evaluation of the efficiency of continuous wavelet transform as processing and preprocessing algorithm for resolution of overlapped signals in univariate and multivariate regression analyses; an application to ternary and quaternary mixtures.

    PubMed

    Hegazy, Maha A; Lotfy, Hayam M; Mowaka, Shereen; Mohamed, Ekram Hany

    2016-07-01

    Wavelets have been adapted for a vast number of signal-processing applications due to the amount of information that can be extracted from a signal. In this work, a comparative study on the efficiency of continuous wavelet transform (CWT) as a signal processing tool in univariate regression and a pre-processing tool in multivariate analysis using partial least square (CWT-PLS) was conducted. These were applied to complex spectral signals of ternary and quaternary mixtures. CWT-PLS method succeeded in the simultaneous determination of a quaternary mixture of drotaverine (DRO), caffeine (CAF), paracetamol (PAR) and p-aminophenol (PAP, the major impurity of paracetamol). While, the univariate CWT failed to simultaneously determine the quaternary mixture components and was able to determine only PAR and PAP, the ternary mixtures of DRO, CAF, and PAR and CAF, PAR, and PAP. During the calculations of CWT, different wavelet families were tested. The univariate CWT method was validated according to the ICH guidelines. While for the development of the CWT-PLS model a calibration set was prepared by means of an orthogonal experimental design and their absorption spectra were recorded and processed by CWT. The CWT-PLS model was constructed by regression between the wavelet coefficients and concentration matrices and validation was performed by both cross validation and external validation sets. Both methods were successfully applied for determination of the studied drugs in pharmaceutical formulations. PMID:27070527

  3. Chromosome Microarray.

    PubMed

    Anderson, Sharon

    2016-01-01

    Over the last half century, knowledge about genetics, genetic testing, and its complexity has flourished. Completion of the Human Genome Project provided a foundation upon which the accuracy of genetics, genomics, and integration of bioinformatics knowledge and testing has grown exponentially. What is lagging, however, are efforts to reach and engage nurses about this rapidly changing field. The purpose of this article is to familiarize nurses with several frequently ordered genetic tests including chromosomes and fluorescence in situ hybridization followed by a comprehensive review of chromosome microarray. It shares the complexity of microarray including how testing is performed and results analyzed. A case report demonstrates how this technology is applied in clinical practice and reveals benefits and limitations of this scientific and bioinformatics genetic technology. Clinical implications for maternal-child nurses across practice levels are discussed. PMID:27276104

  4. PREPROCESSING MAGNETIC FIELDS WITH CHROMOSPHERIC LONGITUDINAL FIELDS

    SciTech Connect

    Yamamoto, Tetsuya T.; Kusano, K.

    2012-06-20

    Nonlinear force-free field (NLFFF) extrapolation is a powerful tool for the modeling of the magnetic field in the solar corona. However, since the photospheric magnetic field does not in general satisfy the force-free condition, some kind of processing is required to assimilate data into the model. In this paper, we report the results of new preprocessing for the NLFFF extrapolation. Through this preprocessing, we expect to obtain magnetic field data similar to those in the chromosphere. In our preprocessing, we add a new term concerning chromospheric longitudinal fields into the optimization function proposed by Wiegelmann et al. We perform a parameter survey of six free parameters to find minimum force- and torque-freeness with the simulated-annealing method. Analyzed data are a photospheric vector magnetogram of AR 10953 observed with the Hinode spectropolarimeter and a chromospheric longitudinal magnetogram observed with SOLIS spectropolarimeter. It is found that some preprocessed fields show the smallest force- and torque-freeness and are very similar to the chromospheric longitudinal fields. On the other hand, other preprocessed fields show noisy maps, although the force- and torque-freeness are of the same order. By analyzing preprocessed noisy maps in the wave number space, we found that small and large wave number components balance out on the force-free index. We also discuss our iteration limit of the simulated-annealing method and magnetic structure broadening in the chromosphere.

  5. ArraySolver: An Algorithm for Colour-Coded Graphical Display and Wilcoxon Signed-Rank Statistics for Comparing Microarray Gene Expression Data

    PubMed Central

    2004-01-01

    The massive surge in the production of microarray data poses a great challenge for proper analysis and interpretation. In recent years numerous computational tools have been developed to extract meaningful interpretation of microarray gene expression data. However, a convenient tool for two-groups comparison of microarray data is still lacking and users have to rely on commercial statistical packages that might be costly and require special skills, in addition to extra time and effort for transferring data from one platform to other. Various statistical methods, including the t-test, analysis of variance, Pearson test and Mann–Whitney U test, have been reported for comparing microarray data, whereas the utilization of the Wilcoxon signed-rank test, which is an appropriate test for two-groups comparison of gene expression data, has largely been neglected in microarray studies. The aim of this investigation was to build an integrated tool, ArraySolver, for colour-coded graphical display and comparison of gene expression data using the Wilcoxon signed-rank test. The results of software validation showed similar outputs with ArraySolver and SPSS for large datasets. Whereas the former program appeared to be more accurate for 25 or fewer pairs (n ≤ 25), suggesting its potential application in analysing molecular signatures that usually contain small numbers of genes. The main advantages of ArraySolver are easy data selection, convenient report format, accurate statistics and the familiar Excel platform. PMID:18629036

  6. DNA Microarrays

    NASA Astrophysics Data System (ADS)

    Nguyen, C.; Gidrol, X.

    Genomics has revolutionised biological and biomedical research. This revolution was predictable on the basis of its two driving forces: the ever increasing availability of genome sequences and the development of new technology able to exploit them. Up until now, technical limitations meant that molecular biology could only analyse one or two parameters per experiment, providing relatively little information compared with the great complexity of the systems under investigation. This gene by gene approach is inadequate to understand biological systems containing several thousand genes. It is essential to have an overall view of the DNA, RNA, and relevant proteins. A simple inventory of the genome is not sufficient to understand the functions of the genes, or indeed the way that cells and organisms work. For this purpose, functional studies based on whole genomes are needed. Among these new large-scale methods of molecular analysis, DNA microarrays provide a way of studying the genome and the transcriptome. The idea of integrating a large amount of data derived from a support with very small area has led biologists to call these chips, borrowing the term from the microelectronics industry. At the beginning of the 1990s, the development of DNA chips on nylon membranes [1, 2], then on glass [3] and silicon [4] supports, made it possible for the first time to carry out simultaneous measurements of the equilibrium concentration of all the messenger RNA (mRNA) or transcribed RNA in a cell. These microarrays offer a wide range of applications, in both fundamental and clinical research, providing a method for genome-wide characterisation of changes occurring within a cell or tissue, as for example in polymorphism studies, detection of mutations, and quantitative assays of gene copies. With regard to the transcriptome, it provides a way of characterising differentially expressed genes, profiling given biological states, and identifying regulatory channels.

  7. An Automated, Adaptive Framework for Optimizing Preprocessing Pipelines in Task-Based Functional MRI

    PubMed Central

    Churchill, Nathan W.; Spring, Robyn; Afshin-Pour, Babak; Dong, Fan; Strother, Stephen C.

    2015-01-01

    BOLD fMRI is sensitive to blood-oxygenation changes correlated with brain function; however, it is limited by relatively weak signal and significant noise confounds. Many preprocessing algorithms have been developed to control noise and improve signal detection in fMRI. Although the chosen set of preprocessing and analysis steps (the “pipeline”) significantly affects signal detection, pipelines are rarely quantitatively validated in the neuroimaging literature, due to complex preprocessing interactions. This paper outlines and validates an adaptive resampling framework for evaluating and optimizing preprocessing choices by optimizing data-driven metrics of task prediction and spatial reproducibility. Compared to standard “fixed” preprocessing pipelines, this optimization approach significantly improves independent validation measures of within-subject test-retest, and between-subject activation overlap, and behavioural prediction accuracy. We demonstrate that preprocessing choices function as implicit model regularizers, and that improvements due to pipeline optimization generalize across a range of simple to complex experimental tasks and analysis models. Results are shown for brief scanning sessions (<3 minutes each), demonstrating that with pipeline optimization, it is possible to obtain reliable results and brain-behaviour correlations in relatively small datasets. PMID:26161667

  8. Aptamer Microarrays

    SciTech Connect

    Angel-Syrett, Heather; Collett, Jim; Ellington, Andrew D.

    2009-01-02

    In vitro selection can yield specific, high-affinity aptamers. We and others have devised methods for the automated selection of aptamers, and have begun to use these reagents for the construction of arrays. Arrayed aptamers have proven to be almost as sensitive as their solution phase counterparts, and when ganged together can provide both specific and general diagnostic signals for proteins and other analytes. We describe here technical details regarding the production and processing of aptamer microarrays, including blocking, washing, drying, and scanning. We will also discuss the challenges involved in developing standardized and reproducible methods for binding and quantitating protein targets. While signals from fluorescent analytes or sandwiches are typically captured, it has proven possible for immobilized aptamers to be uniquely coupled to amplification methods not available to protein reagents, thus allowing for protein-binding signals to be greatly amplified. Into the future, many of the biosensor methods described in this book can potentially be adapted to array formats, thus further expanding the utility of and applications for aptamer arrays.

  9. Efficient Preprocessing technique using Web log mining

    NASA Astrophysics Data System (ADS)

    Raiyani, Sheetal A.; jain, Shailendra

    2012-11-01

    Web Usage Mining can be described as the discovery and Analysis of user access pattern through mining of log files and associated data from a particular websites. No. of visitors interact daily with web sites around the world. enormous amount of data are being generated and these information could be very prize to the company in the field of accepting Customerís behaviors. In this paper a complete preprocessing style having data cleaning, user and session Identification activities to improve the quality of data. Efficient preprocessing technique one of the User Identification which is key issue in preprocessing technique phase is to identify the Unique web users. Traditional User Identification is based on the site structure, being supported by using some heuristic rules, for use of this reduced the efficiency of user identification solve this difficulty we introduced proposed Technique DUI (Distinct User Identification) based on IP address ,Agent and Session time ,Referred pages on desired session time. Which can be used in counter terrorism, fraud detection and detection of unusual access of secure data, as well as through detection of regular access behavior of users improve the overall designing and performance of upcoming access of preprocessing results.

  10. Preprocessing Moist Lignocellulosic Biomass for Biorefinery Feedstocks

    SciTech Connect

    Neal Yancey; Christopher T. Wright; Craig Conner; J. Richard Hess

    2009-06-01

    Biomass preprocessing is one of the primary operations in the feedstock assembly system of a lignocellulosic biorefinery. Preprocessing is generally accomplished using industrial grinders to format biomass materials into a suitable biorefinery feedstock for conversion to ethanol and other bioproducts. Many factors affect machine efficiency and the physical characteristics of preprocessed biomass. For example, moisture content of the biomass as received from the point of production has a significant impact on overall system efficiency and can significantly affect the characteristics (particle size distribution, flowability, storability, etc.) of the size-reduced biomass. Many different grinder configurations are available on the market, each with advantages under specific conditions. Ultimately, the capacity and/or efficiency of the grinding process can be enhanced by selecting the grinder configuration that optimizes grinder performance based on moisture content and screen size. This paper discusses the relationships of biomass moisture with respect to preprocessing system performance and product physical characteristics and compares data obtained on corn stover, switchgrass, and wheat straw as model feedstocks during Vermeer HG 200 grinder testing. During the tests, grinder screen configuration and biomass moisture content were varied and tested to provide a better understanding of their relative impact on machine performance and the resulting feedstock physical characteristics and uniformity relative to each crop tested.

  11. The Stanford Tissue Microarray Database.

    PubMed

    Marinelli, Robert J; Montgomery, Kelli; Liu, Chih Long; Shah, Nigam H; Prapong, Wijan; Nitzberg, Michael; Zachariah, Zachariah K; Sherlock, Gavin J; Natkunam, Yasodha; West, Robert B; van de Rijn, Matt; Brown, Patrick O; Ball, Catherine A

    2008-01-01

    The Stanford Tissue Microarray Database (TMAD; http://tma.stanford.edu) is a public resource for disseminating annotated tissue images and associated expression data. Stanford University pathologists, researchers and their collaborators worldwide use TMAD for designing, viewing, scoring and analyzing their tissue microarrays. The use of tissue microarrays allows hundreds of human tissue cores to be simultaneously probed by antibodies to detect protein abundance (Immunohistochemistry; IHC), or by labeled nucleic acids (in situ hybridization; ISH) to detect transcript abundance. TMAD archives multi-wavelength fluorescence and bright-field images of tissue microarrays for scoring and analysis. As of July 2007, TMAD contained 205 161 images archiving 349 distinct probes on 1488 tissue microarray slides. Of these, 31 306 images for 68 probes on 125 slides have been released to the public. To date, 12 publications have been based on these raw public data. TMAD incorporates the NCI Thesaurus ontology for searching tissues in the cancer domain. Image processing researchers can extract images and scores for training and testing classification algorithms. The production server uses the Apache HTTP Server, Oracle Database and Perl application code. Source code is available to interested researchers under a no-cost license. PMID:17989087

  12. The Effects of Pre-processing Strategies for Pediatric Cochlear Implant Recipients

    PubMed Central

    Rakszawski, Bernadette; Wright, Rose; Cadieux, Jamie H.; Davidson, Lisa S.; Brenner, Christine

    2016-01-01

    Background Cochlear implants (CIs) have been shown to improve children’s speech recognition over traditional amplification when severe to profound sensorineural hearing loss is present. Despite improvements, understanding speech at low-level intensities or in the presence of background noise remains difficult. In an effort to improve speech understanding in challenging environments, Cochlear Ltd. offers pre-processing strategies that apply various algorithms prior to mapping the signal to the internal array. Two of these strategies include Autosensitivity Control™ (ASC) and Adaptive Dynamic Range Optimization (ADRO®). Based on previous research, the manufacturer’s default pre-processing strategy for pediatrics’ everyday programs combines ASC+ADRO®. Purpose The purpose of this study is to compare pediatric speech perception performance across various pre-processing strategies while applying a specific programming protocol utilizing increased threshold (T) levels to ensure access to very low-level sounds. Research Design This was a prospective, cross-sectional, observational study. Participants completed speech perception tasks in four pre-processing conditions: no pre-processing, ADRO®, ASC, ASC+ADRO®. Study Sample Eleven pediatric Cochlear Ltd. cochlear implant users were recruited: six bilateral, one unilateral, and four bimodal. Intervention Four programs, with the participants’ everyday map, were loaded into the processor with different pre-processing strategies applied in each of the four positions: no pre-processing, ADRO®, ASC, and ASC+ADRO®. Data Collection and Analysis Participants repeated CNC words presented at 50 and 70 dB SPL in quiet and HINT sentences presented adaptively with competing R-Space noise at 60 and 70 dB SPL. Each measure was completed as participants listened with each of the four pre-processing strategies listed above. Test order and condition were randomized. A repeated-measures analysis of variance (ANOVA) was used to

  13. Reliable RANSAC Using a Novel Preprocessing Model

    PubMed Central

    Wang, Xiaoyan; Zhang, Hui; Liu, Sheng

    2013-01-01

    Geometric assumption and verification with RANSAC has become a crucial step for corresponding to local features due to its wide applications in biomedical feature analysis and vision computing. However, conventional RANSAC is very time-consuming due to redundant sampling times, especially dealing with the case of numerous matching pairs. This paper presents a novel preprocessing model to explore a reduced set with reliable correspondences from initial matching dataset. Both geometric model generation and verification are carried out on this reduced set, which leads to considerable speedups. Afterwards, this paper proposes a reliable RANSAC framework using preprocessing model, which was implemented and verified using Harris and SIFT features, respectively. Compared with traditional RANSAC, experimental results show that our method is more efficient. PMID:23509601

  14. Infrared Mueller matrix acquisition and preprocessing system.

    PubMed

    Carrieri, Arthur H; Owens, David J; Schultz, Jonathan C

    2008-09-20

    An analog Mueller matrix acquisition and preprocessing system (AMMS) was developed for a photopolarimetric-based sensor with 9.1-12.0 microm optical bandwidth, which is the middle infrared wavelength-tunable region of sensor transmitter and "fingerprint" spectral band for chemical-biological (analyte) standoff detection. AMMS facilitates delivery of two alternate polarization-modulated CO(2) laser beams onto subject analyte that excite/relax molecular vibrational resonance in its analytic mass, primes the photoelastic-modulation engine of the sensor, establishes optimum throughput radiance per backscattering cross section, acquires Mueller elements modulo two laser beams in hexadecimal format, preprocesses (normalize, subtract, filter) these data, and formats the results into digitized identification metrics. Feed forwarding of formatted Mueller matrix metrics through an optimally trained and validated neural network provides pattern recognition and type classification of interrogated analyte. PMID:18806864

  15. The preprocessing of multispectral data. II. [of Landsat satellite

    NASA Technical Reports Server (NTRS)

    Quiel, F.

    1976-01-01

    It is pointed out that a correction of atmospheric effects is an important requirement for a full utilization of the possibilities provided by preprocessing techniques. The most significant characteristics of original and preprocessed data are considered, taking into account the solution of classification problems by means of the preprocessing procedure. Improvements obtainable with different preprocessing techniques are illustrated with the aid of examples involving Landsat data regarding an area in Colorado.

  16. Consensus gene regulatory networks: combining multiple microarray gene expression datasets

    NASA Astrophysics Data System (ADS)

    Peeling, Emma; Tucker, Allan

    2007-09-01

    In this paper we present a method for modelling gene regulatory networks by forming a consensus Bayesian network model from multiple microarray gene expression datasets. Our method is based on combining Bayesian network graph topologies and does not require any special pre-processing of the datasets, such as re-normalisation. We evaluate our method on a synthetic regulatory network and part of the yeast heat-shock response regulatory network using publicly available yeast microarray datasets. Results are promising; the consensus networks formed provide a broader view of the potential underlying network, obtaining an increased true positive rate over networks constructed from a single data source.

  17. Groundtruth approach to accurate quantitation of fluorescence microarrays

    SciTech Connect

    Mascio-Kegelmeyer, L; Tomascik-Cheeseman, L; Burnett, M S; van Hummelen, P; Wyrobek, A J

    2000-12-01

    To more accurately measure fluorescent signals from microarrays, we calibrated our acquisition and analysis systems by using groundtruth samples comprised of known quantities of red and green gene-specific DNA probes hybridized to cDNA targets. We imaged the slides with a full-field, white light CCD imager and analyzed them with our custom analysis software. Here we compare, for multiple genes, results obtained with and without preprocessing (alignment, color crosstalk compensation, dark field subtraction, and integration time). We also evaluate the accuracy of various image processing and analysis techniques (background subtraction, segmentation, quantitation and normalization). This methodology calibrates and validates our system for accurate quantitative measurement of microarrays. Specifically, we show that preprocessing the images produces results significantly closer to the known ground-truth for these samples.

  18. Experimental variability and data pre-processing as factors affecting the discrimination power of some chemometric approaches (PCA, CA and a new algorithm based on linear regression) applied to (+/-)ESI/MS and RPLC/UV data: Application on green tea extracts.

    PubMed

    Iorgulescu, E; Voicu, V A; Sârbu, C; Tache, F; Albu, F; Medvedovici, A

    2016-08-01

    The influence of the experimental variability (instrumental repeatability, instrumental intermediate precision and sample preparation variability) and data pre-processing (normalization, peak alignment, background subtraction) on the discrimination power of multivariate data analysis methods (Principal Component Analysis -PCA- and Cluster Analysis -CA-) as well as a new algorithm based on linear regression was studied. Data used in the study were obtained through positive or negative ion monitoring electrospray mass spectrometry (+/-ESI/MS) and reversed phase liquid chromatography/UV spectrometric detection (RPLC/UV) applied to green tea extracts. Extractions in ethanol and heated water infusion were used as sample preparation procedures. The multivariate methods were directly applied to mass spectra and chromatograms, involving strictly a holistic comparison of shapes, without assignment of any structural identity to compounds. An alternative data interpretation based on linear regression analysis mutually applied to data series is also discussed. Slopes, intercepts and correlation coefficients produced by the linear regression analysis applied on pairs of very large experimental data series successfully retain information resulting from high frequency instrumental acquisition rates, obviously better defining the profiles being compared. Consequently, each type of sample or comparison between samples produces in the Cartesian space an ellipsoidal volume defined by the normal variation intervals of the slope, intercept and correlation coefficient. Distances between volumes graphically illustrates (dis)similarities between compared data. The instrumental intermediate precision had the major effect on the discrimination power of the multivariate data analysis methods. Mass spectra produced through ionization from liquid state in atmospheric pressure conditions of bulk complex mixtures resulting from extracted materials of natural origins provided an excellent data

  19. Microarrays, Integrated Analytical Systems

    NASA Astrophysics Data System (ADS)

    Combinatorial chemistry is used to find materials that form sensor microarrays. This book discusses the fundamentals, and then proceeds to the many applications of microarrays, from measuring gene expression (DNA microarrays) to protein-protein interactions, peptide chemistry, carbodhydrate chemistry, electrochemical detection, and microfluidics.

  20. Acquisition and preprocessing of LANDSAT data

    NASA Technical Reports Server (NTRS)

    Horn, T. N.; Brown, L. E.; Anonsen, W. H. (Principal Investigator)

    1979-01-01

    The original configuration of the GSFC data acquisition, preprocessing, and transmission subsystem, designed to provide LANDSAT data inputs to the LACIE system at JSC, is described. Enhancements made to support LANDSAT -2, and modifications for LANDSAT -3 are discussed. Registration performance throughout the 3 year period of LACIE operations satisfied the 1 pixel root-mean-square requirements established in 1974, with more than two of every three attempts at data registration proving successful, notwithstanding cosmetic faults or content inadequacies to which the process is inherently susceptible. The cloud/snow rejection rate experienced throughout the last 3 years has approached 50%, as expected in most LANDSAT data use situations.

  1. Preprocessing and compression of Hyperspectral images captured onboard UAVs

    NASA Astrophysics Data System (ADS)

    Herrero, Rolando; Cadirola, Martin; Ingle, Vinay K.

    2015-10-01

    Advancements in image sensors and signal processing have led to the successful development of lightweight hyperspectral imaging systems that are critical to the deployment of Photometry and Remote Sensing (PaRS) capabilities in unmanned aerial vehicles (UAVs). In general, hyperspectral data cubes include a few dozens of spectral bands that are extremely useful for remote sensing applications that range from detection of land vegetation to monitoring of atmospheric products derived from the processing of lower level radiance images. Because these data cubes are captured in the challenging environment of UAVs, where resources are limited, source encoding by means of compression is a fundamental mechanism that considerably improves the overall system performance and reliability. In this paper, we focus on the hyperspectral images captured by a state-of-the-art commercial hyperspectral camera by showing the results of applying ultraspectral data compression to the obtained data set. Specifically the compression scheme that we introduce integrates two stages; (1) preprocessing and (2) compression itself. The outcomes of this procedure are linear prediction coefficients and an error signal that, when encoded, results in a compressed version of the original image. Second, preprocessing and compression algorithms are optimized and have their time complexity analyzed to guarantee their successful deployment using low power ARM based embedded processors in the context of UAVs. Lastly, we compare the proposed architecture against other well known schemes and show how the compression scheme presented in this paper outperforms all of them by providing substantial improvement and delivering both lower compression rates and lower distortion.

  2. Acoustic Biometric System Based on Preprocessing Techniques and Linear Support Vector Machines.

    PubMed

    del Val, Lara; Izquierdo-Fuente, Alberto; Villacorta, Juan J; Raboso, Mariano

    2015-01-01

    Drawing on the results of an acoustic biometric system based on a MSE classifier, a new biometric system has been implemented. This new system preprocesses acoustic images, extracts several parameters and finally classifies them, based on Support Vector Machine (SVM). The preprocessing techniques used are spatial filtering, segmentation-based on a Gaussian Mixture Model (GMM) to separate the person from the background, masking-to reduce the dimensions of images-and binarization-to reduce the size of each image. An analysis of classification error and a study of the sensitivity of the error versus the computational burden of each implemented algorithm are presented. This allows the selection of the most relevant algorithms, according to the benefits required by the system. A significant improvement of the biometric system has been achieved by reducing the classification error, the computational burden and the storage requirements. PMID:26091392

  3. Acoustic Biometric System Based on Preprocessing Techniques and Linear Support Vector Machines

    PubMed Central

    del Val, Lara; Izquierdo-Fuente, Alberto; Villacorta, Juan J.; Raboso, Mariano

    2015-01-01

    Drawing on the results of an acoustic biometric system based on a MSE classifier, a new biometric system has been implemented. This new system preprocesses acoustic images, extracts several parameters and finally classifies them, based on Support Vector Machine (SVM). The preprocessing techniques used are spatial filtering, segmentation—based on a Gaussian Mixture Model (GMM) to separate the person from the background, masking—to reduce the dimensions of images—and binarization—to reduce the size of each image. An analysis of classification error and a study of the sensitivity of the error versus the computational burden of each implemented algorithm are presented. This allows the selection of the most relevant algorithms, according to the benefits required by the system. A significant improvement of the biometric system has been achieved by reducing the classification error, the computational burden and the storage requirements. PMID:26091392

  4. Pre-Processing Effect on the Accuracy of Event-Based Activity Segmentation and Classification through Inertial Sensors.

    PubMed

    Fida, Benish; Bernabucci, Ivan; Bibbo, Daniele; Conforto, Silvia; Schmid, Maurizio

    2015-01-01

    Inertial sensors are increasingly being used to recognize and classify physical activities in a variety of applications. For monitoring and fitness applications, it is crucial to develop methods able to segment each activity cycle, e.g., a gait cycle, so that the successive classification step may be more accurate. To increase detection accuracy, pre-processing is often used, with a concurrent increase in computational cost. In this paper, the effect of pre-processing operations on the detection and classification of locomotion activities was investigated, to check whether the presence of pre-processing significantly contributes to an increase in accuracy. The pre-processing stages evaluated in this study were inclination correction and de-noising. Level walking, step ascending, descending and running were monitored by using a shank-mounted inertial sensor. Raw and filtered segments, obtained from a modified version of a rule-based gait detection algorithm optimized for sequential processing, were processed to extract time and frequency-based features for physical activity classification through a support vector machine classifier. The proposed method accurately detected >99% gait cycles from raw data and produced >98% accuracy on these segmented gait cycles. Pre-processing did not substantially increase classification accuracy, thus highlighting the possibility of reducing the amount of pre-processing for real-time applications. PMID:26378544

  5. Pre-Processing Effect on the Accuracy of Event-Based Activity Segmentation and Classification through Inertial Sensors

    PubMed Central

    Fida, Benish; Bernabucci, Ivan; Bibbo, Daniele; Conforto, Silvia; Schmid, Maurizio

    2015-01-01

    Inertial sensors are increasingly being used to recognize and classify physical activities in a variety of applications. For monitoring and fitness applications, it is crucial to develop methods able to segment each activity cycle, e.g., a gait cycle, so that the successive classification step may be more accurate. To increase detection accuracy, pre-processing is often used, with a concurrent increase in computational cost. In this paper, the effect of pre-processing operations on the detection and classification of locomotion activities was investigated, to check whether the presence of pre-processing significantly contributes to an increase in accuracy. The pre-processing stages evaluated in this study were inclination correction and de-noising. Level walking, step ascending, descending and running were monitored by using a shank-mounted inertial sensor. Raw and filtered segments, obtained from a modified version of a rule-based gait detection algorithm optimized for sequential processing, were processed to extract time and frequency-based features for physical activity classification through a support vector machine classifier. The proposed method accurately detected >99% gait cycles from raw data and produced >98% accuracy on these segmented gait cycles. Pre-processing did not substantially increase classification accuracy, thus highlighting the possibility of reducing the amount of pre-processing for real-time applications. PMID:26378544

  6. Predictor for the effect of amino acid composition on CD4+ T cell epitopes preprocessing.

    PubMed

    Hoze, Ehud; Tsaban, Lea; Maman, Yaakov; Louzoun, Yoram

    2013-05-31

    Predictive tools for all levels of CD8+ T cell epitopes processing have reached a maturation level. Good prediction algorithms have been developed for proteasomal cleavage, TAP and MHC class I peptide binding. The same cannot be said of CD4+ T cell epitopes. While multiple algorithms of varying accuracy have been proposed for MHC class II peptide binding, the preprocessing of CD4+ T cell epitopes is still lacking a good prediction algorithm. CD4+ T cell epitopes generation includes several stages, not all which are well-defined. We here group these stages to produce a generic preprocessing stage predictor for the cleavage processes preceding the presentation of epitopes to CD4+ T cell. The predictor is learnt using a combination of in vitro cleavage experiments and observed naturally processed MHC class II binding peptides. The properties of the predictor highlight the effect of different factors on CD4+ T cell epitopes preprocessing. The most important factor emerging from the predictor is the secondary structure of the cleaved region in the protein. The effect of the secondary structure is expected since CD4+ T cell epitopes are not denatured before cleavage. A website developed based on this predictor is available at: http://peptibase.cs.biu.ac.il/PepCleave_cd4/. PMID:23481624

  7. Study of data preprocess for HJ-1A satellite HSI image

    NASA Astrophysics Data System (ADS)

    Gao, Hail-liang; Gu, Xing-fa; Yu, Tao; He, Hua-ying; Zhu, Ling-ya; Wang, Feng

    2015-08-01

    Hyper Spectral Imager (HSI) is the first Chinese space-borne hyperspectral sensor aboard the HJ-1A satellite. We have developed a data preprocess flow for HSI images, which includes destriping, atmospheric correction and spectral filtering. In this paper, the product level of HSI image was introduced in the beginning, and a destriping method for HSI level 2 images was proposed. Then an atmospheric correction method based on radiative transfer mechanism was summarized to retrieve ground reflectance from HSI image. Furthermore, a new spectral filter method for ground reflectance spectra after atmospheric correction was proposed based on reference ground spectral database. Lastly, a HSI image acquired over Lake Dali in Inner Mongolia was used to evaluate the effect of the preprocess method. The HSI image after destriping was compared with the original HSI image, which shows that the stripe noise has been removed effectively. Both un-smoothed reflectance spectra and smoothed spectra using the preprocess method proposed in this paper are compared with the reflectance spectral derived with the well-known FLAASH method. The results show that the spectra become much smoother after the application of the spectral filtered algorithm. It was also found that the spectra using this new preprocessing method have similar results as that of the FLAASH method.

  8. Microarrays in hematology.

    PubMed

    Walker, Josef; Flower, Darren; Rigley, Kevin

    2002-01-01

    Microarrays are fast becoming routine tools for the high-throughput analysis of gene expression in a wide range of biologic systems, including hematology. Although a number of approaches can be taken when implementing microarray-based studies, all are capable of providing important insights into biologic function. Although some technical issues have not been resolved, microarrays will continue to make a significant impact on hematologically important research. PMID:11753074

  9. A preprocessing tool for removing artifact from cardiac RR interval recordings using three-dimensional spatial distribution mapping.

    PubMed

    Stapelberg, Nicolas J C; Neumann, David L; Shum, David H K; McConnell, Harry; Hamilton-Craig, Ian

    2016-04-01

    Artifact is common in cardiac RR interval data that is recorded for heart rate variability (HRV) analysis. A novel algorithm for artifact detection and interpolation in RR interval data is described. It is based on spatial distribution mapping of RR interval magnitude and relationships to adjacent values in three dimensions. The characteristics of normal physiological RR intervals and artifact intervals were established using 24-h recordings from 20 technician-assessed human cardiac recordings. The algorithm was incorporated into a preprocessing tool and validated using 30 artificial RR (ARR) interval data files, to which known quantities of artifact (0.5%, 1%, 2%, 3%, 5%, 7%, 10%) were added. The impact of preprocessing ARR files with 1% added artifact was also assessed using 10 time domain and frequency domain HRV metrics. The preprocessing tool was also used to preprocess 69 24-h human cardiac recordings. The tool was able to remove artifact from technician-assessed human cardiac recordings (sensitivity 0.84, SD = 0.09, specificity of 1.00, SD = 0.01) and artificial data files. The removal of artifact had a low impact on time domain and frequency domain HRV metrics (ranging from 0% to 2.5% change in values). This novel preprocessing tool can be used with human 24-h cardiac recordings to remove artifact while minimally affecting physiological data and therefore having a low impact on HRV measures of that data. PMID:26751605

  10. Comparison of planar images and SPECT with bayesean preprocessing for the demonstration of facial anatomy and craniomandibular disorders

    SciTech Connect

    Kircos, L.T.; Ortendahl, D.A.; Hattner, R.S.; Faulkner, D.; Taylor, R.L.

    1984-01-01

    Craniomandiublar disorders involving the facial anatomy may be difficult to demonstrate in planar images. Although bone scanning is generally more sensitive than radiography, facial bone anatomy is complex and focal areas of increased or decreased radiotracer may become obscured by overlapping structures in planar images. Thus SPECT appears ideally suited to examination of the facial skeleton. A series of patients with craniomandibular disorders of unknown origin were imaged using 20 mCi Tc-99m MDP. Planar and SPECT (Siemens 7500 ZLC Orbiter) images were obtained four hours after injection. The SPECT images were reconstructed with a filtered back-projection algorithm. In order to improve image contrast and resolution in SPECT images, the rotation views were pre-processed with a Bayesean deblurring algorithm which has previously been show to offer improved contrast and resolution in planar images. SPECT images using the pre-processed rotation views were obtained and compared to the SPECT images without pre-processing and the planar images. TMJ arthropathy involving either the glenoid fossa or the mandibular condyle, orthopedic changes involving the mandible or maxilla, localized dental pathosis, as well as changes in structures peripheral to the facial skeleton were identified. Bayesean pre-processed SPECT depicted the facial skeleton more clearly as well as providing a more obvious demonstration of the bony changes associated with craniomandibular disorders than either planar images or SPECT without pre-processing.

  11. Measurement data preprocessing in a radar-based system for monitoring of human movements

    NASA Astrophysics Data System (ADS)

    Morawski, Roman Z.; Miȩkina, Andrzej; Bajurko, Paweł R.

    2015-02-01

    The importance of research on new technologies that could be employed in care services for elderly people is highlighted. The need to examine the applicability of various sensor systems for non-invasive monitoring of the movements and vital bodily functions, such as heart beat or breathing rhythm, of elderly persons in their home environment is justified. An extensive overview of the literature concerning existing monitoring techniques is provided. A technological potential behind radar sensors is indicated. A new class of algorithms for preprocessing of measurement data from impulse radar sensors, when applied for elderly people monitoring, is proposed. Preliminary results of numerical experiments performed on those algorithms are demonstrated.

  12. Data preprocessing methods of FT-NIR spectral data for the classification cooking oil

    NASA Astrophysics Data System (ADS)

    Ruah, Mas Ezatul Nadia Mohd; Rasaruddin, Nor Fazila; Fong, Sim Siong; Jaafar, Mohd Zuli

    2014-12-01

    This recent work describes the data pre-processing method of FT-NIR spectroscopy datasets of cooking oil and its quality parameters with chemometrics method. Pre-processing of near-infrared (NIR) spectral data has become an integral part of chemometrics modelling. Hence, this work is dedicated to investigate the utility and effectiveness of pre-processing algorithms namely row scaling, column scaling and single scaling process with Standard Normal Variate (SNV). The combinations of these scaling methods have impact on exploratory analysis and classification via Principle Component Analysis plot (PCA). The samples were divided into palm oil and non-palm cooking oil. The classification model was build using FT-NIR cooking oil spectra datasets in absorbance mode at the range of 4000cm-1-14000cm-1. Savitzky Golay derivative was applied before developing the classification model. Then, the data was separated into two sets which were training set and test set by using Duplex method. The number of each class was kept equal to 2/3 of the class that has the minimum number of sample. Then, the sample was employed t-statistic as variable selection method in order to select which variable is significant towards the classification models. The evaluation of data pre-processing were looking at value of modified silhouette width (mSW), PCA and also Percentage Correctly Classified (%CC). The results show that different data processing strategies resulting to substantial amount of model performances quality. The effects of several data pre-processing i.e. row scaling, column standardisation and single scaling process with Standard Normal Variate indicated by mSW and %CC. At two PCs model, all five classifier gave high %CC except Quadratic Distance Analysis.

  13. Antibiotic treatment algorithm development based on a microarray nucleic acid assay for rapid bacterial identification and resistance determination from positive blood cultures.

    PubMed

    Rödel, Jürgen; Karrasch, Matthias; Edel, Birgit; Stoll, Sylvia; Bohnert, Jürgen; Löffler, Bettina; Saupe, Angela; Pfister, Wolfgang

    2016-03-01

    Rapid diagnosis of bloodstream infections remains a challenge for the early targeting of an antibiotic therapy in sepsis patients. In recent studies, the reliability of the Nanosphere Verigene Gram-positive and Gram-negative blood culture (BC-GP and BC-GN) assays for the rapid identification of bacteria and resistance genes directly from positive BCs has been demonstrated. In this work, we have developed a model to define treatment recommendations by combining Verigene test results with knowledge on local antibiotic resistance patterns of bacterial pathogens. The data of 275 positive BCs were analyzed. Two hundred sixty-three isolates (95.6%) were included in the Verigene assay panels, and 257 isolates (93.5%) were correctly identified. The agreement of the detection of resistance genes with subsequent phenotypic susceptibility testing was 100%. The hospital antibiogram was used to develop a treatment algorithm on the basis of Verigene results that may contribute to a faster patient management. PMID:26712265

  14. A perceptual preprocess method for 3D-HEVC

    NASA Astrophysics Data System (ADS)

    Shi, Yawen; Wang, Yongfang; Wang, Yubing

    2015-08-01

    A perceptual preprocessing method for 3D-HEVC coding is proposed in the paper. Firstly we proposed a new JND model, which accounts for luminance contrast masking effect, spatial masking effect, and temporal masking effect, saliency characteristic as well as depth information. We utilize spectral residual approach to obtain the saliency map and built a visual saliency factor based on saliency map. In order to distinguish the sensitivity of objects in different depth. We segment each texture frame into foreground and background by a automatic threshold selection algorithm using corresponding depth information, and then built a depth weighting factor. A JND modulation factor is built with a linear combined with visual saliency factor and depth weighting factor to adjust the JND threshold. Then, we applied the proposed JND model to 3D-HEVC for residual filtering and distortion coefficient processing. The filtering process is that the residual value will be set to zero if the JND threshold is greater than residual value, or directly subtract the JND threshold from residual value if JND threshold is less than residual value. Experiment results demonstrate that the proposed method can achieve average bit rate reduction of 15.11%, compared to the original coding scheme with HTM12.1, while maintains the same subjective quality.

  15. Microarrays--status and prospects.

    PubMed

    Venkatasubbarao, Srivatsa

    2004-12-01

    Microarrays have become an extremely important research tool for life science researchers and are also beginning to be used in diagnostic, treatment and monitoring applications. This article provides a detailed description of microarrays prepared by in situ synthesis, deposition using microspotting methods, nonplanar bead arrays, flow-through microarrays, optical fiber bundle arrays and nanobarcodes. The problems and challenges in the development of microarrays, development of standards and diagnostic microarrays are described. Tables summarizing the vendor list of various derivatized microarray surfaces, commercially sold premade microarrays, bead arrays and unique microarray products in development are also included. PMID:15542153

  16. An automated method for gridding and clustering-based segmentation of cDNA microarray images.

    PubMed

    Giannakeas, Nikolaos; Fotiadis, Dimitrios I

    2009-01-01

    Microarrays are widely used to quantify gene expression levels. Microarray image analysis is one of the tools, which are necessary when dealing with vast amounts of biological data. In this work we propose a new method for the automated analysis of microarray images. The proposed method consists of two stages: gridding and segmentation. Initially, the microarray images are preprocessed using template matching, and block and spot finding takes place. Then, the non-expressed spots are detected and a grid is fit on the image using a Voronoi diagram. In the segmentation stage, K-means and Fuzzy C means (FCM) clustering are employed. The proposed method was evaluated using images from the Stanford Microarray Database (SMD). The results that are presented in the segmentation stage show the efficiency of our Fuzzy C means-based work compared to the two already developed K-means-based methods. The proposed method can handle images with artefacts and it is fully automated. PMID:19046850

  17. Full automatic preprocessing of digital map for 2.5D ray tracing propagation model in urban microcellular environment

    NASA Astrophysics Data System (ADS)

    Liu, Zhong-Yu; Guo, Li-Xin; Tao, Wei

    2013-08-01

    Due to the importance of digital map to ray-tracing (RT) algorithm, intelligent preprocessing techniques for the geometric information of buildings are improved, taking into account the characteristic of quasi three-dimensional (2.5D) RT method. By using these techniques, the geometrical factors, which have little or no effect on the prediction results, are neglected from the digital map, and the reduction of the number of blocking test is achieved in the process of executing the RT routine. With the proposed preprocessing of the digital map in urban microcellular environments, the improvement in the computational efficiency is clearly demonstrated without sensibly affecting the accuracy of the propagation prediction.

  18. Application of preprocessing filtering on Decision Tree C4.5 and rough set theory

    NASA Astrophysics Data System (ADS)

    Chan, Joseph C. C.; Lin, Tsau Y.

    2001-03-01

    This paper compares two artificial intelligence methods: the Decision Tree C4.5 and Rough Set Theory on the stock market data. The Decision Tree C4.5 is reviewed with the Rough Set Theory. An enhanced window application is developed to facilitate the pre-processing filtering by introducing the feature (attribute) transformations, which allows users to input formulas and create new attributes. Also, the application produces three varieties of data set with delaying, averaging, and summation. The results prove the improvement of pre-processing by applying feature (attribute) transformations on Decision Tree C4.5. Moreover, the comparison between Decision Tree C4.5 and Rough Set Theory is based on the clarity, automation, accuracy, dimensionality, raw data, and speed, which is supported by the rules sets generated by both algorithms on three different sets of data.

  19. Microarray Analysis in Glioblastomas.

    PubMed

    Bhawe, Kaumudi M; Aghi, Manish K

    2016-01-01

    Microarray analysis in glioblastomas is done using either cell lines or patient samples as starting material. A survey of the current literature points to transcript-based microarrays and immunohistochemistry (IHC)-based tissue microarrays as being the preferred methods of choice in cancers of neurological origin. Microarray analysis may be carried out for various purposes including the following: i. To correlate gene expression signatures of glioblastoma cell lines or tumors with response to chemotherapy (DeLay et al., Clin Cancer Res 18(10):2930-2942, 2012). ii. To correlate gene expression patterns with biological features like proliferation or invasiveness of the glioblastoma cells (Jiang et al., PLoS One 8(6):e66008, 2013). iii. To discover new tumor classificatory systems based on gene expression signature, and to correlate therapeutic response and prognosis with these signatures (Huse et al., Annu Rev Med 64(1):59-70, 2013; Verhaak et al., Cancer Cell 17(1):98-110, 2010). While investigators can sometimes use archived tumor gene expression data available from repositories such as the NCBI Gene Expression Omnibus to answer their questions, new arrays must often be run to adequately answer specific questions. Here, we provide a detailed description of microarray methodologies, how to select the appropriate methodology for a given question, and analytical strategies that can be used. Experimental methodology for protein microarrays is outside the scope of this chapter, but basic sample preparation techniques for transcript-based microarrays are included here. PMID:26113463

  20. Comparing Binaural Pre-processing Strategies I

    PubMed Central

    Krawczyk-Becker, Martin; Marquardt, Daniel; Völker, Christoph; Hu, Hongmei; Herzke, Tobias; Coleman, Graham; Adiloğlu, Kamil; Ernst, Stephan M. A.; Gerkmann, Timo; Doclo, Simon; Kollmeier, Birger; Hohmann, Volker; Dietz, Mathias

    2015-01-01

    In a collaborative research project, several monaural and binaural noise reduction algorithms have been comprehensively evaluated. In this article, eight selected noise reduction algorithms were assessed using instrumental measures, with a focus on the instrumental evaluation of speech intelligibility. Four distinct, reverberant scenarios were created to reflect everyday listening situations: a stationary speech-shaped noise, a multitalker babble noise, a single interfering talker, and a realistic cafeteria noise. Three instrumental measures were employed to assess predicted speech intelligibility and predicted sound quality: the intelligibility-weighted signal-to-noise ratio, the short-time objective intelligibility measure, and the perceptual evaluation of speech quality. The results show substantial improvements in predicted speech intelligibility as well as sound quality for the proposed algorithms. The evaluated coherence-based noise reduction algorithm was able to provide improvements in predicted audio signal quality. For the tested single-channel noise reduction algorithm, improvements in intelligibility-weighted signal-to-noise ratio were observed in all but the nonstationary cafeteria ambient noise scenario. Binaural minimum variance distortionless response beamforming algorithms performed particularly well in all noise scenarios. PMID:26721920

  1. An Overview of DNA Microarray Grid Alignment and Foreground Separation Approaches

    NASA Astrophysics Data System (ADS)

    Bajcsy, Peter

    2006-12-01

    This paper overviews DNA microarray grid alignment and foreground separation approaches. Microarray grid alignment and foreground separation are the basic processing steps of DNA microarray images that affect the quality of gene expression information, and hence impact our confidence in any data-derived biological conclusions. Thus, understanding microarray data processing steps becomes critical for performing optimal microarray data analysis. In the past, the grid alignment and foreground separation steps have not been covered extensively in the survey literature. We present several classifications of existing algorithms, and describe the fundamental principles of these algorithms. Challenges related to automation and reliability of processed image data are outlined at the end of this overview paper.

  2. SeqTrim: a high-throughput pipeline for pre-processing any type of sequence read

    PubMed Central

    2010-01-01

    Background High-throughput automated sequencing has enabled an exponential growth rate of sequencing data. This requires increasing sequence quality and reliability in order to avoid database contamination with artefactual sequences. The arrival of pyrosequencing enhances this problem and necessitates customisable pre-processing algorithms. Results SeqTrim has been implemented both as a Web and as a standalone command line application. Already-published and newly-designed algorithms have been included to identify sequence inserts, to remove low quality, vector, adaptor, low complexity and contaminant sequences, and to detect chimeric reads. The availability of several input and output formats allows its inclusion in sequence processing workflows. Due to its specific algorithms, SeqTrim outperforms other pre-processors implemented as Web services or standalone applications. It performs equally well with sequences from EST libraries, SSH libraries, genomic DNA libraries and pyrosequencing reads and does not lead to over-trimming. Conclusions SeqTrim is an efficient pipeline designed for pre-processing of any type of sequence read, including next-generation sequencing. It is easily configurable and provides a friendly interface that allows users to know what happened with sequences at every pre-processing stage, and to verify pre-processing of an individual sequence if desired. The recommended pipeline reveals more information about each sequence than previously described pre-processors and can discard more sequencing or experimental artefacts. PMID:20089148

  3. A survey of visual preprocessing and shape representation techniques

    NASA Technical Reports Server (NTRS)

    Olshausen, Bruno A.

    1988-01-01

    Many recent theories and methods proposed for visual preprocessing and shape representation are summarized. The survey brings together research from the fields of biology, psychology, computer science, electrical engineering, and most recently, neural networks. It was motivated by the need to preprocess images for a sparse distributed memory (SDM), but the techniques presented may also prove useful for applying other associative memories to visual pattern recognition. The material of this survey is divided into three sections: an overview of biological visual processing; methods of preprocessing (extracting parts of shape, texture, motion, and depth); and shape representation and recognition (form invariance, primitives and structural descriptions, and theories of attention).

  4. Comparing Binaural Pre-processing Strategies II

    PubMed Central

    Hu, Hongmei; Krawczyk-Becker, Martin; Marquardt, Daniel; Herzke, Tobias; Coleman, Graham; Adiloğlu, Kamil; Bomke, Katrin; Plotz, Karsten; Gerkmann, Timo; Doclo, Simon; Kollmeier, Birger; Hohmann, Volker; Dietz, Mathias

    2015-01-01

    Several binaural audio signal enhancement algorithms were evaluated with respect to their potential to improve speech intelligibility in noise for users of bilateral cochlear implants (CIs). 50% speech reception thresholds (SRT50) were assessed using an adaptive procedure in three distinct, realistic noise scenarios. All scenarios were highly nonstationary, complex, and included a significant amount of reverberation. Other aspects, such as the perfectly frontal target position, were idealized laboratory settings, allowing the algorithms to perform better than in corresponding real-world conditions. Eight bilaterally implanted CI users, wearing devices from three manufacturers, participated in the study. In all noise conditions, a substantial improvement in SRT50 compared to the unprocessed signal was observed for most of the algorithms tested, with the largest improvements generally provided by binaural minimum variance distortionless response (MVDR) beamforming algorithms. The largest overall improvement in speech intelligibility was achieved by an adaptive binaural MVDR in a spatially separated, single competing talker noise scenario. A no-pre-processing condition and adaptive differential microphones without a binaural link served as the two baseline conditions. SRT50 improvements provided by the binaural MVDR beamformers surpassed the performance of the adaptive differential microphones in most cases. Speech intelligibility improvements predicted by instrumental measures were shown to account for some but not all aspects of the perceptually obtained SRT50 improvements measured in bilaterally implanted CI users. PMID:26721921

  5. Characterizing the continuously acquired cardiovascular time series during hemodialysis, using median hybrid filter preprocessing noise reduction

    PubMed Central

    Wilson, Scott; Bowyer, Andrea; Harrap, Stephen B

    2015-01-01

    The clinical characterization of cardiovascular dynamics during hemodialysis (HD) has important pathophysiological implications in terms of diagnostic, cardiovascular risk assessment, and treatment efficacy perspectives. Currently the diagnosis of significant intradialytic systolic blood pressure (SBP) changes among HD patients is imprecise and opportunistic, reliant upon the presence of hypotensive symptoms in conjunction with coincident but isolated noninvasive brachial cuff blood pressure (NIBP) readings. Considering hemodynamic variables as a time series makes a continuous recording approach more desirable than intermittent measures; however, in the clinical environment, the data signal is susceptible to corruption due to both impulsive and Gaussian-type noise. Signal preprocessing is an attractive solution to this problem. Prospectively collected continuous noninvasive SBP data over the short-break intradialytic period in ten patients was preprocessed using a novel median hybrid filter (MHF) algorithm and compared with 50 time-coincident pairs of intradialytic NIBP measures from routine HD practice. The median hybrid preprocessing technique for continuously acquired cardiovascular data yielded a dynamic regression without significant noise and artifact, suitable for high-level profiling of time-dependent SBP behavior. Signal accuracy is highly comparable with standard NIBP measurement, with the added clinical benefit of dynamic real-time hemodynamic information. PMID:25678827

  6. Flexibility and utility of pre-processing methods in converting STXM setups for ptychography - Final Paper

    SciTech Connect

    Fromm, Catherine

    2015-08-20

    Ptychography is an advanced diffraction based imaging technique that can achieve resolution of 5nm and below. It is done by scanning a sample through a beam of focused x-rays using discrete yet overlapping scan steps. Scattering data is collected on a CCD camera, and the phase of the scattered light is reconstructed with sophisticated iterative algorithms. Because the experimental setup is similar, ptychography setups can be created by retrofitting existing STXM beam lines with new hardware. The other challenge comes in the reconstruction of the collected scattering images. Scattering data must be adjusted and packaged with experimental parameters to calibrate the reconstruction software. The necessary pre-processing of data prior to reconstruction is unique to each beamline setup, and even the optical alignments used on that particular day. Pre-processing software must be developed to be flexible and efficient in order to allow experiments appropriate control and freedom in the analysis of their hard-won data. This paper will describe the implementation of pre-processing software which successfully connects data collection steps to reconstruction steps, letting the user accomplish accurate and reliable ptychography.

  7. Data Analysis Strategies for Protein Microarrays

    PubMed Central

    Díez, Paula; Dasilva, Noelia; González-González, María; Matarraz, Sergio; Casado-Vela, Juan; Orfao, Alberto; Fuentes, Manuel

    2012-01-01

    Microarrays constitute a new platform which allows the discovery and characterization of proteins. According to different features, such as content, surface or detection system, there are many types of protein microarrays which can be applied for the identification of disease biomarkers and the characterization of protein expression patterns. However, the analysis and interpretation of the amount of information generated by microarrays remain a challenge. Further data analysis strategies are essential to obtain representative and reproducible results. Therefore, the experimental design is key, since the number of samples and dyes, among others aspects, would define the appropriate analysis method to be used. In this sense, several algorithms have been proposed so far to overcome analytical difficulties derived from fluorescence overlapping and/or background noise. Each kind of microarray is developed to fulfill a specific purpose. Therefore, the selection of appropriate analytical and data analysis strategies is crucial to achieve successful biological conclusions. In the present review, we focus on current algorithms and main strategies for data interpretation.

  8. Nanotechnologies in protein microarrays.

    PubMed

    Krizkova, Sona; Heger, Zbynek; Zalewska, Marta; Moulick, Amitava; Adam, Vojtech; Kizek, Rene

    2015-01-01

    Protein microarray technology became an important research tool for study and detection of proteins, protein-protein interactions and a number of other applications. The utilization of nanoparticle-based materials and nanotechnology-based techniques for immobilization allows us not only to extend the surface for biomolecule immobilization resulting in enhanced substrate binding properties, decreased background signals and enhanced reporter systems for more sensitive assays. Generally in contemporarily developed microarray systems, multiple nanotechnology-based techniques are combined. In this review, applications of nanoparticles and nanotechnologies in creating protein microarrays, proteins immobilization and detection are summarized. We anticipate that advanced nanotechnologies can be exploited to expand promising fields of proteins identification, monitoring of protein-protein or drug-protein interactions, or proteins structures. PMID:26039143

  9. Comparing Binaural Pre-processing Strategies III: Speech Intelligibility of Normal-Hearing and Hearing-Impaired Listeners.

    PubMed

    Völker, Christoph; Warzybok, Anna; Ernst, Stephan M A

    2015-01-01

    A comprehensive evaluation of eight signal pre-processing strategies, including directional microphones, coherence filters, single-channel noise reduction, binaural beamformers, and their combinations, was undertaken with normal-hearing (NH) and hearing-impaired (HI) listeners. Speech reception thresholds (SRTs) were measured in three noise scenarios (multitalker babble, cafeteria noise, and single competing talker). Predictions of three common instrumental measures were compared with the general perceptual benefit caused by the algorithms. The individual SRTs measured without pre-processing and individual benefits were objectively estimated using the binaural speech intelligibility model. Ten listeners with NH and 12 HI listeners participated. The participants varied in age and pure-tone threshold levels. Although HI listeners required a better signal-to-noise ratio to obtain 50% intelligibility than listeners with NH, no differences in SRT benefit from the different algorithms were found between the two groups. With the exception of single-channel noise reduction, all algorithms showed an improvement in SRT of between 2.1 dB (in cafeteria noise) and 4.8 dB (in single competing talker condition). Model predictions with binaural speech intelligibility model explained 83% of the measured variance of the individual SRTs in the no pre-processing condition. Regarding the benefit from the algorithms, the instrumental measures were not able to predict the perceptual data in all tested noise conditions. The comparable benefit observed for both groups suggests a possible application of noise reduction schemes for listeners with different hearing status. Although the model can predict the individual SRTs without pre-processing, further development is necessary to predict the benefits obtained from the algorithms at an individual level. PMID:26721922

  10. Design of radial basis function neural network classifier realized with the aid of data preprocessing techniques: design and analysis

    NASA Astrophysics Data System (ADS)

    Oh, Sung-Kwun; Kim, Wook-Dong; Pedrycz, Witold

    2016-05-01

    In this paper, we introduce a new architecture of optimized Radial Basis Function neural network classifier developed with the aid of fuzzy clustering and data preprocessing techniques and discuss its comprehensive design methodology. In the preprocessing part, the Linear Discriminant Analysis (LDA) or Principal Component Analysis (PCA) algorithm forms a front end of the network. The transformed data produced here are used as the inputs of the network. In the premise part, the Fuzzy C-Means (FCM) algorithm determines the receptive field associated with the condition part of the rules. The connection weights of the classifier are of functional nature and come as polynomial functions forming the consequent part. The Particle Swarm Optimization algorithm optimizes a number of essential parameters needed to improve the accuracy of the classifier. Those optimized parameters include the type of data preprocessing, the dimensionality of the feature vectors produced by the LDA (or PCA), the number of clusters (rules), the fuzzification coefficient used in the FCM algorithm and the orders of the polynomials of networks. The performance of the proposed classifier is reported for several benchmarking data-sets and is compared with the performance of other classifiers reported in the previous studies.

  11. AMIC@: All MIcroarray Clusterings @ once

    PubMed Central

    Geraci, Filippo; Pellegrini, Marco; Renda, M. Elena

    2008-01-01

    The AMIC@ Web Server offers a light-weight multi-method clustering engine for microarray gene-expression data. AMIC@ is a highly interactive tool that stresses user-friendliness and robustness by adopting AJAX technology, thus allowing an effective interleaved execution of different clustering algorithms and inspection of results. Among the salient features AMIC@ offers, there are: (i) automatic file format detection, (ii) suggestions on the number of clusters using a variant of the stability-based method of Tibshirani et al. (iii) intuitive visual inspection of the data via heatmaps and (iv) measurements of the clustering quality using cluster homogeneity. Large data sets can be processed efficiently by selecting algorithms (such as FPF-SB and k-Boost), specifically designed for this purpose. In case of very large data sets, the user can opt for a batch-mode use of the system by means of the Clustering wizard that runs all algorithms at once and delivers the results via email. AMIC@ is freely available and open to all users with no login requirement at the following URL http://bioalgo.iit.cnr.it/amica. PMID:18477631

  12. AMIC@: All MIcroarray Clusterings @ once.

    PubMed

    Geraci, Filippo; Pellegrini, Marco; Renda, M Elena

    2008-07-01

    The AMIC@ Web Server offers a light-weight multi-method clustering engine for microarray gene-expression data. AMIC@ is a highly interactive tool that stresses user-friendliness and robustness by adopting AJAX technology, thus allowing an effective interleaved execution of different clustering algorithms and inspection of results. Among the salient features AMIC@ offers, there are: (i) automatic file format detection, (ii) suggestions on the number of clusters using a variant of the stability-based method of Tibshirani et al. (iii) intuitive visual inspection of the data via heatmaps and (iv) measurements of the clustering quality using cluster homogeneity. Large data sets can be processed efficiently by selecting algorithms (such as FPF-SB and k-Boost), specifically designed for this purpose. In case of very large data sets, the user can opt for a batch-mode use of the system by means of the Clustering wizard that runs all algorithms at once and delivers the results via email. AMIC@ is freely available and open to all users with no login requirement at the following URL http://bioalgo.iit.cnr.it/amica. PMID:18477631

  13. Enhancing Interdisciplinary Mathematics and Biology Education: A Microarray Data Analysis Course Bridging These Disciplines

    PubMed Central

    Evans, Irene M.

    2010-01-01

    BIO2010 put forth the goal of improving the mathematical educational background of biology students. The analysis and interpretation of microarray high-dimensional data can be very challenging and is best done by a statistician and a biologist working and teaching in a collaborative manner. We set up such a collaboration and designed a course on microarray data analysis. We started using Genome Consortium for Active Teaching (GCAT) materials and Microarray Genome and Clustering Tool software and added R statistical software along with Bioconductor packages. In response to student feedback, one microarray data set was fully analyzed in class, starting from preprocessing to gene discovery to pathway analysis using the latter software. A class project was to conduct a similar analysis where students analyzed their own data or data from a published journal paper. This exercise showed the impact that filtering, preprocessing, and different normalization methods had on gene inclusion in the final data set. We conclude that this course achieved its goals to equip students with skills to analyze data from a microarray experiment. We offer our insight about collaborative teaching as well as how other faculty might design and implement a similar interdisciplinary course. PMID:20810954

  14. EARLINET Single Calculus Chain - technical - Part 1: Pre-processing of raw lidar data

    NASA Astrophysics Data System (ADS)

    D'Amico, G.; Amodeo, A.; Mattis, I.; Freudenthaler, V.; Pappalardo, G.

    2015-10-01

    In this paper we describe an automatic tool for the pre-processing of lidar data called ELPP (EARLINET Lidar Pre-Processor). It is one of two calculus modules of the EARLINET Single Calculus Chain (SCC), the automatic tool for the analysis of EARLINET data. The ELPP is an open source module that executes instrumental corrections and data handling of the raw lidar signals, making the lidar data ready to be processed by the optical retrieval algorithms. According to the specific lidar configuration, the ELPP automatically performs dead-time correction, atmospheric and electronic background subtraction, gluing of lidar signals, and trigger-delay correction. Moreover, the signal-to-noise ratio of the pre-processed signals can be improved by means of configurable time integration of the raw signals and/or spatial smoothing. The ELPP delivers the statistical uncertainties of the final products by means of error propagation or Monte Carlo simulations. During the development of the ELPP module, particular attention has been payed to make the tool flexible enough to handle all lidar configurations currently used within the EARLINET community. Moreover, it has been designed in a modular way to allow an easy extension to lidar configurations not yet implemented. The primary goal of the ELPP module is to enable the application of quality-assured procedures in the lidar data analysis starting from the raw lidar data. This provides the added value of full traceability of each delivered lidar product. Several tests have been performed to check the proper functioning of the ELPP module. The whole SCC has been tested with the same synthetic data sets, which were used for the EARLINET algorithm inter-comparison exercise. The ELPP module has been successfully employed for the automatic near-real-time pre-processing of the raw lidar data measured during several EARLINET inter-comparison campaigns as well as during intense field campaigns.

  15. EARLINET Single Calculus Chain - technical - Part 1: Pre-processing of raw lidar data

    NASA Astrophysics Data System (ADS)

    D'Amico, Giuseppe; Amodeo, Aldo; Mattis, Ina; Freudenthaler, Volker; Pappalardo, Gelsomina

    2016-02-01

    In this paper we describe an automatic tool for the pre-processing of aerosol lidar data called ELPP (EARLINET Lidar Pre-Processor). It is one of two calculus modules of the EARLINET Single Calculus Chain (SCC), the automatic tool for the analysis of EARLINET data. ELPP is an open source module that executes instrumental corrections and data handling of the raw lidar signals, making the lidar data ready to be processed by the optical retrieval algorithms. According to the specific lidar configuration, ELPP automatically performs dead-time correction, atmospheric and electronic background subtraction, gluing of lidar signals, and trigger-delay correction. Moreover, the signal-to-noise ratio of the pre-processed signals can be improved by means of configurable time integration of the raw signals and/or spatial smoothing. ELPP delivers the statistical uncertainties of the final products by means of error propagation or Monte Carlo simulations. During the development of ELPP, particular attention has been payed to make the tool flexible enough to handle all lidar configurations currently used within the EARLINET community. Moreover, it has been designed in a modular way to allow an easy extension to lidar configurations not yet implemented. The primary goal of ELPP is to enable the application of quality-assured procedures in the lidar data analysis starting from the raw lidar data. This provides the added value of full traceability of each delivered lidar product. Several tests have been performed to check the proper functioning of ELPP. The whole SCC has been tested with the same synthetic data sets, which were used for the EARLINET algorithm inter-comparison exercise. ELPP has been successfully employed for the automatic near-real-time pre-processing of the raw lidar data measured during several EARLINET inter-comparison campaigns as well as during intense field campaigns.

  16. Analysis of microarray experiments of gene expression profiling

    PubMed Central

    Tarca, Adi L.; Romero, Roberto; Draghici, Sorin

    2008-01-01

    The study of gene expression profiling of cells and tissue has become a major tool for discovery in medicine. Microarray experiments allow description of genome-wide expression changes in health and disease. The results of such experiments are expected to change the methods employed in the diagnosis and prognosis of disease in obstetrics and gynecology. Moreover, an unbiased and systematic study of gene expression profiling should allow the establishment of a new taxonomy of disease for obstetric and gynecologic syndromes. Thus, a new era is emerging in which reproductive processes and disorders could be characterized using molecular tools and fingerprinting. The design, analysis, and interpretation of microarray experiments require specialized knowledge that is not part of the standard curriculum of our discipline. This article describes the types of studies that can be conducted with microarray experiments (class comparison, class prediction, class discovery). We discuss key issues pertaining to experimental design, data preprocessing, and gene selection methods. Common types of data representation are illustrated. Potential pitfalls in the interpretation of microarray experiments, as well as the strengths and limitations of this technology, are highlighted. This article is intended to assist clinicians in appraising the quality of the scientific evidence now reported in the obstetric and gynecologic literature. PMID:16890548

  17. Microarrays for Undergraduate Classes

    ERIC Educational Resources Information Center

    Hancock, Dale; Nguyen, Lisa L.; Denyer, Gareth S.; Johnston, Jill M.

    2006-01-01

    A microarray experiment is presented that, in six laboratory sessions, takes undergraduate students from the tissue sample right through to data analysis. The model chosen, the murine erythroleukemia cell line, can be easily cultured in sufficient quantities for class use. Large changes in gene expression can be induced in these cells by…

  18. Real-time multilevel process monitoring and control of CR image acquisition and preprocessing for PACS and ICU

    NASA Astrophysics Data System (ADS)

    Zhang, Jianguo; Wong, Stephen T. C.; Andriole, Katherine P.; Wong, Albert W. K.; Huang, H. K.

    1996-05-01

    The purpose of this paper is to present a control theory and a fault tolerance algorithm developed for real time monitoring and control of acquisition and preprocessing of computed radiographs for PACS and Intensive Care Unit operations. This monitoring and control system uses the event-driven, multilevel processing approach to remove computational bottleneck and to improve system reliability. Its computational performance and processing reliability are evaluated and compared with those of the traditional, single level processing approach.

  19. Microarray data classified by artificial neural networks.

    PubMed

    Linder, Roland; Richards, Tereza; Wagner, Mathias

    2007-01-01

    Systems biology has enjoyed explosive growth in both the number of people participating in this area of research and the number of publications on the topic. The field of systems biology encompasses the in silico analysis of high-throughput data as provided by DNA or protein microarrays. Along with the increasing availability of microarray data, attention is focused on methods of analyzing the expression rates. One important type of analysis is the classification task, for example, distinguishing different types of cell functions or tumors. Recently, interest has been awakened toward artificial neural networks (ANN), which have many appealing characteristics such as an exceptional degree of accuracy. Nonlinear relationships or independence from certain assumptions regarding the data distribution are also considered. The current work reviews advantages as well as disadvantages of neural networks in the context of microarray analysis. Comparisons are drawn to alternative methods. Selected solutions are discussed, and finally algorithms for the effective combination of multiple ANNs are presented. The development of approaches to use ANN-processed microarray data applicable to run cell and tissue simulations may be slated for future investigation. PMID:18220242

  20. Navigating Public Microarray Databases

    PubMed Central

    Bähler, Jürg

    2004-01-01

    With the ever-escalating amount of data being produced by genome-wide microarray studies, it is of increasing importance that these data are captured in public databases so that researchers can use this information to complement and enhance their own studies. Many groups have set up databases of expression data, ranging from large repositories, which are designed to comprehensively capture all published data, through to more specialized databases. The public repositories, such as ArrayExpress at the European Bioinformatics Institute contain complete datasets in raw format in addition to processed data, whilst the specialist databases tend to provide downstream analysis of normalized data from more focused studies and data sources. Here we provide a guide to the use of these public microarray resources. PMID:18629145

  1. Microarrays under the microscope.

    PubMed

    Wildsmith, S E; Elcock, F J

    2001-02-01

    Microarray technology is a rapidly advancing area, which is gaining popularity in many biological disciplines from drug target identification to predictive toxicology. Over the past few years, there has been a dramatic increase in the number of methods and techniques available for carrying out this form of gene expression analysis. The techniques and associated peripherals, such as slide types, deposition methods, robotics, and scanning equipment, are undergoing constant improvement, helping to drive the technology forward in terms of robustness and ease of use. These rapid developments, combined with the number of options available and the associated hyperbole, can prove daunting for the new user. This review aims to guide the researcher through the various steps of conducting microarray experiments, from initial strategy to analysing the data, with critical examination of the benefits and disadvantages along the way. PMID:11212888

  2. Navigating public microarray databases.

    PubMed

    Penkett, Christopher J; Bähler, Jürg

    2004-01-01

    With the ever-escalating amount of data being produced by genome-wide microarray studies, it is of increasing importance that these data are captured in public databases so that researchers can use this information to complement and enhance their own studies. Many groups have set up databases of expression data, ranging from large repositories, which are designed to comprehensively capture all published data, through to more specialized databases. The public repositories, such as ArrayExpress at the European Bioinformatics Institute contain complete datasets in raw format in addition to processed data, whilst the specialist databases tend to provide downstream analysis of normalized data from more focused studies and data sources. Here we provide a guide to the use of these public microarray resources. PMID:18629145

  3. The Minimal Preprocessing Pipelines for the Human Connectome Project

    PubMed Central

    Glasser, Matthew F.; Sotiropoulos, Stamatios N; Wilson, J Anthony; Coalson, Timothy S; Fischl, Bruce; Andersson, Jesper L; Xu, Junqian; Jbabdi, Saad; Webster, Matthew; Polimeni, Jonathan R; Van Essen, David C; Jenkinson, Mark

    2013-01-01

    The Human Connectome Project (HCP) faces the challenging task of bringing multiple magnetic resonance imaging (MRI) modalities together in a common automated preprocessing framework across a large cohort of subjects. The MRI data acquired by the HCP differ in many ways from data acquired on conventional 3 Tesla scanners and often require newly developed preprocessing methods. We describe the minimal preprocessing pipelines for structural, functional, and diffusion MRI that were developed by the HCP to accomplish many low level tasks, including spatial artifact/distortion removal, surface generation, cross-modal registration, and alignment to standard space. These pipelines are specially designed to capitalize on the high quality data offered by the HCP. The final standard space makes use of a recently introduced CIFTI file format and the associated grayordinates spatial coordinate system. This allows for combined cortical surface and subcortical volume analyses while reducing the storage and processing requirements for high spatial and temporal resolution data. Here, we provide the minimum image acquisition requirements for the HCP minimal preprocessing pipelines and additional advice for investigators interested in replicating the HCP’s acquisition protocols or using these pipelines. Finally, we discuss some potential future improvements for the pipelines. PMID:23668970

  4. OPSN: The IMS COMSYS 1 and 2 Data Preprocessing System.

    ERIC Educational Resources Information Center

    Yu, John

    The Instructional Management System (IMS) developed by the Southwest Regional Laboratory (SWRL) processes student and teacher-generated data through the use of an optical scanner that produces a magnetic tape (Scan Tape) for input to IMS. A series of computer routines, OPSN, preprocesses the Scan Tape and prepares the data for transmission to the…

  5. A review of independent component analysis application to microarray gene expression data

    PubMed Central

    Kong, Wei; Vanderburg, Charles R.; Gunshin, Hiromi; Rogers, Jack T.; Huang, Xudong

    2010-01-01

    Independent component analysis (ICA) methods have received growing attention as effective data-mining tools for microarray gene expression data. As a technique of higher-order statistical analysis, ICA is capable of extracting biologically relevant gene expression features from microarray data. Herein we have reviewed the latest applications and the extended algorithms of ICA in gene clustering, classification, and identification. The theoretical frameworks of ICA have been described to further illustrate its feature extraction function in microarray data analysis. PMID:19007336

  6. Tiling Microarray Analysis Tools

    SciTech Connect

    Nix, Davis Austin

    2005-05-04

    TiMAT is a package of 23 command line Java applications for use in the analysis of Affymetrix tiled genomic microarray data. TiMAT enables: 1) Rebuilding the genome annotation for entire tiled arrays (repeat filtering, chromosomal coordinate assignment). 2) Post processing of oligo intensity values (quantile normalization, median scaling, PMMM transformation), 3) Significance testing (Wilcoxon rank sum and signed rank tests, intensity difference and ratio tests) and Interval refinement (filtering based on multiple statistics, overlap comparisons), 4) Data visualization (detailed thumbnail/zoomed view with Interval Plots and data export to Affymetrix's Integrated Genome Browser) and Data reports (spreadsheet summaries and detailed profiles)

  7. Linguistic Preprocessing and Tagging for Problem Report Trend Analysis

    NASA Technical Reports Server (NTRS)

    Beil, Robert J.; Malin, Jane T.

    2012-01-01

    Mr. Robert Beil, Systems Engineer at Kennedy Space Center (KSC), requested the NASA Engineering and Safety Center (NESC) develop a prototype tool suite that combines complementary software technology used at Johnson Space Center (JSC) and KSC for problem report preprocessing and semantic tag extraction, to improve input to data mining and trend analysis. This document contains the outcome of the assessment and the Findings, Observations and NESC Recommendations.

  8. Integration of geometric modeling and advanced finite element preprocessing

    NASA Technical Reports Server (NTRS)

    Shephard, Mark S.; Finnigan, Peter M.

    1987-01-01

    The structure to a geometry based finite element preprocessing system is presented. The key features of the system are the use of geometric operators to support all geometric calculations required for analysis model generation, and the use of a hierarchic boundary based data structure for the major data sets within the system. The approach presented can support the finite element modeling procedures used today as well as the fully automated procedures under development.

  9. CLUM: a cluster program for analyzing microarray data.

    PubMed

    Irigoien, I; Fernandez, E; Vives, S; Arenas, C

    2008-08-01

    Microarray technology is increasingly being applied in biological and medical research to address a wide range of problems. Cluster analysis has proven to be a very useful tool for investigating the structure of microarray data. This paper presents a program for clustering microarray data, which is based on the so call path-distance. The algorithm gives in each step a partition in two clusters and no prior assumptions on the structure of clusters are required. It assigns each object (gene or sample) to only one cluster and gives the global optimum for the function that quantifies the adequacy of a given partition of the sample into k clusters. The program was tested on experimental data sets, showing the robustness of the algorithm. PMID:18825964

  10. Image pre-processing for optimizing automated photogrammetry performances

    NASA Astrophysics Data System (ADS)

    Guidi, G.; Gonizzi, S.; Micoli, L. L.

    2014-05-01

    The purpose of this paper is to analyze how optical pre-processing with polarizing filters and digital pre-processing with HDR imaging, may improve the automated 3D modeling pipeline based on SFM and Image Matching, with special emphasis on optically non-cooperative surfaces of shiny or dark materials. Because of the automatic detection of homologous points, the presence of highlights due to shiny materials, or nearly uniform dark patches produced by low reflectance materials, may produce erroneous matching involving wrong 3D point estimations, and consequently holes and topological errors on the mesh originated by the associated dense 3D cloud. This is due to the limited dynamic range of the 8 bit digital images that are matched each other for generating 3D data. The same 256 levels can be more usefully employed if the actual dynamic range is compressed, avoiding luminance clipping on the darker and lighter image areas. Such approach is here considered both using optical filtering and HDR processing with tone mapping, with experimental evaluation on different Cultural Heritage objects characterized by non-cooperative optical behavior. Three test images of each object have been captured from different positions, changing the shooting conditions (filter/no-filter) and the image processing (no processing/HDR processing), in order to have the same 3 camera orientations with different optical and digital pre-processing, and applying the same automated process to each photo set.

  11. Face recognition by using optical correlator with wavelet preprocessing

    NASA Astrophysics Data System (ADS)

    Strzelecki, Jacek; Chalasinska-Macukow, Katarzyna

    2004-08-01

    The method of face recognition by using optical correlator with wavelet preprocessing is presented. The wavelet transform is used to improve the performance of standard Vander Lugt correlator with phase only filter (POF). The influence of various wavelet transforms of images of human faces on the recognition results has been analyzed. The quality of the face recognition process was tested according to two criteria: the peak to correlation energy ratio (PCE), and the discrimination capability (DC). Additionally, proper localization of correlation peak has been controlled. During the preprocessing step a set of three wavelets -- mexican hat, Haar, and Gabor wavelets, with various scales was used. In addition, Gabor wavelets were tested for various orientation angles. During the recognition procedure the input scene and POF are transformed by the same wavelet. We show the results of the computer simulation for a variety of images of human faces: original images without any distortions, noisy images, and images with non-uniform light ilumination. A comparison of results of recognition obtained with and without wavelet preprocessing is given.

  12. Application of filtering techniques in preprocessing magnetic data

    NASA Astrophysics Data System (ADS)

    Liu, Haijun; Yi, Yongping; Yang, Hongxia; Hu, Guochuang; Liu, Guoming

    2010-08-01

    High precision magnetic exploration is a popular geophysical technique for its simplicity and its effectiveness. The explanation in high precision magnetic exploration is always a difficulty because of the existence of noise and disturbance factors, so it is necessary to find an effective preprocessing method to get rid of the affection of interference factors before further processing. The common way to do this work is by filtering. There are many kinds of filtering methods. In this paper we introduced in detail three popular kinds of filtering techniques including regularized filtering technique, sliding averages filtering technique, compensation smoothing filtering technique. Then we designed the work flow of filtering program based on these techniques and realized it with the help of DELPHI. To check it we applied it to preprocess magnetic data of a certain place in China. Comparing the initial contour map with the filtered contour map, we can see clearly the perfect effect our program. The contour map processed by our program is very smooth and the high frequency parts of data are disappeared. After filtering, we separated useful signals and noisy signals, minor anomaly and major anomaly, local anomaly and regional anomaly. It made us easily to focus on the useful information. Our program can be used to preprocess magnetic data. The results showed the effectiveness of our program.

  13. Review of feed forward neural network classification preprocessing techniques

    NASA Astrophysics Data System (ADS)

    Asadi, Roya; Kareem, Sameem Abdul

    2014-06-01

    The best feature of artificial intelligent Feed Forward Neural Network (FFNN) classification models is learning of input data through their weights. Data preprocessing and pre-training are the contributing factors in developing efficient techniques for low training time and high accuracy of classification. In this study, we investigate and review the powerful preprocessing functions of the FFNN models. Currently initialization of the weights is at random which is the main source of problems. Multilayer auto-encoder networks as the latest technique like other related techniques is unable to solve the problems. Weight Linear Analysis (WLA) is a combination of data pre-processing and pre-training to generate real weights through the use of normalized input values. The FFNN model by using the WLA increases classification accuracy and improve training time in a single epoch without any training cycle, the gradient of the mean square error function, updating the weights. The results of comparison and evaluation show that the WLA is a powerful technique in the FFNN classification area yet.

  14. Compressive Sensing DNA Microarrays

    PubMed Central

    2009-01-01

    Compressive sensing microarrays (CSMs) are DNA-based sensors that operate using group testing and compressive sensing (CS) principles. In contrast to conventional DNA microarrays, in which each genetic sensor is designed to respond to a single target, in a CSM, each sensor responds to a set of targets. We study the problem of designing CSMs that simultaneously account for both the constraints from CS theory and the biochemistry of probe-target DNA hybridization. An appropriate cross-hybridization model is proposed for CSMs, and several methods are developed for probe design and CS signal recovery based on the new model. Lab experiments suggest that in order to achieve accurate hybridization profiling, consensus probe sequences are required to have sequence homology of at least 80% with all targets to be detected. Furthermore, out-of-equilibrium datasets are usually as accurate as those obtained from equilibrium conditions. Consequently, one can use CSMs in applications in which only short hybridization times are allowed. PMID:19158952

  15. Automatic video object detection and mask signal removal for efficient video preprocessing

    NASA Astrophysics Data System (ADS)

    He, Zhihai

    2004-01-01

    In this work, we consider a generic definition of video object, which is a group of pixels with temporal motion coherence. The generic video object (GVO) is the superset of the conventional video objects discussed in the literature. Because of its motion coherence, the GVO can be easily recognized by the human visual system. However, due to its arbitray spatial distribution, the GVO cannot be easily detected by the existing algorithms which often assume the spatial homogeneousness of the video objects. In this work, we introduce the concept of extended optical flow and develop a dynamic programming framework for the GVO detection. Using this mathematical optimization formulation, whose solution is given by the the Viterbi algorithm, the proposed object detection algorithm is able to discover the motion path of the GVO automatically and refine its spatial location progressively. We apply the GVO detection algorithm to extract and remove the so-called "video mask" signals in the video sequence. Our experimental results show that this type of vision-guided video pre-processing significantly improves the compression efficiency.

  16. Integrating data from heterogeneous DNA microarray platforms.

    PubMed

    Valente, Eduardo; Rocha, Miguel

    2015-01-01

    DNA microarrays are one of the most used technologies for gene expression measurement. However, there are several distinct microarray platforms, from different manufacturers, each with its own measurement protocol, resulting in data that can hardly be compared or directly integrated. Data integration from multiple sources aims to improve the assertiveness of statistical tests, reducing the data dimensionality problem. The integration of heterogeneous DNA microarray platforms comprehends a set of tasks that range from the re-annotation of the features used on gene expression, to data normalization and batch effect elimination. In this work, a complete methodology for gene expression data integration and application is proposed, which comprehends a transcript-based re-annotation process and several methods for batch effect attenuation. The integrated data will be used to select the best feature set and learning algorithm for a brain tumor classification case study. The integration will consider data from heterogeneous Agilent and Affymetrix platforms, collected from public gene expression databases, such as The Cancer Genome Atlas and Gene Expression Omnibus. PMID:26673932

  17. The Genopolis Microarray Database

    PubMed Central

    Splendiani, Andrea; Brandizi, Marco; Even, Gael; Beretta, Ottavio; Pavelka, Norman; Pelizzola, Mattia; Mayhaus, Manuel; Foti, Maria; Mauri, Giancarlo; Ricciardi-Castagnoli, Paola

    2007-01-01

    Background Gene expression databases are key resources for microarray data management and analysis and the importance of a proper annotation of their content is well understood. Public repositories as well as microarray database systems that can be implemented by single laboratories exist. However, there is not yet a tool that can easily support a collaborative environment where different users with different rights of access to data can interact to define a common highly coherent content. The scope of the Genopolis database is to provide a resource that allows different groups performing microarray experiments related to a common subject to create a common coherent knowledge base and to analyse it. The Genopolis database has been implemented as a dedicated system for the scientific community studying dendritic and macrophage cells functions and host-parasite interactions. Results The Genopolis Database system allows the community to build an object based MIAME compliant annotation of their experiments and to store images, raw and processed data from the Affymetrix GeneChip® platform. It supports dynamical definition of controlled vocabularies and provides automated and supervised steps to control the coherence of data and annotations. It allows a precise control of the visibility of the database content to different sub groups in the community and facilitates exports of its content to public repositories. It provides an interactive users interface for data analysis: this allows users to visualize data matrices based on functional lists and sample characterization, and to navigate to other data matrices defined by similarity of expression values as well as functional characterizations of genes involved. A collaborative environment is also provided for the definition and sharing of functional annotation by users. Conclusion The Genopolis Database supports a community in building a common coherent knowledge base and analyse it. This fills a gap between a local

  18. DNA Microarray-Based Diagnostics.

    PubMed

    Marzancola, Mahsa Gharibi; Sedighi, Abootaleb; Li, Paul C H

    2016-01-01

    The DNA microarray technology is currently a useful biomedical tool which has been developed for a variety of diagnostic applications. However, the development pathway has not been smooth and the technology has faced some challenges. The reliability of the microarray data and also the clinical utility of the results in the early days were criticized. These criticisms added to the severe competition from other techniques, such as next-generation sequencing (NGS), impacting the growth of microarray-based tests in the molecular diagnostic market.Thanks to the advances in the underlying technologies as well as the tremendous effort offered by the research community and commercial vendors, these challenges have mostly been addressed. Nowadays, the microarray platform has achieved sufficient standardization and method validation as well as efficient probe printing, liquid handling and signal visualization. Integration of various steps of the microarray assay into a harmonized and miniaturized handheld lab-on-a-chip (LOC) device has been a goal for the microarray community. In this respect, notable progress has been achieved in coupling the DNA microarray with the liquid manipulation microsystem as well as the supporting subsystem that will generate the stand-alone LOC device.In this chapter, we discuss the major challenges that microarray technology has faced in its almost two decades of development and also describe the solutions to overcome the challenges. In addition, we review the advancements of the technology, especially the progress toward developing the LOC devices for DNA diagnostic applications. PMID:26614075

  19. Tiling Microarray Analysis Tools

    Energy Science and Technology Software Center (ESTSC)

    2005-05-04

    TiMAT is a package of 23 command line Java applications for use in the analysis of Affymetrix tiled genomic microarray data. TiMAT enables: 1) Rebuilding the genome annotation for entire tiled arrays (repeat filtering, chromosomal coordinate assignment). 2) Post processing of oligo intensity values (quantile normalization, median scaling, PMMM transformation), 3) Significance testing (Wilcoxon rank sum and signed rank tests, intensity difference and ratio tests) and Interval refinement (filtering based on multiple statistics, overlap comparisons),more » 4) Data visualization (detailed thumbnail/zoomed view with Interval Plots and data export to Affymetrix's Integrated Genome Browser) and Data reports (spreadsheet summaries and detailed profiles)« less

  20. Living-Cell Microarrays

    PubMed Central

    Yarmush, Martin L.; King, Kevin R.

    2011-01-01

    Living cells are remarkably complex. To unravel this complexity, living-cell assays have been developed that allow delivery of experimental stimuli and measurement of the resulting cellular responses. High-throughput adaptations of these assays, known as living-cell microarrays, which are based on microtiter plates, high-density spotting, microfabrication, and microfluidics technologies, are being developed for two general applications: (a) to screen large-scale chemical and genomic libraries and (b) to systematically investigate the local cellular microenvironment. These emerging experimental platforms offer exciting opportunities to rapidly identify genetic determinants of disease, to discover modulators of cellular function, and to probe the complex and dynamic relationships between cells and their local environment. PMID:19413510

  1. A new approach to pre-processing digital image for wavelet-based watermark

    NASA Astrophysics Data System (ADS)

    Agreste, Santa; Andaloro, Guido

    2008-11-01

    The growth of the Internet has increased the phenomenon of digital piracy, in multimedia objects, like software, image, video, audio and text. Therefore it is strategic to individualize and to develop methods and numerical algorithms, which are stable and have low computational cost, that will allow us to find a solution to these problems. We describe a digital watermarking algorithm for color image protection and authenticity: robust, not blind, and wavelet-based. The use of Discrete Wavelet Transform is motivated by good time-frequency features and a good match with Human Visual System directives. These two combined elements are important for building an invisible and robust watermark. Moreover our algorithm can work with any image, thanks to the step of pre-processing of the image that includes resize techniques that adapt to the size of the original image for Wavelet transform. The watermark signal is calculated in correlation with the image features and statistic properties. In the detection step we apply a re-synchronization between the original and watermarked image according to the Neyman-Pearson statistic criterion. Experimentation on a large set of different images has been shown to be resistant against geometric, filtering, and StirMark attacks with a low rate of false alarm.

  2. Preprocessing and parameterizing bioimpedance spectroscopy measurements by singular value decomposition.

    PubMed

    Nejadgholi, Isar; Caytak, Herschel; Bolic, Miodrag; Batkin, Izmail; Shirmohammadi, Shervin

    2015-05-01

    In several applications of bioimpedance spectroscopy, the measured spectrum is parameterized by being fitted into the Cole equation. However, the extracted Cole parameters seem to be inconsistent from one measurement session to another, which leads to a high standard deviation of extracted parameters. This inconsistency is modeled with a source of random variations added to the voltage measurement carried out in the time domain. These random variations may originate from biological variations that are irrelevant to the evidence that we are investigating. Yet, they affect the voltage measured by using a bioimpedance device based on which magnitude and phase of impedance are calculated.By means of simulated data, we showed that Cole parameters are highly affected by this type of variation. We further showed that singular value decomposition (SVD) is an effective tool for parameterizing bioimpedance measurements, which results in more consistent parameters than Cole parameters. We propose to apply SVD as a preprocessing method to reconstruct denoised bioimpedance measurements. In order to evaluate the method, we calculated the relative difference between parameters extracted from noisy and clean simulated bioimpedance spectra. Both mean and standard deviation of this relative difference are shown to effectively decrease when Cole parameters are extracted from preprocessed data in comparison to being extracted from raw measurements.We evaluated the performance of the proposed method in distinguishing three arm positions, for a set of experiments including eight subjects. It is shown that Cole parameters of different positions are not distinguishable when extracted from raw measurements. However, one arm position can be distinguished based on SVD scores. Moreover, all three positions are shown to be distinguished by two parameters, R0/R∞ and Fc, when Cole parameters are extracted from preprocessed measurements. These results suggest that SVD could be considered as an

  3. Microarray platform for omics analysis

    NASA Astrophysics Data System (ADS)

    Mecklenburg, Michael; Xie, Bin

    2001-09-01

    Microarray technology has revolutionized genetic analysis. However, limitations in genome analysis has lead to renewed interest in establishing 'omic' strategies. As we enter the post-genomic era, new microarray technologies are needed to address these new classes of 'omic' targets, such as proteins, as well as lipids and carbohydrates. We have developed a microarray platform that combines self- assembling monolayers with the biotin-streptavidin system to provide a robust, versatile immobilization scheme. A hydrophobic film is patterned on the surface creating an array of tension wells that eliminates evaporation effects thereby reducing the shear stress to which biomolecules are exposed to during immobilization. The streptavidin linker layer makes it possible to adapt and/or develop microarray based assays using virtually any class of biomolecules including: carbohydrates, peptides, antibodies, receptors, as well as them ore traditional DNA based arrays. Our microarray technology is designed to furnish seamless compatibility across the various 'omic' platforms by providing a common blueprint for fabricating and analyzing arrays. The prototype microarray uses a microscope slide footprint patterned with 2 by 96 flat wells. Data on the microarray platform will be presented.

  4. Development, characterization and experimental validation of a cultivated sunflower (Helianthus annuus L.) gene expression oligonucleotide microarray.

    PubMed

    Fernandez, Paula; Soria, Marcelo; Blesa, David; DiRienzo, Julio; Moschen, Sebastian; Rivarola, Maximo; Clavijo, Bernardo Jose; Gonzalez, Sergio; Peluffo, Lucila; Príncipi, Dario; Dosio, Guillermo; Aguirrezabal, Luis; García-García, Francisco; Conesa, Ana; Hopp, Esteban; Dopazo, Joaquín; Heinz, Ruth Amelia; Paniego, Norma

    2012-01-01

    Oligonucleotide-based microarrays with accurate gene coverage represent a key strategy for transcriptional studies in orphan species such as sunflower, H. annuus L., which lacks full genome sequences. The goal of this study was the development and functional annotation of a comprehensive sunflower unigene collection and the design and validation of a custom sunflower oligonucleotide-based microarray. A large scale EST (>130,000 ESTs) curation, assembly and sequence annotation was performed using Blast2GO (www.blast2go.de). The EST assembly comprises 41,013 putative transcripts (12,924 contigs and 28,089 singletons). The resulting Sunflower Unigen Resource (SUR version 1.0) was used to design an oligonucleotide-based Agilent microarray for cultivated sunflower. This microarray includes a total of 42,326 features: 1,417 Agilent controls, 74 control probes for sunflower replicated 10 times (740 controls) and 40,169 different non-control probes. Microarray performance was validated using a model experiment examining the induction of senescence by water deficit. Pre-processing and differential expression analysis of Agilent microarrays was performed using the Bioconductor limma package. The analyses based on p-values calculated by eBayes (p<0.01) allowed the detection of 558 differentially expressed genes between water stress and control conditions; from these, ten genes were further validated by qPCR. Over-represented ontologies were identified using FatiScan in the Babelomics suite. This work generated a curated and trustable sunflower unigene collection, and a custom, validated sunflower oligonucleotide-based microarray using Agilent technology. Both the curated unigene collection and the validated oligonucleotide microarray provide key resources for sunflower genome analysis, transcriptional studies, and molecular breeding for crop improvement. PMID:23110046

  5. Development, Characterization and Experimental Validation of a Cultivated Sunflower (Helianthus annuus L.) Gene Expression Oligonucleotide Microarray

    PubMed Central

    Fernandez, Paula; Soria, Marcelo; Blesa, David; DiRienzo, Julio; Moschen, Sebastian; Rivarola, Maximo; Clavijo, Bernardo Jose; Gonzalez, Sergio; Peluffo, Lucila; Príncipi, Dario; Dosio, Guillermo; Aguirrezabal, Luis; García-García, Francisco; Conesa, Ana; Hopp, Esteban; Dopazo, Joaquín; Heinz, Ruth Amelia; Paniego, Norma

    2012-01-01

    Oligonucleotide-based microarrays with accurate gene coverage represent a key strategy for transcriptional studies in orphan species such as sunflower, H. annuus L., which lacks full genome sequences. The goal of this study was the development and functional annotation of a comprehensive sunflower unigene collection and the design and validation of a custom sunflower oligonucleotide-based microarray. A large scale EST (>130,000 ESTs) curation, assembly and sequence annotation was performed using Blast2GO (www.blast2go.de). The EST assembly comprises 41,013 putative transcripts (12,924 contigs and 28,089 singletons). The resulting Sunflower Unigen Resource (SUR version 1.0) was used to design an oligonucleotide-based Agilent microarray for cultivated sunflower. This microarray includes a total of 42,326 features: 1,417 Agilent controls, 74 control probes for sunflower replicated 10 times (740 controls) and 40,169 different non-control probes. Microarray performance was validated using a model experiment examining the induction of senescence by water deficit. Pre-processing and differential expression analysis of Agilent microarrays was performed using the Bioconductor limma package. The analyses based on p-values calculated by eBayes (p<0.01) allowed the detection of 558 differentially expressed genes between water stress and control conditions; from these, ten genes were further validated by qPCR. Over-represented ontologies were identified using FatiScan in the Babelomics suite. This work generated a curated and trustable sunflower unigene collection, and a custom, validated sunflower oligonucleotide-based microarray using Agilent technology. Both the curated unigene collection and the validated oligonucleotide microarray provide key resources for sunflower genome analysis, transcriptional studies, and molecular breeding for crop improvement. PMID:23110046

  6. Radar data pre-processing for reliable rain field estimation

    NASA Astrophysics Data System (ADS)

    Daliakopoulos, Ioannis N.; Tsanis, Ioannis K.

    2010-05-01

    A comparative analysis of different pre-processing methods applied to radar data for the minimization of the uncertainty of the produced Z-R relationship is conducted. The study focuses on measurements from 3 ground precipitation stations which are located in close proximity to the Souda Bay C-Band radar in Crete, Greece. While precipitation and reflectivity measurements were both collected in almost synchronized 10 minute intervals, uncertainties related to timing issues are discussed and measurements are aggregated to various scales up to 12 hours. Reflectivity measurements are also transformed and resampled in space, from polar coordinates to regular grids of 500 to 5000m resolution. The tradeoffs of both spatial and temporal transformation are discussed. Data is also filtered for noise using simple thresholding, the Wiener filter and combinations of both methods. The effects of the three pre-processing procedures are studied with respect to the final fit of the data to acceptable Z-R equations for the generation of reliable precipitation fields.

  7. Chemistry of Natural Glycan Microarray

    PubMed Central

    Song, Xuezheng; Heimburg-Molinaro, Jamie; Cummings, Richard D.; Smith, David F.

    2014-01-01

    Glycan microarrays have become indispensable tools for studying protein-glycan interactions. Along with chemo-enzymatic synthesis, glycans isolated from natural sources have played important roles in array development and will continue to be a major source of glycans. N- and O-glycans from glycoproteins, and glycans from glycosphingolipids can be released from corresponding glycoconjugates with relatively mature methods, although isolation of large numbers and quantities of glycans are still very challenging. Glycosylphosphatidylinositol (GPI)-anchors and glycosaminoglycans (GAGs) are less represented on current glycan microarrays. Glycan microarray development has been greatly facilitated by bifunctional fluorescent linkers, which can be applied in a “Shotgun Glycomics” approach to incorporate isolated natural glycans. Glycan presentation on microarrays may affect glycan binding by GBPs, often through multivalent recognition by the GBP. PMID:24487062

  8. Microarray Analysis of Microbial Weathering

    NASA Astrophysics Data System (ADS)

    Olsson-Francis, K.; van Houdt, R.; Leys, N.; Mergeay, M.; Cockell, C. S.

    2010-04-01

    Microarray analysis of the heavy metal resistant bacterium, Cupriavidus metallidurans CH34 was used to investigate the genes involved in the weathering. The results demonstrated that large porin and membrane transporter genes were unregulated.

  9. CARMAweb: comprehensive R- and bioconductor-based web service for microarray data analysis.

    PubMed

    Rainer, Johannes; Sanchez-Cabo, Fatima; Stocker, Gernot; Sturn, Alexander; Trajanoski, Zlatko

    2006-07-01

    CARMAweb (Comprehensive R-based Microarray Analysis web service) is a web application designed for the analysis of microarray data. CARMAweb performs data preprocessing (background correction, quality control and normalization), detection of differentially expressed genes, cluster analysis, dimension reduction and visualization, classification, and Gene Ontology-term analysis. This web application accepts raw data from a variety of imaging software tools for the most widely used microarray platforms: Affymetrix GeneChips, spotted two-color microarrays and Applied Biosystems (ABI) microarrays. R and packages from the Bioconductor project are used as an analytical engine in combination with the R function Sweave, which allows automatic generation of analysis reports. These report files contain all R commands used to perform the analysis and guarantee therefore a maximum transparency and reproducibility for each analysis. The web application is implemented in Java based on the latest J2EE (Java 2 Enterprise Edition) software technology. CARMAweb is freely available at https://carmaweb.genome.tugraz.at. PMID:16845058

  10. An automated method for gridding in microarray images.

    PubMed

    Giannakeas, Nikolaos; Fotiadis, Dimitrios I; Politou, Anastasia S

    2006-01-01

    Microarray technology is a powerful tool for analyzing the expression of a large number of genes in parallel. A typical microarray image consists of a few thousands of spots which determine the level of gene expression in the sample. In this paper we propose a method which automatically addresses each spot area in the image. Initially, a preliminary segmentation of the image is produced using a template matching algorithm. Next, grid and spot finding are realized. The position of non-expressed spots is located and finally a Voronoi diagram is employed to fit the grid on the image. Our method has been evaluated in a set of five images consisting of 45960 spots, from the Stanford microarray database and the reported accuracy for spot detection was 93% PMID:17946343

  11. Adaptive-weighted bilateral filtering and other pre-processing techniques for optical coherence tomography.

    PubMed

    Anantrasirichai, N; Nicholson, Lindsay; Morgan, James E; Erchova, Irina; Mortlock, Katie; North, Rachel V; Albon, Julie; Achim, Alin

    2014-09-01

    This paper presents novel pre-processing image enhancement algorithms for retinal optical coherence tomography (OCT). These images contain a large amount of speckle causing them to be grainy and of very low contrast. To make these images valuable for clinical interpretation, we propose a novel method to remove speckle, while preserving useful information contained in each retinal layer. The process starts with multi-scale despeckling based on a dual-tree complex wavelet transform (DT-CWT). We further enhance the OCT image through a smoothing process that uses a novel adaptive-weighted bilateral filter (AWBF). This offers the desirable property of preserving texture within the OCT image layers. The enhanced OCT image is then segmented to extract inner retinal layers that contain useful information for eye research. Our layer segmentation technique is also performed in the DT-CWT domain. Finally we describe an OCT/fundus image registration algorithm which is helpful when two modalities are used together for diagnosis and for information fusion. PMID:25034317

  12. Design and implementation of a preprocessing system for a sodium lidar

    NASA Technical Reports Server (NTRS)

    Voelz, D. G.; Sechrist, C. F., Jr.

    1983-01-01

    A preprocessing system, designed and constructed for use with the University of Illinois sodium lidar system, was developed to increase the altitude resolution and range of the lidar system and also to decrease the processing burden of the main lidar computer. The preprocessing system hardware and the software required to implement the system are described. Some preliminary results of an airborne sodium lidar experiment conducted with the preprocessing system installed in the sodium lidar are presented.

  13. Data acquisition and preprocessing techniques for remote sensing field research

    NASA Technical Reports Server (NTRS)

    Biehl, L. L.; Robinson, B. F.

    1983-01-01

    A crops and soils data base has been developed at Purdue University's Laboratory for Applications of Remote Sensing using spectral and agronomic measurements made by several government and university researchers. The data are being used to (1) quantitatively determine the relationships of spectral and agronomic characteristics of crops and soils, (2) define future sensor systems, and (3) develop advanced data analysis techniques. Researchers follow defined data acquisition and preprocessing techniques to provide fully annotated and calibrated sets of spectral, agronomic, and meteorological data. These procedures enable the researcher to combine his data with that acquired by other researchers for remote sensing research. The key elements or requirements for developing a field research data base of spectral data that can be transported across sites and years are appropriate experiment design, accurate spectral data calibration, defined field procedures, and through experiment documentation.

  14. Preprocessing of Satellite Data for Urban Object Extraction

    NASA Astrophysics Data System (ADS)

    Krauß, T.

    2015-03-01

    Very high resolution (VHR) DSMs (digital surface models) derived from stereo- or multi-stereo images from current VHR satellites like WorldView-2 or Pléiades can be produced up to the ground sampling distance (GSD) of the sensors in the range of 50 cm to 1 m. From such DSMs the digital terrain model (DTM) representing the ground and also a so called nDEM (normalized digital elevation model) describing the height of objects above the ground can be derived. In parallel these sensors deliver multispectral imagery which can be used for a spectral classification of the imagery. Fusion of the multispectral classification and the nDEM allows a simple classification and detection of urban objects. In further processing steps these detected urban objects can be modeled and exported in a suitable description language like CityGML. In this work we present the pre-processing steps up to the classification and detection of the urban objects. The modeling is not part of this work. The pre-processing steps described here cover briefly the coregistration of the input images and the generation of the DSM. In more detail the improvement of the DSM, the extraction of the DTM and nDEM, the multispectral classification and the object detection and extraction are explained. The methods described are applied to two test regions from two satellites: First the center of Munich acquired by WorldView-2 and second the center of Melbourne acquired by Pĺeiades. From both acquisitions a stereo-pair from the panchromatic bands is used for creation of the DSM and the pan-sharpened multispectral images are used for spectral classification. Finally the quality of the detected urban objects is discussed.

  15. Inferring genetic networks from microarray data.

    SciTech Connect

    May, Elebeoba Eni; Davidson, George S.; Martin, Shawn Bryan; Werner-Washburne, Margaret C.; Faulon, Jean-Loup Michel

    2004-06-01

    In theory, it should be possible to infer realistic genetic networks from time series microarray data. In practice, however, network discovery has proved problematic. The three major challenges are: (1) inferring the network; (2) estimating the stability of the inferred network; and (3) making the network visually accessible to the user. Here we describe a method, tested on publicly available time series microarray data, which addresses these concerns. The inference of genetic networks from genome-wide experimental data is an important biological problem which has received much attention. Approaches to this problem have typically included application of clustering algorithms [6]; the use of Boolean networks [12, 1, 10]; the use of Bayesian networks [8, 11]; and the use of continuous models [21, 14, 19]. Overviews of the problem and general approaches to network inference can be found in [4, 3]. Our approach to network inference is similar to earlier methods in that we use both clustering and Boolean network inference. However, we have attempted to extend the process to better serve the end-user, the biologist. In particular, we have incorporated a system to assess the reliability of our network, and we have developed tools which allow interactive visualization of the proposed network.

  16. Segmentation of prostate cancer tissue microarray images

    NASA Astrophysics Data System (ADS)

    Cline, Harvey E.; Can, Ali; Padfield, Dirk

    2006-02-01

    Prostate cancer is diagnosed by histopathology interpretation of hematoxylin and eosin (H and E)-stained tissue sections. Gland and nuclei distributions vary with the disease grade. The morphological features vary with the advance of cancer where the epithelial regions grow into the stroma. An efficient pathology slide image analysis method involved using a tissue microarray with known disease stages. Digital 24-bit RGB images were acquired for each tissue element on the slide with both 10X and 40X objectives. Initial segmentation at low magnification was accomplished using prior spectral characteristics from a training tissue set composed of four tissue clusters; namely, glands, epithelia, stroma and nuclei. The segmentation method was automated by using the training RGB values as an initial guess and iterating the averaging process 10 times to find the four cluster centers. Labels were assigned to the nearest cluster center in red-blue spectral feature space. An automatic threshold algorithm separated the glands from the tissue. A visual pseudo color representation of 60 segmented tissue microarray image was generated where white, pink, red, blue colors represent glands, epithelia, stroma and nuclei, respectively. The higher magnification images provided refined nuclei morphology. The nuclei were detected with a RGB color space principle component analysis that resulted in a grey scale image. The shape metrics such as compactness, elongation, minimum and maximum diameters were calculated based on the eigenvalues of the best-fitting ellipses to the nuclei.

  17. Comparing Bacterial DNA Microarray Fingerprints

    SciTech Connect

    Willse, Alan R.; Chandler, Darrell P.; White, Amanda M.; Protic, Miroslava; Daly, Don S.; Wunschel, Sharon C.

    2005-08-15

    Detecting subtle genetic differences between microorganisms is an important problem in molecular epidemiology and microbial forensics. In a typical investigation, gel electrophoresis is used to compare randomly amplified DNA fragments between microbial strains, where the patterns of DNA fragment sizes are proxies for a microbe's genotype. The limited genomic sample captured on a gel is often insufficient to discriminate nearly identical strains. This paper examines the application of microarray technology to DNA fingerprinting as a high-resolution alternative to gel-based methods. The so-called universal microarray, which uses short oligonucleotide probes that do not target specific genes or species, is intended to be applicable to all microorganisms because it does not require prior knowledge of genomic sequence. In principle, closely related strains can be distinguished if the number of probes on the microarray is sufficiently large, i.e., if the genome is sufficiently sampled. In practice, we confront noisy data, imperfectly matched hybridizations, and a high-dimensional inference problem. We describe the statistical problems of microarray fingerprinting, outline similarities with and differences from more conventional microarray applications, and illustrate the statistical fingerprinting problem for 10 closely related strains from three Bacillus species, and 3 strains from non-Bacillus species.

  18. Study on construction of a medical x-ray direct digital radiography system and hybrid preprocessing methods.

    PubMed

    Ren, Yong; Wu, Sheng; Wang, Mijian; Cen, Zhongjie

    2014-01-01

    We construct a medical X-ray direct digital radiography (DDR) system based on a CCD (charge-coupled devices) camera. For the original images captured from X-ray exposure, computer first executes image flat-field correction and image gamma correction, and then carries out image contrast enhancement. A hybrid image contrast enhancement algorithm which is based on sharp frequency localization-contourlet transform (SFL-CT) and contrast limited adaptive histogram equalization (CLAHE), is proposed and verified by the clinical DDR images. Experimental results show that, for the medical X-ray DDR images, the proposed comprehensive preprocessing algorithm can not only greatly enhance the contrast and detail information, but also improve the resolution capability of DDR system. PMID:25013452

  19. Study on Construction of a Medical X-Ray Direct Digital Radiography System and Hybrid Preprocessing Methods

    PubMed Central

    Ren, Yong; Wu, Sheng; Wang, Mijian; Cen, Zhongjie

    2014-01-01

    We construct a medical X-ray direct digital radiography (DDR) system based on a CCD (charge-coupled devices) camera. For the original images captured from X-ray exposure, computer first executes image flat-field correction and image gamma correction, and then carries out image contrast enhancement. A hybrid image contrast enhancement algorithm which is based on sharp frequency localization-contourlet transform (SFL-CT) and contrast limited adaptive histogram equalization (CLAHE), is proposed and verified by the clinical DDR images. Experimental results show that, for the medical X-ray DDR images, the proposed comprehensive preprocessing algorithm can not only greatly enhance the contrast and detail information, but also improve the resolution capability of DDR system. PMID:25013452

  20. Characteristic attributes in cancer microarrays.

    PubMed

    Sarkar, I N; Planet, P J; Bael, T E; Stanley, S E; Siddall, M; DeSalle, R; Figurski, D H

    2002-04-01

    Rapid advances in genome sequencing and gene expression microarray technologies are providing unprecedented opportunities to identify specific genes involved in complex biological processes, such as development, signal transduction, and disease. The vast amount of data generated by these technologies has presented new challenges in bioinformatics. To help organize and interpret microarray data, new and efficient computational methods are needed to: (1) distinguish accurately between different biological or clinical categories (e.g., malignant vs. benign), and (2) identify specific genes that play a role in determining those categories. Here we present a novel and simple method that exhaustively scans microarray data for unambiguous gene expression patterns. Such patterns of data can be used as the basis for classification into biological or clinical categories. The method, termed the Characteristic Attribute Organization System (CAOS), is derived from fundamental precepts in systematic biology. In CAOS we define two types of characteristic attributes ('pure' and 'private') that may exist in gene expression microarray data. We also consider additional attributes ('compound') that are composed of expression states of more than one gene that are not characteristic on their own. CAOS was tested on three well-known cancer DNA microarray data sets for its ability to classify new microarray samples. We found CAOS to be a highly accurate and robust class prediction technique. In addition, CAOS identified specific genes, not emphasized in other analyses, that may be crucial to the biology of certain types of cancer. The success of CAOS in this study has significant implications for basic research and the future development of reliable methods for clinical diagnostic tools. PMID:12474425

  1. Image microarrays (IMA): Digital pathology's missing tool

    PubMed Central

    Hipp, Jason; Cheng, Jerome; Pantanowitz, Liron; Hewitt, Stephen; Yagi, Yukako; Monaco, James; Madabhushi, Anant; Rodriguez-canales, Jaime; Hanson, Jeffrey; Roy-Chowdhuri, Sinchita; Filie, Armando C.; Feldman, Michael D.; Tomaszewski, John E.; Shih, Natalie NC.; Brodsky, Victor; Giaccone, Giuseppe; Emmert-Buck, Michael R.; Balis, Ulysses J.

    2011-01-01

    Introduction: The increasing availability of whole slide imaging (WSI) data sets (digital slides) from glass slides offers new opportunities for the development of computer-aided diagnostic (CAD) algorithms. With the all-digital pathology workflow that these data sets will enable in the near future, literally millions of digital slides will be generated and stored. Consequently, the field in general and pathologists, specifically, will need tools to help extract actionable information from this new and vast collective repository. Methods: To address this limitation, we designed and implemented a tool (dCORE) to enable the systematic capture of image tiles with constrained size and resolution that contain desired histopathologic features. Results: In this communication, we describe a user-friendly tool that will enable pathologists to mine digital slides archives to create image microarrays (IMAs). IMAs are to digital slides as tissue microarrays (TMAs) are to cell blocks. Thus, a single digital slide could be transformed into an array of hundreds to thousands of high quality digital images, with each containing key diagnostic morphologies and appropriate controls. Current manual digital image cut-and-paste methods that allow for the creation of a grid of images (such as an IMA) of matching resolutions are tedious. Conclusion: The ability to create IMAs representing hundreds to thousands of vetted morphologic features has numerous applications in education, proficiency testing, consensus case review, and research. Lastly, in a manner analogous to the way conventional TMA technology has significantly accelerated in situ studies of tissue specimens use of IMAs has similar potential to significantly accelerate CAD algorithm development. PMID:22200030

  2. Microarrayed Materials for Stem Cells

    PubMed Central

    Mei, Ying

    2013-01-01

    Stem cells hold remarkable promise for applications in disease modeling, cancer therapy and regenerative medicine. Despite the significant progress made during the last decade, designing materials to control stem cell fate remains challenging. As an alternative, materials microarray technology has received great attention because it allows for high throughput materials synthesis and screening at a reasonable cost. Here, we discuss recent developments in materials microarray technology and their applications in stem cell engineering. Future opportunities in the field will also be reviewed. PMID:24311967

  3. Immunoprofiling Using NAPPA Protein Microarrays

    PubMed Central

    Sibani, Sahar; LaBaer, Joshua

    2012-01-01

    Protein microarrays provide an efficient method to immunoprofile patients in an effort to rapidly identify disease immunosignatures. The validity of using autoantibodies in diagnosis has been demonstrated in type 1 diabetes, rheumatoid arthritis, and systemic lupus, and is now being strongly considered in cancer. Several types of protein microarrays exist including antibody and antigen arrays. In this chapter, we describe the immunoprofiling application for one type of antigen array called NAPPA (nucleic acids programmable protein array). We provide a guideline for setting up the screening study and designing protein arrays to maximize the likelihood of obtaining quality data. PMID:21370064

  4. Fourier Lucas-Kanade algorithm.

    PubMed

    Lucey, Simon; Navarathna, Rajitha; Ashraf, Ahmed Bilal; Sridharan, Sridha

    2013-06-01

    In this paper, we propose a framework for both gradient descent image and object alignment in the Fourier domain. Our method centers upon the classical Lucas & Kanade (LK) algorithm where we represent the source and template/model in the complex 2D Fourier domain rather than in the spatial 2D domain. We refer to our approach as the Fourier LK (FLK) algorithm. The FLK formulation is advantageous when one preprocesses the source image and template/model with a bank of filters (e.g., oriented edges, Gabor, etc.) as 1) it can handle substantial illumination variations, 2) the inefficient preprocessing filter bank step can be subsumed within the FLK algorithm as a sparse diagonal weighting matrix, 3) unlike traditional LK, the computational cost is invariant to the number of filters and as a result is far more efficient, and 4) this approach can be extended to the Inverse Compositional (IC) form of the LK algorithm where nearly all steps (including Fourier transform and filter bank preprocessing) can be precomputed, leading to an extremely efficient and robust approach to gradient descent image matching. Further, these computational savings translate to nonrigid object alignment tasks that are considered extensions of the LK algorithm, such as those found in Active Appearance Models (AAMs). PMID:23599053

  5. An integrated approach to the simultaneous selection of variables, mathematical pre-processing and calibration samples in partial least-squares multivariate calibration.

    PubMed

    Allegrini, Franco; Olivieri, Alejandro C

    2013-10-15

    A new optimization strategy for multivariate partial-least-squares (PLS) regression analysis is described. It was achieved by integrating three efficient strategies to improve PLS calibration models: (1) variable selection based on ant colony optimization, (2) mathematical pre-processing selection by a genetic algorithm, and (3) sample selection through a distance-based procedure. Outlier detection has also been included as part of the model optimization. All the above procedures have been combined into a single algorithm, whose aim is to find the best PLS calibration model within a Monte Carlo-type philosophy. Simulated and experimental examples are employed to illustrate the success of the proposed approach. PMID:24054659

  6. Validation of MIMGO: a method to identify differentially expressed GO terms in a microarray dataset

    PubMed Central

    2012-01-01

    Background We previously proposed an algorithm for the identification of GO terms that commonly annotate genes whose expression is upregulated or downregulated in some microarray data compared with in other microarray data. We call these “differentially expressed GO terms” and have named the algorithm “matrix-assisted identification method of differentially expressed GO terms” (MIMGO). MIMGO can also identify microarray data in which genes annotated with a differentially expressed GO term are upregulated or downregulated. However, MIMGO has not yet been validated on a real microarray dataset using all available GO terms. Findings We combined Gene Set Enrichment Analysis (GSEA) with MIMGO to identify differentially expressed GO terms in a yeast cell cycle microarray dataset. GSEA followed by MIMGO (GSEA + MIMGO) correctly identified (p < 0.05) microarray data in which genes annotated to differentially expressed GO terms are upregulated. We found that GSEA + MIMGO was slightly less effective than, or comparable to, GSEA (Pearson), a method that uses Pearson’s correlation as a metric, at detecting true differentially expressed GO terms. However, unlike other methods including GSEA (Pearson), GSEA + MIMGO can comprehensively identify the microarray data in which genes annotated with a differentially expressed GO term are upregulated or downregulated. Conclusions MIMGO is a reliable method to identify differentially expressed GO terms comprehensively. PMID:23232071

  7. Microfluidic microarray systems and methods thereof

    DOEpatents

    West, Jay A. A.; Hukari, Kyle W.; Hux, Gary A.

    2009-04-28

    Disclosed are systems that include a manifold in fluid communication with a microfluidic chip having a microarray, an illuminator, and a detector in optical communication with the microarray. Methods for using these systems for biological detection are also disclosed.

  8. Technical Advances of the Recombinant Antibody Microarray Technology Platform for Clinical Immunoproteomics

    PubMed Central

    Delfani, Payam; Dexlin Mellby, Linda; Nordström, Malin; Holmér, Andreas; Ohlsson, Mattias; Borrebaeck, Carl A. K.; Wingren, Christer

    2016-01-01

    In the quest for deciphering disease-associated biomarkers, high-performing tools for multiplexed protein expression profiling of crude clinical samples will be crucial. Affinity proteomics, mainly represented by antibody-based microarrays, have during recent years been established as a proteomic tool providing unique opportunities for parallelized protein expression profiling. But despite the progress, several main technical features and assay procedures remains to be (fully) resolved. Among these issues, the handling of protein microarray data, i.e. the biostatistics parts, is one of the key features to solve. In this study, we have therefore further optimized, validated, and standardized our in-house designed recombinant antibody microarray technology platform. To this end, we addressed the main remaining technical issues (e.g. antibody quality, array production, sample labelling, and selected assay conditions) and most importantly key biostatistics subjects (e.g. array data pre-processing and biomarker panel condensation). This represents one of the first antibody array studies in which these key biostatistics subjects have been studied in detail. Here, we thus present the next generation of the recombinant antibody microarray technology platform designed for clinical immunoproteomics. PMID:27414037

  9. Microarray analysis: Uses and Limitations

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The use of microarray technology has exploded in resent years. All areas of biological research have found application for this powerful platform. From human disease studies to microbial detection systems, a plethora of uses for this technology are currently in place with new uses being developed ...

  10. Microarray Developed on Plastic Substrates.

    PubMed

    Bañuls, María-José; Morais, Sergi B; Tortajada-Genaro, Luis A; Maquieira, Ángel

    2016-01-01

    There is a huge potential interest to use synthetic polymers as versatile solid supports for analytical microarraying. Chemical modification of polycarbonate (PC) for covalent immobilization of probes, micro-printing of protein or nucleic acid probes, development of indirect immunoassay, and development of hybridization protocols are described and discussed. PMID:26614067

  11. Preprocessing functions for computed radiography images in a PACS environment

    NASA Astrophysics Data System (ADS)

    McNitt-Gray, Michael F.; Pietka, Ewa; Huang, H. K.

    1992-05-01

    In a picture archiving and communications system (PACS), images are acquired from several modalities including computed radiography (CR). This modality has unique image characteristics and presents several problems that need to be resolved before the image is available for viewing at a display workstation. A set of preprocessing functions have been applied to all CR images in a PACS environment to enhance the display of images. The first function reformats CR images that are acquired with different plate sizes to a standard size for display. Another function removes the distracting white background caused by the collimation used at the time of exposure. A third function determines the orientation of each image and rotates those images that are in nonstandard positions into a standard viewing position. Another function creates a default look-up table based on the gray levels actually used by the image (instead of allocated gray levels). Finally, there is a function which creates (for chest images only) the piece-wise linear look-up tables that can be applied to enhance different tissue densities. These functions have all been implemented in a PACS environment. Each of these functions have been very successful in improving the viewing conditions of CR images and contribute to the clinical acceptance of PACS by reducing the effort required to display CR images.

  12. Multimodal image fusion with SIMS: Preprocessing with image registration.

    PubMed

    Tarolli, Jay Gage; Bloom, Anna; Winograd, Nicholas

    2016-06-01

    In order to utilize complementary imaging techniques to supply higher resolution data for fusion with secondary ion mass spectrometry (SIMS) chemical images, there are a number of aspects that, if not given proper consideration, could produce results which are easy to misinterpret. One of the most critical aspects is that the two input images must be of the same exact analysis area. With the desire to explore new higher resolution data sources that exists outside of the mass spectrometer, this requirement becomes even more important. To ensure that two input images are of the same region, an implementation of the insight segmentation and registration toolkit (ITK) was developed to act as a preprocessing step before performing image fusion. This implementation of ITK allows for several degrees of movement between two input images to be accounted for, including translation, rotation, and scale transforms. First, the implementation was confirmed to accurately register two multimodal images by supplying a known transform. Once validated, two model systems, a copper mesh grid and a group of RAW 264.7 cells, were used to demonstrate the use of the ITK implementation to register a SIMS image with a microscopy image for the purpose of performing image fusion. PMID:26772745

  13. TOPSAR data focusing based on azimuth scaling preprocessing

    NASA Astrophysics Data System (ADS)

    Xu, Wei; Huang, Pingping; Deng, Yunkai

    2011-07-01

    Both Doppler spectral aliasing and azimuth output time folding simultaneously exist in TOPSAR (Terrain Observation by Progressive Scans) raw data. Resampling in both Doppler frequency and azimuth time domain can resolve the azimuth aliasing problem but with the seriously increased computational complexity and memory consumption. According to the special characteristics of TOPSAR raw data support in the slow time/frequency domain (TFD), the presented azimuth scaling preprocessing step is introduced to not only resolve the Doppler spectral aliasing problem but also reduce the increased azimuth samples. Furthermore, the correction of sawtoothed azimuth antenna pattern (AAP) becomes easy to be implemented. The following conventional stripmap processor can be adopted to focus the residual TOPSAR raw data but with the result of azimuth aliased TOPSAR image. The mosaic approach, which has been presented to unfold azimuth aliased ScanSAR image, is exploited to resolve the problem of azimuth output folding in TOPSAR mode. Simulation results and pulse response parameters are given to validate the presented imaging approach.

  14. Software for Preprocessing Data from Rocket-Engine Tests

    NASA Technical Reports Server (NTRS)

    Cheng, Chiu-Fu

    2004-01-01

    Three computer programs have been written to preprocess digitized outputs of sensors during rocket-engine tests at Stennis Space Center (SSC). The programs apply exclusively to the SSC E test-stand complex and utilize the SSC file format. The programs are the following: Engineering Units Generator (EUGEN) converts sensor-output-measurement data to engineering units. The inputs to EUGEN are raw binary test-data files, which include the voltage data, a list identifying the data channels, and time codes. EUGEN effects conversion by use of a file that contains calibration coefficients for each channel. QUICKLOOK enables immediate viewing of a few selected channels of data, in contradistinction to viewing only after post-test processing (which can take 30 minutes to several hours depending on the number of channels and other test parameters) of data from all channels. QUICKLOOK converts the selected data into a form in which they can be plotted in engineering units by use of Winplot (a free graphing program written by Rick Paris). EUPLOT provides a quick means for looking at data files generated by EUGEN without the necessity of relying on the PV-WAVE based plotting software.

  15. Software for Preprocessing Data From Rocket-Engine Tests

    NASA Technical Reports Server (NTRS)

    Cheng, Chiu-Fu

    2002-01-01

    Three computer programs have been written to preprocess digitized outputs of sensors during rocket-engine tests at Stennis Space Center (SSC). The programs apply exclusively to the SSC "E" test-stand complex and utilize the SSC file format. The programs are the following: 1) Engineering Units Generator (EUGEN) converts sensor-output-measurement data to engineering units. The inputs to EUGEN are raw binary test-data files, which include the voltage data, a list identifying the data channels, and time codes. EUGEN effects conversion by use of a file that contains calibration coefficients for each channel; 2) QUICKLOOK enables immediate viewing of a few selected channels of data, in contradistinction to viewing only after post test processing (which can take 30 minutes to several hours depending on the number of channels and other test parameters) of data from all channels. QUICKLOOK converts the selected data into a form in which they can be plotted in engineering units by use of Winplot (a free graphing program written by Rick Paris); and 3) EUPLOT provides a quick means for looking at data files generated by EUGEN without the necessity of relying on the PVWAVE based plotting software.

  16. Software for Preprocessing Data From Rocket-Engine Tests

    NASA Technical Reports Server (NTRS)

    Cheng, Chiu-Fu

    2003-01-01

    Three computer programs have been written to preprocess digitized outputs of sensors during rocket-engine tests at Stennis Space Center (SSC). The programs apply exclusively to the SSC E test-stand complex and utilize the SSC file format. The programs are the following: (1) Engineering Units Generator (EUGEN) converts sensor-output-measurement data to engineering units. The inputs to EUGEN are raw binary test-data files, which include the voltage data, a list identifying the data channels, and time codes. EUGEN effects conversion by use of a file that contains calibration coefficients for each channel. (2) QUICKLOOK enables immediate viewing of a few selected channels of data, in contradistinction to viewing only after post-test processing (which can take 30 minutes to several hours depending on the number of channels and other test parameters) of data from all channels. QUICKLOOK converts the selected data into a form in which they can be plotted in engineering units by use of Winplot. (3) EUPLOT provides a quick means for looking at data files generated by EUGEN without the necessity of relying on the PVWAVE based plotting software.

  17. Design of a focal plane array with analog neural preprocessing

    NASA Astrophysics Data System (ADS)

    Koren, Ivo; Dohndorf, Juergen; Schluessler, Jens-Uwe; Werner, Joerg; Kroenig, Arndt; Ramacher, Ulrich

    1996-12-01

    The design of a CMOS focal plane array with 128 by 128 pixels and analog neural preprocessing is presented. Optical input to the array is provided by substrate-well photodiodes. A two-dimensional neural grid wIth next- neighbor connectivity, implemented as differential current- mode circuit, is capable of spatial low-pass filtering combined with contrast enhancement or binarization. The gain, spatial filter and nonlinearity parameters of the neural network are controlled externally using analog currents. This allows the multipliers and sigmoid transducers to be operated in weak inversion for a wide parameter sweep range as well as in moderate or strong inversion for a larger signal to pattern-noise ratio. The cell outputs are sequentially read out by an offset compensated differential switched-capacitor multiplexer with column preamplifiers. The analog output buffer is designed for pixel rates up to 1 pixel/microsecond and 2 by 100 pF load capacitance. All digital clocks controlling the analog data path are generated on-chip. The clock timing is programmable via a serial computer interface. Using 1 micrometer double-poly double-metal CMOS process, one pixel cell occupies 96 by 96 micrometer2 and the total chip size is about 2.3 cm2. Operating the neural network in weak inversion, the power dissipation of the analog circuitry is less than 100 mW.

  18. Macular Preprocessing of Linear Acceleratory Stimuli: Implications for the Clinic

    NASA Technical Reports Server (NTRS)

    Ross, M. D.; Hargens, Alan R. (Technical Monitor)

    1996-01-01

    Three-dimensional reconstructions of innervation patterns in rat maculae were carried out using serial section images sent to a Silicon Graphics workstation from a transmission electron microscope. Contours were extracted from mosaicked sections, then registered and visualized using Biocomputation Center software. Purposes were to determine innervation patterns of type II cells and areas encompassed by vestibular afferent receptive fields. Terminals on type II cells typically are elongated and compartmentalized into parts varying in vesicular content; reciprocal and serial synapses are common. The terminals originate as processes of nearby calyces or from nerve fibers passing to calyces outside the immediate vicinity. Thus, receptive fields of the afferents overlap in unique ways. Multiple processes are frequent; from 4 to 6 afferents supply 12-16 terminals on a type II cell. Processes commonly communicate with two type II cells. The morphology indicates that extensive preprocessing of linear acceleratory stimuli occurs peripherally, as is true also of visual and olfactory systems. Clinically, this means that loss of individual nerve fibers may not be noticed behaviorally, due to redundancy (receptive field overlap). However, peripheral processing implies the presence of neuroactive agents whose loss can acutely or chronically alter normal peripheral function and cause balance disorders. (Platform presentation preferred - Theme 11)

  19. Classifying Human Voices by Using Hybrid SFX Time-Series Preprocessing and Ensemble Feature Selection

    PubMed Central

    Wong, Raymond

    2013-01-01

    Voice biometrics is one kind of physiological characteristics whose voice is different for each individual person. Due to this uniqueness, voice classification has found useful applications in classifying speakers' gender, mother tongue or ethnicity (accent), emotion states, identity verification, verbal command control, and so forth. In this paper, we adopt a new preprocessing method named Statistical Feature Extraction (SFX) for extracting important features in training a classification model, based on piecewise transformation treating an audio waveform as a time-series. Using SFX we can faithfully remodel statistical characteristics of the time-series; together with spectral analysis, a substantial amount of features are extracted in combination. An ensemble is utilized in selecting only the influential features to be used in classification model induction. We focus on the comparison of effects of various popular data mining algorithms on multiple datasets. Our experiment consists of classification tests over four typical categories of human voice data, namely, Female and Male, Emotional Speech, Speaker Identification, and Language Recognition. The experiments yield encouraging results supporting the fact that heuristically choosing significant features from both time and frequency domains indeed produces better performance in voice classification than traditional signal processing techniques alone, like wavelets and LPC-to-CC. PMID:24288684

  20. Image preprocessing method for particle image velocimetry (PIV) image interrogation near a fluid-solid surface

    NASA Astrophysics Data System (ADS)

    Zhu, Yiding; Jia, Lichao; Bai, Ye; Yuan, Huijing; Lee, Cunbiao

    2014-11-01

    Accurate particle image velocimetry (PIV) measurements near the moving wall are a great challenge. The problem is compounded by the very large in-plane displacement on PIV images commonly encountered in measurements of the high speed flow. An improved image preprocessing method is presented in this paper. A wall detection technique is used first to qualify the wall position and the movement of the solid body. Virtual particle images are imposed in the solid region, of which the displacements are evaluated by the body movement. The estimation near the wall is then smoothed by data from both sides of the shear layer to reduce the large random uncertainties. Interrogations in the following iterative steps then converge to the correct results to provide accurate predictions for particle tracking velocimetries (PTV). Significant improvement is seen in Monte Carlo simulations and experimental tests such as measurements near a flapping flag or compressor plates. The algorithm also successfully extracted the small flow structures of the 2nd mode wave in the hypersonic boundary layer from PIV images with low signal-noise-ratios(SNR) when the traditional method was not successful.

  1. The Microarray Revolution: Perspectives from Educators

    ERIC Educational Resources Information Center

    Brewster, Jay L.; Beason, K. Beth; Eckdahl, Todd T.; Evans, Irene M.

    2004-01-01

    In recent years, microarray analysis has become a key experimental tool, enabling the analysis of genome-wide patterns of gene expression. This review approaches the microarray revolution with a focus upon four topics: 1) the early development of this technology and its application to cancer diagnostics; 2) a primer of microarray research,…

  2. Understanding the effects of pre-processing on extracted signal features from gait accelerometry signals

    PubMed Central

    Millecamps, Alexandre; Brach, Jennifer S.; Lowry, Kristin A.; Perera, Subashan; Redfern, Mark S.

    2015-01-01

    Gait accelerometry is an important approach for gait assessment. Previous contributions have adopted various pre-processing approaches for gait accelerometry signals, but none have thoroughly investigated the effects of such pre-processing operations on the obtained results. Therefore, this paper investigated the influence of pre-processing operations on signal features extracted from gait accelerometry signals. These signals were collected from 35 participants aged over 65 years-old: 14 of them were healthy controls (HC), 10 had Parkinson’s disease (PD) and 11 had peripheral neuropathy (PN). The participants walked on a treadmill at preferred speed. Signal features in time, frequency and time-frequency domains were computed for both raw and pre-processed signals. The pre-processing stage consisted of applying tilt correction and de-noising operations to acquired signals. We first examined the effects of these operations separately, followed by the investigation of their joint effects. Several important observations were made based on the obtained results. First, the denoising operation alone had almost no effects in comparison to the trends observed in the raw data. Second, the tilt correction affected the reported results to a certain degree, which could lead to a better discrimination between groups. Third, the combination of the two pre-processing operations yielded similar trends as the tilt correction alone. These results indicated that while gait accelerometry is a valuable approach for the gait assessment, one has to carefully adopt any pre-processing steps as they alter the observed findings. PMID:25935124

  3. Understanding the effects of pre-processing on extracted signal features from gait accelerometry signals.

    PubMed

    Millecamps, Alexandre; Lowry, Kristin A; Brach, Jennifer S; Perera, Subashan; Redfern, Mark S; Sejdić, Ervin

    2015-07-01

    Gait accelerometry is an important approach for gait assessment. Previous contributions have adopted various pre-processing approaches for gait accelerometry signals, but none have thoroughly investigated the effects of such pre-processing operations on the obtained results. Therefore, this paper investigated the influence of pre-processing operations on signal features extracted from gait accelerometry signals. These signals were collected from 35 participants aged over 65years: 14 of them were healthy controls (HC), 10 had Parkinson׳s disease (PD) and 11 had peripheral neuropathy (PN). The participants walked on a treadmill at preferred speed. Signal features in time, frequency and time-frequency domains were computed for both raw and pre-processed signals. The pre-processing stage consisted of applying tilt correction and denoising operations to acquired signals. We first examined the effects of these operations separately, followed by the investigation of their joint effects. Several important observations were made based on the obtained results. First, the denoising operation alone had almost no effects in comparison to the trends observed in the raw data. Second, the tilt correction affected the reported results to a certain degree, which could lead to a better discrimination between groups. Third, the combination of the two pre-processing operations yielded similar trends as the tilt correction alone. These results indicated that while gait accelerometry is a valuable approach for the gait assessment, one has to carefully adopt any pre-processing steps as they alter the observed findings. PMID:25935124

  4. Biclustering of microarray data with MOSPO based on crowding distance

    PubMed Central

    Liu, Junwan; Li, Zhoujun; Hu, Xiaohua; Chen, Yiming

    2009-01-01

    Background High-throughput microarray technologies have generated and accumulated massive amounts of gene expression datasets that contain expression levels of thousands of genes under hundreds of different experimental conditions. The microarray datasets are usually presented in 2D matrices, where rows represent genes and columns represent experimental conditions. The analysis of such datasets can discover local structures composed by sets of genes that show coherent expression patterns under subsets of experimental conditions. It leads to the development of sophisticated algorithms capable of extracting novel and useful knowledge from a biomedical point of view. In the medical domain, these patterns are useful for understanding various diseases, and aid in more accurate diagnosis, prognosis, treatment planning, as well as drug discovery. Results In this work we present the CMOPSOB (Crowding distance based Multi-objective Particle Swarm Optimization Biclustering), a novel clustering approach for microarray datasets to cluster genes and conditions highly related in sub-portions of the microarray data. The objective of biclustering is to find sub-matrices, i.e. maximal subgroups of genes and subgroups of conditions where the genes exhibit highly correlated activities over a subset of conditions. Since these objectives are mutually conflicting, they become suitable candidates for multi-objective modelling. Our approach CMOPSOB is based on a heuristic search technique, multi-objective particle swarm optimization, which simulates the movements of a flock of birds which aim to find food. In the meantime, the nearest neighbour search strategies based on crowding distance and ϵ-dominance can rapidly converge to the Pareto front and guarantee diversity of solutions. We compare the potential of this methodology with other biclustering algorithms by analyzing two common and public datasets of gene expression profiles. In all cases our method can find localized structures

  5. Automated Pre-processing for NMR Assignments with Reduced Tedium

    Energy Science and Technology Software Center (ESTSC)

    2004-05-11

    An important rate-limiting step in the reasonance asignment process is accurate identification of resonance peaks in MNR spectra. NMR spectra are noisy. Hence, automatic peak-picking programs must navigate between the Scylla of reliable but incomplete picking, and the Charybdis of noisy but complete picking. Each of these extremes complicates the assignment process: incomplete peak-picking results in the loss of essential connectivities, while noisy picking conceals the true connectivities under a combinatiorial explosion of false positives.more » Intermediate processing can simplify the assignment process by preferentially removing false peaks from noisy peak lists. This is accomplished by requiring consensus between multiple NMR experiments, exploiting a priori information about NMR spectra, and drawing on empirical statistical distributions of chemical shift extracted from the BioMagResBank. Experienced NMR practitioners currently apply many of these techniques "by hand", which is tedious, and may appear arbitrary to the novice. To increase efficiency, we have created a systematic and automated approach to this process, known as APART. Automated pre-processing has three main advantages: reduced tedium, standardization, and pedagogy. In the hands of experienced spectroscopists, the main advantage is reduced tedium (a rapid increase in the ratio of true peaks to false peaks with minimal effort). When a project is passed from hand to hand, the main advantage is standardization. APART automatically documents the peak filtering process by archiving its original recommendations, the accompanying justifications, and whether a user accepted or overrode a given filtering recommendation. In the hands of a novice, this tool can reduce the stumbling block of learning to differentiate between real peaks and noise, by providing real-time examples of how such decisions are made.« less

  6. Ontology-Based Analysis of Microarray Data.

    PubMed

    Giuseppe, Agapito; Milano, Marianna

    2016-01-01

    The importance of semantic-based methods and algorithms for the analysis and management of biological data is growing for two main reasons. From a biological side, knowledge contained in ontologies is more and more accurate and complete, from a computational side, recent algorithms are using in a valuable way such knowledge. Here we focus on semantic-based management and analysis of protein interaction networks referring to all the approaches of analysis of protein-protein interaction data that uses knowledge encoded into biological ontologies. Semantic approaches for studying high-throughput data have been largely used in the past to mine genomic and expression data. Recently, the emergence of network approaches for investigating molecular machineries has stimulated in a parallel way the introduction of semantic-based techniques for analysis and management of network data. The application of these computational approaches to the study of microarray data can broad the application scenario of them and simultaneously can help the understanding of disease development and progress. PMID:25971913

  7. Tissue microarrays: applications in genomic research.

    PubMed

    Watanabe, Aprill; Cornelison, Robert; Hostetter, Galen

    2005-03-01

    The widespread application of tissue microarrays in cancer research and the clinical pathology laboratory demonstrates a versatile and portable technology. The rapid integration of tissue microarrays into biomarker discovery and validation processes reflects the forward thinking of researchers who have pioneered the high-density tissue microarray. The precise arrangement of hundreds of archival clinical tissue samples into a composite tissue microarray block is now a proven method for the efficient and standardized analysis of molecular markers. With applications in cancer research, tissue microarrays are a valuable tool in validating candidate markers discovered in highly sensitive genome-wide microarray experiments. With applications in clinical pathology, tissue microarrays are used widely in immunohistochemistry quality control and quality assurance. The timeline of a biomarker implicated in prostate neoplasia, which was identified by complementary DNA expression profiling, validated by tissue microarrays and is now used as a prognostic immunohistochemistry marker, is reviewed. The tissue microarray format provides opportunities for digital imaging acquisition, image processing and database integration. Advances in digital imaging help to alleviate previous bottlenecks in the research pipeline, permit computer image scoring and convey telepathology opportunities for remote image analysis. The tissue microarray industry now includes public and private sectors with varying degrees of research utility and offers a range of potential tissue microarray applications in basic research, prognostic oncology and drug discovery. PMID:15833047

  8. The Current Status of DNA Microarrays

    NASA Astrophysics Data System (ADS)

    Shi, Leming; Perkins, Roger G.; Tong, Weida

    DNA microarray technology that allows simultaneous assay of thousands of genes in a single experiment has steadily advanced to become a mainstream method used in research, and has reached a stage that envisions its use in medical applications and personalized medicine. Many different strategies have been developed for manufacturing DNA microarrays. In this chapter, we discuss the manufacturing characteristics of seven microarray platforms that were used in a recently completed large study by the MicroArray Quality Control (MAQC) consortium, which evaluated the concordance of results across these platforms. The platforms can be grouped into three categories: (1) in situ synthesis of oligonucleotide probes on microarrays (Affymetrix GeneChip® arrays based on photolithography synthesis and Agilent's arrays based on inkjet synthesis); (2) spotting of presynthesized oligonucleotide probes on microarrays (GE Healthcare's CodeLink system, Applied Biosystems' Genome Survey Microarrays, and the custom microarrays printed with Operon's oligonucleotide set); and (3) deposition of presynthesized oligonucleotide probes on bead-based microarrays (Illumina's BeadChip microarrays). We conclude this chapter with our views on the challenges and opportunities toward acceptance of DNA microarray data in clinical and regulatory settings.

  9. The Current Status of DNA Microarrays

    NASA Astrophysics Data System (ADS)

    Shi, Leming; Perkins, Roger G.; Tong, Weida

    DNA microarray technology that allows simultaneous assay of thousands of genes in a single experiment has steadily advanced to become a mainstream method used in research, and has reached a stage that envisions its use in medical applications and personalized medicine. Many different strategies have been developed for manufacturing DNA microarrays. In this chapter, we discuss the manu facturing characteristics of seven microarray platforms that were used in a recently completed large study by the MicroArray Quality Control (MAQC) consortium, which evaluated the concordance of results across these platforms. The platforms can be grouped into three categories: (1) in situ synthesis of oligonucleotide probes on microarrays (Affymetrix GeneChip® arrays based on photolithography synthesis and Agilent's arrays based on inkjet synthesis); (2) spotting of presynthe-sized oligonucleotide probes on microarrays (GE Healthcare's CodeLink system, Applied Biosystems' Genome Survey Microarrays, and the custom microarrays printed with Operon's oligonucleotide set); and (3) deposition of presynthesized oligonucleotide probes on bead-based microarrays (Illumina's BeadChip microar-rays). We conclude this chapter with our views on the challenges and opportunities toward acceptance of DNA microarray data in clinical and regulatory settings.

  10. Hyperspectral microarray scanning: impact on the accuracy and reliability of gene expression data

    PubMed Central

    Timlin, Jerilyn A; Haaland, David M; Sinclair, Michael B; Aragon, Anthony D; Martinez, M Juanita; Werner-Washburne, Margaret

    2005-01-01

    Background Commercial microarray scanners and software cannot distinguish between spectrally overlapping emission sources, and hence cannot accurately identify or correct for emissions not originating from the labeled cDNA. We employed our hyperspectral microarray scanner coupled with multivariate data analysis algorithms that independently identify and quantitate emissions from all sources to investigate three artifacts that reduce the accuracy and reliability of microarray data: skew toward the green channel, dye separation, and variable background emissions. Results Here we demonstrate that several common microarray artifacts resulted from the presence of emission sources other than the labeled cDNA that can dramatically alter the accuracy and reliability of the array data. The microarrays utilized in this study were representative of a wide cross-section of the microarrays currently employed in genomic research. These findings reinforce the need for careful attention to detail to recognize and subsequently eliminate or quantify the presence of extraneous emissions in microarray images. Conclusion Hyperspectral scanning together with multivariate analysis offers a unique and detailed understanding of the sources of microarray emissions after hybridization. This opportunity to simultaneously identify and quantitate contaminant and background emissions in microarrays markedly improves the reliability and accuracy of the data and permits a level of quality control of microarray emissions previously unachievable. Using these tools, we can not only quantify the extent and contribution of extraneous emission sources to the signal, but also determine the consequences of failing to account for them and gain the insight necessary to adjust preparation protocols to prevent such problems from occurring. PMID:15888208

  11. Microarray analysis in pulmonary hypertension.

    PubMed

    Hoffmann, Julia; Wilhelm, Jochen; Olschewski, Andrea; Kwapiszewska, Grazyna

    2016-07-01

    Microarrays are a powerful and effective tool that allows the detection of genome-wide gene expression differences between controls and disease conditions. They have been broadly applied to investigate the pathobiology of diverse forms of pulmonary hypertension, namely group 1, including patients with idiopathic pulmonary arterial hypertension, and group 3, including pulmonary hypertension associated with chronic lung diseases such as chronic obstructive pulmonary disease and idiopathic pulmonary fibrosis. To date, numerous human microarray studies have been conducted to analyse global (lung homogenate samples), compartment-specific (laser capture microdissection), cell type-specific (isolated primary cells) and circulating cell (peripheral blood) expression profiles. Combined, they provide important information on development, progression and the end-stage disease. In the future, system biology approaches, expression of noncoding RNAs that regulate coding RNAs, and direct comparison between animal models and human disease might be of importance. PMID:27076594

  12. Microarray analysis in pulmonary hypertension

    PubMed Central

    Hoffmann, Julia; Wilhelm, Jochen; Olschewski, Andrea

    2016-01-01

    Microarrays are a powerful and effective tool that allows the detection of genome-wide gene expression differences between controls and disease conditions. They have been broadly applied to investigate the pathobiology of diverse forms of pulmonary hypertension, namely group 1, including patients with idiopathic pulmonary arterial hypertension, and group 3, including pulmonary hypertension associated with chronic lung diseases such as chronic obstructive pulmonary disease and idiopathic pulmonary fibrosis. To date, numerous human microarray studies have been conducted to analyse global (lung homogenate samples), compartment-specific (laser capture microdissection), cell type-specific (isolated primary cells) and circulating cell (peripheral blood) expression profiles. Combined, they provide important information on development, progression and the end-stage disease. In the future, system biology approaches, expression of noncoding RNAs that regulate coding RNAs, and direct comparison between animal models and human disease might be of importance. PMID:27076594

  13. Phenotypic MicroRNA Microarrays

    PubMed Central

    Kwon, Yong-Jun; Heo, Jin Yeong; Kim, Hi Chul; Kim, Jin Yeop; Liuzzi, Michel; Soloveva, Veronica

    2013-01-01

    Microarray technology has become a very popular approach in cases where multiple experiments need to be conducted repeatedly or done with a variety of samples. In our lab, we are applying our high density spots microarray approach to microscopy visualization of the effects of transiently introduced siRNA or cDNA on cellular morphology or phenotype. In this publication, we are discussing the possibility of using this micro-scale high throughput process to study the role of microRNAs in the biology of selected cellular models. After reverse-transfection of microRNAs and siRNA, the cellular phenotype generated by microRNAs regulated NF-κB expression comparably to the siRNA. The ability to print microRNA molecules for reverse transfection into cells is opening up the wide horizon for the phenotypic high content screening of microRNA libraries using cellular disease models.

  14. Self-Assembling Protein Microarrays

    NASA Astrophysics Data System (ADS)

    Ramachandran, Niroshan; Hainsworth, Eugenie; Bhullar, Bhupinder; Eisenstein, Samuel; Rosen, Benjamin; Lau, Albert Y.; C. Walter, Johannes; LaBaer, Joshua

    2004-07-01

    Protein microarrays provide a powerful tool for the study of protein function. However, they are not widely used, in part because of the challenges in producing proteins to spot on the arrays. We generated protein microarrays by printing complementary DNAs onto glass slides and then translating target proteins with mammalian reticulocyte lysate. Epitope tags fused to the proteins allowed them to be immobilized in situ. This obviated the need to purify proteins, avoided protein stability problems during storage, and captured sufficient protein for functional studies. We used the technology to map pairwise interactions among 29 human DNA replication initiation proteins, recapitulate the regulation of Cdt1 binding to select replication proteins, and map its geminin-binding domain.

  15. Washing scaling of GeneChip microarray expression

    PubMed Central

    2010-01-01

    Background Post-hybridization washing is an essential part of microarray experiments. Both the quality of the experimental washing protocol and adequate consideration of washing in intensity calibration ultimately affect the quality of the expression estimates extracted from the microarray intensities. Results We conducted experiments on GeneChip microarrays with altered protocols for washing, scanning and staining to study the probe-level intensity changes as a function of the number of washing cycles. For calibration and analysis of the intensity data we make use of the 'hook' method which allows intensity contributions due to non-specific and specific hybridization of perfect match (PM) and mismatch (MM) probes to be disentangled in a sequence specific manner. On average, washing according to the standard protocol removes about 90% of the non-specific background and about 30-50% and less than 10% of the specific targets from the MM and PM, respectively. Analysis of the washing kinetics shows that the signal-to-noise ratio doubles roughly every ten stringent washing cycles. Washing can be characterized by time-dependent rate constants which reflect the heterogeneous character of target binding to microarray probes. We propose an empirical washing function which estimates the survival of probe bound targets. It depends on the intensity contribution due to specific and non-specific hybridization per probe which can be estimated for each probe using existing methods. The washing function allows probe intensities to be calibrated for the effect of washing. On a relative scale, proper calibration for washing markedly increases expression measures, especially in the limit of small and large values. Conclusions Washing is among the factors which potentially distort expression measures. The proposed first-order correction method allows direct implementation in existing calibration algorithms for microarray data. We provide an experimental 'washing data set' which might

  16. Reordering based integrative expression profiling for microarray classification

    PubMed Central

    2012-01-01

    Background Current network-based microarray analysis uses the information of interactions among concerned genes/gene products, but still considers each gene expression individually. We propose an organized knowledge-supervised approach - Integrative eXpression Profiling (IXP), to improve microarray classification accuracy, and help discover groups of genes that have been too weak to detect individually by traditional ways. To implement IXP, ant colony optimization reordering (ACOR) algorithm is used to group functionally related genes in an ordered way. Results Using Alzheimer's disease (AD) as an example, we demonstrate how to apply ACOR-based IXP approach into microarray classifications. Using a microarray dataset - GSE1297 with 31 samples as training set, the result for the blinded classification on another microarray dataset - GSE5281 with 151 samples, shows that our approach can improve accuracy from 74.83% to 82.78%. A recently-published 1372-probe signature for AD can only achieve 61.59% accuracy in the same condition. The ACOR-based IXP approach also has better performance than the IXP approach based on classic network ranking, graph clustering, and random-ordering methods in an overall classification performance comparison. Conclusions The ACOR-based IXP approach can serve as a knowledge-supervised feature transformation approach to increase classification accuracy dramatically, by transforming each gene expression profile to an integrated expression files as features inputting into standard classifiers. The IXP approach integrates both gene expression information and organized knowledge - disease gene/protein network topology information, which is represented as both network node weights (local topological properties) and network node orders (global topological characteristics). PMID:22536860

  17. Performance of Multi-User Transmitter Pre-Processing Assisted Multi-Cell IDMA System for Downlink Transmission

    NASA Astrophysics Data System (ADS)

    Partibane, B.; Nagarajan, V.; Vishvaksenan, K. S.; Kalidoss, R.

    2015-06-01

    In this paper, we present the performance of multi-user transmitter pre-processing (MUTP) assisted coded-interleave division multiple access (IDMA) system over correlated frequency-selective channels for downlink communication. We realize MUTP using singular value decomposition (SVD) technique, which exploits the channel state information (CSI) of all the active users that is acquired via feedback channels. We consider the MUTP technique to alleviate the effects of co-channel interference (CCI) and multiple access interference (MAI). To be specific, we estimate the CSI using least square error (LSE) algorithm at each of the mobile stations (MSs) and perform vector quantization using Lloyd's algorithm, and feedback the bits that represents the quantized magnitudes and phases to the base station (BS) through the dedicated low rate noisy channel. Finally we recover the quantized bits at the BS to formulate the pre-processing matrix. The performance of MUTP aided IDMA systems are evaluated for five types of delay spread distributions pertaining to long-term evolution (LTE) and Stanford University Interim (SUI) channel models. We also compare the performance of MUTP with minimum mean square error (MMSE) detector for the coded IDMA system. The considered TP scheme alleviates the effects of CCI with less complex signal detection at the MSs when compared to MMSE detector. Further, our simulation results reveal that SVD-based MUTP assisted coded IDMA system outperforms the MMSE detector in terms of achievable bit error rate (BER) with low signal-to-noise ratio (SNR) requirement by mitigating the effects of CCI and MAI.

  18. Optical detection of nanoparticle-enhanced human papillomavirus genotyping microarrays.

    PubMed

    Li, Xue Zhe; Kim, Sookyung; Cho, Wonhyung; Lee, Seung-Yop

    2013-02-01

    In this study, we propose a new detection method of nanoparticle-enhanced human papillomavirus genotyping microarrays using a DVD optical pick-up with a photodiode. The HPV genotyping DNA chip was labeled using Au/Ag core-shell nanoparticles, prepared on a treatment glass substrate. Then, the bio information of the HPV genotyping target DNA was detected by measuring the difference of the optical signals between the DNA spots and the background parts for cervical cancer diagnosis. Moreover the approximate linear relationship between the concentration of the HPV genotyping target DNA and the optical signal depending on the density of Au/Ag core-shell nanoparticles was obtained by performing a spot finding algorithm. It is shown that the nanoparticle-labeled HPV genotyping target DNA can be measured and quantified by collecting the low-cost photodiode signal on the treatment glass chip, replacing high-cost fluorescence microarray scanners using a photomultiplier tube. PMID:23413051

  19. A Hybrid BPSO-CGA Approach for Gene Selection and Classification of Microarray Data

    PubMed Central

    Chuang, Li-Yeh; Yang, Cheng-Huei; Li, Jung-Chike

    2012-01-01

    Abstract Microarray analysis promises to detect variations in gene expressions, and changes in the transcription rates of an entire genome in vivo. Microarray gene expression profiles indicate the relative abundance of mRNA corresponding to the genes. The selection of relevant genes from microarray data poses a formidable challenge to researchers due to the high-dimensionality of features, multiclass categories being involved, and the usually small sample size. A classification process is often employed which decreases the dimensionality of the microarray data. In order to correctly analyze microarray data, the goal is to find an optimal subset of features (genes) which adequately represents the original set of features. A hybrid method of binary particle swarm optimization (BPSO) and a combat genetic algorithm (CGA) is to perform the microarray data selection. The K-nearest neighbor (K-NN) method with leave-one-out cross-validation (LOOCV) served as a classifier. The proposed BPSO-CGA approach is compared to ten microarray data sets from the literature. The experimental results indicate that the proposed method not only effectively reduce the number of genes expression level, but also achieves a low classification error rate. PMID:21210743

  20. Multisensor data fusion algorithm development

    SciTech Connect

    Yocky, D.A.; Chadwick, M.D.; Goudy, S.P.; Johnson, D.K.

    1995-12-01

    This report presents a two-year LDRD research effort into multisensor data fusion. We approached the problem by addressing the available types of data, preprocessing that data, and developing fusion algorithms using that data. The report reflects these three distinct areas. First, the possible data sets for fusion are identified. Second, automated registration techniques for imagery data are analyzed. Third, two fusion techniques are presented. The first fusion algorithm is based on the two-dimensional discrete wavelet transform. Using test images, the wavelet algorithm is compared against intensity modulation and intensity-hue-saturation image fusion algorithms that are available in commercial software. The wavelet approach outperforms the other two fusion techniques by preserving spectral/spatial information more precisely. The wavelet fusion algorithm was also applied to Landsat Thematic Mapper and SPOT panchromatic imagery data. The second algorithm is based on a linear-regression technique. We analyzed the technique using the same Landsat and SPOT data.

  1. ArrayWiki: an enabling technology for sharing public microarray data repositories and meta-analyses

    PubMed Central

    Stokes, Todd H; Torrance, JT; Li, Henry; Wang, May D

    2008-01-01

    Background A survey of microarray databases reveals that most of the repository contents and data models are heterogeneous (i.e., data obtained from different chip manufacturers), and that the repositories provide only basic biological keywords linking to PubMed. As a result, it is difficult to find datasets using research context or analysis parameters information beyond a few keywords. For example, to reduce the "curse-of-dimension" problem in microarray analysis, the number of samples is often increased by merging array data from different datasets. Knowing chip data parameters such as pre-processing steps (e.g., normalization, artefact removal, etc), and knowing any previous biological validation of the dataset is essential due to the heterogeneity of the data. However, most of the microarray repositories do not have meta-data information in the first place, and do not have a a mechanism to add or insert this information. Thus, there is a critical need to create "intelligent" microarray repositories that (1) enable update of meta-data with the raw array data, and (2) provide standardized archiving protocols to minimize bias from the raw data sources. Results To address the problems discussed, we have developed a community maintained system called ArrayWiki that unites disparate meta-data of microarray meta-experiments from multiple primary sources with four key features. First, ArrayWiki provides a user-friendly knowledge management interface in addition to a programmable interface using standards developed by Wikipedia. Second, ArrayWiki includes automated quality control processes (caCORRECT) and novel visualization methods (BioPNG, Gel Plots), which provide extra information about data quality unavailable in other microarray repositories. Third, it provides a user-curation capability through the familiar Wiki interface. Fourth, ArrayWiki provides users with simple text-based searches across all experiment meta-data, and exposes data to search engine crawlers

  2. Gene Expression Browser: large-scale and cross-experiment microarray data integration, management, search & visualization

    PubMed Central

    2010-01-01

    Background In the last decade, a large amount of microarray gene expression data has been accumulated in public repositories. Integrating and analyzing high-throughput gene expression data have become key activities for exploring gene functions, gene networks and biological pathways. Effectively utilizing these invaluable microarray data remains challenging due to a lack of powerful tools to integrate large-scale gene-expression information across diverse experiments and to search and visualize a large number of gene-expression data points. Results Gene Expression Browser is a microarray data integration, management and processing system with web-based search and visualization functions. An innovative method has been developed to define a treatment over a control for every microarray experiment to standardize and make microarray data from different experiments homogeneous. In the browser, data are pre-processed offline and the resulting data points are visualized online with a 2-layer dynamic web display. Users can view all treatments over control that affect the expression of a selected gene via Gene View, and view all genes that change in a selected treatment over control via treatment over control View. Users can also check the changes of expression profiles of a set of either the treatments over control or genes via Slide View. In addition, the relationships between genes and treatments over control are computed according to gene expression ratio and are shown as co-responsive genes and co-regulation treatments over control. Conclusion Gene Expression Browser is composed of a set of software tools, including a data extraction tool, a microarray data-management system, a data-annotation tool, a microarray data-processing pipeline, and a data search & visualization tool. The browser is deployed as a free public web service (http://www.ExpressionBrowser.com) that integrates 301 ATH1 gene microarray experiments from public data repositories (viz. the Gene

  3. Integrated Amplification Microarrays for Infectious Disease Diagnostics

    PubMed Central

    Chandler, Darrell P.; Bryant, Lexi; Griesemer, Sara B.; Gu, Rui; Knickerbocker, Christopher; Kukhtin, Alexander; Parker, Jennifer; Zimmerman, Cynthia; George, Kirsten St.; Cooney, Christopher G.

    2012-01-01

    This overview describes microarray-based tests that combine solution-phase amplification chemistry and microarray hybridization within a single microfluidic chamber. The integrated biochemical approach improves microarray workflow for diagnostic applications by reducing the number of steps and minimizing the potential for sample or amplicon cross-contamination. Examples described herein illustrate a basic, integrated approach for DNA and RNA genomes, and a simple consumable architecture for incorporating wash steps while retaining an entirely closed system. It is anticipated that integrated microarray biochemistry will provide an opportunity to significantly reduce the complexity and cost of microarray consumables, equipment, and workflow, which in turn will enable a broader spectrum of users to exploit the intrinsic multiplexing power of microarrays for infectious disease diagnostics.

  4. THE ABRF MARG MICROARRAY SURVEY 2005: TAKING THE PULSE ON THE MICROARRAY FIELD

    EPA Science Inventory

    Over the past several years microarray technology has evolved into a critical component of any discovery based program. Since 1999, the Association of Biomolecular Resource Facilities (ABRF) Microarray Research Group (MARG) has conducted biennial surveys designed to generate a pr...

  5. Living Cell Microarrays: An Overview of Concepts.

    PubMed

    Jonczyk, Rebecca; Kurth, Tracy; Lavrentieva, Antonina; Walter, Johanna-Gabriela; Scheper, Thomas; Stahl, Frank

    2016-01-01

    Living cell microarrays are a highly efficient cellular screening system. Due to the low number of cells required per spot, cell microarrays enable the use of primary and stem cells and provide resolution close to the single-cell level. Apart from a variety of conventional static designs, microfluidic microarray systems have also been established. An alternative format is a microarray consisting of three-dimensional cell constructs ranging from cell spheroids to cells encapsulated in hydrogel. These systems provide an in vivo-like microenvironment and are preferably used for the investigation of cellular physiology, cytotoxicity, and drug screening. Thus, many different high-tech microarray platforms are currently available. Disadvantages of many systems include their high cost, the requirement of specialized equipment for their manufacture, and the poor comparability of results between different platforms. In this article, we provide an overview of static, microfluidic, and 3D cell microarrays. In addition, we describe a simple method for the printing of living cell microarrays on modified microscope glass slides using standard DNA microarray equipment available in most laboratories. Applications in research and diagnostics are discussed, e.g., the selective and sensitive detection of biomarkers. Finally, we highlight current limitations and the future prospects of living cell microarrays. PMID:27600077

  6. Living Cell Microarrays: An Overview of Concepts

    PubMed Central

    Jonczyk, Rebecca; Kurth, Tracy; Lavrentieva, Antonina; Walter, Johanna-Gabriela; Scheper, Thomas; Stahl, Frank

    2016-01-01

    Living cell microarrays are a highly efficient cellular screening system. Due to the low number of cells required per spot, cell microarrays enable the use of primary and stem cells and provide resolution close to the single-cell level. Apart from a variety of conventional static designs, microfluidic microarray systems have also been established. An alternative format is a microarray consisting of three-dimensional cell constructs ranging from cell spheroids to cells encapsulated in hydrogel. These systems provide an in vivo-like microenvironment and are preferably used for the investigation of cellular physiology, cytotoxicity, and drug screening. Thus, many different high-tech microarray platforms are currently available. Disadvantages of many systems include their high cost, the requirement of specialized equipment for their manufacture, and the poor comparability of results between different platforms. In this article, we provide an overview of static, microfluidic, and 3D cell microarrays. In addition, we describe a simple method for the printing of living cell microarrays on modified microscope glass slides using standard DNA microarray equipment available in most laboratories. Applications in research and diagnostics are discussed, e.g., the selective and sensitive detection of biomarkers. Finally, we highlight current limitations and the future prospects of living cell microarrays. PMID:27600077

  7. Protein microarrays as tools for functional proteomics.

    PubMed

    LaBaer, Joshua; Ramachandran, Niroshan

    2005-02-01

    Protein microarrays present an innovative and versatile approach to study protein abundance and function at an unprecedented scale. Given the chemical and structural complexity of the proteome, the development of protein microarrays has been challenging. Despite these challenges there has been a marked increase in the use of protein microarrays to map interactions of proteins with various other molecules, and to identify potential disease biomarkers, especially in the area of cancer biology. In this review, we discuss some of the promising advances made in the development and use of protein microarrays. PMID:15701447

  8. Photoelectrochemical synthesis of DNA microarrays

    PubMed Central

    Chow, Brian Y.; Emig, Christopher J.; Jacobson, Joseph M.

    2009-01-01

    Optical addressing of semiconductor electrodes represents a powerful technology that enables the independent and parallel control of a very large number of electrical phenomena at the solid-electrolyte interface. To date, it has been used in a wide range of applications including electrophoretic manipulation, biomolecule sensing, and stimulating networks of neurons. Here, we have adapted this approach for the parallel addressing of redox reactions, and report the construction of a DNA microarray synthesis platform based on semiconductor photoelectrochemistry (PEC). An amorphous silicon photoconductor is activated by an optical projection system to create virtual electrodes capable of electrochemically generating protons; these PEC-generated protons then cleave the acid-labile dimethoxytrityl protecting groups of DNA phosphoramidite synthesis reagents with the requisite spatial selectivity to generate DNA microarrays. Furthermore, a thin-film porous glass dramatically increases the amount of DNA synthesized per chip by over an order of magnitude versus uncoated glass. This platform demonstrates that PEC can be used toward combinatorial bio-polymer and small molecule synthesis. PMID:19706433

  9. THE ABRF-MARG MICROARRAY SURVEY 2004: TAKING THE PULSE OF THE MICROARRAY FIELD

    EPA Science Inventory

    Over the past several years, the field of microarrays has grown and evolved drastically. In its continued efforts to track this evolution, the ABRF-MARG has once again conducted a survey of international microarray facilities and individual microarray users. The goal of the surve...

  10. 2008 Microarray Research Group (MARG Survey): Sensing the State of Microarray Technology

    EPA Science Inventory

    Over the past several years, the field of microarrays has grown and evolved drastically. In its continued efforts to track this evolution and transformation, the ABRF-MARG has once again conducted a survey of international microarray facilities and individual microarray users. Th...

  11. Nucleosome positioning from tiling microarray data

    PubMed Central

    Yassour, Moran; Kaplan, Tommy; Jaimovich, Ariel; Friedman, Nir

    2008-01-01

    Motivation: The packaging of DNA around nucleosomes in eukaryotic cells plays a crucial role in regulation of gene expression, and other DNA-related processes. To better understand the regulatory role of nucleosomes, it is important to pinpoint their position in a high (5–10 bp) resolution. Toward this end, several recent works used dense tiling arrays to map nucleosomes in a high-throughput manner. These data were then parsed and hand-curated, and the positions of nucleosomes were assessed. Results: In this manuscript, we present a fully automated algorithm to analyze such data and predict the exact location of nucleosomes. We introduce a method, based on a probabilistic graphical model, to increase the resolution of our predictions even beyond that of the microarray used. We show how to build such a model and how to compile it into a simple Hidden Markov Model, allowing for a fast and accurate inference of nucleosome positions. We applied our model to nucleosomal data from mid-log yeast cells reported by Yuan et al. and compared our predictions to those of the original paper; to a more recent method that uses five times denser tiling arrays as explained by Lee et al.; and to a curated set of literature-based nucleosome positions. Our results suggest that by applying our algorithm to the same data used by Yuan et al. our fully automated model traced 13% more nucleosomes, and increased the overall accuracy by about 20%. We believe that such an improvement opens the way for a better understanding of the regulatory mechanisms controlling gene expression, and how they are encoded in the DNA. Contact: nir@cs.huji.ac.il PMID:18586706

  12. Adaptive filtering image preprocessing for smart FPA technology

    NASA Astrophysics Data System (ADS)

    Brooks, Geoffrey W.

    1995-05-01

    This paper discusses two applications of adaptive filters for image processing on parallel architectures. The first, based on the results of previously accomplished work, summarizes the analyses of various adaptive filters implemented for pixel-level image prediction. FIR filters, fixed and adaptive IIR filters, and various variable step size algorithms were compared with a focus on algorithm complexity against the ability to predict future pixel values. A gaussian smoothing operation with varying spatial and temporal constants were also applied for comparisons of random noise reductions. The second application is a suggestion to use memory-adaptive IIR filters for detecting and tracking motion within an image. Objects within an image are made of edges, or segments, with varying degrees of motion. An application has been previously published that describes FIR filters connecting pixels and using correlations to determine motion and direction. This implementation seems limited to detecting motion coinciding with FIR filter operation rate and the associated harmonics. Upgrading the FIR structures with adaptive IIR structures can eliminate these limitations. These and any other pixel-level adaptive filtering application require data memory for filter parameters and some basic computational capability. Tradeoffs have to be made between chip real estate and these desired features. System tradeoffs will also have to be made as to where it makes the most sense to do which level of processing. Although smart pixels may not be ready to implement adaptive filters, applications such as these should give the smart pixel designer some long range goals.

  13. Preprocessing Inconsistent Linear System for a Meaningful Least Squares Solution

    NASA Technical Reports Server (NTRS)

    Sen, Syamal K.; Shaykhian, Gholam Ali

    2011-01-01

    Mathematical models of many physical/statistical problems are systems of linear equations. Due to measurement and possible human errors/mistakes in modeling/data, as well as due to certain assumptions to reduce complexity, inconsistency (contradiction) is injected into the model, viz. the linear system. While any inconsistent system irrespective of the degree of inconsistency has always a least-squares solution, one needs to check whether an equation is too much inconsistent or, equivalently too much contradictory. Such an equation will affect/distort the least-squares solution to such an extent that renders it unacceptable/unfit to be used in a real-world application. We propose an algorithm which (i) prunes numerically redundant linear equations from the system as these do not add any new information to the model, (ii) detects contradictory linear equations along with their degree of contradiction (inconsistency index), (iii) removes those equations presumed to be too contradictory, and then (iv) obtain the minimum norm least-squares solution of the acceptably inconsistent reduced linear system. The algorithm presented in Matlab reduces the computational and storage complexities and also improves the accuracy of the solution. It also provides the necessary warning about the existence of too much contradiction in the model. In addition, we suggest a thorough relook into the mathematical modeling to determine the reason why unacceptable contradiction has occurred thus prompting us to make necessary corrections/modifications to the models - both mathematical and, if necessary, physical.

  14. Autonomous system for Web-based microarray image analysis.

    PubMed

    Bozinov, Daniel

    2003-12-01

    Software-based feature extraction from DNA microarray images still requires human intervention on various levels. Manual adjustment of grid and metagrid parameters, precise alignment of superimposed grid templates and gene spots, or simply identification of large-scale artifacts have to be performed beforehand to reliably analyze DNA signals and correctly quantify their expression values. Ideally, a Web-based system with input solely confined to a single microarray image and a data table as output containing measurements for all gene spots would directly transform raw image data into abstracted gene expression tables. Sophisticated algorithms with advanced procedures for iterative correction function can overcome imminent challenges in image processing. Herein is introduced an integrated software system with a Java-based interface on the client side that allows for decentralized access and furthermore enables the scientist to instantly employ the most updated software version at any given time. This software tool is extended from PixClust as used in Extractiff incorporated with Java Web Start deployment technology. Ultimately, this setup is destined for high-throughput pipelines in genome-wide medical diagnostics labs or microarray core facilities aimed at providing fully automated service to its users. PMID:15376911

  15. MAGMA: analysis of two-channel microarrays made easy.

    PubMed

    Rehrauer, Hubert; Zoller, Stefan; Schlapbach, Ralph

    2007-07-01

    The web application MAGMA provides a simple and intuitive interface to identify differentially expressed genes from two-channel microarray data. While the underlying algorithms are not superior to those of similar web applications, MAGMA is particularly user friendly and can be used without prior training. The user interface guides the novice user through the most typical microarray analysis workflow consisting of data upload, annotation, normalization and statistical analysis. It automatically generates R-scripts that document MAGMA's entire data processing steps, thereby allowing the user to regenerate all results in his local R installation. The implementation of MAGMA follows the model-view-controller design pattern that strictly separates the R-based statistical data processing, the web-representation and the application logic. This modular design makes the application flexible and easily extendible by experts in one of the fields: statistical microarray analysis, web design or software development. State-of-the-art Java Server Faces technology was used to generate the web interface and to perform user input processing. MAGMA's object-oriented modular framework makes it easily extendible and applicable to other fields and demonstrates that modern Java technology is also suitable for rather small and concise academic projects. MAGMA is freely available at www.magma-fgcz.uzh.ch. PMID:17517778

  16. Classification of Microarray Data Using Kernel Fuzzy Inference System

    PubMed Central

    Kumar Rath, Santanu

    2014-01-01

    The DNA microarray classification technique has gained more popularity in both research and practice. In real data analysis, such as microarray data, the dataset contains a huge number of insignificant and irrelevant features that tend to lose useful information. Classes with high relevance and feature sets with high significance are generally referred for the selected features, which determine the samples classification into their respective classes. In this paper, kernel fuzzy inference system (K-FIS) algorithm is applied to classify the microarray data (leukemia) using t-test as a feature selection method. Kernel functions are used to map original data points into a higher-dimensional (possibly infinite-dimensional) feature space defined by a (usually nonlinear) function ϕ through a mathematical process called the kernel trick. This paper also presents a comparative study for classification using K-FIS along with support vector machine (SVM) for different set of features (genes). Performance parameters available in the literature such as precision, recall, specificity, F-measure, ROC curve, and accuracy are considered to analyze the efficiency of the classification model. From the proposed approach, it is apparent that K-FIS model obtains similar results when compared with SVM model. This is an indication that the proposed approach relies on kernel function.

  17. Algorithms and Algorithmic Languages.

    ERIC Educational Resources Information Center

    Veselov, V. M.; Koprov, V. M.

    This paper is intended as an introduction to a number of problems connected with the description of algorithms and algorithmic languages, particularly the syntaxes and semantics of algorithmic languages. The terms "letter, word, alphabet" are defined and described. The concept of the algorithm is defined and the relation between the algorithm and…

  18. Automatic image analysis and spot classification for detection of pathogenic Escherichia coli on glass slide DNA microarrays

    Technology Transfer Automated Retrieval System (TEKTRAN)

    A computer algorithm was created to inspect scanned images from DNA microarray slides developed to rapidly detect and genotype E. Coli O157 virulent strains. The algorithm computes centroid locations for signal and background pixels in RGB space and defines a plane perpendicular to the line connect...

  19. On the Development of Parafoveal Preprocessing: Evidence from the Incremental Boundary Paradigm

    PubMed Central

    Marx, Christina; Hutzler, Florian; Schuster, Sarah; Hawelka, Stefan

    2016-01-01

    Parafoveal preprocessing of upcoming words and the resultant preview benefit are key aspects of fluent reading. Evidence regarding the development of parafoveal preprocessing during reading acquisition, however, is scarce. The present developmental (cross-sectional) eye tracking study estimated the magnitude of parafoveal preprocessing of beginning readers with a novel variant of the classical boundary paradigm. Additionally, we assessed the association of parafoveal preprocessing with several reading-related psychometric measures. The participants were children learning to read the regular German orthography with about 1, 3, and 5 years of formal reading instruction (Grade 2, 4, and 6, respectively). We found evidence of parafoveal preprocessing in each Grade. However, an effective use of parafoveal information was related to the individual reading fluency of the participants (i.e., the reading rate expressed as words-per-minute) which substantially overlapped between the Grades. The size of the preview benefit was furthermore associated with the children’s performance in rapid naming tasks and with their performance in a pseudoword reading task. The latter task assessed the children’s efficiency in phonological decoding and our findings show that the best decoders exhibited the largest preview benefit. PMID:27148123

  20. On the Development of Parafoveal Preprocessing: Evidence from the Incremental Boundary Paradigm.

    PubMed

    Marx, Christina; Hutzler, Florian; Schuster, Sarah; Hawelka, Stefan

    2016-01-01

    Parafoveal preprocessing of upcoming words and the resultant preview benefit are key aspects of fluent reading. Evidence regarding the development of parafoveal preprocessing during reading acquisition, however, is scarce. The present developmental (cross-sectional) eye tracking study estimated the magnitude of parafoveal preprocessing of beginning readers with a novel variant of the classical boundary paradigm. Additionally, we assessed the association of parafoveal preprocessing with several reading-related psychometric measures. The participants were children learning to read the regular German orthography with about 1, 3, and 5 years of formal reading instruction (Grade 2, 4, and 6, respectively). We found evidence of parafoveal preprocessing in each Grade. However, an effective use of parafoveal information was related to the individual reading fluency of the participants (i.e., the reading rate expressed as words-per-minute) which substantially overlapped between the Grades. The size of the preview benefit was furthermore associated with the children's performance in rapid naming tasks and with their performance in a pseudoword reading task. The latter task assessed the children's efficiency in phonological decoding and our findings show that the best decoders exhibited the largest preview benefit. PMID:27148123

  1. Prediction of rainfall time series using modular artificial neural networks coupled with data-preprocessing techniques

    NASA Astrophysics Data System (ADS)

    Wu, C. L.; Chau, K. W.; Fan, C.

    2010-07-01

    SummaryThis study is an attempt to seek a relatively optimal data-driven model for rainfall forecasting from three aspects: model inputs, modeling methods, and data-preprocessing techniques. Four rain data records from different regions, namely two monthly and two daily series, are examined. A comparison of seven input techniques, either linear or nonlinear, indicates that linear correlation analysis (LCA) is capable of identifying model inputs reasonably. A proposed model, modular artificial neural network (MANN), is compared with three benchmark models, viz. artificial neural network (ANN), K-nearest-neighbors (K-NN), and linear regression (LR). Prediction is performed in the context of two modes including normal mode (viz., without data preprocessing) and data preprocessing mode. Results from the normal mode indicate that MANN performs the best among all four models, but the advantage of MANN over ANN is not significant in monthly rainfall series forecasting. Under the data preprocessing mode, each of LR, K-NN and ANN is respectively coupled with three data-preprocessing techniques including moving average (MA), principal component analysis (PCA), and singular spectrum analysis (SSA). Results indicate that the improvement of model performance generated by SSA is considerable whereas those of MA or PCA are slight. Moreover, when MANN is coupled with SSA, results show that advantages of MANN over other models are quite noticeable, particularly for daily rainfall forecasting. Therefore, the proposed optimal rainfall forecasting model can be derived from MANN coupled with SSA.

  2. Effect of data normalization on fuzzy clustering of DNA microarray data

    PubMed Central

    Kim, Seo Young; Lee, Jae Won; Bae, Jong Sung

    2006-01-01

    Background Microarray technology has made it possible to simultaneously measure the expression levels of large numbers of genes in a short time. Gene expression data is information rich; however, extensive data mining is required to identify the patterns that characterize the underlying mechanisms of action. Clustering is an important tool for finding groups of genes with similar expression patterns in microarray data analysis. However, hard clustering methods, which assign each gene exactly to one cluster, are poorly suited to the analysis of microarray datasets because in such datasets the clusters of genes frequently overlap. Results In this study we applied the fuzzy partitional clustering method known as Fuzzy C-Means (FCM) to overcome the limitations of hard clustering. To identify the effect of data normalization, we used three normalization methods, the two common scale and location transformations and Lowess normalization methods, to normalize three microarray datasets and three simulated datasets. First we determined the optimal parameters for FCM clustering. We found that the optimal fuzzification parameter in the FCM analysis of a microarray dataset depended on the normalization method applied to the dataset during preprocessing. We additionally evaluated the effect of normalization of noisy datasets on the results obtained when hard clustering or FCM clustering was applied to those datasets. The effects of normalization were evaluated using both simulated datasets and microarray datasets. A comparative analysis showed that the clustering results depended on the normalization method used and the noisiness of the data. In particular, the selection of the fuzzification parameter value for the FCM method was sensitive to the normalization method used for datasets with large variations across samples. Conclusion Lowess normalization is more robust for clustering of genes from general microarray data than the two common scale and location adjustment methods

  3. Generation of attributes for learning algorithms

    SciTech Connect

    Hu, Yuh-Jyh; Kibler, D.

    1996-12-31

    Inductive algorithms rely strongly on their representational biases. Constructive induction can mitigate representational inadequacies. This paper introduces the notion of a relative gain measure and describes a new constructive induction algorithm (GALA) which is independent of the learning algorithm. Unlike most previous research on constructive induction, our methods are designed as preprocessing step before standard machine learning algorithms are applied. We present the results which demonstrate the effectiveness of GALA on artificial and real domains for several learners: C4.5, CN2, perceptron and backpropagation.

  4. Microarrays Made Simple: "DNA Chips" Paper Activity

    ERIC Educational Resources Information Center

    Barnard, Betsy

    2006-01-01

    DNA microarray technology is revolutionizing biological science. DNA microarrays (also called DNA chips) allow simultaneous screening of many genes for changes in expression between different cells. Now researchers can obtain information about genes in days or weeks that used to take months or years. The paper activity described in this article…

  5. Protein-Based Microarray for the Detection of Pathogenic Bacteria

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Microarrays have been used for gene expression and protein interaction studies, but recently, multianalyte diagnostic assays have employed the microarray platform. We developed a microarray immunoassay for bacteria, with biotinylated capture antibodies on streptavidin slides. To complete the fluor...

  6. Tissue Microarrays in Clinical Oncology

    PubMed Central

    Voduc, David; Kenney, Challayne; Nielsen, Torsten O.

    2008-01-01

    The tissue microarray is a recently-implemented, high-throughput technology for the analysis of molecular markers in oncology. This research tool permits the rapid assessment of a biomarker in thousands of tumor samples, using commonly available laboratory assays such as immunohistochemistry and in-situ hybridization. Although introduced less than a decade ago, the TMA has proven to be invaluable in the study of tumor biology, the development of diagnostic tests, and the investigation of oncological biomarkers. This review describes the impact of TMA-based research in clinical oncology and its potential future applications. Technical aspects of TMA construction, and the advantages and disadvantages inherent to this technology are also discussed. PMID:18314063

  7. DNA Microarrays for Identifying Fishes

    PubMed Central

    Nölte, M.; Weber, H.; Silkenbeumer, N.; Hjörleifsdottir, S.; Hreggvidsson, G. O.; Marteinsson, V.; Kappel, K.; Planes, S.; Tinti, F.; Magoulas, A.; Garcia Vazquez, E.; Turan, C.; Hervet, C.; Campo Falgueras, D.; Antoniou, A.; Landi, M.; Blohm, D.

    2008-01-01

    In many cases marine organisms and especially their diverse developmental stages are difficult to identify by morphological characters. DNA-based identification methods offer an analytically powerful addition or even an alternative. In this study, a DNA microarray has been developed to be able to investigate its potential as a tool for the identification of fish species from European seas based on mitochondrial 16S rDNA sequences. Eleven commercially important fish species were selected for a first prototype. Oligonucleotide probes were designed based on the 16S rDNA sequences obtained from 230 individuals of 27 fish species. In addition, more than 1200 sequences of 380 species served as sequence background against which the specificity of the probes was tested in silico. Single target hybridisations with Cy5-labelled, PCR-amplified 16S rDNA fragments from each of the 11 species on microarrays containing the complete set of probes confirmed their suitability. True-positive, fluorescence signals obtained were at least one order of magnitude stronger than false-positive cross-hybridisations. Single nontarget hybridisations resulted in cross-hybridisation signals at approximately 27% of the cases tested, but all of them were at least one order of magnitude lower than true-positive signals. This study demonstrates that the 16S rDNA gene is suitable for designing oligonucleotide probes, which can be used to differentiate 11 fish species. These data are a solid basis for the second step to create a “Fish Chip” for approximately 50 fish species relevant in marine environmental and fisheries research, as well as control of fisheries products. PMID:18270778

  8. Preprocessed barley, rye, and triticale as a feedstock for an integrated fuel ethanol-feedlot plant

    SciTech Connect

    Sosulski, K.; Wang, Sunmin; Ingledew, W.M.

    1997-12-31

    Rye, triticale, and barley were evaluated as starch feedstock to replace wheat for ethanol production. Preprocessing of grain by abrasion on a Satake mill reduced fiber and increased starch concentrations in feed-stock for fermentations. Higher concentrations of starch in flours from preprocessed cereal grains would increase plant throughput by 8-23% since more starch is processed in the same weight of feedstock. Increased concentrations of starch for fermentation resulted in higher concentrations of ethanol in beer. Energy requirements to produce one L of ethanol from preprocessed grains were reduced, the natural gas by 3.5-11.4%, whereas power consumption was reduced by 5.2-15.6%. 7 refs., 7 figs., 4 tabs.

  9. Optimization of Preprocessing and Densification of Sorghum Stover at Full-scale Operation

    SciTech Connect

    Neal A. Yancey; Jaya Shankar Tumuluru; Craig C. Conner; Christopher T. Wright

    2011-08-01

    Transportation costs can be a prohibitive step in bringing biomass to a preprocessing location or biofuel refinery. One alternative to transporting biomass in baled or loose format to a preprocessing location, is to utilize a mobile preprocessing system that can be relocated to various locations where biomass is stored, preprocess and densify the biomass, then ship it to the refinery as needed. The Idaho National Laboratory has a full scale 'Process Demonstration Unit' PDU which includes a stage 1 grinder, hammer mill, drier, pellet mill, and cooler with the associated conveyance system components. Testing at bench and pilot scale has been conducted to determine effects of moisture on preprocessing, crop varieties on preprocessing efficiency and product quality. The INLs PDU provides an opportunity to test the conclusions made at the bench and pilot scale on full industrial scale systems. Each component of the PDU is operated from a central operating station where data is collected to determine power consumption rates for each step in the process. The power for each electrical motor in the system is monitored from the control station to monitor for problems and determine optimal conditions for the system performance. The data can then be viewed to observe how changes in biomass input parameters (moisture and crop type for example), mechanical changes (screen size, biomass drying, pellet size, grinding speed, etc.,), or other variations effect the power consumption of the system. Sorgum in four foot round bales was tested in the system using a series of 6 different screen sizes including: 3/16 in., 1 in., 2 in., 3 in., 4 in., and 6 in. The effect on power consumption, product quality, and production rate were measured to determine optimal conditions.

  10. Probe Region Expression Estimation for RNA-Seq Data for Improved Microarray Comparability

    PubMed Central

    Uziela, Karolis; Honkela, Antti

    2015-01-01

    Rapidly growing public gene expression databases contain a wealth of data for building an unprecedentedly detailed picture of human biology and disease. This data comes from many diverse measurement platforms that make integrating it all difficult. Although RNA-sequencing (RNA-seq) is attracting the most attention, at present, the rate of new microarray studies submitted to public databases far exceeds the rate of new RNA-seq studies. There is clearly a need for methods that make it easier to combine data from different technologies. In this paper, we propose a new method for processing RNA-seq data that yields gene expression estimates that are much more similar to corresponding estimates from microarray data, hence greatly improving cross-platform comparability. The method we call PREBS is based on estimating the expression from RNA-seq reads overlapping the microarray probe regions, and processing these estimates with standard microarray summarisation algorithms. Using paired microarray and RNA-seq samples from TCGA LAML data set we show that PREBS expression estimates derived from RNA-seq are more similar to microarray-based expression estimates than those from other RNA-seq processing methods. In an experiment to retrieve paired microarray samples from a database using an RNA-seq query sample, gene signatures defined based on PREBS expression estimates were found to be much more accurate than those from other methods. PREBS also allows new ways of using RNA-seq data, such as expression estimation for microarray probe sets. An implementation of the proposed method is available in the Bioconductor package “prebs.” PMID:25966034

  11. Influence of Hemp Fibers Pre-processing on Low Density Polyethylene Matrix Composites Properties

    NASA Astrophysics Data System (ADS)

    Kukle, S.; Vidzickis, R.; Zelca, Z.; Belakova, D.; Kajaks, J.

    2016-04-01

    In present research with short hemp fibres reinforced LLDPE matrix composites with fibres content in a range from 30 to 50 wt% subjected to four different pre-processing technologies were produced and such their properties as tensile strength and elongation at break, tensile modulus, melt flow index, micro hardness and water absorption dynamics were investigated. Capillary viscosimetry was used for fluidity evaluation and melt flow index (MFI) evaluated for all variants. MFI of fibres of two pre-processing variants were high enough to increase hemp fibres content from 30 to 50 wt% with moderate increase of water sorption capability.

  12. Boosting model performance and interpretation by entangling preprocessing selection and variable selection.

    PubMed

    Gerretzen, Jan; Szymańska, Ewa; Bart, Jacob; Davies, Antony N; van Manen, Henk-Jan; van den Heuvel, Edwin R; Jansen, Jeroen J; Buydens, Lutgarde M C

    2016-09-28

    The aim of data preprocessing is to remove data artifacts-such as a baseline, scatter effects or noise-and to enhance the contextually relevant information. Many preprocessing methods exist to deliver one or more of these benefits, but which method or combination of methods should be used for the specific data being analyzed is difficult to select. Recently, we have shown that a preprocessing selection approach based on Design of Experiments (DoE) enables correct selection of highly appropriate preprocessing strategies within reasonable time frames. In that approach, the focus was solely on improving the predictive performance of the chemometric model. This is, however, only one of the two relevant criteria in modeling: interpretation of the model results can be just as important. Variable selection is often used to achieve such interpretation. Data artifacts, however, may hamper proper variable selection by masking the true relevant variables. The choice of preprocessing therefore has a huge impact on the outcome of variable selection methods and may thus hamper an objective interpretation of the final model. To enhance such objective interpretation, we here integrate variable selection into the preprocessing selection approach that is based on DoE. We show that the entanglement of preprocessing selection and variable selection not only improves the interpretation, but also the predictive performance of the model. This is achieved by analyzing several experimental data sets of which the true relevant variables are available as prior knowledge. We show that a selection of variables is provided that complies more with the true informative variables compared to individual optimization of both model aspects. Importantly, the approach presented in this work is generic. Different types of models (e.g. PCR, PLS, …) can be incorporated into it, as well as different variable selection methods and different preprocessing methods, according to the taste and experience of

  13. View and design of basic element for smart imagers with image preprocessing

    NASA Astrophysics Data System (ADS)

    Shilin, Victor A.

    2005-06-01

    This paper is devoted to view ofbasic elements for smart imagers. We discussed principal ofwork. CMOS APS imagers with focal plane parallel image preprocessing for smart technical vision and electro-optical systems based on neural implementation. Using analysis of main biological vision features, the desired artificial vision characteristics are defined. Image processing tasks can be implemented by smart focal plane preprocessing CMOS imagers with neural networks are determined. Eventual results are important for medicine, aerospace ecological monitoring, complexity, and ways for CMOS APS neural nets implementation.

  14. PAA: an R/bioconductor package for biomarker discovery with protein microarrays

    PubMed Central

    Turewicz, Michael; Ahrens, Maike; May, Caroline; Marcus, Katrin; Eisenacher, Martin

    2016-01-01

    Summary: The R/Bioconductor package Protein Array Analyzer (PAA) facilitates a flexible analysis of protein microarrays for biomarker discovery (esp., ProtoArrays). It provides a complete data analysis workflow including preprocessing and quality control, uni- and multivariate feature selection as well as several different plots and results tables to outline and evaluate the analysis results. As a main feature, PAA’s multivariate feature selection methods are based on recursive feature elimination (e.g. SVM-recursive feature elimination, SVM-RFE) with stability ensuring strategies such as ensemble feature selection. This enables PAA to detect stable and reliable biomarker candidate panels. Availability and implementation: PAA is freely available (BSD 3-clause license) from http://www.bioconductor.org/packages/PAA/. Contact: michael.turewicz@rub.de or martin.eisenacher@rub.de PMID:26803161

  15. Genetic programming based ensemble system for microarray data classification.

    PubMed

    Liu, Kun-Hong; Tong, Muchenxuan; Xie, Shu-Tong; Yee Ng, Vincent To

    2015-01-01

    Recently, more and more machine learning techniques have been applied to microarray data analysis. The aim of this study is to propose a genetic programming (GP) based new ensemble system (named GPES), which can be used to effectively classify different types of cancers. Decision trees are deployed as base classifiers in this ensemble framework with three operators: Min, Max, and Average. Each individual of the GP is an ensemble system, and they become more and more accurate in the evolutionary process. The feature selection technique and balanced subsampling technique are applied to increase the diversity in each ensemble system. The final ensemble committee is selected by a forward search algorithm, which is shown to be capable of fitting data automatically. The performance of GPES is evaluated using five binary class and six multiclass microarray datasets, and results show that the algorithm can achieve better results in most cases compared with some other ensemble systems. By using elaborate base classifiers or applying other sampling techniques, the performance of GPES may be further improved. PMID:25810748

  16. Validation of analytical breast cancer microarray analysis in medical laboratory.

    PubMed

    Darweesh, Amal Said; Louka, Manal Louis; Hana, Maha; Rashad, Shaymaa; El-Shinawi, Mohamed; Sharaf-Eldin, Ahmed; Kassim, Samar Kamal

    2014-10-01

    A previously reported microarray data analysis by RISS algorithm on breast cancer showed over-expression of the growth factor receptor (Grb7) and it also highlighted Tweety (TTYH1) gene to be under expressed in breast cancer for the first time. Our aim was to validate the results obtained from the microarray analysis with respect to these genes. Also, the relationship between their expression and the different prognostic indicators was addressed. RNA was extracted from the breast tissue of 30 patients with primary malignant breast cancer. Control samples from the same patients were harvested at a distance of ≥5 cm from the tumour. Semi-quantitative RT-PCR analysis was done on all samples. There was a significant difference between the malignant and control tissues as regards Grb7 expression. It was significantly related to the presence of lymph node metastasis, stage and histological grade of the malignant tumours. There was a significant inverse relation between expression of Grb7 and expression of both oestrogen and progesterone receptors. Grb7 was found to be significantly related to the biological classification of breast cancer. TTYH1 was not expressed in either the malignant or the control samples. The RISS by our group algorithm developed was laboratory validated for Grb7, but not for TTYH1. The newly developed software tool needs to be improved. PMID:25182704

  17. Microarray-integrated optoelectrofluidic immunoassay system.

    PubMed

    Han, Dongsik; Park, Je-Kyun

    2016-05-01

    A microarray-based analytical platform has been utilized as a powerful tool in biological assay fields. However, an analyte depletion problem due to the slow mass transport based on molecular diffusion causes low reaction efficiency, resulting in a limitation for practical applications. This paper presents a novel method to improve the efficiency of microarray-based immunoassay via an optically induced electrokinetic phenomenon by integrating an optoelectrofluidic device with a conventional glass slide-based microarray format. A sample droplet was loaded between the microarray slide and the optoelectrofluidic device on which a photoconductive layer was deposited. Under the application of an AC voltage, optically induced AC electroosmotic flows caused by a microarray-patterned light actively enhanced the mass transport of target molecules at the multiple assay spots of the microarray simultaneously, which reduced tedious reaction time from more than 30 min to 10 min. Based on this enhancing effect, a heterogeneous immunoassay with a tiny volume of sample (5 μl) was successfully performed in the microarray-integrated optoelectrofluidic system using immunoglobulin G (IgG) and anti-IgG, resulting in improved efficiency compared to the static environment. Furthermore, the application of multiplex assays was also demonstrated by multiple protein detection. PMID:27190571

  18. MARS: Microarray analysis, retrieval, and storage system

    PubMed Central

    Maurer, Michael; Molidor, Robert; Sturn, Alexander; Hartler, Juergen; Hackl, Hubert; Stocker, Gernot; Prokesch, Andreas; Scheideler, Marcel; Trajanoski, Zlatko

    2005-01-01

    Background Microarray analysis has become a widely used technique for the study of gene-expression patterns on a genomic scale. As more and more laboratories are adopting microarray technology, there is a need for powerful and easy to use microarray databases facilitating array fabrication, labeling, hybridization, and data analysis. The wealth of data generated by this high throughput approach renders adequate database and analysis tools crucial for the pursuit of insights into the transcriptomic behavior of cells. Results MARS (Microarray Analysis and Retrieval System) provides a comprehensive MIAME supportive suite for storing, retrieving, and analyzing multi color microarray data. The system comprises a laboratory information management system (LIMS), a quality control management, as well as a sophisticated user management system. MARS is fully integrated into an analytical pipeline of microarray image analysis, normalization, gene expression clustering, and mapping of gene expression data onto biological pathways. The incorporation of ontologies and the use of MAGE-ML enables an export of studies stored in MARS to public repositories and other databases accepting these documents. Conclusion We have developed an integrated system tailored to serve the specific needs of microarray based research projects using a unique fusion of Web based and standalone applications connected to the latest J2EE application server technology. The presented system is freely available for academic and non-profit institutions. More information can be found at . PMID:15836795

  19. Chromatographic alignment by warping and dynamic programming as a pre-processing tool for PARAFAC modelling of liquid chromatography-mass spectrometry data.

    PubMed

    Bylund, Dan; Danielsson, Rolf; Malmquist, Gunnar; Markides, Karin E

    2002-07-01

    Solutes analysed with LC-MS are characterised by their retention times and mass spectra, and quantified by the intensities measured. This highly selective information can be extracted by multiway modelling. However, for full use and interpretability it is necessary that the assumptions made for the model are valid. For PARAFAC modelling, the assumption is a trilinear data structure. With LC-MS, several factors, e.g. non-linear detector response and ionisation suppression may introduce deviations from trilinearity. The single largest problem, however, is the retention time shifts not related to the true sample variations. In this paper, a time warping algorithm for alignment of LC-MS data in the chromatographic direction has been examined. Several refinements have been implemented and the features are demonstrated for both simulated and real data. With moderate time shifts present in the data, pre-processing with this algorithm yields approximately trilinear data for which reasonable models can be made. PMID:12184621

  20. DNA Microarrays in Herbal Drug Research

    PubMed Central

    Chavan, Preeti; Joshi, Kalpana; Patwardhan, Bhushan

    2006-01-01

    Natural products are gaining increased applications in drug discovery and development. Being chemically diverse they are able to modulate several targets simultaneously in a complex system. Analysis of gene expression becomes necessary for better understanding of molecular mechanisms. Conventional strategies for expression profiling are optimized for single gene analysis. DNA microarrays serve as suitable high throughput tool for simultaneous analysis of multiple genes. Major practical applicability of DNA microarrays remains in DNA mutation and polymorphism analysis. This review highlights applications of DNA microarrays in pharmacodynamics, pharmacogenomics, toxicogenomics and quality control of herbal drugs and extracts. PMID:17173108

  1. Progress in the application of DNA microarrays.

    PubMed Central

    Lobenhofer, E K; Bushel, P R; Afshari, C A; Hamadeh, H K

    2001-01-01

    Microarray technology has been applied to a variety of different fields to address fundamental research questions. The use of microarrays, or DNA chips, to study the gene expression profiles of biologic samples began in 1995. Since that time, the fundamental concepts behind the chip, the technology required for making and using these chips, and the multitude of statistical tools for analyzing the data have been extensively reviewed. For this reason, the focus of this review will be not on the technology itself but on the application of microarrays as a research tool and the future challenges of the field. PMID:11673116

  2. Pipeline for macro- and microarray analyses.

    PubMed

    Vicentini, R; Menossi, M

    2007-05-01

    The pipeline for macro- and microarray analyses (PMmA) is a set of scripts with a web interface developed to analyze DNA array data generated by array image quantification software. PMmA is designed for use with single- or double-color array data and to work as a pipeline in five classes (data format, normalization, data analysis, clustering, and array maps). It can also be used as a plugin in the BioArray Software Environment, an open-source database for array analysis, or used in a local version of the web service. All scripts in PMmA were developed in the PERL programming language and statistical analysis functions were implemented in the R statistical language. Consequently, our package is a platform-independent software. Our algorithms can correctly select almost 90% of the differentially expressed genes, showing a superior performance compared to other methods of analysis. The pipeline software has been applied to 1536 expressed sequence tags macroarray public data of sugarcane exposed to cold for 3 to 48 h. PMmA identified thirty cold-responsive genes previously unidentified in this public dataset. Fourteen genes were up-regulated, two had a variable expression and the other fourteen were down-regulated in the treatments. These new findings certainly were a consequence of using a superior statistical analysis approach, since the original study did not take into account the dependence of data variability on the average signal intensity of each gene. The web interface, supplementary information, and the package source code are available, free, to non-commercial users at http://ipe.cbmeg.unicamp.br/pub/PMmA. PMID:17464422

  3. Parafoveal Preprocessing in Reading Revisited: Evidence from a Novel Preview Manipulation

    ERIC Educational Resources Information Center

    Gagl, Benjamin; Hawelka, Stefan; Richlan, Fabio; Schuster, Sarah; Hutzler, Florian

    2014-01-01

    The study investigated parafoveal preprocessing by the means of the classical invisible boundary paradigm and a novel manipulation of the parafoveal previews (i.e., visual degradation). Eye movements were investigated on 5-letter target words with constraining (i.e., highly informative) initial letters or similarly constraining final letters.…

  4. Integrated Multi-Strategic Web Document Pre-Processing for Sentence and Word Boundary Detection.

    ERIC Educational Resources Information Center

    Shim, Junhyeok; Kim, Dongseok; Cha, Jeongwon; Lee, Gary Geunbae; Seo, Jungyun

    2002-01-01

    Discussion of natural language processing focuses on a multi-strategic integrated text preprocessing method for difficult problems of sentence boundary disambiguation and word boundary disambiguation of Web texts. Describes an evaluation of the method using Korean Web document collections. (Author/LRW)

  5. Multi-wavelength aerosol LIDAR signal pre-processing: practical considerations

    NASA Astrophysics Data System (ADS)

    Rodríguez-Gómez, A.; Rocadenbosch, F.; Sicard, M.; Lange, D.; Barragán, R.; Batet, O.; Comerón, A.; López Márquez, M. A.; Muñoz-Porcar, C.; Tiana, J.; Tomás, S.

    2015-12-01

    Elastic lidars provide range-resolved information about the aerosol content in the atmosphere. Nevertheless, a number of pre-processing techniques need to be used before performing the inversion of the detected signal: range-correction, time-averaging, photoncounting channel dead-time correction, overlap correction, Rayleigh-fitting and gluing of both channels.

  6. Pre-processing SAR image stream to facilitate compression for transport on bandwidth-limited-link

    SciTech Connect

    Rush, Bobby G.; Riley, Robert

    2015-09-29

    Pre-processing is applied to a raw VideoSAR (or similar near-video rate) product to transform the image frame sequence into a product that resembles more closely the type of product for which conventional video codecs are designed, while sufficiently maintaining utility and visual quality of the product delivered by the codec.

  7. Integrating Microarray Data and GRNs.

    PubMed

    Koumakis, L; Potamias, G; Tsiknakis, M; Zervakis, M; Moustakis, V

    2016-01-01

    With the completion of the Human Genome Project and the emergence of high-throughput technologies, a vast amount of molecular and biological data are being produced. Two of the most important and significant data sources come from microarray gene-expression experiments and respective databanks (e,g., Gene Expression Omnibus-GEO (http://www.ncbi.nlm.nih.gov/geo)), and from molecular pathways and Gene Regulatory Networks (GRNs) stored and curated in public (e.g., Kyoto Encyclopedia of Genes and Genomes-KEGG (http://www.genome.jp/kegg/pathway.html), Reactome (http://www.reactome.org/ReactomeGWT/entrypoint.html)) as well as in commercial repositories (e.g., Ingenuity IPA (http://www.ingenuity.com/products/ipa)). The association of these two sources aims to give new insight in disease understanding and reveal new molecular targets in the treatment of specific phenotypes.Three major research lines and respective efforts that try to utilize and combine data from both of these sources could be identified, namely: (1) de novo reconstruction of GRNs, (2) identification of Gene-signatures, and (3) identification of differentially expressed GRN functional paths (i.e., sub-GRN paths that distinguish between different phenotypes). In this chapter, we give an overview of the existing methods that support the different types of gene-expression and GRN integration with a focus on methodologies that aim to identify phenotype-discriminant GRNs or subnetworks, and we also present our methodology. PMID:26134183

  8. DNA microarrays in prostate cancer.

    PubMed

    Ho, Shuk-Mei; Lau, Kin-Mang

    2002-02-01

    DNA microarray technology provides a means to examine large numbers of molecular changes related to a biological process in a high throughput manner. This review discusses plausible utilities of this technology in prostate cancer research, including definition of prostate cancer predisposition, global profiling of gene expression patterns associated with cancer initiation and progression, identification of new diagnostic and prognostic markers, and discovery of novel patient classification schemes. The technology, at present, has only been explored in a limited fashion in prostate cancer research. Some hurdles to be overcome are the high cost of the technology, insufficient sample size and repeated experiments, and the inadequate use of bioinformatics. With the completion of the Human Genome Project and the advance of several highly complementary technologies, such as laser capture microdissection, unbiased RNA amplification, customized functional arrays (eg, single-nucleotide polymorphism chips), and amenable bioinformatics software, this technology will become widely used by investigators in the field. The large amount of novel, unbiased hypotheses and insights generated by this technology is expected to have a significant impact on the diagnosis, treatment, and prevention of prostate cancer. Finally, this review emphasizes existing, but currently underutilized, data-mining tools, such as multivariate statistical analyses, neural networking, and machine learning techniques, to stimulate wider usage. PMID:12084220

  9. Increasing peptide identifications and decreasing search times for ETD spectra by pre-processing and calculation of parent precursor charge

    PubMed Central

    2012-01-01

    Background Electron Transfer Dissociation [ETD] can dissociate multiply charged precursor polypeptides, providing extensive peptide backbone cleavage. ETD spectra contain charge reduced precursor peaks, usually of high intensity, and whose pattern is dependent on its parent precursor charge. These charge reduced precursor peaks and associated neutral loss peaks should be removed before these spectra are searched for peptide identifications. ETD spectra can also contain ion-types other than c and z˙. Modifying search strategies to accommodate these ion-types may aid in increased peptide identifications. Additionally, if the precursor mass is measured using a lower resolution instrument such as a linear ion trap, the charge of the precursor is often not known, reducing sensitivity and increasing search times. We implemented algorithms to remove these precursor peaks, accommodate new ion-types in noise filtering routine in OMSSA and to estimate any unknown precursor charge, using Linear Discriminant Analysis [LDA]. Results Spectral pre-processing to remove precursor peaks and their associated neutral losses prior to protein sequence library searches resulted in a 9.8% increase in peptide identifications at a 1% False Discovery Rate [FDR] compared to previous OMSSA filter. Modifications to the OMSSA noise filter to accommodate various ion-types resulted in a further 4.2% increase in peptide identifications at 1% FDR. Moreover, ETD spectra when searched with charge states obtained from the precursor charge determination algorithm is shown to be up to 3.5 times faster than the general range search method, with a minor 3.8% increase in sensitivity. Conclusion Overall, there is an 18.8% increase in peptide identifications at 1% FDR by incorporating the new precursor filter, noise filter and by using the charge determination algorithm, when compared to previous versions of OMSSA. PMID:22321509

  10. Quality Visualization of Microarray Datasets Using Circos

    PubMed Central

    Koch, Martin; Wiese, Michael

    2012-01-01

    Quality control and normalization is considered the most important step in the analysis of microarray data. At present there are various methods available for quality assessments of microarray datasets. However there seems to be no standard visualization routine, which also depicts individual microarray quality. Here we present a convenient method for visualizing the results of standard quality control tests using Circos plots. In these plots various quality measurements are drawn in a circular fashion, thus allowing for visualization of the quality and all outliers of each distinct array within a microarray dataset. The proposed method is intended for use with the Affymetrix Human Genome platform (i.e., GPL 96, GPL570 and GPL571). Circos quality measurement plots are a convenient way for the initial quality estimate of Affymetrix datasets that are stored in publicly available databases.