Science.gov

Sample records for microarray preprocessing algorithms

  1. Micro-Analyzer: automatic preprocessing of Affymetrix microarray data.

    PubMed

    Guzzi, Pietro Hiram; Cannataro, Mario

    2013-08-01

    A current trend in genomics is the investigation of the cell mechanism using different technologies, in order to explain the relationship among genes, molecular processes and diseases. For instance, the combined use of gene-expression arrays and genomic arrays has been demonstrated as an effective instrument in clinical practice. Consequently, in a single experiment different kind of microarrays may be used, resulting in the production of different types of binary data (images and textual raw data). The analysis of microarray data requires an initial preprocessing phase, that makes raw data suitable for use on existing analysis platforms, such as the TIGR M4 (TM4) Suite. An additional challenge to be faced by emerging data analysis platforms is the ability to treat in a combined way those different microarray formats coupled with clinical data. In fact, resulting integrated data may include both numerical and symbolic data (e.g. gene expression and SNPs regarding molecular data), as well as temporal data (e.g. the response to a drug, time to progression and survival rate), regarding clinical data. Raw data preprocessing is a crucial step in analysis but is often performed in a manual and error prone way using different software tools. Thus novel, platform independent, and possibly open source tools enabling the semi-automatic preprocessing and annotation of different microarray data are needed. The paper presents Micro-Analyzer (Microarray Analyzer), a cross-platform tool for the automatic normalization, summarization and annotation of Affymetrix gene expression and SNP binary data. It represents the evolution of the μ-CS tool, extending the preprocessing to SNP arrays that were not allowed in μ-CS. The Micro-Analyzer is provided as a Java standalone tool and enables users to read, preprocess and analyse binary microarray data (gene expression and SNPs) by invoking TM4 platform. It avoids: (i) the manual invocation of external tools (e.g. the Affymetrix Power

  2. Optimisation algorithms for microarray biclustering.

    PubMed

    Perrin, Dimitri; Duhamel, Christophe

    2013-01-01

    In providing simultaneous information on expression profiles for thousands of genes, microarray technologies have, in recent years, been largely used to investigate mechanisms of gene expression. Clustering and classification of such data can, indeed, highlight patterns and provide insight on biological processes. A common approach is to consider genes and samples of microarray datasets as nodes in a bipartite graphs, where edges are weighted e.g. based on the expression levels. In this paper, using a previously-evaluated weighting scheme, we focus on search algorithms and evaluate, in the context of biclustering, several variations of Genetic Algorithms. We also introduce a new heuristic "Propagate", which consists in recursively evaluating neighbour solutions with one more or one less active conditions. The results obtained on three well-known datasets show that, for a given weighting scheme, optimal or near-optimal solutions can be identified. PMID:24109756

  3. User-friendly solutions for microarray quality control and pre-processing on ArrayAnalysis.org.

    PubMed

    Eijssen, Lars M T; Jaillard, Magali; Adriaens, Michiel E; Gaj, Stan; de Groot, Philip J; Müller, Michael; Evelo, Chris T

    2013-07-01

    Quality control (QC) is crucial for any scientific method producing data. Applying adequate QC introduces new challenges in the genomics field where large amounts of data are produced with complex technologies. For DNA microarrays, specific algorithms for QC and pre-processing including normalization have been developed by the scientific community, especially for expression chips of the Affymetrix platform. Many of these have been implemented in the statistical scripting language R and are available from the Bioconductor repository. However, application is hampered by lack of integrative tools that can be used by users of any experience level. To fill this gap, we developed a freely available tool for QC and pre-processing of Affymetrix gene expression results, extending, integrating and harmonizing functionality of Bioconductor packages. The tool can be easily accessed through a wizard-like web portal at http://www.arrayanalysis.org or downloaded for local use in R. The portal provides extensive documentation, including user guides, interpretation help with real output illustrations and detailed technical documentation. It assists newcomers to the field in performing state-of-the-art QC and pre-processing while offering data analysts an integral open-source package. Providing the scientific community with this easily accessible tool will allow improving data quality and reuse and adoption of standards. PMID:23620278

  4. An Efficient and Configurable Preprocessing Algorithm to Improve Stability Analysis.

    PubMed

    Sesia, Ilaria; Cantoni, Elena; Cernigliaro, Alice; Signorile, Giovanna; Fantino, Gianluca; Tavella, Patrizia

    2016-04-01

    The Allan variance (AVAR) is widely used to measure the stability of experimental time series. Specifically, AVAR is commonly used in space applications such as monitoring the clocks of the global navigation satellite systems (GNSSs). In these applications, the experimental data present some peculiar aspects which are not generally encountered when the measurements are carried out in a laboratory. Space clocks' data can in fact present outliers, jumps, and missing values, which corrupt the clock characterization. Therefore, an efficient preprocessing is fundamental to ensure a proper data analysis and improve the stability estimation performed with the AVAR or other similar variances. In this work, we propose a preprocessing algorithm and its implementation in a robust software code (in MATLAB language) able to deal with time series of experimental data affected by nonstationarities and missing data; our method is properly detecting and removing anomalous behaviors, hence making the subsequent stability analysis more reliable. PMID:26540679

  5. Fully Automated Complementary DNA Microarray Segmentation using a Novel Fuzzy-based Algorithm

    PubMed Central

    Saberkari, Hamidreza; Bahrami, Sheyda; Shamsi, Mousa; Amoshahy, Mohammad Javad; Ghavifekr, Habib Badri; Sedaaghi, Mohammad Hossein

    2015-01-01

    DNA microarray is a powerful approach to study simultaneously, the expression of 1000 of genes in a single experiment. The average value of the fluorescent intensity could be calculated in a microarray experiment. The calculated intensity values are very close in amount to the levels of expression of a particular gene. However, determining the appropriate position of every spot in microarray images is a main challenge, which leads to the accurate classification of normal and abnormal (cancer) cells. In this paper, first a preprocessing approach is performed to eliminate the noise and artifacts available in microarray cells using the nonlinear anisotropic diffusion filtering method. Then, the coordinate center of each spot is positioned utilizing the mathematical morphology operations. Finally, the position of each spot is exactly determined through applying a novel hybrid model based on the principle component analysis and the spatial fuzzy c-means clustering (SFCM) algorithm. Using a Gaussian kernel in SFCM algorithm will lead to improving the quality in complementary DNA microarray segmentation. The performance of the proposed algorithm has been evaluated on the real microarray images, which is available in Stanford Microarray Databases. Results illustrate that the accuracy of microarray cells segmentation in the proposed algorithm reaches to 100% and 98% for noiseless/noisy cells, respectively. PMID:26284175

  6. An improved preprocessing algorithm for haplotype inference by pure parsimony.

    PubMed

    Choi, Mun-Ho; Kang, Seung-Ho; Lim, Hyeong-Seok

    2014-08-01

    The identification of haplotypes, which encode SNPs in a single chromosome, makes it possible to perform a haplotype-based association test with disease. Given a set of genotypes from a population, the process of recovering the haplotypes, which explain the genotypes, is called haplotype inference (HI). We propose an improved preprocessing method for solving the haplotype inference by pure parsimony (HIPP), which excludes a large amount of redundant haplotypes by detecting some groups of haplotypes that are dispensable for optimal solutions. The method uses only inclusion relations between groups of haplotypes but dramatically reduces the number of candidate haplotypes; therefore, it causes the computational time and memory reduction of real HIPP solvers. The proposed method can be easily coupled with a wide range of optimization methods which consider a set of candidate haplotypes explicitly. For the simulated and well-known benchmark datasets, the experimental results show that our method coupled with a classical exact HIPP solver run much faster than the state-of-the-art solver and can solve a large number of instances that were so far unaffordable in a reasonable time. PMID:25152045

  7. Genetic Algorithm for Optimization: Preprocessing with n Dimensional Bisection and Error Estimation

    NASA Technical Reports Server (NTRS)

    Sen, S. K.; Shaykhian, Gholam Ali

    2006-01-01

    A knowledge of the appropriate values of the parameters of a genetic algorithm (GA) such as the population size, the shrunk search space containing the solution, crossover and mutation probabilities is not available a priori for a general optimization problem. Recommended here is a polynomial-time preprocessing scheme that includes an n-dimensional bisection and that determines the foregoing parameters before deciding upon an appropriate GA for all problems of similar nature and type. Such a preprocessing is not only fast but also enables us to get the global optimal solution and its reasonably narrow error bounds with a high degree of confidence.

  8. A biomimetic algorithm for the improved detection of microarray features

    NASA Astrophysics Data System (ADS)

    Nicolau, Dan V., Jr.; Nicolau, Dan V.; Maini, Philip K.

    2007-02-01

    One the major difficulties of microarray technology relate to the processing of large and - importantly - error-loaded images of the dots on the chip surface. Whatever the source of these errors, those obtained in the first stage of data acquisition - segmentation - are passed down to the subsequent processes, with deleterious results. As it has been demonstrated recently that biological systems have evolved algorithms that are mathematically efficient, this contribution attempts to test an algorithm that mimics a bacterial-"patented" algorithm for the search of available space and nutrients to find, "zero-in" and eventually delimitate the features existent on the microarray surface.

  9. Image preprocessing for improving computational efficiency in implementation of restoration and superresolution algorithms.

    PubMed

    Sundareshan, Malur K; Bhattacharjee, Supratik; Inampudi, Radhika; Pang, Ho-Yuen

    2002-12-10

    Computational complexity is a major impediment to the real-time implementation of image restoration and superresolution algorithms in many applications. Although powerful restoration algorithms have been developed within the past few years utilizing sophisticated mathematical machinery (based on statistical optimization and convex set theory), these algorithms are typically iterative in nature and require a sufficient number of iterations to be executed to achieve the desired resolution improvement that may be needed to meaningfully perform postprocessing image exploitation tasks in practice. Additionally, recent technological breakthroughs have facilitated novel sensor designs (focal plane arrays, for instance) that make it possible to capture megapixel imagery data at video frame rates. A major challenge in the processing of these large-format images is to complete the execution of the image processing steps within the frame capture times and to keep up with the output rate of the sensor so that all data captured by the sensor can be efficiently utilized. Consequently, development of novel methods that facilitate real-time implementation of image restoration and superresolution algorithms is of significant practical interest and is the primary focus of this study. The key to designing computationally efficient processing schemes lies in strategically introducing appropriate preprocessing steps together with the superresolution iterations to tailor optimized overall processing sequences for imagery data of specific formats. For substantiating this assertion, three distinct methods for tailoring a preprocessing filter and integrating it with the superresolution processing steps are outlined. These methods consist of a region-of-interest extraction scheme, a background-detail separation procedure, and a scene-derived information extraction step for implementing a set-theoretic restoration of the image that is less demanding in computation compared with the

  10. Image preprocessing for improving computational efficiency in implementation of restoration and superresolution algorithms

    NASA Astrophysics Data System (ADS)

    Sundareshan, Malur K.; Bhattacharjee, Supratik; Inampudi, Radhika; Pang, Ho-Yuen

    2002-12-01

    Computational complexity is a major impediment to the real-time implementation of image restoration and superresolution algorithms in many applications. Although powerful restoration algorithms have been developed within the past few years utilizing sophisticated mathematical machinery (based on statistical optimization and convex set theory), these algorithms are typically iterative in nature and require a sufficient number of iterations to be executed to achieve the desired resolution improvement that may be needed to meaningfully perform postprocessing image exploitation tasks in practice. Additionally, recent technological breakthroughs have facilitated novel sensor designs (focal plane arrays, for instance) that make it possible to capture megapixel imagery data at video frame rates. A major challenge in the processing of these large-format images is to complete the execution of the image processing steps within the frame capture times and to keep up with the output rate of the sensor so that all data captured by the sensor can be efficiently utilized. Consequently, development of novel methods that facilitate real-time implementation of image restoration and superresolution algorithms is of significant practical interest and is the primary focus of this study. The key to designing computationally efficient processing schemes lies in strategically introducing appropriate preprocessing steps together with the superresolution iterations to tailor optimized overall processing sequences for imagery data of specific formats. For substantiating this assertion, three distinct methods for tailoring a preprocessing filter and integrating it with the superresolution processing steps are outlined. These methods consist of a region-of-interest extraction scheme, a background-detail separation procedure, and a scene-derived information extraction step for implementing a set-theoretic restoration of the image that is less demanding in computation compared with the

  11. Cancer Classification in Microarray Data using a Hybrid Selective Independent Component Analysis and υ-Support Vector Machine Algorithm.

    PubMed

    Saberkari, Hamidreza; Shamsi, Mousa; Joroughi, Mahsa; Golabi, Faegheh; Sedaaghi, Mohammad Hossein

    2014-10-01

    Microarray data have an important role in identification and classification of the cancer tissues. Having a few samples of microarrays in cancer researches is always one of the most concerns which lead to some problems in designing the classifiers. For this matter, preprocessing gene selection techniques should be utilized before classification to remove the noninformative genes from the microarray data. An appropriate gene selection method can significantly improve the performance of cancer classification. In this paper, we use selective independent component analysis (SICA) for decreasing the dimension of microarray data. Using this selective algorithm, we can solve the instability problem occurred in the case of employing conventional independent component analysis (ICA) methods. First, the reconstruction error and selective set are analyzed as independent components of each gene, which have a small part in making error in order to reconstruct new sample. Then, some of the modified support vector machine (υ-SVM) algorithm sub-classifiers are trained, simultaneously. Eventually, the best sub-classifier with the highest recognition rate is selected. The proposed algorithm is applied on three cancer datasets (leukemia, breast cancer and lung cancer datasets), and its results are compared with other existing methods. The results illustrate that the proposed algorithm (SICA + υ-SVM) has higher accuracy and validity in order to increase the classification accuracy. Such that, our proposed algorithm exhibits relative improvements of 3.3% in correctness rate over ICA + SVM and SVM algorithms in lung cancer dataset. PMID:25426433

  12. Syndromic surveillance using veterinary laboratory data: data pre-processing and algorithm performance evaluation

    PubMed Central

    Dórea, Fernanda C.; McEwen, Beverly J.; McNab, W. Bruce; Revie, Crawford W.; Sanchez, Javier

    2013-01-01

    Diagnostic test orders to an animal laboratory were explored as a data source for monitoring trends in the incidence of clinical syndromes in cattle. Four years of real data and over 200 simulated outbreak signals were used to compare pre-processing methods that could remove temporal effects in the data, as well as temporal aberration detection algorithms that provided high sensitivity and specificity. Weekly differencing demonstrated solid performance in removing day-of-week effects, even in series with low daily counts. For aberration detection, the results indicated that no single algorithm showed performance superior to all others across the range of outbreak scenarios simulated. Exponentially weighted moving average charts and Holt–Winters exponential smoothing demonstrated complementary performance, with the latter offering an automated method to adjust to changes in the time series that will likely occur in the future. Shewhart charts provided lower sensitivity but earlier detection in some scenarios. Cumulative sum charts did not appear to add value to the system; however, the poor performance of this algorithm was attributed to characteristics of the data monitored. These findings indicate that automated monitoring aimed at early detection of temporal aberrations will likely be most effective when a range of algorithms are implemented in parallel. PMID:23576782

  13. DNA Microarray Data Analysis: A Novel Biclustering Algorithm Approach

    NASA Astrophysics Data System (ADS)

    Tchagang, Alain B.; Tewfik, Ahmed H.

    2006-12-01

    Biclustering algorithms refer to a distinct class of clustering algorithms that perform simultaneous row-column clustering. Biclustering problems arise in DNA microarray data analysis, collaborative filtering, market research, information retrieval, text mining, electoral trends, exchange analysis, and so forth. When dealing with DNA microarray experimental data for example, the goal of biclustering algorithms is to find submatrices, that is, subgroups of genes and subgroups of conditions, where the genes exhibit highly correlated activities for every condition. In this study, we develop novel biclustering algorithms using basic linear algebra and arithmetic tools. The proposed biclustering algorithms can be used to search for all biclusters with constant values, biclusters with constant values on rows, biclusters with constant values on columns, and biclusters with coherent values from a set of data in a timely manner and without solving any optimization problem. We also show how one of the proposed biclustering algorithms can be adapted to identify biclusters with coherent evolution. The algorithms developed in this study discover all valid biclusters of each type, while almost all previous biclustering approaches will miss some.

  14. Clustering Short Time-Series Microarray

    NASA Astrophysics Data System (ADS)

    Ping, Loh Wei; Hasan, Yahya Abu

    2008-01-01

    Most microarray analyses are carried out on static gene expressions. However, the dynamical study of microarrays has lately gained more attention. Most researches on time-series microarray emphasize on the bioscience and medical aspects but few from the numerical aspect. This study attempts to analyze short time-series microarray mathematically using STEM clustering tool which formally preprocess data followed by clustering. We next introduce the Circular Mould Distance (CMD) algorithm with combinations of both preprocessing and clustering analysis. Both methods are subsequently compared in terms of efficiencies.

  15. Rank-based algorithms for anlaysis of microarrays

    NASA Astrophysics Data System (ADS)

    Liu, Wei-min; Mei, Rui; Bartell, Daniel M.; Di, Xiaojun; Webster, Teresa A.; Ryder, Tom

    2001-06-01

    Analysis of microarray data often involves extracting information from raw intensities of spots of cells and making certain calls. Rank-based algorithms are powerful tools to provide probability values of hypothesis tests, especially when the distribution of the intensities is unknown. For our current gene expression arrays, a gene is detected by a set of probe pairs consisting of perfect match and mismatch cells. The one-sided upper-tail Wilcoxon's signed rank test is used in our algorithms for absolute calls (whether a gene is detected or not), as well as comparative calls (whether a gene is increasing or decreasing or no significant change in a sample compared with another sample). We also test the possibility to use only perfect match cells to make calls. This paper focuses on absolute calls. We have developed error analysis methods and software tools that allow us to compare the accuracy of the calls in the presence or absence of mismatch cells at different target concentrations. The usage of nonparametric rank-based tests is not limited to absolute and comparative calls of gene expression chips. They can also be applied to other oligonucleotide microarrays for genotyping and mutation detection, as well as spotted arrays.

  16. Benchmarking a memetic algorithm for ordering microarray data.

    PubMed

    Moscato, P; Mendes, A; Berretta, R

    2007-03-01

    This work introduces a new algorithm for "gene ordering". Given a matrix of gene expression data values, the task is to find a permutation of the gene names list such that genes with similar expression patterns should be relatively close in the permutation. The algorithm is based on a combined approach that integrates a constructive heuristic with evolutionary and Tabu Search techniques in a single methodology. To evaluate the benefits of this method, we compared our results with the current outputs provided by several widely used algorithms in functional genomics. We also compared the results with our own hierarchical clustering method when used in isolation. We show that the use of images, corrupted with known levels of noise, helps to illustrate some aspects of the performance of the algorithms and provide a complementary benchmark for the analysis. The use of these images, with known high-quality solutions, facilitates in some cases the assessment of the methods and helps the software development, validation and reproducibility of results. We also propose two quantitative measures of performance for gene ordering. Using these measures, we make a comparison with probably the most used algorithm (due to Eisen and collaborators, PNAS 1998) using a microarray dataset available on the public domain (the complete yeast cell cycle dataset). PMID:16870322

  17. Effective preprocessing in #SAT

    NASA Astrophysics Data System (ADS)

    Guo, Qin; Sang, Juan; He, Yong-mei

    2011-12-01

    Preprocessing #SAT instances can reduce their size considerably and decrease the solving time. In this paper we investigate the use of the hyper-binary resolution and equality reduction to preprocess the #SAT instances. And a preprocessing algorithm Preprocess MC is presented, which combines the unit propagation, the hyper-binary resolution, and the equality reduction together. The experiment shows that these excellent technologies not only reduce the size of the #SAT formula, but also improve the ability of the model counters to solve #SAT problems.

  18. Microarrays

    ERIC Educational Resources Information Center

    Plomin, Robert; Schalkwyk, Leonard C.

    2007-01-01

    Microarrays are revolutionizing genetics by making it possible to genotype hundreds of thousands of DNA markers and to assess the expression (RNA transcripts) of all of the genes in the genome. Microarrays are slides the size of a postage stamp that contain millions of DNA sequences to which single-stranded DNA or RNA can hybridize. This…

  19. Artifact Removal from Biosignal using Fixed Point ICA Algorithm for Pre-processing in Biometric Recognition

    NASA Astrophysics Data System (ADS)

    Mishra, Puneet; Singla, Sunil Kumar

    2013-01-01

    In the modern world of automation, biological signals, especially Electroencephalogram (EEG) and Electrocardiogram (ECG), are gaining wide attention as a source of biometric information. Earlier studies have shown that EEG and ECG show versatility with individuals and every individual has distinct EEG and ECG spectrum. EEG (which can be recorded from the scalp due to the effect of millions of neurons) may contain noise signals such as eye blink, eye movement, muscular movement, line noise, etc. Similarly, ECG may contain artifact like line noise, tremor artifacts, baseline wandering, etc. These noise signals are required to be separated from the EEG and ECG signals to obtain the accurate results. This paper proposes a technique for the removal of eye blink artifact from EEG and ECG signal using fixed point or FastICA algorithm of Independent Component Analysis (ICA). For validation, FastICA algorithm has been applied to synthetic signal prepared by adding random noise to the Electrocardiogram (ECG) signal. FastICA algorithm separates the signal into two independent components, i.e. ECG pure and artifact signal. Similarly, the same algorithm has been applied to remove the artifacts (Electrooculogram or eye blink) from the EEG signal.

  20. LANDSAT data preprocessing

    NASA Technical Reports Server (NTRS)

    Austin, W. W.

    1983-01-01

    The effect on LANDSAT data of a Sun angle correction, an intersatellite LANDSAT-2 and LANDSAT-3 data range adjustment, and the atmospheric correction algorithm was evaluated. Fourteen 1978 crop year LACIE sites were used as the site data set. The preprocessing techniques were applied to multispectral scanner channel data and transformed data were plotted and used to analyze the effectiveness of the preprocessing techniques. Ratio transformations effectively reduce the need for preprocessing techniques to be applied directly to the data. Subtractive transformations are more sensitive to Sun angle and atmospheric corrections than ratios. Preprocessing techniques, other than those applied at the Goddard Space Flight Center, should only be applied as an option of the user. While performed on LANDSAT data the study results are also applicable to meteorological satellite data.

  1. Genetic Bee Colony (GBC) algorithm: A new gene selection method for microarray cancer classification.

    PubMed

    Alshamlan, Hala M; Badr, Ghada H; Alohali, Yousef A

    2015-06-01

    Naturally inspired evolutionary algorithms prove effectiveness when used for solving feature selection and classification problems. Artificial Bee Colony (ABC) is a relatively new swarm intelligence method. In this paper, we propose a new hybrid gene selection method, namely Genetic Bee Colony (GBC) algorithm. The proposed algorithm combines the used of a Genetic Algorithm (GA) along with Artificial Bee Colony (ABC) algorithm. The goal is to integrate the advantages of both algorithms. The proposed algorithm is applied to a microarray gene expression profile in order to select the most predictive and informative genes for cancer classification. In order to test the accuracy performance of the proposed algorithm, extensive experiments were conducted. Three binary microarray datasets are use, which include: colon, leukemia, and lung. In addition, another three multi-class microarray datasets are used, which are: SRBCT, lymphoma, and leukemia. Results of the GBC algorithm are compared with our recently proposed technique: mRMR when combined with the Artificial Bee Colony algorithm (mRMR-ABC). We also compared the combination of mRMR with GA (mRMR-GA) and Particle Swarm Optimization (mRMR-PSO) algorithms. In addition, we compared the GBC algorithm with other related algorithms that have been recently published in the literature, using all benchmark datasets. The GBC algorithm shows superior performance as it achieved the highest classification accuracy along with the lowest average number of selected genes. This proves that the GBC algorithm is a promising approach for solving the gene selection problem in both binary and multi-class cancer classification. PMID:25880524

  2. Crossword: A Fully Automated Algorithm for the Segmentation and Quality Control of Protein Microarray Images

    PubMed Central

    2015-01-01

    Biological assays formatted as microarrays have become a critical tool for the generation of the comprehensive data sets required for systems-level understanding of biological processes. Manual annotation of data extracted from images of microarrays, however, remains a significant bottleneck, particularly for protein microarrays due to the sensitivity of this technology to weak artifact signal. In order to automate the extraction and curation of data from protein microarrays, we describe an algorithm called Crossword that logically combines information from multiple approaches to fully automate microarray segmentation. Automated artifact removal is also accomplished by segregating structured pixels from the background noise using iterative clustering and pixel connectivity. Correlation of the location of structured pixels across image channels is used to identify and remove artifact pixels from the image prior to data extraction. This component improves the accuracy of data sets while reducing the requirement for time-consuming visual inspection of the data. Crossword enables a fully automated protocol that is robust to significant spatial and intensity aberrations. Overall, the average amount of user intervention is reduced by an order of magnitude and the data quality is increased through artifact removal and reduced user variability. The increase in throughput should aid the further implementation of microarray technologies in clinical studies. PMID:24417579

  3. An efficient algorithm for the stochastic simulation of the hybridization of DNA to microarrays

    PubMed Central

    2009-01-01

    Background Although oligonucleotide microarray technology is ubiquitous in genomic research, reproducibility and standardization of expression measurements still concern many researchers. Cross-hybridization between microarray probes and non-target ssDNA has been implicated as a primary factor in sensitivity and selectivity loss. Since hybridization is a chemical process, it may be modeled at a population-level using a combination of material balance equations and thermodynamics. However, the hybridization reaction network may be exceptionally large for commercial arrays, which often possess at least one reporter per transcript. Quantification of the kinetics and equilibrium of exceptionally large chemical systems of this type is numerically infeasible with customary approaches. Results In this paper, we present a robust and computationally efficient algorithm for the simulation of hybridization processes underlying microarray assays. Our method may be utilized to identify the extent to which nucleic acid targets (e.g. cDNA) will cross-hybridize with probes, and by extension, characterize probe robustnessusing the information specified by MAGE-TAB. Using this algorithm, we characterize cross-hybridization in a modified commercial microarray assay. Conclusions By integrating stochastic simulation with thermodynamic prediction tools for DNA hybridization, one may robustly and rapidly characterize of the selectivity of a proposed microarray design at the probe and "system" levels. Our code is available at http://www.laurenzi.net. PMID:20003312

  4. MIClique: An algorithm to identify differentially coexpressed disease gene subset from microarray data.

    PubMed

    Zhang, Huanping; Song, Xiaofeng; Wang, Huinan; Zhang, Xiaobai

    2009-01-01

    Computational analysis of microarray data has provided an effective way to identify disease-related genes. Traditional disease gene selection methods from microarray data such as statistical test always focus on differentially expressed genes in different samples by individual gene prioritization. These traditional methods might miss differentially coexpressed (DCE) gene subsets because they ignore the interaction between genes. In this paper, MIClique algorithm is proposed to identify DEC gene subsets based on mutual information and clique analysis. Mutual information is used to measure the coexpression relationship between each pair of genes in two different kinds of samples. Clique analysis is a commonly used method in biological network, which generally represents biological module of similar function. By applying the MIClique algorithm to real gene expression data, some DEC gene subsets which correlated under one experimental condition but uncorrelated under another condition are detected from the graph of colon dataset and leukemia dataset. PMID:20169000

  5. Novel algorithm for coexpression detection in time-varying microarray data sets.

    PubMed

    Yin, Zong-Xian; Chiang, Jung-Hsien

    2008-01-01

    When analyzing the results of microarray experiments, biologists generally use unsupervised categorization tools. However, such tools regard each time point as an independent dimension and utilize the Euclidean distance to compute the similarities between expressions. Furthermore, some of these methods require the number of clusters to be determined in advance, which is clearly impossible in the case of a new dataset. Therefore, this study proposes a novel scheme, designated as the Variation-based Coexpression Detection (VCD) algorithm, to analyze the trends of expressions based on their variation over time. The proposed algorithm has two advantages. First, it is unnecessary to determine the number of clusters in advance since the algorithm automatically detects those genes whose profiles are grouped together and creates patterns for these groups. Second, the algorithm features a new measurement criterion for calculating the degree of change of the expressions between adjacent time points and evaluating their trend similarities. Three real-world microarray datasets are employed to evaluate the performance of the proposed algorithm. PMID:18245881

  6. Krylov subspace algorithms for computing GeneRank for the analysis of microarray data mining.

    PubMed

    Wu, Gang; Zhang, Ying; Wei, Yimin

    2010-04-01

    GeneRank is a new engine technology for the analysis of microarray experiments. It combines gene expression information with a network structure derived from gene notations or expression profile correlations. Using matrix decomposition techniques, we first give a matrix analysis of the GeneRank model. We reformulate the GeneRank vector as a linear combination of three parts in the general case when the matrix in question is non-diagonalizable. We then propose two Krylov subspace methods for computing GeneRank. Numerical experiments show that, when the GeneRank problem is very large, the new algorithms are appropriate choices. PMID:20426695

  7. SPACE: an algorithm to predict and quantify alternatively spliced isoforms using microarrays.

    PubMed

    Anton, Miguel A; Gorostiaga, Dorleta; Guruceaga, Elizabeth; Segura, Victor; Carmona-Saez, Pedro; Pascual-Montano, Alberto; Pio, Ruben; Montuenga, Luis M; Rubio, Angel

    2008-01-01

    Exon and exon+junction microarrays are promising tools for studying alternative splicing. Current analytical tools applied to these arrays lack two relevant features: the ability to predict unknown spliced forms and the ability to quantify the concentration of known and unknown isoforms. SPACE is an algorithm that has been developed to (1) estimate the number of different transcripts expressed under several conditions, (2) predict the precursor mRNA splicing structure and (3) quantify the transcript concentrations including unknown forms. The results presented here show its robustness and accuracy for real and simulated data. PMID:18312629

  8. mRMR-ABC: A Hybrid Gene Selection Algorithm for Cancer Classification Using Microarray Gene Expression Profiling

    PubMed Central

    Alshamlan, Hala; Badr, Ghada; Alohali, Yousef

    2015-01-01

    An artificial bee colony (ABC) is a relatively recent swarm intelligence optimization approach. In this paper, we propose the first attempt at applying ABC algorithm in analyzing a microarray gene expression profile. In addition, we propose an innovative feature selection algorithm, minimum redundancy maximum relevance (mRMR), and combine it with an ABC algorithm, mRMR-ABC, to select informative genes from microarray profile. The new approach is based on a support vector machine (SVM) algorithm to measure the classification accuracy for selected genes. We evaluate the performance of the proposed mRMR-ABC algorithm by conducting extensive experiments on six binary and multiclass gene expression microarray datasets. Furthermore, we compare our proposed mRMR-ABC algorithm with previously known techniques. We reimplemented two of these techniques for the sake of a fair comparison using the same parameters. These two techniques are mRMR when combined with a genetic algorithm (mRMR-GA) and mRMR when combined with a particle swarm optimization algorithm (mRMR-PSO). The experimental results prove that the proposed mRMR-ABC algorithm achieves accurate classification performance using small number of predictive genes when tested using both datasets and compared to previously suggested methods. This shows that mRMR-ABC is a promising approach for solving gene selection and cancer classification problems. PMID:25961028

  9. mRMR-ABC: A Hybrid Gene Selection Algorithm for Cancer Classification Using Microarray Gene Expression Profiling.

    PubMed

    Alshamlan, Hala; Badr, Ghada; Alohali, Yousef

    2015-01-01

    An artificial bee colony (ABC) is a relatively recent swarm intelligence optimization approach. In this paper, we propose the first attempt at applying ABC algorithm in analyzing a microarray gene expression profile. In addition, we propose an innovative feature selection algorithm, minimum redundancy maximum relevance (mRMR), and combine it with an ABC algorithm, mRMR-ABC, to select informative genes from microarray profile. The new approach is based on a support vector machine (SVM) algorithm to measure the classification accuracy for selected genes. We evaluate the performance of the proposed mRMR-ABC algorithm by conducting extensive experiments on six binary and multiclass gene expression microarray datasets. Furthermore, we compare our proposed mRMR-ABC algorithm with previously known techniques. We reimplemented two of these techniques for the sake of a fair comparison using the same parameters. These two techniques are mRMR when combined with a genetic algorithm (mRMR-GA) and mRMR when combined with a particle swarm optimization algorithm (mRMR-PSO). The experimental results prove that the proposed mRMR-ABC algorithm achieves accurate classification performance using small number of predictive genes when tested using both datasets and compared to previously suggested methods. This shows that mRMR-ABC is a promising approach for solving gene selection and cancer classification problems. PMID:25961028

  10. Forward-Masked Frequency Selectivity Improvements in Simulated and Actual Cochlear Implant Users Using a Preprocessing Algorithm.

    PubMed

    Langner, Florian; Jürgens, Tim

    2016-01-01

    Frequency selectivity can be quantified using masking paradigms, such as psychophysical tuning curves (PTCs). Normal-hearing (NH) listeners show sharp PTCs that are level- and frequency-dependent, whereas frequency selectivity is strongly reduced in cochlear implant (CI) users. This study aims at (a) assessing individual shapes of PTCs in CI users, (b) comparing these shapes to those of simulated CI listeners (NH listeners hearing through a CI simulation), and (c) increasing the sharpness of PTCs using a biologically inspired dynamic compression algorithm, BioAid, which has been shown to sharpen the PTC shape in hearing-impaired listeners. A three-alternative-forced-choice forward-masking technique was used to assess PTCs in 8 CI users (with their own speech processor) and 11 NH listeners (with and without listening through a vocoder to simulate electric hearing). CI users showed flat PTCs with large interindividual variability in shape, whereas simulated CI listeners had PTCs of the same average flatness, but more homogeneous shapes across listeners. The algorithm BioAid was used to process the stimuli before entering the CI users' speech processor or the vocoder simulation. This algorithm was able to partially restore frequency selectivity in both groups, particularly in seven out of eight CI users, meaning significantly sharper PTCs than in the unprocessed condition. The results indicate that algorithms can improve the large-scale sharpness of frequency selectivity in some CI users. This finding may be useful for the design of sound coding strategies particularly for situations in which high frequency selectivity is desired, such as for music perception. PMID:27604785

  11. Classifier dependent feature preprocessing methods

    NASA Astrophysics Data System (ADS)

    Rodriguez, Benjamin M., II; Peterson, Gilbert L.

    2008-04-01

    In mobile applications, computational complexity is an issue that limits sophisticated algorithms from being implemented on these devices. This paper provides an initial solution to applying pattern recognition systems on mobile devices by combining existing preprocessing algorithms for recognition. In pattern recognition systems, it is essential to properly apply feature preprocessing tools prior to training classification models in an attempt to reduce computational complexity and improve the overall classification accuracy. The feature preprocessing tools extended for the mobile environment are feature ranking, feature extraction, data preparation and outlier removal. Most desktop systems today are capable of processing a majority of the available classification algorithms without concern of processing while the same is not true on mobile platforms. As an application of pattern recognition for mobile devices, the recognition system targets the problem of steganalysis, determining if an image contains hidden information. The measure of performance shows that feature preprocessing increases the overall steganalysis classification accuracy by an average of 22%. The methods in this paper are tested on a workstation and a Nokia 6620 (Symbian operating system) camera phone with similar results.

  12. Evaluation of multivariate calibration models with different pre-processing and processing algorithms for a novel resolution and quantitation of spectrally overlapped quaternary mixture in syrup

    NASA Astrophysics Data System (ADS)

    Moustafa, Azza A.; Hegazy, Maha A.; Mohamed, Dalia; Ali, Omnia

    2016-02-01

    A novel approach for the resolution and quantitation of severely overlapped quaternary mixture of carbinoxamine maleate (CAR), pholcodine (PHL), ephedrine hydrochloride (EPH) and sunset yellow (SUN) in syrup was demonstrated utilizing different spectrophotometric assisted multivariate calibration methods. The applied methods have used different processing and pre-processing algorithms. The proposed methods were partial least squares (PLS), concentration residuals augmented classical least squares (CRACLS), and a novel method; continuous wavelet transforms coupled with partial least squares (CWT-PLS). These methods were applied to a training set in the concentration ranges of 40-100 μg/mL, 40-160 μg/mL, 100-500 μg/mL and 8-24 μg/mL for the four components, respectively. The utilized methods have not required any preliminary separation step or chemical pretreatment. The validity of the methods was evaluated by an external validation set. The selectivity of the developed methods was demonstrated by analyzing the drugs in their combined pharmaceutical formulation without any interference from additives. The obtained results were statistically compared with the official and reported methods where no significant difference was observed regarding both accuracy and precision.

  13. Evaluation of multivariate calibration models with different pre-processing and processing algorithms for a novel resolution and quantitation of spectrally overlapped quaternary mixture in syrup.

    PubMed

    Moustafa, Azza A; Hegazy, Maha A; Mohamed, Dalia; Ali, Omnia

    2016-02-01

    A novel approach for the resolution and quantitation of severely overlapped quaternary mixture of carbinoxamine maleate (CAR), pholcodine (PHL), ephedrine hydrochloride (EPH) and sunset yellow (SUN) in syrup was demonstrated utilizing different spectrophotometric assisted multivariate calibration methods. The applied methods have used different processing and pre-processing algorithms. The proposed methods were partial least squares (PLS), concentration residuals augmented classical least squares (CRACLS), and a novel method; continuous wavelet transforms coupled with partial least squares (CWT-PLS). These methods were applied to a training set in the concentration ranges of 40-100 μg/mL, 40-160 μg/mL, 100-500 μg/mL and 8-24 μg/mL for the four components, respectively. The utilized methods have not required any preliminary separation step or chemical pretreatment. The validity of the methods was evaluated by an external validation set. The selectivity of the developed methods was demonstrated by analyzing the drugs in their combined pharmaceutical formulation without any interference from additives. The obtained results were statistically compared with the official and reported methods where no significant difference was observed regarding both accuracy and precision. PMID:26519913

  14. A novel biclustering algorithm of binary microarray data: BiBinCons and BiBinAlter.

    PubMed

    Saber, Haifa Ben; Elloumi, Mourad

    2015-01-01

    The biclustering of microarray data has been the subject of a large research. No one of the existing biclustering algorithms is perfect. The construction of biologically significant groups of biclusters for large microarray data is still a problem that requires a continuous work. Biological validation of biclusters of microarray data is one of the most important open issues. So far, there are no general guidelines in the literature on how to validate biologically extracted biclusters. In this paper, we develop two biclustering algorithms of binary microarray data, adopting the Iterative Row and Column Clustering Combination (IRCCC) approach, called BiBinCons and BiBinAlter. However, the BiBinAlter algorithm is an improvement of BiBinCons. On the other hand, BiBinAlter differs from BiBinCons by the use of the EvalStab and IndHomog evaluation functions in addition to the CroBin one (Bioinformatics 20:1993-2003, 2004). BiBinAlter can extracts biclusters of good quality with better p-values. PMID:26628919

  15. Application of genetic algorithms and constructive neural networks for the analysis of microarray cancer data

    PubMed Central

    2014-01-01

    Background Extracting relevant information from microarray data is a very complex task due to the characteristics of the data sets, as they comprise a large number of features while few samples are generally available. In this sense, feature selection is a very important aspect of the analysis helping in the tasks of identifying relevant genes and also for maximizing predictive information. Methods Due to its simplicity and speed, Stepwise Forward Selection (SFS) is a widely used feature selection technique. In this work, we carry a comparative study of SFS and Genetic Algorithms (GA) as general frameworks for the analysis of microarray data with the aim of identifying group of genes with high predictive capability and biological relevance. Six standard and machine learning-based techniques (Linear Discriminant Analysis (LDA), Support Vector Machines (SVM), Naive Bayes (NB), C-MANTEC Constructive Neural Network, K-Nearest Neighbors (kNN) and Multilayer perceptron (MLP)) are used within both frameworks using six free-public datasets for the task of predicting cancer outcome. Results Better cancer outcome prediction results were obtained using the GA framework noting that this approach, in comparison to the SFS one, leads to a larger selection set, uses a large number of comparison between genetic profiles and thus it is computationally more intensive. Also the GA framework permitted to obtain a set of genes that can be considered to be more biologically relevant. Regarding the different classifiers used standard feedforward neural networks (MLP), LDA and SVM lead to similar and best results, while C-MANTEC and k-NN followed closely but with a lower accuracy. Further, C-MANTEC, MLP and LDA permitted to obtain a more limited set of genes in comparison to SVM, NB and kNN, and in particular C-MANTEC resulted in the most robust classifier in terms of changes in the parameter settings. Conclusions This study shows that if prediction accuracy is the objective, the GA

  16. K-Boost: a scalable algorithm for high-quality clustering of microarray gene expression data.

    PubMed

    Geraci, Filippo; Leoncini, Mauro; Montangero, Manuela; Pellegrini, Marco; Renda, M Elena

    2009-06-01

    Microarray technology for profiling gene expression levels is a popular tool in modern biological research. Applications range from tissue classification to the detection of metabolic networks, from drug discovery to time-critical personalized medicine. Given the increase in size and complexity of the data sets produced, their analysis is becoming problematic in terms of time/quality trade-offs. Clustering genes with similar expression profiles is a key initial step for subsequent manipulations and the increasing volumes of data to be analyzed requires methods that are at the same time efficient (completing an analysis in minutes rather than hours) and effective (identifying significant clusters with high biological correlations). In this paper, we propose K-Boost, a clustering algorithm based on a combination of the furthest-point-first (FPF) heuristic for solving the metric k-center problem, a stability-based method for determining the number of clusters, and a k-means-like cluster refinement. K-Boost runs in O (|N| x k) time, where N is the input matrix and k is the number of proposed clusters. Experiments show that this low complexity is usually coupled with a very good quality of the computed clusterings, which we measure using both internal and external criteria. Supporting data can be found as online Supplementary Material at www.liebertonline.com. PMID:19522668

  17. GTI: A Novel Algorithm for Identifying Outlier Gene Expression Profiles from Integrated Microarray Datasets

    PubMed Central

    Mpindi, John Patrick; Sara, Henri; Haapa-Paananen, Saija; Kilpinen, Sami; Pisto, Tommi; Bucher, Elmar; Ojala, Kalle; Iljin, Kristiina; Vainio, Paula; Björkman, Mari; Gupta, Santosh; Kohonen, Pekka; Nees, Matthias; Kallioniemi, Olli

    2011-01-01

    Background Meta-analysis of gene expression microarray datasets presents significant challenges for statistical analysis. We developed and validated a new bioinformatic method for the identification of genes upregulated in subsets of samples of a given tumour type (‘outlier genes’), a hallmark of potential oncogenes. Methodology A new statistical method (the gene tissue index, GTI) was developed by modifying and adapting algorithms originally developed for statistical problems in economics. We compared the potential of the GTI to detect outlier genes in meta-datasets with four previously defined statistical methods, COPA, the OS statistic, the t-test and ORT, using simulated data. We demonstrated that the GTI performed equally well to existing methods in a single study simulation. Next, we evaluated the performance of the GTI in the analysis of combined Affymetrix gene expression data from several published studies covering 392 normal samples of tissue from the central nervous system, 74 astrocytomas, and 353 glioblastomas. According to the results, the GTI was better able than most of the previous methods to identify known oncogenic outlier genes. In addition, the GTI identified 29 novel outlier genes in glioblastomas, including TYMS and CDKN2A. The over-expression of these genes was validated in vivo by immunohistochemical staining data from clinical glioblastoma samples. Immunohistochemical data were available for 65% (19 of 29) of these genes, and 17 of these 19 genes (90%) showed a typical outlier staining pattern. Furthermore, raltitrexed, a specific inhibitor of TYMS used in the therapy of tumour types other than glioblastoma, also effectively blocked cell proliferation in glioblastoma cell lines, thus highlighting this outlier gene candidate as a potential therapeutic target. Conclusions/Significance Taken together, these results support the GTI as a novel approach to identify potential oncogene outliers and drug targets. The algorithm is implemented in

  18. The LeFE algorithm: embracing the complexity of gene expression in the interpretation of microarray data.

    PubMed

    Eichler, Gabriel S; Reimers, Mark; Kane, David; Weinstein, John N

    2007-01-01

    Interpretation of microarray data remains a challenge, and most methods fail to consider the complex, nonlinear regulation of gene expression. To address that limitation, we introduce Learner of Functional Enrichment (LeFE), a statistical/machine learning algorithm based on Random Forest, and demonstrate it on several diverse datasets: smoker/never smoker, breast cancer classification, and cancer drug sensitivity. We also compare it with previously published algorithms, including Gene Set Enrichment Analysis. LeFE regularly identifies statistically significant functional themes consistent with known biology. PMID:17845722

  19. Neighborhood inverse consistency preprocessing

    SciTech Connect

    Freuder, E.C.; Elfe, C.D.

    1996-12-31

    Constraint satisfaction consistency preprocessing methods are used to reduce search effort. Time and especially space costs limit the amount of preprocessing that will be cost effective. A new form of consistency preprocessing, neighborhood inverse consistency, can achieve more problem pruning than the usual arc consistency preprocessing in a cost effective manner. There are two basic ideas: (1) Common forms of consistency enforcement basically operate by identifying and remembering solutions to subproblems for which a consistent value cannot be found for some additional problem variable. The space required for this memory can quickly become prohibitive. Inverse consistency basically operates by removing values for variables that are not consistent with any solution to some subproblem involving additional variables. The space requirement is at worst linear. (2) Typically consistency preprocessing achieves some level of consistency uniformly throughout the problem. A subproblem solution will be tested against each additional variable that constrains any subproblem variable. Neighborhood consistency focuses attention on the subproblem formed by the variables that are all constrained by the value in question. By targeting highly relevant subproblems we hope to {open_quotes}skim the cream{close_quotes}, obtaining a high payoff for a limited cost.

  20. Exploring the feasibility of next-generation sequencing and microarray data meta-analysis

    PubMed Central

    Wu, Po-Yen; Phan, John H.; Wang, May D.

    2016-01-01

    Emerging next-generation sequencing (NGS) technology potentially resolves many issues that prevent widespread clinical use of gene expression microarrays. However, the number of publicly available NGS datasets is still smaller than that of microarrays. This paper explores the possibilities for combining information from both microarray and NGS gene expression datasets for the discovery of differentially expressed genes (DEGs). We evaluate several existing methods in detecting DEGs using individual datasets as well as combined NGS and microarray datasets. Results indicate that analysis of combined NGS and microarray data is feasible, but successful detection of DEGs may depend on careful selection of algorithms as well as on data normalization and pre-processing. PMID:22256102

  1. An MCMC Algorithm for Target Estimation in Real-Time DNA Microarrays

    NASA Astrophysics Data System (ADS)

    Vikalo, Haris; Gokdemir, Mahsuni

    2010-12-01

    DNA microarrays detect the presence and quantify the amounts of nucleic acid molecules of interest. They rely on a chemical attraction between the target molecules and their Watson-Crick complements, which serve as biological sensing elements (probes). The attraction between these biomolecules leads to binding, in which probes capture target analytes. Recently developed real-time DNA microarrays are capable of observing kinetics of the binding process. They collect noisy measurements of the amount of captured molecules at discrete points in time. Molecular binding is a random process which, in this paper, is modeled by a stochastic differential equation. The target analyte quantification is posed as a parameter estimation problem, and solved using a Markov Chain Monte Carlo technique. In simulation studies where we test the robustness with respect to the measurement noise, the proposed technique significantly outperforms previously proposed methods. Moreover, the proposed approach is tested and verified on experimental data.

  2. Biomarker Discovery Based on Hybrid Optimization Algorithm and Artificial Neural Networks on Microarray Data for Cancer Classification.

    PubMed

    Moteghaed, Niloofar Yousefi; Maghooli, Keivan; Pirhadi, Shiva; Garshasbi, Masoud

    2015-01-01

    The improvement of high-through-put gene profiling based microarrays technology has provided monitoring the expression value of thousands of genes simultaneously. Detailed examination of changes in expression levels of genes can help physicians to have efficient diagnosing, classification of tumors and cancer's types as well as effective treatments. Finding genes that can classify the group of cancers correctly based on hybrid optimization algorithms is the main purpose of this paper. In this paper, a hybrid particle swarm optimization and genetic algorithm method are used for gene selection and also artificial neural network (ANN) is adopted as the classifier. In this work, we have improved the ability of the algorithm for the classification problem by finding small group of biomarkers and also best parameters of the classifier. The proposed approach is tested on three benchmark gene expression data sets: Blood (acute myeloid leukemia, acute lymphoblastic leukemia), colon and breast datasets. We used 10-fold cross-validation to achieve accuracy and also decision tree algorithm to find the relation between the biomarkers for biological point of view. To test the ability of the trained ANN models to categorize the cancers, we analyzed additional blinded samples that were not previously used for the training procedure. Experimental results show that the proposed method can reduce the dimension of the data set and confirm the most informative gene subset and improve classification accuracy with best parameters based on datasets. PMID:26120567

  3. Biomarker Discovery Based on Hybrid Optimization Algorithm and Artificial Neural Networks on Microarray Data for Cancer Classification

    PubMed Central

    Moteghaed, Niloofar Yousefi; Maghooli, Keivan; Pirhadi, Shiva; Garshasbi, Masoud

    2015-01-01

    The improvement of high-through-put gene profiling based microarrays technology has provided monitoring the expression value of thousands of genes simultaneously. Detailed examination of changes in expression levels of genes can help physicians to have efficient diagnosing, classification of tumors and cancer's types as well as effective treatments. Finding genes that can classify the group of cancers correctly based on hybrid optimization algorithms is the main purpose of this paper. In this paper, a hybrid particle swarm optimization and genetic algorithm method are used for gene selection and also artificial neural network (ANN) is adopted as the classifier. In this work, we have improved the ability of the algorithm for the classification problem by finding small group of biomarkers and also best parameters of the classifier. The proposed approach is tested on three benchmark gene expression data sets: Blood (acute myeloid leukemia, acute lymphoblastic leukemia), colon and breast datasets. We used 10-fold cross-validation to achieve accuracy and also decision tree algorithm to find the relation between the biomarkers for biological point of view. To test the ability of the trained ANN models to categorize the cancers, we analyzed additional blinded samples that were not previously used for the training procedure. Experimental results show that the proposed method can reduce the dimension of the data set and confirm the most informative gene subset and improve classification accuracy with best parameters based on datasets. PMID:26120567

  4. The preprocessed doacross loop

    NASA Technical Reports Server (NTRS)

    Saltz, Joel H.; Mirchandaney, Ravi

    1990-01-01

    Dependencies between loop iterations cannot always be characterized during program compilation. Doacross loops typically make use of a-priori knowledge of inter-iteration dependencies to carry out required synchronizations. A type of doacross loop is proposed that allows the scheduling of iterations of a loop among processors without advance knowledge of inter-iteration dependencies. The method proposed for loop iterations requires that parallelizable preprocessing and postprocessing steps be carried out during program execution.

  5. Classification of Non-Small Cell Lung Cancer Using Significance Analysis of Microarray-Gene Set Reduction Algorithm

    PubMed Central

    Zhang, Lei; Wang, Linlin; Du, Bochuan; Wang, Tianjiao; Tian, Pu

    2016-01-01

    Among non-small cell lung cancer (NSCLC), adenocarcinoma (AC), and squamous cell carcinoma (SCC) are two major histology subtypes, accounting for roughly 40% and 30% of all lung cancer cases, respectively. Since AC and SCC differ in their cell of origin, location within the lung, and growth pattern, they are considered as distinct diseases. Gene expression signatures have been demonstrated to be an effective tool for distinguishing AC and SCC. Gene set analysis is regarded as irrelevant to the identification of gene expression signatures. Nevertheless, we found that one specific gene set analysis method, significance analysis of microarray-gene set reduction (SAMGSR), can be adopted directly to select relevant features and to construct gene expression signatures. In this study, we applied SAMGSR to a NSCLC gene expression dataset. When compared with several novel feature selection algorithms, for example, LASSO, SAMGSR has equivalent or better performance in terms of predictive ability and model parsimony. Therefore, SAMGSR is a feature selection algorithm, indeed. Additionally, we applied SAMGSR to AC and SCC subtypes separately to discriminate their respective stages, that is, stage II versus stage I. Few overlaps between these two resulting gene signatures illustrate that AC and SCC are technically distinct diseases. Therefore, stratified analyses on subtypes are recommended when diagnostic or prognostic signatures of these two NSCLC subtypes are constructed. PMID:27446945

  6. Preoperative overnight parenteral nutrition (TPN) improves skeletal muscle protein metabolism indicated by microarray algorithm analyses in a randomized trial.

    PubMed

    Iresjö, Britt-Marie; Engström, Cecilia; Lundholm, Kent

    2016-06-01

    Loss of muscle mass is associated with increased risk of morbidity and mortality in hospitalized patients. Uncertainties of treatment efficiency by short-term artificial nutrition remain, specifically improvement of protein balance in skeletal muscles. In this study, algorithmic microarray analysis was applied to map cellular changes related to muscle protein metabolism in human skeletal muscle tissue during provision of overnight preoperative total parenteral nutrition (TPN). Twenty-two patients (11/group) scheduled for upper GI surgery due to malignant or benign disease received a continuous peripheral all-in-one TPN infusion (30 kcal/kg/day, 0.16 gN/kg/day) or saline infusion for 12 h prior operation. Biopsies from the rectus abdominis muscle were taken at the start of operation for isolation of muscle RNA RNA expression microarray analyses were performed with Agilent Sureprint G3, 8 × 60K arrays using one-color labeling. 447 mRNAs were differently expressed between study and control patients (P < 0.1). mRNAs related to ribosomal biogenesis, mRNA processing, and translation were upregulated during overnight nutrition; particularly anabolic signaling S6K1 (P < 0.01-0.1). Transcripts of genes associated with lysosomal degradation showed consistently lower expression during TPN while mRNAs for ubiquitin-mediated degradation of proteins as well as transcripts related to intracellular signaling pathways, PI3 kinase/MAPkinase, were either increased or decreased. In conclusion, muscle mRNA alterations during overnight standard TPN infusions at constant rate altered mRNAs associated with mTOR signaling; increased initiation of protein translation; and suppressed autophagy/lysosomal degradation of proteins. This indicates that overnight preoperative parenteral nutrition is effective to promote muscle protein metabolism. PMID:27273879

  7. Comparing Binaural Pre-processing Strategies III

    PubMed Central

    Warzybok, Anna; Ernst, Stephan M. A.

    2015-01-01

    A comprehensive evaluation of eight signal pre-processing strategies, including directional microphones, coherence filters, single-channel noise reduction, binaural beamformers, and their combinations, was undertaken with normal-hearing (NH) and hearing-impaired (HI) listeners. Speech reception thresholds (SRTs) were measured in three noise scenarios (multitalker babble, cafeteria noise, and single competing talker). Predictions of three common instrumental measures were compared with the general perceptual benefit caused by the algorithms. The individual SRTs measured without pre-processing and individual benefits were objectively estimated using the binaural speech intelligibility model. Ten listeners with NH and 12 HI listeners participated. The participants varied in age and pure-tone threshold levels. Although HI listeners required a better signal-to-noise ratio to obtain 50% intelligibility than listeners with NH, no differences in SRT benefit from the different algorithms were found between the two groups. With the exception of single-channel noise reduction, all algorithms showed an improvement in SRT of between 2.1 dB (in cafeteria noise) and 4.8 dB (in single competing talker condition). Model predictions with binaural speech intelligibility model explained 83% of the measured variance of the individual SRTs in the no pre-processing condition. Regarding the benefit from the algorithms, the instrumental measures were not able to predict the perceptual data in all tested noise conditions. The comparable benefit observed for both groups suggests a possible application of noise reduction schemes for listeners with different hearing status. Although the model can predict the individual SRTs without pre-processing, further development is necessary to predict the benefits obtained from the algorithms at an individual level. PMID:26721922

  8. Retinex Preprocessing for Improved Multi-Spectral Image Classification

    NASA Technical Reports Server (NTRS)

    Thompson, B.; Rahman, Z.; Park, S.

    2000-01-01

    The goal of multi-image classification is to identify and label "similar regions" within a scene. The ability to correctly classify a remotely sensed multi-image of a scene is affected by the ability of the classification process to adequately compensate for the effects of atmospheric variations and sensor anomalies. Better classification may be obtained if the multi-image is preprocessed before classification, so as to reduce the adverse effects of image formation. In this paper, we discuss the overall impact on multi-spectral image classification when the retinex image enhancement algorithm is used to preprocess multi-spectral images. The retinex is a multi-purpose image enhancement algorithm that performs dynamic range compression, reduces the dependence on lighting conditions, and generally enhances apparent spatial resolution. The retinex has been successfully applied to the enhancement of many different types of grayscale and color images. We show in this paper that retinex preprocessing improves the spatial structure of multi-spectral images and thus provides better within-class variations than would otherwise be obtained without the preprocessing. For a series of multi-spectral images obtained with diffuse and direct lighting, we show that without retinex preprocessing the class spectral signatures vary substantially with the lighting conditions. Whereas multi-dimensional clustering without preprocessing produced one-class homogeneous regions, the classification on the preprocessed images produced multi-class non-homogeneous regions. This lack of homogeneity is explained by the interaction between different agronomic treatments applied to the regions: the preprocessed images are closer to ground truth. The principle advantage that the retinex offers is that for different lighting conditions classifications derived from the retinex preprocessed images look remarkably "similar", and thus more consistent, whereas classifications derived from the original

  9. Microarrays: an overview.

    PubMed

    Lee, Norman H; Saeed, Alexander I

    2007-01-01

    Gene expression microarrays are being used widely to address a myriad of complex biological questions. To gather meaningful expression data, it is crucial to have a firm understanding of the steps involved in the application of microarrays. The available microarray platforms are discussed along with their advantages and disadvantages. Additional considerations include study design, quality control and systematic assessment of microarray performance, RNA-labeling strategies, sample allocation, signal amplification schemes, defining the number of appropriate biological replicates, data normalization, statistical approaches to identify differentially regulated genes, and clustering algorithms for data visualization. In this chapter, the underlying principles regarding microarrays are reviewed, to serve as a guide when navigating through this powerful technology. PMID:17332646

  10. Context-based preprocessing of molecular docking data

    PubMed Central

    2013-01-01

    Background Data preprocessing is a major step in data mining. In data preprocessing, several known techniques can be applied, or new ones developed, to improve data quality such that the mining results become more accurate and intelligible. Bioinformatics is one area with a high demand for generation of comprehensive models from large datasets. In this article, we propose a context-based data preprocessing approach to mine data from molecular docking simulation results. The test cases used a fully-flexible receptor (FFR) model of Mycobacterium tuberculosis InhA enzyme (FFR_InhA) and four different ligands. Results We generated an initial set of attributes as well as their respective instances. To improve this initial set, we applied two selection strategies. The first was based on our context-based approach while the second used the CFS (Correlation-based Feature Selection) machine learning algorithm. Additionally, we produced an extra dataset containing features selected by combining our context strategy and the CFS algorithm. To demonstrate the effectiveness of the proposed method, we evaluated its performance based on various predictive (RMSE, MAE, Correlation, and Nodes) and context (Precision, Recall and FScore) measures. Conclusions Statistical analysis of the results shows that the proposed context-based data preprocessing approach significantly improves predictive and context measures and outperforms the CFS algorithm. Context-based data preprocessing improves mining results by producing superior interpretable models, which makes it well-suited for practical applications in molecular docking simulations using FFR models. PMID:24564276

  11. Compact Circuit Preprocesses Accelerometer Output

    NASA Technical Reports Server (NTRS)

    Bozeman, Richard J., Jr.

    1993-01-01

    Compact electronic circuit transfers dc power to, and preprocesses ac output of, accelerometer and associated preamplifier. Incorporated into accelerometer case during initial fabrication or retrofit onto commercial accelerometer. Made of commercial integrated circuits and other conventional components; made smaller by use of micrologic and surface-mount technology.

  12. Arabic handwritten: pre-processing and segmentation

    NASA Astrophysics Data System (ADS)

    Maliki, Makki; Jassim, Sabah; Al-Jawad, Naseer; Sellahewa, Harin

    2012-06-01

    This paper is concerned with pre-processing and segmentation tasks that influence the performance of Optical Character Recognition (OCR) systems and handwritten/printed text recognition. In Arabic, these tasks are adversely effected by the fact that many words are made up of sub-words, with many sub-words there associated one or more diacritics that are not connected to the sub-word's body; there could be multiple instances of sub-words overlap. To overcome these problems we investigate and develop segmentation techniques that first segment a document into sub-words, link the diacritics with their sub-words, and removes possible overlapping between words and sub-words. We shall also investigate two approaches for pre-processing tasks to estimate sub-words baseline, and to determine parameters that yield appropriate slope correction, slant removal. We shall investigate the use of linear regression on sub-words pixels to determine their central x and y coordinates, as well as their high density part. We also develop a new incremental rotation procedure to be performed on sub-words that determines the best rotation angle needed to realign baselines. We shall demonstrate the benefits of these proposals by conducting extensive experiments on publicly available databases and in-house created databases. These algorithms help improve character segmentation accuracy by transforming handwritten Arabic text into a form that could benefit from analysis of printed text.

  13. Image preprocessing study on KPCA-based face recognition

    NASA Astrophysics Data System (ADS)

    Li, Xuan; Li, Dehua

    2015-12-01

    Face recognition as an important biometric identification method, with its friendly, natural, convenient advantages, has obtained more and more attention. This paper intends to research a face recognition system including face detection, feature extraction and face recognition, mainly through researching on related theory and the key technology of various preprocessing methods in face detection process, using KPCA method, focuses on the different recognition results in different preprocessing methods. In this paper, we choose YCbCr color space for skin segmentation and choose integral projection for face location. We use erosion and dilation of the opening and closing operation and illumination compensation method to preprocess face images, and then use the face recognition method based on kernel principal component analysis method for analysis and research, and the experiments were carried out using the typical face database. The algorithms experiment on MATLAB platform. Experimental results show that integration of the kernel method based on PCA algorithm under certain conditions make the extracted features represent the original image information better for using nonlinear feature extraction method, which can obtain higher recognition rate. In the image preprocessing stage, we found that images under various operations may appear different results, so as to obtain different recognition rate in recognition stage. At the same time, in the process of the kernel principal component analysis, the value of the power of the polynomial function can affect the recognition result.

  14. Preprocessing and analysis of the ECG signals

    NASA Astrophysics Data System (ADS)

    Zhu, Jianmin; Zhang, Xiaolan; Wang, Zhongyu; Wang, Xiaoling

    2008-10-01

    According to the request of automatic analysis and depressing high frequency interference of the ECG signals, this paper applies low-pass filter to preprocess ECG signals, and proposes a QRS complex detection method based on wavelet transform, which takes advantage of Marr wavelet to decompose and filter the ECG signals with Mallat algorithm, using the relationship between wavelet transform and signal singularity to detect QRS complex with amplitude threshold method in scale 3, and to detect P wave and R wave in scale 4. Meanwhile, compositive detection method is used for re-detection, thus to improving the detection accuracy ratio. At last, records from ECG database of MIT/BIH which is widely accepted in the world are used to test the algorithm. And the result shows that correction detecting ratio under this algorithm has been more than 99.8 percent. The detection method in this paper is simple and running fast, and is easy to be realized in the real-time detecting system using for clinical diagnosis.

  15. Biclustering of time series microarray data.

    PubMed

    Meng, Jia; Huang, Yufei

    2012-01-01

    Clustering is a popular data exploration technique widely used in microarray data analysis. In this chapter, we review ideas and algorithms of bicluster and its applications in time series microarray analysis. We introduce first the concept and importance of biclustering and its different variations. We then focus our discussion on the popular iterative signature algorithm (ISA) for searching biclusters in microarray dataset. Next, we discuss in detail the enrichment constraint time-dependent ISA (ECTDISA) for identifying biologically meaningful temporal transcription modules from time series microarray dataset. In the end, we provide an example of ECTDISA application to time series microarray data of Kaposi's Sarcoma-associated Herpesvirus (KSHV) infection. PMID:22130875

  16. Analysis of High-Throughput ELISA Microarray Data

    SciTech Connect

    White, Amanda M.; Daly, Don S.; Zangar, Richard C.

    2011-02-23

    Our research group develops analytical methods and software for the high-throughput analysis of quantitative enzyme-linked immunosorbent assay (ELISA) microarrays. ELISA microarrays differ from DNA microarrays in several fundamental aspects and most algorithms for analysis of DNA microarray data are not applicable to ELISA microarrays. In this review, we provide an overview of the steps involved in ELISA microarray data analysis and how the statistically sound algorithms we have developed provide an integrated software suite to address the needs of each data-processing step. The algorithms discussed are available in a set of open-source software tools (http://www.pnl.gov/statistics/ProMAT).

  17. Protein Microarrays

    NASA Astrophysics Data System (ADS)

    Ricard-Blum, S.

    Proteins are key actors in the life of the cell, involved in many physiological and pathological processes. Since variations in the expression of messenger RNA are not systematically correlated with variations in the protein levels, the latter better reflect the way a cell functions. Protein microarrays thus supply complementary information to DNA chips. They are used in particular to analyse protein expression profiles, to detect proteins within complex biological media, and to study protein-protein interactions, which give information about the functions of those proteins [3-9]. They have the same advantages as DNA microarrays for high-throughput analysis, miniaturisation, and the possibility of automation. Section 18.1 gives a brief overview of proteins. Following this, Sect. 18.2 describes how protein microarrays can be made on flat supports, explaining how proteins can be produced and immobilised on a solid support, and discussing the different kinds of substrate and detection method. Section 18.3 discusses the particular format of protein microarrays in suspension. The diversity of protein microarrays and their applications are then reported in Sect. 18.4, with applications to therapeutics (protein-drug interactions) and diagnostics. The prospects for future developments of protein microarrays are then outlined in the conclusion. The bibliography provides an extensive list of reviews and detailed references for those readers who wish to go further in this area. Indeed, the aim of the present chapter is not to give an exhaustive or detailed analysis of the state of the art, but rather to provide the reader with the basic elements needed to understand how proteins are designed and used.

  18. EMAAS: An extensible grid-based Rich Internet Application for microarray data analysis and management

    PubMed Central

    Barton, G; Abbott, J; Chiba, N; Huang, DW; Huang, Y; Krznaric, M; Mack-Smith, J; Saleem, A; Sherman, BT; Tiwari, B; Tomlinson, C; Aitman, T; Darlington, J; Game, L; Sternberg, MJE; Butcher, SA

    2008-01-01

    Background Microarray experimentation requires the application of complex analysis methods as well as the use of non-trivial computer technologies to manage the resultant large data sets. This, together with the proliferation of tools and techniques for microarray data analysis, makes it very challenging for a laboratory scientist to keep up-to-date with the latest developments in this field. Our aim was to develop a distributed e-support system for microarray data analysis and management. Results EMAAS (Extensible MicroArray Analysis System) is a multi-user rich internet application (RIA) providing simple, robust access to up-to-date resources for microarray data storage and analysis, combined with integrated tools to optimise real time user support and training. The system leverages the power of distributed computing to perform microarray analyses, and provides seamless access to resources located at various remote facilities. The EMAAS framework allows users to import microarray data from several sources to an underlying database, to pre-process, quality assess and analyse the data, to perform functional analyses, and to track data analysis steps, all through a single easy to use web portal. This interface offers distance support to users both in the form of video tutorials and via live screen feeds using the web conferencing tool EVO. A number of analysis packages, including R-Bioconductor and Affymetrix Power Tools have been integrated on the server side and are available programmatically through the Postgres-PLR library or on grid compute clusters. Integrated distributed resources include the functional annotation tool DAVID, GeneCards and the microarray data repositories GEO, CELSIUS and MiMiR. EMAAS currently supports analysis of Affymetrix 3' and Exon expression arrays, and the system is extensible to cater for other microarray and transcriptomic platforms. Conclusion EMAAS enables users to track and perform microarray data management and analysis tasks

  19. Research on pre-processing of QR Code

    NASA Astrophysics Data System (ADS)

    Sun, Haixing; Xia, Haojie; Dong, Ning

    2013-10-01

    QR code encodes many kinds of information because of its advantages: large storage capacity, high reliability, full arrange of utter-high-speed reading, small printing size and high-efficient representation of Chinese characters, etc. In order to obtain the clearer binarization image from complex background, and improve the recognition rate of QR code, this paper researches on pre-processing methods of QR code (Quick Response Code), and shows algorithms and results of image pre-processing for QR code recognition. Improve the conventional method by changing the Souvola's adaptive text recognition method. Additionally, introduce the QR code Extraction which adapts to different image size, flexible image correction approach, and improve the efficiency and accuracy of QR code image processing.

  20. Preprocessing cotton to prevent byssinosis

    PubMed Central

    Merchant, James A.; Lumsden, John C.; Kilburn, Kaye H.; Germino, Victor H.; Hamilton, John D.; Lynn, William S.; Byrd, H.; Baucom, D.

    1973-01-01

    Merchant, J. A., Lumsden, J. C., Kilburn, K. H., Germino, V. H., Hamilton, J. D., Lynn, W. S., Byrd, H., and Baucom, D. (1973).British Journal of Industrial Medicine,30, 237-247. Preprocessing cotton to prevent byssinosis. A fundamental approach of cleaning or deactivating cotton prior to manufacturing has long been advocated to prevent byssinosis, but no trial had been conducted to test the feasibility of such an approach. In the study described, it was possible to be directed by both biological observations and the results of manufacturing trials. An exposure chamber was built in a cotton textile mill which had been previously studied as part of a large cross-sectional survey. The chamber was provided with an independent air conditioning system and a carding machine which served as a dust generator. Sixteen subjects, who had shown reductions in expiratory flow rate with exposure to cotton dust, were chosen to form a panel for exposure to raw cottons and cottons which had been preprocessed by heating, washing, and steaming. Indicators of effects were symptoms of chest tightness and/or dyspnoea, change in FEV1·0, and fine dust levels over 6 hours of exposure. Exposure of the panel to no cotton dust resulted in no change in FEV1·0 and served as the control for subsequent trials. Exposure to strict middling cotton resulted in a byssinosis symptom prevalence of 22%, a significant decrement in FEV1·0 of 2·9%, and a fine dust level of 0·26 mg/m3. Exposure to strict low middling cotton resulted in a byssinosis symptom prevalence of 79%, a decrement in FEV1·0 of 8·5%, and a fine dust level of 0·89 mg/m3. Oven heating strict low middling cotton resulted in a byssinosis symptom prevalence of 56% and a relatively greater drop in FEV1·0 of 8·3% for 0·48 mg/m3 of fine dust. Washing the strict low grade cotton eliminated detectable biological effects with a symptom prevalence of 8%, an increase of 1·4% in FEV1·, and a dust level of 0·16 mg/m3, but the cotton

  1. Evaluation of the efficiency of continuous wavelet transform as processing and preprocessing algorithm for resolution of overlapped signals in univariate and multivariate regression analyses; an application to ternary and quaternary mixtures

    NASA Astrophysics Data System (ADS)

    Hegazy, Maha A.; Lotfy, Hayam M.; Mowaka, Shereen; Mohamed, Ekram Hany

    2016-07-01

    Wavelets have been adapted for a vast number of signal-processing applications due to the amount of information that can be extracted from a signal. In this work, a comparative study on the efficiency of continuous wavelet transform (CWT) as a signal processing tool in univariate regression and a pre-processing tool in multivariate analysis using partial least square (CWT-PLS) was conducted. These were applied to complex spectral signals of ternary and quaternary mixtures. CWT-PLS method succeeded in the simultaneous determination of a quaternary mixture of drotaverine (DRO), caffeine (CAF), paracetamol (PAR) and p-aminophenol (PAP, the major impurity of paracetamol). While, the univariate CWT failed to simultaneously determine the quaternary mixture components and was able to determine only PAR and PAP, the ternary mixtures of DRO, CAF, and PAR and CAF, PAR, and PAP. During the calculations of CWT, different wavelet families were tested. The univariate CWT method was validated according to the ICH guidelines. While for the development of the CWT-PLS model a calibration set was prepared by means of an orthogonal experimental design and their absorption spectra were recorded and processed by CWT. The CWT-PLS model was constructed by regression between the wavelet coefficients and concentration matrices and validation was performed by both cross validation and external validation sets. Both methods were successfully applied for determination of the studied drugs in pharmaceutical formulations.

  2. Evaluation of the efficiency of continuous wavelet transform as processing and preprocessing algorithm for resolution of overlapped signals in univariate and multivariate regression analyses; an application to ternary and quaternary mixtures.

    PubMed

    Hegazy, Maha A; Lotfy, Hayam M; Mowaka, Shereen; Mohamed, Ekram Hany

    2016-07-01

    Wavelets have been adapted for a vast number of signal-processing applications due to the amount of information that can be extracted from a signal. In this work, a comparative study on the efficiency of continuous wavelet transform (CWT) as a signal processing tool in univariate regression and a pre-processing tool in multivariate analysis using partial least square (CWT-PLS) was conducted. These were applied to complex spectral signals of ternary and quaternary mixtures. CWT-PLS method succeeded in the simultaneous determination of a quaternary mixture of drotaverine (DRO), caffeine (CAF), paracetamol (PAR) and p-aminophenol (PAP, the major impurity of paracetamol). While, the univariate CWT failed to simultaneously determine the quaternary mixture components and was able to determine only PAR and PAP, the ternary mixtures of DRO, CAF, and PAR and CAF, PAR, and PAP. During the calculations of CWT, different wavelet families were tested. The univariate CWT method was validated according to the ICH guidelines. While for the development of the CWT-PLS model a calibration set was prepared by means of an orthogonal experimental design and their absorption spectra were recorded and processed by CWT. The CWT-PLS model was constructed by regression between the wavelet coefficients and concentration matrices and validation was performed by both cross validation and external validation sets. Both methods were successfully applied for determination of the studied drugs in pharmaceutical formulations. PMID:27070527

  3. Chromosome Microarray.

    PubMed

    Anderson, Sharon

    2016-01-01

    Over the last half century, knowledge about genetics, genetic testing, and its complexity has flourished. Completion of the Human Genome Project provided a foundation upon which the accuracy of genetics, genomics, and integration of bioinformatics knowledge and testing has grown exponentially. What is lagging, however, are efforts to reach and engage nurses about this rapidly changing field. The purpose of this article is to familiarize nurses with several frequently ordered genetic tests including chromosomes and fluorescence in situ hybridization followed by a comprehensive review of chromosome microarray. It shares the complexity of microarray including how testing is performed and results analyzed. A case report demonstrates how this technology is applied in clinical practice and reveals benefits and limitations of this scientific and bioinformatics genetic technology. Clinical implications for maternal-child nurses across practice levels are discussed. PMID:27276104

  4. PREPROCESSING MAGNETIC FIELDS WITH CHROMOSPHERIC LONGITUDINAL FIELDS

    SciTech Connect

    Yamamoto, Tetsuya T.; Kusano, K.

    2012-06-20

    Nonlinear force-free field (NLFFF) extrapolation is a powerful tool for the modeling of the magnetic field in the solar corona. However, since the photospheric magnetic field does not in general satisfy the force-free condition, some kind of processing is required to assimilate data into the model. In this paper, we report the results of new preprocessing for the NLFFF extrapolation. Through this preprocessing, we expect to obtain magnetic field data similar to those in the chromosphere. In our preprocessing, we add a new term concerning chromospheric longitudinal fields into the optimization function proposed by Wiegelmann et al. We perform a parameter survey of six free parameters to find minimum force- and torque-freeness with the simulated-annealing method. Analyzed data are a photospheric vector magnetogram of AR 10953 observed with the Hinode spectropolarimeter and a chromospheric longitudinal magnetogram observed with SOLIS spectropolarimeter. It is found that some preprocessed fields show the smallest force- and torque-freeness and are very similar to the chromospheric longitudinal fields. On the other hand, other preprocessed fields show noisy maps, although the force- and torque-freeness are of the same order. By analyzing preprocessed noisy maps in the wave number space, we found that small and large wave number components balance out on the force-free index. We also discuss our iteration limit of the simulated-annealing method and magnetic structure broadening in the chromosphere.

  5. ArraySolver: An Algorithm for Colour-Coded Graphical Display and Wilcoxon Signed-Rank Statistics for Comparing Microarray Gene Expression Data

    PubMed Central

    2004-01-01

    The massive surge in the production of microarray data poses a great challenge for proper analysis and interpretation. In recent years numerous computational tools have been developed to extract meaningful interpretation of microarray gene expression data. However, a convenient tool for two-groups comparison of microarray data is still lacking and users have to rely on commercial statistical packages that might be costly and require special skills, in addition to extra time and effort for transferring data from one platform to other. Various statistical methods, including the t-test, analysis of variance, Pearson test and Mann–Whitney U test, have been reported for comparing microarray data, whereas the utilization of the Wilcoxon signed-rank test, which is an appropriate test for two-groups comparison of gene expression data, has largely been neglected in microarray studies. The aim of this investigation was to build an integrated tool, ArraySolver, for colour-coded graphical display and comparison of gene expression data using the Wilcoxon signed-rank test. The results of software validation showed similar outputs with ArraySolver and SPSS for large datasets. Whereas the former program appeared to be more accurate for 25 or fewer pairs (n ≤ 25), suggesting its potential application in analysing molecular signatures that usually contain small numbers of genes. The main advantages of ArraySolver are easy data selection, convenient report format, accurate statistics and the familiar Excel platform. PMID:18629036

  6. DNA Microarrays

    NASA Astrophysics Data System (ADS)

    Nguyen, C.; Gidrol, X.

    Genomics has revolutionised biological and biomedical research. This revolution was predictable on the basis of its two driving forces: the ever increasing availability of genome sequences and the development of new technology able to exploit them. Up until now, technical limitations meant that molecular biology could only analyse one or two parameters per experiment, providing relatively little information compared with the great complexity of the systems under investigation. This gene by gene approach is inadequate to understand biological systems containing several thousand genes. It is essential to have an overall view of the DNA, RNA, and relevant proteins. A simple inventory of the genome is not sufficient to understand the functions of the genes, or indeed the way that cells and organisms work. For this purpose, functional studies based on whole genomes are needed. Among these new large-scale methods of molecular analysis, DNA microarrays provide a way of studying the genome and the transcriptome. The idea of integrating a large amount of data derived from a support with very small area has led biologists to call these chips, borrowing the term from the microelectronics industry. At the beginning of the 1990s, the development of DNA chips on nylon membranes [1, 2], then on glass [3] and silicon [4] supports, made it possible for the first time to carry out simultaneous measurements of the equilibrium concentration of all the messenger RNA (mRNA) or transcribed RNA in a cell. These microarrays offer a wide range of applications, in both fundamental and clinical research, providing a method for genome-wide characterisation of changes occurring within a cell or tissue, as for example in polymorphism studies, detection of mutations, and quantitative assays of gene copies. With regard to the transcriptome, it provides a way of characterising differentially expressed genes, profiling given biological states, and identifying regulatory channels.

  7. An Automated, Adaptive Framework for Optimizing Preprocessing Pipelines in Task-Based Functional MRI

    PubMed Central

    Churchill, Nathan W.; Spring, Robyn; Afshin-Pour, Babak; Dong, Fan; Strother, Stephen C.

    2015-01-01

    BOLD fMRI is sensitive to blood-oxygenation changes correlated with brain function; however, it is limited by relatively weak signal and significant noise confounds. Many preprocessing algorithms have been developed to control noise and improve signal detection in fMRI. Although the chosen set of preprocessing and analysis steps (the “pipeline”) significantly affects signal detection, pipelines are rarely quantitatively validated in the neuroimaging literature, due to complex preprocessing interactions. This paper outlines and validates an adaptive resampling framework for evaluating and optimizing preprocessing choices by optimizing data-driven metrics of task prediction and spatial reproducibility. Compared to standard “fixed” preprocessing pipelines, this optimization approach significantly improves independent validation measures of within-subject test-retest, and between-subject activation overlap, and behavioural prediction accuracy. We demonstrate that preprocessing choices function as implicit model regularizers, and that improvements due to pipeline optimization generalize across a range of simple to complex experimental tasks and analysis models. Results are shown for brief scanning sessions (<3 minutes each), demonstrating that with pipeline optimization, it is possible to obtain reliable results and brain-behaviour correlations in relatively small datasets. PMID:26161667

  8. Aptamer Microarrays

    SciTech Connect

    Angel-Syrett, Heather; Collett, Jim; Ellington, Andrew D.

    2009-01-02

    In vitro selection can yield specific, high-affinity aptamers. We and others have devised methods for the automated selection of aptamers, and have begun to use these reagents for the construction of arrays. Arrayed aptamers have proven to be almost as sensitive as their solution phase counterparts, and when ganged together can provide both specific and general diagnostic signals for proteins and other analytes. We describe here technical details regarding the production and processing of aptamer microarrays, including blocking, washing, drying, and scanning. We will also discuss the challenges involved in developing standardized and reproducible methods for binding and quantitating protein targets. While signals from fluorescent analytes or sandwiches are typically captured, it has proven possible for immobilized aptamers to be uniquely coupled to amplification methods not available to protein reagents, thus allowing for protein-binding signals to be greatly amplified. Into the future, many of the biosensor methods described in this book can potentially be adapted to array formats, thus further expanding the utility of and applications for aptamer arrays.

  9. Efficient Preprocessing technique using Web log mining

    NASA Astrophysics Data System (ADS)

    Raiyani, Sheetal A.; jain, Shailendra

    2012-11-01

    Web Usage Mining can be described as the discovery and Analysis of user access pattern through mining of log files and associated data from a particular websites. No. of visitors interact daily with web sites around the world. enormous amount of data are being generated and these information could be very prize to the company in the field of accepting Customerís behaviors. In this paper a complete preprocessing style having data cleaning, user and session Identification activities to improve the quality of data. Efficient preprocessing technique one of the User Identification which is key issue in preprocessing technique phase is to identify the Unique web users. Traditional User Identification is based on the site structure, being supported by using some heuristic rules, for use of this reduced the efficiency of user identification solve this difficulty we introduced proposed Technique DUI (Distinct User Identification) based on IP address ,Agent and Session time ,Referred pages on desired session time. Which can be used in counter terrorism, fraud detection and detection of unusual access of secure data, as well as through detection of regular access behavior of users improve the overall designing and performance of upcoming access of preprocessing results.

  10. Preprocessing Moist Lignocellulosic Biomass for Biorefinery Feedstocks

    SciTech Connect

    Neal Yancey; Christopher T. Wright; Craig Conner; J. Richard Hess

    2009-06-01

    Biomass preprocessing is one of the primary operations in the feedstock assembly system of a lignocellulosic biorefinery. Preprocessing is generally accomplished using industrial grinders to format biomass materials into a suitable biorefinery feedstock for conversion to ethanol and other bioproducts. Many factors affect machine efficiency and the physical characteristics of preprocessed biomass. For example, moisture content of the biomass as received from the point of production has a significant impact on overall system efficiency and can significantly affect the characteristics (particle size distribution, flowability, storability, etc.) of the size-reduced biomass. Many different grinder configurations are available on the market, each with advantages under specific conditions. Ultimately, the capacity and/or efficiency of the grinding process can be enhanced by selecting the grinder configuration that optimizes grinder performance based on moisture content and screen size. This paper discusses the relationships of biomass moisture with respect to preprocessing system performance and product physical characteristics and compares data obtained on corn stover, switchgrass, and wheat straw as model feedstocks during Vermeer HG 200 grinder testing. During the tests, grinder screen configuration and biomass moisture content were varied and tested to provide a better understanding of their relative impact on machine performance and the resulting feedstock physical characteristics and uniformity relative to each crop tested.

  11. The Stanford Tissue Microarray Database.

    PubMed

    Marinelli, Robert J; Montgomery, Kelli; Liu, Chih Long; Shah, Nigam H; Prapong, Wijan; Nitzberg, Michael; Zachariah, Zachariah K; Sherlock, Gavin J; Natkunam, Yasodha; West, Robert B; van de Rijn, Matt; Brown, Patrick O; Ball, Catherine A

    2008-01-01

    The Stanford Tissue Microarray Database (TMAD; http://tma.stanford.edu) is a public resource for disseminating annotated tissue images and associated expression data. Stanford University pathologists, researchers and their collaborators worldwide use TMAD for designing, viewing, scoring and analyzing their tissue microarrays. The use of tissue microarrays allows hundreds of human tissue cores to be simultaneously probed by antibodies to detect protein abundance (Immunohistochemistry; IHC), or by labeled nucleic acids (in situ hybridization; ISH) to detect transcript abundance. TMAD archives multi-wavelength fluorescence and bright-field images of tissue microarrays for scoring and analysis. As of July 2007, TMAD contained 205 161 images archiving 349 distinct probes on 1488 tissue microarray slides. Of these, 31 306 images for 68 probes on 125 slides have been released to the public. To date, 12 publications have been based on these raw public data. TMAD incorporates the NCI Thesaurus ontology for searching tissues in the cancer domain. Image processing researchers can extract images and scores for training and testing classification algorithms. The production server uses the Apache HTTP Server, Oracle Database and Perl application code. Source code is available to interested researchers under a no-cost license. PMID:17989087

  12. The Effects of Pre-processing Strategies for Pediatric Cochlear Implant Recipients

    PubMed Central

    Rakszawski, Bernadette; Wright, Rose; Cadieux, Jamie H.; Davidson, Lisa S.; Brenner, Christine

    2016-01-01

    Background Cochlear implants (CIs) have been shown to improve children’s speech recognition over traditional amplification when severe to profound sensorineural hearing loss is present. Despite improvements, understanding speech at low-level intensities or in the presence of background noise remains difficult. In an effort to improve speech understanding in challenging environments, Cochlear Ltd. offers pre-processing strategies that apply various algorithms prior to mapping the signal to the internal array. Two of these strategies include Autosensitivity Control™ (ASC) and Adaptive Dynamic Range Optimization (ADRO®). Based on previous research, the manufacturer’s default pre-processing strategy for pediatrics’ everyday programs combines ASC+ADRO®. Purpose The purpose of this study is to compare pediatric speech perception performance across various pre-processing strategies while applying a specific programming protocol utilizing increased threshold (T) levels to ensure access to very low-level sounds. Research Design This was a prospective, cross-sectional, observational study. Participants completed speech perception tasks in four pre-processing conditions: no pre-processing, ADRO®, ASC, ASC+ADRO®. Study Sample Eleven pediatric Cochlear Ltd. cochlear implant users were recruited: six bilateral, one unilateral, and four bimodal. Intervention Four programs, with the participants’ everyday map, were loaded into the processor with different pre-processing strategies applied in each of the four positions: no pre-processing, ADRO®, ASC, and ASC+ADRO®. Data Collection and Analysis Participants repeated CNC words presented at 50 and 70 dB SPL in quiet and HINT sentences presented adaptively with competing R-Space noise at 60 and 70 dB SPL. Each measure was completed as participants listened with each of the four pre-processing strategies listed above. Test order and condition were randomized. A repeated-measures analysis of variance (ANOVA) was used to

  13. Reliable RANSAC Using a Novel Preprocessing Model

    PubMed Central

    Wang, Xiaoyan; Zhang, Hui; Liu, Sheng

    2013-01-01

    Geometric assumption and verification with RANSAC has become a crucial step for corresponding to local features due to its wide applications in biomedical feature analysis and vision computing. However, conventional RANSAC is very time-consuming due to redundant sampling times, especially dealing with the case of numerous matching pairs. This paper presents a novel preprocessing model to explore a reduced set with reliable correspondences from initial matching dataset. Both geometric model generation and verification are carried out on this reduced set, which leads to considerable speedups. Afterwards, this paper proposes a reliable RANSAC framework using preprocessing model, which was implemented and verified using Harris and SIFT features, respectively. Compared with traditional RANSAC, experimental results show that our method is more efficient. PMID:23509601

  14. Infrared Mueller matrix acquisition and preprocessing system.

    PubMed

    Carrieri, Arthur H; Owens, David J; Schultz, Jonathan C

    2008-09-20

    An analog Mueller matrix acquisition and preprocessing system (AMMS) was developed for a photopolarimetric-based sensor with 9.1-12.0 microm optical bandwidth, which is the middle infrared wavelength-tunable region of sensor transmitter and "fingerprint" spectral band for chemical-biological (analyte) standoff detection. AMMS facilitates delivery of two alternate polarization-modulated CO(2) laser beams onto subject analyte that excite/relax molecular vibrational resonance in its analytic mass, primes the photoelastic-modulation engine of the sensor, establishes optimum throughput radiance per backscattering cross section, acquires Mueller elements modulo two laser beams in hexadecimal format, preprocesses (normalize, subtract, filter) these data, and formats the results into digitized identification metrics. Feed forwarding of formatted Mueller matrix metrics through an optimally trained and validated neural network provides pattern recognition and type classification of interrogated analyte. PMID:18806864

  15. The preprocessing of multispectral data. II. [of Landsat satellite

    NASA Technical Reports Server (NTRS)

    Quiel, F.

    1976-01-01

    It is pointed out that a correction of atmospheric effects is an important requirement for a full utilization of the possibilities provided by preprocessing techniques. The most significant characteristics of original and preprocessed data are considered, taking into account the solution of classification problems by means of the preprocessing procedure. Improvements obtainable with different preprocessing techniques are illustrated with the aid of examples involving Landsat data regarding an area in Colorado.

  16. Consensus gene regulatory networks: combining multiple microarray gene expression datasets

    NASA Astrophysics Data System (ADS)

    Peeling, Emma; Tucker, Allan

    2007-09-01

    In this paper we present a method for modelling gene regulatory networks by forming a consensus Bayesian network model from multiple microarray gene expression datasets. Our method is based on combining Bayesian network graph topologies and does not require any special pre-processing of the datasets, such as re-normalisation. We evaluate our method on a synthetic regulatory network and part of the yeast heat-shock response regulatory network using publicly available yeast microarray datasets. Results are promising; the consensus networks formed provide a broader view of the potential underlying network, obtaining an increased true positive rate over networks constructed from a single data source.

  17. Groundtruth approach to accurate quantitation of fluorescence microarrays

    SciTech Connect

    Mascio-Kegelmeyer, L; Tomascik-Cheeseman, L; Burnett, M S; van Hummelen, P; Wyrobek, A J

    2000-12-01

    To more accurately measure fluorescent signals from microarrays, we calibrated our acquisition and analysis systems by using groundtruth samples comprised of known quantities of red and green gene-specific DNA probes hybridized to cDNA targets. We imaged the slides with a full-field, white light CCD imager and analyzed them with our custom analysis software. Here we compare, for multiple genes, results obtained with and without preprocessing (alignment, color crosstalk compensation, dark field subtraction, and integration time). We also evaluate the accuracy of various image processing and analysis techniques (background subtraction, segmentation, quantitation and normalization). This methodology calibrates and validates our system for accurate quantitative measurement of microarrays. Specifically, we show that preprocessing the images produces results significantly closer to the known ground-truth for these samples.

  18. Experimental variability and data pre-processing as factors affecting the discrimination power of some chemometric approaches (PCA, CA and a new algorithm based on linear regression) applied to (+/-)ESI/MS and RPLC/UV data: Application on green tea extracts.

    PubMed

    Iorgulescu, E; Voicu, V A; Sârbu, C; Tache, F; Albu, F; Medvedovici, A

    2016-08-01

    The influence of the experimental variability (instrumental repeatability, instrumental intermediate precision and sample preparation variability) and data pre-processing (normalization, peak alignment, background subtraction) on the discrimination power of multivariate data analysis methods (Principal Component Analysis -PCA- and Cluster Analysis -CA-) as well as a new algorithm based on linear regression was studied. Data used in the study were obtained through positive or negative ion monitoring electrospray mass spectrometry (+/-ESI/MS) and reversed phase liquid chromatography/UV spectrometric detection (RPLC/UV) applied to green tea extracts. Extractions in ethanol and heated water infusion were used as sample preparation procedures. The multivariate methods were directly applied to mass spectra and chromatograms, involving strictly a holistic comparison of shapes, without assignment of any structural identity to compounds. An alternative data interpretation based on linear regression analysis mutually applied to data series is also discussed. Slopes, intercepts and correlation coefficients produced by the linear regression analysis applied on pairs of very large experimental data series successfully retain information resulting from high frequency instrumental acquisition rates, obviously better defining the profiles being compared. Consequently, each type of sample or comparison between samples produces in the Cartesian space an ellipsoidal volume defined by the normal variation intervals of the slope, intercept and correlation coefficient. Distances between volumes graphically illustrates (dis)similarities between compared data. The instrumental intermediate precision had the major effect on the discrimination power of the multivariate data analysis methods. Mass spectra produced through ionization from liquid state in atmospheric pressure conditions of bulk complex mixtures resulting from extracted materials of natural origins provided an excellent data

  19. Microarrays, Integrated Analytical Systems

    NASA Astrophysics Data System (ADS)

    Combinatorial chemistry is used to find materials that form sensor microarrays. This book discusses the fundamentals, and then proceeds to the many applications of microarrays, from measuring gene expression (DNA microarrays) to protein-protein interactions, peptide chemistry, carbodhydrate chemistry, electrochemical detection, and microfluidics.

  20. Acquisition and preprocessing of LANDSAT data

    NASA Technical Reports Server (NTRS)

    Horn, T. N.; Brown, L. E.; Anonsen, W. H. (Principal Investigator)

    1979-01-01

    The original configuration of the GSFC data acquisition, preprocessing, and transmission subsystem, designed to provide LANDSAT data inputs to the LACIE system at JSC, is described. Enhancements made to support LANDSAT -2, and modifications for LANDSAT -3 are discussed. Registration performance throughout the 3 year period of LACIE operations satisfied the 1 pixel root-mean-square requirements established in 1974, with more than two of every three attempts at data registration proving successful, notwithstanding cosmetic faults or content inadequacies to which the process is inherently susceptible. The cloud/snow rejection rate experienced throughout the last 3 years has approached 50%, as expected in most LANDSAT data use situations.

  1. Preprocessing and compression of Hyperspectral images captured onboard UAVs

    NASA Astrophysics Data System (ADS)

    Herrero, Rolando; Cadirola, Martin; Ingle, Vinay K.

    2015-10-01

    Advancements in image sensors and signal processing have led to the successful development of lightweight hyperspectral imaging systems that are critical to the deployment of Photometry and Remote Sensing (PaRS) capabilities in unmanned aerial vehicles (UAVs). In general, hyperspectral data cubes include a few dozens of spectral bands that are extremely useful for remote sensing applications that range from detection of land vegetation to monitoring of atmospheric products derived from the processing of lower level radiance images. Because these data cubes are captured in the challenging environment of UAVs, where resources are limited, source encoding by means of compression is a fundamental mechanism that considerably improves the overall system performance and reliability. In this paper, we focus on the hyperspectral images captured by a state-of-the-art commercial hyperspectral camera by showing the results of applying ultraspectral data compression to the obtained data set. Specifically the compression scheme that we introduce integrates two stages; (1) preprocessing and (2) compression itself. The outcomes of this procedure are linear prediction coefficients and an error signal that, when encoded, results in a compressed version of the original image. Second, preprocessing and compression algorithms are optimized and have their time complexity analyzed to guarantee their successful deployment using low power ARM based embedded processors in the context of UAVs. Lastly, we compare the proposed architecture against other well known schemes and show how the compression scheme presented in this paper outperforms all of them by providing substantial improvement and delivering both lower compression rates and lower distortion.

  2. Acoustic Biometric System Based on Preprocessing Techniques and Linear Support Vector Machines

    PubMed Central

    del Val, Lara; Izquierdo-Fuente, Alberto; Villacorta, Juan J.; Raboso, Mariano

    2015-01-01

    Drawing on the results of an acoustic biometric system based on a MSE classifier, a new biometric system has been implemented. This new system preprocesses acoustic images, extracts several parameters and finally classifies them, based on Support Vector Machine (SVM). The preprocessing techniques used are spatial filtering, segmentation—based on a Gaussian Mixture Model (GMM) to separate the person from the background, masking—to reduce the dimensions of images—and binarization—to reduce the size of each image. An analysis of classification error and a study of the sensitivity of the error versus the computational burden of each implemented algorithm are presented. This allows the selection of the most relevant algorithms, according to the benefits required by the system. A significant improvement of the biometric system has been achieved by reducing the classification error, the computational burden and the storage requirements. PMID:26091392

  3. Acoustic Biometric System Based on Preprocessing Techniques and Linear Support Vector Machines.

    PubMed

    del Val, Lara; Izquierdo-Fuente, Alberto; Villacorta, Juan J; Raboso, Mariano

    2015-01-01

    Drawing on the results of an acoustic biometric system based on a MSE classifier, a new biometric system has been implemented. This new system preprocesses acoustic images, extracts several parameters and finally classifies them, based on Support Vector Machine (SVM). The preprocessing techniques used are spatial filtering, segmentation-based on a Gaussian Mixture Model (GMM) to separate the person from the background, masking-to reduce the dimensions of images-and binarization-to reduce the size of each image. An analysis of classification error and a study of the sensitivity of the error versus the computational burden of each implemented algorithm are presented. This allows the selection of the most relevant algorithms, according to the benefits required by the system. A significant improvement of the biometric system has been achieved by reducing the classification error, the computational burden and the storage requirements. PMID:26091392

  4. Pre-Processing Effect on the Accuracy of Event-Based Activity Segmentation and Classification through Inertial Sensors.

    PubMed

    Fida, Benish; Bernabucci, Ivan; Bibbo, Daniele; Conforto, Silvia; Schmid, Maurizio

    2015-01-01

    Inertial sensors are increasingly being used to recognize and classify physical activities in a variety of applications. For monitoring and fitness applications, it is crucial to develop methods able to segment each activity cycle, e.g., a gait cycle, so that the successive classification step may be more accurate. To increase detection accuracy, pre-processing is often used, with a concurrent increase in computational cost. In this paper, the effect of pre-processing operations on the detection and classification of locomotion activities was investigated, to check whether the presence of pre-processing significantly contributes to an increase in accuracy. The pre-processing stages evaluated in this study were inclination correction and de-noising. Level walking, step ascending, descending and running were monitored by using a shank-mounted inertial sensor. Raw and filtered segments, obtained from a modified version of a rule-based gait detection algorithm optimized for sequential processing, were processed to extract time and frequency-based features for physical activity classification through a support vector machine classifier. The proposed method accurately detected >99% gait cycles from raw data and produced >98% accuracy on these segmented gait cycles. Pre-processing did not substantially increase classification accuracy, thus highlighting the possibility of reducing the amount of pre-processing for real-time applications. PMID:26378544

  5. Pre-Processing Effect on the Accuracy of Event-Based Activity Segmentation and Classification through Inertial Sensors

    PubMed Central

    Fida, Benish; Bernabucci, Ivan; Bibbo, Daniele; Conforto, Silvia; Schmid, Maurizio

    2015-01-01

    Inertial sensors are increasingly being used to recognize and classify physical activities in a variety of applications. For monitoring and fitness applications, it is crucial to develop methods able to segment each activity cycle, e.g., a gait cycle, so that the successive classification step may be more accurate. To increase detection accuracy, pre-processing is often used, with a concurrent increase in computational cost. In this paper, the effect of pre-processing operations on the detection and classification of locomotion activities was investigated, to check whether the presence of pre-processing significantly contributes to an increase in accuracy. The pre-processing stages evaluated in this study were inclination correction and de-noising. Level walking, step ascending, descending and running were monitored by using a shank-mounted inertial sensor. Raw and filtered segments, obtained from a modified version of a rule-based gait detection algorithm optimized for sequential processing, were processed to extract time and frequency-based features for physical activity classification through a support vector machine classifier. The proposed method accurately detected >99% gait cycles from raw data and produced >98% accuracy on these segmented gait cycles. Pre-processing did not substantially increase classification accuracy, thus highlighting the possibility of reducing the amount of pre-processing for real-time applications. PMID:26378544

  6. Predictor for the effect of amino acid composition on CD4+ T cell epitopes preprocessing.

    PubMed

    Hoze, Ehud; Tsaban, Lea; Maman, Yaakov; Louzoun, Yoram

    2013-05-31

    Predictive tools for all levels of CD8+ T cell epitopes processing have reached a maturation level. Good prediction algorithms have been developed for proteasomal cleavage, TAP and MHC class I peptide binding. The same cannot be said of CD4+ T cell epitopes. While multiple algorithms of varying accuracy have been proposed for MHC class II peptide binding, the preprocessing of CD4+ T cell epitopes is still lacking a good prediction algorithm. CD4+ T cell epitopes generation includes several stages, not all which are well-defined. We here group these stages to produce a generic preprocessing stage predictor for the cleavage processes preceding the presentation of epitopes to CD4+ T cell. The predictor is learnt using a combination of in vitro cleavage experiments and observed naturally processed MHC class II binding peptides. The properties of the predictor highlight the effect of different factors on CD4+ T cell epitopes preprocessing. The most important factor emerging from the predictor is the secondary structure of the cleaved region in the protein. The effect of the secondary structure is expected since CD4+ T cell epitopes are not denatured before cleavage. A website developed based on this predictor is available at: http://peptibase.cs.biu.ac.il/PepCleave_cd4/. PMID:23481624

  7. Study of data preprocess for HJ-1A satellite HSI image

    NASA Astrophysics Data System (ADS)

    Gao, Hail-liang; Gu, Xing-fa; Yu, Tao; He, Hua-ying; Zhu, Ling-ya; Wang, Feng

    2015-08-01

    Hyper Spectral Imager (HSI) is the first Chinese space-borne hyperspectral sensor aboard the HJ-1A satellite. We have developed a data preprocess flow for HSI images, which includes destriping, atmospheric correction and spectral filtering. In this paper, the product level of HSI image was introduced in the beginning, and a destriping method for HSI level 2 images was proposed. Then an atmospheric correction method based on radiative transfer mechanism was summarized to retrieve ground reflectance from HSI image. Furthermore, a new spectral filter method for ground reflectance spectra after atmospheric correction was proposed based on reference ground spectral database. Lastly, a HSI image acquired over Lake Dali in Inner Mongolia was used to evaluate the effect of the preprocess method. The HSI image after destriping was compared with the original HSI image, which shows that the stripe noise has been removed effectively. Both un-smoothed reflectance spectra and smoothed spectra using the preprocess method proposed in this paper are compared with the reflectance spectral derived with the well-known FLAASH method. The results show that the spectra become much smoother after the application of the spectral filtered algorithm. It was also found that the spectra using this new preprocessing method have similar results as that of the FLAASH method.

  8. Microarrays in hematology.

    PubMed

    Walker, Josef; Flower, Darren; Rigley, Kevin

    2002-01-01

    Microarrays are fast becoming routine tools for the high-throughput analysis of gene expression in a wide range of biologic systems, including hematology. Although a number of approaches can be taken when implementing microarray-based studies, all are capable of providing important insights into biologic function. Although some technical issues have not been resolved, microarrays will continue to make a significant impact on hematologically important research. PMID:11753074

  9. A preprocessing tool for removing artifact from cardiac RR interval recordings using three-dimensional spatial distribution mapping.

    PubMed

    Stapelberg, Nicolas J C; Neumann, David L; Shum, David H K; McConnell, Harry; Hamilton-Craig, Ian

    2016-04-01

    Artifact is common in cardiac RR interval data that is recorded for heart rate variability (HRV) analysis. A novel algorithm for artifact detection and interpolation in RR interval data is described. It is based on spatial distribution mapping of RR interval magnitude and relationships to adjacent values in three dimensions. The characteristics of normal physiological RR intervals and artifact intervals were established using 24-h recordings from 20 technician-assessed human cardiac recordings. The algorithm was incorporated into a preprocessing tool and validated using 30 artificial RR (ARR) interval data files, to which known quantities of artifact (0.5%, 1%, 2%, 3%, 5%, 7%, 10%) were added. The impact of preprocessing ARR files with 1% added artifact was also assessed using 10 time domain and frequency domain HRV metrics. The preprocessing tool was also used to preprocess 69 24-h human cardiac recordings. The tool was able to remove artifact from technician-assessed human cardiac recordings (sensitivity 0.84, SD = 0.09, specificity of 1.00, SD = 0.01) and artificial data files. The removal of artifact had a low impact on time domain and frequency domain HRV metrics (ranging from 0% to 2.5% change in values). This novel preprocessing tool can be used with human 24-h cardiac recordings to remove artifact while minimally affecting physiological data and therefore having a low impact on HRV measures of that data. PMID:26751605

  10. Comparison of planar images and SPECT with bayesean preprocessing for the demonstration of facial anatomy and craniomandibular disorders

    SciTech Connect

    Kircos, L.T.; Ortendahl, D.A.; Hattner, R.S.; Faulkner, D.; Taylor, R.L.

    1984-01-01

    Craniomandiublar disorders involving the facial anatomy may be difficult to demonstrate in planar images. Although bone scanning is generally more sensitive than radiography, facial bone anatomy is complex and focal areas of increased or decreased radiotracer may become obscured by overlapping structures in planar images. Thus SPECT appears ideally suited to examination of the facial skeleton. A series of patients with craniomandibular disorders of unknown origin were imaged using 20 mCi Tc-99m MDP. Planar and SPECT (Siemens 7500 ZLC Orbiter) images were obtained four hours after injection. The SPECT images were reconstructed with a filtered back-projection algorithm. In order to improve image contrast and resolution in SPECT images, the rotation views were pre-processed with a Bayesean deblurring algorithm which has previously been show to offer improved contrast and resolution in planar images. SPECT images using the pre-processed rotation views were obtained and compared to the SPECT images without pre-processing and the planar images. TMJ arthropathy involving either the glenoid fossa or the mandibular condyle, orthopedic changes involving the mandible or maxilla, localized dental pathosis, as well as changes in structures peripheral to the facial skeleton were identified. Bayesean pre-processed SPECT depicted the facial skeleton more clearly as well as providing a more obvious demonstration of the bony changes associated with craniomandibular disorders than either planar images or SPECT without pre-processing.

  11. Measurement data preprocessing in a radar-based system for monitoring of human movements

    NASA Astrophysics Data System (ADS)

    Morawski, Roman Z.; Miȩkina, Andrzej; Bajurko, Paweł R.

    2015-02-01

    The importance of research on new technologies that could be employed in care services for elderly people is highlighted. The need to examine the applicability of various sensor systems for non-invasive monitoring of the movements and vital bodily functions, such as heart beat or breathing rhythm, of elderly persons in their home environment is justified. An extensive overview of the literature concerning existing monitoring techniques is provided. A technological potential behind radar sensors is indicated. A new class of algorithms for preprocessing of measurement data from impulse radar sensors, when applied for elderly people monitoring, is proposed. Preliminary results of numerical experiments performed on those algorithms are demonstrated.

  12. Data preprocessing methods of FT-NIR spectral data for the classification cooking oil

    NASA Astrophysics Data System (ADS)

    Ruah, Mas Ezatul Nadia Mohd; Rasaruddin, Nor Fazila; Fong, Sim Siong; Jaafar, Mohd Zuli

    2014-12-01

    This recent work describes the data pre-processing method of FT-NIR spectroscopy datasets of cooking oil and its quality parameters with chemometrics method. Pre-processing of near-infrared (NIR) spectral data has become an integral part of chemometrics modelling. Hence, this work is dedicated to investigate the utility and effectiveness of pre-processing algorithms namely row scaling, column scaling and single scaling process with Standard Normal Variate (SNV). The combinations of these scaling methods have impact on exploratory analysis and classification via Principle Component Analysis plot (PCA). The samples were divided into palm oil and non-palm cooking oil. The classification model was build using FT-NIR cooking oil spectra datasets in absorbance mode at the range of 4000cm-1-14000cm-1. Savitzky Golay derivative was applied before developing the classification model. Then, the data was separated into two sets which were training set and test set by using Duplex method. The number of each class was kept equal to 2/3 of the class that has the minimum number of sample. Then, the sample was employed t-statistic as variable selection method in order to select which variable is significant towards the classification models. The evaluation of data pre-processing were looking at value of modified silhouette width (mSW), PCA and also Percentage Correctly Classified (%CC). The results show that different data processing strategies resulting to substantial amount of model performances quality. The effects of several data pre-processing i.e. row scaling, column standardisation and single scaling process with Standard Normal Variate indicated by mSW and %CC. At two PCs model, all five classifier gave high %CC except Quadratic Distance Analysis.

  13. Antibiotic treatment algorithm development based on a microarray nucleic acid assay for rapid bacterial identification and resistance determination from positive blood cultures.

    PubMed

    Rödel, Jürgen; Karrasch, Matthias; Edel, Birgit; Stoll, Sylvia; Bohnert, Jürgen; Löffler, Bettina; Saupe, Angela; Pfister, Wolfgang

    2016-03-01

    Rapid diagnosis of bloodstream infections remains a challenge for the early targeting of an antibiotic therapy in sepsis patients. In recent studies, the reliability of the Nanosphere Verigene Gram-positive and Gram-negative blood culture (BC-GP and BC-GN) assays for the rapid identification of bacteria and resistance genes directly from positive BCs has been demonstrated. In this work, we have developed a model to define treatment recommendations by combining Verigene test results with knowledge on local antibiotic resistance patterns of bacterial pathogens. The data of 275 positive BCs were analyzed. Two hundred sixty-three isolates (95.6%) were included in the Verigene assay panels, and 257 isolates (93.5%) were correctly identified. The agreement of the detection of resistance genes with subsequent phenotypic susceptibility testing was 100%. The hospital antibiogram was used to develop a treatment algorithm on the basis of Verigene results that may contribute to a faster patient management. PMID:26712265

  14. A perceptual preprocess method for 3D-HEVC

    NASA Astrophysics Data System (ADS)

    Shi, Yawen; Wang, Yongfang; Wang, Yubing

    2015-08-01

    A perceptual preprocessing method for 3D-HEVC coding is proposed in the paper. Firstly we proposed a new JND model, which accounts for luminance contrast masking effect, spatial masking effect, and temporal masking effect, saliency characteristic as well as depth information. We utilize spectral residual approach to obtain the saliency map and built a visual saliency factor based on saliency map. In order to distinguish the sensitivity of objects in different depth. We segment each texture frame into foreground and background by a automatic threshold selection algorithm using corresponding depth information, and then built a depth weighting factor. A JND modulation factor is built with a linear combined with visual saliency factor and depth weighting factor to adjust the JND threshold. Then, we applied the proposed JND model to 3D-HEVC for residual filtering and distortion coefficient processing. The filtering process is that the residual value will be set to zero if the JND threshold is greater than residual value, or directly subtract the JND threshold from residual value if JND threshold is less than residual value. Experiment results demonstrate that the proposed method can achieve average bit rate reduction of 15.11%, compared to the original coding scheme with HTM12.1, while maintains the same subjective quality.

  15. Microarrays--status and prospects.

    PubMed

    Venkatasubbarao, Srivatsa

    2004-12-01

    Microarrays have become an extremely important research tool for life science researchers and are also beginning to be used in diagnostic, treatment and monitoring applications. This article provides a detailed description of microarrays prepared by in situ synthesis, deposition using microspotting methods, nonplanar bead arrays, flow-through microarrays, optical fiber bundle arrays and nanobarcodes. The problems and challenges in the development of microarrays, development of standards and diagnostic microarrays are described. Tables summarizing the vendor list of various derivatized microarray surfaces, commercially sold premade microarrays, bead arrays and unique microarray products in development are also included. PMID:15542153

  16. An automated method for gridding and clustering-based segmentation of cDNA microarray images.

    PubMed

    Giannakeas, Nikolaos; Fotiadis, Dimitrios I

    2009-01-01

    Microarrays are widely used to quantify gene expression levels. Microarray image analysis is one of the tools, which are necessary when dealing with vast amounts of biological data. In this work we propose a new method for the automated analysis of microarray images. The proposed method consists of two stages: gridding and segmentation. Initially, the microarray images are preprocessed using template matching, and block and spot finding takes place. Then, the non-expressed spots are detected and a grid is fit on the image using a Voronoi diagram. In the segmentation stage, K-means and Fuzzy C means (FCM) clustering are employed. The proposed method was evaluated using images from the Stanford Microarray Database (SMD). The results that are presented in the segmentation stage show the efficiency of our Fuzzy C means-based work compared to the two already developed K-means-based methods. The proposed method can handle images with artefacts and it is fully automated. PMID:19046850

  17. Full automatic preprocessing of digital map for 2.5D ray tracing propagation model in urban microcellular environment

    NASA Astrophysics Data System (ADS)

    Liu, Zhong-Yu; Guo, Li-Xin; Tao, Wei

    2013-08-01

    Due to the importance of digital map to ray-tracing (RT) algorithm, intelligent preprocessing techniques for the geometric information of buildings are improved, taking into account the characteristic of quasi three-dimensional (2.5D) RT method. By using these techniques, the geometrical factors, which have little or no effect on the prediction results, are neglected from the digital map, and the reduction of the number of blocking test is achieved in the process of executing the RT routine. With the proposed preprocessing of the digital map in urban microcellular environments, the improvement in the computational efficiency is clearly demonstrated without sensibly affecting the accuracy of the propagation prediction.

  18. Application of preprocessing filtering on Decision Tree C4.5 and rough set theory

    NASA Astrophysics Data System (ADS)

    Chan, Joseph C. C.; Lin, Tsau Y.

    2001-03-01

    This paper compares two artificial intelligence methods: the Decision Tree C4.5 and Rough Set Theory on the stock market data. The Decision Tree C4.5 is reviewed with the Rough Set Theory. An enhanced window application is developed to facilitate the pre-processing filtering by introducing the feature (attribute) transformations, which allows users to input formulas and create new attributes. Also, the application produces three varieties of data set with delaying, averaging, and summation. The results prove the improvement of pre-processing by applying feature (attribute) transformations on Decision Tree C4.5. Moreover, the comparison between Decision Tree C4.5 and Rough Set Theory is based on the clarity, automation, accuracy, dimensionality, raw data, and speed, which is supported by the rules sets generated by both algorithms on three different sets of data.

  19. Microarray Analysis in Glioblastomas.

    PubMed

    Bhawe, Kaumudi M; Aghi, Manish K

    2016-01-01

    Microarray analysis in glioblastomas is done using either cell lines or patient samples as starting material. A survey of the current literature points to transcript-based microarrays and immunohistochemistry (IHC)-based tissue microarrays as being the preferred methods of choice in cancers of neurological origin. Microarray analysis may be carried out for various purposes including the following: i. To correlate gene expression signatures of glioblastoma cell lines or tumors with response to chemotherapy (DeLay et al., Clin Cancer Res 18(10):2930-2942, 2012). ii. To correlate gene expression patterns with biological features like proliferation or invasiveness of the glioblastoma cells (Jiang et al., PLoS One 8(6):e66008, 2013). iii. To discover new tumor classificatory systems based on gene expression signature, and to correlate therapeutic response and prognosis with these signatures (Huse et al., Annu Rev Med 64(1):59-70, 2013; Verhaak et al., Cancer Cell 17(1):98-110, 2010). While investigators can sometimes use archived tumor gene expression data available from repositories such as the NCBI Gene Expression Omnibus to answer their questions, new arrays must often be run to adequately answer specific questions. Here, we provide a detailed description of microarray methodologies, how to select the appropriate methodology for a given question, and analytical strategies that can be used. Experimental methodology for protein microarrays is outside the scope of this chapter, but basic sample preparation techniques for transcript-based microarrays are included here. PMID:26113463

  20. Comparing Binaural Pre-processing Strategies I

    PubMed Central

    Krawczyk-Becker, Martin; Marquardt, Daniel; Völker, Christoph; Hu, Hongmei; Herzke, Tobias; Coleman, Graham; Adiloğlu, Kamil; Ernst, Stephan M. A.; Gerkmann, Timo; Doclo, Simon; Kollmeier, Birger; Hohmann, Volker; Dietz, Mathias

    2015-01-01

    In a collaborative research project, several monaural and binaural noise reduction algorithms have been comprehensively evaluated. In this article, eight selected noise reduction algorithms were assessed using instrumental measures, with a focus on the instrumental evaluation of speech intelligibility. Four distinct, reverberant scenarios were created to reflect everyday listening situations: a stationary speech-shaped noise, a multitalker babble noise, a single interfering talker, and a realistic cafeteria noise. Three instrumental measures were employed to assess predicted speech intelligibility and predicted sound quality: the intelligibility-weighted signal-to-noise ratio, the short-time objective intelligibility measure, and the perceptual evaluation of speech quality. The results show substantial improvements in predicted speech intelligibility as well as sound quality for the proposed algorithms. The evaluated coherence-based noise reduction algorithm was able to provide improvements in predicted audio signal quality. For the tested single-channel noise reduction algorithm, improvements in intelligibility-weighted signal-to-noise ratio were observed in all but the nonstationary cafeteria ambient noise scenario. Binaural minimum variance distortionless response beamforming algorithms performed particularly well in all noise scenarios. PMID:26721920

  1. An Overview of DNA Microarray Grid Alignment and Foreground Separation Approaches

    NASA Astrophysics Data System (ADS)

    Bajcsy, Peter

    2006-12-01

    This paper overviews DNA microarray grid alignment and foreground separation approaches. Microarray grid alignment and foreground separation are the basic processing steps of DNA microarray images that affect the quality of gene expression information, and hence impact our confidence in any data-derived biological conclusions. Thus, understanding microarray data processing steps becomes critical for performing optimal microarray data analysis. In the past, the grid alignment and foreground separation steps have not been covered extensively in the survey literature. We present several classifications of existing algorithms, and describe the fundamental principles of these algorithms. Challenges related to automation and reliability of processed image data are outlined at the end of this overview paper.

  2. SeqTrim: a high-throughput pipeline for pre-processing any type of sequence read

    PubMed Central

    2010-01-01

    Background High-throughput automated sequencing has enabled an exponential growth rate of sequencing data. This requires increasing sequence quality and reliability in order to avoid database contamination with artefactual sequences. The arrival of pyrosequencing enhances this problem and necessitates customisable pre-processing algorithms. Results SeqTrim has been implemented both as a Web and as a standalone command line application. Already-published and newly-designed algorithms have been included to identify sequence inserts, to remove low quality, vector, adaptor, low complexity and contaminant sequences, and to detect chimeric reads. The availability of several input and output formats allows its inclusion in sequence processing workflows. Due to its specific algorithms, SeqTrim outperforms other pre-processors implemented as Web services or standalone applications. It performs equally well with sequences from EST libraries, SSH libraries, genomic DNA libraries and pyrosequencing reads and does not lead to over-trimming. Conclusions SeqTrim is an efficient pipeline designed for pre-processing of any type of sequence read, including next-generation sequencing. It is easily configurable and provides a friendly interface that allows users to know what happened with sequences at every pre-processing stage, and to verify pre-processing of an individual sequence if desired. The recommended pipeline reveals more information about each sequence than previously described pre-processors and can discard more sequencing or experimental artefacts. PMID:20089148

  3. A survey of visual preprocessing and shape representation techniques

    NASA Technical Reports Server (NTRS)

    Olshausen, Bruno A.

    1988-01-01

    Many recent theories and methods proposed for visual preprocessing and shape representation are summarized. The survey brings together research from the fields of biology, psychology, computer science, electrical engineering, and most recently, neural networks. It was motivated by the need to preprocess images for a sparse distributed memory (SDM), but the techniques presented may also prove useful for applying other associative memories to visual pattern recognition. The material of this survey is divided into three sections: an overview of biological visual processing; methods of preprocessing (extracting parts of shape, texture, motion, and depth); and shape representation and recognition (form invariance, primitives and structural descriptions, and theories of attention).

  4. Comparing Binaural Pre-processing Strategies II

    PubMed Central

    Hu, Hongmei; Krawczyk-Becker, Martin; Marquardt, Daniel; Herzke, Tobias; Coleman, Graham; Adiloğlu, Kamil; Bomke, Katrin; Plotz, Karsten; Gerkmann, Timo; Doclo, Simon; Kollmeier, Birger; Hohmann, Volker; Dietz, Mathias

    2015-01-01

    Several binaural audio signal enhancement algorithms were evaluated with respect to their potential to improve speech intelligibility in noise for users of bilateral cochlear implants (CIs). 50% speech reception thresholds (SRT50) were assessed using an adaptive procedure in three distinct, realistic noise scenarios. All scenarios were highly nonstationary, complex, and included a significant amount of reverberation. Other aspects, such as the perfectly frontal target position, were idealized laboratory settings, allowing the algorithms to perform better than in corresponding real-world conditions. Eight bilaterally implanted CI users, wearing devices from three manufacturers, participated in the study. In all noise conditions, a substantial improvement in SRT50 compared to the unprocessed signal was observed for most of the algorithms tested, with the largest improvements generally provided by binaural minimum variance distortionless response (MVDR) beamforming algorithms. The largest overall improvement in speech intelligibility was achieved by an adaptive binaural MVDR in a spatially separated, single competing talker noise scenario. A no-pre-processing condition and adaptive differential microphones without a binaural link served as the two baseline conditions. SRT50 improvements provided by the binaural MVDR beamformers surpassed the performance of the adaptive differential microphones in most cases. Speech intelligibility improvements predicted by instrumental measures were shown to account for some but not all aspects of the perceptually obtained SRT50 improvements measured in bilaterally implanted CI users. PMID:26721921

  5. Characterizing the continuously acquired cardiovascular time series during hemodialysis, using median hybrid filter preprocessing noise reduction

    PubMed Central

    Wilson, Scott; Bowyer, Andrea; Harrap, Stephen B

    2015-01-01

    The clinical characterization of cardiovascular dynamics during hemodialysis (HD) has important pathophysiological implications in terms of diagnostic, cardiovascular risk assessment, and treatment efficacy perspectives. Currently the diagnosis of significant intradialytic systolic blood pressure (SBP) changes among HD patients is imprecise and opportunistic, reliant upon the presence of hypotensive symptoms in conjunction with coincident but isolated noninvasive brachial cuff blood pressure (NIBP) readings. Considering hemodynamic variables as a time series makes a continuous recording approach more desirable than intermittent measures; however, in the clinical environment, the data signal is susceptible to corruption due to both impulsive and Gaussian-type noise. Signal preprocessing is an attractive solution to this problem. Prospectively collected continuous noninvasive SBP data over the short-break intradialytic period in ten patients was preprocessed using a novel median hybrid filter (MHF) algorithm and compared with 50 time-coincident pairs of intradialytic NIBP measures from routine HD practice. The median hybrid preprocessing technique for continuously acquired cardiovascular data yielded a dynamic regression without significant noise and artifact, suitable for high-level profiling of time-dependent SBP behavior. Signal accuracy is highly comparable with standard NIBP measurement, with the added clinical benefit of dynamic real-time hemodynamic information. PMID:25678827

  6. Flexibility and utility of pre-processing methods in converting STXM setups for ptychography - Final Paper

    SciTech Connect

    Fromm, Catherine

    2015-08-20

    Ptychography is an advanced diffraction based imaging technique that can achieve resolution of 5nm and below. It is done by scanning a sample through a beam of focused x-rays using discrete yet overlapping scan steps. Scattering data is collected on a CCD camera, and the phase of the scattered light is reconstructed with sophisticated iterative algorithms. Because the experimental setup is similar, ptychography setups can be created by retrofitting existing STXM beam lines with new hardware. The other challenge comes in the reconstruction of the collected scattering images. Scattering data must be adjusted and packaged with experimental parameters to calibrate the reconstruction software. The necessary pre-processing of data prior to reconstruction is unique to each beamline setup, and even the optical alignments used on that particular day. Pre-processing software must be developed to be flexible and efficient in order to allow experiments appropriate control and freedom in the analysis of their hard-won data. This paper will describe the implementation of pre-processing software which successfully connects data collection steps to reconstruction steps, letting the user accomplish accurate and reliable ptychography.

  7. Data Analysis Strategies for Protein Microarrays

    PubMed Central

    Díez, Paula; Dasilva, Noelia; González-González, María; Matarraz, Sergio; Casado-Vela, Juan; Orfao, Alberto; Fuentes, Manuel

    2012-01-01

    Microarrays constitute a new platform which allows the discovery and characterization of proteins. According to different features, such as content, surface or detection system, there are many types of protein microarrays which can be applied for the identification of disease biomarkers and the characterization of protein expression patterns. However, the analysis and interpretation of the amount of information generated by microarrays remain a challenge. Further data analysis strategies are essential to obtain representative and reproducible results. Therefore, the experimental design is key, since the number of samples and dyes, among others aspects, would define the appropriate analysis method to be used. In this sense, several algorithms have been proposed so far to overcome analytical difficulties derived from fluorescence overlapping and/or background noise. Each kind of microarray is developed to fulfill a specific purpose. Therefore, the selection of appropriate analytical and data analysis strategies is crucial to achieve successful biological conclusions. In the present review, we focus on current algorithms and main strategies for data interpretation.

  8. Nanotechnologies in protein microarrays.

    PubMed

    Krizkova, Sona; Heger, Zbynek; Zalewska, Marta; Moulick, Amitava; Adam, Vojtech; Kizek, Rene

    2015-01-01

    Protein microarray technology became an important research tool for study and detection of proteins, protein-protein interactions and a number of other applications. The utilization of nanoparticle-based materials and nanotechnology-based techniques for immobilization allows us not only to extend the surface for biomolecule immobilization resulting in enhanced substrate binding properties, decreased background signals and enhanced reporter systems for more sensitive assays. Generally in contemporarily developed microarray systems, multiple nanotechnology-based techniques are combined. In this review, applications of nanoparticles and nanotechnologies in creating protein microarrays, proteins immobilization and detection are summarized. We anticipate that advanced nanotechnologies can be exploited to expand promising fields of proteins identification, monitoring of protein-protein or drug-protein interactions, or proteins structures. PMID:26039143

  9. Comparing Binaural Pre-processing Strategies III: Speech Intelligibility of Normal-Hearing and Hearing-Impaired Listeners.

    PubMed

    Völker, Christoph; Warzybok, Anna; Ernst, Stephan M A

    2015-01-01

    A comprehensive evaluation of eight signal pre-processing strategies, including directional microphones, coherence filters, single-channel noise reduction, binaural beamformers, and their combinations, was undertaken with normal-hearing (NH) and hearing-impaired (HI) listeners. Speech reception thresholds (SRTs) were measured in three noise scenarios (multitalker babble, cafeteria noise, and single competing talker). Predictions of three common instrumental measures were compared with the general perceptual benefit caused by the algorithms. The individual SRTs measured without pre-processing and individual benefits were objectively estimated using the binaural speech intelligibility model. Ten listeners with NH and 12 HI listeners participated. The participants varied in age and pure-tone threshold levels. Although HI listeners required a better signal-to-noise ratio to obtain 50% intelligibility than listeners with NH, no differences in SRT benefit from the different algorithms were found between the two groups. With the exception of single-channel noise reduction, all algorithms showed an improvement in SRT of between 2.1 dB (in cafeteria noise) and 4.8 dB (in single competing talker condition). Model predictions with binaural speech intelligibility model explained 83% of the measured variance of the individual SRTs in the no pre-processing condition. Regarding the benefit from the algorithms, the instrumental measures were not able to predict the perceptual data in all tested noise conditions. The comparable benefit observed for both groups suggests a possible application of noise reduction schemes for listeners with different hearing status. Although the model can predict the individual SRTs without pre-processing, further development is necessary to predict the benefits obtained from the algorithms at an individual level. PMID:26721922

  10. Design of radial basis function neural network classifier realized with the aid of data preprocessing techniques: design and analysis

    NASA Astrophysics Data System (ADS)

    Oh, Sung-Kwun; Kim, Wook-Dong; Pedrycz, Witold

    2016-05-01

    In this paper, we introduce a new architecture of optimized Radial Basis Function neural network classifier developed with the aid of fuzzy clustering and data preprocessing techniques and discuss its comprehensive design methodology. In the preprocessing part, the Linear Discriminant Analysis (LDA) or Principal Component Analysis (PCA) algorithm forms a front end of the network. The transformed data produced here are used as the inputs of the network. In the premise part, the Fuzzy C-Means (FCM) algorithm determines the receptive field associated with the condition part of the rules. The connection weights of the classifier are of functional nature and come as polynomial functions forming the consequent part. The Particle Swarm Optimization algorithm optimizes a number of essential parameters needed to improve the accuracy of the classifier. Those optimized parameters include the type of data preprocessing, the dimensionality of the feature vectors produced by the LDA (or PCA), the number of clusters (rules), the fuzzification coefficient used in the FCM algorithm and the orders of the polynomials of networks. The performance of the proposed classifier is reported for several benchmarking data-sets and is compared with the performance of other classifiers reported in the previous studies.

  11. AMIC@: All MIcroarray Clusterings @ once

    PubMed Central

    Geraci, Filippo; Pellegrini, Marco; Renda, M. Elena

    2008-01-01

    The AMIC@ Web Server offers a light-weight multi-method clustering engine for microarray gene-expression data. AMIC@ is a highly interactive tool that stresses user-friendliness and robustness by adopting AJAX technology, thus allowing an effective interleaved execution of different clustering algorithms and inspection of results. Among the salient features AMIC@ offers, there are: (i) automatic file format detection, (ii) suggestions on the number of clusters using a variant of the stability-based method of Tibshirani et al. (iii) intuitive visual inspection of the data via heatmaps and (iv) measurements of the clustering quality using cluster homogeneity. Large data sets can be processed efficiently by selecting algorithms (such as FPF-SB and k-Boost), specifically designed for this purpose. In case of very large data sets, the user can opt for a batch-mode use of the system by means of the Clustering wizard that runs all algorithms at once and delivers the results via email. AMIC@ is freely available and open to all users with no login requirement at the following URL http://bioalgo.iit.cnr.it/amica. PMID:18477631

  12. AMIC@: All MIcroarray Clusterings @ once.

    PubMed

    Geraci, Filippo; Pellegrini, Marco; Renda, M Elena

    2008-07-01

    The AMIC@ Web Server offers a light-weight multi-method clustering engine for microarray gene-expression data. AMIC@ is a highly interactive tool that stresses user-friendliness and robustness by adopting AJAX technology, thus allowing an effective interleaved execution of different clustering algorithms and inspection of results. Among the salient features AMIC@ offers, there are: (i) automatic file format detection, (ii) suggestions on the number of clusters using a variant of the stability-based method of Tibshirani et al. (iii) intuitive visual inspection of the data via heatmaps and (iv) measurements of the clustering quality using cluster homogeneity. Large data sets can be processed efficiently by selecting algorithms (such as FPF-SB and k-Boost), specifically designed for this purpose. In case of very large data sets, the user can opt for a batch-mode use of the system by means of the Clustering wizard that runs all algorithms at once and delivers the results via email. AMIC@ is freely available and open to all users with no login requirement at the following URL http://bioalgo.iit.cnr.it/amica. PMID:18477631

  13. Enhancing Interdisciplinary Mathematics and Biology Education: A Microarray Data Analysis Course Bridging These Disciplines

    PubMed Central

    Evans, Irene M.

    2010-01-01

    BIO2010 put forth the goal of improving the mathematical educational background of biology students. The analysis and interpretation of microarray high-dimensional data can be very challenging and is best done by a statistician and a biologist working and teaching in a collaborative manner. We set up such a collaboration and designed a course on microarray data analysis. We started using Genome Consortium for Active Teaching (GCAT) materials and Microarray Genome and Clustering Tool software and added R statistical software along with Bioconductor packages. In response to student feedback, one microarray data set was fully analyzed in class, starting from preprocessing to gene discovery to pathway analysis using the latter software. A class project was to conduct a similar analysis where students analyzed their own data or data from a published journal paper. This exercise showed the impact that filtering, preprocessing, and different normalization methods had on gene inclusion in the final data set. We conclude that this course achieved its goals to equip students with skills to analyze data from a microarray experiment. We offer our insight about collaborative teaching as well as how other faculty might design and implement a similar interdisciplinary course. PMID:20810954

  14. EARLINET Single Calculus Chain - technical - Part 1: Pre-processing of raw lidar data

    NASA Astrophysics Data System (ADS)

    D'Amico, G.; Amodeo, A.; Mattis, I.; Freudenthaler, V.; Pappalardo, G.

    2015-10-01

    In this paper we describe an automatic tool for the pre-processing of lidar data called ELPP (EARLINET Lidar Pre-Processor). It is one of two calculus modules of the EARLINET Single Calculus Chain (SCC), the automatic tool for the analysis of EARLINET data. The ELPP is an open source module that executes instrumental corrections and data handling of the raw lidar signals, making the lidar data ready to be processed by the optical retrieval algorithms. According to the specific lidar configuration, the ELPP automatically performs dead-time correction, atmospheric and electronic background subtraction, gluing of lidar signals, and trigger-delay correction. Moreover, the signal-to-noise ratio of the pre-processed signals can be improved by means of configurable time integration of the raw signals and/or spatial smoothing. The ELPP delivers the statistical uncertainties of the final products by means of error propagation or Monte Carlo simulations. During the development of the ELPP module, particular attention has been payed to make the tool flexible enough to handle all lidar configurations currently used within the EARLINET community. Moreover, it has been designed in a modular way to allow an easy extension to lidar configurations not yet implemented. The primary goal of the ELPP module is to enable the application of quality-assured procedures in the lidar data analysis starting from the raw lidar data. This provides the added value of full traceability of each delivered lidar product. Several tests have been performed to check the proper functioning of the ELPP module. The whole SCC has been tested with the same synthetic data sets, which were used for the EARLINET algorithm inter-comparison exercise. The ELPP module has been successfully employed for the automatic near-real-time pre-processing of the raw lidar data measured during several EARLINET inter-comparison campaigns as well as during intense field campaigns.

  15. EARLINET Single Calculus Chain - technical - Part 1: Pre-processing of raw lidar data

    NASA Astrophysics Data System (ADS)

    D'Amico, Giuseppe; Amodeo, Aldo; Mattis, Ina; Freudenthaler, Volker; Pappalardo, Gelsomina

    2016-02-01

    In this paper we describe an automatic tool for the pre-processing of aerosol lidar data called ELPP (EARLINET Lidar Pre-Processor). It is one of two calculus modules of the EARLINET Single Calculus Chain (SCC), the automatic tool for the analysis of EARLINET data. ELPP is an open source module that executes instrumental corrections and data handling of the raw lidar signals, making the lidar data ready to be processed by the optical retrieval algorithms. According to the specific lidar configuration, ELPP automatically performs dead-time correction, atmospheric and electronic background subtraction, gluing of lidar signals, and trigger-delay correction. Moreover, the signal-to-noise ratio of the pre-processed signals can be improved by means of configurable time integration of the raw signals and/or spatial smoothing. ELPP delivers the statistical uncertainties of the final products by means of error propagation or Monte Carlo simulations. During the development of ELPP, particular attention has been payed to make the tool flexible enough to handle all lidar configurations currently used within the EARLINET community. Moreover, it has been designed in a modular way to allow an easy extension to lidar configurations not yet implemented. The primary goal of ELPP is to enable the application of quality-assured procedures in the lidar data analysis starting from the raw lidar data. This provides the added value of full traceability of each delivered lidar product. Several tests have been performed to check the proper functioning of ELPP. The whole SCC has been tested with the same synthetic data sets, which were used for the EARLINET algorithm inter-comparison exercise. ELPP has been successfully employed for the automatic near-real-time pre-processing of the raw lidar data measured during several EARLINET inter-comparison campaigns as well as during intense field campaigns.

  16. Analysis of microarray experiments of gene expression profiling

    PubMed Central

    Tarca, Adi L.; Romero, Roberto; Draghici, Sorin

    2008-01-01

    The study of gene expression profiling of cells and tissue has become a major tool for discovery in medicine. Microarray experiments allow description of genome-wide expression changes in health and disease. The results of such experiments are expected to change the methods employed in the diagnosis and prognosis of disease in obstetrics and gynecology. Moreover, an unbiased and systematic study of gene expression profiling should allow the establishment of a new taxonomy of disease for obstetric and gynecologic syndromes. Thus, a new era is emerging in which reproductive processes and disorders could be characterized using molecular tools and fingerprinting. The design, analysis, and interpretation of microarray experiments require specialized knowledge that is not part of the standard curriculum of our discipline. This article describes the types of studies that can be conducted with microarray experiments (class comparison, class prediction, class discovery). We discuss key issues pertaining to experimental design, data preprocessing, and gene selection methods. Common types of data representation are illustrated. Potential pitfalls in the interpretation of microarray experiments, as well as the strengths and limitations of this technology, are highlighted. This article is intended to assist clinicians in appraising the quality of the scientific evidence now reported in the obstetric and gynecologic literature. PMID:16890548

  17. Microarrays for Undergraduate Classes

    ERIC Educational Resources Information Center

    Hancock, Dale; Nguyen, Lisa L.; Denyer, Gareth S.; Johnston, Jill M.

    2006-01-01

    A microarray experiment is presented that, in six laboratory sessions, takes undergraduate students from the tissue sample right through to data analysis. The model chosen, the murine erythroleukemia cell line, can be easily cultured in sufficient quantities for class use. Large changes in gene expression can be induced in these cells by…

  18. Real-time multilevel process monitoring and control of CR image acquisition and preprocessing for PACS and ICU

    NASA Astrophysics Data System (ADS)

    Zhang, Jianguo; Wong, Stephen T. C.; Andriole, Katherine P.; Wong, Albert W. K.; Huang, H. K.

    1996-05-01

    The purpose of this paper is to present a control theory and a fault tolerance algorithm developed for real time monitoring and control of acquisition and preprocessing of computed radiographs for PACS and Intensive Care Unit operations. This monitoring and control system uses the event-driven, multilevel processing approach to remove computational bottleneck and to improve system reliability. Its computational performance and processing reliability are evaluated and compared with those of the traditional, single level processing approach.

  19. Microarray data classified by artificial neural networks.

    PubMed

    Linder, Roland; Richards, Tereza; Wagner, Mathias

    2007-01-01

    Systems biology has enjoyed explosive growth in both the number of people participating in this area of research and the number of publications on the topic. The field of systems biology encompasses the in silico analysis of high-throughput data as provided by DNA or protein microarrays. Along with the increasing availability of microarray data, attention is focused on methods of analyzing the expression rates. One important type of analysis is the classification task, for example, distinguishing different types of cell functions or tumors. Recently, interest has been awakened toward artificial neural networks (ANN), which have many appealing characteristics such as an exceptional degree of accuracy. Nonlinear relationships or independence from certain assumptions regarding the data distribution are also considered. The current work reviews advantages as well as disadvantages of neural networks in the context of microarray analysis. Comparisons are drawn to alternative methods. Selected solutions are discussed, and finally algorithms for the effective combination of multiple ANNs are presented. The development of approaches to use ANN-processed microarray data applicable to run cell and tissue simulations may be slated for future investigation. PMID:18220242

  20. OPSN: The IMS COMSYS 1 and 2 Data Preprocessing System.

    ERIC Educational Resources Information Center

    Yu, John

    The Instructional Management System (IMS) developed by the Southwest Regional Laboratory (SWRL) processes student and teacher-generated data through the use of an optical scanner that produces a magnetic tape (Scan Tape) for input to IMS. A series of computer routines, OPSN, preprocesses the Scan Tape and prepares the data for transmission to the…

  1. The Minimal Preprocessing Pipelines for the Human Connectome Project

    PubMed Central

    Glasser, Matthew F.; Sotiropoulos, Stamatios N; Wilson, J Anthony; Coalson, Timothy S; Fischl, Bruce; Andersson, Jesper L; Xu, Junqian; Jbabdi, Saad; Webster, Matthew; Polimeni, Jonathan R; Van Essen, David C; Jenkinson, Mark

    2013-01-01

    The Human Connectome Project (HCP) faces the challenging task of bringing multiple magnetic resonance imaging (MRI) modalities together in a common automated preprocessing framework across a large cohort of subjects. The MRI data acquired by the HCP differ in many ways from data acquired on conventional 3 Tesla scanners and often require newly developed preprocessing methods. We describe the minimal preprocessing pipelines for structural, functional, and diffusion MRI that were developed by the HCP to accomplish many low level tasks, including spatial artifact/distortion removal, surface generation, cross-modal registration, and alignment to standard space. These pipelines are specially designed to capitalize on the high quality data offered by the HCP. The final standard space makes use of a recently introduced CIFTI file format and the associated grayordinates spatial coordinate system. This allows for combined cortical surface and subcortical volume analyses while reducing the storage and processing requirements for high spatial and temporal resolution data. Here, we provide the minimum image acquisition requirements for the HCP minimal preprocessing pipelines and additional advice for investigators interested in replicating the HCP’s acquisition protocols or using these pipelines. Finally, we discuss some potential future improvements for the pipelines. PMID:23668970

  2. Microarrays under the microscope.

    PubMed

    Wildsmith, S E; Elcock, F J

    2001-02-01

    Microarray technology is a rapidly advancing area, which is gaining popularity in many biological disciplines from drug target identification to predictive toxicology. Over the past few years, there has been a dramatic increase in the number of methods and techniques available for carrying out this form of gene expression analysis. The techniques and associated peripherals, such as slide types, deposition methods, robotics, and scanning equipment, are undergoing constant improvement, helping to drive the technology forward in terms of robustness and ease of use. These rapid developments, combined with the number of options available and the associated hyperbole, can prove daunting for the new user. This review aims to guide the researcher through the various steps of conducting microarray experiments, from initial strategy to analysing the data, with critical examination of the benefits and disadvantages along the way. PMID:11212888

  3. Navigating Public Microarray Databases

    PubMed Central

    Bähler, Jürg

    2004-01-01

    With the ever-escalating amount of data being produced by genome-wide microarray studies, it is of increasing importance that these data are captured in public databases so that researchers can use this information to complement and enhance their own studies. Many groups have set up databases of expression data, ranging from large repositories, which are designed to comprehensively capture all published data, through to more specialized databases. The public repositories, such as ArrayExpress at the European Bioinformatics Institute contain complete datasets in raw format in addition to processed data, whilst the specialist databases tend to provide downstream analysis of normalized data from more focused studies and data sources. Here we provide a guide to the use of these public microarray resources. PMID:18629145

  4. Navigating public microarray databases.

    PubMed

    Penkett, Christopher J; Bähler, Jürg

    2004-01-01

    With the ever-escalating amount of data being produced by genome-wide microarray studies, it is of increasing importance that these data are captured in public databases so that researchers can use this information to complement and enhance their own studies. Many groups have set up databases of expression data, ranging from large repositories, which are designed to comprehensively capture all published data, through to more specialized databases. The public repositories, such as ArrayExpress at the European Bioinformatics Institute contain complete datasets in raw format in addition to processed data, whilst the specialist databases tend to provide downstream analysis of normalized data from more focused studies and data sources. Here we provide a guide to the use of these public microarray resources. PMID:18629145

  5. A review of independent component analysis application to microarray gene expression data

    PubMed Central

    Kong, Wei; Vanderburg, Charles R.; Gunshin, Hiromi; Rogers, Jack T.; Huang, Xudong

    2010-01-01

    Independent component analysis (ICA) methods have received growing attention as effective data-mining tools for microarray gene expression data. As a technique of higher-order statistical analysis, ICA is capable of extracting biologically relevant gene expression features from microarray data. Herein we have reviewed the latest applications and the extended algorithms of ICA in gene clustering, classification, and identification. The theoretical frameworks of ICA have been described to further illustrate its feature extraction function in microarray data analysis. PMID:19007336

  6. Tiling Microarray Analysis Tools

    SciTech Connect

    Nix, Davis Austin

    2005-05-04

    TiMAT is a package of 23 command line Java applications for use in the analysis of Affymetrix tiled genomic microarray data. TiMAT enables: 1) Rebuilding the genome annotation for entire tiled arrays (repeat filtering, chromosomal coordinate assignment). 2) Post processing of oligo intensity values (quantile normalization, median scaling, PMMM transformation), 3) Significance testing (Wilcoxon rank sum and signed rank tests, intensity difference and ratio tests) and Interval refinement (filtering based on multiple statistics, overlap comparisons), 4) Data visualization (detailed thumbnail/zoomed view with Interval Plots and data export to Affymetrix's Integrated Genome Browser) and Data reports (spreadsheet summaries and detailed profiles)

  7. Linguistic Preprocessing and Tagging for Problem Report Trend Analysis

    NASA Technical Reports Server (NTRS)

    Beil, Robert J.; Malin, Jane T.

    2012-01-01

    Mr. Robert Beil, Systems Engineer at Kennedy Space Center (KSC), requested the NASA Engineering and Safety Center (NESC) develop a prototype tool suite that combines complementary software technology used at Johnson Space Center (JSC) and KSC for problem report preprocessing and semantic tag extraction, to improve input to data mining and trend analysis. This document contains the outcome of the assessment and the Findings, Observations and NESC Recommendations.

  8. Integration of geometric modeling and advanced finite element preprocessing

    NASA Technical Reports Server (NTRS)

    Shephard, Mark S.; Finnigan, Peter M.

    1987-01-01

    The structure to a geometry based finite element preprocessing system is presented. The key features of the system are the use of geometric operators to support all geometric calculations required for analysis model generation, and the use of a hierarchic boundary based data structure for the major data sets within the system. The approach presented can support the finite element modeling procedures used today as well as the fully automated procedures under development.

  9. Image pre-processing for optimizing automated photogrammetry performances

    NASA Astrophysics Data System (ADS)

    Guidi, G.; Gonizzi, S.; Micoli, L. L.

    2014-05-01

    The purpose of this paper is to analyze how optical pre-processing with polarizing filters and digital pre-processing with HDR imaging, may improve the automated 3D modeling pipeline based on SFM and Image Matching, with special emphasis on optically non-cooperative surfaces of shiny or dark materials. Because of the automatic detection of homologous points, the presence of highlights due to shiny materials, or nearly uniform dark patches produced by low reflectance materials, may produce erroneous matching involving wrong 3D point estimations, and consequently holes and topological errors on the mesh originated by the associated dense 3D cloud. This is due to the limited dynamic range of the 8 bit digital images that are matched each other for generating 3D data. The same 256 levels can be more usefully employed if the actual dynamic range is compressed, avoiding luminance clipping on the darker and lighter image areas. Such approach is here considered both using optical filtering and HDR processing with tone mapping, with experimental evaluation on different Cultural Heritage objects characterized by non-cooperative optical behavior. Three test images of each object have been captured from different positions, changing the shooting conditions (filter/no-filter) and the image processing (no processing/HDR processing), in order to have the same 3 camera orientations with different optical and digital pre-processing, and applying the same automated process to each photo set.

  10. Review of feed forward neural network classification preprocessing techniques

    NASA Astrophysics Data System (ADS)

    Asadi, Roya; Kareem, Sameem Abdul

    2014-06-01

    The best feature of artificial intelligent Feed Forward Neural Network (FFNN) classification models is learning of input data through their weights. Data preprocessing and pre-training are the contributing factors in developing efficient techniques for low training time and high accuracy of classification. In this study, we investigate and review the powerful preprocessing functions of the FFNN models. Currently initialization of the weights is at random which is the main source of problems. Multilayer auto-encoder networks as the latest technique like other related techniques is unable to solve the problems. Weight Linear Analysis (WLA) is a combination of data pre-processing and pre-training to generate real weights through the use of normalized input values. The FFNN model by using the WLA increases classification accuracy and improve training time in a single epoch without any training cycle, the gradient of the mean square error function, updating the weights. The results of comparison and evaluation show that the WLA is a powerful technique in the FFNN classification area yet.

  11. Face recognition by using optical correlator with wavelet preprocessing

    NASA Astrophysics Data System (ADS)

    Strzelecki, Jacek; Chalasinska-Macukow, Katarzyna

    2004-08-01

    The method of face recognition by using optical correlator with wavelet preprocessing is presented. The wavelet transform is used to improve the performance of standard Vander Lugt correlator with phase only filter (POF). The influence of various wavelet transforms of images of human faces on the recognition results has been analyzed. The quality of the face recognition process was tested according to two criteria: the peak to correlation energy ratio (PCE), and the discrimination capability (DC). Additionally, proper localization of correlation peak has been controlled. During the preprocessing step a set of three wavelets -- mexican hat, Haar, and Gabor wavelets, with various scales was used. In addition, Gabor wavelets were tested for various orientation angles. During the recognition procedure the input scene and POF are transformed by the same wavelet. We show the results of the computer simulation for a variety of images of human faces: original images without any distortions, noisy images, and images with non-uniform light ilumination. A comparison of results of recognition obtained with and without wavelet preprocessing is given.

  12. Application of filtering techniques in preprocessing magnetic data

    NASA Astrophysics Data System (ADS)

    Liu, Haijun; Yi, Yongping; Yang, Hongxia; Hu, Guochuang; Liu, Guoming

    2010-08-01

    High precision magnetic exploration is a popular geophysical technique for its simplicity and its effectiveness. The explanation in high precision magnetic exploration is always a difficulty because of the existence of noise and disturbance factors, so it is necessary to find an effective preprocessing method to get rid of the affection of interference factors before further processing. The common way to do this work is by filtering. There are many kinds of filtering methods. In this paper we introduced in detail three popular kinds of filtering techniques including regularized filtering technique, sliding averages filtering technique, compensation smoothing filtering technique. Then we designed the work flow of filtering program based on these techniques and realized it with the help of DELPHI. To check it we applied it to preprocess magnetic data of a certain place in China. Comparing the initial contour map with the filtered contour map, we can see clearly the perfect effect our program. The contour map processed by our program is very smooth and the high frequency parts of data are disappeared. After filtering, we separated useful signals and noisy signals, minor anomaly and major anomaly, local anomaly and regional anomaly. It made us easily to focus on the useful information. Our program can be used to preprocess magnetic data. The results showed the effectiveness of our program.

  13. CLUM: a cluster program for analyzing microarray data.

    PubMed

    Irigoien, I; Fernandez, E; Vives, S; Arenas, C

    2008-08-01

    Microarray technology is increasingly being applied in biological and medical research to address a wide range of problems. Cluster analysis has proven to be a very useful tool for investigating the structure of microarray data. This paper presents a program for clustering microarray data, which is based on the so call path-distance. The algorithm gives in each step a partition in two clusters and no prior assumptions on the structure of clusters are required. It assigns each object (gene or sample) to only one cluster and gives the global optimum for the function that quantifies the adequacy of a given partition of the sample into k clusters. The program was tested on experimental data sets, showing the robustness of the algorithm. PMID:18825964

  14. Compressive Sensing DNA Microarrays

    PubMed Central

    2009-01-01

    Compressive sensing microarrays (CSMs) are DNA-based sensors that operate using group testing and compressive sensing (CS) principles. In contrast to conventional DNA microarrays, in which each genetic sensor is designed to respond to a single target, in a CSM, each sensor responds to a set of targets. We study the problem of designing CSMs that simultaneously account for both the constraints from CS theory and the biochemistry of probe-target DNA hybridization. An appropriate cross-hybridization model is proposed for CSMs, and several methods are developed for probe design and CS signal recovery based on the new model. Lab experiments suggest that in order to achieve accurate hybridization profiling, consensus probe sequences are required to have sequence homology of at least 80% with all targets to be detected. Furthermore, out-of-equilibrium datasets are usually as accurate as those obtained from equilibrium conditions. Consequently, one can use CSMs in applications in which only short hybridization times are allowed. PMID:19158952

  15. Automatic video object detection and mask signal removal for efficient video preprocessing

    NASA Astrophysics Data System (ADS)

    He, Zhihai

    2004-01-01

    In this work, we consider a generic definition of video object, which is a group of pixels with temporal motion coherence. The generic video object (GVO) is the superset of the conventional video objects discussed in the literature. Because of its motion coherence, the GVO can be easily recognized by the human visual system. However, due to its arbitray spatial distribution, the GVO cannot be easily detected by the existing algorithms which often assume the spatial homogeneousness of the video objects. In this work, we introduce the concept of extended optical flow and develop a dynamic programming framework for the GVO detection. Using this mathematical optimization formulation, whose solution is given by the the Viterbi algorithm, the proposed object detection algorithm is able to discover the motion path of the GVO automatically and refine its spatial location progressively. We apply the GVO detection algorithm to extract and remove the so-called "video mask" signals in the video sequence. Our experimental results show that this type of vision-guided video pre-processing significantly improves the compression efficiency.

  16. Integrating data from heterogeneous DNA microarray platforms.

    PubMed

    Valente, Eduardo; Rocha, Miguel

    2015-01-01

    DNA microarrays are one of the most used technologies for gene expression measurement. However, there are several distinct microarray platforms, from different manufacturers, each with its own measurement protocol, resulting in data that can hardly be compared or directly integrated. Data integration from multiple sources aims to improve the assertiveness of statistical tests, reducing the data dimensionality problem. The integration of heterogeneous DNA microarray platforms comprehends a set of tasks that range from the re-annotation of the features used on gene expression, to data normalization and batch effect elimination. In this work, a complete methodology for gene expression data integration and application is proposed, which comprehends a transcript-based re-annotation process and several methods for batch effect attenuation. The integrated data will be used to select the best feature set and learning algorithm for a brain tumor classification case study. The integration will consider data from heterogeneous Agilent and Affymetrix platforms, collected from public gene expression databases, such as The Cancer Genome Atlas and Gene Expression Omnibus. PMID:26673932

  17. The Genopolis Microarray Database

    PubMed Central

    Splendiani, Andrea; Brandizi, Marco; Even, Gael; Beretta, Ottavio; Pavelka, Norman; Pelizzola, Mattia; Mayhaus, Manuel; Foti, Maria; Mauri, Giancarlo; Ricciardi-Castagnoli, Paola

    2007-01-01

    Background Gene expression databases are key resources for microarray data management and analysis and the importance of a proper annotation of their content is well understood. Public repositories as well as microarray database systems that can be implemented by single laboratories exist. However, there is not yet a tool that can easily support a collaborative environment where different users with different rights of access to data can interact to define a common highly coherent content. The scope of the Genopolis database is to provide a resource that allows different groups performing microarray experiments related to a common subject to create a common coherent knowledge base and to analyse it. The Genopolis database has been implemented as a dedicated system for the scientific community studying dendritic and macrophage cells functions and host-parasite interactions. Results The Genopolis Database system allows the community to build an object based MIAME compliant annotation of their experiments and to store images, raw and processed data from the Affymetrix GeneChip® platform. It supports dynamical definition of controlled vocabularies and provides automated and supervised steps to control the coherence of data and annotations. It allows a precise control of the visibility of the database content to different sub groups in the community and facilitates exports of its content to public repositories. It provides an interactive users interface for data analysis: this allows users to visualize data matrices based on functional lists and sample characterization, and to navigate to other data matrices defined by similarity of expression values as well as functional characterizations of genes involved. A collaborative environment is also provided for the definition and sharing of functional annotation by users. Conclusion The Genopolis Database supports a community in building a common coherent knowledge base and analyse it. This fills a gap between a local

  18. DNA Microarray-Based Diagnostics.

    PubMed

    Marzancola, Mahsa Gharibi; Sedighi, Abootaleb; Li, Paul C H

    2016-01-01

    The DNA microarray technology is currently a useful biomedical tool which has been developed for a variety of diagnostic applications. However, the development pathway has not been smooth and the technology has faced some challenges. The reliability of the microarray data and also the clinical utility of the results in the early days were criticized. These criticisms added to the severe competition from other techniques, such as next-generation sequencing (NGS), impacting the growth of microarray-based tests in the molecular diagnostic market.Thanks to the advances in the underlying technologies as well as the tremendous effort offered by the research community and commercial vendors, these challenges have mostly been addressed. Nowadays, the microarray platform has achieved sufficient standardization and method validation as well as efficient probe printing, liquid handling and signal visualization. Integration of various steps of the microarray assay into a harmonized and miniaturized handheld lab-on-a-chip (LOC) device has been a goal for the microarray community. In this respect, notable progress has been achieved in coupling the DNA microarray with the liquid manipulation microsystem as well as the supporting subsystem that will generate the stand-alone LOC device.In this chapter, we discuss the major challenges that microarray technology has faced in its almost two decades of development and also describe the solutions to overcome the challenges. In addition, we review the advancements of the technology, especially the progress toward developing the LOC devices for DNA diagnostic applications. PMID:26614075

  19. Living-Cell Microarrays

    PubMed Central

    Yarmush, Martin L.; King, Kevin R.

    2011-01-01

    Living cells are remarkably complex. To unravel this complexity, living-cell assays have been developed that allow delivery of experimental stimuli and measurement of the resulting cellular responses. High-throughput adaptations of these assays, known as living-cell microarrays, which are based on microtiter plates, high-density spotting, microfabrication, and microfluidics technologies, are being developed for two general applications: (a) to screen large-scale chemical and genomic libraries and (b) to systematically investigate the local cellular microenvironment. These emerging experimental platforms offer exciting opportunities to rapidly identify genetic determinants of disease, to discover modulators of cellular function, and to probe the complex and dynamic relationships between cells and their local environment. PMID:19413510

  20. Tiling Microarray Analysis Tools

    Energy Science and Technology Software Center (ESTSC)

    2005-05-04

    TiMAT is a package of 23 command line Java applications for use in the analysis of Affymetrix tiled genomic microarray data. TiMAT enables: 1) Rebuilding the genome annotation for entire tiled arrays (repeat filtering, chromosomal coordinate assignment). 2) Post processing of oligo intensity values (quantile normalization, median scaling, PMMM transformation), 3) Significance testing (Wilcoxon rank sum and signed rank tests, intensity difference and ratio tests) and Interval refinement (filtering based on multiple statistics, overlap comparisons),more » 4) Data visualization (detailed thumbnail/zoomed view with Interval Plots and data export to Affymetrix's Integrated Genome Browser) and Data reports (spreadsheet summaries and detailed profiles)« less

  1. Preprocessing and parameterizing bioimpedance spectroscopy measurements by singular value decomposition.

    PubMed

    Nejadgholi, Isar; Caytak, Herschel; Bolic, Miodrag; Batkin, Izmail; Shirmohammadi, Shervin

    2015-05-01

    In several applications of bioimpedance spectroscopy, the measured spectrum is parameterized by being fitted into the Cole equation. However, the extracted Cole parameters seem to be inconsistent from one measurement session to another, which leads to a high standard deviation of extracted parameters. This inconsistency is modeled with a source of random variations added to the voltage measurement carried out in the time domain. These random variations may originate from biological variations that are irrelevant to the evidence that we are investigating. Yet, they affect the voltage measured by using a bioimpedance device based on which magnitude and phase of impedance are calculated.By means of simulated data, we showed that Cole parameters are highly affected by this type of variation. We further showed that singular value decomposition (SVD) is an effective tool for parameterizing bioimpedance measurements, which results in more consistent parameters than Cole parameters. We propose to apply SVD as a preprocessing method to reconstruct denoised bioimpedance measurements. In order to evaluate the method, we calculated the relative difference between parameters extracted from noisy and clean simulated bioimpedance spectra. Both mean and standard deviation of this relative difference are shown to effectively decrease when Cole parameters are extracted from preprocessed data in comparison to being extracted from raw measurements.We evaluated the performance of the proposed method in distinguishing three arm positions, for a set of experiments including eight subjects. It is shown that Cole parameters of different positions are not distinguishable when extracted from raw measurements. However, one arm position can be distinguished based on SVD scores. Moreover, all three positions are shown to be distinguished by two parameters, R0/R∞ and Fc, when Cole parameters are extracted from preprocessed measurements. These results suggest that SVD could be considered as an

  2. A new approach to pre-processing digital image for wavelet-based watermark

    NASA Astrophysics Data System (ADS)

    Agreste, Santa; Andaloro, Guido

    2008-11-01

    The growth of the Internet has increased the phenomenon of digital piracy, in multimedia objects, like software, image, video, audio and text. Therefore it is strategic to individualize and to develop methods and numerical algorithms, which are stable and have low computational cost, that will allow us to find a solution to these problems. We describe a digital watermarking algorithm for color image protection and authenticity: robust, not blind, and wavelet-based. The use of Discrete Wavelet Transform is motivated by good time-frequency features and a good match with Human Visual System directives. These two combined elements are important for building an invisible and robust watermark. Moreover our algorithm can work with any image, thanks to the step of pre-processing of the image that includes resize techniques that adapt to the size of the original image for Wavelet transform. The watermark signal is calculated in correlation with the image features and statistic properties. In the detection step we apply a re-synchronization between the original and watermarked image according to the Neyman-Pearson statistic criterion. Experimentation on a large set of different images has been shown to be resistant against geometric, filtering, and StirMark attacks with a low rate of false alarm.

  3. Microarray platform for omics analysis

    NASA Astrophysics Data System (ADS)

    Mecklenburg, Michael; Xie, Bin

    2001-09-01

    Microarray technology has revolutionized genetic analysis. However, limitations in genome analysis has lead to renewed interest in establishing 'omic' strategies. As we enter the post-genomic era, new microarray technologies are needed to address these new classes of 'omic' targets, such as proteins, as well as lipids and carbohydrates. We have developed a microarray platform that combines self- assembling monolayers with the biotin-streptavidin system to provide a robust, versatile immobilization scheme. A hydrophobic film is patterned on the surface creating an array of tension wells that eliminates evaporation effects thereby reducing the shear stress to which biomolecules are exposed to during immobilization. The streptavidin linker layer makes it possible to adapt and/or develop microarray based assays using virtually any class of biomolecules including: carbohydrates, peptides, antibodies, receptors, as well as them ore traditional DNA based arrays. Our microarray technology is designed to furnish seamless compatibility across the various 'omic' platforms by providing a common blueprint for fabricating and analyzing arrays. The prototype microarray uses a microscope slide footprint patterned with 2 by 96 flat wells. Data on the microarray platform will be presented.

  4. Development, Characterization and Experimental Validation of a Cultivated Sunflower (Helianthus annuus L.) Gene Expression Oligonucleotide Microarray

    PubMed Central

    Fernandez, Paula; Soria, Marcelo; Blesa, David; DiRienzo, Julio; Moschen, Sebastian; Rivarola, Maximo; Clavijo, Bernardo Jose; Gonzalez, Sergio; Peluffo, Lucila; Príncipi, Dario; Dosio, Guillermo; Aguirrezabal, Luis; García-García, Francisco; Conesa, Ana; Hopp, Esteban; Dopazo, Joaquín; Heinz, Ruth Amelia; Paniego, Norma

    2012-01-01

    Oligonucleotide-based microarrays with accurate gene coverage represent a key strategy for transcriptional studies in orphan species such as sunflower, H. annuus L., which lacks full genome sequences. The goal of this study was the development and functional annotation of a comprehensive sunflower unigene collection and the design and validation of a custom sunflower oligonucleotide-based microarray. A large scale EST (>130,000 ESTs) curation, assembly and sequence annotation was performed using Blast2GO (www.blast2go.de). The EST assembly comprises 41,013 putative transcripts (12,924 contigs and 28,089 singletons). The resulting Sunflower Unigen Resource (SUR version 1.0) was used to design an oligonucleotide-based Agilent microarray for cultivated sunflower. This microarray includes a total of 42,326 features: 1,417 Agilent controls, 74 control probes for sunflower replicated 10 times (740 controls) and 40,169 different non-control probes. Microarray performance was validated using a model experiment examining the induction of senescence by water deficit. Pre-processing and differential expression analysis of Agilent microarrays was performed using the Bioconductor limma package. The analyses based on p-values calculated by eBayes (p<0.01) allowed the detection of 558 differentially expressed genes between water stress and control conditions; from these, ten genes were further validated by qPCR. Over-represented ontologies were identified using FatiScan in the Babelomics suite. This work generated a curated and trustable sunflower unigene collection, and a custom, validated sunflower oligonucleotide-based microarray using Agilent technology. Both the curated unigene collection and the validated oligonucleotide microarray provide key resources for sunflower genome analysis, transcriptional studies, and molecular breeding for crop improvement. PMID:23110046

  5. Development, characterization and experimental validation of a cultivated sunflower (Helianthus annuus L.) gene expression oligonucleotide microarray.

    PubMed

    Fernandez, Paula; Soria, Marcelo; Blesa, David; DiRienzo, Julio; Moschen, Sebastian; Rivarola, Maximo; Clavijo, Bernardo Jose; Gonzalez, Sergio; Peluffo, Lucila; Príncipi, Dario; Dosio, Guillermo; Aguirrezabal, Luis; García-García, Francisco; Conesa, Ana; Hopp, Esteban; Dopazo, Joaquín; Heinz, Ruth Amelia; Paniego, Norma

    2012-01-01

    Oligonucleotide-based microarrays with accurate gene coverage represent a key strategy for transcriptional studies in orphan species such as sunflower, H. annuus L., which lacks full genome sequences. The goal of this study was the development and functional annotation of a comprehensive sunflower unigene collection and the design and validation of a custom sunflower oligonucleotide-based microarray. A large scale EST (>130,000 ESTs) curation, assembly and sequence annotation was performed using Blast2GO (www.blast2go.de). The EST assembly comprises 41,013 putative transcripts (12,924 contigs and 28,089 singletons). The resulting Sunflower Unigen Resource (SUR version 1.0) was used to design an oligonucleotide-based Agilent microarray for cultivated sunflower. This microarray includes a total of 42,326 features: 1,417 Agilent controls, 74 control probes for sunflower replicated 10 times (740 controls) and 40,169 different non-control probes. Microarray performance was validated using a model experiment examining the induction of senescence by water deficit. Pre-processing and differential expression analysis of Agilent microarrays was performed using the Bioconductor limma package. The analyses based on p-values calculated by eBayes (p<0.01) allowed the detection of 558 differentially expressed genes between water stress and control conditions; from these, ten genes were further validated by qPCR. Over-represented ontologies were identified using FatiScan in the Babelomics suite. This work generated a curated and trustable sunflower unigene collection, and a custom, validated sunflower oligonucleotide-based microarray using Agilent technology. Both the curated unigene collection and the validated oligonucleotide microarray provide key resources for sunflower genome analysis, transcriptional studies, and molecular breeding for crop improvement. PMID:23110046

  6. Radar data pre-processing for reliable rain field estimation

    NASA Astrophysics Data System (ADS)

    Daliakopoulos, Ioannis N.; Tsanis, Ioannis K.

    2010-05-01

    A comparative analysis of different pre-processing methods applied to radar data for the minimization of the uncertainty of the produced Z-R relationship is conducted. The study focuses on measurements from 3 ground precipitation stations which are located in close proximity to the Souda Bay C-Band radar in Crete, Greece. While precipitation and reflectivity measurements were both collected in almost synchronized 10 minute intervals, uncertainties related to timing issues are discussed and measurements are aggregated to various scales up to 12 hours. Reflectivity measurements are also transformed and resampled in space, from polar coordinates to regular grids of 500 to 5000m resolution. The tradeoffs of both spatial and temporal transformation are discussed. Data is also filtered for noise using simple thresholding, the Wiener filter and combinations of both methods. The effects of the three pre-processing procedures are studied with respect to the final fit of the data to acceptable Z-R equations for the generation of reliable precipitation fields.

  7. Chemistry of Natural Glycan Microarray

    PubMed Central

    Song, Xuezheng; Heimburg-Molinaro, Jamie; Cummings, Richard D.; Smith, David F.

    2014-01-01

    Glycan microarrays have become indispensable tools for studying protein-glycan interactions. Along with chemo-enzymatic synthesis, glycans isolated from natural sources have played important roles in array development and will continue to be a major source of glycans. N- and O-glycans from glycoproteins, and glycans from glycosphingolipids can be released from corresponding glycoconjugates with relatively mature methods, although isolation of large numbers and quantities of glycans are still very challenging. Glycosylphosphatidylinositol (GPI)-anchors and glycosaminoglycans (GAGs) are less represented on current glycan microarrays. Glycan microarray development has been greatly facilitated by bifunctional fluorescent linkers, which can be applied in a “Shotgun Glycomics” approach to incorporate isolated natural glycans. Glycan presentation on microarrays may affect glycan binding by GBPs, often through multivalent recognition by the GBP. PMID:24487062

  8. Microarray Analysis of Microbial Weathering

    NASA Astrophysics Data System (ADS)

    Olsson-Francis, K.; van Houdt, R.; Leys, N.; Mergeay, M.; Cockell, C. S.

    2010-04-01

    Microarray analysis of the heavy metal resistant bacterium, Cupriavidus metallidurans CH34 was used to investigate the genes involved in the weathering. The results demonstrated that large porin and membrane transporter genes were unregulated.

  9. CARMAweb: comprehensive R- and bioconductor-based web service for microarray data analysis.

    PubMed

    Rainer, Johannes; Sanchez-Cabo, Fatima; Stocker, Gernot; Sturn, Alexander; Trajanoski, Zlatko

    2006-07-01

    CARMAweb (Comprehensive R-based Microarray Analysis web service) is a web application designed for the analysis of microarray data. CARMAweb performs data preprocessing (background correction, quality control and normalization), detection of differentially expressed genes, cluster analysis, dimension reduction and visualization, classification, and Gene Ontology-term analysis. This web application accepts raw data from a variety of imaging software tools for the most widely used microarray platforms: Affymetrix GeneChips, spotted two-color microarrays and Applied Biosystems (ABI) microarrays. R and packages from the Bioconductor project are used as an analytical engine in combination with the R function Sweave, which allows automatic generation of analysis reports. These report files contain all R commands used to perform the analysis and guarantee therefore a maximum transparency and reproducibility for each analysis. The web application is implemented in Java based on the latest J2EE (Java 2 Enterprise Edition) software technology. CARMAweb is freely available at https://carmaweb.genome.tugraz.at. PMID:16845058

  10. An automated method for gridding in microarray images.

    PubMed

    Giannakeas, Nikolaos; Fotiadis, Dimitrios I; Politou, Anastasia S

    2006-01-01

    Microarray technology is a powerful tool for analyzing the expression of a large number of genes in parallel. A typical microarray image consists of a few thousands of spots which determine the level of gene expression in the sample. In this paper we propose a method which automatically addresses each spot area in the image. Initially, a preliminary segmentation of the image is produced using a template matching algorithm. Next, grid and spot finding are realized. The position of non-expressed spots is located and finally a Voronoi diagram is employed to fit the grid on the image. Our method has been evaluated in a set of five images consisting of 45960 spots, from the Stanford microarray database and the reported accuracy for spot detection was 93% PMID:17946343

  11. Adaptive-weighted bilateral filtering and other pre-processing techniques for optical coherence tomography.

    PubMed

    Anantrasirichai, N; Nicholson, Lindsay; Morgan, James E; Erchova, Irina; Mortlock, Katie; North, Rachel V; Albon, Julie; Achim, Alin

    2014-09-01

    This paper presents novel pre-processing image enhancement algorithms for retinal optical coherence tomography (OCT). These images contain a large amount of speckle causing them to be grainy and of very low contrast. To make these images valuable for clinical interpretation, we propose a novel method to remove speckle, while preserving useful information contained in each retinal layer. The process starts with multi-scale despeckling based on a dual-tree complex wavelet transform (DT-CWT). We further enhance the OCT image through a smoothing process that uses a novel adaptive-weighted bilateral filter (AWBF). This offers the desirable property of preserving texture within the OCT image layers. The enhanced OCT image is then segmented to extract inner retinal layers that contain useful information for eye research. Our layer segmentation technique is also performed in the DT-CWT domain. Finally we describe an OCT/fundus image registration algorithm which is helpful when two modalities are used together for diagnosis and for information fusion. PMID:25034317

  12. Design and implementation of a preprocessing system for a sodium lidar

    NASA Technical Reports Server (NTRS)

    Voelz, D. G.; Sechrist, C. F., Jr.

    1983-01-01

    A preprocessing system, designed and constructed for use with the University of Illinois sodium lidar system, was developed to increase the altitude resolution and range of the lidar system and also to decrease the processing burden of the main lidar computer. The preprocessing system hardware and the software required to implement the system are described. Some preliminary results of an airborne sodium lidar experiment conducted with the preprocessing system installed in the sodium lidar are presented.

  13. Data acquisition and preprocessing techniques for remote sensing field research

    NASA Technical Reports Server (NTRS)

    Biehl, L. L.; Robinson, B. F.

    1983-01-01

    A crops and soils data base has been developed at Purdue University's Laboratory for Applications of Remote Sensing using spectral and agronomic measurements made by several government and university researchers. The data are being used to (1) quantitatively determine the relationships of spectral and agronomic characteristics of crops and soils, (2) define future sensor systems, and (3) develop advanced data analysis techniques. Researchers follow defined data acquisition and preprocessing techniques to provide fully annotated and calibrated sets of spectral, agronomic, and meteorological data. These procedures enable the researcher to combine his data with that acquired by other researchers for remote sensing research. The key elements or requirements for developing a field research data base of spectral data that can be transported across sites and years are appropriate experiment design, accurate spectral data calibration, defined field procedures, and through experiment documentation.

  14. Preprocessing of Satellite Data for Urban Object Extraction

    NASA Astrophysics Data System (ADS)

    Krauß, T.

    2015-03-01

    Very high resolution (VHR) DSMs (digital surface models) derived from stereo- or multi-stereo images from current VHR satellites like WorldView-2 or Pléiades can be produced up to the ground sampling distance (GSD) of the sensors in the range of 50 cm to 1 m. From such DSMs the digital terrain model (DTM) representing the ground and also a so called nDEM (normalized digital elevation model) describing the height of objects above the ground can be derived. In parallel these sensors deliver multispectral imagery which can be used for a spectral classification of the imagery. Fusion of the multispectral classification and the nDEM allows a simple classification and detection of urban objects. In further processing steps these detected urban objects can be modeled and exported in a suitable description language like CityGML. In this work we present the pre-processing steps up to the classification and detection of the urban objects. The modeling is not part of this work. The pre-processing steps described here cover briefly the coregistration of the input images and the generation of the DSM. In more detail the improvement of the DSM, the extraction of the DTM and nDEM, the multispectral classification and the object detection and extraction are explained. The methods described are applied to two test regions from two satellites: First the center of Munich acquired by WorldView-2 and second the center of Melbourne acquired by Pĺeiades. From both acquisitions a stereo-pair from the panchromatic bands is used for creation of the DSM and the pan-sharpened multispectral images are used for spectral classification. Finally the quality of the detected urban objects is discussed.

  15. Inferring genetic networks from microarray data.

    SciTech Connect

    May, Elebeoba Eni; Davidson, George S.; Martin, Shawn Bryan; Werner-Washburne, Margaret C.; Faulon, Jean-Loup Michel

    2004-06-01

    In theory, it should be possible to infer realistic genetic networks from time series microarray data. In practice, however, network discovery has proved problematic. The three major challenges are: (1) inferring the network; (2) estimating the stability of the inferred network; and (3) making the network visually accessible to the user. Here we describe a method, tested on publicly available time series microarray data, which addresses these concerns. The inference of genetic networks from genome-wide experimental data is an important biological problem which has received much attention. Approaches to this problem have typically included application of clustering algorithms [6]; the use of Boolean networks [12, 1, 10]; the use of Bayesian networks [8, 11]; and the use of continuous models [21, 14, 19]. Overviews of the problem and general approaches to network inference can be found in [4, 3]. Our approach to network inference is similar to earlier methods in that we use both clustering and Boolean network inference. However, we have attempted to extend the process to better serve the end-user, the biologist. In particular, we have incorporated a system to assess the reliability of our network, and we have developed tools which allow interactive visualization of the proposed network.

  16. Segmentation of prostate cancer tissue microarray images

    NASA Astrophysics Data System (ADS)

    Cline, Harvey E.; Can, Ali; Padfield, Dirk

    2006-02-01

    Prostate cancer is diagnosed by histopathology interpretation of hematoxylin and eosin (H and E)-stained tissue sections. Gland and nuclei distributions vary with the disease grade. The morphological features vary with the advance of cancer where the epithelial regions grow into the stroma. An efficient pathology slide image analysis method involved using a tissue microarray with known disease stages. Digital 24-bit RGB images were acquired for each tissue element on the slide with both 10X and 40X objectives. Initial segmentation at low magnification was accomplished using prior spectral characteristics from a training tissue set composed of four tissue clusters; namely, glands, epithelia, stroma and nuclei. The segmentation method was automated by using the training RGB values as an initial guess and iterating the averaging process 10 times to find the four cluster centers. Labels were assigned to the nearest cluster center in red-blue spectral feature space. An automatic threshold algorithm separated the glands from the tissue. A visual pseudo color representation of 60 segmented tissue microarray image was generated where white, pink, red, blue colors represent glands, epithelia, stroma and nuclei, respectively. The higher magnification images provided refined nuclei morphology. The nuclei were detected with a RGB color space principle component analysis that resulted in a grey scale image. The shape metrics such as compactness, elongation, minimum and maximum diameters were calculated based on the eigenvalues of the best-fitting ellipses to the nuclei.

  17. Comparing Bacterial DNA Microarray Fingerprints

    SciTech Connect

    Willse, Alan R.; Chandler, Darrell P.; White, Amanda M.; Protic, Miroslava; Daly, Don S.; Wunschel, Sharon C.

    2005-08-15

    Detecting subtle genetic differences between microorganisms is an important problem in molecular epidemiology and microbial forensics. In a typical investigation, gel electrophoresis is used to compare randomly amplified DNA fragments between microbial strains, where the patterns of DNA fragment sizes are proxies for a microbe's genotype. The limited genomic sample captured on a gel is often insufficient to discriminate nearly identical strains. This paper examines the application of microarray technology to DNA fingerprinting as a high-resolution alternative to gel-based methods. The so-called universal microarray, which uses short oligonucleotide probes that do not target specific genes or species, is intended to be applicable to all microorganisms because it does not require prior knowledge of genomic sequence. In principle, closely related strains can be distinguished if the number of probes on the microarray is sufficiently large, i.e., if the genome is sufficiently sampled. In practice, we confront noisy data, imperfectly matched hybridizations, and a high-dimensional inference problem. We describe the statistical problems of microarray fingerprinting, outline similarities with and differences from more conventional microarray applications, and illustrate the statistical fingerprinting problem for 10 closely related strains from three Bacillus species, and 3 strains from non-Bacillus species.

  18. Study on Construction of a Medical X-Ray Direct Digital Radiography System and Hybrid Preprocessing Methods

    PubMed Central

    Ren, Yong; Wu, Sheng; Wang, Mijian; Cen, Zhongjie

    2014-01-01

    We construct a medical X-ray direct digital radiography (DDR) system based on a CCD (charge-coupled devices) camera. For the original images captured from X-ray exposure, computer first executes image flat-field correction and image gamma correction, and then carries out image contrast enhancement. A hybrid image contrast enhancement algorithm which is based on sharp frequency localization-contourlet transform (SFL-CT) and contrast limited adaptive histogram equalization (CLAHE), is proposed and verified by the clinical DDR images. Experimental results show that, for the medical X-ray DDR images, the proposed comprehensive preprocessing algorithm can not only greatly enhance the contrast and detail information, but also improve the resolution capability of DDR system. PMID:25013452

  19. Study on construction of a medical x-ray direct digital radiography system and hybrid preprocessing methods.

    PubMed

    Ren, Yong; Wu, Sheng; Wang, Mijian; Cen, Zhongjie

    2014-01-01

    We construct a medical X-ray direct digital radiography (DDR) system based on a CCD (charge-coupled devices) camera. For the original images captured from X-ray exposure, computer first executes image flat-field correction and image gamma correction, and then carries out image contrast enhancement. A hybrid image contrast enhancement algorithm which is based on sharp frequency localization-contourlet transform (SFL-CT) and contrast limited adaptive histogram equalization (CLAHE), is proposed and verified by the clinical DDR images. Experimental results show that, for the medical X-ray DDR images, the proposed comprehensive preprocessing algorithm can not only greatly enhance the contrast and detail information, but also improve the resolution capability of DDR system. PMID:25013452

  20. Characteristic attributes in cancer microarrays.

    PubMed

    Sarkar, I N; Planet, P J; Bael, T E; Stanley, S E; Siddall, M; DeSalle, R; Figurski, D H

    2002-04-01

    Rapid advances in genome sequencing and gene expression microarray technologies are providing unprecedented opportunities to identify specific genes involved in complex biological processes, such as development, signal transduction, and disease. The vast amount of data generated by these technologies has presented new challenges in bioinformatics. To help organize and interpret microarray data, new and efficient computational methods are needed to: (1) distinguish accurately between different biological or clinical categories (e.g., malignant vs. benign), and (2) identify specific genes that play a role in determining those categories. Here we present a novel and simple method that exhaustively scans microarray data for unambiguous gene expression patterns. Such patterns of data can be used as the basis for classification into biological or clinical categories. The method, termed the Characteristic Attribute Organization System (CAOS), is derived from fundamental precepts in systematic biology. In CAOS we define two types of characteristic attributes ('pure' and 'private') that may exist in gene expression microarray data. We also consider additional attributes ('compound') that are composed of expression states of more than one gene that are not characteristic on their own. CAOS was tested on three well-known cancer DNA microarray data sets for its ability to classify new microarray samples. We found CAOS to be a highly accurate and robust class prediction technique. In addition, CAOS identified specific genes, not emphasized in other analyses, that may be crucial to the biology of certain types of cancer. The success of CAOS in this study has significant implications for basic research and the future development of reliable methods for clinical diagnostic tools. PMID:12474425

  1. Image microarrays (IMA): Digital pathology's missing tool

    PubMed Central

    Hipp, Jason; Cheng, Jerome; Pantanowitz, Liron; Hewitt, Stephen; Yagi, Yukako; Monaco, James; Madabhushi, Anant; Rodriguez-canales, Jaime; Hanson, Jeffrey; Roy-Chowdhuri, Sinchita; Filie, Armando C.; Feldman, Michael D.; Tomaszewski, John E.; Shih, Natalie NC.; Brodsky, Victor; Giaccone, Giuseppe; Emmert-Buck, Michael R.; Balis, Ulysses J.

    2011-01-01

    Introduction: The increasing availability of whole slide imaging (WSI) data sets (digital slides) from glass slides offers new opportunities for the development of computer-aided diagnostic (CAD) algorithms. With the all-digital pathology workflow that these data sets will enable in the near future, literally millions of digital slides will be generated and stored. Consequently, the field in general and pathologists, specifically, will need tools to help extract actionable information from this new and vast collective repository. Methods: To address this limitation, we designed and implemented a tool (dCORE) to enable the systematic capture of image tiles with constrained size and resolution that contain desired histopathologic features. Results: In this communication, we describe a user-friendly tool that will enable pathologists to mine digital slides archives to create image microarrays (IMAs). IMAs are to digital slides as tissue microarrays (TMAs) are to cell blocks. Thus, a single digital slide could be transformed into an array of hundreds to thousands of high quality digital images, with each containing key diagnostic morphologies and appropriate controls. Current manual digital image cut-and-paste methods that allow for the creation of a grid of images (such as an IMA) of matching resolutions are tedious. Conclusion: The ability to create IMAs representing hundreds to thousands of vetted morphologic features has numerous applications in education, proficiency testing, consensus case review, and research. Lastly, in a manner analogous to the way conventional TMA technology has significantly accelerated in situ studies of tissue specimens use of IMAs has similar potential to significantly accelerate CAD algorithm development. PMID:22200030

  2. Microarrayed Materials for Stem Cells

    PubMed Central

    Mei, Ying

    2013-01-01

    Stem cells hold remarkable promise for applications in disease modeling, cancer therapy and regenerative medicine. Despite the significant progress made during the last decade, designing materials to control stem cell fate remains challenging. As an alternative, materials microarray technology has received great attention because it allows for high throughput materials synthesis and screening at a reasonable cost. Here, we discuss recent developments in materials microarray technology and their applications in stem cell engineering. Future opportunities in the field will also be reviewed. PMID:24311967

  3. Immunoprofiling Using NAPPA Protein Microarrays

    PubMed Central

    Sibani, Sahar; LaBaer, Joshua

    2012-01-01

    Protein microarrays provide an efficient method to immunoprofile patients in an effort to rapidly identify disease immunosignatures. The validity of using autoantibodies in diagnosis has been demonstrated in type 1 diabetes, rheumatoid arthritis, and systemic lupus, and is now being strongly considered in cancer. Several types of protein microarrays exist including antibody and antigen arrays. In this chapter, we describe the immunoprofiling application for one type of antigen array called NAPPA (nucleic acids programmable protein array). We provide a guideline for setting up the screening study and designing protein arrays to maximize the likelihood of obtaining quality data. PMID:21370064

  4. An integrated approach to the simultaneous selection of variables, mathematical pre-processing and calibration samples in partial least-squares multivariate calibration.

    PubMed

    Allegrini, Franco; Olivieri, Alejandro C

    2013-10-15

    A new optimization strategy for multivariate partial-least-squares (PLS) regression analysis is described. It was achieved by integrating three efficient strategies to improve PLS calibration models: (1) variable selection based on ant colony optimization, (2) mathematical pre-processing selection by a genetic algorithm, and (3) sample selection through a distance-based procedure. Outlier detection has also been included as part of the model optimization. All the above procedures have been combined into a single algorithm, whose aim is to find the best PLS calibration model within a Monte Carlo-type philosophy. Simulated and experimental examples are employed to illustrate the success of the proposed approach. PMID:24054659

  5. Fourier Lucas-Kanade algorithm.

    PubMed

    Lucey, Simon; Navarathna, Rajitha; Ashraf, Ahmed Bilal; Sridharan, Sridha

    2013-06-01

    In this paper, we propose a framework for both gradient descent image and object alignment in the Fourier domain. Our method centers upon the classical Lucas & Kanade (LK) algorithm where we represent the source and template/model in the complex 2D Fourier domain rather than in the spatial 2D domain. We refer to our approach as the Fourier LK (FLK) algorithm. The FLK formulation is advantageous when one preprocesses the source image and template/model with a bank of filters (e.g., oriented edges, Gabor, etc.) as 1) it can handle substantial illumination variations, 2) the inefficient preprocessing filter bank step can be subsumed within the FLK algorithm as a sparse diagonal weighting matrix, 3) unlike traditional LK, the computational cost is invariant to the number of filters and as a result is far more efficient, and 4) this approach can be extended to the Inverse Compositional (IC) form of the LK algorithm where nearly all steps (including Fourier transform and filter bank preprocessing) can be precomputed, leading to an extremely efficient and robust approach to gradient descent image matching. Further, these computational savings translate to nonrigid object alignment tasks that are considered extensions of the LK algorithm, such as those found in Active Appearance Models (AAMs). PMID:23599053

  6. Validation of MIMGO: a method to identify differentially expressed GO terms in a microarray dataset

    PubMed Central

    2012-01-01

    Background We previously proposed an algorithm for the identification of GO terms that commonly annotate genes whose expression is upregulated or downregulated in some microarray data compared with in other microarray data. We call these “differentially expressed GO terms” and have named the algorithm “matrix-assisted identification method of differentially expressed GO terms” (MIMGO). MIMGO can also identify microarray data in which genes annotated with a differentially expressed GO term are upregulated or downregulated. However, MIMGO has not yet been validated on a real microarray dataset using all available GO terms. Findings We combined Gene Set Enrichment Analysis (GSEA) with MIMGO to identify differentially expressed GO terms in a yeast cell cycle microarray dataset. GSEA followed by MIMGO (GSEA + MIMGO) correctly identified (p < 0.05) microarray data in which genes annotated to differentially expressed GO terms are upregulated. We found that GSEA + MIMGO was slightly less effective than, or comparable to, GSEA (Pearson), a method that uses Pearson’s correlation as a metric, at detecting true differentially expressed GO terms. However, unlike other methods including GSEA (Pearson), GSEA + MIMGO can comprehensively identify the microarray data in which genes annotated with a differentially expressed GO term are upregulated or downregulated. Conclusions MIMGO is a reliable method to identify differentially expressed GO terms comprehensively. PMID:23232071

  7. TOPSAR data focusing based on azimuth scaling preprocessing

    NASA Astrophysics Data System (ADS)

    Xu, Wei; Huang, Pingping; Deng, Yunkai

    2011-07-01

    Both Doppler spectral aliasing and azimuth output time folding simultaneously exist in TOPSAR (Terrain Observation by Progressive Scans) raw data. Resampling in both Doppler frequency and azimuth time domain can resolve the azimuth aliasing problem but with the seriously increased computational complexity and memory consumption. According to the special characteristics of TOPSAR raw data support in the slow time/frequency domain (TFD), the presented azimuth scaling preprocessing step is introduced to not only resolve the Doppler spectral aliasing problem but also reduce the increased azimuth samples. Furthermore, the correction of sawtoothed azimuth antenna pattern (AAP) becomes easy to be implemented. The following conventional stripmap processor can be adopted to focus the residual TOPSAR raw data but with the result of azimuth aliased TOPSAR image. The mosaic approach, which has been presented to unfold azimuth aliased ScanSAR image, is exploited to resolve the problem of azimuth output folding in TOPSAR mode. Simulation results and pulse response parameters are given to validate the presented imaging approach.

  8. Preprocessing functions for computed radiography images in a PACS environment

    NASA Astrophysics Data System (ADS)

    McNitt-Gray, Michael F.; Pietka, Ewa; Huang, H. K.

    1992-05-01

    In a picture archiving and communications system (PACS), images are acquired from several modalities including computed radiography (CR). This modality has unique image characteristics and presents several problems that need to be resolved before the image is available for viewing at a display workstation. A set of preprocessing functions have been applied to all CR images in a PACS environment to enhance the display of images. The first function reformats CR images that are acquired with different plate sizes to a standard size for display. Another function removes the distracting white background caused by the collimation used at the time of exposure. A third function determines the orientation of each image and rotates those images that are in nonstandard positions into a standard viewing position. Another function creates a default look-up table based on the gray levels actually used by the image (instead of allocated gray levels). Finally, there is a function which creates (for chest images only) the piece-wise linear look-up tables that can be applied to enhance different tissue densities. These functions have all been implemented in a PACS environment. Each of these functions have been very successful in improving the viewing conditions of CR images and contribute to the clinical acceptance of PACS by reducing the effort required to display CR images.

  9. Multimodal image fusion with SIMS: Preprocessing with image registration.

    PubMed

    Tarolli, Jay Gage; Bloom, Anna; Winograd, Nicholas

    2016-06-01

    In order to utilize complementary imaging techniques to supply higher resolution data for fusion with secondary ion mass spectrometry (SIMS) chemical images, there are a number of aspects that, if not given proper consideration, could produce results which are easy to misinterpret. One of the most critical aspects is that the two input images must be of the same exact analysis area. With the desire to explore new higher resolution data sources that exists outside of the mass spectrometer, this requirement becomes even more important. To ensure that two input images are of the same region, an implementation of the insight segmentation and registration toolkit (ITK) was developed to act as a preprocessing step before performing image fusion. This implementation of ITK allows for several degrees of movement between two input images to be accounted for, including translation, rotation, and scale transforms. First, the implementation was confirmed to accurately register two multimodal images by supplying a known transform. Once validated, two model systems, a copper mesh grid and a group of RAW 264.7 cells, were used to demonstrate the use of the ITK implementation to register a SIMS image with a microscopy image for the purpose of performing image fusion. PMID:26772745

  10. Macular Preprocessing of Linear Acceleratory Stimuli: Implications for the Clinic

    NASA Technical Reports Server (NTRS)

    Ross, M. D.; Hargens, Alan R. (Technical Monitor)

    1996-01-01

    Three-dimensional reconstructions of innervation patterns in rat maculae were carried out using serial section images sent to a Silicon Graphics workstation from a transmission electron microscope. Contours were extracted from mosaicked sections, then registered and visualized using Biocomputation Center software. Purposes were to determine innervation patterns of type II cells and areas encompassed by vestibular afferent receptive fields. Terminals on type II cells typically are elongated and compartmentalized into parts varying in vesicular content; reciprocal and serial synapses are common. The terminals originate as processes of nearby calyces or from nerve fibers passing to calyces outside the immediate vicinity. Thus, receptive fields of the afferents overlap in unique ways. Multiple processes are frequent; from 4 to 6 afferents supply 12-16 terminals on a type II cell. Processes commonly communicate with two type II cells. The morphology indicates that extensive preprocessing of linear acceleratory stimuli occurs peripherally, as is true also of visual and olfactory systems. Clinically, this means that loss of individual nerve fibers may not be noticed behaviorally, due to redundancy (receptive field overlap). However, peripheral processing implies the presence of neuroactive agents whose loss can acutely or chronically alter normal peripheral function and cause balance disorders. (Platform presentation preferred - Theme 11)

  11. Software for Preprocessing Data from Rocket-Engine Tests

    NASA Technical Reports Server (NTRS)

    Cheng, Chiu-Fu

    2004-01-01

    Three computer programs have been written to preprocess digitized outputs of sensors during rocket-engine tests at Stennis Space Center (SSC). The programs apply exclusively to the SSC E test-stand complex and utilize the SSC file format. The programs are the following: Engineering Units Generator (EUGEN) converts sensor-output-measurement data to engineering units. The inputs to EUGEN are raw binary test-data files, which include the voltage data, a list identifying the data channels, and time codes. EUGEN effects conversion by use of a file that contains calibration coefficients for each channel. QUICKLOOK enables immediate viewing of a few selected channels of data, in contradistinction to viewing only after post-test processing (which can take 30 minutes to several hours depending on the number of channels and other test parameters) of data from all channels. QUICKLOOK converts the selected data into a form in which they can be plotted in engineering units by use of Winplot (a free graphing program written by Rick Paris). EUPLOT provides a quick means for looking at data files generated by EUGEN without the necessity of relying on the PV-WAVE based plotting software.

  12. Software for Preprocessing Data From Rocket-Engine Tests

    NASA Technical Reports Server (NTRS)

    Cheng, Chiu-Fu

    2002-01-01

    Three computer programs have been written to preprocess digitized outputs of sensors during rocket-engine tests at Stennis Space Center (SSC). The programs apply exclusively to the SSC "E" test-stand complex and utilize the SSC file format. The programs are the following: 1) Engineering Units Generator (EUGEN) converts sensor-output-measurement data to engineering units. The inputs to EUGEN are raw binary test-data files, which include the voltage data, a list identifying the data channels, and time codes. EUGEN effects conversion by use of a file that contains calibration coefficients for each channel; 2) QUICKLOOK enables immediate viewing of a few selected channels of data, in contradistinction to viewing only after post test processing (which can take 30 minutes to several hours depending on the number of channels and other test parameters) of data from all channels. QUICKLOOK converts the selected data into a form in which they can be plotted in engineering units by use of Winplot (a free graphing program written by Rick Paris); and 3) EUPLOT provides a quick means for looking at data files generated by EUGEN without the necessity of relying on the PVWAVE based plotting software.

  13. Software for Preprocessing Data From Rocket-Engine Tests

    NASA Technical Reports Server (NTRS)

    Cheng, Chiu-Fu

    2003-01-01

    Three computer programs have been written to preprocess digitized outputs of sensors during rocket-engine tests at Stennis Space Center (SSC). The programs apply exclusively to the SSC E test-stand complex and utilize the SSC file format. The programs are the following: (1) Engineering Units Generator (EUGEN) converts sensor-output-measurement data to engineering units. The inputs to EUGEN are raw binary test-data files, which include the voltage data, a list identifying the data channels, and time codes. EUGEN effects conversion by use of a file that contains calibration coefficients for each channel. (2) QUICKLOOK enables immediate viewing of a few selected channels of data, in contradistinction to viewing only after post-test processing (which can take 30 minutes to several hours depending on the number of channels and other test parameters) of data from all channels. QUICKLOOK converts the selected data into a form in which they can be plotted in engineering units by use of Winplot. (3) EUPLOT provides a quick means for looking at data files generated by EUGEN without the necessity of relying on the PVWAVE based plotting software.

  14. Design of a focal plane array with analog neural preprocessing

    NASA Astrophysics Data System (ADS)

    Koren, Ivo; Dohndorf, Juergen; Schluessler, Jens-Uwe; Werner, Joerg; Kroenig, Arndt; Ramacher, Ulrich

    1996-12-01

    The design of a CMOS focal plane array with 128 by 128 pixels and analog neural preprocessing is presented. Optical input to the array is provided by substrate-well photodiodes. A two-dimensional neural grid wIth next- neighbor connectivity, implemented as differential current- mode circuit, is capable of spatial low-pass filtering combined with contrast enhancement or binarization. The gain, spatial filter and nonlinearity parameters of the neural network are controlled externally using analog currents. This allows the multipliers and sigmoid transducers to be operated in weak inversion for a wide parameter sweep range as well as in moderate or strong inversion for a larger signal to pattern-noise ratio. The cell outputs are sequentially read out by an offset compensated differential switched-capacitor multiplexer with column preamplifiers. The analog output buffer is designed for pixel rates up to 1 pixel/microsecond and 2 by 100 pF load capacitance. All digital clocks controlling the analog data path are generated on-chip. The clock timing is programmable via a serial computer interface. Using 1 micrometer double-poly double-metal CMOS process, one pixel cell occupies 96 by 96 micrometer2 and the total chip size is about 2.3 cm2. Operating the neural network in weak inversion, the power dissipation of the analog circuitry is less than 100 mW.

  15. Microfluidic microarray systems and methods thereof

    DOEpatents

    West, Jay A. A.; Hukari, Kyle W.; Hux, Gary A.

    2009-04-28

    Disclosed are systems that include a manifold in fluid communication with a microfluidic chip having a microarray, an illuminator, and a detector in optical communication with the microarray. Methods for using these systems for biological detection are also disclosed.

  16. Technical Advances of the Recombinant Antibody Microarray Technology Platform for Clinical Immunoproteomics

    PubMed Central

    Delfani, Payam; Dexlin Mellby, Linda; Nordström, Malin; Holmér, Andreas; Ohlsson, Mattias; Borrebaeck, Carl A. K.; Wingren, Christer

    2016-01-01

    In the quest for deciphering disease-associated biomarkers, high-performing tools for multiplexed protein expression profiling of crude clinical samples will be crucial. Affinity proteomics, mainly represented by antibody-based microarrays, have during recent years been established as a proteomic tool providing unique opportunities for parallelized protein expression profiling. But despite the progress, several main technical features and assay procedures remains to be (fully) resolved. Among these issues, the handling of protein microarray data, i.e. the biostatistics parts, is one of the key features to solve. In this study, we have therefore further optimized, validated, and standardized our in-house designed recombinant antibody microarray technology platform. To this end, we addressed the main remaining technical issues (e.g. antibody quality, array production, sample labelling, and selected assay conditions) and most importantly key biostatistics subjects (e.g. array data pre-processing and biomarker panel condensation). This represents one of the first antibody array studies in which these key biostatistics subjects have been studied in detail. Here, we thus present the next generation of the recombinant antibody microarray technology platform designed for clinical immunoproteomics. PMID:27414037

  17. Microarray analysis: Uses and Limitations

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The use of microarray technology has exploded in resent years. All areas of biological research have found application for this powerful platform. From human disease studies to microbial detection systems, a plethora of uses for this technology are currently in place with new uses being developed ...

  18. Microarray Developed on Plastic Substrates.

    PubMed

    Bañuls, María-José; Morais, Sergi B; Tortajada-Genaro, Luis A; Maquieira, Ángel

    2016-01-01

    There is a huge potential interest to use synthetic polymers as versatile solid supports for analytical microarraying. Chemical modification of polycarbonate (PC) for covalent immobilization of probes, micro-printing of protein or nucleic acid probes, development of indirect immunoassay, and development of hybridization protocols are described and discussed. PMID:26614067

  19. Classifying Human Voices by Using Hybrid SFX Time-Series Preprocessing and Ensemble Feature Selection

    PubMed Central

    Wong, Raymond

    2013-01-01

    Voice biometrics is one kind of physiological characteristics whose voice is different for each individual person. Due to this uniqueness, voice classification has found useful applications in classifying speakers' gender, mother tongue or ethnicity (accent), emotion states, identity verification, verbal command control, and so forth. In this paper, we adopt a new preprocessing method named Statistical Feature Extraction (SFX) for extracting important features in training a classification model, based on piecewise transformation treating an audio waveform as a time-series. Using SFX we can faithfully remodel statistical characteristics of the time-series; together with spectral analysis, a substantial amount of features are extracted in combination. An ensemble is utilized in selecting only the influential features to be used in classification model induction. We focus on the comparison of effects of various popular data mining algorithms on multiple datasets. Our experiment consists of classification tests over four typical categories of human voice data, namely, Female and Male, Emotional Speech, Speaker Identification, and Language Recognition. The experiments yield encouraging results supporting the fact that heuristically choosing significant features from both time and frequency domains indeed produces better performance in voice classification than traditional signal processing techniques alone, like wavelets and LPC-to-CC. PMID:24288684

  20. Image preprocessing method for particle image velocimetry (PIV) image interrogation near a fluid-solid surface

    NASA Astrophysics Data System (ADS)

    Zhu, Yiding; Jia, Lichao; Bai, Ye; Yuan, Huijing; Lee, Cunbiao

    2014-11-01

    Accurate particle image velocimetry (PIV) measurements near the moving wall are a great challenge. The problem is compounded by the very large in-plane displacement on PIV images commonly encountered in measurements of the high speed flow. An improved image preprocessing method is presented in this paper. A wall detection technique is used first to qualify the wall position and the movement of the solid body. Virtual particle images are imposed in the solid region, of which the displacements are evaluated by the body movement. The estimation near the wall is then smoothed by data from both sides of the shear layer to reduce the large random uncertainties. Interrogations in the following iterative steps then converge to the correct results to provide accurate predictions for particle tracking velocimetries (PTV). Significant improvement is seen in Monte Carlo simulations and experimental tests such as measurements near a flapping flag or compressor plates. The algorithm also successfully extracted the small flow structures of the 2nd mode wave in the hypersonic boundary layer from PIV images with low signal-noise-ratios(SNR) when the traditional method was not successful.

  1. The Microarray Revolution: Perspectives from Educators

    ERIC Educational Resources Information Center

    Brewster, Jay L.; Beason, K. Beth; Eckdahl, Todd T.; Evans, Irene M.

    2004-01-01

    In recent years, microarray analysis has become a key experimental tool, enabling the analysis of genome-wide patterns of gene expression. This review approaches the microarray revolution with a focus upon four topics: 1) the early development of this technology and its application to cancer diagnostics; 2) a primer of microarray research,…

  2. Understanding the effects of pre-processing on extracted signal features from gait accelerometry signals.

    PubMed

    Millecamps, Alexandre; Lowry, Kristin A; Brach, Jennifer S; Perera, Subashan; Redfern, Mark S; Sejdić, Ervin

    2015-07-01

    Gait accelerometry is an important approach for gait assessment. Previous contributions have adopted various pre-processing approaches for gait accelerometry signals, but none have thoroughly investigated the effects of such pre-processing operations on the obtained results. Therefore, this paper investigated the influence of pre-processing operations on signal features extracted from gait accelerometry signals. These signals were collected from 35 participants aged over 65years: 14 of them were healthy controls (HC), 10 had Parkinson׳s disease (PD) and 11 had peripheral neuropathy (PN). The participants walked on a treadmill at preferred speed. Signal features in time, frequency and time-frequency domains were computed for both raw and pre-processed signals. The pre-processing stage consisted of applying tilt correction and denoising operations to acquired signals. We first examined the effects of these operations separately, followed by the investigation of their joint effects. Several important observations were made based on the obtained results. First, the denoising operation alone had almost no effects in comparison to the trends observed in the raw data. Second, the tilt correction affected the reported results to a certain degree, which could lead to a better discrimination between groups. Third, the combination of the two pre-processing operations yielded similar trends as the tilt correction alone. These results indicated that while gait accelerometry is a valuable approach for the gait assessment, one has to carefully adopt any pre-processing steps as they alter the observed findings. PMID:25935124

  3. Understanding the effects of pre-processing on extracted signal features from gait accelerometry signals

    PubMed Central

    Millecamps, Alexandre; Brach, Jennifer S.; Lowry, Kristin A.; Perera, Subashan; Redfern, Mark S.

    2015-01-01

    Gait accelerometry is an important approach for gait assessment. Previous contributions have adopted various pre-processing approaches for gait accelerometry signals, but none have thoroughly investigated the effects of such pre-processing operations on the obtained results. Therefore, this paper investigated the influence of pre-processing operations on signal features extracted from gait accelerometry signals. These signals were collected from 35 participants aged over 65 years-old: 14 of them were healthy controls (HC), 10 had Parkinson’s disease (PD) and 11 had peripheral neuropathy (PN). The participants walked on a treadmill at preferred speed. Signal features in time, frequency and time-frequency domains were computed for both raw and pre-processed signals. The pre-processing stage consisted of applying tilt correction and de-noising operations to acquired signals. We first examined the effects of these operations separately, followed by the investigation of their joint effects. Several important observations were made based on the obtained results. First, the denoising operation alone had almost no effects in comparison to the trends observed in the raw data. Second, the tilt correction affected the reported results to a certain degree, which could lead to a better discrimination between groups. Third, the combination of the two pre-processing operations yielded similar trends as the tilt correction alone. These results indicated that while gait accelerometry is a valuable approach for the gait assessment, one has to carefully adopt any pre-processing steps as they alter the observed findings. PMID:25935124

  4. Biclustering of microarray data with MOSPO based on crowding distance

    PubMed Central

    Liu, Junwan; Li, Zhoujun; Hu, Xiaohua; Chen, Yiming

    2009-01-01

    Background High-throughput microarray technologies have generated and accumulated massive amounts of gene expression datasets that contain expression levels of thousands of genes under hundreds of different experimental conditions. The microarray datasets are usually presented in 2D matrices, where rows represent genes and columns represent experimental conditions. The analysis of such datasets can discover local structures composed by sets of genes that show coherent expression patterns under subsets of experimental conditions. It leads to the development of sophisticated algorithms capable of extracting novel and useful knowledge from a biomedical point of view. In the medical domain, these patterns are useful for understanding various diseases, and aid in more accurate diagnosis, prognosis, treatment planning, as well as drug discovery. Results In this work we present the CMOPSOB (Crowding distance based Multi-objective Particle Swarm Optimization Biclustering), a novel clustering approach for microarray datasets to cluster genes and conditions highly related in sub-portions of the microarray data. The objective of biclustering is to find sub-matrices, i.e. maximal subgroups of genes and subgroups of conditions where the genes exhibit highly correlated activities over a subset of conditions. Since these objectives are mutually conflicting, they become suitable candidates for multi-objective modelling. Our approach CMOPSOB is based on a heuristic search technique, multi-objective particle swarm optimization, which simulates the movements of a flock of birds which aim to find food. In the meantime, the nearest neighbour search strategies based on crowding distance and ϵ-dominance can rapidly converge to the Pareto front and guarantee diversity of solutions. We compare the potential of this methodology with other biclustering algorithms by analyzing two common and public datasets of gene expression profiles. In all cases our method can find localized structures

  5. Automated Pre-processing for NMR Assignments with Reduced Tedium

    Energy Science and Technology Software Center (ESTSC)

    2004-05-11

    An important rate-limiting step in the reasonance asignment process is accurate identification of resonance peaks in MNR spectra. NMR spectra are noisy. Hence, automatic peak-picking programs must navigate between the Scylla of reliable but incomplete picking, and the Charybdis of noisy but complete picking. Each of these extremes complicates the assignment process: incomplete peak-picking results in the loss of essential connectivities, while noisy picking conceals the true connectivities under a combinatiorial explosion of false positives.more » Intermediate processing can simplify the assignment process by preferentially removing false peaks from noisy peak lists. This is accomplished by requiring consensus between multiple NMR experiments, exploiting a priori information about NMR spectra, and drawing on empirical statistical distributions of chemical shift extracted from the BioMagResBank. Experienced NMR practitioners currently apply many of these techniques "by hand", which is tedious, and may appear arbitrary to the novice. To increase efficiency, we have created a systematic and automated approach to this process, known as APART. Automated pre-processing has three main advantages: reduced tedium, standardization, and pedagogy. In the hands of experienced spectroscopists, the main advantage is reduced tedium (a rapid increase in the ratio of true peaks to false peaks with minimal effort). When a project is passed from hand to hand, the main advantage is standardization. APART automatically documents the peak filtering process by archiving its original recommendations, the accompanying justifications, and whether a user accepted or overrode a given filtering recommendation. In the hands of a novice, this tool can reduce the stumbling block of learning to differentiate between real peaks and noise, by providing real-time examples of how such decisions are made.« less

  6. Ontology-Based Analysis of Microarray Data.

    PubMed

    Giuseppe, Agapito; Milano, Marianna

    2016-01-01

    The importance of semantic-based methods and algorithms for the analysis and management of biological data is growing for two main reasons. From a biological side, knowledge contained in ontologies is more and more accurate and complete, from a computational side, recent algorithms are using in a valuable way such knowledge. Here we focus on semantic-based management and analysis of protein interaction networks referring to all the approaches of analysis of protein-protein interaction data that uses knowledge encoded into biological ontologies. Semantic approaches for studying high-throughput data have been largely used in the past to mine genomic and expression data. Recently, the emergence of network approaches for investigating molecular machineries has stimulated in a parallel way the introduction of semantic-based techniques for analysis and management of network data. The application of these computational approaches to the study of microarray data can broad the application scenario of them and simultaneously can help the understanding of disease development and progress. PMID:25971913

  7. The Current Status of DNA Microarrays

    NASA Astrophysics Data System (ADS)

    Shi, Leming; Perkins, Roger G.; Tong, Weida

    DNA microarray technology that allows simultaneous assay of thousands of genes in a single experiment has steadily advanced to become a mainstream method used in research, and has reached a stage that envisions its use in medical applications and personalized medicine. Many different strategies have been developed for manufacturing DNA microarrays. In this chapter, we discuss the manufacturing characteristics of seven microarray platforms that were used in a recently completed large study by the MicroArray Quality Control (MAQC) consortium, which evaluated the concordance of results across these platforms. The platforms can be grouped into three categories: (1) in situ synthesis of oligonucleotide probes on microarrays (Affymetrix GeneChip® arrays based on photolithography synthesis and Agilent's arrays based on inkjet synthesis); (2) spotting of presynthesized oligonucleotide probes on microarrays (GE Healthcare's CodeLink system, Applied Biosystems' Genome Survey Microarrays, and the custom microarrays printed with Operon's oligonucleotide set); and (3) deposition of presynthesized oligonucleotide probes on bead-based microarrays (Illumina's BeadChip microarrays). We conclude this chapter with our views on the challenges and opportunities toward acceptance of DNA microarray data in clinical and regulatory settings.

  8. The Current Status of DNA Microarrays

    NASA Astrophysics Data System (ADS)

    Shi, Leming; Perkins, Roger G.; Tong, Weida

    DNA microarray technology that allows simultaneous assay of thousands of genes in a single experiment has steadily advanced to become a mainstream method used in research, and has reached a stage that envisions its use in medical applications and personalized medicine. Many different strategies have been developed for manufacturing DNA microarrays. In this chapter, we discuss the manu facturing characteristics of seven microarray platforms that were used in a recently completed large study by the MicroArray Quality Control (MAQC) consortium, which evaluated the concordance of results across these platforms. The platforms can be grouped into three categories: (1) in situ synthesis of oligonucleotide probes on microarrays (Affymetrix GeneChip® arrays based on photolithography synthesis and Agilent's arrays based on inkjet synthesis); (2) spotting of presynthe-sized oligonucleotide probes on microarrays (GE Healthcare's CodeLink system, Applied Biosystems' Genome Survey Microarrays, and the custom microarrays printed with Operon's oligonucleotide set); and (3) deposition of presynthesized oligonucleotide probes on bead-based microarrays (Illumina's BeadChip microar-rays). We conclude this chapter with our views on the challenges and opportunities toward acceptance of DNA microarray data in clinical and regulatory settings.

  9. Tissue microarrays: applications in genomic research.

    PubMed

    Watanabe, Aprill; Cornelison, Robert; Hostetter, Galen

    2005-03-01

    The widespread application of tissue microarrays in cancer research and the clinical pathology laboratory demonstrates a versatile and portable technology. The rapid integration of tissue microarrays into biomarker discovery and validation processes reflects the forward thinking of researchers who have pioneered the high-density tissue microarray. The precise arrangement of hundreds of archival clinical tissue samples into a composite tissue microarray block is now a proven method for the efficient and standardized analysis of molecular markers. With applications in cancer research, tissue microarrays are a valuable tool in validating candidate markers discovered in highly sensitive genome-wide microarray experiments. With applications in clinical pathology, tissue microarrays are used widely in immunohistochemistry quality control and quality assurance. The timeline of a biomarker implicated in prostate neoplasia, which was identified by complementary DNA expression profiling, validated by tissue microarrays and is now used as a prognostic immunohistochemistry marker, is reviewed. The tissue microarray format provides opportunities for digital imaging acquisition, image processing and database integration. Advances in digital imaging help to alleviate previous bottlenecks in the research pipeline, permit computer image scoring and convey telepathology opportunities for remote image analysis. The tissue microarray industry now includes public and private sectors with varying degrees of research utility and offers a range of potential tissue microarray applications in basic research, prognostic oncology and drug discovery. PMID:15833047

  10. Hyperspectral microarray scanning: impact on the accuracy and reliability of gene expression data

    PubMed Central

    Timlin, Jerilyn A; Haaland, David M; Sinclair, Michael B; Aragon, Anthony D; Martinez, M Juanita; Werner-Washburne, Margaret

    2005-01-01

    Background Commercial microarray scanners and software cannot distinguish between spectrally overlapping emission sources, and hence cannot accurately identify or correct for emissions not originating from the labeled cDNA. We employed our hyperspectral microarray scanner coupled with multivariate data analysis algorithms that independently identify and quantitate emissions from all sources to investigate three artifacts that reduce the accuracy and reliability of microarray data: skew toward the green channel, dye separation, and variable background emissions. Results Here we demonstrate that several common microarray artifacts resulted from the presence of emission sources other than the labeled cDNA that can dramatically alter the accuracy and reliability of the array data. The microarrays utilized in this study were representative of a wide cross-section of the microarrays currently employed in genomic research. These findings reinforce the need for careful attention to detail to recognize and subsequently eliminate or quantify the presence of extraneous emissions in microarray images. Conclusion Hyperspectral scanning together with multivariate analysis offers a unique and detailed understanding of the sources of microarray emissions after hybridization. This opportunity to simultaneously identify and quantitate contaminant and background emissions in microarrays markedly improves the reliability and accuracy of the data and permits a level of quality control of microarray emissions previously unachievable. Using these tools, we can not only quantify the extent and contribution of extraneous emission sources to the signal, but also determine the consequences of failing to account for them and gain the insight necessary to adjust preparation protocols to prevent such problems from occurring. PMID:15888208

  11. Microarray analysis in pulmonary hypertension

    PubMed Central

    Hoffmann, Julia; Wilhelm, Jochen; Olschewski, Andrea

    2016-01-01

    Microarrays are a powerful and effective tool that allows the detection of genome-wide gene expression differences between controls and disease conditions. They have been broadly applied to investigate the pathobiology of diverse forms of pulmonary hypertension, namely group 1, including patients with idiopathic pulmonary arterial hypertension, and group 3, including pulmonary hypertension associated with chronic lung diseases such as chronic obstructive pulmonary disease and idiopathic pulmonary fibrosis. To date, numerous human microarray studies have been conducted to analyse global (lung homogenate samples), compartment-specific (laser capture microdissection), cell type-specific (isolated primary cells) and circulating cell (peripheral blood) expression profiles. Combined, they provide important information on development, progression and the end-stage disease. In the future, system biology approaches, expression of noncoding RNAs that regulate coding RNAs, and direct comparison between animal models and human disease might be of importance. PMID:27076594

  12. Phenotypic MicroRNA Microarrays

    PubMed Central

    Kwon, Yong-Jun; Heo, Jin Yeong; Kim, Hi Chul; Kim, Jin Yeop; Liuzzi, Michel; Soloveva, Veronica

    2013-01-01

    Microarray technology has become a very popular approach in cases where multiple experiments need to be conducted repeatedly or done with a variety of samples. In our lab, we are applying our high density spots microarray approach to microscopy visualization of the effects of transiently introduced siRNA or cDNA on cellular morphology or phenotype. In this publication, we are discussing the possibility of using this micro-scale high throughput process to study the role of microRNAs in the biology of selected cellular models. After reverse-transfection of microRNAs and siRNA, the cellular phenotype generated by microRNAs regulated NF-κB expression comparably to the siRNA. The ability to print microRNA molecules for reverse transfection into cells is opening up the wide horizon for the phenotypic high content screening of microRNA libraries using cellular disease models.

  13. Microarray analysis in pulmonary hypertension.

    PubMed

    Hoffmann, Julia; Wilhelm, Jochen; Olschewski, Andrea; Kwapiszewska, Grazyna

    2016-07-01

    Microarrays are a powerful and effective tool that allows the detection of genome-wide gene expression differences between controls and disease conditions. They have been broadly applied to investigate the pathobiology of diverse forms of pulmonary hypertension, namely group 1, including patients with idiopathic pulmonary arterial hypertension, and group 3, including pulmonary hypertension associated with chronic lung diseases such as chronic obstructive pulmonary disease and idiopathic pulmonary fibrosis. To date, numerous human microarray studies have been conducted to analyse global (lung homogenate samples), compartment-specific (laser capture microdissection), cell type-specific (isolated primary cells) and circulating cell (peripheral blood) expression profiles. Combined, they provide important information on development, progression and the end-stage disease. In the future, system biology approaches, expression of noncoding RNAs that regulate coding RNAs, and direct comparison between animal models and human disease might be of importance. PMID:27076594

  14. Self-Assembling Protein Microarrays

    NASA Astrophysics Data System (ADS)

    Ramachandran, Niroshan; Hainsworth, Eugenie; Bhullar, Bhupinder; Eisenstein, Samuel; Rosen, Benjamin; Lau, Albert Y.; C. Walter, Johannes; LaBaer, Joshua

    2004-07-01

    Protein microarrays provide a powerful tool for the study of protein function. However, they are not widely used, in part because of the challenges in producing proteins to spot on the arrays. We generated protein microarrays by printing complementary DNAs onto glass slides and then translating target proteins with mammalian reticulocyte lysate. Epitope tags fused to the proteins allowed them to be immobilized in situ. This obviated the need to purify proteins, avoided protein stability problems during storage, and captured sufficient protein for functional studies. We used the technology to map pairwise interactions among 29 human DNA replication initiation proteins, recapitulate the regulation of Cdt1 binding to select replication proteins, and map its geminin-binding domain.

  15. Reordering based integrative expression profiling for microarray classification

    PubMed Central

    2012-01-01

    Background Current network-based microarray analysis uses the information of interactions among concerned genes/gene products, but still considers each gene expression individually. We propose an organized knowledge-supervised approach - Integrative eXpression Profiling (IXP), to improve microarray classification accuracy, and help discover groups of genes that have been too weak to detect individually by traditional ways. To implement IXP, ant colony optimization reordering (ACOR) algorithm is used to group functionally related genes in an ordered way. Results Using Alzheimer's disease (AD) as an example, we demonstrate how to apply ACOR-based IXP approach into microarray classifications. Using a microarray dataset - GSE1297 with 31 samples as training set, the result for the blinded classification on another microarray dataset - GSE5281 with 151 samples, shows that our approach can improve accuracy from 74.83% to 82.78%. A recently-published 1372-probe signature for AD can only achieve 61.59% accuracy in the same condition. The ACOR-based IXP approach also has better performance than the IXP approach based on classic network ranking, graph clustering, and random-ordering methods in an overall classification performance comparison. Conclusions The ACOR-based IXP approach can serve as a knowledge-supervised feature transformation approach to increase classification accuracy dramatically, by transforming each gene expression profile to an integrated expression files as features inputting into standard classifiers. The IXP approach integrates both gene expression information and organized knowledge - disease gene/protein network topology information, which is represented as both network node weights (local topological properties) and network node orders (global topological characteristics). PMID:22536860

  16. Washing scaling of GeneChip microarray expression

    PubMed Central

    2010-01-01

    Background Post-hybridization washing is an essential part of microarray experiments. Both the quality of the experimental washing protocol and adequate consideration of washing in intensity calibration ultimately affect the quality of the expression estimates extracted from the microarray intensities. Results We conducted experiments on GeneChip microarrays with altered protocols for washing, scanning and staining to study the probe-level intensity changes as a function of the number of washing cycles. For calibration and analysis of the intensity data we make use of the 'hook' method which allows intensity contributions due to non-specific and specific hybridization of perfect match (PM) and mismatch (MM) probes to be disentangled in a sequence specific manner. On average, washing according to the standard protocol removes about 90% of the non-specific background and about 30-50% and less than 10% of the specific targets from the MM and PM, respectively. Analysis of the washing kinetics shows that the signal-to-noise ratio doubles roughly every ten stringent washing cycles. Washing can be characterized by time-dependent rate constants which reflect the heterogeneous character of target binding to microarray probes. We propose an empirical washing function which estimates the survival of probe bound targets. It depends on the intensity contribution due to specific and non-specific hybridization per probe which can be estimated for each probe using existing methods. The washing function allows probe intensities to be calibrated for the effect of washing. On a relative scale, proper calibration for washing markedly increases expression measures, especially in the limit of small and large values. Conclusions Washing is among the factors which potentially distort expression measures. The proposed first-order correction method allows direct implementation in existing calibration algorithms for microarray data. We provide an experimental 'washing data set' which might

  17. Performance of Multi-User Transmitter Pre-Processing Assisted Multi-Cell IDMA System for Downlink Transmission

    NASA Astrophysics Data System (ADS)

    Partibane, B.; Nagarajan, V.; Vishvaksenan, K. S.; Kalidoss, R.

    2015-06-01

    In this paper, we present the performance of multi-user transmitter pre-processing (MUTP) assisted coded-interleave division multiple access (IDMA) system over correlated frequency-selective channels for downlink communication. We realize MUTP using singular value decomposition (SVD) technique, which exploits the channel state information (CSI) of all the active users that is acquired via feedback channels. We consider the MUTP technique to alleviate the effects of co-channel interference (CCI) and multiple access interference (MAI). To be specific, we estimate the CSI using least square error (LSE) algorithm at each of the mobile stations (MSs) and perform vector quantization using Lloyd's algorithm, and feedback the bits that represents the quantized magnitudes and phases to the base station (BS) through the dedicated low rate noisy channel. Finally we recover the quantized bits at the BS to formulate the pre-processing matrix. The performance of MUTP aided IDMA systems are evaluated for five types of delay spread distributions pertaining to long-term evolution (LTE) and Stanford University Interim (SUI) channel models. We also compare the performance of MUTP with minimum mean square error (MMSE) detector for the coded IDMA system. The considered TP scheme alleviates the effects of CCI with less complex signal detection at the MSs when compared to MMSE detector. Further, our simulation results reveal that SVD-based MUTP assisted coded IDMA system outperforms the MMSE detector in terms of achievable bit error rate (BER) with low signal-to-noise ratio (SNR) requirement by mitigating the effects of CCI and MAI.

  18. Optical detection of nanoparticle-enhanced human papillomavirus genotyping microarrays.

    PubMed

    Li, Xue Zhe; Kim, Sookyung; Cho, Wonhyung; Lee, Seung-Yop

    2013-02-01

    In this study, we propose a new detection method of nanoparticle-enhanced human papillomavirus genotyping microarrays using a DVD optical pick-up with a photodiode. The HPV genotyping DNA chip was labeled using Au/Ag core-shell nanoparticles, prepared on a treatment glass substrate. Then, the bio information of the HPV genotyping target DNA was detected by measuring the difference of the optical signals between the DNA spots and the background parts for cervical cancer diagnosis. Moreover the approximate linear relationship between the concentration of the HPV genotyping target DNA and the optical signal depending on the density of Au/Ag core-shell nanoparticles was obtained by performing a spot finding algorithm. It is shown that the nanoparticle-labeled HPV genotyping target DNA can be measured and quantified by collecting the low-cost photodiode signal on the treatment glass chip, replacing high-cost fluorescence microarray scanners using a photomultiplier tube. PMID:23413051

  19. A Hybrid BPSO-CGA Approach for Gene Selection and Classification of Microarray Data

    PubMed Central

    Chuang, Li-Yeh; Yang, Cheng-Huei; Li, Jung-Chike

    2012-01-01

    Abstract Microarray analysis promises to detect variations in gene expressions, and changes in the transcription rates of an entire genome in vivo. Microarray gene expression profiles indicate the relative abundance of mRNA corresponding to the genes. The selection of relevant genes from microarray data poses a formidable challenge to researchers due to the high-dimensionality of features, multiclass categories being involved, and the usually small sample size. A classification process is often employed which decreases the dimensionality of the microarray data. In order to correctly analyze microarray data, the goal is to find an optimal subset of features (genes) which adequately represents the original set of features. A hybrid method of binary particle swarm optimization (BPSO) and a combat genetic algorithm (CGA) is to perform the microarray data selection. The K-nearest neighbor (K-NN) method with leave-one-out cross-validation (LOOCV) served as a classifier. The proposed BPSO-CGA approach is compared to ten microarray data sets from the literature. The experimental results indicate that the proposed method not only effectively reduce the number of genes expression level, but also achieves a low classification error rate. PMID:21210743

  20. Multisensor data fusion algorithm development

    SciTech Connect

    Yocky, D.A.; Chadwick, M.D.; Goudy, S.P.; Johnson, D.K.

    1995-12-01

    This report presents a two-year LDRD research effort into multisensor data fusion. We approached the problem by addressing the available types of data, preprocessing that data, and developing fusion algorithms using that data. The report reflects these three distinct areas. First, the possible data sets for fusion are identified. Second, automated registration techniques for imagery data are analyzed. Third, two fusion techniques are presented. The first fusion algorithm is based on the two-dimensional discrete wavelet transform. Using test images, the wavelet algorithm is compared against intensity modulation and intensity-hue-saturation image fusion algorithms that are available in commercial software. The wavelet approach outperforms the other two fusion techniques by preserving spectral/spatial information more precisely. The wavelet fusion algorithm was also applied to Landsat Thematic Mapper and SPOT panchromatic imagery data. The second algorithm is based on a linear-regression technique. We analyzed the technique using the same Landsat and SPOT data.

  1. ArrayWiki: an enabling technology for sharing public microarray data repositories and meta-analyses

    PubMed Central

    Stokes, Todd H; Torrance, JT; Li, Henry; Wang, May D

    2008-01-01

    Background A survey of microarray databases reveals that most of the repository contents and data models are heterogeneous (i.e., data obtained from different chip manufacturers), and that the repositories provide only basic biological keywords linking to PubMed. As a result, it is difficult to find datasets using research context or analysis parameters information beyond a few keywords. For example, to reduce the "curse-of-dimension" problem in microarray analysis, the number of samples is often increased by merging array data from different datasets. Knowing chip data parameters such as pre-processing steps (e.g., normalization, artefact removal, etc), and knowing any previous biological validation of the dataset is essential due to the heterogeneity of the data. However, most of the microarray repositories do not have meta-data information in the first place, and do not have a a mechanism to add or insert this information. Thus, there is a critical need to create "intelligent" microarray repositories that (1) enable update of meta-data with the raw array data, and (2) provide standardized archiving protocols to minimize bias from the raw data sources. Results To address the problems discussed, we have developed a community maintained system called ArrayWiki that unites disparate meta-data of microarray meta-experiments from multiple primary sources with four key features. First, ArrayWiki provides a user-friendly knowledge management interface in addition to a programmable interface using standards developed by Wikipedia. Second, ArrayWiki includes automated quality control processes (caCORRECT) and novel visualization methods (BioPNG, Gel Plots), which provide extra information about data quality unavailable in other microarray repositories. Third, it provides a user-curation capability through the familiar Wiki interface. Fourth, ArrayWiki provides users with simple text-based searches across all experiment meta-data, and exposes data to search engine crawlers

  2. Gene Expression Browser: large-scale and cross-experiment microarray data integration, management, search & visualization

    PubMed Central

    2010-01-01

    Background In the last decade, a large amount of microarray gene expression data has been accumulated in public repositories. Integrating and analyzing high-throughput gene expression data have become key activities for exploring gene functions, gene networks and biological pathways. Effectively utilizing these invaluable microarray data remains challenging due to a lack of powerful tools to integrate large-scale gene-expression information across diverse experiments and to search and visualize a large number of gene-expression data points. Results Gene Expression Browser is a microarray data integration, management and processing system with web-based search and visualization functions. An innovative method has been developed to define a treatment over a control for every microarray experiment to standardize and make microarray data from different experiments homogeneous. In the browser, data are pre-processed offline and the resulting data points are visualized online with a 2-layer dynamic web display. Users can view all treatments over control that affect the expression of a selected gene via Gene View, and view all genes that change in a selected treatment over control via treatment over control View. Users can also check the changes of expression profiles of a set of either the treatments over control or genes via Slide View. In addition, the relationships between genes and treatments over control are computed according to gene expression ratio and are shown as co-responsive genes and co-regulation treatments over control. Conclusion Gene Expression Browser is composed of a set of software tools, including a data extraction tool, a microarray data-management system, a data-annotation tool, a microarray data-processing pipeline, and a data search & visualization tool. The browser is deployed as a free public web service (http://www.ExpressionBrowser.com) that integrates 301 ATH1 gene microarray experiments from public data repositories (viz. the Gene

  3. Integrated Amplification Microarrays for Infectious Disease Diagnostics

    PubMed Central

    Chandler, Darrell P.; Bryant, Lexi; Griesemer, Sara B.; Gu, Rui; Knickerbocker, Christopher; Kukhtin, Alexander; Parker, Jennifer; Zimmerman, Cynthia; George, Kirsten St.; Cooney, Christopher G.

    2012-01-01

    This overview describes microarray-based tests that combine solution-phase amplification chemistry and microarray hybridization within a single microfluidic chamber. The integrated biochemical approach improves microarray workflow for diagnostic applications by reducing the number of steps and minimizing the potential for sample or amplicon cross-contamination. Examples described herein illustrate a basic, integrated approach for DNA and RNA genomes, and a simple consumable architecture for incorporating wash steps while retaining an entirely closed system. It is anticipated that integrated microarray biochemistry will provide an opportunity to significantly reduce the complexity and cost of microarray consumables, equipment, and workflow, which in turn will enable a broader spectrum of users to exploit the intrinsic multiplexing power of microarrays for infectious disease diagnostics.

  4. THE ABRF MARG MICROARRAY SURVEY 2005: TAKING THE PULSE ON THE MICROARRAY FIELD

    EPA Science Inventory

    Over the past several years microarray technology has evolved into a critical component of any discovery based program. Since 1999, the Association of Biomolecular Resource Facilities (ABRF) Microarray Research Group (MARG) has conducted biennial surveys designed to generate a pr...

  5. Living Cell Microarrays: An Overview of Concepts.

    PubMed

    Jonczyk, Rebecca; Kurth, Tracy; Lavrentieva, Antonina; Walter, Johanna-Gabriela; Scheper, Thomas; Stahl, Frank

    2016-01-01

    Living cell microarrays are a highly efficient cellular screening system. Due to the low number of cells required per spot, cell microarrays enable the use of primary and stem cells and provide resolution close to the single-cell level. Apart from a variety of conventional static designs, microfluidic microarray systems have also been established. An alternative format is a microarray consisting of three-dimensional cell constructs ranging from cell spheroids to cells encapsulated in hydrogel. These systems provide an in vivo-like microenvironment and are preferably used for the investigation of cellular physiology, cytotoxicity, and drug screening. Thus, many different high-tech microarray platforms are currently available. Disadvantages of many systems include their high cost, the requirement of specialized equipment for their manufacture, and the poor comparability of results between different platforms. In this article, we provide an overview of static, microfluidic, and 3D cell microarrays. In addition, we describe a simple method for the printing of living cell microarrays on modified microscope glass slides using standard DNA microarray equipment available in most laboratories. Applications in research and diagnostics are discussed, e.g., the selective and sensitive detection of biomarkers. Finally, we highlight current limitations and the future prospects of living cell microarrays. PMID:27600077

  6. Living Cell Microarrays: An Overview of Concepts

    PubMed Central

    Jonczyk, Rebecca; Kurth, Tracy; Lavrentieva, Antonina; Walter, Johanna-Gabriela; Scheper, Thomas; Stahl, Frank

    2016-01-01

    Living cell microarrays are a highly efficient cellular screening system. Due to the low number of cells required per spot, cell microarrays enable the use of primary and stem cells and provide resolution close to the single-cell level. Apart from a variety of conventional static designs, microfluidic microarray systems have also been established. An alternative format is a microarray consisting of three-dimensional cell constructs ranging from cell spheroids to cells encapsulated in hydrogel. These systems provide an in vivo-like microenvironment and are preferably used for the investigation of cellular physiology, cytotoxicity, and drug screening. Thus, many different high-tech microarray platforms are currently available. Disadvantages of many systems include their high cost, the requirement of specialized equipment for their manufacture, and the poor comparability of results between different platforms. In this article, we provide an overview of static, microfluidic, and 3D cell microarrays. In addition, we describe a simple method for the printing of living cell microarrays on modified microscope glass slides using standard DNA microarray equipment available in most laboratories. Applications in research and diagnostics are discussed, e.g., the selective and sensitive detection of biomarkers. Finally, we highlight current limitations and the future prospects of living cell microarrays. PMID:27600077

  7. Protein microarrays as tools for functional proteomics.

    PubMed

    LaBaer, Joshua; Ramachandran, Niroshan

    2005-02-01

    Protein microarrays present an innovative and versatile approach to study protein abundance and function at an unprecedented scale. Given the chemical and structural complexity of the proteome, the development of protein microarrays has been challenging. Despite these challenges there has been a marked increase in the use of protein microarrays to map interactions of proteins with various other molecules, and to identify potential disease biomarkers, especially in the area of cancer biology. In this review, we discuss some of the promising advances made in the development and use of protein microarrays. PMID:15701447

  8. Photoelectrochemical synthesis of DNA microarrays

    PubMed Central

    Chow, Brian Y.; Emig, Christopher J.; Jacobson, Joseph M.

    2009-01-01

    Optical addressing of semiconductor electrodes represents a powerful technology that enables the independent and parallel control of a very large number of electrical phenomena at the solid-electrolyte interface. To date, it has been used in a wide range of applications including electrophoretic manipulation, biomolecule sensing, and stimulating networks of neurons. Here, we have adapted this approach for the parallel addressing of redox reactions, and report the construction of a DNA microarray synthesis platform based on semiconductor photoelectrochemistry (PEC). An amorphous silicon photoconductor is activated by an optical projection system to create virtual electrodes capable of electrochemically generating protons; these PEC-generated protons then cleave the acid-labile dimethoxytrityl protecting groups of DNA phosphoramidite synthesis reagents with the requisite spatial selectivity to generate DNA microarrays. Furthermore, a thin-film porous glass dramatically increases the amount of DNA synthesized per chip by over an order of magnitude versus uncoated glass. This platform demonstrates that PEC can be used toward combinatorial bio-polymer and small molecule synthesis. PMID:19706433

  9. THE ABRF-MARG MICROARRAY SURVEY 2004: TAKING THE PULSE OF THE MICROARRAY FIELD

    EPA Science Inventory

    Over the past several years, the field of microarrays has grown and evolved drastically. In its continued efforts to track this evolution, the ABRF-MARG has once again conducted a survey of international microarray facilities and individual microarray users. The goal of the surve...

  10. 2008 Microarray Research Group (MARG Survey): Sensing the State of Microarray Technology

    EPA Science Inventory

    Over the past several years, the field of microarrays has grown and evolved drastically. In its continued efforts to track this evolution and transformation, the ABRF-MARG has once again conducted a survey of international microarray facilities and individual microarray users. Th...

  11. Nucleosome positioning from tiling microarray data

    PubMed Central

    Yassour, Moran; Kaplan, Tommy; Jaimovich, Ariel; Friedman, Nir

    2008-01-01

    Motivation: The packaging of DNA around nucleosomes in eukaryotic cells plays a crucial role in regulation of gene expression, and other DNA-related processes. To better understand the regulatory role of nucleosomes, it is important to pinpoint their position in a high (5–10 bp) resolution. Toward this end, several recent works used dense tiling arrays to map nucleosomes in a high-throughput manner. These data were then parsed and hand-curated, and the positions of nucleosomes were assessed. Results: In this manuscript, we present a fully automated algorithm to analyze such data and predict the exact location of nucleosomes. We introduce a method, based on a probabilistic graphical model, to increase the resolution of our predictions even beyond that of the microarray used. We show how to build such a model and how to compile it into a simple Hidden Markov Model, allowing for a fast and accurate inference of nucleosome positions. We applied our model to nucleosomal data from mid-log yeast cells reported by Yuan et al. and compared our predictions to those of the original paper; to a more recent method that uses five times denser tiling arrays as explained by Lee et al.; and to a curated set of literature-based nucleosome positions. Our results suggest that by applying our algorithm to the same data used by Yuan et al. our fully automated model traced 13% more nucleosomes, and increased the overall accuracy by about 20%. We believe that such an improvement opens the way for a better understanding of the regulatory mechanisms controlling gene expression, and how they are encoded in the DNA. Contact: nir@cs.huji.ac.il PMID:18586706

  12. Preprocessing Inconsistent Linear System for a Meaningful Least Squares Solution

    NASA Technical Reports Server (NTRS)

    Sen, Syamal K.; Shaykhian, Gholam Ali

    2011-01-01

    Mathematical models of many physical/statistical problems are systems of linear equations. Due to measurement and possible human errors/mistakes in modeling/data, as well as due to certain assumptions to reduce complexity, inconsistency (contradiction) is injected into the model, viz. the linear system. While any inconsistent system irrespective of the degree of inconsistency has always a least-squares solution, one needs to check whether an equation is too much inconsistent or, equivalently too much contradictory. Such an equation will affect/distort the least-squares solution to such an extent that renders it unacceptable/unfit to be used in a real-world application. We propose an algorithm which (i) prunes numerically redundant linear equations from the system as these do not add any new information to the model, (ii) detects contradictory linear equations along with their degree of contradiction (inconsistency index), (iii) removes those equations presumed to be too contradictory, and then (iv) obtain the minimum norm least-squares solution of the acceptably inconsistent reduced linear system. The algorithm presented in Matlab reduces the computational and storage complexities and also improves the accuracy of the solution. It also provides the necessary warning about the existence of too much contradiction in the model. In addition, we suggest a thorough relook into the mathematical modeling to determine the reason why unacceptable contradiction has occurred thus prompting us to make necessary corrections/modifications to the models - both mathematical and, if necessary, physical.

  13. Adaptive filtering image preprocessing for smart FPA technology

    NASA Astrophysics Data System (ADS)

    Brooks, Geoffrey W.

    1995-05-01

    This paper discusses two applications of adaptive filters for image processing on parallel architectures. The first, based on the results of previously accomplished work, summarizes the analyses of various adaptive filters implemented for pixel-level image prediction. FIR filters, fixed and adaptive IIR filters, and various variable step size algorithms were compared with a focus on algorithm complexity against the ability to predict future pixel values. A gaussian smoothing operation with varying spatial and temporal constants were also applied for comparisons of random noise reductions. The second application is a suggestion to use memory-adaptive IIR filters for detecting and tracking motion within an image. Objects within an image are made of edges, or segments, with varying degrees of motion. An application has been previously published that describes FIR filters connecting pixels and using correlations to determine motion and direction. This implementation seems limited to detecting motion coinciding with FIR filter operation rate and the associated harmonics. Upgrading the FIR structures with adaptive IIR structures can eliminate these limitations. These and any other pixel-level adaptive filtering application require data memory for filter parameters and some basic computational capability. Tradeoffs have to be made between chip real estate and these desired features. System tradeoffs will also have to be made as to where it makes the most sense to do which level of processing. Although smart pixels may not be ready to implement adaptive filters, applications such as these should give the smart pixel designer some long range goals.

  14. MAGMA: analysis of two-channel microarrays made easy.

    PubMed

    Rehrauer, Hubert; Zoller, Stefan; Schlapbach, Ralph

    2007-07-01

    The web application MAGMA provides a simple and intuitive interface to identify differentially expressed genes from two-channel microarray data. While the underlying algorithms are not superior to those of similar web applications, MAGMA is particularly user friendly and can be used without prior training. The user interface guides the novice user through the most typical microarray analysis workflow consisting of data upload, annotation, normalization and statistical analysis. It automatically generates R-scripts that document MAGMA's entire data processing steps, thereby allowing the user to regenerate all results in his local R installation. The implementation of MAGMA follows the model-view-controller design pattern that strictly separates the R-based statistical data processing, the web-representation and the application logic. This modular design makes the application flexible and easily extendible by experts in one of the fields: statistical microarray analysis, web design or software development. State-of-the-art Java Server Faces technology was used to generate the web interface and to perform user input processing. MAGMA's object-oriented modular framework makes it easily extendible and applicable to other fields and demonstrates that modern Java technology is also suitable for rather small and concise academic projects. MAGMA is freely available at www.magma-fgcz.uzh.ch. PMID:17517778

  15. Autonomous system for Web-based microarray image analysis.

    PubMed

    Bozinov, Daniel

    2003-12-01

    Software-based feature extraction from DNA microarray images still requires human intervention on various levels. Manual adjustment of grid and metagrid parameters, precise alignment of superimposed grid templates and gene spots, or simply identification of large-scale artifacts have to be performed beforehand to reliably analyze DNA signals and correctly quantify their expression values. Ideally, a Web-based system with input solely confined to a single microarray image and a data table as output containing measurements for all gene spots would directly transform raw image data into abstracted gene expression tables. Sophisticated algorithms with advanced procedures for iterative correction function can overcome imminent challenges in image processing. Herein is introduced an integrated software system with a Java-based interface on the client side that allows for decentralized access and furthermore enables the scientist to instantly employ the most updated software version at any given time. This software tool is extended from PixClust as used in Extractiff incorporated with Java Web Start deployment technology. Ultimately, this setup is destined for high-throughput pipelines in genome-wide medical diagnostics labs or microarray core facilities aimed at providing fully automated service to its users. PMID:15376911

  16. Classification of Microarray Data Using Kernel Fuzzy Inference System

    PubMed Central

    Kumar Rath, Santanu

    2014-01-01

    The DNA microarray classification technique has gained more popularity in both research and practice. In real data analysis, such as microarray data, the dataset contains a huge number of insignificant and irrelevant features that tend to lose useful information. Classes with high relevance and feature sets with high significance are generally referred for the selected features, which determine the samples classification into their respective classes. In this paper, kernel fuzzy inference system (K-FIS) algorithm is applied to classify the microarray data (leukemia) using t-test as a feature selection method. Kernel functions are used to map original data points into a higher-dimensional (possibly infinite-dimensional) feature space defined by a (usually nonlinear) function ϕ through a mathematical process called the kernel trick. This paper also presents a comparative study for classification using K-FIS along with support vector machine (SVM) for different set of features (genes). Performance parameters available in the literature such as precision, recall, specificity, F-measure, ROC curve, and accuracy are considered to analyze the efficiency of the classification model. From the proposed approach, it is apparent that K-FIS model obtains similar results when compared with SVM model. This is an indication that the proposed approach relies on kernel function.

  17. Algorithms and Algorithmic Languages.

    ERIC Educational Resources Information Center

    Veselov, V. M.; Koprov, V. M.

    This paper is intended as an introduction to a number of problems connected with the description of algorithms and algorithmic languages, particularly the syntaxes and semantics of algorithmic languages. The terms "letter, word, alphabet" are defined and described. The concept of the algorithm is defined and the relation between the algorithm and…

  18. Automatic image analysis and spot classification for detection of pathogenic Escherichia coli on glass slide DNA microarrays

    Technology Transfer Automated Retrieval System (TEKTRAN)

    A computer algorithm was created to inspect scanned images from DNA microarray slides developed to rapidly detect and genotype E. Coli O157 virulent strains. The algorithm computes centroid locations for signal and background pixels in RGB space and defines a plane perpendicular to the line connect...

  19. Prediction of rainfall time series using modular artificial neural networks coupled with data-preprocessing techniques

    NASA Astrophysics Data System (ADS)

    Wu, C. L.; Chau, K. W.; Fan, C.

    2010-07-01

    SummaryThis study is an attempt to seek a relatively optimal data-driven model for rainfall forecasting from three aspects: model inputs, modeling methods, and data-preprocessing techniques. Four rain data records from different regions, namely two monthly and two daily series, are examined. A comparison of seven input techniques, either linear or nonlinear, indicates that linear correlation analysis (LCA) is capable of identifying model inputs reasonably. A proposed model, modular artificial neural network (MANN), is compared with three benchmark models, viz. artificial neural network (ANN), K-nearest-neighbors (K-NN), and linear regression (LR). Prediction is performed in the context of two modes including normal mode (viz., without data preprocessing) and data preprocessing mode. Results from the normal mode indicate that MANN performs the best among all four models, but the advantage of MANN over ANN is not significant in monthly rainfall series forecasting. Under the data preprocessing mode, each of LR, K-NN and ANN is respectively coupled with three data-preprocessing techniques including moving average (MA), principal component analysis (PCA), and singular spectrum analysis (SSA). Results indicate that the improvement of model performance generated by SSA is considerable whereas those of MA or PCA are slight. Moreover, when MANN is coupled with SSA, results show that advantages of MANN over other models are quite noticeable, particularly for daily rainfall forecasting. Therefore, the proposed optimal rainfall forecasting model can be derived from MANN coupled with SSA.

  20. On the Development of Parafoveal Preprocessing: Evidence from the Incremental Boundary Paradigm

    PubMed Central

    Marx, Christina; Hutzler, Florian; Schuster, Sarah; Hawelka, Stefan

    2016-01-01

    Parafoveal preprocessing of upcoming words and the resultant preview benefit are key aspects of fluent reading. Evidence regarding the development of parafoveal preprocessing during reading acquisition, however, is scarce. The present developmental (cross-sectional) eye tracking study estimated the magnitude of parafoveal preprocessing of beginning readers with a novel variant of the classical boundary paradigm. Additionally, we assessed the association of parafoveal preprocessing with several reading-related psychometric measures. The participants were children learning to read the regular German orthography with about 1, 3, and 5 years of formal reading instruction (Grade 2, 4, and 6, respectively). We found evidence of parafoveal preprocessing in each Grade. However, an effective use of parafoveal information was related to the individual reading fluency of the participants (i.e., the reading rate expressed as words-per-minute) which substantially overlapped between the Grades. The size of the preview benefit was furthermore associated with the children’s performance in rapid naming tasks and with their performance in a pseudoword reading task. The latter task assessed the children’s efficiency in phonological decoding and our findings show that the best decoders exhibited the largest preview benefit. PMID:27148123

  1. On the Development of Parafoveal Preprocessing: Evidence from the Incremental Boundary Paradigm.

    PubMed

    Marx, Christina; Hutzler, Florian; Schuster, Sarah; Hawelka, Stefan

    2016-01-01

    Parafoveal preprocessing of upcoming words and the resultant preview benefit are key aspects of fluent reading. Evidence regarding the development of parafoveal preprocessing during reading acquisition, however, is scarce. The present developmental (cross-sectional) eye tracking study estimated the magnitude of parafoveal preprocessing of beginning readers with a novel variant of the classical boundary paradigm. Additionally, we assessed the association of parafoveal preprocessing with several reading-related psychometric measures. The participants were children learning to read the regular German orthography with about 1, 3, and 5 years of formal reading instruction (Grade 2, 4, and 6, respectively). We found evidence of parafoveal preprocessing in each Grade. However, an effective use of parafoveal information was related to the individual reading fluency of the participants (i.e., the reading rate expressed as words-per-minute) which substantially overlapped between the Grades. The size of the preview benefit was furthermore associated with the children's performance in rapid naming tasks and with their performance in a pseudoword reading task. The latter task assessed the children's efficiency in phonological decoding and our findings show that the best decoders exhibited the largest preview benefit. PMID:27148123

  2. Effect of data normalization on fuzzy clustering of DNA microarray data

    PubMed Central

    Kim, Seo Young; Lee, Jae Won; Bae, Jong Sung

    2006-01-01

    Background Microarray technology has made it possible to simultaneously measure the expression levels of large numbers of genes in a short time. Gene expression data is information rich; however, extensive data mining is required to identify the patterns that characterize the underlying mechanisms of action. Clustering is an important tool for finding groups of genes with similar expression patterns in microarray data analysis. However, hard clustering methods, which assign each gene exactly to one cluster, are poorly suited to the analysis of microarray datasets because in such datasets the clusters of genes frequently overlap. Results In this study we applied the fuzzy partitional clustering method known as Fuzzy C-Means (FCM) to overcome the limitations of hard clustering. To identify the effect of data normalization, we used three normalization methods, the two common scale and location transformations and Lowess normalization methods, to normalize three microarray datasets and three simulated datasets. First we determined the optimal parameters for FCM clustering. We found that the optimal fuzzification parameter in the FCM analysis of a microarray dataset depended on the normalization method applied to the dataset during preprocessing. We additionally evaluated the effect of normalization of noisy datasets on the results obtained when hard clustering or FCM clustering was applied to those datasets. The effects of normalization were evaluated using both simulated datasets and microarray datasets. A comparative analysis showed that the clustering results depended on the normalization method used and the noisiness of the data. In particular, the selection of the fuzzification parameter value for the FCM method was sensitive to the normalization method used for datasets with large variations across samples. Conclusion Lowess normalization is more robust for clustering of genes from general microarray data than the two common scale and location adjustment methods

  3. Generation of attributes for learning algorithms

    SciTech Connect

    Hu, Yuh-Jyh; Kibler, D.

    1996-12-31

    Inductive algorithms rely strongly on their representational biases. Constructive induction can mitigate representational inadequacies. This paper introduces the notion of a relative gain measure and describes a new constructive induction algorithm (GALA) which is independent of the learning algorithm. Unlike most previous research on constructive induction, our methods are designed as preprocessing step before standard machine learning algorithms are applied. We present the results which demonstrate the effectiveness of GALA on artificial and real domains for several learners: C4.5, CN2, perceptron and backpropagation.

  4. Microarrays Made Simple: "DNA Chips" Paper Activity

    ERIC Educational Resources Information Center

    Barnard, Betsy

    2006-01-01

    DNA microarray technology is revolutionizing biological science. DNA microarrays (also called DNA chips) allow simultaneous screening of many genes for changes in expression between different cells. Now researchers can obtain information about genes in days or weeks that used to take months or years. The paper activity described in this article…

  5. Protein-Based Microarray for the Detection of Pathogenic Bacteria

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Microarrays have been used for gene expression and protein interaction studies, but recently, multianalyte diagnostic assays have employed the microarray platform. We developed a microarray immunoassay for bacteria, with biotinylated capture antibodies on streptavidin slides. To complete the fluor...

  6. Tissue Microarrays in Clinical Oncology

    PubMed Central

    Voduc, David; Kenney, Challayne; Nielsen, Torsten O.

    2008-01-01

    The tissue microarray is a recently-implemented, high-throughput technology for the analysis of molecular markers in oncology. This research tool permits the rapid assessment of a biomarker in thousands of tumor samples, using commonly available laboratory assays such as immunohistochemistry and in-situ hybridization. Although introduced less than a decade ago, the TMA has proven to be invaluable in the study of tumor biology, the development of diagnostic tests, and the investigation of oncological biomarkers. This review describes the impact of TMA-based research in clinical oncology and its potential future applications. Technical aspects of TMA construction, and the advantages and disadvantages inherent to this technology are also discussed. PMID:18314063

  7. DNA Microarrays for Identifying Fishes

    PubMed Central

    Nölte, M.; Weber, H.; Silkenbeumer, N.; Hjörleifsdottir, S.; Hreggvidsson, G. O.; Marteinsson, V.; Kappel, K.; Planes, S.; Tinti, F.; Magoulas, A.; Garcia Vazquez, E.; Turan, C.; Hervet, C.; Campo Falgueras, D.; Antoniou, A.; Landi, M.; Blohm, D.

    2008-01-01

    In many cases marine organisms and especially their diverse developmental stages are difficult to identify by morphological characters. DNA-based identification methods offer an analytically powerful addition or even an alternative. In this study, a DNA microarray has been developed to be able to investigate its potential as a tool for the identification of fish species from European seas based on mitochondrial 16S rDNA sequences. Eleven commercially important fish species were selected for a first prototype. Oligonucleotide probes were designed based on the 16S rDNA sequences obtained from 230 individuals of 27 fish species. In addition, more than 1200 sequences of 380 species served as sequence background against which the specificity of the probes was tested in silico. Single target hybridisations with Cy5-labelled, PCR-amplified 16S rDNA fragments from each of the 11 species on microarrays containing the complete set of probes confirmed their suitability. True-positive, fluorescence signals obtained were at least one order of magnitude stronger than false-positive cross-hybridisations. Single nontarget hybridisations resulted in cross-hybridisation signals at approximately 27% of the cases tested, but all of them were at least one order of magnitude lower than true-positive signals. This study demonstrates that the 16S rDNA gene is suitable for designing oligonucleotide probes, which can be used to differentiate 11 fish species. These data are a solid basis for the second step to create a “Fish Chip” for approximately 50 fish species relevant in marine environmental and fisheries research, as well as control of fisheries products. PMID:18270778

  8. Preprocessed barley, rye, and triticale as a feedstock for an integrated fuel ethanol-feedlot plant

    SciTech Connect

    Sosulski, K.; Wang, Sunmin; Ingledew, W.M.

    1997-12-31

    Rye, triticale, and barley were evaluated as starch feedstock to replace wheat for ethanol production. Preprocessing of grain by abrasion on a Satake mill reduced fiber and increased starch concentrations in feed-stock for fermentations. Higher concentrations of starch in flours from preprocessed cereal grains would increase plant throughput by 8-23% since more starch is processed in the same weight of feedstock. Increased concentrations of starch for fermentation resulted in higher concentrations of ethanol in beer. Energy requirements to produce one L of ethanol from preprocessed grains were reduced, the natural gas by 3.5-11.4%, whereas power consumption was reduced by 5.2-15.6%. 7 refs., 7 figs., 4 tabs.

  9. Optimization of Preprocessing and Densification of Sorghum Stover at Full-scale Operation

    SciTech Connect

    Neal A. Yancey; Jaya Shankar Tumuluru; Craig C. Conner; Christopher T. Wright

    2011-08-01

    Transportation costs can be a prohibitive step in bringing biomass to a preprocessing location or biofuel refinery. One alternative to transporting biomass in baled or loose format to a preprocessing location, is to utilize a mobile preprocessing system that can be relocated to various locations where biomass is stored, preprocess and densify the biomass, then ship it to the refinery as needed. The Idaho National Laboratory has a full scale 'Process Demonstration Unit' PDU which includes a stage 1 grinder, hammer mill, drier, pellet mill, and cooler with the associated conveyance system components. Testing at bench and pilot scale has been conducted to determine effects of moisture on preprocessing, crop varieties on preprocessing efficiency and product quality. The INLs PDU provides an opportunity to test the conclusions made at the bench and pilot scale on full industrial scale systems. Each component of the PDU is operated from a central operating station where data is collected to determine power consumption rates for each step in the process. The power for each electrical motor in the system is monitored from the control station to monitor for problems and determine optimal conditions for the system performance. The data can then be viewed to observe how changes in biomass input parameters (moisture and crop type for example), mechanical changes (screen size, biomass drying, pellet size, grinding speed, etc.,), or other variations effect the power consumption of the system. Sorgum in four foot round bales was tested in the system using a series of 6 different screen sizes including: 3/16 in., 1 in., 2 in., 3 in., 4 in., and 6 in. The effect on power consumption, product quality, and production rate were measured to determine optimal conditions.

  10. Probe Region Expression Estimation for RNA-Seq Data for Improved Microarray Comparability

    PubMed Central

    Uziela, Karolis; Honkela, Antti

    2015-01-01

    Rapidly growing public gene expression databases contain a wealth of data for building an unprecedentedly detailed picture of human biology and disease. This data comes from many diverse measurement platforms that make integrating it all difficult. Although RNA-sequencing (RNA-seq) is attracting the most attention, at present, the rate of new microarray studies submitted to public databases far exceeds the rate of new RNA-seq studies. There is clearly a need for methods that make it easier to combine data from different technologies. In this paper, we propose a new method for processing RNA-seq data that yields gene expression estimates that are much more similar to corresponding estimates from microarray data, hence greatly improving cross-platform comparability. The method we call PREBS is based on estimating the expression from RNA-seq reads overlapping the microarray probe regions, and processing these estimates with standard microarray summarisation algorithms. Using paired microarray and RNA-seq samples from TCGA LAML data set we show that PREBS expression estimates derived from RNA-seq are more similar to microarray-based expression estimates than those from other RNA-seq processing methods. In an experiment to retrieve paired microarray samples from a database using an RNA-seq query sample, gene signatures defined based on PREBS expression estimates were found to be much more accurate than those from other methods. PREBS also allows new ways of using RNA-seq data, such as expression estimation for microarray probe sets. An implementation of the proposed method is available in the Bioconductor package “prebs.” PMID:25966034

  11. Boosting model performance and interpretation by entangling preprocessing selection and variable selection.

    PubMed

    Gerretzen, Jan; Szymańska, Ewa; Bart, Jacob; Davies, Antony N; van Manen, Henk-Jan; van den Heuvel, Edwin R; Jansen, Jeroen J; Buydens, Lutgarde M C

    2016-09-28

    The aim of data preprocessing is to remove data artifacts-such as a baseline, scatter effects or noise-and to enhance the contextually relevant information. Many preprocessing methods exist to deliver one or more of these benefits, but which method or combination of methods should be used for the specific data being analyzed is difficult to select. Recently, we have shown that a preprocessing selection approach based on Design of Experiments (DoE) enables correct selection of highly appropriate preprocessing strategies within reasonable time frames. In that approach, the focus was solely on improving the predictive performance of the chemometric model. This is, however, only one of the two relevant criteria in modeling: interpretation of the model results can be just as important. Variable selection is often used to achieve such interpretation. Data artifacts, however, may hamper proper variable selection by masking the true relevant variables. The choice of preprocessing therefore has a huge impact on the outcome of variable selection methods and may thus hamper an objective interpretation of the final model. To enhance such objective interpretation, we here integrate variable selection into the preprocessing selection approach that is based on DoE. We show that the entanglement of preprocessing selection and variable selection not only improves the interpretation, but also the predictive performance of the model. This is achieved by analyzing several experimental data sets of which the true relevant variables are available as prior knowledge. We show that a selection of variables is provided that complies more with the true informative variables compared to individual optimization of both model aspects. Importantly, the approach presented in this work is generic. Different types of models (e.g. PCR, PLS, …) can be incorporated into it, as well as different variable selection methods and different preprocessing methods, according to the taste and experience of

  12. View and design of basic element for smart imagers with image preprocessing

    NASA Astrophysics Data System (ADS)

    Shilin, Victor A.

    2005-06-01

    This paper is devoted to view ofbasic elements for smart imagers. We discussed principal ofwork. CMOS APS imagers with focal plane parallel image preprocessing for smart technical vision and electro-optical systems based on neural implementation. Using analysis of main biological vision features, the desired artificial vision characteristics are defined. Image processing tasks can be implemented by smart focal plane preprocessing CMOS imagers with neural networks are determined. Eventual results are important for medicine, aerospace ecological monitoring, complexity, and ways for CMOS APS neural nets implementation.

  13. Influence of Hemp Fibers Pre-processing on Low Density Polyethylene Matrix Composites Properties

    NASA Astrophysics Data System (ADS)

    Kukle, S.; Vidzickis, R.; Zelca, Z.; Belakova, D.; Kajaks, J.

    2016-04-01

    In present research with short hemp fibres reinforced LLDPE matrix composites with fibres content in a range from 30 to 50 wt% subjected to four different pre-processing technologies were produced and such their properties as tensile strength and elongation at break, tensile modulus, melt flow index, micro hardness and water absorption dynamics were investigated. Capillary viscosimetry was used for fluidity evaluation and melt flow index (MFI) evaluated for all variants. MFI of fibres of two pre-processing variants were high enough to increase hemp fibres content from 30 to 50 wt% with moderate increase of water sorption capability.

  14. PAA: an R/bioconductor package for biomarker discovery with protein microarrays

    PubMed Central

    Turewicz, Michael; Ahrens, Maike; May, Caroline; Marcus, Katrin; Eisenacher, Martin

    2016-01-01

    Summary: The R/Bioconductor package Protein Array Analyzer (PAA) facilitates a flexible analysis of protein microarrays for biomarker discovery (esp., ProtoArrays). It provides a complete data analysis workflow including preprocessing and quality control, uni- and multivariate feature selection as well as several different plots and results tables to outline and evaluate the analysis results. As a main feature, PAA’s multivariate feature selection methods are based on recursive feature elimination (e.g. SVM-recursive feature elimination, SVM-RFE) with stability ensuring strategies such as ensemble feature selection. This enables PAA to detect stable and reliable biomarker candidate panels. Availability and implementation: PAA is freely available (BSD 3-clause license) from http://www.bioconductor.org/packages/PAA/. Contact: michael.turewicz@rub.de or martin.eisenacher@rub.de PMID:26803161

  15. Genetic programming based ensemble system for microarray data classification.

    PubMed

    Liu, Kun-Hong; Tong, Muchenxuan; Xie, Shu-Tong; Yee Ng, Vincent To

    2015-01-01

    Recently, more and more machine learning techniques have been applied to microarray data analysis. The aim of this study is to propose a genetic programming (GP) based new ensemble system (named GPES), which can be used to effectively classify different types of cancers. Decision trees are deployed as base classifiers in this ensemble framework with three operators: Min, Max, and Average. Each individual of the GP is an ensemble system, and they become more and more accurate in the evolutionary process. The feature selection technique and balanced subsampling technique are applied to increase the diversity in each ensemble system. The final ensemble committee is selected by a forward search algorithm, which is shown to be capable of fitting data automatically. The performance of GPES is evaluated using five binary class and six multiclass microarray datasets, and results show that the algorithm can achieve better results in most cases compared with some other ensemble systems. By using elaborate base classifiers or applying other sampling techniques, the performance of GPES may be further improved. PMID:25810748

  16. Validation of analytical breast cancer microarray analysis in medical laboratory.

    PubMed

    Darweesh, Amal Said; Louka, Manal Louis; Hana, Maha; Rashad, Shaymaa; El-Shinawi, Mohamed; Sharaf-Eldin, Ahmed; Kassim, Samar Kamal

    2014-10-01

    A previously reported microarray data analysis by RISS algorithm on breast cancer showed over-expression of the growth factor receptor (Grb7) and it also highlighted Tweety (TTYH1) gene to be under expressed in breast cancer for the first time. Our aim was to validate the results obtained from the microarray analysis with respect to these genes. Also, the relationship between their expression and the different prognostic indicators was addressed. RNA was extracted from the breast tissue of 30 patients with primary malignant breast cancer. Control samples from the same patients were harvested at a distance of ≥5 cm from the tumour. Semi-quantitative RT-PCR analysis was done on all samples. There was a significant difference between the malignant and control tissues as regards Grb7 expression. It was significantly related to the presence of lymph node metastasis, stage and histological grade of the malignant tumours. There was a significant inverse relation between expression of Grb7 and expression of both oestrogen and progesterone receptors. Grb7 was found to be significantly related to the biological classification of breast cancer. TTYH1 was not expressed in either the malignant or the control samples. The RISS by our group algorithm developed was laboratory validated for Grb7, but not for TTYH1. The newly developed software tool needs to be improved. PMID:25182704

  17. MARS: Microarray analysis, retrieval, and storage system

    PubMed Central

    Maurer, Michael; Molidor, Robert; Sturn, Alexander; Hartler, Juergen; Hackl, Hubert; Stocker, Gernot; Prokesch, Andreas; Scheideler, Marcel; Trajanoski, Zlatko

    2005-01-01

    Background Microarray analysis has become a widely used technique for the study of gene-expression patterns on a genomic scale. As more and more laboratories are adopting microarray technology, there is a need for powerful and easy to use microarray databases facilitating array fabrication, labeling, hybridization, and data analysis. The wealth of data generated by this high throughput approach renders adequate database and analysis tools crucial for the pursuit of insights into the transcriptomic behavior of cells. Results MARS (Microarray Analysis and Retrieval System) provides a comprehensive MIAME supportive suite for storing, retrieving, and analyzing multi color microarray data. The system comprises a laboratory information management system (LIMS), a quality control management, as well as a sophisticated user management system. MARS is fully integrated into an analytical pipeline of microarray image analysis, normalization, gene expression clustering, and mapping of gene expression data onto biological pathways. The incorporation of ontologies and the use of MAGE-ML enables an export of studies stored in MARS to public repositories and other databases accepting these documents. Conclusion We have developed an integrated system tailored to serve the specific needs of microarray based research projects using a unique fusion of Web based and standalone applications connected to the latest J2EE application server technology. The presented system is freely available for academic and non-profit institutions. More information can be found at . PMID:15836795

  18. Microarray-integrated optoelectrofluidic immunoassay system.

    PubMed

    Han, Dongsik; Park, Je-Kyun

    2016-05-01

    A microarray-based analytical platform has been utilized as a powerful tool in biological assay fields. However, an analyte depletion problem due to the slow mass transport based on molecular diffusion causes low reaction efficiency, resulting in a limitation for practical applications. This paper presents a novel method to improve the efficiency of microarray-based immunoassay via an optically induced electrokinetic phenomenon by integrating an optoelectrofluidic device with a conventional glass slide-based microarray format. A sample droplet was loaded between the microarray slide and the optoelectrofluidic device on which a photoconductive layer was deposited. Under the application of an AC voltage, optically induced AC electroosmotic flows caused by a microarray-patterned light actively enhanced the mass transport of target molecules at the multiple assay spots of the microarray simultaneously, which reduced tedious reaction time from more than 30 min to 10 min. Based on this enhancing effect, a heterogeneous immunoassay with a tiny volume of sample (5 μl) was successfully performed in the microarray-integrated optoelectrofluidic system using immunoglobulin G (IgG) and anti-IgG, resulting in improved efficiency compared to the static environment. Furthermore, the application of multiplex assays was also demonstrated by multiple protein detection. PMID:27190571

  19. Chromatographic alignment by warping and dynamic programming as a pre-processing tool for PARAFAC modelling of liquid chromatography-mass spectrometry data.

    PubMed

    Bylund, Dan; Danielsson, Rolf; Malmquist, Gunnar; Markides, Karin E

    2002-07-01

    Solutes analysed with LC-MS are characterised by their retention times and mass spectra, and quantified by the intensities measured. This highly selective information can be extracted by multiway modelling. However, for full use and interpretability it is necessary that the assumptions made for the model are valid. For PARAFAC modelling, the assumption is a trilinear data structure. With LC-MS, several factors, e.g. non-linear detector response and ionisation suppression may introduce deviations from trilinearity. The single largest problem, however, is the retention time shifts not related to the true sample variations. In this paper, a time warping algorithm for alignment of LC-MS data in the chromatographic direction has been examined. Several refinements have been implemented and the features are demonstrated for both simulated and real data. With moderate time shifts present in the data, pre-processing with this algorithm yields approximately trilinear data for which reasonable models can be made. PMID:12184621

  20. DNA Microarrays in Herbal Drug Research

    PubMed Central

    Chavan, Preeti; Joshi, Kalpana; Patwardhan, Bhushan

    2006-01-01

    Natural products are gaining increased applications in drug discovery and development. Being chemically diverse they are able to modulate several targets simultaneously in a complex system. Analysis of gene expression becomes necessary for better understanding of molecular mechanisms. Conventional strategies for expression profiling are optimized for single gene analysis. DNA microarrays serve as suitable high throughput tool for simultaneous analysis of multiple genes. Major practical applicability of DNA microarrays remains in DNA mutation and polymorphism analysis. This review highlights applications of DNA microarrays in pharmacodynamics, pharmacogenomics, toxicogenomics and quality control of herbal drugs and extracts. PMID:17173108

  1. Progress in the application of DNA microarrays.

    PubMed Central

    Lobenhofer, E K; Bushel, P R; Afshari, C A; Hamadeh, H K

    2001-01-01

    Microarray technology has been applied to a variety of different fields to address fundamental research questions. The use of microarrays, or DNA chips, to study the gene expression profiles of biologic samples began in 1995. Since that time, the fundamental concepts behind the chip, the technology required for making and using these chips, and the multitude of statistical tools for analyzing the data have been extensively reviewed. For this reason, the focus of this review will be not on the technology itself but on the application of microarrays as a research tool and the future challenges of the field. PMID:11673116

  2. Pipeline for macro- and microarray analyses.

    PubMed

    Vicentini, R; Menossi, M

    2007-05-01

    The pipeline for macro- and microarray analyses (PMmA) is a set of scripts with a web interface developed to analyze DNA array data generated by array image quantification software. PMmA is designed for use with single- or double-color array data and to work as a pipeline in five classes (data format, normalization, data analysis, clustering, and array maps). It can also be used as a plugin in the BioArray Software Environment, an open-source database for array analysis, or used in a local version of the web service. All scripts in PMmA were developed in the PERL programming language and statistical analysis functions were implemented in the R statistical language. Consequently, our package is a platform-independent software. Our algorithms can correctly select almost 90% of the differentially expressed genes, showing a superior performance compared to other methods of analysis. The pipeline software has been applied to 1536 expressed sequence tags macroarray public data of sugarcane exposed to cold for 3 to 48 h. PMmA identified thirty cold-responsive genes previously unidentified in this public dataset. Fourteen genes were up-regulated, two had a variable expression and the other fourteen were down-regulated in the treatments. These new findings certainly were a consequence of using a superior statistical analysis approach, since the original study did not take into account the dependence of data variability on the average signal intensity of each gene. The web interface, supplementary information, and the package source code are available, free, to non-commercial users at http://ipe.cbmeg.unicamp.br/pub/PMmA. PMID:17464422

  3. Parafoveal Preprocessing in Reading Revisited: Evidence from a Novel Preview Manipulation

    ERIC Educational Resources Information Center

    Gagl, Benjamin; Hawelka, Stefan; Richlan, Fabio; Schuster, Sarah; Hutzler, Florian

    2014-01-01

    The study investigated parafoveal preprocessing by the means of the classical invisible boundary paradigm and a novel manipulation of the parafoveal previews (i.e., visual degradation). Eye movements were investigated on 5-letter target words with constraining (i.e., highly informative) initial letters or similarly constraining final letters.…

  4. Integrated Multi-Strategic Web Document Pre-Processing for Sentence and Word Boundary Detection.

    ERIC Educational Resources Information Center

    Shim, Junhyeok; Kim, Dongseok; Cha, Jeongwon; Lee, Gary Geunbae; Seo, Jungyun

    2002-01-01

    Discussion of natural language processing focuses on a multi-strategic integrated text preprocessing method for difficult problems of sentence boundary disambiguation and word boundary disambiguation of Web texts. Describes an evaluation of the method using Korean Web document collections. (Author/LRW)

  5. Pre-processing SAR image stream to facilitate compression for transport on bandwidth-limited-link

    SciTech Connect

    Rush, Bobby G.; Riley, Robert

    2015-09-29

    Pre-processing is applied to a raw VideoSAR (or similar near-video rate) product to transform the image frame sequence into a product that resembles more closely the type of product for which conventional video codecs are designed, while sufficiently maintaining utility and visual quality of the product delivered by the codec.

  6. Multi-wavelength aerosol LIDAR signal pre-processing: practical considerations

    NASA Astrophysics Data System (ADS)

    Rodríguez-Gómez, A.; Rocadenbosch, F.; Sicard, M.; Lange, D.; Barragán, R.; Batet, O.; Comerón, A.; López Márquez, M. A.; Muñoz-Porcar, C.; Tiana, J.; Tomás, S.

    2015-12-01

    Elastic lidars provide range-resolved information about the aerosol content in the atmosphere. Nevertheless, a number of pre-processing techniques need to be used before performing the inversion of the detected signal: range-correction, time-averaging, photoncounting channel dead-time correction, overlap correction, Rayleigh-fitting and gluing of both channels.

  7. Integrating Microarray Data and GRNs.

    PubMed

    Koumakis, L; Potamias, G; Tsiknakis, M; Zervakis, M; Moustakis, V

    2016-01-01

    With the completion of the Human Genome Project and the emergence of high-throughput technologies, a vast amount of molecular and biological data are being produced. Two of the most important and significant data sources come from microarray gene-expression experiments and respective databanks (e,g., Gene Expression Omnibus-GEO (http://www.ncbi.nlm.nih.gov/geo)), and from molecular pathways and Gene Regulatory Networks (GRNs) stored and curated in public (e.g., Kyoto Encyclopedia of Genes and Genomes-KEGG (http://www.genome.jp/kegg/pathway.html), Reactome (http://www.reactome.org/ReactomeGWT/entrypoint.html)) as well as in commercial repositories (e.g., Ingenuity IPA (http://www.ingenuity.com/products/ipa)). The association of these two sources aims to give new insight in disease understanding and reveal new molecular targets in the treatment of specific phenotypes.Three major research lines and respective efforts that try to utilize and combine data from both of these sources could be identified, namely: (1) de novo reconstruction of GRNs, (2) identification of Gene-signatures, and (3) identification of differentially expressed GRN functional paths (i.e., sub-GRN paths that distinguish between different phenotypes). In this chapter, we give an overview of the existing methods that support the different types of gene-expression and GRN integration with a focus on methodologies that aim to identify phenotype-discriminant GRNs or subnetworks, and we also present our methodology. PMID:26134183

  8. DNA microarrays in prostate cancer.

    PubMed

    Ho, Shuk-Mei; Lau, Kin-Mang

    2002-02-01

    DNA microarray technology provides a means to examine large numbers of molecular changes related to a biological process in a high throughput manner. This review discusses plausible utilities of this technology in prostate cancer research, including definition of prostate cancer predisposition, global profiling of gene expression patterns associated with cancer initiation and progression, identification of new diagnostic and prognostic markers, and discovery of novel patient classification schemes. The technology, at present, has only been explored in a limited fashion in prostate cancer research. Some hurdles to be overcome are the high cost of the technology, insufficient sample size and repeated experiments, and the inadequate use of bioinformatics. With the completion of the Human Genome Project and the advance of several highly complementary technologies, such as laser capture microdissection, unbiased RNA amplification, customized functional arrays (eg, single-nucleotide polymorphism chips), and amenable bioinformatics software, this technology will become widely used by investigators in the field. The large amount of novel, unbiased hypotheses and insights generated by this technology is expected to have a significant impact on the diagnosis, treatment, and prevention of prostate cancer. Finally, this review emphasizes existing, but currently underutilized, data-mining tools, such as multivariate statistical analyses, neural networking, and machine learning techniques, to stimulate wider usage. PMID:12084220

  9. Increasing peptide identifications and decreasing search times for ETD spectra by pre-processing and calculation of parent precursor charge

    PubMed Central

    2012-01-01

    Background Electron Transfer Dissociation [ETD] can dissociate multiply charged precursor polypeptides, providing extensive peptide backbone cleavage. ETD spectra contain charge reduced precursor peaks, usually of high intensity, and whose pattern is dependent on its parent precursor charge. These charge reduced precursor peaks and associated neutral loss peaks should be removed before these spectra are searched for peptide identifications. ETD spectra can also contain ion-types other than c and z˙. Modifying search strategies to accommodate these ion-types may aid in increased peptide identifications. Additionally, if the precursor mass is measured using a lower resolution instrument such as a linear ion trap, the charge of the precursor is often not known, reducing sensitivity and increasing search times. We implemented algorithms to remove these precursor peaks, accommodate new ion-types in noise filtering routine in OMSSA and to estimate any unknown precursor charge, using Linear Discriminant Analysis [LDA]. Results Spectral pre-processing to remove precursor peaks and their associated neutral losses prior to protein sequence library searches resulted in a 9.8% increase in peptide identifications at a 1% False Discovery Rate [FDR] compared to previous OMSSA filter. Modifications to the OMSSA noise filter to accommodate various ion-types resulted in a further 4.2% increase in peptide identifications at 1% FDR. Moreover, ETD spectra when searched with charge states obtained from the precursor charge determination algorithm is shown to be up to 3.5 times faster than the general range search method, with a minor 3.8% increase in sensitivity. Conclusion Overall, there is an 18.8% increase in peptide identifications at 1% FDR by incorporating the new precursor filter, noise filter and by using the charge determination algorithm, when compared to previous versions of OMSSA. PMID:22321509

  10. Quality Visualization of Microarray Datasets Using Circos

    PubMed Central

    Koch, Martin; Wiese, Michael

    2012-01-01

    Quality control and normalization is considered the most important step in the analysis of microarray data. At present there are various methods available for quality assessments of microarray datasets. However there seems to be no standard visualization routine, which also depicts individual microarray quality. Here we present a convenient method for visualizing the results of standard quality control tests using Circos plots. In these plots various quality measurements are drawn in a circular fashion, thus allowing for visualization of the quality and all outliers of each distinct array within a microarray dataset. The proposed method is intended for use with the Affymetrix Human Genome platform (i.e., GPL 96, GPL570 and GPL571). Circos quality measurement plots are a convenient way for the initial quality estimate of Affymetrix datasets that are stored in publicly available databases.

  11. Microarray: an approach for current drug targets.

    PubMed

    Gomase, Virendra S; Tagore, Somnath; Kale, Karbhari V

    2008-03-01

    Microarrays are a powerful tool has multiple applications both in clinical and cellular and molecular biology arenas. Early assessment of the probable biological importance of drug targets, pharmacogenomics, toxicogenomics and single nucleotide polymorphisms (SNPs). A list of new drug candidates along with proposed targets for intervention is described. Recent advances in the knowledge of microarrays analysis of organisms and the availability of the genomics sequences provide a wide range of novel targets for drug design. This review gives different process of microarray technologies; methods for comparative gene expression study, applications of microarrays in medicine and pharmacogenomics and current drug targets in research, which are relevant to common diseases as they relate to clinical and future perspectives. PMID:18336225

  12. Value of Distributed Preprocessing of Biomass Feedstocks to a Bioenergy Industry

    SciTech Connect

    Christopher T Wright

    2006-07-01

    Biomass preprocessing is one of the primary operations in the feedstock assembly system and the front-end of a biorefinery. Its purpose is to chop, grind, or otherwise format the biomass into a suitable feedstock for conversion to ethanol and other bioproducts. Many variables such as equipment cost and efficiency, and feedstock moisture content, particle size, bulk density, compressibility, and flowability affect the location and implementation of this unit operation. Previous conceptual designs show this operation to be located at the front-end of the biorefinery. However, data are presented that show distributed preprocessing at the field-side or in a fixed preprocessing facility can provide significant cost benefits by producing a higher value feedstock with improved handling, transporting, and merchandising potential. In addition, data supporting the preferential deconstruction of feedstock materials due to their bio-composite structure identifies the potential for significant improvements in equipment efficiencies and compositional quality upgrades. Theses data are collected from full-scale low and high capacity hammermill grinders with various screen sizes. Multiple feedstock varieties with a range of moisture values were used in the preprocessing tests. The comparative values of the different grinding configurations, feedstock varieties, and moisture levels are assessed through post-grinding analysis of the different particle fractions separated with a medium-scale forage particle separator and a Rototap separator. The results show that distributed preprocessing produces a material that has bulk flowable properties and fractionation benefits that can improve the ease of transporting, handling and conveying the material to the biorefinery and improve the biochemical and thermochemical conversion processes.

  13. Preprocessing strategy influences graph-based exploration of altered functional networks in major depression.

    PubMed

    Borchardt, Viola; Lord, Anton Richard; Li, Meng; van der Meer, Johan; Heinze, Hans-Jochen; Bogerts, Bernhard; Breakspear, Michael; Walter, Martin

    2016-04-01

    Resting-state fMRI studies have gained widespread use in exploratory studies of neuropsychiatric disorders. Graph metrics derived from whole brain functional connectivity studies have been used to reveal disease-related variations in many neuropsychiatric disorders including major depression (MDD). These techniques show promise in developing diagnostics for these often difficult to identify disorders. However, the analysis of resting-state datasets is increasingly beset by a myriad of approaches and methods, each with underlying assumptions. Choosing the most appropriate preprocessing parameters a priori is difficult. Nevertheless, the specific methodological choice influences graph-theoretical network topologies as well as regional metrics. The aim of this study was to systematically compare different preprocessing strategies by evaluating their influence on group differences between healthy participants (HC) and depressive patients. We thus investigated the effects of common preprocessing variants, including global mean-signal regression (GMR), temporal filtering, detrending, and network sparsity on group differences between brain networks of HC and MDD patients measured by global and nodal graph theoretical metrics. Occurrence of group differences in global metrics was absent in the majority of tested preprocessing variants, but in local graph metrics it is sparse, variable, and highly dependent on the combination of preprocessing variant and sparsity threshold. Sparsity thresholds between 16 and 22% were shown to have the greatest potential to reveal differences between HC and MDD patients in global and local network metrics. Our study offers an overview of consequences of methodological decisions and which neurobiological characteristics of MDD they implicate, adding further caution to this rapidly growing field. Hum Brain Mapp 37:1422-1442, 2016. © 2016 Wiley Periodicals, Inc. PMID:26888761

  14. Contributions to Statistical Problems Related to Microarray Data

    ERIC Educational Resources Information Center

    Hong, Feng

    2009-01-01

    Microarray is a high throughput technology to measure the gene expression. Analysis of microarray data brings many interesting and challenging problems. This thesis consists three studies related to microarray data. First, we propose a Bayesian model for microarray data and use Bayes Factors to identify differentially expressed genes. Second, we…

  15. Construction of citrus gene coexpression networks from microarray data using random matrix theory.

    PubMed

    Du, Dongliang; Rawat, Nidhi; Deng, Zhanao; Gmitter, Fred G

    2015-01-01

    After the sequencing of citrus genomes, gene function annotation is becoming a new challenge. Gene coexpression analysis can be employed for function annotation using publicly available microarray data sets. In this study, 230 sweet orange (Citrus sinensis) microarrays were used to construct seven coexpression networks, including one condition-independent and six condition-dependent (Citrus canker, Huanglongbing, leaves, flavedo, albedo, and flesh) networks. In total, these networks contain 37 633 edges among 6256 nodes (genes), which accounts for 52.11% measurable genes of the citrus microarray. Then, these networks were partitioned into functional modules using the Markov Cluster Algorithm. Significantly enriched Gene Ontology biological process terms and KEGG pathway terms were detected for 343 and 60 modules, respectively. Finally, independent verification of these networks was performed using another expression data of 371 genes. This study provides new targets for further functional analyses in citrus. PMID:26504573

  16. Construction of citrus gene coexpression networks from microarray data using random matrix theory

    PubMed Central

    Du, Dongliang; Rawat, Nidhi; Deng, Zhanao; Gmitter, Fred G.

    2015-01-01

    After the sequencing of citrus genomes, gene function annotation is becoming a new challenge. Gene coexpression analysis can be employed for function annotation using publicly available microarray data sets. In this study, 230 sweet orange (Citrus sinensis) microarrays were used to construct seven coexpression networks, including one condition-independent and six condition-dependent (Citrus canker, Huanglongbing, leaves, flavedo, albedo, and flesh) networks. In total, these networks contain 37 633 edges among 6256 nodes (genes), which accounts for 52.11% measurable genes of the citrus microarray. Then, these networks were partitioned into functional modules using the Markov Cluster Algorithm. Significantly enriched Gene Ontology biological process terms and KEGG pathway terms were detected for 343 and 60 modules, respectively. Finally, independent verification of these networks was performed using another expression data of 371 genes. This study provides new targets for further functional analyses in citrus. PMID:26504573

  17. Algorithms for optimal dyadic decision trees

    SciTech Connect

    Hush, Don; Porter, Reid

    2009-01-01

    A new algorithm for constructing optimal dyadic decision trees was recently introduced, analyzed, and shown to be very effective for low dimensional data sets. This paper enhances and extends this algorithm by: introducing an adaptive grid search for the regularization parameter that guarantees optimal solutions for all relevant trees sizes, revising the core tree-building algorithm so that its run time is substantially smaller for most regularization parameter values on the grid, and incorporating new data structures and data pre-processing steps that provide significant run time enhancement in practice.

  18. Evaluation of Surface Chemistries for Antibody Microarrays

    SciTech Connect

    Seurynck-Servoss, Shannon L.; White, Amanda M.; Baird, Cheryl L.; Rodland, Karin D.; Zangar, Richard C.

    2007-12-01

    Antibody microarrays are an emerging technology that promises to be a powerful tool for the detection of disease biomarkers. The current technology for protein microarrays has been primarily derived from DNA microarrays and is not fully characterized for use with proteins. For example, there are a myriad of surface chemistries that are commercially available for antibody microarrays, but no rigorous studies that compare these different surfaces. Therefore, we have used an enzyme-linked immunosorbent assay (ELISA) microarray platform to analyze 16 different commercially available slide types. Full standard curves were generated for 24 different assays. We found that this approach provides a rigorous and quantitative system for comparing the different slide types based on spot size and morphology, slide noise, spot background, lower limit of detection, and reproducibility. These studies demonstrate that the properties of the slide surface affect the activity of immobilized antibodies and the quality of data produced. Although many slide types can produce useful data, glass slides coated with poly-L-lysine or aminosilane, with or without activation with a crosslinker, consistently produce superior results in the ELISA microarray analyses we performed.

  19. The Impact of Photobleaching on Microarray Analysis

    PubMed Central

    von der Haar, Marcel; Preuß, John-Alexander; von der Haar, Kathrin; Lindner, Patrick; Scheper, Thomas; Stahl, Frank

    2015-01-01

    DNA-Microarrays have become a potent technology for high-throughput analysis of genetic regulation. However, the wide dynamic range of signal intensities of fluorophore-based microarrays exceeds the dynamic range of a single array scan by far, thus limiting the key benefit of microarray technology: parallelization. The implementation of multi-scan techniques represents a promising approach to overcome these limitations. These techniques are, in turn, limited by the fluorophores’ susceptibility to photobleaching when exposed to the scanner’s laser light. In this paper the photobleaching characteristics of cyanine-3 and cyanine-5 as part of solid state DNA microarrays are studied. The effects of initial fluorophore intensity as well as laser scanner dependent variables such as the photomultiplier tube’s voltage on bleaching and imaging are investigated. The resulting data is used to develop a model capable of simulating the expected degree of signal intensity reduction caused by photobleaching for each fluorophore individually, allowing for the removal of photobleaching-induced, systematic bias in multi-scan procedures. Single-scan applications also benefit as they rely on pre-scans to determine the optimal scanner settings. These findings constitute a step towards standardization of microarray experiments and analysis and may help to increase the lab-to-lab comparability of microarray experiment results. PMID:26378589

  20. Terrain matching image pre-process and its format transform in autonomous underwater navigation

    NASA Astrophysics Data System (ADS)

    Cao, Xuejun; Zhang, Feizhou; Yang, Dongkai; Yang, Bogang

    2007-06-01

    matching precision directly influences the final precision of integrated navigation system. Image matching assistant navigation is spatially matching and aiming at two underwater scenery images coming from two different sensors matriculating of the same scenery in order to confirm the relative displacement of the two images. In this way, we can obtain the vehicle's location in fiducial image known geographical relation, and the precise location information given from image matching location is transmitted to INS to eliminate its location error and greatly enhance the navigation precision of vehicle. Digital image data analysis and processing of image matching in underwater passive navigation is important. In regard to underwater geographic data analysis, we focus on the acquirement, disposal, analysis, expression and measurement of database information. These analysis items structure one of the important contents of underwater terrain matching and are propitious to know the seabed terrain configuration of navigation areas so that the best advantageous seabed terrain district and dependable navigation algorithm can be selected. In this way, we can improve the precision and reliability of terrain assistant navigation system. The pre-process and format transformation of digital image during underwater image matching are expatiated in this paper. The information of the terrain status in navigation areas need further study to provide the reliable data terrain characteristic and underwater overcast for navigation. Through realizing the choice of sea route, danger district prediction and navigating algorithm analysis, TAN can obtain more high location precision and probability, hence provide technological support for image matching of underwater passive navigation.

  1. Data Mining for Tectonic Tremor in a Large Global Seismogram Database using Preprocessed Data Quality Measurements

    NASA Astrophysics Data System (ADS)

    Rasor, B. A.; Brudzinski, M. R.

    2013-12-01

    The collision of plates at subduction zones yields the potential for disastrous earthquakes, yet the processes that lead up to these events are still largely unclear and make them difficult to forecast. Recent advancements in seismic monitoring has revealed subtle ground vibrations termed tectonic tremor that occur as long-lived swarms of narrow bandwidth activity, different from local earthquakes of comparable amplitude that create brief signals of broader, higher frequency. The close proximity of detected tremor events to the lower edge of the seismogenic zone along the subduction interface suggests a potential triggering relationship between tremor and megathrust earthquakes. Most tremor catalogs are constructed with detection methods that involve an exhausting download of years of high sample rate seismic data, as well as large computation power to process the large data volume and identify temporal patterns of tremor activity. We have developed a tremor detection method that employs the underutilized Quality Analysis Control Kit (QuACK), originally built to analyze station performance and identify instrument problems across the many seismic networks that contribute data to one of the largest seismogram databases in the world (IRIS DMC). The QuACK dataset stores seismogram amplitudes at a wide range of frequencies calculated every hour since 2005 for most stations achieved in the IRIS DMC. Such a preprocessed dataset is advantageous considering several tremor detection techniques use hourly seismic amplitudes in the frequency band where tremor is most active (2-5 Hz) to characterize the time history of tremor. Yet these previous detection techniques have relied on downloading years of 40-100 sample-per-second data to make the calculations, which typically takes several days on a 36-node high-performance cluster to calculate the amplitude variations for a single station. Processing times are even longer for a recently developed detection algorithm that utilize

  2. Identifying Cancer Biomarkers From Microarray Data Using Feature Selection and Semisupervised Learning

    PubMed Central

    Maulik, Ujjwal

    2014-01-01

    Microarrays have now gone from obscurity to being almost ubiquitous in biological research. At the same time, the statistical methodology for microarray analysis has progressed from simple visual assessments of results to novel algorithms for analyzing changes in expression profiles. In a micro-RNA (miRNA) or gene-expression profiling experiment, the expression levels of thousands of genes/miRNAs are simultaneously monitored to study the effects of certain treatments, diseases, and developmental stages on their expressions. Microarray-based gene expression profiling can be used to identify genes, whose expressions are changed in response to pathogens or other organisms by comparing gene expression in infected to that in uninfected cells or tissues. Recent studies have revealed that patterns of altered microarray expression profiles in cancer can serve as molecular biomarkers for tumor diagnosis, prognosis of disease-specific outcomes, and prediction of therapeutic responses. Microarray data sets containing expression profiles of a number of miRNAs or genes are used to identify biomarkers, which have dysregulation in normal and malignant tissues. However, small sample size remains a bottleneck to design successful classification methods. On the other hand, adequate number of microarray data that do not have clinical knowledge can be employed as additional source of information. In this paper, a combination of kernelized fuzzy rough set (KFRS) and semisupervised support vector machine (S3VM) is proposed for predicting cancer biomarkers from one miRNA and three gene expression data sets. Biomarkers are discovered employing three feature selection methods, including KFRS. The effectiveness of the proposed KFRS and S3VM combination on the microarray data sets is demonstrated, and the cancer biomarkers identified from miRNA data are reported. Furthermore, biological significance tests are conducted for miRNA cancer biomarkers. PMID:27170887

  3. Advanced Recording and Preprocessing of Physiological Signals. [data processing equipment for flow measurement of blood flow by ultrasonics

    NASA Technical Reports Server (NTRS)

    Bentley, P. B.

    1975-01-01

    The measurement of the volume flow-rate of blood in an artery or vein requires both an estimate of the flow velocity and its spatial distribution and the corresponding cross-sectional area. Transcutaneous measurements of these parameters can be performed using ultrasonic techniques that are analogous to the measurement of moving objects by use of a radar. Modern digital data recording and preprocessing methods were applied to the measurement of blood-flow velocity by means of the CW Doppler ultrasonic technique. Only the average flow velocity was measured and no distribution or size information was obtained. Evaluations of current flowmeter design and performance, ultrasonic transducer fabrication methods, and other related items are given. The main thrust was the development of effective data-handling and processing methods by application of modern digital techniques. The evaluation resulted in useful improvements in both the flowmeter instrumentation and the ultrasonic transducers. Effective digital processing algorithms that provided enhanced blood-flow measurement accuracy and sensitivity were developed. Block diagrams illustrative of the equipment setup are included.

  4. Detection of NASBA amplified bacterial tmRNA molecules on SLICSel designed microarray probes

    PubMed Central

    2011-01-01

    Background We present a comprehensive technological solution for bacterial diagnostics using tmRNA as a marker molecule. A robust probe design algorithm for microbial detection microarray is implemented. The probes were evaluated for specificity and, combined with NASBA (Nucleic Acid Sequence Based Amplification) amplification, for sensitivity. Results We developed a new web-based program SLICSel for the design of hybridization probes, based on nearest-neighbor thermodynamic modeling. A SLICSel minimum binding energy difference criterion of 4 kcal/mol was sufficient to design of Streptococcus pneumoniae tmRNA specific microarray probes. With lower binding energy difference criteria, additional hybridization specificity tests on the microarray were needed to eliminate non-specific probes. Using SLICSel designed microarray probes and NASBA we were able to detect S. pneumoniae tmRNA from a series of total RNA dilutions equivalent to the RNA content of 0.1-10 CFU. Conclusions The described technological solution and both its separate components SLICSel and NASBA-microarray technology independently are applicative for many different areas of microbial diagnostics. PMID:21356118

  5. A statistical method to incorporate biological knowledge for generating testable novel gene regulatory interactions from microarray experiments

    PubMed Central

    Larsen, Peter; Almasri, Eyad; Chen, Guanrao; Dai, Yang

    2007-01-01

    Background The incorporation of prior biological knowledge in the analysis of microarray data has become important in the reconstruction of transcription regulatory networks in a cell. Most of the current research has been focused on the integration of multiple sets of microarray data as well as curated databases for a genome scale reconstruction. However, individual researchers are more interested in the extraction of most useful information from the data of their hypothesis-driven microarray experiments. How to compile the prior biological knowledge from literature to facilitate new hypothesis generation from a microarray experiment is the focus of this work. We propose a novel method based on the statistical analysis of reported gene interactions in PubMed literature. Results Using Gene Ontology (GO) Molecular Function annotation for reported gene regulatory interactions in PubMed literature, a statistical analysis method was proposed for the derivation of a likelihood of interaction (LOI) score for a pair of genes. The LOI-score and the Pearson correlation coefficient of gene profiles were utilized to check if a pair of query genes would be in the above specified interaction. The method was validated in the analysis of two gene sets formed from the yeast Saccharomyces cerevisiae cell cycle microarray data. It was found that high percentage of identified interactions shares GO Biological Process annotations (39.5% for a 102 interaction enriched gene set and 23.0% for a larger 999 cyclically expressed gene set). Conclusion This method can uncover novel biologically relevant gene interactions. With stringent confidence levels, small interaction networks can be identified for further establishment of a hypothesis testable by biological experiment. This procedure is computationally inexpensive and can be used as a preprocessing procedure for screening potential biologically relevant gene pairs subject to the analysis with sophisticated statistical methods. PMID

  6. New indicator for optimal preprocessing and wavelength selection of near-infrared spectra.

    PubMed

    Skibsted, E T S; Boelens, H F M; Westerhuis, J A; Witte, D T; Smilde, A K

    2004-03-01

    Preprocessing of near-infrared spectra to remove unwanted, i.e., non-related spectral variation and selection of informative wavelengths is considered to be a crucial step prior to the construction of a quantitative calibration model. The standard methodology when comparing various preprocessing techniques and selecting different wavelengths is to compare prediction statistics computed with an independent set of data not used to make the actual calibration model. When the errors of reference value are large, no such values are available at all, or only a limited number of samples are available, other methods exist to evaluate the preprocessing method and wavelength selection. In this work we present a new indicator (SE) that only requires blank sample spectra, i.e., spectra of samples that are mixtures of the interfering constituents (everything except the analyte), a pure analyte spectrum, or alternatively, a sample spectrum where the analyte is present. The indicator is based on computing the net analyte signal of the analyte and the total error, i.e., instrumental noise and bias. By comparing the indicator values when different preprocessing techniques and wavelength selections are applied to the spectra, the optimal preprocessing technique and the optimal wavelength selection can be determined without knowledge of reference values, i.e., it minimizes the non-related spectral variation. The SE indicator is compared to two other indicators that also use net analyte signal computations. To demonstrate the feasibility of the SE indicator, two near-infrared spectral data sets from the pharmaceutical industry were used, i.e., diffuse reflectance spectra of powder samples and transmission spectra of tablets. Especially in pharmaceutical spectroscopic applications, it is expected beforehand that the non-related spectral variation is rather large and it is important to remove it. The indicator gave excellent results with respect to wavelength selection and optimal

  7. A hybrid imputation approach for microarray missing value estimation

    PubMed Central

    2015-01-01

    Background Missing data is an inevitable phenomenon in gene expression microarray experiments due to instrument failure or human error. It has a negative impact on performance of downstream analysis. Technically, most existing approaches suffer from this prevalent problem. Imputation is one of the frequently used methods for processing missing data. Actually many developments have been achieved in the research on estimating missing values. The challenging task is how to improve imputation accuracy for data with a large missing rate. Methods In this paper, induced by the thought of collaborative training, we propose a novel hybrid imputation method, called Recursive Mutual Imputation (RMI). Specifically, RMI exploits global correlation information and local structure in the data, captured by two popular methods, Bayesian Principal Component Analysis (BPCA) and Local Least Squares (LLS), respectively. Mutual strategy is implemented by sharing the estimated data sequences at each recursive process. Meanwhile, we consider the imputation sequence based on the number of missing entries in the target gene. Furthermore, a weight based integrated method is utilized in the final assembling step. Results We evaluate RMI with three state-of-art algorithms (BPCA, LLS, Iterated Local Least Squares imputation (ItrLLS)) on four publicly available microarray datasets. Experimental results clearly demonstrate that RMI significantly outperforms comparative methods in terms of Normalized Root Mean Square Error (NRMSE), especially for datasets with large missing rates and less complete genes. Conclusions It is noted that our proposed hybrid imputation approach incorporates both global and local information of microarray genes, which achieves lower NRMSE values against to any single approach only. Besides, this study highlights the need for considering the imputing sequence of missing entries for imputation methods. PMID:26330180

  8. Chromosomal Microarray versus Karyotyping for Prenatal Diagnosis

    PubMed Central

    Wapner, Ronald J.; Martin, Christa Lese; Levy, Brynn; Ballif, Blake C.; Eng, Christine M.; Zachary, Julia M.; Savage, Melissa; Platt, Lawrence D.; Saltzman, Daniel; Grobman, William A.; Klugman, Susan; Scholl, Thomas; Simpson, Joe Leigh; McCall, Kimberly; Aggarwal, Vimla S.; Bunke, Brian; Nahum, Odelia; Patel, Ankita; Lamb, Allen N.; Thom, Elizabeth A.; Beaudet, Arthur L.; Ledbetter, David H.; Shaffer, Lisa G.; Jackson, Laird

    2013-01-01

    Background Chromosomal microarray analysis has emerged as a primary diagnostic tool for the evaluation of developmental delay and structural malformations in children. We aimed to evaluate the accuracy, efficacy, and incremental yield of chromosomal microarray analysis as compared with karyotyping for routine prenatal diagnosis. Methods Samples from women undergoing prenatal diagnosis at 29 centers were sent to a central karyotyping laboratory. Each sample was split in two; standard karyotyping was performed on one portion and the other was sent to one of four laboratories for chromosomal microarray. Results We enrolled a total of 4406 women. Indications for prenatal diagnosis were advanced maternal age (46.6%), abnormal result on Down’s syndrome screening (18.8%), structural anomalies on ultrasonography (25.2%), and other indications (9.4%). In 4340 (98.8%) of the fetal samples, microarray analysis was successful; 87.9% of samples could be used without tissue culture. Microarray analysis of the 4282 nonmosaic samples identified all the aneuploidies and unbalanced rearrangements identified on karyotyping but did not identify balanced translocations and fetal triploidy. In samples with a normal karyotype, microarray analysis revealed clinically relevant deletions or duplications in 6.0% with a structural anomaly and in 1.7% of those whose indications were advanced maternal age or positive screening results. Conclusions In the context of prenatal diagnostic testing, chromosomal microarray analysis identified additional, clinically significant cytogenetic information as compared with karyotyping and was equally efficacious in identifying aneuploidies and unbalanced rearrangements but did not identify balanced translocations and triploidies. (Funded by the Eunice Kennedy Shriver National Institute of Child Health and Human Development and others; ClinicalTrials.gov number, NCT01279733.) PMID:23215555

  9. Knowledge-based analysis of microarrays for the discovery of transcriptional regulation relationships

    PubMed Central

    2010-01-01

    Background The large amount of high-throughput genomic data has facilitated the discovery of the regulatory relationships between transcription factors and their target genes. While early methods for discovery of transcriptional regulation relationships from microarray data often focused on the high-throughput experimental data alone, more recent approaches have explored the integration of external knowledge bases of gene interactions. Results In this work, we develop an algorithm that provides improved performance in the prediction of transcriptional regulatory relationships by supplementing the analysis of microarray data with a new method of integrating information from an existing knowledge base. Using a well-known dataset of yeast microarrays and the Yeast Proteome Database, a comprehensive collection of known information of yeast genes, we show that knowledge-based predictions demonstrate better sensitivity and specificity in inferring new transcriptional interactions than predictions from microarray data alone. We also show that comprehensive, direct and high-quality knowledge bases provide better prediction performance. Comparison of our results with ChIP-chip data and growth fitness data suggests that our predicted genome-wide regulatory pairs in yeast are reasonable candidates for follow-up biological verification. Conclusion High quality, comprehensive, and direct knowledge bases, when combined with appropriate bioinformatic algorithms, can significantly improve the discovery of gene regulatory relationships from high throughput gene expression data. PMID:20122245

  10. A Fully Automatic Method for Gridding Bright Field Images of Bead-Based Microarrays.

    PubMed

    Datta, Abhik; Wai-Kin Kong, Adams; Yow, Kin-Choong

    2016-07-01

    In this paper, a fully automatic method for gridding bright field images of bead-based microarrays is proposed. There have been numerous techniques developed for gridding fluorescence images of traditional spotted microarrays but to our best knowledge, no algorithm has yet been developed for gridding bright field images of bead-based microarrays. The proposed gridding method is designed for automatic quality control during fabrication and assembly of bead-based microarrays. The method begins by estimating the grid parameters using an evolutionary algorithm. This is followed by a grid-fitting step that rigidly aligns an ideal grid with the image. Finally, a grid refinement step deforms the ideal grid to better fit the image. The grid fitting and refinement are performed locally and the final grid is a nonlinear (piecewise affine) grid. To deal with extreme corruptions in the image, the initial grid parameter estimation and grid-fitting steps employ robust search techniques. The proposed method does not have any free parameters that need tuning. The method is capable of identifying the grid structure even in the presence of extreme amounts of artifacts and distortions. Evaluation results on a variety of images are presented. PMID:26011899

  11. Quantifying the Antibody Binding on Protein Microarrays using Microarray Nonlinear Calibration

    PubMed Central

    Yu, Xiaobo; Wallstrom, Garrick; Magee, Dewey Mitchell; Qiu, Ji; Mendoza, D. Eliseo A.; Wang, Jie; Bian, Xiaofang; Graves, Morgan; LaBaer, Joshua

    2015-01-01

    To address the issue of quantification for antibody assays with protein microarrays, we firstly developed a Microarray Nonlinear Calibration (MiNC) method that applies in the quantification of antibody binding to the surface of microarray spots. We found that MiNC significantly increased the linear dynamic range and reduced assay variations. A serological analysis of guinea pig Mycobacterium tuberculosis models showed that a larger number of putative antigen targets were identified with MiNC, which is consistent with the improved assay performance of protein microarrays. We expect that our cumulative results will provide scientists with a new appreciation of antibody assays with protein microarrays. Our MiNC method has the potential to be employed in biomedical research with multiplex antibody assays which need quantitation, including the discovery of antibody biomarkers, clinical diagnostics with multi-antibody signatures and construction of immune mathematical models. PMID:23662896

  12. Radar signal pre-processing to suppress surface bounce and multipath

    DOEpatents

    Paglieroni, David W; Mast, Jeffrey E; Beer, N. Reginald

    2013-12-31

    A method and system for detecting the presence of subsurface objects within a medium is provided. In some embodiments, the imaging and detection system operates in a multistatic mode to collect radar return signals generated by an array of transceiver antenna pairs that is positioned across the surface and that travels down the surface. The imaging and detection system pre-processes that return signal to suppress certain undesirable effects. The imaging and detection system then generates synthetic aperture radar images from real aperture radar images generated from the pre-processed return signal. The imaging and detection system then post-processes the synthetic aperture radar images to improve detection of subsurface objects. The imaging and detection system identifies peaks in the energy levels of the post-processed image frame, which indicates the presence of a subsurface object.

  13. KONFIG and REKONFIG: Two interactive preprocessing to the Navy/NASA Engine Program (NNEP)

    NASA Technical Reports Server (NTRS)

    Fishbach, L. H.

    1981-01-01

    The NNEP is a computer program that is currently being used to simulate the thermodynamic cycle performance of almost all types of turbine engines by many government, industry, and university personnel. The NNEP uses arrays of input data to set up the engine simulation and component matching method as well as to describe the characteristics of the components. A preprocessing program (KONFIG) is described in which the user at a terminal on a time shared computer can interactively prepare the arrays of data required. It is intended to make it easier for the occasional or new user to operate NNEP. Another preprocessing program (REKONFIG) in which the user can modify the component specifications of a previously configured NNEP dataset is also described. It is intended to aid in preparing data for parametric studies and/or studies of similar engines such a mixed flow turbofans, turboshafts, etc.

  14. Data preprocessing and preliminary results of the moon-based ultraviolet telescope on CE-3 lander

    NASA Astrophysics Data System (ADS)

    Wang, f.

    2015-10-01

    The moon-based ultraviolet telescope (MUVT) is one of the payloads on the Chang'e-3(CE-3)lunar lander. Because of the advantages of having no atmospheric disturbances and the slow rotation of the Moon, we can make longterm continuous observations of a series of important remote celestial objects in the near ultraviolet band, and perform a sky survey of selected areas. We can find characteristic changes incelestial brightness with time by analyzing image data from the MUVT ,and deduce the radiation mechanism and physical properties of these celestial objects after comparing with a physical model. In order to explain the scientific purposes of MUVT, this article analyzes the preprocessing of MUVT image data and makes a preliminary evaluation of data quality. The results demonstrate that the methods used for data collection and preprocessing are effective, and the Level 2A and 2B image data satisfy the requirements of follow-up scientific researches.

  15. The Role of GRAIL Orbit Determination in Preprocessing of Gravity Science Measurements

    NASA Technical Reports Server (NTRS)

    Kruizinga, Gerhard; Asmar, Sami; Fahnestock, Eugene; Harvey, Nate; Kahan, Daniel; Konopliv, Alex; Oudrhiri, Kamal; Paik, Meegyeong; Park, Ryan; Strekalov, Dmitry; Watkins, Michael; Yuan, Dah-Ning

    2013-01-01

    The Gravity Recovery And Interior Laboratory (GRAIL) mission has constructed a lunar gravity field with unprecedented uniform accuracy on the farside and nearside of the Moon. GRAIL lunar gravity field determination begins with preprocessing of the gravity science measurements by applying corrections for time tag error, general relativity, measurement noise and biases. Gravity field determination requires the generation of spacecraft ephemerides of an accuracy not attainable with the pre-GRAIL lunar gravity fields. Therefore, a bootstrapping strategy was developed, iterating between science data preprocessing and lunar gravity field estimation in order to construct sufficiently accurate orbit ephemerides.This paper describes the GRAIL measurements, their dependence on the spacecraft ephemerides and the role of orbit determination in the bootstrapping strategy. Simulation results will be presented that validate the bootstrapping strategy followed by bootstrapping results for flight data, which have led to the latest GRAIL lunar gravity fields.

  16. Immune-Signatures for Lung Cancer Diagnostics: Evaluation of Protein Microarray Data Normalization Strategies

    PubMed Central

    Brezina, Stefanie; Soldo, Regina; Kreuzhuber, Roman; Hofer, Philipp; Gsur, Andrea; Weinhaeusel, Andreas

    2015-01-01

    New minimal invasive diagnostic methods for early detection of lung cancer are urgently needed. It is known that the immune system responds to tumors with production of tumor-autoantibodies. Protein microarrays are a suitable highly multiplexed platform for identification of autoantibody signatures against tumor-associated antigens (TAA). These microarrays can be probed using 0.1 mg immunoglobulin G (IgG), purified from 10 µL of plasma. We used a microarray comprising recombinant proteins derived from 15,417 cDNA clones for the screening of 100 lung cancer samples, including 25 samples of each main histological entity of lung cancer, and 100 controls. Since this number of samples cannot be processed at once, the resulting data showed non-biological variances due to “batch effects”. Our aim was to evaluate quantile normalization, “distance-weighted discrimination” (DWD), and “ComBat” for their effectiveness in data pre-processing for elucidating diagnostic immune-signatures. “ComBat” data adjustment outperformed the other methods and allowed us to identify classifiers for all lung cancer cases versus controls and small-cell, squamous cell, large-cell, and adenocarcinoma of the lung with an accuracy of 85%, 94%, 96%, 92%, and 83% (sensitivity of 0.85, 0.92, 0.96, 0.88, 0.83; specificity of 0.85, 0.96, 0.96, 0.96, 0.83), respectively. These promising data would be the basis for further validation using targeted autoantibody tests.

  17. PERFORMANCE CHARACTERISTICS OF 65-MER OLIGONUCLEOTIDE MICROARRAYS

    PubMed Central

    Lee, Myoyong; Xiang, Charlie C.; Trent, Jeffrey M.; Bittner, Michael L.

    2009-01-01

    Microarray fabrication using pre-synthesized long oligonucleotide is becoming increasingly important, but a study of large-scale array productions is not published yet. We addressed the issue of fabricating oligonucleotide microarrays by spotting commercial, pre-synthesized 65-mers with 5′ amines representing 7500 murine genes. Amine-modified oligonucleotides were immobilized on glass slides having aldehyde groups via transient Schiff base formation followed by reduction to produce a covalent conjugate. When RNA derived from the same source was used for Cy3 and Cy5 labeling and hybridized to the same array, signal intensities spanning three orders of magnitude were observed, and the coefficient of variation between the two channels for all spots was 8–10%. To ascertain the reproducibility of ratio determination of these arrays, two triplicate hybridizations (with fluorochrome reversal) comparing RNAs from a fibroblast (NIH3T3) and a breast cancer (JC) cell line were carried out. The 95% confidence interval for all spots in the six hybridizations was 0.60 – 1.66. This level of reproducibility allows use of the full range of pattern finding and discriminant analysis typically applied to cDNA microarrays. Further comparative testing was carried out with oligonucleotide microarrays, cDNA microarrays and RT-PCR assays to examine the comparability of results across these different methodologies. PMID:17617369

  18. Advancing microarray assembly with acoustic dispensing technology.

    PubMed

    Wong, E Y; Diamond, S L

    2009-01-01

    In the assembly of microarrays and microarray-based chemical assays and enzymatic bioassays, most approaches use pins for contact spotting. Acoustic dispensing is a technology capable of nanoliter transfers by using acoustic energy to eject liquid sample from an open source well. Although typically used for well plate transfers, when applied to microarraying, it avoids the drawbacks of undesired physical contact with the sample; difficulty in assembling multicomponent reactions on a chip by readdressing, a rigid mode of printing that lacks patterning capabilities; and time-consuming wash steps. We demonstrated the utility of acoustic dispensing by delivering human cathepsin L in a drop-on-drop fashion into individual 50-nanoliter, prespotted reaction volumes to activate enzyme reactions at targeted positions on a microarray. We generated variable-sized spots ranging from 200 to 750 microm (and higher) and handled the transfer of fluorescent bead suspensions with increasing source well concentrations of 0.1 to 10 x 10(8) beads/mL in a linear fashion. There are no tips that can clog, and liquid dispensing CVs are generally below 5%. This platform expands the toolbox for generating analytical arrays and meets needs associated with spatially addressed assembly of multicomponent microarrays on the nanoliter scale. PMID:19035650

  19. A Synthetic Kinome Microarray Data Generator

    PubMed Central

    Maleki, Farhad; Kusalik, Anthony

    2015-01-01

    Cellular pathways involve the phosphorylation and dephosphorylation of proteins. Peptide microarrays called kinome arrays facilitate the measurement of the phosphorylation activity of hundreds of proteins in a single experiment. Analyzing the data from kinome microarrays is a multi-step process. Typically, various techniques are possible for a particular step, and it is necessary to compare and evaluate them. Such evaluations require data for which correct analysis results are known. Unfortunately, such kinome data is not readily available in the community. Further, there are no established techniques for creating artificial kinome datasets with known results and with the same characteristics as real kinome datasets. In this paper, a methodology for generating synthetic kinome array data is proposed. The methodology relies on actual intensity measurements from kinome microarray experiments and preserves their subtle characteristics. The utility of the methodology is demonstrated by evaluating methods for eliminating heterogeneous variance in kinome microarray data. Phosphorylation intensities from kinome microarrays often exhibit such heterogeneous variance and its presence can negatively impact downstream statistical techniques that rely on homogeneity of variance. It is shown that using the output from the proposed synthetic data generator, it is possible to critically compare two variance stabilization methods.

  20. t-Test at the Probe Level: An Alternative Method to Identify Statistically Significant Genes for Microarray Data

    PubMed Central

    Boareto, Marcelo; Caticha, Nestor

    2014-01-01

    Microarray data analysis typically consists in identifying a list of differentially expressed genes (DEG), i.e., the genes that are differentially expressed between two experimental conditions. Variance shrinkage methods have been considered a better choice than the standard t-test for selecting the DEG because they correct the dependence of the error with the expression level. This dependence is mainly caused by errors in background correction, which more severely affects genes with low expression values. Here, we propose a new method for identifying the DEG that overcomes this issue and does not require background correction or variance shrinkage. Unlike current methods, our methodology is easy to understand and implement. It consists of applying the standard t-test directly on the normalized intensity data, which is possible because the probe intensity is proportional to the gene expression level and because the t-test is scale- and location-invariant. This methodology considerably improves the sensitivity and robustness of the list of DEG when compared with the t-test applied to preprocessed data and to the most widely used shrinkage methods, Significance Analysis of Microarrays (SAM) and Linear Models for Microarray Data (LIMMA). Our approach is useful especially when the genes of interest have small differences in expression and therefore get ignored by standard variance shrinkage methods.

  1. Optimization of the performances of correlation filters by pre-processing the input plane

    NASA Astrophysics Data System (ADS)

    Bouzidi, F.; Elbouz, M.; Alfalou, A.; Brosseau, C.; Fakhfakh, A.

    2016-01-01

    We report findings on the optimization of the performances of correlation filters. First, we propound and validate an optimization of ROC curves adapted to correlation technique. Then, analysis suggests that a pre-processing of the input plane leads to a compromise between the robustness of the adapted filter and the discrimination of the inverse filter for face recognition applications. Rewardingly, our technical results demonstrate that this method is remarkably efficient to increase the performances of a VanderLugt correlator.

  2. Hyperspectral imaging in medicine: image pre-processing problems and solutions in Matlab.

    PubMed

    Koprowski, Robert

    2015-11-01

    The paper presents problems and solutions related to hyperspectral image pre-processing. New methods of preliminary image analysis are proposed. The paper shows problems occurring in Matlab when trying to analyse this type of images. Moreover, new methods are discussed which provide the source code in Matlab that can be used in practice without any licensing restrictions. The proposed application and sample result of hyperspectral image analysis. PMID:25676816

  3. Learning-based image preprocessing for robust computer-aided detection

    NASA Astrophysics Data System (ADS)

    Raghupathi, Laks; Devarakota, Pandu R.; Wolf, Matthias

    2013-03-01

    Recent studies have shown that low dose computed tomography (LDCT) can be an effective screening tool to reduce lung cancer mortality. Computer-aided detection (CAD) would be a beneficial second reader for radiologists in such cases. Studies demonstrate that while iterative reconstructions (IR) improve LDCT diagnostic quality, it however degrades CAD performance significantly (increased false positives) when applied directly. For improving CAD performance, solutions such as retraining with newer data or applying a standard preprocessing technique may not be suffice due to high prevalence of CT scanners and non-uniform acquisition protocols. Here, we present a learning-based framework that can adaptively transform a wide variety of input data to boost an existing CAD performance. This not only enhances their robustness but also their applicability in clinical workflows. Our solution consists of applying a suitable pre-processing filter automatically on the given image based on its characteristics. This requires the preparation of ground truth (GT) of choosing an appropriate filter resulting in improved CAD performance. Accordingly, we propose an efficient consolidation process with a novel metric. Using key anatomical landmarks, we then derive consistent feature descriptors for the classification scheme that then uses a priority mechanism to automatically choose an optimal preprocessing filter. We demonstrate CAD prototype∗ performance improvement using hospital-scale datasets acquired from North America, Europe and Asia. Though we demonstrated our results for a lung nodule CAD, this scheme is straightforward to extend to other post-processing tools dedicated to other organs and modalities.

  4. Long synthetic oligonucleotides for microarray expression measurement

    NASA Astrophysics Data System (ADS)

    Li, Jiong; Wang, Hong; Liu, Heping; Zhang, M.; Zhang, Chunxiu; Lu, Zu-Hong; Gao, Xiang; Kong, Dong

    2001-09-01

    There are generally two kinds of DNA microarray used for genomic-scale gene expression profiling of mRNA: cDNA and DNA chip, but both of them suffer from some drawbacks. To meet more requirements, another oligonucleotide microarray with long was produced. This type of microarray had the advantages of low cost, minimal Cross-hybridization, flexible and easy to make, which is most fit for small laboratories with special purposes. In this paper, we devised different probes with different probe lengths, GC contents and gene positions to optimization the probe design. Experiments showed 70 mer probes are suitable for both sufficient sensitivity and reasonable costs. Higher G-C content produces stronger signal intensity thus better sensitivity and probes designed at 3 untranslated region of gene within the range of 300 pb should be best for both sensitivity and specificity.

  5. Protein microarrays for parasite antigen discovery.

    PubMed

    Driguez, Patrick; Doolan, Denise L; Molina, Douglas M; Loukas, Alex; Trieu, Angela; Felgner, Phil L; McManus, Donald P

    2015-01-01

    The host serological profile to a parasitic infection, such as schistosomiasis, can be used to define potential vaccine and diagnostic targets. Determining the host antibody response using traditional approaches is hindered by the large number of putative antigens in any parasite proteome. Parasite protein microarrays offer the potential for a high-throughput host antibody screen to simplify this task. In order to construct the array, parasite proteins are selected from available genomic sequence and protein databases using bioinformatic tools. Selected open reading frames are PCR amplified, incorporated into a vector for cell-free protein expression, and printed robotically onto glass slides. The protein microarrays can be probed with antisera from infected/immune animals or humans and the antibody reactivity measured with fluorophore labeled antibodies on a confocal laser microarray scanner to identify potential targets for diagnosis or therapeutic or prophylactic intervention. PMID:25388117

  6. Applications of protein microarrays for biomarker discovery

    PubMed Central

    Ramachandran, Niroshan; Srivastava, Sanjeeva; LaBaer, Joshua

    2011-01-01

    The search for new biomarkers for diagnosis, prognosis and therapeutic monitoring of diseases continues in earnest despite dwindling success at finding novel reliable markers. Some of the current markers in clinical use do not provide optimal sensitivity and specificity, with the prostate cancer antigen (PSA) being one of many such examples. The emergence of proteomic techniques and systems approaches to study disease pathophysiology has rekindled the quest for new biomarkers. In particular the use of protein microarrays has surged as a powerful tool for large scale testing of biological samples. Approximately half the reports on protein microarrays have been published in the last two years especially in the area of biomarker discovery. In this review, we will discuss the application of protein microarray technologies that offer unique opportunities to find novel biomarkers. PMID:21136793

  7. Hybridization and Selective Release of DNA Microarrays

    SciTech Connect

    Beer, N R; Baker, B; Piggott, T; Maberry, S; Hara, C M; DeOtte, J; Benett, W; Mukerjee, E; Dzenitis, J; Wheeler, E K

    2011-11-29

    DNA microarrays contain sequence specific probes arrayed in distinct spots numbering from 10,000 to over 1,000,000, depending on the platform. This tremendous degree of multiplexing gives microarrays great potential for environmental background sampling, broad-spectrum clinical monitoring, and continuous biological threat detection. In practice, their use in these applications is not common due to limited information content, long processing times, and high cost. The work focused on characterizing the phenomena of microarray hybridization and selective release that will allow these limitations to be addressed. This will revolutionize the ways that microarrays can be used for LLNL's Global Security missions. The goals of this project were two-fold: automated faster hybridizations and selective release of hybridized features. The first study area involves hybridization kinetics and mass-transfer effects. the standard hybridization protocol uses an overnight incubation to achieve the best possible signal for any sample type, as well as for convenience in manual processing. There is potential to significantly shorten this time based on better understanding and control of the rate-limiting processes and knowledge of the progress of the hybridization. In the hybridization work, a custom microarray flow cell was used to manipulate the chemical and thermal environment of the array and autonomously image the changes over time during hybridization. The second study area is selective release. Microarrays easily generate hybridization patterns and signatures, but there is still an unmet need for methodologies enabling rapid and selective analysis of these patterns and signatures. Detailed analysis of individual spots by subsequent sequencing could potentially yield significant information for rapidly mutating and emerging (or deliberately engineered) pathogens. In the selective release work, optical energy deposition with coherent light quickly provides the thermal energy to

  8. Overview of DNA microarrays: types, applications, and their future.

    PubMed

    Bumgarner, Roger

    2013-01-01

    This unit provides an overview of DNA microarrays. Microarrays are a technology in which thousands of nucleic acids are bound to a surface and are used to measure the relative concentration of nucleic acid sequences in a mixture via hybridization and subsequent detection of the hybridization events. This overview first discusses the history of microarrays and the antecedent technologies that led to their development. This is followed by discussion of the methods of manufacture of microarrays and the most common biological applications. The unit ends with a brief description of the limitations of microarrays and discusses how microarrays are being rapidly replaced by DNA sequencing technologies. PMID:23288464

  9. The use of microarrays in microbial ecology

    SciTech Connect

    Andersen, G.L.; He, Z.; DeSantis, T.Z.; Brodie, E.L.; Zhou, J.

    2009-09-15

    Microarrays have proven to be a useful and high-throughput method to provide targeted DNA sequence information for up to many thousands of specific genetic regions in a single test. A microarray consists of multiple DNA oligonucleotide probes that, under high stringency conditions, hybridize only to specific complementary nucleic acid sequences (targets). A fluorescent signal indicates the presence and, in many cases, the abundance of genetic regions of interest. In this chapter we will look at how microarrays are used in microbial ecology, especially with the recent increase in microbial community DNA sequence data. Of particular interest to microbial ecologists, phylogenetic microarrays are used for the analysis of phylotypes in a community and functional gene arrays are used for the analysis of functional genes, and, by inference, phylotypes in environmental samples. A phylogenetic microarray that has been developed by the Andersen laboratory, the PhyloChip, will be discussed as an example of a microarray that targets the known diversity within the 16S rRNA gene to determine microbial community composition. Using multiple, confirmatory probes to increase the confidence of detection and a mismatch probe for every perfect match probe to minimize the effect of cross-hybridization by non-target regions, the PhyloChip is able to simultaneously identify any of thousands of taxa present in an environmental sample. The PhyloChip is shown to reveal greater diversity within a community than rRNA gene sequencing due to the placement of the entire gene product on the microarray compared with the analysis of up to thousands of individual molecules by traditional sequencing methods. A functional gene array that has been developed by the Zhou laboratory, the GeoChip, will be discussed as an example of a microarray that dynamically identifies functional activities of multiple members within a community. The recent version of GeoChip contains more than 24,000 50mer

  10. Pineal function: impact of microarray analysis.

    PubMed

    Klein, David C; Bailey, Michael J; Carter, David A; Kim, Jong-so; Shi, Qiong; Ho, Anthony K; Chik, Constance L; Gaildrat, Pascaline; Morin, Fabrice; Ganguly, Surajit; Rath, Martin F; Møller, Morten; Sugden, David; Rangel, Zoila G; Munson, Peter J; Weller, Joan L; Coon, Steven L

    2010-01-27

    Microarray analysis has provided a new understanding of pineal function by identifying genes that are highly expressed in this tissue relative to other tissues and also by identifying over 600 genes that are expressed on a 24-h schedule. This effort has highlighted surprising similarity to the retina and has provided reason to explore new avenues of study including intracellular signaling, signal transduction, transcriptional cascades, thyroid/retinoic acid hormone signaling, metal biology, RNA splicing, and the role the pineal gland plays in the immune/inflammation response. The new foundation that microarray analysis has provided will broadly support future research on pineal function. PMID:19622385

  11. MicroRNA expression profiling using microarrays.

    PubMed

    Love, Cassandra; Dave, Sandeep

    2013-01-01

    MicroRNAs are small noncoding RNAs which are able to regulate gene expression at both the transcriptional and translational levels. There is a growing recognition of the role of microRNAs in nearly every tissue type and cellular process. Thus there is an increasing need for accurate quantitation of microRNA expression in a variety of tissues. Microarrays provide a robust method for the examination of microRNA expression. In this chapter, we describe detailed methods for the use of microarrays to measure microRNA expression and discuss methods for the analysis of microRNA expression data. PMID:23666707

  12. Protein Microarrays for the Detection of Biothreats

    NASA Astrophysics Data System (ADS)

    Herr, Amy E.

    Although protein microarrays have proven to be an important tool in proteomics research, the technology is emerging as useful for public health and defense applications. Recent progress in the measurement and characterization of biothreat agents is reviewed in this chapter. Details concerning validation of various protein microarray formats, from contact-printed sandwich assays to supported lipid bilayers, are presented. The reviewed technologies have important implications for in vitro characterization of toxin-ligand interactions, serotyping of bacteria, screening of potential biothreat inhibitors, and as core components of biosensors, among others, research and engineering applications.

  13. Quality Control Usage in High-Density Microarrays Reveals Differential Gene Expression Profiles in Ovarian Cancer.

    PubMed

    Villegas-Ruiz, Vanessa; Moreno, Jose; Jacome-Lopez, Karina; Zentella-Dehesa, Alejandro; Juarez-Mendez, Sergio

    2016-01-01

    There are several existing reports of microarray chip use for assessment of altered gene expression in different diseases. In fact, there have been over 1.5 million assays of this kind performed over the last twenty years, which have influenced clinical and translational research studies. The most commonly used DNA microarray platforms are Affymetrix GeneChip and Quality Control Software along with their GeneChip Probe Arrays. These chips are created using several quality controls to confirm the success of each assay, but their actual impact on gene expression profiles had not been previously analyzed until the appearance of several bioinformatics tools for this purpose. We here performed a data mining analysis, in this case specifically focused on ovarian cancer, as well as healthy ovarian tissue and ovarian cell lines, in order to confirm quality control results and associated variation in gene expression profiles. The microarray data used in our research were downloaded from ArrayExpress and Gene Expression Omnibus (GEO) and analyzed with Expression Console Software using RMA, MAS5 and Plier algorithms. The gene expression profiles were obtained using Partek Genomics Suite v6.6 and data were visualized using principal component analysis, heat map, and Venn diagrams. Microarray quality control analysis showed that roughly 40% of the microarray files were false negative, demonstrating over- and under-estimation of expressed genes. Additionally, we confirmed the results performing second analysis using independent samples. About 70% of the significant expressed genes were correlated in both analyses. These results demonstrate the importance of appropriate microarray processing to obtain a reliable gene expression profile. PMID:27268623

  14. Microarray data mining: A novel optimization-based approach to uncover biologically coherent structures

    PubMed Central

    Tan, Meng P; Smith, Erin N; Broach, James R; Floudas, Christodoulos A

    2008-01-01

    Background DNA microarray technology allows for the measurement of genome-wide expression patterns. Within the resultant mass of data lies the problem of analyzing and presenting information on this genomic scale, and a first step towards the rapid and comprehensive interpretation of this data is gene clustering with respect to the expression patterns. Classifying genes into clusters can lead to interesting biological insights. In this study, we describe an iterative clustering approach to uncover biologically coherent structures from DNA microarray data based on a novel clustering algorithm EP_GOS_Clust. Results We apply our proposed iterative algorithm to three sets of experimental DNA microarray data from experiments with the yeast Saccharomyces cerevisiae and show that the proposed iterative approach improves biological coherence. Comparison with other clustering techniques suggests that our iterative algorithm provides superior performance with regard to biological coherence. An important consequence of our approach is that an increasing proportion of genes find membership in clusters of high biological coherence and that the average cluster specificity improves. Conclusion The results from these clustering experiments provide a robust basis for extracting motifs and trans-acting factors that determine particular patterns of expression. In addition, the biological coherence of the clusters is iteratively assessed independently of the clustering. Thus, this method will not be severely impacted by functional annotations that are missing, inaccurate, or sparse. PMID:18538024

  15. A Combinational Clustering Based Method for cDNA Microarray Image Segmentation

    PubMed Central

    Shao, Guifang; Li, Tiejun; Zuo, Wangda; Wu, Shunxiang; Liu, Tundong

    2015-01-01

    Microarray technology plays an important role in drawing useful biological conclusions by analyzing thousands of gene expressions simultaneously. Especially, image analysis is a key step in microarray analysis and its accuracy strongly depends on segmentation. The pioneering works of clustering based segmentation have shown that k-means clustering algorithm and moving k-means clustering algorithm are two commonly used methods in microarray image processing. However, they usually face unsatisfactory results because the real microarray image contains noise, artifacts and spots that vary in size, shape and contrast. To improve the segmentation accuracy, in this article we present a combination clustering based segmentation approach that may be more reliable and able to segment spots automatically. First, this new method starts with a very simple but effective contrast enhancement operation to improve the image quality. Then, an automatic gridding based on the maximum between-class variance is applied to separate the spots into independent areas. Next, among each spot region, the moving k-means clustering is first conducted to separate the spot from background and then the k-means clustering algorithms are combined for those spots failing to obtain the entire boundary. Finally, a refinement step is used to replace the false segmentation and the inseparable ones of missing spots. In addition, quantitative comparisons between the improved method and the other four segmentation algorithms--edge detection, thresholding, k-means clustering and moving k-means clustering--are carried out on cDNA microarray images from six different data sets. Experiments on six different data sets, 1) Stanford Microarray Database (SMD), 2) Gene Expression Omnibus (GEO), 3) Baylor College of Medicine (BCM), 4) Swiss Institute of Bioinformatics (SIB), 5) Joe DeRisi’s individual tiff files (DeRisi), and 6) University of California, San Francisco (UCSF), indicate that the improved approach is

  16. A Combinational Clustering Based Method for cDNA Microarray Image Segmentation.

    PubMed

    Shao, Guifang; Li, Tiejun; Zuo, Wangda; Wu, Shunxiang; Liu, Tundong

    2015-01-01

    Microarray technology plays an important role in drawing useful biological conclusions by analyzing thousands of gene expressions simultaneously. Especially, image analysis is a key step in microarray analysis and its accuracy strongly depends on segmentation. The pioneering works of clustering based segmentation have shown that k-means clustering algorithm and moving k-means clustering algorithm are two commonly used methods in microarray image processing. However, they usually face unsatisfactory results because the real microarray image contains noise, artifacts and spots that vary in size, shape and contrast. To improve the segmentation accuracy, in this article we present a combination clustering based segmentation approach that may be more reliable and able to segment spots automatically. First, this new method starts with a very simple but effective contrast enhancement operation to improve the image quality. Then, an automatic gridding based on the maximum between-class variance is applied to separate the spots into independent areas. Next, among each spot region, the moving k-means clustering is first conducted to separate the spot from background and then the k-means clustering algorithms are combined for those spots failing to obtain the entire boundary. Finally, a refinement step is used to replace the false segmentation and the inseparable ones of missing spots. In addition, quantitative comparisons between the improved method and the other four segmentation algorithms--edge detection, thresholding, k-means clustering and moving k-means clustering--are carried out on cDNA microarray images from six different data sets. Experiments on six different data sets, 1) Stanford Microarray Database (SMD), 2) Gene Expression Omnibus (GEO), 3) Baylor College of Medicine (BCM), 4) Swiss Institute of Bioinformatics (SIB), 5) Joe DeRisi's individual tiff files (DeRisi), and 6) University of California, San Francisco (UCSF), indicate that the improved approach is

  17. Analysis of microarray leukemia data using an efficient MapReduce-based K-nearest-neighbor classifier.

    PubMed

    Kumar, Mukesh; Rath, Nitish Kumar; Rath, Santanu Kumar

    2016-04-01

    Microarray-based gene expression profiling has emerged as an efficient technique for classification, prognosis, diagnosis, and treatment of cancer. Frequent changes in the behavior of this disease generates an enormous volume of data. Microarray data satisfies both the veracity and velocity properties of big data, as it keeps changing with time. Therefore, the analysis of microarray datasets in a small amount of time is essential. They often contain a large amount of expression, but only a fraction of it comprises genes that are significantly expressed. The precise identification of genes of interest that are responsible for causing cancer are imperative in microarray data analysis. Most existing schemes employ a two-phase process such as feature selection/extraction followed by classification. In this paper, various statistical methods (tests) based on MapReduce are proposed for selecting relevant features. After feature selection, a MapReduce-based K-nearest neighbor (mrKNN) classifier is also employed to classify microarray data. These algorithms are successfully implemented in a Hadoop framework. A comparative analysis is done on these MapReduce-based models using microarray datasets of various dimensions. From the obtained results, it is observed that these models consume much less execution time than conventional models in processing big data. PMID:26975600

  18. Multi-objective dynamic population shuffled frog-leaping biclustering of microarray data

    PubMed Central

    2012-01-01

    Background Multi-objective optimization (MOO) involves optimization problems with multiple objectives. Generally, theose objectives is used to estimate very different aspects of the solutions, and these aspects are often in conflict with each other. MOO first gets a Pareto set, and then looks for both commonality and systematic variations across the set. For the large-scale data sets, heuristic search algorithms such as EA combined with MOO techniques are ideal. Newly DNA microarray technology may study the transcriptional response of a complete genome to different experimental conditions and yield a lot of large-scale datasets. Biclustering technique can simultaneously cluster rows and columns of a dataset, and hlep to extract more accurate information from those datasets. Biclustering need optimize several conflicting objectives, and can be solved with MOO methods. As a heuristics-based optimization approach, the particle swarm optimization (PSO) simulate the movements of a bird flock finding food. The shuffled frog-leaping algorithm (SFL) is a population-based cooperative search metaphor combining the benefits of the local search of PSO and the global shuffled of information of the complex evolution technique. SFL is used to solve the optimization problems of the large-scale datasets. Results This paper integrates dynamic population strategy and shuffled frog-leaping algorithm into biclustering of microarray data, and proposes a novel multi-objective dynamic population shuffled frog-leaping biclustering (MODPSFLB) algorithm to mine maximum bicluesters from microarray data. Experimental results show that the proposed MODPSFLB algorithm can effectively find significant biological structures in terms of related biological processes, components and molecular functions. Conclusions The proposed MODPSFLB algorithm has good diversity and fast convergence of Pareto solutions and will become a powerful systematic functional analysis in genome research. PMID:22759615

  19. PRACTICAL STRATEGIES FOR PROCESSING AND ANALYZING SPOTTED OLIGONUCLEOTIDE MICROARRAY DATA

    EPA Science Inventory

    Thoughtful data analysis is as important as experimental design, biological sample quality, and appropriate experimental procedures for making microarrays a useful supplement to traditional toxicology. In the present study, spotted oligonucleotide microarrays were used to profile...

  20. MICROARRAY DATA ANALYSIS USING MULTIPLE STATISTICAL MODELS

    EPA Science Inventory

    Microarray Data Analysis Using Multiple Statistical Models

    Wenjun Bao1, Judith E. Schmid1, Amber K. Goetz1, Ming Ouyang2, William J. Welsh2,Andrew I. Brooks3,4, ChiYi Chu3,Mitsunori Ogihara3,4, Yinhe Cheng5, David J. Dix1. 1National Health and Environmental Effects Researc...

  1. Microarrays (DNA Chips) for the Classroom Laboratory

    ERIC Educational Resources Information Center

    Barnard, Betsy; Sussman, Michael; BonDurant, Sandra Splinter; Nienhuis, James; Krysan, Patrick

    2006-01-01

    We have developed and optimized the necessary laboratory materials to make DNA microarray technology accessible to all high school students at a fraction of both cost and data size. The primary component is a DNA chip/array that students "print" by hand and then analyze using research tools that have been adapted for classroom use. The primary…

  2. DISC-BASED IMMUNOASSAY MICROARRAYS. (R825433)

    EPA Science Inventory

    Microarray technology as applied to areas that include genomics, diagnostics, environmental, and drug discovery, is an interesting research topic for which different chip-based devices have been developed. As an alternative, we have explored the principle of compact disc-based...

  3. Diagnostic Oligonucleotide Microarray Fingerprinting of Bacillus Isolates

    SciTech Connect

    Chandler, Darrell P.; Alferov, Oleg; Chernov, Boris; Daly, Don S.; Golova, Julia; Perov, Alexander N.; Protic, Miroslava; Robison, Richard; Shipma, Matthew; White, Amanda M.; Willse, Alan R.

    2006-01-01

    A diagnostic, genome-independent microbial fingerprinting method using DNA oligonucleotide microarrays was used for high-resolution differentiation between closely related Bacillus strains, including two strains of Bacillus anthracis that are monomorphic (indistinguishable) via amplified fragment length polymorphism fingerprinting techniques. Replicated hybridizations on 391-probe nonamer arrays were used to construct a prototype fingerprint library for quantitative comparisons. Descriptive analysis of the fingerprints, including phylogenetic reconstruction, is consistent with previous taxonomic organization of the genus. Newly developed statistical analysis methods were used to quantitatively compare and objectively confirm apparent differences in microarray fingerprints with the statistical rigor required for microbial forensics and clinical diagnostics. These data suggest that a relatively simple fingerprinting microarray and statistical analysis method can differentiate between species in the Bacillus cereus complex, and between strains of B. anthracis. A synthetic DNA standard was used to understand underlying microarray and process-level variability, leading to specific recommendations for the development of a standard operating procedure and/or continued technology enhancements for microbial forensics and diagnostics.

  4. Shrinkage covariance matrix approach for microarray data

    NASA Astrophysics Data System (ADS)

    Karjanto, Suryaefiza; Aripin, Rasimah

    2013-04-01

    Microarray technology was developed for the purpose of monitoring the expression levels of thousands of genes. A microarray data set typically consists of tens of thousands of genes (variables) from just dozens of samples due to various constraints including the high cost of producing microarray chips. As a result, the widely used standard covariance estimator is not appropriate for this purpose. One such technique is the Hotelling's T2 statistic which is a multivariate test statistic for comparing means between two groups. It requires that the number of observations (n) exceeds the number of genes (p) in the set but in microarray studies it is common that n < p. This leads to a biased estimate of the covariance matrix. In this study, the Hotelling's T2 statistic with the shrinkage approach is proposed to estimate the covariance matrix for testing differential gene expression. The performance of this approach is then compared with other commonly used multivariate tests using a widely analysed diabetes data set as illustrations. The results across the methods are consistent, implying that this approach provides an alternative to existing techniques.

  5. Raman-based microarray readout: a review.

    PubMed

    Haisch, Christoph

    2016-07-01

    For a quarter of a century, microarrays have been part of the routine analytical toolbox. Label-based fluorescence detection is still the commonest optical readout strategy. Since the 1990s, a continuously increasing number of label-based as well as label-free experiments on Raman-based microarray readout concepts have been reported. This review summarizes the possible concepts and methods and their advantages and challenges. A common label-based strategy is based on the binding of selective receptors as well as Raman reporter molecules to plasmonic nanoparticles in a sandwich immunoassay, which results in surface-enhanced Raman scattering signals of the reporter molecule. Alternatively, capture of the analytes can be performed by receptors on a microarray surface. Addition of plasmonic nanoparticles again leads to a surface-enhanced Raman scattering signal, not of a label but directly of the analyte. This approach is mostly proposed for bacteria and cell detection. However, although many promising readout strategies have been discussed in numerous publications, rarely have any of them made the step from proof of concept to a practical application, let alone routine use. Graphical Abstract Possible realization of a SERS (Surface-Enhanced Raman Scattering) system for microarray readout. PMID:26973235

  6. Examining microarray slide quality for the EPA using SNL's hyperspectral microarray scanner.

    SciTech Connect

    Rohde, Rachel M.; Timlin, Jerilyn Ann

    2005-11-01

    This report summarizes research performed at Sandia National Laboratories (SNL) in collaboration with the Environmental Protection Agency (EPA) to assess microarray quality on arrays from two platforms of interest to the EPA. Custom microarrays from two novel, commercially produced array platforms were imaged with SNL's unique hyperspectral imaging technology and multivariate data analysis was performed to investigate sources of emission on the arrays. No extraneous sources of emission were evident in any of the array areas scanned. This led to the conclusions that either of these array platforms could produce high quality, reliable microarray data for the EPA toxicology programs. Hyperspectral imaging results are presented and recommendations for microarray analyses using these platforms are detailed within the report.

  7. An evolutionary and visual framework for clustering of DNA microarray data.

    PubMed

    Castellanos-Garzón, José A; Díaz, Fernando

    2013-01-01

    This paper presents a case study to show the competence of our evolutionary and visual framework for cluster analysis of DNA microarray data. The proposed framework joins a genetic algorithm for hierarchical clustering with a set of visual components of cluster tasks given by a tool. The cluster visualization tool allows us to display different views of clustering results as a means of cluster visual validation. The results of the genetic algorithm for clustering have shown that it can find better solutions than the other methods for the selected data set. Thus, this shows the reliability of the proposed framework. PMID:24231146

  8. Microarray analysis at single molecule resolution

    PubMed Central

    Mureşan, Leila; Jacak, Jarosław; Klement, Erich Peter; Hesse, Jan; Schütz, Gerhard J.

    2010-01-01

    Bioanalytical chip-based assays have been enormously improved in sensitivity in the recent years; detection of trace amounts of substances down to the level of individual fluorescent molecules has become state of the art technology. The impact of such detection methods, however, has yet not fully been exploited, mainly due to a lack in appropriate mathematical tools for robust data analysis. One particular example relates to the analysis of microarray data. While classical microarray analysis works at resolutions of two to 20 micrometers and quantifies the abundance of target molecules by determining average pixel intensities, a novel high resolution approach [1] directly visualizes individual bound molecules as diffraction limited peaks. The now possible quantification via counting is less susceptible to labeling artifacts and background noise. We have developed an approach for the analysis of high-resolution microarray images. It consists first of a single molecule detection step, based on undecimated wavelet transforms, and second, of a spot identification step via spatial statistics approach (corresponding to the segmentation step in the classical microarray analysis). The detection method was tested on simulated images with a concentration range of 0.001 to 0.5 molecules per square micron and signal-to-noise ratio (SNR) between 0.9 and 31.6. For SNR above 15 the false negatives relative error was below 15%. Separation of foreground/background proved reliable, in case foreground density exceeds background by a factor of 2. The method has also been applied to real data from high-resolution microarray measurements. PMID:20123580

  9. Facilitating functional annotation of chicken microarray data

    PubMed Central

    2009-01-01

    Background Modeling results from chicken microarray studies is challenging for researchers due to little functional annotation associated with these arrays. The Affymetrix GenChip chicken genome array, one of the biggest arrays that serve as a key research tool for the study of chicken functional genomics, is among the few arrays that link gene products to Gene Ontology (GO). However the GO annotation data presented by Affymetrix is incomplete, for example, they do not show references linked to manually annotated functions. In addition, there is no tool that facilitates microarray researchers to directly retrieve functional annotations for their datasets from the annotated arrays. This costs researchers amount of time in searching multiple GO databases for functional information. Results We have improved the breadth of functional annotations of the gene products associated with probesets on the Affymetrix chicken genome array by 45% and the quality of annotation by 14%. We have also identified the most significant diseases and disorders, different types of genes, and known drug targets represented on Affymetrix chicken genome array. To facilitate functional annotation of other arrays and microarray experimental datasets we developed an Array GO Mapper (AGOM) tool to help researchers to quickly retrieve corresponding functional information for their dataset. Conclusion Results from this study will directly facilitate annotation of other chicken arrays and microarray experimental datasets. Researchers will be able to quickly model their microarray dataset into more reliable biological functional information by using AGOM tool. The disease, disorders, gene types and drug targets revealed in the study will allow researchers to learn more about how genes function in complex biological systems and may lead to new drug discovery and development of therapies. The GO annotation data generated will be available for public use via AgBase website and will be updated on regular

  10. Identifying Fishes through DNA Barcodes and Microarrays

    PubMed Central

    Kochzius, Marc; Seidel, Christian; Antoniou, Aglaia; Botla, Sandeep Kumar; Campo, Daniel; Cariani, Alessia; Vazquez, Eva Garcia; Hauschild, Janet; Hervet, Caroline; Hjörleifsdottir, Sigridur; Hreggvidsson, Gudmundur; Kappel, Kristina; Landi, Monica; Magoulas, Antonios; Marteinsson, Viggo; Nölte, Manfred; Planes, Serge; Tinti, Fausto; Turan, Cemal; Venugopal, Moleyur N.; Weber, Hannes; Blohm, Dietmar

    2010-01-01

    Background International fish trade reached an import value of 62.8 billion Euro in 2006, of which 44.6% are covered by the European Union. Species identification is a key problem throughout the life cycle of fishes: from eggs and larvae to adults in fisheries research and control, as well as processed fish products in consumer protection. Methodology/Principal Findings This study aims to evaluate the applicability of the three mitochondrial genes 16S rRNA (16S), cytochrome b (cyt b), and cytochrome oxidase subunit I (COI) for the identification of 50 European marine fish species by combining techniques of “DNA barcoding” and microarrays. In a DNA barcoding approach, neighbour Joining (NJ) phylogenetic trees of 369 16S, 212 cyt b, and 447 COI sequences indicated that cyt b and COI are suitable for unambiguous identification, whereas 16S failed to discriminate closely related flatfish and gurnard species. In course of probe design for DNA microarray development, each of the markers yielded a high number of potentially species-specific probes in silico, although many of them were rejected based on microarray hybridisation experiments. None of the markers provided probes to discriminate the sibling flatfish and gurnard species. However, since 16S-probes were less negatively influenced by the “position of label” effect and showed the lowest rejection rate and the highest mean signal intensity, 16S is more suitable for DNA microarray probe design than cty b and COI. The large portion of rejected COI-probes after hybridisation experiments (>90%) renders the DNA barcoding marker as rather unsuitable for this high-throughput technology. Conclusions/Significance Based on these data, a DNA microarray containing 64 functional oligonucleotide probes for the identification of 30 out of the 50 fish species investigated was developed. It represents the next step towards an automated and easy-to-handle method to identify fish, ichthyoplankton, and fish products. PMID

  11. Acquisition, preprocessing, and reconstruction of ultralow dose volumetric CT scout for organ-based CT scan planning

    SciTech Connect

    Yin, Zhye De Man, Bruno; Yao, Yangyang; Wu, Mingye; Montillo, Albert; Edic, Peter M.; Kalra, Mannudeep

    2015-05-15

    Purpose: Traditionally, 2D radiographic preparatory scan images (scout scans) are used to plan diagnostic CT scans. However, a 3D CT volume with a full 3D organ segmentation map could provide superior information for customized scan planning and other purposes. A practical challenge is to design the volumetric scout acquisition and processing steps to provide good image quality (at least good enough to enable 3D organ segmentation) while delivering a radiation dose similar to that of the conventional 2D scout. Methods: The authors explored various acquisition methods, scan parameters, postprocessing methods, and reconstruction methods through simulation and cadaver data studies to achieve an ultralow dose 3D scout while simultaneously reducing the noise and maintaining the edge strength around the target organ. Results: In a simulation study, the 3D scout with the proposed acquisition, preprocessing, and reconstruction strategy provided a similar level of organ segmentation capability as a traditional 240 mAs diagnostic scan, based on noise and normalized edge strength metrics. At the same time, the proposed approach delivers only 1.25% of the dose of a traditional scan. In a cadaver study, the authors’ pictorial-structures based organ localization algorithm successfully located the major abdominal-thoracic organs from the ultralow dose 3D scout obtained with the proposed strategy. Conclusions: The authors demonstrated that images with a similar degree of segmentation capability (interpretability) as conventional dose CT scans can be achieved with an ultralow dose 3D scout acquisition and suitable postprocessing. Furthermore, the authors applied these techniques to real cadaver CT scans with a CTDI dose level of less than 0.1 mGy and successfully generated a 3D organ localization map.

  12. Comparative evaluation of preprocessing freeware on chromatography/mass spectrometry data for signature discovery.

    PubMed

    Coble, Jamie B; Fraga, Carlos G

    2014-09-01

    Preprocessing software, which converts large instrumental data sets into a manageable format for data analysis, is crucial for the discovery of chemical signatures in metabolomics, chemical forensics, and other signature-focused disciplines. Here, four freely available and published preprocessing tools known as MetAlign, MZmine, SpectConnect, and XCMS were evaluated for impurity profiling using nominal mass GC/MS data and accurate mass LC/MS data. Both data sets were previously collected from the analysis of replicate samples from multiple stocks of a nerve-agent precursor and method blanks. Parameters were optimized for each of the four tools for the untargeted detection, matching, and cataloging of chromatographic peaks from impurities present in the stock samples. The peak table generated by each preprocessing tool was analyzed to determine the number of impurity components detected in all replicate samples per stock and absent in the method blanks. A cumulative set of impurity components was then generated using all available peak tables and used as a reference to calculate the percent of component detections for each tool, in which 100% indicated the detection of every known component present in a stock. For the nominal mass GC/MS data, MetAlign had the most component detections followed by MZmine, SpectConnect, and XCMS with detection percentages of 83, 60, 47, and 41%, respectively. For the accurate mass LC/MS data, the order was MetAlign, XCMS, and MZmine with detection percentages of 80, 45, and 35%, respectively. SpectConnect did not function for the accurate mass LC/MS data. Larger detection percentages were obtained by combining the top performer with at least one of the other tools such as 96% by combining MetAlign with MZmine for the GC/MS data and 93% by combining MetAlign with XCMS for the LC/MS data. In terms of quantitative performance, the reported peak intensities from each tool had averaged absolute biases (relative to peak intensities obtained

  13. Comparative Evaluation of Preprocessing Freeware on Chromatography/Mass Spectrometry Data for Signature Discovery

    SciTech Connect

    Coble, Jamie B.; Fraga, Carlos G.

    2014-07-07

    Preprocessing software is crucial for the discovery of chemical signatures in metabolomics, chemical forensics, and other signature-focused disciplines that involve analyzing large data sets from chemical instruments. Here, four freely available and published preprocessing tools known as metAlign, MZmine, SpectConnect, and XCMS were evaluated for impurity profiling using nominal mass GC/MS data and accurate mass LC/MS data. Both data sets were previously collected from the analysis of replicate samples from multiple stocks of a nerve-agent precursor. Each of the four tools had their parameters set for the untargeted detection of chromatographic peaks from impurities present in the stocks. The peak table generated by each preprocessing tool was analyzed to determine the number of impurity components detected in all replicate samples per stock. A cumulative set of impurity components was then generated using all available peak tables and used as a reference to calculate the percent of component detections for each tool, in which 100% indicated the detection of every component. For the nominal mass GC/MS data, metAlign performed the best followed by MZmine, SpectConnect, and XCMS with detection percentages of 83, 60, 47, and 42%, respectively. For the accurate mass LC/MS data, the order was metAlign, XCMS, and MZmine with detection percentages of 80, 45, and 35%, respectively. SpectConnect did not function for the accurate mass LC/MS data. Larger detection percentages were obtained by combining the top performer with at least one of the other tools such as 96% by combining metAlign with MZmine for the GC/MS data and 93% by combining metAlign with XCMS for the LC/MS data. In terms of quantitative performance, the reported peak intensities had average absolute biases of 41, 4.4, 1.3 and 1.3% for SpectConnect, metAlign, XCMS, and MZmine, respectively, for the GC/MS data. For the LC/MS data, the average absolute biases were 22, 4.5, and 3.1% for metAlign, MZmine, and XCMS

  14. Reservoir computing with a slowly modulated mask signal for preprocessing using a mutually coupled optoelectronic system

    NASA Astrophysics Data System (ADS)

    Tezuka, Miwa; Kanno, Kazutaka; Bunsen, Masatoshi

    2016-08-01

    Reservoir computing is a machine-learning paradigm based on information processing in the human brain. We numerically demonstrate reservoir computing with a slowly modulated mask signal for preprocessing by using a mutually coupled optoelectronic system. The performance of our system is quantitatively evaluated by a chaotic time series prediction task. Our system can produce comparable performance with reservoir computing with a single feedback system and a fast modulated mask signal. We showed that it is possible to slow down the modulation speed of the mask signal by using the mutually coupled system in reservoir computing.

  15. Are hybrid models integrated with data preprocessing techniques suitable for monthly streamflow forecasting? Some experiment evidences

    NASA Astrophysics Data System (ADS)

    Zhang, Xiaoli; Peng, Yong; Zhang, Chi; Wang, Bende

    2015-11-01

    A number of hydrological studies have proven the superior prediction performance of hybrid models coupled with data preprocessing techniques. However, many studies first decompose the entire data series into components and later divide each component into calibration and validation datasets to establish models, which sends some amount of future information into the decomposition and reconstruction processes. As a consequence, the resulting components used to forecast the value of a particular moment are computed using information from future values, which are not available at that particular moment in a forecasting exercise. Since most papers don't present their model framework in detail, it is difficult to identify whether they are performing a real forecast or not. Even though several other papers have explicitly stated which experiment they are performing, a comparison between results in the hindcast and forecast experiments is still missing. Therefore, it is necessary to investigate and compare the performance of these hybrid models in the two experiments in order to estimate whether they are suitable for real forecasting. With the combination of three preprocessing techniques, such as wavelet analysis (WA), empirical mode decomposition (EMD) and singular spectrum analysis (SSA), and two modeling methods (i.e. ANN model and ARMA model), six hybrid models are developed in this study, including WA-ANN, WA-ARMA, EMD-ANN, EMD-ARMA, SSA-ANN and SSA-ARMA. Preprocessing techniques are used to decompose the data series into sub-series, and then these sub-series are modeled using ANN and ARMA models. These models are examined in hindcasting and forecasting of the monthly streamflow of two sites in the Yangtze River of China. The results of this study indicate that the six hybrid models perform better in the hindcast experiment compared with the original ANN and ARMA models, while the hybrid models in the forecast experiment perform worse than the original models and the

  16. Fast randomized point location without preprocessing in two- and three-dimensional Delaunay triangulations

    SciTech Connect

    Muecke, E.P.; Saias, I.; Zhu, B.

    1996-05-01

    This paper studies the point location problem in Delaunay triangulations without preprocessing and additional storage. The proposed procedure finds the query point simply by walking through the triangulation, after selecting a good starting point by random sampling. The analysis generalizes and extends a recent result of d = 2 dimensions by proving this procedure to take expected time close to O(n{sup 1/(d+1)}) for point location in Delaunay triangulations of n random points in d = 3 dimensions. Empirical results in both two and three dimensions show that this procedure is efficient in practice.

  17. A fast meteor detection algorithm

    NASA Astrophysics Data System (ADS)

    Gural, P.

    2016-01-01

    A low latency meteor detection algorithm for use with fast steering mirrors had been previously developed to track and telescopically follow meteors in real-time (Gural, 2007). It has been rewritten as a generic clustering and tracking software module for meteor detection that meets both the demanding throughput requirements of a Raspberry Pi while also maintaining a high probability of detection. The software interface is generalized to work with various forms of front-end video pre-processing approaches and provides a rich product set of parameterized line detection metrics. Discussion will include the Maximum Temporal Pixel (MTP) compression technique as a fast thresholding option for feeding the detection module, the detection algorithm trade for maximum processing throughput, details on the clustering and tracking methodology, processing products, performance metrics, and a general interface description.

  18. Hierarchical Gene Selection and Genetic Fuzzy System for Cancer Microarray Data Classification

    PubMed Central

    Nguyen, Thanh; Khosravi, Abbas; Creighton, Douglas; Nahavandi, Saeid

    2015-01-01

    This paper introduces a novel approach to gene selection based on a substantial modification of analytic hierarchy process (AHP). The modified AHP systematically integrates outcomes of individual filter methods to select the most informative genes for microarray classification. Five individual ranking methods including t-test, entropy, receiver operating characteristic (ROC) curve, Wilcoxon and signal to noise ratio are employed to rank genes. These ranked genes are then considered as inputs for the modified AHP. Additionally, a method that uses fuzzy standard additive model (FSAM) for cancer classification based on genes selected by AHP is also proposed in this paper. Traditional FSAM learning is a hybrid process comprising unsupervised structure learning and supervised parameter tuning. Genetic algorithm (GA) is incorporated in-between unsupervised and supervised training to optimize the number of fuzzy rules. The integration of GA enables FSAM to deal with the high-dimensional-low-sample nature of microarray data and thus enhance the efficiency of the classification. Experiments are carried out on numerous microarray datasets. Results demonstrate the performance dominance of the AHP-based gene selection against the single ranking methods. Furthermore, the combination of AHP-FSAM shows a great accuracy in microarray data classification compared to various competing classifiers. The proposed approach therefore is useful for medical practitioners and clinicians as a decision support system that can be implemented in the real medical practice. PMID:25823003

  19. Query Large Scale Microarray Compendium Datasets Using a Model-Based Bayesian Approach with Variable Selection

    PubMed Central

    Hu, Ming; Qin, Zhaohui S.

    2009-01-01

    In microarray gene expression data analysis, it is often of interest to identify genes that share similar expression profiles with a particular gene such as a key regulatory protein. Multiple studies have been conducted using various correlation measures to identify co-expressed genes. While working well for small datasets, the heterogeneity introduced from increased sample size inevitably reduces the sensitivity and specificity of these approaches. This is because most co-expression relationships do not extend to all experimental conditions. With the rapid increase in the size of microarray datasets, identifying functionally related genes from large and diverse microarray gene expression datasets is a key challenge. We develop a model-based gene expression query algorithm built under the Bayesian model selection framework. It is capable of detecting co-expression profiles under a subset of samples/experimental conditions. In addition, it allows linearly transformed expression patterns to be recognized and is robust against sporadic outliers in the data. Both features are critically important for increasing the power of identifying co-expressed genes in large scale gene expression datasets. Our simulation studies suggest that this method outperforms existing correlation coefficients or mutual information-based query tools. When we apply this new method to the Escherichia coli microarray compendium data, it identifies a majority of known regulons as well as novel potential target genes of numerous key transcription factors. PMID:19214232

  20. Robust Feature Selection from Microarray Data Based on Cooperative Game Theory and Qualitative Mutual Information

    PubMed Central

    Mortazavi, Atiyeh; Moattar, Mohammad Hossein

    2016-01-01

    High dimensionality of microarray data sets may lead to low efficiency and overfitting. In this paper, a multiphase cooperative game theoretic feature selection approach is proposed for microarray data classification. In the first phase, due to high dimension of microarray data sets, the features are reduced using one of the two filter-based feature selection methods, namely, mutual information and Fisher ratio. In the second phase, Shapley index is used to evaluate the power of each feature. The main innovation of the proposed approach is to employ Qualitative Mutual Information (QMI) for this purpose. The idea of Qualitative Mutual Information causes the selected features to have more stability and this stability helps to deal with the problem of data imbalance and scarcity. In the third phase, a forward selection scheme is applied which uses a scoring function to weight each feature. The performance of the proposed method is compared with other popular feature selection algorithms such as Fisher ratio, minimum redundancy maximum relevance, and previous works on cooperative game based feature selection. The average classification accuracy on eleven microarray data sets shows that the proposed method improves both average accuracy and average stability compared to other approaches. PMID:27127506

  1. Measuring information flow in cellular networks by the systems biology method through microarray data

    PubMed Central

    Chen, Bor-Sen; Li, Cheng-Wei

    2015-01-01

    In general, it is very difficult to measure the information flow in a cellular network directly. In this study, based on an information flow model and microarray data, we measured the information flow in cellular networks indirectly by using a systems biology method. First, we used a recursive least square parameter estimation algorithm to identify the system parameters of coupling signal transduction pathways and the cellular gene regulatory network (GRN). Then, based on the identified parameters and systems theory, we estimated the signal transductivities of the coupling signal transduction pathways from the extracellular signals to each downstream protein and the information transductivities of the GRN between transcription factors in response to environmental events. According to the proposed method, the information flow, which is characterized by signal transductivity in coupling signaling pathways and information transductivity in the GRN, can be estimated by microarray temporal data or microarray sample data. It can also be estimated by other high-throughput data such as next-generation sequencing or proteomic data. Finally, the information flows of the signal transduction pathways and the GRN in leukemia cancer cells and non-leukemia normal cells were also measured to analyze the systematic dysfunction in this cancer from microarray sample data. The results show that the signal transductivities of signal transduction pathways change substantially from normal cells to leukemia cancer cells. PMID:26082788

  2. A Bayesian Approach to Pathway Analysis by Integrating Gene–Gene Functional Directions and Microarray Data

    PubMed Central

    Zhao, Yifang; Chen, Ming-Hui; Pei, Baikang; Rowe, David; Shin, Dong-Guk; Xie, Wangang; Yu, Fang; Kuo, Lynn

    2012-01-01

    Many statistical methods have been developed to screen for differentially expressed genes associated with specific phenotypes in the microarray data. However, it remains a major challenge to synthesize the observed expression patterns with abundant biological knowledge for more complete understanding of the biological functions among genes. Various methods including clustering analysis on genes, neural network, Bayesian network and pathway analysis have been developed toward this goal. In most of these procedures, the activation and inhibition relationships among genes have hardly been utilized in the modeling steps. We propose two novel Bayesian models to integrate the microarray data with the putative pathway structures obtained from the KEGG database and the directional gene–gene interactions in the medical literature. We define the symmetric Kullback–Leibler divergence of a pathway, and use it to identify the pathway(s) most supported by the microarray data. Monte Carlo Markov Chain sampling algorithm is given for posterior computation in the hierarchical model. The proposed method is shown to select the most supported pathway in an illustrative example. Finally, we apply the methodology to a real microarray data set to understand the gene expression profile of osteoblast lineage at defined stages of differentiation. We observe that our method correctly identifies the pathways that are reported to play essential roles in modulating bone mass. PMID:23482678

  3. Multivariate curve resolution for hyperspectral image analysis :applications to microarray technology.

    SciTech Connect

    Van Benthem, Mark Hilary; Sinclair, Michael B.; Haaland, David Michael; Martinez, M. Juanita (University of New Mexico, Albuquerque, NM); Timlin, Jerilyn Ann; Werner-Washburne, Margaret C. (University of New Mexico, Albuquerque, NM); Aragon, Anthony D. (University of New Mexico, Albuquerque, NM)

    2003-01-01

    Multivariate curve resolution (MCR) using constrained alternating least squares algorithms represents a powerful analysis capability for a quantitative analysis of hyperspectral image data. We will demonstrate the application of MCR using data from a new hyperspectral fluorescence imaging microarray scanner for monitoring gene expression in cells from thousands of genes on the array. The new scanner collects the entire fluorescence spectrum from each pixel of the scanned microarray. Application of MCR with nonnegativity and equality constraints reveals several sources of undesired fluorescence that emit in the same wavelength range as the reporter fluorphores. MCR analysis of the hyperspectral images confirms that one of the sources of fluorescence is due to contaminant fluorescence under the printed DNA spots that is spot localized. Thus, traditional background subtraction methods used with data collected from the current commercial microarray scanners will lead to errors in determining the relative expression of low-expressed genes. With the new scanner and MCR analysis, we generate relative concentration maps of the background, impurity, and fluorescent labels over the entire image. Since the concentration maps of the fluorescent labels are relatively unaffected by the presence of background and impurity emissions, the accuracy and useful dynamic range of the gene expression data are both greatly improved over those obtained by commercial microarray scanners.

  4. Using attribute behavior diversity to build accurate decision tree committees for microarray data.

    PubMed

    Han, Qian; Dong, Guozhu

    2012-08-01

    DNA microarrays (gene chips), frequently used in biological and medical studies, measure the expressions of thousands of genes per sample. Using microarray data to build accurate classifiers for diseases is an important task. This paper introduces an algorithm, called Committee of Decision Trees by Attribute Behavior Diversity (CABD), to build highly accurate ensembles of decision trees for such data. Since a committee's accuracy is greatly influenced by the diversity among its member classifiers, CABD uses two new ideas to "optimize" that diversity, namely (1) the concept of attribute behavior-based similarity between attributes, and (2) the concept of attribute usage diversity among trees. The ideas are effective for microarray data, since such data have many features and behavior similarity between genes can be high. Experiments on microarray data for six cancers show that CABD outperforms previous ensemble methods significantly and outperforms SVM, and show that the diversified features used by CABD's decision tree committee can be used to improve performance of other classifiers such as SVM. CABD has potential for other high-dimensional data, and its ideas may apply to ensembles of other classifier types. PMID:22809418

  5. Hierarchical gene selection and genetic fuzzy system for cancer microarray data classification.

    PubMed

    Nguyen, Thanh; Khosravi, Abbas; Creighton, Douglas; Nahavandi, Saeid

    2015-01-01

    This paper introduces a novel approach to gene selection based on a substantial modification of analytic hierarchy process (AHP). The modified AHP systematically integrates outcomes of individual filter methods to select the most informative genes for microarray classification. Five individual ranking methods including t-test, entropy, receiver operating characteristic (ROC) curve, Wilcoxon and signal to noise ratio are employed to rank genes. These ranked genes are then considered as inputs for the modified AHP. Additionally, a method that uses fuzzy standard additive model (FSAM) for cancer classification based on genes selected by AHP is also proposed in this paper. Traditional FSAM learning is a hybrid process comprising unsupervised structure learning and supervised parameter tuning. Genetic algorithm (GA) is incorporated in-between unsupervised and supervised training to optimize the number of fuzzy rules. The integration of GA enables FSAM to deal with the high-dimensional-low-sample nature of microarray data and thus enhance the efficiency of the classification. Experiments are carried out on numerous microarray datasets. Results demonstrate the performance dominance of the AHP-based gene selection against the single ranking methods. Furthermore, the combination of AHP-FSAM shows a great accuracy in microarray data classification compared to various competing classifiers. The proposed approach therefore is useful for medical practitioners and clinicians as a decision support system that can be implemented in the real medical practice. PMID:25823003

  6. Segmentation of complementary DNA microarray images by wavelet-based Markov random field model.

    PubMed

    Athanasiadis, Emmanouil I; Cavouras, Dionisis A; Glotsos, Dimitris Th; Georgiadis, Pantelis V; Kalatzis, Ioannis K; Nikiforidis, George C

    2009-11-01

    A wavelet-based modification of the Markov random field (WMRF) model is proposed for segmenting complementary DNA (cDNA) microarray images. For evaluation purposes, five simulated and a set of five real microarray images were used. The one-level stationary wavelet transform (SWT) of each microarray image was used to form two images, a denoised image, using hard thresholding filter, and a magnitude image, from the amplitudes of the horizontal and vertical components of SWT. Elements from these two images were suitably combined to form the WMRF model for segmenting spots from their background. The WMRF was compared against the conventional MRF and the Fuzzy C means (FCM) algorithms on simulated and real microarray images and their performances were evaluated by means of the segmentation matching factor (SMF) and the coefficient of determination (r2). Additionally, the WMRF was compared against the SPOT and SCANALYZE, and performances were evaluated by the mean absolute error (MAE) and the coefficient of variation (CV). The WMRF performed more accurately than the MRF and FCM (SMF: 92.66, 92.15, and 89.22, r2 : 0.92, 0.90, and 0.84, respectively) and achieved higher reproducibility than the MRF, SPOT, and SCANALYZE (MAE: 497, 1215, 1180, and 503, CV: 0.88, 1.15, 0.93, and 0.90, respectively). PMID:19783509

  7. AKITA: Application Knowledge Interface to Algorithms

    NASA Astrophysics Data System (ADS)

    Barros, Paul; Mathis, Allison; Newman, Kevin; Wilder, Steven

    2013-05-01

    We propose a methodology for using sensor metadata and targeted preprocessing to optimize which selection from a large suite of algorithms are most appropriate for a given data set. Rather than applying several general purpose algorithms or requiring a human operator to oversee the analysis of the data, our method allows the most effective algorithm to be automatically chosen, conserving both computational, network and human resources. For example, the amount of video data being produced daily is far greater than can ever be analyzed. Computer vision algorithms can help sift for the relevant data, but not every algorithm is suited to every data type nor is it efficient to run them all. A full body detector won't work well when the camera is zoomed in or when it is raining and all the people are occluded by foul weather gear. However, leveraging metadata knowledge of the camera settings and the conditions under which the data was collected (generated by automatic preprocessing), face or umbrella detectors could be applied instead, increasing the likelihood of a correct reading. The Lockheed Martin AKITA™ system is a modular knowledge layer which uses knowledge of the system and environment to determine how to most efficiently and usefully process whatever data it is given.

  8. Influence of microarrays experiments missing values on the stability of gene groups by hierarchical clustering

    PubMed Central

    de Brevern, Alexandre G; Hazout, Serge; Malpertuy, Alain

    2004-01-01

    Background Microarray technologies produced large amount of data. The hierarchical clustering is commonly used to identify clusters of co-expressed genes. However, microarray datasets often contain missing values (MVs) representing a major drawback for the use of the clustering methods. Usually the MVs are not treated, or replaced by zero or estimated by the k-Nearest Neighbor (kNN) approach. The topic of the paper is to study the stability of gene clusters, defined by various hierarchical clustering algorithms, of microarrays experiments including or not MVs. Results In this study, we show that the MVs have important effects on the stability of the gene clusters. Moreover, the magnitude of the gene misallocations is depending on the aggregation algorithm. The most appropriate aggregation methods (e.g. complete-linkage and Ward) are highly sensitive to MVs, and surprisingly, for a very tiny proportion of MVs (e.g. 1%). In most of the case, the MVs must be replaced by expected values. The MVs replacement by the kNN approach clearly improves the identification of co-expressed gene clusters. Nevertheless, we observe that kNN approach is less suitable for the extreme values of gene expression. Conclusion The presence of MVs (even at a low rate) is a major factor of gene cluster instability. In addition, the impact depends on the hierarchical clustering algorithm used. Some methods should be used carefully. Nevertheless, the kNN approach constitutes one efficient method for restoring the missing expression gene values, with a low error level. Our study highlights the need of statistical treatments in microarray data to avoid misinterpretation. PMID:15324460

  9. Viral diagnosis in Indian livestock using customized microarray chips

    PubMed Central

    Yadav, Brijesh S; Pokhriyal, Mayank; Ratta, Barkha; Kumar, Ajay; Saxena, Meeta; Sharma, Bhaskar

    2015-01-01

    Viral diagnosis in Indian livestock using customized microarray chips is gaining momentum in recent years. Hence, it is possible to design customized microarray chip for viruses infecting livestock in India. Customized microarray chips identified Bovine herpes virus-1 (BHV-1), Canine Adeno Virus-1 (CAV-1), and Canine Parvo Virus-2 (CPV-2) in clinical samples. Microarray identified specific probes were further confirmed using RT-PCR in all clinical and known samples. Therefore, the application of microarray chips during viral disease outbreaks in Indian livestock is possible where conventional methods are unsuitable. It should be noted that customized application requires a detailed cost efficiency calculation. PMID:26912948

  10. Advancing translational research with next-generation protein microarrays.

    PubMed

    Yu, Xiaobo; Petritis, Brianne; LaBaer, Joshua

    2016-04-01

    Protein microarrays are a high-throughput technology used increasingly in translational research, seeking to apply basic science findings to enhance human health. In addition to assessing protein levels, posttranslational modifications, and signaling pathways in patient samples, protein microarrays have aided in the identification of potential protein biomarkers of disease and infection. In this perspective, the different types of full-length protein microarrays that are used in translational research are reviewed. Specific studies employing these microarrays are presented to highlight their potential in finding solutions to real clinical problems. Finally, the criteria that should be considered when developing next-generation protein microarrays are provided. PMID:26749402

  11. Gene Expression Signature in Endemic Osteoarthritis by Microarray Analysis

    PubMed Central

    Wang, Xi; Ning, Yujie; Zhang, Feng; Yu, Fangfang; Tan, Wuhong; Lei, Yanxia; Wu, Cuiyan; Zheng, Jingjing; Wang, Sen; Yu, Hanjie; Li, Zheng; Lammi, Mikko J.; Guo, Xiong

    2015-01-01

    Kashin-Beck Disease (KBD) is an endemic osteochondropathy with an unknown pathogenesis. Diagnosis of KBD is effective only in advanced cases, which eliminates the possibility of early treatment and leads to an inevitable exacerbation of symptoms. Therefore, we aim to identify an accurate blood-based gene signature for the detection of KBD. Previously published gene expression profile data on cartilage and peripheral blood mononuclear cells (PBMCs) from adults with KBD were compared to select potential target genes. Microarray analysis was conducted to evaluate the expression of the target genes in a cohort of 100 KBD patients and 100 healthy controls. A gene expression signature was identified using a training set, which was subsequently validated using an independent test set with a minimum redundancy maximum relevance (mRMR) algorithm and support vector machine (SVM) algorithm. Fifty unique genes were differentially expressed between KBD patients and healthy controls. A 20-gene signature was identified that distinguished between KBD patients and controls with 90% accuracy, 85% sensitivity, and 95% specificity. This study identified a 20-gene signature that accurately distinguishes between patients with KBD and controls using peripheral blood samples. These results promote the further development of blood-based genetic biomarkers for detection of KBD. PMID:25997002

  12. Data Acquisition and Preprocessing in Studies on Humans: What Is Not Taught in Statistics Classes?

    PubMed Central

    Zhu, Yeyi; Hernandez, Ladia M.; Mueller, Peter; Dong, Yongquan; Forman, Michele R.

    2013-01-01

    The aim of this paper is to address issues in research that may be missing from statistics classes and important for (bio-)statistics students. In the context of a case study, we discuss data acquisition and preprocessing steps that fill the gap between research questions posed by subject matter scientists and statistical methodology for formal inference. Issues include participant recruitment, data collection training and standardization, variable coding, data review and verification, data cleaning and editing, and documentation. Despite the critical importance of these details in research, most of these issues are rarely discussed in an applied statistics program. One reason for the lack of more formal training is the difficulty in addressing the many challenges that can possibly arise in the course of a study in a systematic way. This article can help to bridge this gap between research questions and formal statistical inference by using an illustrative case study for a discussion. We hope that reading and discussing this paper and practicing data preprocessing exercises will sensitize statistics students to these important issues and achieve optimal conduct, quality control, analysis, and interpretation of a study. PMID:24511148

  13. Applying Enhancement Filters in the Pre-processing of Images of Lymphoma

    NASA Astrophysics Data System (ADS)

    Henrique Silva, Sérgio; Zanchetta do Nascimento, Marcelo; Alves Neves, Leandro; Ramos Batista, Valério

    2015-01-01

    Lymphoma is a type of cancer that affects the immune system, and is classified as Hodgkin or non-Hodgkin. It is one of the ten types of cancer that are the most common on earth. Among all malignant neoplasms diagnosed in the world, lymphoma ranges from three to four percent of them. Our work presents a study of some filters devoted to enhancing images of lymphoma at the pre-processing step. Here the enhancement is useful for removing noise from the digital images. We have analysed the noise caused by different sources like room vibration, scraps and defocusing, and in the following classes of lymphoma: follicular, mantle cell and B-cell chronic lymphocytic leukemia. The filters Gaussian, Median and Mean-Shift were applied to different colour models (RGB, Lab and HSV). Afterwards, we performed a quantitative analysis of the images by means of the Structural Similarity Index. This was done in order to evaluate the similarity between the images. In all cases we have obtained a certainty of at least 75%, which rises to 99% if one considers only HSV. Namely, we have concluded that HSV is an important choice of colour model at pre-processing histological images of lymphoma, because in this case the resulting image will get the best enhancement.

  14. Preprocessing of A-scan GPR data based on energy features

    NASA Astrophysics Data System (ADS)

    Dogan, Mesut; Turhan-Sayan, Gonul

    2016-05-01

    There is an increasing demand for noninvasive real-time detection and classification of buried objects in various civil and military applications. The problem of detection and annihilation of landmines is particularly important due to strong safety concerns. The requirement for a fast real-time decision process is as important as the requirements for high detection rates and low false alarm rates. In this paper, we introduce and demonstrate a computationally simple, timeefficient, energy-based preprocessing approach that can be used in ground penetrating radar (GPR) applications to eliminate reflections from the air-ground boundary and to locate the buried objects, simultaneously, at one easy step. The instantaneous power signals, the total energy values and the cumulative energy curves are extracted from the A-scan GPR data. The cumulative energy curves, in particular, are shown to be useful to detect the presence and location of buried objects in a fast and simple way while preserving the spectral content of the original A-scan data for further steps of physics-based target classification. The proposed method is demonstrated using the GPR data collected at the facilities of IPA Defense, Ankara at outdoor test lanes. Cylindrically shaped plastic containers were buried in fine-medium sand to simulate buried landmines. These plastic containers were half-filled by ammonium nitrate including metal pins. Results of this pilot study are demonstrated to be highly promising to motivate further research for the use of energy-based preprocessing features in landmine detection problem.

  15. Fast generation of digitally reconstructed radiograph through an efficient preprocessing of ray attenuation values

    NASA Astrophysics Data System (ADS)

    Ghafurian, Soheil; Metaxas, Dimitris N.; Tan, Virak; Li, Kang

    2016-03-01

    Digitally reconstructed radiographs (DRR) are a simulation of radiographic images produced through a perspective projection of the three-dimensional (3D) image (volume) onto a two-dimensional (2D) image plane. The traditional method for the generation of DRRs, namely ray-casting, is a computationally intensive process and accounts for most of solution time in 3D/2D medical image registration frameworks, where a large number of DRRs is required. A few alternate methods for a faster DRR generation have been proposed, the most successful of which are based on the idea of pre-calculating the attenuation value of possible rays. Despite achieving good quality, these methods support a limited range of motion for the volume and entail long pre-calculation time. In this paper, we propose a new preprocessing procedure and data structure for the calculation of the ray attenuation values. This method supports all possible volume positions with practically small memory requirements in addition to reducing the complexity of the problem from O(n3) to O(n2). In our experiments, we generated DRRs of high quality in 63 milliseconds with a preprocessing time of 99.48 seconds and a memory size of 7.45 megabytes.

  16. Robust symmetrical number system preprocessing for minimizing encoding errors in photonic analog-to-digital converters

    NASA Astrophysics Data System (ADS)

    Arvizo, Mylene R.; Calusdian, James; Hollinger, Kenneth B.; Pace, Phillip E.

    2011-08-01

    A photonic analog-to-digital converter (ADC) preprocessing architecture based on the robust symmetrical number system (RSNS) is presented. The RSNS preprocessing architecture is a modular scheme in which a modulus number of comparators are used at the output of each Mach-Zehnder modulator channel. The number of comparators with a logic 1 in each channel represents the integer values within each RSNS modulus sequence. When considered together, the integers within each sequence change one at a time at the next code position, resulting in an integer Gray code property. The RSNS ADC has the feature that the maximum nonlinearity is less than a least significant bit (LSB). Although the observed dynamic range (greatest length of combined sequences that contain no ambiguities) of the RSNS ADC is less than the optimum symmetrical number system ADC, the integer Gray code properties make it attractive for error control. A prototype is presented to demonstrate the feasibility of the concept and to show the important RSNS property that the largest nonlinearity is always less than a LSB. Also discussed are practical considerations related to multi-gigahertz implementations.

  17. A Technical Review on Biomass Processing: Densification, Preprocessing, Modeling and Optimization

    SciTech Connect

    Jaya Shankar Tumuluru; Christopher T. Wright

    2010-06-01

    It is now a well-acclaimed fact that burning fossil fuels and deforestation are major contributors to climate change. Biomass from plants can serve as an alternative renewable and carbon-neutral raw material for the production of bioenergy. Low densities of 40–60 kg/m3 for lignocellulosic and 200–400 kg/m3 for woody biomass limits their application for energy purposes. Prior to use in energy applications these materials need to be densified. The densified biomass can have bulk densities over 10 times the raw material helping to significantly reduce technical limitations associated with storage, loading and transportation. Pelleting, briquetting, or extrusion processing are commonly used methods for densification. The aim of the present research is to develop a comprehensive review of biomass processing that includes densification, preprocessing, modeling and optimization. The specific objective include carrying out a technical review on (a) mechanisms of particle bonding during densification; (b) methods of densification including extrusion, briquetting, pelleting, and agglomeration; (c) effects of process and feedstock variables and biomass biochemical composition on the densification (d) effects of preprocessing such as grinding, preheating, steam explosion, and torrefaction on biomass quality and binding characteristics; (e) models for understanding the compression characteristics; and (f) procedures for response surface modeling and optimization.

  18. A wavelet-based data pre-processing analysis approach in mass spectrometry.

    PubMed

    Li, Xiaoli; Li, Jin; Yao, Xin

    2007-04-01

    Recently, mass spectrometry analysis has a become an effective and rapid approach in detecting early-stage cancer. To identify proteomic patterns in serum to discriminate cancer patients from normal individuals, machine-learning methods, such as feature selection and classification, have already been involved in the analysis of mass spectrometry (MS) data with some success. However, the performance of existing machine learning methods for MS data analysis still needs improving. The study in this paper proposes a wavelet-based pre-processing approach to MS data analysis. The approach applies wavelet-based transforms to MS data with the aim of de-noising the data that are potentially contaminated in acquisition. The effects of the selection of wavelet function and decomposition level on the de-noising performance have also been investigated in this study. Our comparative experimental results demonstrate that the proposed de-noising pre-processing approach has potentials to remove possible noise embedded in MS data, which can lead to improved performance for existing machine learning methods in cancer detection. PMID:16982045

  19. Parafoveal preprocessing in reading revisited: evidence from a novel preview manipulation.

    PubMed

    Gagl, Benjamin; Hawelka, Stefan; Richlan, Fabio; Schuster, Sarah; Hutzler, Florian

    2014-03-01

    The study investigated parafoveal preprocessing by the means of the classical invisible boundary paradigm and a novel manipulation of the parafoveal previews (i.e., visual degradation). Eye movements were investigated on 5-letter target words with constraining (i.e., highly informative) initial letters or similarly constraining final letters. Visual degradation was administered to all, no, the initial, or the final 2 letters of the parafoveal preview of the target words. Critically, the manipulation of the parafoveal previews did not interfere with foveal processing. Thus, we had a proper baseline to which we could relate our main findings, which were as follows: First, the valid (i.e., nondegraded) preview of the target words' final letters led to shorter fixation times compared to the baseline condition (i.e., the degradation of all letters). Second, this preview benefit for the final letters was comparable to the benefit of previewing the initial letters. Third, the preview of a constraining initial letter sequence, however, yielded a larger preview benefit than the preview of an unconstraining initial letter sequence. The latter finding indicates that preprocessing constraining initial letters is particularly conducive to foveal word recognition. PMID:24041397

  20. Contour Error Map Algorithm

    NASA Technical Reports Server (NTRS)

    Merceret, Francis; Lane, John; Immer, Christopher; Case, Jonathan; Manobianco, John

    2005-01-01

    The contour error map (CEM) algorithm and the software that implements the algorithm are means of quantifying correlations between sets of time-varying data that are binarized and registered on spatial grids. The present version of the software is intended for use in evaluating numerical weather forecasts against observational sea-breeze data. In cases in which observational data come from off-grid stations, it is necessary to preprocess the observational data to transform them into gridded data. First, the wind direction is gridded and binarized so that D(i,j;n) is the input to CEM based on forecast data and d(i,j;n) is the input to CEM based on gridded observational data. Here, i and j are spatial indices representing 1.25-km intervals along the west-to-east and south-to-north directions, respectively; and n is a time index representing 5-minute intervals. A binary value of D or d = 0 corresponds to an offshore wind, whereas a value of D or d = 1 corresponds to an onshore wind. CEM includes two notable subalgorithms: One identifies and verifies sea-breeze boundaries; the other, which can be invoked optionally, performs an image-erosion function for the purpose of attempting to eliminate river-breeze contributions in the wind fields.

  1. PMD: A Resource for Archiving and Analyzing Protein Microarray data

    PubMed Central

    Xu, Zhaowei; Huang, Likun; Zhang, Hainan; Li, Yang; Guo, Shujuan; Wang, Nan; Wang, Shi-hua; Chen, Ziqing; Wang, Jingfang; Tao, Sheng-ce

    2016-01-01

    Protein microarray is a powerful technology for both basic research and clinical study. However, because there is no database specifically tailored for protein microarray, the majority of the valuable original protein microarray data is still not publically accessible. To address this issue, we constructed Protein Microarray Database (PMD), which is specifically designed for archiving and analyzing protein microarray data. In PMD, users can easily browse and search the entire database by experimental name, protein microarray type, and sample information. Additionally, PMD integrates several data analysis tools and provides an automated data analysis pipeline for users. With just one click, users can obtain a comprehensive analysis report for their protein microarray data. The report includes preliminary data analysis, such as data normalization, candidate identification, and an in-depth bioinformatics analysis of the candidates, which include functional annotation, pathway analysis, and protein-protein interaction network analysis. PMD is now freely available at www.proteinmicroarray.cn. PMID:26813635

  2. Plasmonically amplified fluorescence bioassay with microarray format

    NASA Astrophysics Data System (ADS)

    Gogalic, S.; Hageneder, S.; Ctortecka, C.; Bauch, M.; Khan, I.; Preininger, Claudia; Sauer, U.; Dostalek, J.

    2015-05-01

    Plasmonic amplification of fluorescence signal in bioassays with microarray detection format is reported. A crossed relief diffraction grating was designed to couple an excitation laser beam to surface plasmons at the wavelength overlapping with the absorption and emission bands of fluorophore Dy647 that was used as a label. The surface of periodically corrugated sensor chip was coated with surface plasmon-supporting gold layer and a thin SU8 polymer film carrying epoxy groups. These groups were employed for the covalent immobilization of capture antibodies at arrays of spots. The plasmonic amplification of fluorescence signal on the developed microarray chip was tested by using interleukin 8 sandwich immunoassay. The readout was performed ex situ after drying the chip by using a commercial scanner with high numerical aperture collecting lens. Obtained results reveal the enhancement of fluorescence signal by a factor of 5 when compared to a regular glass chip.

  3. Immobilization Techniques for Microarray: Challenges and Applications

    PubMed Central

    Nimse, Satish Balasaheb; Song, Keumsoo; Sonawane, Mukesh Digambar; Sayyed, Danishmalik Rafiq; Kim, Taisun

    2014-01-01

    The highly programmable positioning of molecules (biomolecules, nanoparticles, nanobeads, nanocomposites materials) on surfaces has potential applications in the fields of biosensors, biomolecular electronics, and nanodevices. However, the conventional techniques including self-assembled monolayers fail to position the molecules on the nanometer scale to produce highly organized monolayers on the surface. The present article elaborates different techniques for the immobilization of the biomolecules on the surface to produce microarrays and their diagnostic applications. The advantages and the drawbacks of various methods are compared. This article also sheds light on the applications of the different technologies for the detection and discrimination of viral/bacterial genotypes and the detection of the biomarkers. A brief survey with 115 references covering the last 10 years on the biological applications of microarrays in various fields is also provided. PMID:25429408

  4. Use of microarray technologies in toxicology research.

    PubMed

    Vrana, Kent E; Freeman, Willard M; Aschner, Michael

    2003-06-01

    Microarray technology provides a unique tool for the determination of gene expression at the level of messenger RNA (mRNA). The simultaneous measurement of the entire human genome (thousands of genes) will facilitate the uncovering of specific gene expression patterns that are associated with disease. One important application of microarray technology, within the context of neurotoxicological studies, is its use as a screening tool for the identification of molecular mechanisms of toxicity. Such approaches enable researchers to identify those genes and their products (either single or whole pathways) that are involved in conferring resistance or sensitivity to toxic substances. This review addresses: (1) the potential uses of array data; (2) the various array platforms, highlighting both their advantages and disadvantages; (3) insights into data analysis and presentation strategies; and (4) concrete examples of DNA array studies in neurotoxicological research. PMID:12782098

  5. A Flexible Microarray Data Simulation Model

    PubMed Central

    Dembélé, Doulaye

    2013-01-01

    Microarray technology allows monitoring of gene expression profiling at the genome level. This is useful in order to search for genes involved in a disease. The performances of the methods used to select interesting genes are most often judged after other analyzes (qPCR validation, search in databases...), which are also subject to error. A good evaluation of gene selection methods is possible with data whose characteristics are known, that is to say, synthetic data. We propose a model to simulate microarray data with similar characteristics to the data commonly produced by current platforms. The parameters used in this model are described to allow the user to generate data with varying characteristics. In order to show the flexibility of the proposed model, a commented example is given and illustrated. An R package is available for immediate use.

  6. Microarrays: how many do you need?

    PubMed

    Zien, Alexander; Fluck, Juliane; Zimmer, Ralf; Lengauer, Thomas

    2003-01-01

    We estimate the number of microarrays that is required in order to gain reliable results from a common type of study: the pairwise comparison of different classes of samples. We show that current knowledge allows for the construction of models that look realistic with respect to searches for individual differentially expressed genes and derive prototypical parameters from real data sets. Such models allow investigation of the dependence of the required number of samples on the relevant parameters: the biological variability of the samples within each class, the fold changes in expression that are desired to be detected, the detection sensitivity of the microarrays, and the acceptable error rates of the results. We supply experimentalists with general conclusions as well as a freely accessible Java applet at www.scai.fhg.de/special/bio/howmanyarrays/ for fine tuning simulations to their particular settings. PMID:12935350

  7. Glycan microarrays for decoding the glycome

    PubMed Central

    Rillahan, Cory D.; Paulson, James C.

    2011-01-01

    In the last decade glycan microarrays have revolutionized the analysis of the specificity of glycan binding proteins, providing information that simultaneously illuminates the biology mediated by them and decodes the information content of the glycome. Numerous methods have emerged for arraying glycans in a ‘chip’ format, and glycan libraries have been assembled that address the diversity of the human glycome. Such arrays have been successfully used for analysis of glycan binding proteins that mediate mammalian biology, host-pathogen interactions, immune recognition of glycans relevant to vaccine production and cancer antigens. This review covers the development of glycan microarrays and applications that have provided insights into the roles of mammalian and microbial glycan binding proteins. PMID:21469953

  8. A Two Dimensional Overlapped Subaperture Polar Format Algorithm Based on Stepped-chirp Signal

    PubMed Central

    Mao, Xinhua; Zhu, Daiyin; Nie, Xin; Zhu, Zhaoda

    2008-01-01

    In this work, a 2-D subaperture polar format algorithm (PFA) based on stepped-chirp signal is proposed. Instead of traditional pulse synthesis preprocessing, the presented method integrates the pulse synthesis process into the range subaperture processing. Meanwhile, due to the multi-resolution property of subaperture processing, this algorithm is able to compensate the space-variant phase error caused by the radar motion during the period of a pulse cluster. Point target simulation has validated the presented algorithm.

  9. Design of a combinatorial dna microarray for protein-dnainteraction studies

    SciTech Connect

    Mintseris, Julian; Eisen, Michael B.

    2006-07-07

    Background: Discovery of precise specificity oftranscription factors is an important step on the way to understandingthe complex mechanisms of gene regulation in eukaryotes. Recently,doublestranded protein-binding microarrays were developed as apotentially scalable approach to tackle transcription factor binding siteidentification. Results: Here we present an algorithmic approach toexperimental design of a microarray that allows for testing fullspecificity of a transcription factor binding to all possible DNA bindingsites of a given length, with optimally efficient use of the array. Thisdesign is universal, works for any factor that binds a sequence motif andis not species-specific. Furthermore, simulation results show that dataproduced with the designed arrays is easier to analyze and would resultin more precise identification of binding sites. Conclusion: In thisstudy, we present a design of a double stranded DNA microarray forprotein-DNA interaction studies and show that our algorithm allowsoptimally efficient use of the arrays for this purpose. We believe such adesign will prove useful for transcription factor binding siteidentification and other biological problems.

  10. Design of a combinatorial DNA microarray for protein-DNA interaction studies

    PubMed Central

    Mintseris, Julian; Eisen, Michael B

    2006-01-01

    Background Discovery of precise specificity of transcription factors is an important step on the way to understanding the complex mechanisms of gene regulation in eukaryotes. Recently, double-stranded protein-binding microarrays were developed as a potentially scalable approach to tackle transcription factor binding site identification. Results Here we present an algorithmic approach to experimental design of a microarray that allows for testing full specificity of a transcription factor binding to all possible DNA binding sites of a given length, with optimally efficient use of the array. This design is universal, works for any factor that binds a sequence motif and is not species-specific. Furthermore, simulation results show that data produced with the designed arrays is easier to analyze and would result in more precise identification of binding sites. Conclusion In this study, we present a design of a double stranded DNA microarray for protein-DNA interaction studies and show that our algorithm allows optimally efficient use of the arrays for this purpose. We believe such a design will prove useful for transcription factor binding site identification and other biological problems. PMID:17018151

  11. Epitope Identification from Fixed-complexity Random-sequence Peptide Microarrays

    PubMed Central

    Richer, Josh; Johnston, Stephen Albert; Stafford, Phillip

    2015-01-01

    Antibodies play an important role in modern science and medicine. They are essential in many biological assays and have emerged as an important class of therapeutics. Unfortunately, current methods for mapping antibody epitopes require costly synthesis or enrichment steps, and no low-cost universal platform exists. In order to address this, we tested a random-sequence peptide microarray consisting of over 330,000 unique peptide sequences sampling 83% of all possible tetramers and 27% of pentamers. It is a single, unbiased platform that can be used in many different types of tests, it does not rely on informatic selection of peptides for a particular proteome, and it does not require iterative rounds of selection. In order to optimize the platform, we developed an algorithm that considers the significance of k-length peptide subsequences (k-mers) within selected peptides that come from the microarray. We tested eight monoclonal antibodies and seven infectious disease cohorts. The method correctly identified five of the eight monoclonal epitopes and identified both reported and unreported epitope candidates in the infectious disease cohorts. This algorithm could greatly enhance the utility of random-sequence peptide microarrays by enabling rapid epitope mapping and antigen identification. PMID:25368412

  12. Metadata Management and Semantics in Microarray Repositories

    PubMed Central

    Kocabaş, F; Can, T; Baykal, N

    2011-01-01

    The number of microarray and other high-throughput experiments on primary repositories keeps increasing as do the size and complexity of the results in response to biomedical investigations. Initiatives have been started on standardization of content, object model, exchange format and ontology. However, there are backlogs and inability to exchange data between microarray repositories, which indicate that there is a great need for a standard format and data management. We have introduced a metadata framework that includes a metadata card and semantic nets that make experimental results visible, understandable and usable. These are encoded in syntax encoding schemes and represented in RDF (Resource Description Frame-word), can be integrated with other metadata cards and semantic nets, and can be exchanged, shared and queried. We demonstrated the performance and potential benefits through a case study on a selected microarray repository. We concluded that the backlogs can be reduced and that exchange of information and asking of knowledge discovery questions can become possible with the use of this metadata framework. PMID:24052712

  13. Development and Applications of the Lectin Microarray.

    PubMed

    Hirabayashi, Jun; Kuno, Atsushi; Tateno, Hiroaki

    2015-01-01

    The lectin microarray is an emerging technology for glycomics. It has already found maximum use in diverse fields of glycobiology by providing simple procedures for differential glycan profiling in a rapid and high-throughput manner. Since its first appearance in the literature in 2005, many application methods have been developed essentially on the same platform, comprising a series of glycan-binding proteins immobilized on an appropriate substrate such as a glass slide. Because the lectin microarray strategy does not require prior liberation of glycans from the core protein in glycoprotein analysis, it should encourage researchers not familiar with glycotechnology to use glycan analysis in future work. This feasibility should provide a broader range of experimental scientists with good opportunities to investigate novel aspects of glycoscience. Applications of the technology include not only basic sciences but also the growing fields of bio-industry. This chapter describes first the essence of glycan profiling and the basic fabrication of the lectin microarray for this purpose. In the latter part the focus is on diverse applications to both structural and functional glycomics, with emphasis on the wide applicability now available with this new technology. Finally, the importance of developing advanced lectin engineering is discussed. PMID:25821171

  14. RNAi microarray analysis in cultured mammalian cells.

    PubMed

    Mousses, Spyro; Caplen, Natasha J; Cornelison, Robert; Weaver, Don; Basik, Mark; Hautaniemi, Sampsa; Elkahloun, Abdel G; Lotufo, Roberto A; Choudary, Ashish; Dougherty, Edward R; Suh, Ed; Kallioniemi, Olli

    2003-10-01

    RNA interference (RNAi) mediated by small interfering RNAs (siRNAs) is a powerful new tool for analyzing gene knockdown phenotypes in living mammalian cells. To facilitate large-scale, high-throughput functional genomics studies using RNAi, we have developed a microarray-based technology for highly parallel analysis. Specifically, siRNAs in a transfection matrix were first arrayed on glass slides, overlaid with a monolayer of adherent cells, incubated to allow reverse transfection, and assessed for the effects of gene silencing by digital image analysis at a single cell level. Validation experiments with HeLa cells stably expressing GFP showed spatially confined, sequence-specific, time- and dose-dependent inhibition of green fluorescence for those cells growing directly on microspots containing siRNA targeting the GFP sequence. Microarray-based siRNA transfections analyzed with a custom-made quantitative image analysis system produced results that were identical to those from traditional well-based transfection, quantified by flow cytometry. Finally, to integrate experimental details, image analysis, data display, and data archiving, we developed a prototype information management system for high-throughput cell-based analyses. In summary, this RNAi microarray platform, together with ongoing efforts to develop large-scale human siRNA libraries, should facilitate genomic-scale cell-based analyses of gene function. PMID:14525932

  15. An imputation approach for oligonucleotide microarrays.

    PubMed

    Li, Ming; Wen, Yalu; Lu, Qing; Fu, Wenjiang J

    2013-01-01

    Oligonucleotide microarrays are commonly adopted for detecting and qualifying the abundance of molecules in biological samples. Analysis of microarray data starts with recording and interpreting hybridization signals from CEL images. However, many CEL images may be blemished by noises from various sources, observed as "bright spots", "dark clouds", and "shadowy circles", etc. It is crucial that these image defects are correctly identified and properly processed. Existing approaches mainly focus on detecting defect areas and removing affected intensities. In this article, we propose to use a mixed effect model for imputing the affected intensities. The proposed imputation procedure is a single-array-based approach which does not require any biological replicate or between-array normalization. We further examine its performance by using Affymetrix high-density SNP arrays. The results show that this imputation procedure significantly reduces genotyping error rates. We also discuss the necessary adjustments for its potential extension to other oligonucleotide microarrays, such as gene expression profiling. The R source code for the implementation of approach is freely available upon request. PMID:23505547

  16. [Genomic medicine. Polymorphisms and microarray applications].

    PubMed

    Spalvieri, Mónica P; Rotenberg, Rosa G

    2004-01-01

    This update shows new concepts related to the significance of DNA variations among individuals, as well as to their detection by using a new technology. The sequencing of the human genome is only the beginning of what will enable us to understand genetic diversity. The unit of DNA variability is the polymorphism of a single nucleotide (SNP). At present, studies on SNPs are restricted to basic research but the large number of papers on this subject makes feasible their entrance into clinical practice. We illustrate here the use of SNPs as molecular markers in ethnical genotyping, gene expression in some diseases and as potential targets in pharmacological response, and also introduce the technology of arrays. Microarrays experiments allow the quantification and comparison of gene expression on a large scale, at the same time, by using special chips and array designs. Conventional methods provide data from up to 20 genes, while a single microarray may provide information about thousands of them simultaneously, leading to a more rapid and accurate genotyping. Biotechnology improvements will facilitate our knowledge of each gene sequence, the frequency and exact location of SNPs and their influence on cellular behavior. Although experimental efficiency and validity of results from microarrays are still controversial, the knowledge and characterization of a patient's genetic profile will lead, undoubtedly, to advances in prevention, diagnosis, prognosis and treatment of human diseases. PMID:15637833

  17. High-Throughput Enzyme Kinetics Using Microarrays

    SciTech Connect

    Guoxin Lu; Edward S. Yeung

    2007-11-01

    We report a microanalytical method to study enzyme kinetics. The technique involves immobilizing horseradish peroxidase on a poly-L-lysine (PLL)- coated glass slide in a microarray format, followed by applying substrate solution onto the enzyme microarray. Enzyme molecules are immobilized on the PLL-coated glass slide through electrostatic interactions, and no further modification of the enzyme or glass slide is needed. In situ detection of the products generated on the enzyme spots is made possible by monitoring the light intensity of each spot using a scientific-grade charged-coupled device (CCD). Reactions of substrate solutions of various types and concentrations can be carried out sequentially on one enzyme microarray. To account for the loss of enzyme from washing in between runs, a standard substrate solution is used for calibration. Substantially reduced amounts of substrate solution are consumed for each reaction on each enzyme spot. The Michaelis constant K{sub m} obtained by using this method is comparable to the result for homogeneous solutions. Absorbance detection allows universal monitoring, and no chemical modification of the substrate is needed. High-throughput studies of native enzyme kinetics for multiple enzymes are therefore possible in a simple, rapid, and low-cost manner.

  18. Data Pre-Processing Method to Remove Interference of Gas Bubbles and Cell Clusters During Anaerobic and Aerobic Yeast Fermentations in a Stirred Tank Bioreactor

    NASA Astrophysics Data System (ADS)

    Princz, S.; Wenzel, U.; Miller, R.; Hessling, M.

    2014-11-01

    One aerobic and four anaerobic batch fermentations of the yeast Saccharomyces cerevisiae were conducted in a stirred bioreactor and monitored inline by NIR spectroscopy and a transflectance dip probe. From the acquired NIR spectra, chemometric partial least squares regression (PLSR) models for predicting biomass, glucose and ethanol were constructed. The spectra were directly measured in the fermentation broth and successfully inspected for adulteration using our novel data pre-processing method. These adulterations manifested as strong fluctuations in the shape and offset of the absorption spectra. They resulted from cells, cell clusters, or gas bubbles intercepting the optical path of the dip probe. In the proposed data pre-processing method, adulterated signals are removed by passing the time-scanned non-averaged spectra through two filter algorithms with a 5% quantile cutoff. The filtered spectra containing meaningful data are then averaged. A second step checks whether the whole time scan is analyzable. If true, the average is calculated and used to prepare the PLSR models. This new method distinctly improved the prediction results. To dissociate possible correlations between analyte concentrations, such as glucose and ethanol, the feeding analytes were alternately supplied at different concentrations (spiking) at the end of the four anaerobic fermentations. This procedure yielded low-error (anaerobic) PLSR models for predicting analyte concentrations of 0.31 g/l for biomass, 3.41 g/l for glucose, and 2.17 g/l for ethanol. The maximum concentrations were 14 g/l biomass, 167 g/l glucose, and 80 g/l ethanol. Data from the aerobic fermentation, carried out under high agitation and high aeration, were incorporated to realize combined PLSR models, which have not been previously reported to our knowledge.

  19. Image change detection algorithms: a systematic survey.

    PubMed

    Radke, Richard J; Andra, Srinivas; Al-Kofahi, Omar; Roysam, Badrinath

    2005-03-01

    Detecting regions of change in multiple images of the same scene taken at different times is of widespread interest due to a large number of applications in diverse disciplines, including remote sensing, surveillance, medical diagnosis and treatment, civil infrastructure, and underwater sensing. This paper presents a systematic survey of the common processing steps and core decision rules in modern change detection algorithms, including significance and hypothesis testing, predictive models, the shading model, and background modeling. We also discuss important preprocessing methods, approaches to enforcing the consistency of the change mask, and principles for evaluating and comparing the performance of change detection algorithms. It is hoped that our classification of algorithms into a relatively small number of categories will provide useful guidance to the algorithm designer. PMID:15762326

  20. Gene ARMADA: an integrated multi-analysis platform for microarray data implemented in MATLAB

    PubMed Central

    Chatziioannou, Aristotelis; Moulos, Panagiotis; Kolisis, Fragiskos N

    2009-01-01

    Background The microarray data analysis realm is ever growing through the development of various tools, open source and commercial. However there is absence of predefined rational algorithmic analysis workflows or batch standardized processing to incorporate all steps, from raw data import up to the derivation of significantly differentially expressed gene lists. This absence obfuscates the analytical procedure and obstructs the massive comparative processing of genomic microarray datasets. Moreover, the solutions provided, heavily depend on the programming skills of the user, whereas in the case of GUI embedded solutions, they do not provide direct support of various raw image analysis formats or a versatile and simultaneously flexible combination of signal processing methods. Results We describe here Gene ARMADA (Automated Robust MicroArray Data Analysis), a MATLAB implemented platform with a Graphical User Interface. This suite integrates all steps of microarray data analysis including automated data import, noise correction and filtering, normalization, statistical selection of differentially expressed genes, clustering, classification and annotation. In its current version, Gene ARMADA fully supports 2 coloured cDNA and Affymetrix oligonucleotide arrays, plus custom arrays for which experimental details are given in tabular form (Excel spreadsheet, comma separated values, tab-delimited text formats). It also supports the analysis of already processed results through its versatile import editor. Besides being fully automated, Gene ARMADA incorporates numerous functionalities of the Statistics and Bioinformatics Toolboxes of MATLAB. In addition, it provides numerous visualization and exploration tools plus customizable export data formats for seamless integration by other analysis tools or MATLAB, for further processing. Gene ARMADA requires MATLAB 7.4 (R2007a) or higher and is also distributed as a stand-alone application with MATLAB Component Runtime

  1. DNA Microarray for Detection of Gastrointestinal Viruses

    PubMed Central

    Martínez, Miguel A.; Soto-del Río, María de los Dolores; Gutiérrez, Rosa María; Chiu, Charles Y.; Greninger, Alexander L.; Contreras, Juan Francisco; López, Susana; Arias, Carlos F.

    2014-01-01

    Gastroenteritis is a clinical illness of humans and other animals that is characterized by vomiting and diarrhea and caused by a variety of pathogens, including viruses. An increasing number of viral species have been associated with gastroenteritis or have been found in stool samples as new molecular tools have been developed. In this work, a DNA microarray capable in theory of parallel detection of more than 100 viral species was developed and tested. Initial validation was done with 10 different virus species, and an additional 5 species were validated using clinical samples. Detection limits of 1 × 103 virus particles of Human adenovirus C (HAdV), Human astrovirus (HAstV), and group A Rotavirus (RV-A) were established. Furthermore, when exogenous RNA was added, the limit for RV-A detection decreased by one log. In a small group of clinical samples from children with gastroenteritis (n = 76), the microarray detected at least one viral species in 92% of the samples. Single infection was identified in 63 samples (83%), and coinfection with more than one virus was identified in 7 samples (9%). The most abundant virus species were RV-A (58%), followed by Anellovirus (15.8%), HAstV (6.6%), HAdV (5.3%), Norwalk virus (6.6%), Human enterovirus (HEV) (9.2%), Human parechovirus (1.3%), Sapporo virus (1.3%), and Human bocavirus (1.3%). To further test the specificity and sensitivity of the microarray, the results were verified by reverse transcription-PCR (RT-PCR) detection of 5 gastrointestinal viruses. The RT-PCR assay detected a virus in 59 samples (78%). The microarray showed good performance for detection of RV-A, HAstV, and calicivirus, while the sensitivity for HAdV and HEV was low. Furthermore, some discrepancies in detection of mixed infections were observed and were addressed by reverse transcription-quantitative PCR (RT-qPCR) of the viruses involved. It was observed that differences in the amount of genetic material favored the detection of the most abundant

  2. Computerized system for recognition of autism on the basis of gene expression microarray data.

    PubMed

    Latkowski, Tomasz; Osowski, Stanislaw

    2015-01-01

    The aim of this paper is to provide a means to recognize a case of autism using gene expression microarrays. The crucial task is to discover the most important genes which are strictly associated with autism. The paper presents an application of different methods of gene selection, to select the most representative input attributes for an ensemble of classifiers. The set of classifiers is responsible for distinguishing autism data from the reference class. Simultaneous application of a few gene selection methods enables analysis of the ill-conditioned gene expression matrix from different points of view. The results of selection combined with a genetic algorithm and SVM classifier have shown increased accuracy of autism recognition. Early recognition of autism is extremely important for treatment of children and increases the probability of their recovery and return to normal social communication. The results of this research can find practical application in early recognition of autism on the basis of gene expression microarray analysis. PMID:25464350

  3. An overview of innovations and industrial solutions in Protein Microarray Technology.

    PubMed

    Gupta, Shabarni; Manubhai, K P; Kulkarni, Vishwesh; Srivastava, Sanjeeva

    2016-04-01

    The complexity involving protein array technology reflects in the fact that instrumentation and data analysis are subject to change depending on the biological question, technical compatibility of instruments and software used in each experiment. Industry has played a pivotal role in establishing standards for future deliberations in sustenance of these technologies in the form of protein array chips, arrayers, scanning devices, and data analysis software. This has enhanced the outreach of protein microarray technology to researchers across the globe. These have encouraged a surge in the adaptation of "nonclassical" approaches such as DNA-based protein arrays, micro-contact printing, label-free protein detection, and algorithms for data analysis. This review provides a unique overview of these industrial solutions available for protein microarray based studies. It aims at assessing the developments in various commercial platforms, thus providing a holistic overview of various modalities, options, and compatibility; summarizing the journey of this powerful high-throughput technology. PMID:27089056

  4. Facilitating access to pre-processed research evidence in public health

    PubMed Central

    2010-01-01

    Background Evidence-informed decision making is accepted in Canada and worldwide as necessary for the provision of effective health services. This process involves: 1) clearly articulating a practice-based issue; 2) searching for and accessing relevant evidence; 3) appraising methodological rigor and choosing the most synthesized evidence of the highest quality and relevance to the practice issue and setting that is available; and 4) extracting, interpreting, and translating knowledge, in light of the local context and resources, into practice, program and policy decisions. While the public health sector in Canada is working toward evidence-informed decision making, considerable barriers, including efficient access to synthesized resources, exist. Methods In this paper we map to a previously developed 6 level pyramid of pre-processed research evidence, relevant resources that include public health-related effectiveness evidence. The resources were identified through extensive searches of both the published and unpublished domains. Results Many resources with public health-related evidence were identified. While there were very few resources dedicated solely to public health evidence, many clinically focused resources include public health-related evidence, making tools such as the pyramid, that identify these resources, particularly helpful for public health decisions makers. A practical example illustrates the application of this model and highlights its potential to reduce the time and effort that would be required by public health decision makers to address their practice-based issues. Conclusions This paper describes an existing hierarchy of pre-processed evidence and its adaptation to the public health setting. A number of resources with public health-relevant content that are either freely accessible or requiring a subscription are identified. This will facilitate easier and faster access to pre-processed, public health-relevant evidence, with the intent of

  5. Detecting variants with Metabolic Design, a new software tool to design probes for explorative functional DNA microarray development

    PubMed Central

    2010-01-01

    Background Microorganisms display vast diversity, and each one has its own set of genes, cell components and metabolic reactions. To assess their huge unexploited metabolic potential in different ecosystems, we need high throughput tools, such as functional microarrays, that allow the simultaneous analysis of thousands of genes. However, most classical functional microarrays use specific probes that monitor only known sequences, and so fail to cover the full microbial gene diversity present in complex environments. We have thus developed an algorithm, implemented in the user-friendly program Metabolic Design, to design efficient explorative probes. Results First we have validated our approach by studying eight enzymes involved in the degradation of polycyclic aromatic hydrocarbons from the model strain Sphingomonas paucimobilis sp. EPA505 using a designed microarray of 8,048 probes. As expected, microarray assays identified the targeted set of genes induced during biodegradation kinetics experiments with various pollutants. We have then confirmed the identity of these new genes by sequencing, and corroborated the quantitative discrimination of our microarray by quantitative real-time PCR. Finally, we have assessed metabolic capacities of microbial communities in soil contaminated with aromatic hydrocarbons. Results show that our probe design (sensitivity and explorative quality) can be used to study a complex environment efficiently. Conclusions We successfully use our microarray to detect gene expression encoding enzymes involved in polycyclic aromatic hydrocarbon degradation for the model strain. In addition, DNA microarray experiments performed on soil polluted by organic pollutants without prior sequence assumptions demonstrate high specificity and sensitivity for gene detection. Metabolic Design is thus a powerful, efficient tool that can be used to design explorative probes and monitor metabolic pathways in complex environments, and it may also be used to

  6. Evaluation of pre-processing, thresholding and post-processing steps for very small target detection in infrared images

    NASA Astrophysics Data System (ADS)

    Yardımcı, Ozan; Ulusoy, Ä.°lkay

    2016-05-01

    Pre-processing, thresholding and post-processing stages are very important especially for very small target detection from infrared images. The effects of these stages to the final detection performance are measured in this study. Various methods for each stage are compared based on the final detection performance, which is defined by precision and recall values. Among various methods, the best method for each stage is selected and proved. For the pre-processing stage, local block based methods perform the best, nearly for all thresholding methods. The best thresholding method is chosen as the one, which does not need any user defined parameter. Finally, the post processing method, which is suitable for the best performing pre-processing and thesholding methods is selected.

  7. A fuzzy based feature selection from independent component subspace for machine learning classification of microarray data.

    PubMed

    Aziz, Rabia; Verma, C K; Srivastava, Namita

    2016-06-01

    Feature (gene) selection and classification of microarray data are the two most interesting machine learning challenges. In the present work two existing feature selection/extraction algorithms, namely independent component analysis (ICA) and fuzzy backward feature elimination (FBFE) are used which is a new combination of selection/extraction. The main objective of this paper is to select the independent components of the DNA microarray data using FBFE to improve the performance of support vector machine (SVM) and Naïve Bayes (NB) classifier, while making the computational expenses affordable. To show the validity of the proposed method, it is applied to reduce the number of genes for five DNA microarray datasets namely; colon cancer, acute leukemia, prostate cancer, lung cancer II, and high-grade glioma. Now these datasets are then classified using SVM and NB classifiers. Experimental results on these five microarray datasets demonstrate that gene selected by proposed approach, effectively improve the performance of SVM and NB classifiers in terms of classification accuracy. We compare our proposed method with principal component analysis (PCA) as a standard extraction algorithm and find that the proposed method can obtain better classification accuracy, using SVM and NB classifiers with a smaller number of selected genes than the PCA. The curve between the average error rate and number of genes with each dataset represents the selection of required number of genes for the highest accuracy with our proposed method for both the classifiers. ROC shows best subset of genes for both the classifier of different datasets with propose method. PMID:27081632

  8. Dynamic biclustering of microarray data by multi-objective immune optimization

    PubMed Central

    2011-01-01

    Abstract Background Newly microarray technologies yield large-scale datasets. The microarray datasets are usually presented in 2D matrices, where rows represent genes and columns represent experimental conditions. Systematic analysis of those datasets provides the increasing amount of information, which is urgently needed in the post-genomic era. Biclustering, which is a technique developed to allow simultaneous clustering of rows and columns of a dataset, might be useful to extract more accurate information from those datasets. Biclustering requires the optimization of two conflicting objectives (residue and volume), and a multi-objective artificial immune system capable of performing a multi-population search. As a heuristic search technique, artificial immune systems (AISs) can be considered a new computational paradigm inspired by the immunological system of vertebrates and designed to solve a wide range of optimization problems. During biclustering several objectives in conflict with each other have to be optimized simultaneously, so multi-objective optimization model is suitable for solving biclustering problem. Results Based on dynamic population, this paper proposes a novel dynamic multi-objective immune optimization biclustering (DMOIOB) algorithm to mine coherent patterns from microarray data. Experimental results on two common and public datasets of gene expression profiles show that our approach can effectively find significant localized structures related to sets of genes that show consistent expression patterns across subsets of experimental conditions. The mined patterns present a significant biological relevance in terms of related biological processes, components and molecular functions in a species-independent manner. Conclusions The proposed DMOIOB algorithm is an efficient tool to analyze large microarray datasets. It achieves a good diversity and rapid convergence. PMID:21989068

  9. ARACNe-based inference, using curated microarray data, of Arabidopsis thaliana root transcriptional regulatory networks

    PubMed Central

    2014-01-01

    Background Uncovering the complex transcriptional regulatory networks (TRNs) that underlie plant and animal development remains a challenge. However, a vast amount of data from public microarray experiments is available, which can be subject to inference algorithms in order to recover reliable TRN architectures. Results In this study we present a simple bioinformatics methodology that uses public, carefully curated microarray data and the mutual information algorithm ARACNe in order to obtain a database of transcriptional interactions. We used data from Arabidopsis thaliana root samples to show that the transcriptional regulatory networks derived from this database successfully recover previously identified root transcriptional modules and to propose new transcription factors for the SHORT ROOT/SCARECROW and PLETHORA pathways. We further show that these networks are a powerful tool to integrate and analyze high-throughput expression data, as exemplified by our analysis of a SHORT ROOT induction time-course microarray dataset, and are a reliable source for the prediction of novel root gene functions. In particular, we used our database to predict novel genes involved in root secondary cell-wall synthesis and identified the MADS-box TF XAL1/AGL12 as an unexpected participant in this process. Conclusions This study demonstrates that network inference using carefully curated microarray data yields reliable TRN architectures. In contrast to previous efforts to obtain root TRNs, that have focused on particular functional modules or tissues, our root transcriptional interactions provide an overview of the transcriptional pathways present in Arabidopsis thaliana roots and will likely yield a plethora of novel hypotheses to be tested experimentally. PMID:24739361

  10. Assessing Statistical Significance in Microarray Experiments Using the Distance Between Microarrays

    PubMed Central

    Hayden, Douglas; Lazar, Peter; Schoenfeld, David

    2009-01-01

    We propose permutation tests based on the pairwise distances between microarrays to compare location, variability, or equivalence of gene expression between two populations. For these tests the entire microarray or some pre-specified subset of genes is the unit of analysis. The pairwise distances only have to be computed once so the procedure is not computationally intensive despite the high dimensionality of the data. An R software package, permtest, implementing the method is freely available from the Comprehensive R Archive Network at http://cran.r-project.org. PMID:19529777

  11. Evaluation of the robustness of the preprocessing technique improving reversible compressibility of CT images: Tested on various CT examinations

    SciTech Connect

    Jeon, Chang Ho; Kim, Bohyoung; Gu, Bon Seung; Lee, Jong Min; Kim, Kil Joong; Lee, Kyoung Ho; Kim, Tae Ki

    2013-10-15

    Purpose: To modify the preprocessing technique, which was previously proposed, improving compressibility of computed tomography (CT) images to cover the diversity of three dimensional configurations of different body parts and to evaluate the robustness of the technique in terms of segmentation correctness and increase in reversible compression ratio (CR) for various CT examinations.Methods: This study had institutional review board approval with waiver of informed patient consent. A preprocessing technique was previously proposed to improve the compressibility of CT images by replacing pixel values outside the body region with a constant value resulting in maximizing data redundancy. Since the technique was developed aiming at only chest CT images, the authors modified the segmentation method to cover the diversity of three dimensional configurations of different body parts. The modified version was evaluated as follows. In randomly selected 368 CT examinations (352 787 images), each image was preprocessed by using the modified preprocessing technique. Radiologists visually confirmed whether the segmented region covers the body region or not. The images with and without the preprocessing were reversibly compressed using Joint Photographic Experts Group (JPEG), JPEG2000 two-dimensional (2D), and JPEG2000 three-dimensional (3D) compressions. The percentage increase in CR per examination (CR{sub I}) was measured.Results: The rate of correct segmentation was 100.0% (95% CI: 99.9%, 100.0%) for all the examinations. The median of CR{sub I} were 26.1% (95% CI: 24.9%, 27.1%), 40.2% (38.5%, 41.1%), and 34.5% (32.7%, 36.2%) in JPEG, JPEG2000 2D, and JPEG2000 3D, respectively.Conclusions: In various CT examinations, the modified preprocessing technique can increase in the CR by 25% or more without concerning about degradation of diagnostic information.

  12. Feasibility investigation of integrated optics Fourier transform devices. [holographic subtraction for multichannel data preprocessing

    NASA Technical Reports Server (NTRS)

    Verber, C. M.; Vahey, D. W.; Wood, V. E.; Kenan, R. P.; Hartman, N. F.

    1977-01-01

    The possibility of producing an integrated optics data processing device based upon Fourier transformations or other parallel processing techniques, and the ways in which such techniques may be used to upgrade the performance of present and projected NASA systems were investigated. Activities toward this goal include; (1) production of near-diffraction-limited geodesic lenses in glass waveguides; (2) development of grinding and polishing techniques for the production of geodesic lenses in LiNbO3 waveguides; (3) development of a characterization technique for waveguide lenses; and (4) development of a theory for corrected aspheric geodesic lenses. A holographic subtraction system was devised which should be capable of rapid on-board preprocessing of a large number of parallel data channels. The principle involved is validated in three demonstrations.

  13. Analog signal pre-processing for the Fermilab Main Injector BPM upgrade

    SciTech Connect

    Saewert, A.L.; Rapisarda, S.M.; Wendt, M.; /Fermilab

    2006-05-01

    An analog signal pre-processing scheme was developed, in the framework of the Fermilab Main Injector Beam Position Monitor (BPM) Upgrade, to interface BPM pickup signals to the new digital receiver based read-out system. A key component is the 8-channel electronics module, which uses separate frequency selective gain stages to acquire 53 MHz bunched proton, and 2.5 MHz anti-proton signals. Related hardware includes a filter and combiner box to sum pickup electrode signals in the tunnel. A controller module allows local/remote control of gain settings and activation of gain stages, and supplies test signals. Theory of operation, system overview, and some design details are presented, as well as first beam measurements of the prototype hardware.

  14. Intelligent Text Retrieval and Knowledge Acquisition from Texts for NASA Applications: Preprocessing Issues

    NASA Technical Reports Server (NTRS)

    2001-01-01

    In this contract, which is a component of a larger contract that we plan to submit in the coming months, we plan to study the preprocessing issues which arise in applying natural language processing techniques to NASA-KSC problem reports. The goals of this work will be to deal with the issues of: a) automatically obtaining the problem reports from NASA-KSC data bases, b) the format of these reports and c) the conversion of these reports to a format that will be adequate for our natural language software. At the end of this contract, we expect that these problems will be solved and that we will be ready to apply our natural language software to a text database of over 1000 KSC problem reports.

  15. Color Image Watermarking Scheme Based on Efficient Preprocessing and Support Vector Machines

    NASA Astrophysics Data System (ADS)

    Fındık, Oğuz; Bayrak, Mehmet; Babaoğlu, Ismail; Çomak, Emre

    This paper suggests a new block based watermarking technique utilizing preprocessing and support vector machine (PPSVMW) to protect color image's intellectual property rights. Binary test set is employed here to train support vector machine (SVM). Before adding binary data into the original image, blocks have been separated into two parts to train SVM for better accuracy. Watermark's 1 valued bits were randomly added into the first block part and 0 into the second block part. Watermark is embedded by modifying the blue channel pixel value in the middle of each block so that watermarked image could be composed. SVM was trained with set-bits and three other features which are averages of the differences of pixels in three distinct shapes extracted from each block, and hence without the need of original image, it could be extracted. The results of PPSVMW technique proposed in this study were compared with those of the Tsai's technique. Our technique was proved to be more efficient.

  16. Advances in Software Tools for Pre-processing and Post-processing of Overset Grid Computations

    NASA Technical Reports Server (NTRS)

    Chan, William M.

    2004-01-01

    Recent developments in three pieces of software for performing pre-processing and post-processing work on numerical computations using overset grids are presented. The first is the OVERGRID graphical interface which provides a unified environment for the visualization, manipulation, generation and diagnostics of geometry and grids. Modules are also available for automatic boundary conditions detection, flow solver input preparation, multiple component dynamics input preparation and dynamics animation, simple solution viewing for moving components, and debris trajectory analysis input preparation. The second is a grid generation script library that enables rapid creation of grid generation scripts. A sample of recent applications will be described. The third is the OVERPLOT graphical interface for displaying and analyzing history files generated by the flow solver. Data displayed include residuals, component forces and moments, number of supersonic and reverse flow points, and various dynamics parameters.

  17. Experimental design, preprocessing, normalization and differential expression analysis of small RNA sequencing experiments

    PubMed Central

    2011-01-01

    Prior to the advent of new, deep sequencing methods, small RNA (sRNA) discovery was dependent on Sanger sequencing, which was time-consuming and limited knowledge to only the most abundant sRNA. The innovation of large-scale, next-generation sequencing has exponentially increased knowledge of the biology, diversity and abundance of sRNA populations. In this review, we discuss issues involved in the design of sRNA sequencing experiments, including choosing a sequencing platform, inherent biases that affect sRNA measurements and replication. We outline the steps involved in preprocessing sRNA sequencing data and review both the principles behind and the current options for normalization. Finally, we discuss differential expression analysis in the absence and presence of biological replicates. While our focus is on sRNA sequencing experiments, many of the principles discussed are applicable to the sequencing of other RNA populations. PMID:21356093

  18. Reductive leaching of low-grade manganese ore with pre-processed cornstalk

    NASA Astrophysics Data System (ADS)

    Yi, Ai-fei; Wu, Meng-ni; Liu, Peng-wei; Feng, Ya-li; Li, Hao-ran

    2015-12-01

    Cornstalk is usually directly used as a reductant in reductive leaching manganese. However, low utilization of cornstalk makes low manganese dissolution ratio. In the research, pretreatment for cornstalk was proposed to improve manganese dissolution ratio. Cornstalk was preprocessed by a heated sulfuric acid solution (1.2 M of sulfuric acid concentration) for 10 min at 80°C. Thereafter, both the pretreated solution and the residue were used as a reductant for manganese leaching. This method not only exhibited superior activity for hydrolyzing cornstalk but also enhanced manganese dissolution. These effects were attributed to an increase in the amount of reductive sugars resulting from lignin hydrolysis. Through acid pretreatment for cornstalk, the manganese dissolution ratio was improved from 50.14% to 83.46%. The present work demonstrates for the first time the effective acid pretreatment of cornstalk to provide a cost-effective reductant for manganese leaching.

  19. Preprocessing of backprojection images in the McClellan Nuclear Radiation Center tomography system

    SciTech Connect

    Gibbons, M. R., LLNL

    1998-02-19

    Neutron tomography is being investigated as a nondestructive technique for quantitative assessment of low atomic mass impurity concentration in metals. Neutrons maximize the sensitivity given their higher cross sections for low Z isotopes while tomography provides the three dimensional density information. The specific application is the detection of Hydrogen down to 200 ppm weight in aircraft engine compressor blades. A number of preprocessing corrections have been implemented for the backprojection images in order to achieve the detection requirements for a testing rate of three blades per hour. Among the procedures are corrections for neutron scattering and beam hardening. With these procedures the artifacts in tomographic reconstructions are shown to be less than the signal for 100 ppm hydrogen in titanium alloy samples.

  20. Preprocessing communication unit (PCU) with short message service (SMS) communication channels for AVL tracking

    NASA Astrophysics Data System (ADS)

    Young, Andrew S.; Skobla, Joseph

    2003-08-01

    The Preprocessing GPS - SMS Communication Unit (PCU) is a mobile tracking device used within AVL tracking systems for determining the location of vehicles. It was designed primarily to utilize the SMS service of the GSM network for communicating. The use of SMS messages is part of an effort aimed at providing a cost effective alternative for tracking the location of vehicles. Its primary function is to send information about user location across a GSM network to a Central Base Station (CBS) from which assets are being tracked. Though SMS is the main bearer, the unit is also capable of using Circuit Switch Data Service (CSD) to send and receive data from the Base Station (BS). The PCU was developed as a small hardware unit based on the Microchip microcontroller, with a multiplexer switching two RS 232 serial inputs. One input is dedicated to the GPS receiver and the second one to the wireless modem.

  1. The MARK-AGE extended database: data integration and pre-processing.

    PubMed

    Baur, J; Kötter, T; Moreno-Villanueva, M; Sindlinger, T; Berthold, M R; Bürkle, A; Junk, M

    2015-11-01

    MARK-AGE is a recently completed European population study, where bioanalytical and anthropometric data were collected from human subjects at a large scale. To facilitate data analysis and mathematical modelling, an extended database had to be constructed, integrating the data sources that were part of the project. This step involved checking, transformation and documentation of data. The success of downstream analysis mainly depends on the preparation and quality of the integrated data. Here, we present the pre-processing steps applied to the MARK-AGE data to ensure high quality and reliability in the MARK-AGE Extended Database. Various kinds of obstacles that arose during the project are highlighted and solutions are presented. PMID:26004672

  2. Pre-Processing Code System for Data in ENDF/B Format.

    Energy Science and Technology Software Center (ESTSC)

    2015-04-01

    Version 08 PREPRO2015-2 is a modular set of computer codes, each of which reads evaluated nuclear data in the ENDF/B format, processes the data and outputs it in the ENDF/B format. Each code performs one or more independent operations on the data. The codes are named "the pre-processing" codes, because they are designed to pre process ENDF/B data, for later, further processing for use in applications. These codes are designed to operate on virtually anymore » type of computer with the included capability of optimization on any given computer. They can process datasets in any ENDF/B format, ENDF/B-I through ENDF/B-VII. This package containes updated content. Additional information is available on the PREPRO website: http://www-nds.iaea.org/ndspub/endf/prepro/.« less

  3. Pre-Processing Code System for Data in ENDF/B Format.

    SciTech Connect

    CULLEN, D. E.

    2015-04-01

    Version 08 PREPRO2015-2 is a modular set of computer codes, each of which reads evaluated nuclear data in the ENDF/B format, processes the data and outputs it in the ENDF/B format. Each code performs one or more independent operations on the data. The codes are named "the pre-processing" codes, because they are designed to pre process ENDF/B data, for later, further processing for use in applications. These codes are designed to operate on virtually any type of computer with the included capability of optimization on any given computer. They can process datasets in any ENDF/B format, ENDF/B-I through ENDF/B-VII. This package containes updated content. Additional information is available on the PREPRO website: http://www-nds.iaea.org/ndspub/endf/prepro/.

  4. CMS Preprocessing Subsystem user`s guide: Software Version 2.0

    SciTech Connect

    Didier, B.T.; Gash, J.D.; Greitzer, F.L.; Havre, S.L.; Ramsdell, J.V.; Turney, C.R.

    1993-12-01

    The Common Mapping Standard (CMS) Data Production System (CDPS) produces and distributes CMS data in compliance with the Common Mapping Standard Interface Control Document. CDPS is composed of two subsystems, the CMS Preprocessing Subsystem (CPS) and the CMS Distribution Subsystem (CDS). This guide describes the operation of CPS. CPS is responsible for the management of source data and the production of CMS data from source data. The CPS system was developed for use on a workstation running Ultrix 4.2, the X Window System Version X11R4, and motif Version 1.1. This subsystem is organized into four major functional groups and supports production of CMS data from source chart, indose, and elevation data products.

  5. From voids to Coma: the prevalence of pre-processing in the local Universe

    NASA Astrophysics Data System (ADS)

    Cybulski, Ryan; Yun, Min S.; Fazio, Giovanni G.; Gutermuth, Robert A.

    2014-04-01

    We examine the effects of pre-processing across the Coma Supercluster, including 3505 galaxies over ˜500 deg2, by quantifying the degree to which star-forming (SF) activity is quenched as a function of environment. We characterize environment using the complementary techniques of Voronoi Tessellation, to measure the density field, and the Minimal Spanning Tree, to define continuous structures, and so we measure SF activity as a function of local density and the type of environment (cluster, group, filament, and void), and quantify the degree to which environment contributes to quenching of SF activity. Our sample covers over two orders of magnitude in stellar mass (108.5-1011 M⊙), and consequently, we trace the effects of environment on SF activity for dwarf and massive galaxies, distinguishing so-called mass quenching from environment quenching. Environmentally driven quenching of SF activity, measured relative to the void galaxies, occurs to progressively greater degrees in filaments, groups, and clusters, and this trend holds for dwarf and massive galaxies alike. A similar trend is found using g - r colours, but with a more significant disparity between galaxy mass bins driven by increased internal dust extinction in massive galaxies. The SFR distributions of massive SF galaxies have no significant environmental dependence, but the distributions for dwarf SF galaxies are found to be statistically distinct in most environments. Pre-processing plays a significant role at low redshift, as environmentally driven galaxy evolution affects nearly half of the galaxies in the group environment, and a significant fraction of the galaxies in the more diffuse filaments. Our study underscores the need for sensitivity to dwarf galaxies to separate mass-driven from environmentally driven effects, and the use of unbiased tracers of SF activity.

  6. Predicting monthly streamflow using data-driven models coupled with data-preprocessing techniques

    NASA Astrophysics Data System (ADS)

    Wu, C. L.; Chau, K. W.; Li, Y. S.

    2009-08-01

    In this paper, the accuracy performance of monthly streamflow forecasts is discussed when using data-driven modeling techniques on the streamflow series. A crisp distributed support vectors regression (CDSVR) model was proposed for monthly streamflow prediction in comparison with four other models: autoregressive moving average (ARMA), K-nearest neighbors (KNN), artificial neural networks (ANNs), and crisp distributed artificial neural networks (CDANN). With respect to distributed models of CDSVR and CDANN, the fuzzy C-means (FCM) clustering technique first split the flow data into three subsets (low, medium, and high levels) according to the magnitudes of the data, and then three single SVRs (or ANNs) were fitted to three subsets. This paper gives a detailed analysis on reconstruction of dynamics that was used to identify the configuration of all models except for ARMA. To improve the model performance, the data-preprocessing techniques of singular spectrum analysis (SSA) and/or moving average (MA) were coupled with all five models. Some discussions were presented (1) on the number of neighbors in KNN; (2) on the configuration of ANN; and (3) on the investigation of effects of MA and SSA. Two streamflow series from different locations in China (Xiangjiaba and Danjiangkou) were applied for the analysis of forecasting. Forecasts were conducted at four different horizons (1-, 3-, 6-, and 12-month-ahead forecasts). The results showed that models fed by preprocessed data performed better than models fed by original data, and CDSVR outperformed other models except for at a 6-month-ahead horizon for Danjiangkou. For the perspective of streamflow series, the SSA exhibited better effects on Danjingkou data because its raw discharge series was more complex than the discharge of Xiangjiaba. The MA considerably improved the performance of ANN, CDANN, and CDSVR by adjusting the correlation relationship between input components and output of models. It was also found that the

  7. Chang'E-3 data pre-processing system based on scientific workflow

    NASA Astrophysics Data System (ADS)

    tan, xu; liu, jianjun; wang, yuanyuan; yan, wei; zhang, xiaoxia; li, chunlai

    2016-04-01

    The Chang'E-3(CE3) mission have obtained a huge amount of lunar scientific data. Data pre-processing is an important segment of CE3 ground research and application system. With a dramatic increase in the demand of data research and application, Chang'E-3 data pre-processing system(CEDPS) based on scientific workflow is proposed for the purpose of making scientists more flexible and productive by automating data-driven. The system should allow the planning, conduct and control of the data processing procedure with the following possibilities: • describe a data processing task, include:1)define input data/output data, 2)define the data relationship, 3)define the sequence of tasks,4)define the communication between tasks,5)define mathematical formula, 6)define the relationship between task and data. • automatic processing of tasks. Accordingly, Describing a task is the key point whether the system is flexible. We design a workflow designer which is a visual environment for capturing processes as workflows, the three-level model for the workflow designer is discussed:1) The data relationship is established through product tree.2)The process model is constructed based on directed acyclic graph(DAG). Especially, a set of process workflow constructs, including Sequence, Loop, Merge, Fork are compositional one with another.3)To reduce the modeling complexity of the mathematical formulas using DAG, semantic modeling based on MathML is approached. On top of that, we will present how processed the CE3 data with CEDPS.

  8. Identification of significant features in DNA microarray data

    PubMed Central

    Bair, Eric

    2013-01-01

    DNA microarrays are a relatively new technology that can simultaneously measure the expression level of thousands of genes. They have become an important tool for a wide variety of biological experiments. One of the most common goals of DNA microarray experiments is to identify genes associated with biological processes of interest. Conventional statistical tests often produce poor results when applied to microarray data owing to small sample sizes, noisy data, and correlation among the expression levels of the genes. Thus, novel statistical methods are needed to identify significant genes in DNA microarray experiments. This article discusses the challenges inherent in DNA microarray analysis and describes a series of statistical techniques that can be used to overcome these challenges. The problem of multiple hypothesis testing and its relation to microarray studies are also considered, along with several possible solutions. PMID:24244802

  9. High-throughput allogeneic antibody detection using protein microarrays.

    PubMed

    Paul, Jed; Sahaf, Bita; Perloff, Spenser; Schoenrock, Kelsi; Wu, Fang; Nakasone, Hideki; Coller, John; Miklos, David

    2016-05-01

    Enzyme-linked immunosorbent assays (ELISAs) have traditionally been used to detect alloantibodies in patient plasma samples post hematopoietic cell transplantation (HCT); however, protein microarrays have the potential to be multiplexed, more sensitive, and higher throughput than ELISAs. Here, we describe the development of a novel and sensitive microarray method for detection of allogeneic antibodies against minor histocompatibility antigens encoded on the Y chromosome, called HY antigens. Six microarray surfaces were tested for their ability to bind recombinant protein and peptide HY antigens. Significant allogeneic immune responses were determined in male patients with female donors by considering normal male donor responses as baseline. HY microarray results were also compared with our previous ELISA results. Our overall goal was to maximize antibody detection for both recombinant protein and peptide epitopes. For detection of HY antigens, the Epoxy (Schott) protein microarray surface was both most sensitive and reliable and has become the standard surface in our microarray platform. PMID:26902899

  10. An Enhanced TIMESAT Algorithm for Estimating Vegetation Phenology Metrics from MODIS Data

    NASA Technical Reports Server (NTRS)

    Tan, Bin; Morisette, Jeffrey T.; Wolfe, Robert E.; Gao, Feng; Ederer, Gregory A.; Nightingale, Joanne; Pedelty, Jeffrey A.

    2012-01-01

    An enhanced TIMESAT algorithm was developed for retrieving vegetation phenology metrics from 250 m and 500 m spatial resolution Moderate Resolution Imaging Spectroradiometer (MODIS) vegetation indexes (VI) over North America. MODIS VI data were pre-processed using snow-cover and land surface temperature data, and temporally smoothed with the enhanced TIMESAT algorithm. An objective third derivative test was applied to define key phenology dates and retrieve a set of phenology metrics. This algorithm has been applied to two MODIS VIs: Normalized Difference Vegetation Index (NDVI) and Enhanced Vegetation Index (EVI). In this paper, we describe the algorithm and use EVI as an example to compare three sets of TIMESAT algorithm/MODIS VI combinations: a) original TIMESAT algorithm with original MODIS VI, b) original TIMESAT algorithm with pre-processed MODIS VI, and c) enhanced TIMESAT and pre-processed MODIS VI. All retrievals were compared with ground phenology observations, some made available through the National Phenology Network. Our results show that for MODIS data in middle to high latitude regions, snow and land surface temperature information is critical in retrieving phenology metrics from satellite observations. The results also show that the enhanced TIMESAT algorithm can better accommodate growing season start and end dates that vary significantly from year to year. The TIMESAT algorithm improvements contribute to more spatial coverage and more accurate retrievals of the phenology metrics. Among three sets of TIMESAT/MODIS VI combinations, the start of the growing season metric predicted by the enhanced TIMESAT algorithm using pre-processed MODIS VIs has the best associations with ground observed vegetation greenup dates.

  11. An enhanced TIMESAT algorithm for estimating vegetation phenology metrics from MODIS data

    USGS Publications Warehouse

    Tan, B.; Morisette, J.T.; Wolfe, R.E.; Gao, F.; Ederer, G.A.; Nightingale, J.; Pedelty, J.A.

    2011-01-01

    An enhanced TIMESAT algorithm was developed for retrieving vegetation phenology metrics from 250 m and 500 m spatial resolution Moderate Resolution Imaging Spectroradiometer (MODIS) vegetation indexes (VI) over North America. MODIS VI data were pre-processed using snow-cover and land surface temperature data, and temporally smoothed with the enhanced TIMESAT algorithm. An objective third derivative test was applied to define key phenology dates and retrieve a set of phenology metrics. This algorithm has been applied to two MODIS VIs: Normalized Difference Vegetation Index (NDVI) and Enhanced Vegetation Index (EVI). In this paper, we describe the algorithm and use EVI as an example to compare three sets of TIMESAT algorithm/MODIS VI combinations: a) original TIMESAT algorithm with original MODIS VI, b) original TIMESAT algorithm with pre-processed MODIS VI, and c) enhanced TIMESAT and pre-processed MODIS VI. All retrievals were compared with ground phenology observations, some made available through the National Phenology Network. Our results show that for MODIS data in middle to high latitude regions, snow and land surface temperature information is critical in retrieving phenology metrics from satellite observations. The results also show that the enhanced TIMESAT algorithm can better accommodate growing season start and end dates that vary significantly from year to year. The TIMESAT algorithm improvements contribute to more spatial coverage and more accurate retrievals of the phenology metrics. Among three sets of TIMESAT/MODIS VI combinations, the start of the growing season metric predicted by the enhanced TIMESAT algorithm using pre-processed MODIS VIs has the best associations with ground observed vegetation greenup dates. ?? 2010 IEEE.

  12. Prenatal chromosomal microarray for the Catholic physician

    PubMed Central

    Bringman, Jay J.

    2014-01-01

    Prenatal chromosomal microarray (CMA) is a test that is used to diagnose certain genetic problems in the fetus. While the test has been used in the pediatric setting for several years, it is now being introduced for use in the prenatal setting. The test offers great hope for detection of certain genetic defects in the fetus so that early intervention can be performed to improve the outcome for that individual. As with many biotechnical advances, CMA comes with certain bioethical issues that need to be addressed prior to its implementation. This paper is intended to provide guidance to all those that provide counseling regarding genetic testing options during pregnancy. PMID:24899750

  13. Protein Microarrays--Without a Trace

    SciTech Connect

    Camarero, J A

    2007-04-05

    Many experimental approaches in biology and biophysics, as well as applications in diagnosis and drug discovery, require proteins to be immobilized on solid supports. Protein microarrays, for example, provide a high-throughput format to study biomolecular interactions. The technique employed for protein immobilization is a key to the success of these applications. Recent biochemical developments are allowing, for the first time, the selective and traceless immobilization of proteins generated by cell-free systems without the need for purification and/or reconcentration prior to the immobilization step.

  14. ProMAT: protein microarray analysis tool

    SciTech Connect

    White, Amanda M.; Daly, Don S.; Varnum, Susan M.; Anderson, Kevin K.; Bollinger, Nikki; Zangar, Richard C.

    2006-04-04

    Summary: ProMAT is a software tool for statistically analyzing data from ELISA microarray experiments. The software estimates standard curves, sample protein concentrations and their uncertainties for multiple assays. ProMAT generates a set of comprehensive figures for assessing results and diagnosing process quality. The tool is available for Windows or Mac, and is distributed as open-source Java and R code. Availability: ProMAT is available at http://www.pnl.gov/statistics/ProMAT. ProMAT requires Java version 1.5.0 and R version 1.9.1 (or more recent versions) which are distributed with the tool.

  15. Genetic algorithms

    NASA Technical Reports Server (NTRS)

    Wang, Lui; Bayer, Steven E.

    1991-01-01

    Genetic algorithms are mathematical, highly parallel, adaptive search procedures (i.e., problem solving methods) based loosely on the processes of natural genetics and Darwinian survival of the fittest. Basic genetic algorithms concepts are introduced, genetic algorithm applications are introduced, and results are presented from a project to develop a software tool that will enable the widespread use of genetic algorithm technology.

  16. Refractive index change detection based on porous silicon microarray

    NASA Astrophysics Data System (ADS)

    Chen, Weirong; Jia, Zhenhong; Li, Peng; Lv, Guodong; Lv, Xiaoyi

    2016-05-01

    By combining photolithography with the electrochemical anodization method, a microarray device of porous silicon (PS) photonic crystal was fabricated on the crystalline silicon substrate. The optical properties of the microarray were analyzed with the transfer matrix method. The relationship between refractive index and reflectivity of each array element of the microarray at 633 nm was also studied, and the array surface reflectivity changes were observed through digital imaging. By means of the reflectivity measurement method, reflectivity changes below 10-3 can be observed based on PS microarray. The results of this study can be applied to the detection of biosensor arrays.

  17. Studying cellular processes and detecting disease with protein microarrays

    SciTech Connect

    Zangar, Richard C.; Varnum, Susan M.; Bollinger, Nikki

    2005-10-31

    Protein microarrays are a rapidly developing analytic tool with diverse applications in biomedical research. These applications include profiling of disease markers or autoimmune responses, understanding molecular pathways, protein modifications and protein activities. One factor that is driving this expanding usage is the wide variety of experimental formats that protein microarrays can take. In this review, we provide a short, conceptual overview of the different approaches for protein microarray. We then examine some of the most significant applications of these microarrays to date, with an emphasis on how global protein analyses can be used to facilitate biomedical research.

  18. Adaptive color image watermarking algorithm

    NASA Astrophysics Data System (ADS)

    Feng, Gui; Lin, Qiwei

    2008-03-01

    As a major method for intellectual property right protecting, digital watermarking techniques have been widely studied and used. But due to the problems of data amount and color shifted, watermarking techniques on color image was not so widespread studied, although the color image is the principal part for multi-medium usages. Considering the characteristic of Human Visual System (HVS), an adaptive color image watermarking algorithm is proposed in this paper. In this algorithm, HSI color model was adopted both for host and watermark image, the DCT coefficient of intensity component (I) of the host color image was used for watermark date embedding, and while embedding watermark the amount of embedding bit was adaptively changed with the complex degree of the host image. As to the watermark image, preprocessing is applied first, in which the watermark image is decomposed by two layer wavelet transformations. At the same time, for enhancing anti-attack ability and security of the watermarking algorithm, the watermark image was scrambled. According to its significance, some watermark bits were selected and some watermark bits were deleted as to form the actual embedding data. The experimental results show that the proposed watermarking algorithm is robust to several common attacks, and has good perceptual quality at the same time.

  19. An Efficient Ensemble Learning Method for Gene Microarray Classification

    PubMed Central

    Shadgar, Bita

    2013-01-01

    The gene microarray analysis and classification have demonstrated an effective way for the effective diagnosis of diseases and cancers. However, it has been also revealed that the basic classification techniques have intrinsic drawbacks in achieving accurate gene classification and cancer diagnosis. On the other hand, classifier ensembles have received increasing attention in various applications. Here, we address the gene classification issue using RotBoost ensemble methodology. This method is a combination of Rotation Forest and AdaBoost techniques which in turn preserve both desirable features of an ensemble architecture, that is, accuracy and diversity. To select a concise subset of informative genes, 5 different feature selection algorithms are considered. To assess the efficiency of the RotBoost, other nonensemble/ensemble techniques including Decision Trees, Support Vector Machines, Rotation Forest, AdaBoost, and Bagging are also deployed. Experimental results have revealed that the combination of the fast correlation-based feature selection method with ICA-based RotBoost ensemble is highly effective for gene classification. In fact, the proposed method can create ensemble classifiers which outperform not only the classifiers produced by the conventional machine learning but also the classifiers generated by two widely used conventional ensemble learning methods, that is, Bagging and AdaBoost. PMID:24024194

  20. Multiplex planar microarrays for disease prognosis, diagnosis and theranosis

    PubMed Central

    Lea, Peter

    2015-01-01

    Advanced diagnostic methods and algorithms for immune disorders provide qualitative and quantitative multiplex measurement for pre-clinical prognostic and clinical diagnostic biomarkers specific for diseases. Choice of therapy is confirmed by modulating diagnostic efficacy of companion, theranotic drug concentrations. Assay methods identify, monitor and manage autoimmune diseases, or risk thereof, in subjects who have, or who are related to individuals with autoimmune disease. These same diagnostic protocols also integrate qualitative and quantitative assay test protocol designs for responder patient assessment, risk analysis and management of disease when integrating multiplex planar microarray diagnostic tests, patient theranostic companion diagnostic methods and test panels for simultaneous assessment and management of dysimmune and inflammatory disorders, autoimmunity, allergy and cancer. Proprietary assay methods are provided to identify, monitor and manage dysimmune conditions, or risk thereof, in subjects with pathological alterations in the immune system, or who are related to individuals with these conditions. The protocols can be used for confirmatory testing of subjects who exhibit symptoms of dysimmunity, as well as subjects who are apparently healthy and do not exhibit symptoms of altered immune function. The protocols also provide for methods of determining whether a subject has, is at risk for, or is a candidate for disease therapy, guided by companion diagnosis and immunosuppressive therapy, as well as therapeutic drug monitoring and theranostic testing of disease biomarkers in response to immuno-absorption therapy. The multiplex test panels provide the components that are integral for performing the methods to recognized clinical standards. PMID:26309820

  1. Discovering Pair-wise Synergies in Microarray Data

    PubMed Central

    Chen, Yuan; Cao, Dan; Gao, Jun; Yuan, Zheming

    2016-01-01

    Informative gene selection can have important implications for the improvement of cancer diagnosis and the identification of new drug targets. Individual-gene-ranking methods ignore interactions between genes. Furthermore, popular pair-wise gene evaluation methods, e.g. TSP and TSG, are helpless for discovering pair-wise interactions. Several efforts to discover pair-wise synergy have been made based on the information approach, such as EMBP and FeatKNN. However, the methods which are employed to estimate mutual information, e.g. binarization, histogram-based and KNN estimators, depend on known data or domain characteristics. Recently, Reshef et al. proposed a novel maximal information coefficient (MIC) measure to capture a wide range of associations between two variables that has the property of generality. An extension from MIC(X; Y) to MIC(X1; X2; Y) is therefore desired. We developed an approximation algorithm for estimating MIC(X1; X2; Y) where Y is a discrete variable. MIC(X1; X2; Y) is employed to detect pair-wise synergy in simulation and cancer microarray data. The results indicate that MIC(X1; X2; Y) also has the property of generality. It can discover synergic genes that are undetectable by reference feature selection methods such as MIC(X; Y) and TSG. Synergic genes can distinguish different phenotypes. Finally, the biological relevance of these synergic genes is validated with GO annotation and OUgene database. PMID:27470995

  2. Discovering Pair-wise Synergies in Microarray Data.

    PubMed

    Chen, Yuan; Cao, Dan; Gao, Jun; Yuan, Zheming

    2016-01-01

    Informative gene selection can have important implications for the improvement of cancer diagnosis and the identification of new drug targets. Individual-gene-ranking methods ignore interactions between genes. Furthermore, popular pair-wise gene evaluation methods, e.g. TSP and TSG, are helpless for discovering pair-wise interactions. Several efforts to discover pair-wise synergy have been made based on the information approach, such as EMBP and FeatKNN. However, the methods which are employed to estimate mutual information, e.g. binarization, histogram-based and KNN estimators, depend on known data or domain characteristics. Recently, Reshef et al. proposed a novel maximal information coefficient (MIC) measure to capture a wide range of associations between two variables that has the property of generality. An extension from MIC(X; Y) to MIC(X1; X2; Y) is therefore desired. We developed an approximation algorithm for estimating MIC(X1; X2; Y) where Y is a discrete variable. MIC(X1; X2; Y) is employed to detect pair-wise synergy in simulation and cancer microarray data. The results indicate that MIC(X1; X2; Y) also has the property of generality. It can discover synergic genes that are undetectable by reference feature selection methods such as MIC(X; Y) and TSG. Synergic genes can distinguish different phenotypes. Finally, the biological relevance of these synergic genes is validated with GO annotation and OUgene database. PMID:27470995

  3. Spot addressing for microarray images structured in hexagonal grids.

    PubMed

    Giannakeas, Nikolaos; Kalatzis, Fanis; Tsipouras, Markos G; Fotiadis, Dimitrios I

    2012-04-01

    In this work, an efficient method for spot addressing in images, which are generated by the scanning of hexagonal structured microarrays, is proposed. Initially, the blocks of the image are separated using the projections of the image. Next, all the blocks of the image are processed separately for the detection of each spot. The spot addressing procedure begins with the detection of the high intensity objects, which are probably the spots of the image. Next, the Growing Concentric Hexagon algorithm, which uses the properties of the hexagonal grid, is introduced for the detection of the non-hybridized spots. Finally, the Voronoi diagram is applied to the centers of the detected spots for the gridding of the image. The method is evaluated using spots generated from the scanning of the Beadchip of Illumina, which is used for the detection of Single Nucleotide Polymorphisms in the human genome, and uses hexagonal structure for the location of the spots. For the evaluation, the detected centers for each of the spot in the image are compared to the centers of the annotation, obtaining up to 98% accuracy for the spot addressing procedure. PMID:21924515

  4. GeneRank: Using search engine technology for the analysis of microarray experiments

    PubMed Central

    Morrison, Julie L; Breitling, Rainer; Higham, Desmond J; Gilbert, David R

    2005-01-01

    Background Interpretation of simple microarray experiments is usually based on the fold-change of gene expression between a reference and a "treated" sample where the treatment can be of many types from drug exposure to genetic variation. Interpretation of the results usually combines lists of differentially expressed genes with previous knowledge about their biological function. Here we evaluate a method – based on the PageRank algorithm employed by the popular search engine Google – that tries to automate some of this procedure to generate prioritized gene lists by exploiting biological background information. Results GeneRank is an intuitive modification of PageRank that maintains many of its mathematical properties. It combines gene expression information with a network structure derived from gene annotations (gene ontologies) or expression profile correlations. Using both simulated and real data we find that the algorithm offers an improved ranking of genes compared to pure expression change rankings. Conclusion Our modification of the PageRank algorithm provides an alternative method of evaluating microarray experimental results which combines prior knowledge about the underlying network. GeneRank offers an improvement compared to assessing the importance of a gene based on its experimentally observed fold-change alone and may be used as a basis for further analytical developments. PMID:16176585

  5. Enzyme Microarrays Assembled by Acoustic Dispensing Technology

    PubMed Central

    Wong, E. Y.; Diamond, S. L.

    2008-01-01

    Miniaturizing bioassays to the nanoliter scale for high-throughput screening reduces the consumption of reagents that are expensive or difficult to handle. Utilizing acoustic dispensing technology, nanodroplets containing 10 µM ATP (3 µCi/µL 32P) and reaction buffer in 10% glycerol were positionally dispensed to the surface of glass slides to form 40 nL compartments (100 droplets/slide) for Pim1 (Proviral integration site 1) kinase reactions. The reactions were activated by dispensing 4 nL of various levels of a pyridocarbazolo-cyclopentadienyl ruthenium-complex Pim1 inhibitor, followed by dispensing 4 nL of a Pim1 kinase and peptide substrate solution to achieve final concentrations of 150 nM enzyme and 10 µM substrate. The microarray was incubated at 30°C (97% Rh) for 1.5 hr. The spots were then blotted to phosphocellulose membranes to capture phosphorylated substrate. Using phosphor imaging to quantify the washed membranes, the assay showed that, for doses of inhibitor from 0.75 µM to 3 µM, Pim1 was increasingly inhibited. Signal-to-background ratios were as high as 165 and average coefficients of variation (CVs) for the assay were ~20%. CVs for dispensing typical working buffers were under 5%. Thus, microarrays assembled by acoustic dispensing are promising as cost-effective tools that can be used in protein assay development. PMID:18616925

  6. Laser direct writing of biomolecule microarrays

    NASA Astrophysics Data System (ADS)

    Serra, P.; Fernández-Pradas, J. M.; Berthet, F. X.; Colina, M.; Elvira, J.; Morenza, J. L.

    Protein-based biosensors are highly efficient tools for protein detection and identification. The production of these devices requires the manipulation of tiny amounts of protein solutions in conditions preserving their biological properties. In this work, laser induced forward transfer (LIFT) was used for spotting an array of a purified bacterial antigen in order to check the viability of this technique for the production of protein microarrays. A pulsed Nd:YAG laser beam (355 nm wavelength, 10 ns pulse duration) was used to transfer droplets of a solution containing the Treponema pallidum 17 kDa protein antigen on a glass slide. Optical microscopy showed that a regular array of micrometric droplets could be precisely and uniformly spotted onto a solid substrate. Subsequently, it was proved that LIFT deposition of a T. pallidum 17 kDa antigen onto nylon-coated glass slides preserves its antigenic reactivity and diagnostic properties. These results support that LIFT is suitable for the production of protein microarrays and pave the way for future diagnostics applications.

  7. Mining microarray expression data by literature profiling

    PubMed Central

    Chaussabel, Damien; Sher, Alan

    2002-01-01

    Background The rapidly expanding fields of genomics and proteomics have prompted the development of computational methods for managing, analyzing and visualizing expression data derived from microarray screening. Nevertheless, the lack of efficient techniques for assessing the biological implications of gene-expression data remains an important obstacle in exploiting this information. Results To address this need, we have developed a mining technique based on the analysis of literature profiles generated by extracting the frequencies of certain terms from thousands of abstracts stored in the Medline literature database. Terms are then filtered on the basis of both repetitive occurrence and co-occurrence among multiple gene entries. Finally, clustering analysis is performed on the retained frequency values, shaping a coherent picture of the functional relationship among large and heterogeneous lists of genes. Such data treatment also provides information on the nature and pertinence of the associations that were formed. Conclusions The analysis of patterns of term occurrence in abstracts constitutes a means of exploring the biological significance of large and heterogeneous lists of genes. This approach should contribute to optimizing the exploitation of microarray technologies by providing investigators with an interface between complex expression data and large literature resources. PMID:12372143

  8. Enzyme microarrays assembled by acoustic dispensing technology.

    PubMed

    Wong, E Y; Diamond, S L

    2008-10-01

    Miniaturizing bioassays to the nanoliter scale for high-throughput screening reduces the consumption of reagents that are expensive or difficult to handle. Through the use of acoustic dispensing technology, nanodroplets containing 10 microM ATP (3 microCi/microL (32)P) and reaction buffer in 10% glycerol were positionally dispensed to the surface of glass slides to form 40-nL compartments (100 droplets/slide) for Pim1 (proviral integration site 1) kinase reactions. The reactions were activated by dispensing 4 nL of various levels of a pyridocarbazolo-cyclopentadienyl ruthenium complex Pim1 inhibitor, followed by dispensing 4 nL of a Pim1 kinase and peptide substrate solution to achieve final concentrations of 150 nM enzyme and 10 microM substrate. The microarray was incubated at 30 degrees C (97% R(h)) for 1.5 h. The spots were then blotted to phosphocellulose membranes to capture phosphorylated substrate. With phosphor imaging to quantify the washed membranes, the assay showed that, for doses of inhibitor from 0.75 to 3 microM, Pim1 was increasingly inhibited. Signal-to-background ratios were as high as 165, and average coefficients of variation for the assay were approximately 20%. Coefficients of variation for dispensing typical working buffers were under 5%. Thus, microarrays assembled by acoustic dispensing are promising as cost-effective tools that can be used in protein assay development. PMID:18616925

  9. Preprocessing and calibration of optical diffuse reflectance signal for estimation of soil physical and chemical properties in the central USA

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Optical diffuse reflectance sensing in visible and near-infrared wavelength ranges is one approach to rapidly quantify soil properties for site-specific management. The objectives of this study were to investigate effects of preprocessing of reflectance data and determine the accuracy of the reflect...

  10. Convergence Properties of an Iterative Procedure of Ipsatizing and Standardizing a Data Matrix, with Applications to Parafac/Candecomp Preprocessing.

    ERIC Educational Resources Information Center

    ten Berge, Jos M. F.; Kiers, Henk A. L.

    1989-01-01

    Centering a matrix row-wise and rescaling it column-wise to a unit sum of squares requires an iterative procedure. It is shown that this procedure converges to a stable solution that need not be centered row-wise. The results bear directly on several types of preprocessing methods in Parafac/Candecomp. (Author/TJH)

  11. Label-free detection repeatability of protein microarrays by oblique-incidence reflectivity difference method

    NASA Astrophysics Data System (ADS)

    Dai, Jun; Li, Lin; Wang, JingYi; He, LiPing; Lu, HuiBin; Ruan, KangCheng; Jin, KuiJuan; Yang, GuoZhen

    2012-12-01

    We examine the repeatabilities of oblique-incidence reflectivity difference (OIRD) method for label-free detecting biological molecular interaction using protein microarrays. The experimental results show that the repeatabilities are the same in a given microarray or microarray-microarray and are consistent, indicating that OIRD is a promising label-free detection technique for biological microarrays.

  12. TAMEE: data management and analysis for tissue microarrays

    PubMed Central

    Thallinger, Gerhard G; Baumgartner, Kerstin; Pirklbauer, Martin; Uray, Martina; Pauritsch, Elke; Mehes, Gabor; Buck, Charles R; Zatloukal, Kurt; Trajanoski, Zlatko

    2007-01-01

    Background With the introduction of tissue microarrays (TMAs) researchers can investigate gene and protein expression in tissues on a high-throughput scale. TMAs generate a wealth of data calling for extended, high level data management. Enhanced data analysis and systematic data management are required for traceability and reproducibility of experiments and provision of results in a timely and reliable fashion. Robust and scalable applications have to be utilized, which allow secure data access, manipulation and evaluation for researchers from different laboratories. Results TAMEE (Tissue Array Management and Evaluation Environment) is a web-based database application for the management and analysis of data resulting from the production and application of TMAs. It facilitates storage of production and experimental parameters, of images generated throughout the TMA workflow, and of results from core evaluation. Database content consistency is achieved using structured classifications of parameters. This allows the extraction of high quality results for subsequent biologically-relevant data analyses. Tissue cores in the images of stained tissue sections are automatically located and extracted and can be evaluated using a set of predefined analysis algorithms. Additional evaluation algorithms can be easily integrated into the application via a plug-in interface. Downstream analysis of results is facilitated via a flexible query generator. Conclusion We have developed an integrated system tailored to the specific needs of research projects using high density TMAs. It covers the complete workflow of TMA production, experimental use and subsequent analysis. The system is freely available for academic and non-profit institutions from . PMID:17343750

  13. Differentiation of whole bacterial cells based on high-throughput microarray chip printing and infrared microspectroscopic readout.

    PubMed

    Al-Khaldi, Sufian F; Mossoba, Magdi M; Burke, Tara L; Fry, Frederick S

    2009-10-01

    Using robotic automation, a microarray printing protocol for whole bacterial cells was developed for subsequent label-free and nondestructive infrared microspectroscopic detection. Using this contact microspotting system, 24 microorganisms were printed on zinc selenide slides; these were 6 species of Listeria, 10 species of Vibrio, 2 strains of Photobacterium damselae, Yersinia enterocolitica 289, Bacillus cereus ATCC 14529, Staphylococcus aureus, ATCC 19075 (serotype 104 B), Shigella sonnei 20143, Klebsiella pneumoniae KP73, Enterobacter cloacae, Citrobacter freundii 200, and Escherichia coli. Microarrays consisting of separate spots of bacterial deposits gave consistent and reproducible infrared spectra, which were differentiated by unsupervised pattern recognition algorithms. Two multivariate analysis algorithms, principal component analysis and hierarchical cluster analysis, successfully separated most, but not all, the bacteria investigated down to the species level. PMID:19630511

  14. Determination of B-Cell Epitopes in Patients with Celiac Disease: Peptide Microarrays

    PubMed Central

    Choung, Rok Seon; Marietta, Eric V.; Van Dyke, Carol T.; Brantner, Tricia L.; Rajasekaran, John; Pasricha, Pankaj J.; Wang, Tianhao; Bei, Kang; Krishna, Karthik; Krishnamurthy, Hari K.; Snyder, Melissa R.; Jayaraman, Vasanth; Murray, Joseph A.

    2016-01-01

    Background Most antibodies recognize conformational or discontinuous epitopes that have a specific 3-dimensional shape; however, determination of discontinuous B-cell epitopes is a major challenge in bioscience. Moreover, the current methods for identifying peptide epitopes often involve laborious, high-cost peptide screening programs. Here, we present a novel microarray method for identifying discontinuous B-cell epitopes in celiac disease (CD) by using a silicon-based peptide array and computational methods. Methods Using a novel silicon-based microarray platform with a multi-pillar chip, overlapping 12-mer peptide sequences of all native and deamidated gliadins, which are known to trigger CD, were synthesized in situ and used to identify peptide epitopes. Results Using a computational algorithm that considered disease specificity of peptide sequences, 2 distinct epitope sets were identified. Further, by combining the most discriminative 3-mer gliadin sequences with randomly interpolated3- or 6-mer peptide sequences, novel discontinuous epitopes were identified and further optimized to maximize disease discrimination. The final discontinuous epitope sets were tested in a confirmatory cohort of CD patients and controls, yielding 99% sensitivity and 100% specificity. Conclusions These novel sets of epitopes derived from gliadin have a high degree of accuracy in differentiating CD from controls, compared with standard serologic tests. The method of ultra-high-density peptide microarray described here would be broadly useful to develop high-fidelity diagnostic tests and explore pathogenesis. PMID:26824466

  15. GEPAS, a web-based tool for microarray data analysis and interpretation

    PubMed Central

    Tárraga, Joaquín; Medina, Ignacio; Carbonell, José; Huerta-Cepas, Jaime; Minguez, Pablo; Alloza, Eva; Al-Shahrour, Fátima; Vegas-Azcárate, Susana; Goetz, Stefan; Escobar, Pablo; Garcia-Garcia, Francisco; Conesa, Ana; Montaner, David; Dopazo, Joaquín

    2008-01-01

    Gene Expression Profile Analysis Suite (GEPAS) is one of the most complete and extensively used web-based packages for microarray data analysis. During its more than 5 years of activity it has continuously been updated to keep pace with the state-of-the-art in the changing microarray data analysis arena. GEPAS offers diverse analysis options that include well established as well as novel algorithms for normalization, gene selection, class prediction, clustering and functional profiling of the experiment. New options for time-course (or dose-response) experiments, microarray-based class prediction, new clustering methods and new tests for differential expression have been included. The new pipeliner module allows automating the execution of sequential analysis steps by means of a simple but powerful graphic interface. An extensive re-engineering of GEPAS has been carried out which includes the use of web services and Web 2.0 technology features, a new user interface with persistent sessions and a new extended database of gene identifiers. GEPAS is nowadays the most quoted web tool in its field and it is extensively used by researchers of many countries and its records indicate an average usage rate of 500 experiments per day. GEPAS, is available at http://www.gepas.org. PMID:18508806

  16. Gene selection and classification for cancer microarray data based on machine learning and similarity measures

    PubMed Central

    2011-01-01

    Background Microarray data have a high dimension of variables and a small sample size. In microarray data analyses, two important issues are how to choose genes, which provide reliable and good prediction for disease status, and how to determine the final gene set that is best for classification. Associations among genetic markers mean one can exploit information redundancy to potentially reduce classification cost in terms of time and money. Results To deal with redundant information and improve classification, we propose a gene selection method, Recursive Feature Addition, which combines supervised learning and statistical similarity measures. To determine the final optimal gene set for prediction and classification, we propose an algorithm, Lagging Prediction Peephole Optimization. By using six benchmark microarray gene expression data sets, we compared Recursive Feature Addition with recently developed gene selection methods: Support Vector Machine Recursive Feature Elimination, Leave-One-Out Calculation Sequential Forward Selection and several others. Conclusions On average, with the use of popular learning machines including Nearest Mean Scaled Classifier, Support Vector Machine, Naive Bayes Classifier and Random Forest, Recursive Feature Addition outperformed other methods. Our studies also showed that Lagging Prediction Peephole Optimization is superior to random strategy; Recursive Feature Addition with Lagging Prediction Peephole Optimization obtained better testing accuracies than the gene selection method varSelRF. PMID:22369383

  17. ZODET: Software for the Identification, Analysis and Visualisation of Outlier Genes in Microarray Expression Data

    PubMed Central

    Roden, Daniel L.; Sewell, Gavin W.; Lobley, Anna; Levine, Adam P.; Smith, Andrew M.; Segal, Anthony W.

    2014-01-01

    Summary Complex human diseases can show significant heterogeneity between patients with the same phenotypic disorder. An outlier detection strategy was developed to identify variants at the level of gene transcription that are of potential biological and phenotypic importance. Here we describe a graphical software package (z-score outlier detection (ZODET)) that enables identification and visualisation of gross abnormalities in gene expression (outliers) in individuals, using whole genome microarray data. Mean and standard deviation of expression in a healthy control cohort is used to detect both over and under-expressed probes in individual test subjects. We compared the potential of ZODET to detect outlier genes in gene expression datasets with a previously described statistical method, gene tissue index (GTI), using a simulated expression dataset and a publicly available monocyte-derived macrophage microarray dataset. Taken together, these results support ZODET as a novel approach to identify outlier genes of potential pathogenic relevance in complex human diseases. The algorithm is implemented using R packages and Java. Availability The software is freely available from http://www.ucl.ac.uk/medicine/molecular-medicine/publications/microarray-outlier-analysis. PMID:24416128

  18. Unsupervised assessment of microarray data quality using a Gaussian mixture model

    PubMed Central

    Howard, Brian E; Sick, Beate; Heber, Steffen

    2009-01-01

    Background Quality assessment of microarray data is an important and often challenging aspect of gene expression analysis. This task frequently involves the examination of a variety of summary statistics and diagnostic plots. The interpretation of these diagnostics is often subjective, and generally requires careful expert scrutiny. Results We show how an unsupervised classification technique based on the Expectation-Maximization (EM) algorithm and the naïve Bayes model can be used to automate microarray quality assessment. The method is flexible and can be easily adapted to accommodate alternate quality statistics and platforms. We evaluate our approach using Affymetrix 3' gene expression and exon arrays and compare the performance of this method to a similar supervised approach. Conclusion This research illustrates the efficacy of an unsupervised classification approach for the purpose of automated microarray data quality assessment. Since our approach requires only unannotated training data, it is easy to customize and to keep up-to-date as technology evolves. In contrast to other "black box" classification systems, this method also allows for intuitive explanations. PMID:19545436

  19. A centroid-based gene selection method for microarray data classification.

    PubMed

    Guo, Shun; Guo, Donghui; Chen, Lifei; Jiang, Qingshan

    2016-07-01

    For classification problems based on microarray data, the data typically contains a large number of irrelevant and redundant features. In this paper, a new gene selection method is proposed to choose the best subset of features for microarray data with the irrelevant and redundant features removed. We formulate the selection problem as a L1-regularized optimization problem, based on a newly defined linear discriminant analysis criterion. Instead of calculating the mean of the samples, a kernel-based approach is used to estimate the class centroid to define both the between-class separability and the within-class compactness for the criterion. Theoretical analysis indicates that the global optimal solution of the L1-regularized criterion can be reached with a general condition, on which an efficient algorithm is derived to the feature selection problem in a linear time complexity with respect to the number of features and the number of samples. The experimental results on ten publicly available microarray datasets demonstrate that the proposed method performs effectively and competitively compared with state-of-the-art methods. PMID:27056739

  20. Multiplexed fluorescent microarray for human salivary protein analysis using polymer microspheres and fiber-optic bundles.

    PubMed

    Nie, Shuai; Benito-Peña, Elena; Zhang, Huaibin; Wu, Yue; Walt, David R

    2013-01-01

    Herein, we describe a protocol for simultaneously measuring six proteins in saliva using a fiber-optic microsphere-based antibody array. The immuno-array technology employed combines the advantages of microsphere-based suspension array fabrication with the use of fluorescence microscopy. As described in the video protocol, commercially available 4.5 μm polymer microspheres were encoded into seven different types, differentiated by the concentration of two fluorescent dyes physically trapped inside the microspheres. The encoded microspheres containing surface carboxyl groups were modified with monoclonal capture antibodies through EDC/NHS coupling chemistry. To assemble the protein microarray, the different types of encoded and functionalized microspheres were mixed and randomly deposited in 4.5 μm microwells, which were chemically etched at the proximal end of a fiber-optic bundle. The fiber-optic bundle was used as both a carrier and for imaging the microspheres. Once assembled, the microarray was used to capture proteins in the saliva supernatant collected from the clinic. The detection was based on a sandwich immunoassay using a mixture of biotinylated detection antibodies for different analytes with a streptavidin-conjugated fluorescent probe, R-phycoerythrin. The microarray was imaged by fluorescence microscopy in three different channels, two for microsphere registration and one for the assay signal. The fluorescence micrographs were then decoded and analyzed using a homemade algorithm in MATLAB. PMID:24145242

  1. Stability of gene contributions and identification of outliers in multivariate analysis of microarray data

    PubMed Central

    Baty, Florent; Jaeger, Daniel; Preiswerk, Frank; Schumacher, Martin M; Brutsche, Martin H

    2008-01-01

    Background Multivariate ordination methods are powerful tools for the exploration of complex data structures present in microarray data. These methods have several advantages compared to common gene-by-gene approaches. However, due to their exploratory nature, multivariate ordination methods do not allow direct statistical testing of the stability of genes. Results In this study, we developed a computationally efficient algorithm for: i) the assessment of the significance of gene contributions and ii) the identification of sample outliers in multivariate analysis of microarray data. The approach is based on the use of resampling methods including bootstrapping and jackknifing. A statistical package of R functions was developed. This package includes tools for both inferring the statistical significance of gene contributions and identifying outliers among samples. Conclusion The methodology was successfully applied to three published data sets with varying levels of signal intensities. Its relevance was compared with alternative methods. Overall, it proved to be particularly effective for the evaluation of the stability of microarray data. PMID:18570644

  2. Scientific data products and the data pre-processing subsystem of the Chang'e-3 mission

    NASA Astrophysics Data System (ADS)

    Tan, Xu; Liu, Jian-Jun; Li, Chun-Lai; Feng, Jian-Qing; Ren, Xin; Wang, Fen-Fei; Yan, Wei; Zuo, Wei; Wang, Xiao-Qian; Zhang, Zhou-Bin

    2014-12-01

    The Chang'e-3 (CE-3) mission is China's first exploration mission on the surface of the Moon that uses a lander and a rover. Eight instruments that form the scientific payloads have the following objectives: (1) investigate the morphological features and geological structures at the landing site; (2) integrated in-situ analysis of minerals and chemical compositions; (3) integrated exploration of the structure of the lunar interior; (4) exploration of the lunar-terrestrial space environment, lunar surface environment and acquire Moon-based ultraviolet astronomical observations. The Ground Research and Application System (GRAS) is in charge of data acquisition and pre-processing, management of the payload in orbit, and managing the data products and their applications. The Data Pre-processing Subsystem (DPS) is a part of GRAS. The task of DPS is the pre-processing of raw data from the eight instruments that are part of CE-3, including channel processing, unpacking, package sorting, calibration and correction, identification of geographical location, calculation of probe azimuth angle, probe zenith angle, solar azimuth angle, and solar zenith angle and so on, and conducting quality checks. These processes produce Level 0, Level 1 and Level 2 data. The computing platform of this subsystem is comprised of a high-performance computing cluster, including a real-time subsystem used for processing Level 0 data and a post-time subsystem for generating Level 1 and Level 2 data. This paper describes the CE-3 data pre-processing method, the data pre-processing subsystem, data classification, data validity and data products that are used for scientific studies.

  3. The Importance of Normalization on Large and Heterogeneous Microarray Datasets

    EPA Science Inventory

    DNA microarray technology is a powerful functional genomics tool increasingly used for investigating global gene expression in environmental studies. Microarrays can also be used in identifying biological networks, as they give insight on the complex gene-to-gene interactions, ne...

  4. Experimental Approaches to Microarray Analysis of Tumor Samples

    ERIC Educational Resources Information Center

    Furge, Laura Lowe; Winter, Michael B.; Meyers, Jacob I.; Furge, Kyle A.

    2008-01-01

    Comprehensive measurement of gene expression using high-density nucleic acid arrays (i.e. microarrays) has become an important tool for investigating the molecular differences in clinical and research samples. Consequently, inclusion of discussion in biochemistry, molecular biology, or other appropriate courses of microarray technologies has…

  5. Removal of hybridization and scanning noise from microarrays.

    PubMed

    Gopalappa, Chaitra; Das, Tapas K; Enkemann, Steven; Eschrich, Steven

    2009-09-01

    Microarray technology for measuring gene expression values has created significant opportunities for advances in disease diagnosis and individualized treatment planning. However, the random noise introduced by the sample preparation, hybridization, and scanning stages of microarray processing creates significant inaccuracies in the gene expression levels, and hence presents a major barrier in realizing the anticipated advances. Literature presents several methodologies for noise reduction, which can be broadly categorized as: 1) model based approaches for estimation and removal of hybridization noise; 2) approaches using commonly available image denoising tools; and 3) approaches involving the need for control sample(s). In this paper, we present a novel methodology for identifying and removing hybridization and scanning noise from microarray images, using a dual-tree-complex-wavelet-transform-based multiresolution analysis coupled with bivariate shrinkage thresholding. The key features of our methodology include consideration of inherent features and type of noise specific to microarray images, and the ability to work with a single microarray without needing a control. Our methodology is first benchmarked on a fabricated dataset that mimics a real microarray probe dataset. Thereafter, our methodology is tested on datasets obtained from a number of Affymetrix GeneChip human genome HG-U133 Plus 2.0 arrays, processed on HCT-116 cell line at the Microarray Core Facility of Moffitt Cancer Center and Research Institute. The results indicate an appreciable improvement in the quality of the microarray data. PMID:20051337

  6. Applications of microarray technology in breast cancer research

    PubMed Central

    Cooper, Colin S

    2001-01-01

    Microarrays provide a versatile platform for utilizing information from the Human Genome Project to benefit human health. This article reviews the ways in which microarray technology may be used in breast cancer research. Its diverse applications include monitoring chromosome gains and losses, tumour classification, drug discovery and development, DNA resequencing, mutation detection and investigating the mechanism of tumour development. PMID:11305951

  7. Demonstrating a Multi-drug Resistant Mycobacterium tuberculosis Amplification Microarray

    PubMed Central

    Linger, Yvonne; Kukhtin, Alexander; Golova, Julia; Perov, Alexander; Qu, Peter; Knickerbocker, Christopher; Cooney, Christopher G.; Chandler, Darrell P.

    2014-01-01

    Simplifying microarray workflow is a necessary first step for creating MDR-TB microarray-based diagnostics that can be routinely used in lower-resource environments. An amplification microarray combines asymmetric PCR amplification, target size selection, target labeling, and microarray hybridization within a single solution and into a single microfluidic chamber. A batch processing method is demonstrated with a 9-plex asymmetric master mix and low-density gel element microarray for genotyping multi-drug resistant Mycobacterium tuberculosis (MDR-TB). The protocol described here can be completed in 6 hr and provide correct genotyping with at least 1,000 cell equivalents of genomic DNA. Incorporating on-chip wash steps is feasible, which will result in an entirely closed amplicon method and system. The extent of multiplexing with an amplification microarray is ultimately constrained by the number of primer pairs that can be combined into a single master mix and still achieve desired sensitivity and specificity performance metrics, rather than the number of probes that are immobilized on the array. Likewise, the total analysis time can be shortened or lengthened depending on the specific intended use, research question, and desired limits of detection. Nevertheless, the general approach significantly streamlines microarray workflow for the end user by reducing the number of manually intensive and time-consuming processing steps, and provides a simplified biochemical and microfluidic path for translating microarray-based diagnostics into routine clinical practice. PMID:24796567

  8. Application of Microarray Technology to Investigate Salmonella and Antimicrobial Resistance

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Microarrays have been developed for the study of various aspects of Salmonella, which is a model system for investigating pathogenesis. Microarrays were used to analyze the gene expression of Salmonella in various environments that mimic the host environment and these studies have helped to elucidat...

  9. An automatic method for producing robust regression models from hyperspectral data using multiple simple genetic algorithms

    NASA Astrophysics Data System (ADS)

    Sykas, Dimitris; Karathanassi, Vassilia

    2015-06-01

    This paper presents a new method for automatically determining the optimum regression model, which enable the estimation of a parameter. The concept lies on the combination of k spectral pre-processing algorithms (SPPAs) that enhance spectral features correlated to the desired parameter. Initially a pre-processing algorithm uses as input a single spectral signature and transforms it according to the SPPA function. A k-step combination of SPPAs uses k preprocessing algorithms serially. The result of each SPPA is used as input to the next SPPA, and so on until the k desired pre-processed signatures are reached. These signatures are then used as input to three different regression methods: the Normalized band Difference Regression (NDR), the Multiple Linear Regression (MLR) and the Partial Least Squares Regression (PLSR). Three Simple Genetic Algorithms (SGAs) are used, one for each regression method, for the selection of the optimum combination of k SPPAs. The performance of the SGAs is evaluated based on the RMS error of the regression models. The evaluation not only indicates the selection of the optimum SPPA combination but also the regression method that produces the optimum prediction model. The proposed method was applied on soil spectral measurements in order to predict Soil Organic Matter (SOM). In this study, the maximum value assigned to k was 3. PLSR yielded the highest accuracy while NDR's accuracy was satisfactory compared to its complexity. MLR method showed severe drawbacks due to the presence of noise in terms of collinearity at the spectral bands. Most of the regression methods required a 3-step combination of SPPAs for achieving the highest performance. The selected preprocessing algorithms were different for each regression method since each regression method handles with a different way the explanatory variables.

  10. HoughFeature, a novel method for assessing drug effects in three-color cDNA microarray experiments

    PubMed Central

    Zhao, Hongya; Yan, Hong

    2007-01-01

    Background Three-color microarray experiments can be performed to assess drug effects on the genomic scale. The methodology may be useful in shortening the cycle, reducing the cost, and improving the efficiency in drug discovery and development compared with the commonly used dual-color technology. A visualization tool, the hexaMplot, is able to show the interrelations of gene expressions in normal-disease-drug samples in three-color microarray data. However, it is not enough to assess the complicated drug therapeutic effects based on the plot alone. It is important to explore more effective tools so that a deeper insight into gene expression patterns can be gained with three-color microarrays. Results Based on the celebrated Hough transform, a novel algorithm, HoughFeature, is proposed to extract line features in the hexaMplot corresponding to different drug effects. Drug therapy results can then be divided into a number of levels in relation to different groups of genes. We apply the framework to experimental microarray data to assess the complex effects of Rg1 (an extract of Chinese medicine) on Hcy-related HUVECs in details. Differentially expressed genes are classified into 15 functional groups corresponding to different levels of drug effects. Conclusion Our study shows that the HoughFeature algorithm can reveal natural cluster patterns in gene expression data of normal-disease-drug samples. It provides both qualitative and quantitative information about up- or down-regulated genes. The methodology can be employed to predict disease susceptibility in gene therapy and assess drug effects on the disease based on three-color microarray data. PMID:17634089

  11. A Comparative Study of Normalization Methods Used in Statistical Analysis of Oligonucleotide Microarray Data

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Normalization methods used in the statistical analysis of oligonucleotide microarray data were evaluated. The oligonucleotide microarray is considered an efficient analytical tool for analyzing thousands of genes simultaneously in a single experiment. However, systematic variation in microarray, ori...

  12. Genomic-Wide Analysis with Microarrays in Human Oncology

    PubMed Central

    Inaoka, Kenichi; Inokawa, Yoshikuni; Nomoto, Shuji

    2015-01-01

    DNA microarray technologies have advanced rapidly and had a profound impact on examining gene expression on a genomic scale in research. This review discusses the history and development of microarray and DNA chip devices, and specific microarrays are described along with their methods and applications. In particular, microarrays have detected many novel cancer-related genes by comparing cancer tissues and non-cancerous tissues in oncological research. Recently, new methods have been in development, such as the double-combination array and triple-combination array, which allow more effective analysis of gene expression and epigenetic changes. Analysis of gene expression alterations in precancerous regions compared with normal regions and array analysis in drug-resistance cancer tissues are also successfully performed. Compared with next-generation sequencing, a similar method of genome analysis, several important differences distinguish these techniques and their applications. Development of novel microarray technologies is expected to contribute to further cancer research.

  13. An ultralow background substrate for protein microarray technology.

    PubMed

    Feng, Hui; Zhang, Qingyang; Ma, Hongwei; Zheng, Bo

    2015-08-21

    We herein report an ultralow background substrate for protein microarrays. Conventional protein microarray substrates often suffer from non-specific protein adsorption and inhomogeneous spot morphology. Consequently, surface treatment and a suitable printing solution are required to improve the microarray performance. In the current work, we improved the situation by developing a new microarray substrate based on a fluorinated ethylene propylene (FEP) membrane. A polydopamine microspot array was fabricated on the FEP membrane, with proteins conjugated to the FEP surface through polydopamine. Uniform microspots were obtained on FEP without the application of a special printing solution. The modified FEP membrane demonstrated ultralow background signal and was applied in protein and peptide microarray analysis. PMID:26134063

  14. Tools and databases of the KOMICS web portal for preprocessing, mining, and dissemination of metabolomics data.

    PubMed

    Sakurai, Nozomu; Ara, Takeshi; Enomoto, Mitsuo; Motegi, Takeshi; Morishita, Yoshihiko; Kurabayashi, Atsushi; Iijima, Yoko; Ogata, Yoshiyuki; Nakajima, Daisuke; Suzuki, Hideyuki; Shibata, Daisuke

    2014-01-01

    A metabolome--the collection of comprehensive quantitative data on metabolites in an organism--has been increasingly utilized for applications such as data-intensive systems biology, disease diagnostics, biomarker discovery, and assessment of food quality. A considerable number of tools and databases have been developed to date for the analysis of data generated by various combinations of chromatography and mass spectrometry. We report here a web portal named KOMICS (The Kazusa Metabolomics Portal), where the tools and databases that we developed are available for free to academic users. KOMICS includes the tools and databases for preprocessing, mining, visualization, and publication of metabolomics data. Improvements in the annotation of unknown metabolites and dissemination of comprehensive metabolomic data are the primary aims behind the development of this portal. For this purpose, PowerGet and FragmentAlign include a manual curation function for the results of metabolite feature alignments. A metadata-specific wiki-based database, Metabolonote, functions as a hub of web resources related to the submitters' work. This feature is expected to increase citation of the submitters' work, thereby promoting data publication. As an example of the practical use of KOMICS, a workflow for a study on Jatropha curcas is presented. The tools and databases available at KOMICS should contribute to enhanced production, interpretation, and utilization of metabolomic Big Data. PMID:24949426

  15. Tree leaves extraction in natural images: comparative study of preprocessing tools and segmentation methods.

    PubMed

    Grand-Brochier, Manuel; Vacavant, Antoine; Cerutti, Guillaume; Kurtz, Camille; Weber, Jonathan; Tougne, Laure

    2015-05-01

    In this paper, we propose a comparative study of various segmentation methods applied to the extraction of tree leaves from natural images. This study follows the design of a mobile application, developed by Cerutti et al. (published in ReVeS Participation--Tree Species Classification Using Random Forests and Botanical Features. CLEF 2012), to highlight the impact of the choices made for segmentation aspects. All the tests are based on a database of 232 images of tree leaves depicted on natural background from smartphones acquisitions. We also propose to study the improvements, in terms of performance, using preprocessing tools, such as the interaction between the user and the application through an input stroke, as well as the use of color distance maps. The results presented in this paper shows that the method developed by Cerutti et al. (denoted Guided Active Contour), obtains the best score for almost all observation criteria. Finally, we detail our online benchmark composed of 14 unsupervised methods and 6 supervised ones. PMID:25667351

  16. Robust preprocessing for stimulus-based functional MRI of the moving fetus.

    PubMed

    You, Wonsang; Evangelou, Iordanis E; Zun, Zungho; Andescavage, Nickie; Limperopoulos, Catherine

    2016-04-01

    Fetal motion manifests as signal degradation and image artifact in the acquired time series of blood oxygen level dependent (BOLD) functional magnetic resonance imaging (fMRI) studies. We present a robust preprocessing pipeline to specifically address fetal and placental motion-induced artifacts in stimulus-based fMRI with slowly cycled block design in the living fetus. In the proposed pipeline, motion correction is optimized to the experimental paradigm, and it is performed separately in each phase as well as in each region of interest (ROI), recognizing that each phase and organ experiences different types of motion. To obtain the averaged BOLD signals for each ROI, both misaligned volumes and noisy voxels are automatically detected and excluded, and the missing data are then imputed by statistical estimation based on local polynomial smoothing. Our experimental results demonstrate that the proposed pipeline was effective in mitigating the motion-induced artifacts in stimulus-based fMRI data of the fetal brain and placenta. PMID:27081665

  17. Pre-processing in sentence comprehension: Sensitivity to likely upcoming meaning and structure

    PubMed Central

    DeLong, Katherine A.; Troyer, Melissa; Kutas, Marta

    2016-01-01

    For more than a decade, views of sentence comprehension have been shifting toward wider acceptance of a role for linguistic pre-processing—that is, anticipation, expectancy, (neural) pre-activation, or prediction—of upcoming semantic content and syntactic structure. In this survey, we begin by examining the implications of each of these “brands” of predictive comprehension, including the issue of potential costs and consequences to not encountering highly constrained sentence input. We then describe a number of studies (many using online methodologies) that provide results consistent with prospective sensitivity to various grains and levels of semantic and syntactic information, acknowledging that such pre-processing is likely to occur in other linguistic and extralinguistic domains, as well. This review of anticipatory findings also includes some discussion on the relationship of priming to prediction. We conclude with a brief examination of some possible limits to prediction, and with a suggestion for future work to probe whether and how various strands of prediction may integrate during real-time comprehension.

  18. Noise reduction in ultrasonic computerized tomography by preprocessing for projection data

    NASA Astrophysics Data System (ADS)

    Norose, Yoko; Mizutani, Koichi; Wakatsuki, Naoto; Ebihara, Tadashi

    2015-07-01

    In this study, an ultrasonic computerized tomography (CT) using time-of-flights (TOFs) has been used for the nondestructive inspection of steel billets with high acoustic attenuation. One of the remaining problems of this method is noise in CT images, which makes it difficult to distinguish defects from noise. Conventionally, noise is suppressed by a low-pass filter (LPF) in the process of filtered back projection (FBP). However, it has been found that there is residual noise even after filtering. To cope with this problem, in this study, the noise observed in ultrasonic testing was examined. As a result, it was found that the TOF data used for CT processing contains impulse noise, which remains in the CT image even after filtering, owing to the existence of transducer directivity. To remove impulse noise selectively, we propose a noise reduction technique for ultrasonic CT for steel billet inspection, that is, preprocessing (outlier detection and removal) of TOF data. The performance of the proposed technique was evaluated experimentally. The obtained results suggest that the proposed technique can remove impulse noise selectively and markedly improve the quality of the CT image. Hence, the proposed technique can improve the performance of ultrasonic CT for steel billet inspection.

  19. The effects of physical and chemical preprocessing on the flowability of corn stover

    SciTech Connect

    Crawford, Nathan C.; Nagle, Nick; Sievers, David A.; Stickel, Jonathan J.

    2015-12-20

    Continuous and reliable feeding of biomass is essential for successful biofuel production. However, the challenges associated with biomass solids handling are commonly overlooked. In this study, we examine the effects of preprocessing (particle size reduction, moisture content, chemical additives, etc.) on the flow properties of corn stover. Compressibility, flow properties (interparticle friction, cohesion, unconfined yield stress, etc.), and wall friction were examined for five corn stover samples: ground, milled (dry and wet), acid impregnated, and deacetylated. The ground corn stover was found to be the least compressible and most flowable material. The water and acid impregnated stovers had similar compressibilities. Yet, the wet corn stover was less flowable than the acid impregnated sample, which displayed a flow index equivalent to the dry, milled corn stover. The deacetylated stover, on the other hand, was the most compressible and least flowable examined material. However, all of the tested stover samples had internal friction angles >30°, which could present additional feeding and handling challenges. All of the ''wetted'' materials (water, acid, and deacetylated) displayed reduced flowabilities (excluding the acid impregnated sample), and enhanced compressibilities and wall friction angles, indicating the potential for added handling issues; which was corroborated via theoretical hopper design calculations. All of the ''wetted'' corn stovers require larger theoretical hopper outlet diameters and steeper hopper walls than the examined ''dry'' stovers.

  20. The effects of physical and chemical preprocessing on the flowability of corn stover

    DOE PAGESBeta

    Crawford, Nathan C.; Nagle, Nick; Sievers, David A.; Stickel, Jonathan J.

    2015-12-20

    Continuous and reliable feeding of biomass is essential for successful biofuel production. However, the challenges associated with biomass solids handling are commonly overlooked. In this study, we examine the effects of preprocessing (particle size reduction, moisture content, chemical additives, etc.) on the flow properties of corn stover. Compressibility, flow properties (interparticle friction, cohesion, unconfined yield stress, etc.), and wall friction were examined for five corn stover samples: ground, milled (dry and wet), acid impregnated, and deacetylated. The ground corn stover was found to be the least compressible and most flowable material. The water and acid impregnated stovers had similar compressibilities.more » Yet, the wet corn stover was less flowable than the acid impregnated sample, which displayed a flow index equivalent to the dry, milled corn stover. The deacetylated stover, on the other hand, was the most compressible and least flowable examined material. However, all of the tested stover samples had internal friction angles >30°, which could present additional feeding and handling challenges. All of the ''wetted'' materials (water, acid, and deacetylated) displayed reduced flowabilities (excluding the acid impregnated sample), and enhanced compressibilities and wall friction angles, indicating the potential for added handling issues; which was corroborated via theoretical hopper design calculations. All of the ''wetted'' corn stovers require larger theoretical hopper outlet diameters and steeper hopper walls than the examined ''dry'' stovers.« less

  1. Pre-Processing of Point-Data from Contact and Optical 3D Digitization Sensors

    PubMed Central

    Budak, Igor; Vukelić, Djordje; Bračun, Drago; Hodolič, Janko; Soković, Mirko

    2012-01-01

    Contemporary 3D digitization systems employed by reverse engineering (RE) feature ever-growing scanning speeds with the ability to generate large quantity of points in a unit of time. Although advantageous for the quality and efficiency of RE modelling, the huge number of point datas can turn into a serious practical problem, later on, when the CAD model is generated. In addition, 3D digitization processes are very often plagued by measuring errors, which can be attributed to the very nature of measuring systems, various characteristics of the digitized objects and subjective errors by the operator, which also contribute to problems in the CAD model generation process. This paper presents an integral system for the pre-processing of point data, i.e., filtering, smoothing and reduction, based on a cross-sectional RE approach. In the course of the proposed system development, major emphasis was placed on the module for point data reduction, which was designed according to a novel approach with integrated deviation analysis and fuzzy logic reasoning. The developed system was verified through its application on three case studies, on point data from objects of versatile geometries obtained by contact and laser 3D digitization systems. The obtained results demonstrate the effectiveness of the system. PMID:22368513

  2. Defining properties of speech spectrogram images to allow effective pre-processing prior to pattern recognition

    NASA Astrophysics Data System (ADS)

    Al-Darkazali, Mohammed; Young, Rupert; Chatwin, Chris; Birch, Philip

    2013-03-01

    The speech signal of a word is a combination of frequencies which can produce specific transition frequency shapes. These can be regarded as a written text in some unknown `script'. Before attempting methods to read the speech spectrogram image using image processing techniques we need first to define the properties of the speech spectrogram image as well as the reduction of the clutter of the spectrogram image and the selection of the methods to be employed for image matching. Thus methods to convert the speech signal to a spectrogram image are initially employed, followed by reduction of the noise in the signal by capturing the energy associated with formants of the speech signal. This is followed by the normalisation of the size of the image and its resolution of in both the frequency and time axes. Finally, template matching methods are employed to recognise portions of text and isolated words. The paper describes the pre-processing methods employed and outlines the use of normalised grey-level correlation for the recognition of words.

  3. Preprocessing: Geocoding of AVIRIS data using navigation, engineering, DEM, and radar tracking system data

    NASA Technical Reports Server (NTRS)

    Meyer, Peter; Larson, Steven A.; Hansen, Earl G.; Itten, Klaus I.

    1993-01-01

    Remotely sensed data have geometric characteristics and representation which depend on the type of the acquisition system used. To correlate such data over large regions with other real world representation tools like conventional maps or Geographic Information System (GIS) for verification purposes, or for further treatment within different data sets, a coregistration has to be performed. In addition to the geometric characteristics of the sensor there are two other dominating factors which affect the geometry: the stability of the platform and the topography. There are two basic approaches for a geometric correction on a pixel-by-pixel basis: (1) A parametric approach using the location of the airplane and inertial navigation system data to simulate the observation geometry; and (2) a non-parametric approach using tie points or ground control points. It is well known that the non-parametric approach is not reliable enough for the unstable flight conditions of airborne systems, and is not satisfying in areas with significant topography, e.g. mountains and hills. The present work describes a parametric preprocessing procedure which corrects effects of flight line and attitude variation as well as topographic influences and is described in more detail by Meyer.

  4. The PREP pipeline: standardized preprocessing for large-scale EEG analysis.

    PubMed

    Bigdely-Shamlo, Nima; Mullen, Tim; Kothe, Christian; Su, Kyung-Min; Robbins, Kay A

    2015-01-01

    The technology to collect brain imaging and physiological measures has become portable and ubiquitous, opening the possibility of large-scale analysis of real-world human imaging. By its nature, such data is large and complex, making automated processing essential. This paper shows how lack of attention to the very early stages of an EEG preprocessing pipeline can reduce the signal-to-noise ratio and introduce unwanted artifacts into the data, particularly for computations done in single precision. We demonstrate that ordinary average referencing improves the signal-to-noise ratio, but that noisy channels can contaminate the results. We also show that identification of noisy channels depends on the reference and examine the complex interaction of filtering, noisy channel identification, and referencing. We introduce a multi-stage robust referencing scheme to deal with the noisy channel-reference interaction. We propose a standardized early-stage EEG processing pipeline (PREP) and discuss the application of the pipeline to more than 600 EEG datasets. The pipeline includes an automatically generated report for each dataset processed. Users can download the PREP pipeline as a freely available MATLAB library from http://eegstudy.org/prepcode. PMID:26150785

  5. The PREP pipeline: standardized preprocessing for large-scale EEG analysis

    PubMed Central

    Bigdely-Shamlo, Nima; Mullen, Tim; Kothe, Christian; Su, Kyung-Min; Robbins, Kay A.

    2015-01-01

    The technology to collect brain imaging and physiological measures has become portable and ubiquitous, opening the possibility of large-scale analysis of real-world human imaging. By its nature, such data is large and complex, making automated processing essential. This paper shows how lack of attention to the very early stages of an EEG preprocessing pipeline can reduce the signal-to-noise ratio and introduce unwanted artifacts into the data, particularly for computations done in single precision. We demonstrate that ordinary average referencing improves the signal-to-noise ratio, but that noisy channels can contaminate the results. We also show that identification of noisy channels depends on the reference and examine the complex interaction of filtering, noisy channel identification, and referencing. We introduce a multi-stage robust referencing scheme to deal with the noisy channel-reference interaction. We propose a standardized early-stage EEG processing pipeline (PREP) and discuss the application of the pipeline to more than 600 EEG datasets. The pipeline includes an automatically generated report for each dataset processed. Users can download the PREP pipeline as a freely available MATLAB library from http://eegstudy.org/prepcode. PMID:26150785

  6. Tools and Databases of the KOMICS Web Portal for Preprocessing, Mining, and Dissemination of Metabolomics Data

    PubMed Central

    Enomoto, Mitsuo; Morishita, Yoshihiko; Kurabayashi, Atsushi; Iijima, Yoko; Ogata, Yoshiyuki; Nakajima, Daisuke; Suzuki, Hideyuki; Shibata, Daisuke

    2014-01-01

    A metabolome—the collection of comprehensive quantitative data on metabolites in an organism—has been increasingly utilized for applications such as data-intensive systems biology, disease diagnostics, biomarker discovery, and assessment of food quality. A considerable number of tools and databases have been developed to date for the analysis of data generated by various combinations of chromatography and mass spectrometry. We report here a web portal named KOMICS (The Kazusa Metabolomics Portal), where the tools and databases that we developed are available for free to academic users. KOMICS includes the tools and databases for preprocessing, mining, visualization, and publication of metabolomics data. Improvements in the annotation of unknown metabolites and dissemination of comprehensive metabolomic data are the primary aims behind the development of this portal. For this purpose, PowerGet and FragmentAlign include a manual curation function for the results of metabolite feature alignments. A metadata-specific wiki-based database, Metabolonote, functions as a hub of web resources related to the submitters' work. This feature is expected to increase citation of the submitters' work, thereby promoting data publication. As an example of the practical use of KOMICS, a workflow for a study on Jatropha curcas is presented. The tools and databases available at KOMICS should contribute to enhanced production, interpretation, and utilization of metabolomic Big Data. PMID:24949426

  7. Parafoveal preprocessing of word initial trigrams during reading in adults and children.

    PubMed

    Pagán, Ascensión; Blythe, Hazel I; Liversedge, Simon P

    2016-03-01

    Although previous research has shown that letter position information for the first letter of a parafoveal word is encoded less flexibly than internal word beginning letters (Johnson, Perea & Rayner, 2007; White et al., 2008), it is not clear how positional encoding operates over the initial trigram in English. This experiment explored the preprocessing of letter identity and position information of a parafoveal word's initial trigram by adults and children using the boundary paradigm during normal sentence reading. Seven previews were generated: Identity (captain); transposed letter and substituted letter nonwords in Positions 1 and 2 (acptain-imptain); 1 and 3 (pactain-gartain), and 2 and 3 (cpatain-cgotain). Results showed a transposed letter effect (TLE) in Position 13 for gaze duration in the pretarget word; and TLE in Positions 12 and 23 but not in Position 13 in the target word for both adults and children. These findings suggest that children, similar to adults, extract letter identity and position information flexibly using a spatial coding mechanism; supporting isolated word recognition models such as SOLAR (Davis, 1999, 2010) and SERIOL (Whitney, 2001) models. (PsycINFO Database Record PMID:26348198

  8. Chapter 9 - Methylation Analysis by Microarray

    PubMed Central

    Deatherage, Daniel E.; Potter, Dustin; Yan, Pearlly S.; Huang, Tim H.-M.; Lin, Shili

    2010-01-01

    Differential Methylation Hybridization (DMH) is a high-throughput DNA methylation screening tool that utilizes methylation-sensitive restriction enzymes to profile methylated fragments by hybridizing them to a CpG island microarray. This array contains probes spanning all the 27,800 islands annotated in the UCSC Genome Browser. Herein we describe a DMH protocol with clearly identified quality control points. In this manner, samples that are unlikely to provide good read-outs for differential methylation profiles between the test and the control samples will be identified and repeated with appropriate modifications. The step-by-step laboratory DMH protocol is described. In addition, we provide descriptions regarding DMH data analysis, including image quantification, background correction, and statistical procedures for both exploratory analysis and more formal inferences. Issues regarding quality control are addressed as well. PMID:19488875

  9. Uses of Dendrimers for DNA Microarrays

    PubMed Central

    Caminade, Anne-Marie; Padié, Clément; Laurent, Régis; Maraval, Alexandrine; Majoral, Jean-Pierre

    2006-01-01

    Biosensors such as DNA microarrays and microchips are gaining an increasing importance in medicinal, forensic, and environmental analyses. Such devices are based on the detection of supramolecular interactions called hybridizations that occur between complementary oligonucleotides, one linked to a solid surface (the probe), and the other one to be analyzed (the target). This paper focuses on the improvements that hyperbranched and perfectly defined nanomolecules called dendrimers can provide to this methodology. Two main uses of dendrimers for such purpose have been described up to now; either the dendrimer is used as linker between the solid surface and the probe oligonucleotide, or the dendrimer is used as a multilabeled entity linked to the target oligonucleotide. In the first case the dendrimer generally induces a higher loading of probes and an easier hybridization, due to moving away the solid phase. In the second case the high number of localized labels (generally fluorescent) induces an increased sensitivity, allowing the detection of small quantities of biological entities.

  10. Meta-analysis of incomplete microarray studies.

    PubMed

    Zollinger, Alix; Davison, Anthony C; Goldstein, Darlene R

    2015-10-01

    Meta-analysis of microarray studies to produce an overall gene list is relatively straightforward when complete data are available. When some studies lack information-providing only a ranked list of genes, for example-it is common to reduce all studies to ranked lists prior to combining them. Since this entails a loss of information, we consider a hierarchical Bayes approach to meta-analysis using different types of information from different studies: the full data matrix, summary statistics, or ranks. The model uses an informative prior for the parameter of interest to aid the detection of differentially expressed genes. Simulations show that the new approach can give substantial power gains compared with classical meta-analysis and list aggregation methods. A meta-analysis of 11 published studies with different data types identifies genes known to be involved in ovarian cancer and shows significant enrichment. PMID:25987649

  11. DNA microarrays on a mesospaced surface

    NASA Astrophysics Data System (ADS)

    Hong, Bong Jin; Park, Joon Won

    2004-12-01

    A dendron having nine carboxylic acid groups at the end of the branches and a protected amine at the apex was allowed to form a molecular layer on the aminosilylated surface through multipoint ionic attraction. It was found that a compact and smooth monolayer was obtained at appropriate condition. The film quality was maintained successfully after deprotecting CBZ group with trimethylsilyl iodide. The surface density of the primary amine after the deprotection was measured with fluorometry, and 0.1-0.2 amine group per 1 nm2 was observed. This implies that the spacing between the amine functional groups is 24-34 Å in hexagonal close packing (hcp) model. In addition, DNA microarrays were fabricated successfully on the dendron-modified surface.

  12. Digital microarray analysis for digital artifact genomics

    NASA Astrophysics Data System (ADS)

    Jaenisch, Holger; Handley, James; Williams, Deborah

    2013-06-01

    We implement a Spatial Voting (SV) based analogy of microarray analysis for digital gene marker identification in malware code sections. We examine a famous set of malware formally analyzed by Mandiant and code named Advanced Persistent Threat (APT1). APT1 is a Chinese organization formed with specific intent to infiltrate and exploit US resources. Manidant provided a detailed behavior and sting analysis report for the 288 malware samples available. We performed an independent analysis using a new alternative to the traditional dynamic analysis and static analysis we call Spatial Analysis (SA). We perform unsupervised SA on the APT1 originating malware code sections and report our findings. We also show the results of SA performed on some members of the families associated by Manidant. We conclude that SV based SA is a practical fast alternative to dynamics analysis and static analysis.

  13. Giant Magnetoresistive Sensors for DNA Microarray

    PubMed Central

    Xu, Liang; Yu, Heng; Han, Shu-Jen; Osterfeld, Sebastian; White, Robert L.; Pourmand, Nader; Wang, Shan X.

    2009-01-01

    Giant magnetoresistive (GMR) sensors are developed for a DNA microarray. Compared with the conventional fluorescent sensors, GMR sensors are cheaper, more sensitive, can generate fully electronic signals, and can be easily integrated with electronics and microfluidics. The GMR sensor used in this work has a bottom spin valve structure with an MR ratio of 12%. The single-strand target DNA detected has a length of 20 bases. Assays with DNA concentrations down to 10 pM were performed, with a dynamic range of 3 logs. A double modulation technique was used in signal detection to reduce the 1/f noise in the sensor while circumventing electromagnetic interference. The logarithmic relationship between the magnetic signal and the target DNA concentration can be described by the Temkin isotherm. Furthermore, GMR sensors integrated with microfluidics has great potential of improving the sensitivity to 1 pM or below, and the total assay time can be reduced to less than 1 hour. PMID:20824116

  14. Functional assessment of time course microarray data

    PubMed Central

    Nueda, María José; Sebastián, Patricia; Tarazona, Sonia; García-García, Francisco; Dopazo, Joaquín; Ferrer, Alberto; Conesa, Ana

    2009-01-01

    Motivation Time-course microarray experiments study the progress of gene expression along time across one or several experimental conditions. Most developed analysis methods focus on the clustering or the differential expression analysis of genes and do not integrate functional information. The assessment of the functional aspects of time-course transcriptomics data requires the use of approaches that exploit the activation dynamics of the functional categories to where genes are annotated. Methods We present three novel methodologies for the functional assessment of time-course microarray data. i) maSigFun derives from the maSigPro method, a regression-based strategy to model time-dependent expression patterns and identify genes with differences across series. maSigFun fits a regression model for groups of genes labeled by a functional class and selects those categories which have a significant model. ii) PCA-maSigFun fits a PCA model of each functional class-defined expression matrix to extract orthogonal patterns of expression change, which are then assessed for their fit to a time-dependent regression model. iii) ASCA-functional uses the ASCA model to rank genes according to their correlation to principal time expression patterns and assess functional enrichment on a GSA fashion. We used simulated and experimental datasets to study these novel approaches. Results were compared to alternative methodologies. Results Synthetic and experimental data showed that the different methods are able to capture different aspects of the relationship between genes, functions and co-expression that are biologically meaningful. The methods should not be considered as competitive but they provide different insights into the molecular and functional dynamic events taking place within the biological system under study. PMID:19534758

  15. Lipid Microarray Biosensor for Biotoxin Detection.

    SciTech Connect

    Singh, Anup K.; Throckmorton, Daniel J.; Moran-Mirabal, Jose C.; Edel, Joshua B.; Meyer, Grant D.; Craighead, Harold G.

    2006-05-01

    We present the use of micron-sized lipid domains, patterned onto planar substrates and within microfluidic channels, to assay the binding of bacterial toxins via total internal reflection fluorescence microscopy (TIRFM). The lipid domains were patterned using a polymer lift-off technique and consisted of ganglioside-populated DSPC:cholesterol supported lipid bilayers (SLBs). Lipid patterns were formed on the substrates by vesicle fusion followed by polymer lift-off, which revealed micron-sized SLBs containing either ganglioside GT1b or GM1. The ganglioside-populated SLB arrays were then exposed to either Cholera toxin subunit B (CTB) or Tetanus toxin fragment C (TTC). Binding was assayed on planar substrates by TIRFM down to 1 nM concentration for CTB and 100 nM for TTC. Apparent binding constants extracted from three different models applied to the binding curves suggest that binding of a protein to a lipid-based receptor is strongly affected by the lipid composition of the SLB and by the substrate on which the bilayer is formed. Patterning of SLBs inside microfluidic channels also allowed the preparation of lipid domains with different compositions on a single device. Arrays within microfluidic channels were used to achieve segregation and selective binding from a binary mixture of the toxin fragments in one device. The binding and segregation within the microfluidic channels was assayed with epifluorescence as proof of concept. We propose that the method used for patterning the lipid microarrays on planar substrates and within microfluidic channels can be easily adapted to proteins or nucleic acids and can be used for biosensor applications and cell stimulation assays under different flow conditions. KEYWORDS. Microarray, ganglioside, polymer lift-off, cholera toxin, tetanus toxin, TIRFM, binding constant.4

  16. Development of pre-processing method for use of meteorological ensemble predictions as input to hydrological models: case study of the Huai River Basin, China

    NASA Astrophysics Data System (ADS)

    Song, W.; Xu, X.; Duan, Q.; van Andel, S. J.; Lobbrecht, A. H.; Solomatine, D. P.

    2012-04-01

    Hydrological models are run with precipitation and other meteorological data as input. With adequate spatial and temporal coverage, observed meteorological data is more reliable and more accurate for hydrological models, conventionally. However, when an early warning with several days ahead needs to be provided (for example for flood forecasts) in many cases Numerical Weather Prediction (NWP) has to be used. Hydro-meteorological forecasters and researchers aim to understand and minimize the forecast uncertainties. This research's focus is on testing existing methods and/or developing new methods to improve the integration between meteorological models and hydrological models. The output of the fixed version of the NCEP GFS meteorological ensemble prediction system for the Huai river basin is used in this research with the lead time of 1 to 15 days. There are a number of pre-processing methods to be tested, such as Quantile-to-Quantile correction methods, Analog methods, and Logistic regression. Apart from a single method, a multi-methods system could be developed as well - employing BMA, ANN or other model combinational algorithms. Furthermore, the processed ensemble meteorological data can be fed into hydrological models to generate ensemble discharge forecasts. Performance of this approach is tested by comparing its output with ensemble discharge forecasts on the basis of the raw meteorological ensemble forecast input.

  17. Faster algorithms for RNA-folding using the Four-Russians method

    PubMed Central

    2014-01-01

    Background The secondary structure that maximizes the number of non-crossing matchings between complimentary bases of an RNA sequence of length n can be computed in O(n3) time using Nussinov’s dynamic programming algorithm. The Four-Russians method is a technique that reduces the running time for certain dynamic programming algorithms by a multiplicative factor after a preprocessing step where solutions to all smaller subproblems of a fixed size are exhaustively enumerated and solved. Frid and Gusfield designed an O(n3logn) algorithm for RNA folding using the Four-Russians technique. In their algorithm the preprocessing is interleaved with the algorithm computation. Theoretical results We simplify the algorithm and the analysis by doing the preprocessing once prior to the algorithm computation. We call this the two-vector method. We also show variants where instead of exhaustive preprocessing, we only solve the subproblems encountered in the main algorithm once and memoize the results. We give a simple proof of correctness and explore the practical advantages over the earlier method. The Nussinov algorithm admits an O(n2) time parallel algorithm. We show a parallel algorithm using the two-vector idea that improves the time bound to O(n2logn). Practical results We have implemented the parallel algorithm on graphics processing units using the CUDA platform. We discuss the organization of the data structures to exploit coalesced memory access for fast running times. The ideas to organize the data structures also help in improving the running time of the serial algorithms. For sequences of length up to 6000 bases the parallel algorithm takes only about 2.5 seconds and the two-vector serial method takes about 57 seconds on a desktop and 15 seconds on a server. Among the serial algorithms, the two-vector and memoized versions are faster than the Frid-Gusfield algorithm by a factor of 3, and are faster than Nussinov by up to a factor of 20. The source-code for the

  18. Time-Frequency Analysis of Peptide Microarray Data: Application to Brain Cancer Immunosignatures.

    PubMed

    O'Donnell, Brian; Maurer, Alexander; Papandreou-Suppappola, Antonia; Stafford, Phillip

    2015-01-01

    One of the gravest dangers facing cancer patients is an extended symptom-free lull between tumor initiation and the first diagnosis. Detection of tumors is critical for effective intervention. Using the body's immune system to detect and amplify tumor-specific signals may enable detection of cancer using an inexpensive immunoassay. Immunosignatures are one such assay: they provide a map of antibody interactions with random-sequence peptides. They enable detection of disease-specific patterns using classic train/test methods. However, to date, very little effort has gone into extracting information from the sequence of peptides that interact with disease-specific antibodies. Because it is difficult to represent all possible antigen peptides in a microarray format, we chose to synthesize only 330,000 peptides on a single immunosignature microarray. The 330,000 random-sequence peptides on the microarray represent 83% of all tetramers and 27% of all pentamers, creating an unbiased but substantial gap in the coverage of total sequence space. We therefore chose to examine many relatively short motifs from these random-sequence peptides. Time-variant analysis of recurrent subsequences provided a means to dissect amino acid sequences from the peptides while simultaneously retaining the antibody-peptide binding intensities. We first used a simple experiment in which monoclonal antibodies with known linear epitopes were exposed to these random-sequence peptides, and their binding intensities were used to create our algorithm. We then demonstrated the performance of the proposed algorithm by examining immunosignatures from patients with Glioblastoma multiformae (GBM), an aggressive form of brain cancer. Eight different frameshift targets were identified from the random-sequence peptides using this technique. If immune-reactive antigens can be identified using a relatively simple immune assay, it might enable a diagnostic test with sufficient sensitivity to detect tumors in a

  19. Improved document image segmentation algorithm using multiresolution morphology

    NASA Astrophysics Data System (ADS)

    Bukhari, Syed Saqib; Shafait, Faisal; Breuel, Thomas M.

    2011-01-01

    Page segmentation into text and non-text elements is an essential preprocessing step before optical character recognition (OCR) operation. In case of poor segmentation, an OCR classification engine produces garbage characters due to the presence of non-text elements. This paper describes modifications to the text/non-text segmentation algorithm presented by Bloomberg,1 which is also available in his open-source Leptonica library.2The modifications result in significant improvements and achieved better segmentation accuracy than the original algorithm for UW-III, UNLV, ICDAR 2009 page segmentation competition test images and circuit diagram datasets.

  20. A Novel Binarization Algorithm for Ballistics Firearm Identification

    NASA Astrophysics Data System (ADS)

    Li, Dongguang

    The identification of ballistics specimens from imaging systems is of paramount importance in criminal investigation. Binarization plays a key role in preprocess of recognizing cartridges in the ballistic imaging systems. Unfortunately, it is very difficult to get the satisfactory binary image using existing binary algorithms. In this paper, we utilize the global and local thresholds to enhance the image binarization. Importantly, we present a novel criterion for effectively detecting edges in the images. Comprehensive experiments have been conducted over sample ballistic images. The empirical results demonstrate the proposed method can provide a better solution than existing binary algorithms.

  1. Segment and fit thresholding: a new method for image analysis applied to microarray and immunofluorescence data.

    PubMed

    Ensink, Elliot; Sinha, Jessica; Sinha, Arkadeep; Tang, Huiyuan; Calderone, Heather M; Hostetter, Galen; Winter, Jordan; Cherba, David; Brand, Randall E; Allen, Peter J; Sempere, Lorenzo F; Haab, Brian B

    2015-10-01

    Experiments involving the high-throughput quantification of image data require algorithms for automation. A challenge in the development of such algorithms is to properly interpret signals over a broad range of image characteristics, without the need for manual adjustment of parameters. Here we present a new approach for locating signals in image data, called Segment and Fit Thresholding (SFT). The method assesses statistical characteristics of small segments of the image and determines the best-fit trends between the statistics. Based on the relationships, SFT identifies segments belonging to background regions; analyzes the background to determine optimal thresholds; and analyzes all segments to identify signal pixels. We optimized the initial settings for locating background and signal in antibody microarray and immunofluorescence data and found that SFT performed well over multiple, diverse image characteristics without readjustment of settings. When used for the automated analysis of multicolor, tissue-microarray images, SFT correctly found the overlap of markers with known subcellular localization, and it performed better than a fixed threshold and Otsu's method for selected images. SFT promises to advance the goal of full automation in image analysis. PMID:26339978

  2. Segment and Fit Thresholding: A New Method for Image Analysis Applied to Microarray and Immunofluorescence Data

    PubMed Central

    Ensink, Elliot; Sinha, Jessica; Sinha, Arkadeep; Tang, Huiyuan; Calderone, Heather M.; Hostetter, Galen; Winter, Jordan; Cherba, David; Brand, Randall E.; Allen, Peter J.; Sempere, Lorenzo F.; Haab, Brian B.

    2016-01-01

    Certain experiments involve the high-throughput quantification of image data, thus requiring algorithms for automation. A challenge in the development of such algorithms is to properly interpret signals over a broad range of image characteristics, without the need for manual adjustment of parameters. Here we present a new approach for locating signals in image data, called Segment and Fit Thresholding (SFT). The method assesses statistical characteristics of small segments of the image and determines the best-fit trends between the statistics. Based on the relationships, SFT identifies segments belonging to background regions; analyzes the background to determine optimal thresholds; and analyzes all segments to identify signal pixels. We optimized the initial settings for locating background and signal in antibody microarray and immunofluorescence data and found that SFT performed well over multiple, diverse image characteristics without readjustment of settings. When used for the automated analysis of multi-color, tissue-microarray images, SFT correctly found the overlap of markers with known subcellular localization, and it performed better than a fixed threshold and Otsu’s method for selected images. SFT promises to advance the goal of full automation in image analysis. PMID:26339978

  3. Novel Microarrays for Simultaneous Serodiagnosis of Multiple Antiviral Antibodies

    PubMed Central

    Sivakumar, Ponnurengam Malliappan; Moritsugu, Nozomi; Obuse, Sei; Isoshima, Takashi; Tashiro, Hideo; Ito, Yoshihiro

    2013-01-01

    We developed an automated diagnostic system for the detection of virus-specific immunoglobulin Gs (IgGs) that was based on a microarray platform. We compared efficacies of our automated system with conventional enzyme immunoassays (EIAs). Viruses were immobilized to microarrays using a radical cross-linking reaction that was induced by photo-irradiation. A new photoreactive polymer containing perfluorophenyl azide (PFPA) and poly(ethylene glycol) methacrylate was prepared and coated on plates. Inactivated measles, rubella, mumps, Varicella-Zoster and recombinant Epstein-Barr viruse antigen were added to coated plates, and irradiated with ultraviolet light to facilitate immobilization. Virus-specific IgGs in healthy human sera were assayed using these prepared microarrays and the results obtained compared with those from conventional EIAs. We observed high correlation (0.79–0.96) in the results between the automated microarray technique and EIAs. The microarray-based assay was more rapid, involved less reagents and sample, and was easier to conduct compared with conventional EIA techniques. The automated microarray system was further improved by introducing reagent storage reservoirs inside the chamber, thereby conserving the use of expensive reagents and antibodies. We considered the microarray format to be suitable for rapid and multiple serological diagnoses of viral diseases that could be developed further for clinical applications. PMID:24367491

  4. Novel microarrays for simultaneous serodiagnosis of multiple antiviral antibodies.

    PubMed

    Sivakumar, Ponnurengam Malliappan; Moritsugu, Nozomi; Obuse, Sei; Isoshima, Takashi; Tashiro, Hideo; Ito, Yoshihiro

    2013-01-01

    We developed an automated diagnostic system for the detection of virus-specific immunoglobulin Gs (IgGs) that was based on a microarray platform. We compared efficacies of our automated system with conventional enzyme immunoassays (EIAs). Viruses were immobilized to microarrays using a radical cross-linking reaction that was induced by photo-irradiation. A new photoreactive polymer containing perfluorophenyl azide (PFPA) and poly(ethylene glycol) methacrylate was prepared and coated on plates. Inactivated measles, rubella, mumps, Varicella-Zoster and recombinant Epstein-Barr viruse antigen were added to coated plates, and irradiated with ultraviolet light to facilitate immobilization. Virus-specific IgGs in healthy human sera were assayed using these prepared microarrays and the results obtained compared with those from conventional EIAs. We observed high correlation (0.79-0.96) in the results between the automated microarray technique and EIAs. The microarray-based assay was more rapid, involved less reagents and sample, and was easier to conduct compared with conventional EIA techniques. The automated microarray system was further improved by introducing reagent storage reservoirs inside the chamber, thereby conserving the use of expensive reagents and antibodies. We considered the microarray format to be suitable for rapid and multiple serological diagnoses of viral diseases that could be developed further for clinical applications. PMID:24367491

  5. Microintaglio Printing for Soft Lithography-Based in Situ Microarrays

    PubMed Central

    Biyani, Manish; Ichiki, Takanori

    2015-01-01

    Advances in lithographic approaches to fabricating bio-microarrays have been extensively explored over the last two decades. However, the need for pattern flexibility, a high density, a high resolution, affordability and on-demand fabrication is promoting the development of unconventional routes for microarray fabrication. This review highlights the development and uses of a new molecular lithography approach, called “microintaglio printing technology”, for large-scale bio-microarray fabrication using a microreactor array (µRA)-based chip consisting of uniformly-arranged, femtoliter-size µRA molds. In this method, a single-molecule-amplified DNA microarray pattern is self-assembled onto a µRA mold and subsequently converted into a messenger RNA or protein microarray pattern by simultaneously producing and transferring (immobilizing) a messenger RNA or a protein from a µRA mold to a glass surface. Microintaglio printing allows the self-assembly and patterning of in situ-synthesized biomolecules into high-density (kilo-giga-density), ordered arrays on a chip surface with µm-order precision. This holistic aim, which is difficult to achieve using conventional printing and microarray approaches, is expected to revolutionize and reshape proteomics. This review is not written comprehensively, but rather substantively, highlighting the versatility of microintaglio printing for developing a prerequisite platform for microarray technology for the postgenomic era.

  6. Directional-cosine and related pre-processing techniques - Possibilities and problems in earth-resources surveys

    NASA Technical Reports Server (NTRS)

    Quiel, F.

    1975-01-01

    The possibilities of using various pre-processing techniques (directional-cosine, ratios and ratio/sum) have been investigated in relation to an urban land-use problem in Marion County, Indiana (USA) and for geologic applications in the San Juan Mountains of Colorado. For Marion County, it proved possible to classify directional-cosine data from September 1972 into different land uses by applying statistics developed with data from a May 1973 ERTS frame, thereby demonstrating the possibilities of using this type of data for signature-extension purposes. In the Silverton (Colorado) area pre-processed data proved superior to original data when extracting useful information in mountainous areas without corresponding ground observations. This approach allowed meaningful classification and interpretation of the data. The main problems encountered as a result of atmospheric effects, mixing of different surface materials, and the performance characteristics of ERTS are elucidated.

  7. lop-DWI: A Novel Scheme for Pre-Processing of Diffusion-Weighted Images in the Gradient Direction Domain

    PubMed Central

    Sepehrband, Farshid; Choupan, Jeiran; Caruyer, Emmanuel; Kurniawan, Nyoman D.; Gal, Yaniv; Tieng, Quang M.; McMahon, Katie L.; Vegh, Viktor; Reutens, David C.; Yang, Zhengyi

    2015-01-01

    We describe and evaluate a pre-processing method based on a periodic spiral sampling of diffusion-gradient directions for high angular resolution diffusion magnetic resonance imaging. Our pre-processing method incorporates prior knowledge about the acquired diffusion-weighted signal, facilitating noise reduction. Periodic spiral sampling of gradient direction encodings results in an acquired signal in each voxel that is pseudo-periodic with characteristics that allow separation of low-frequency signal from high frequency noise. Consequently, it enhances local reconstruction of the orientation distribution function used to define fiber tracks in the brain. Denoising with periodic spiral sampling was tested using synthetic data and in vivo human brain images. The level of improvement in signal-to-noise ratio and in the accuracy of local reconstruction of fiber tracks was significantly improved using our method. PMID:25628600

  8. Algorithms evaluation for fundus images enhancement

    NASA Astrophysics Data System (ADS)

    Braem, V.; Marcos, M.; Bizai, G.; Drozdowicz, B.; Salvatelli, y. A.

    2011-12-01

    Color images of the retina inherently involve noise and illumination artifacts. In order to improve the diagnostic quality of the images, it is desirable to homogenize the non-uniform illumination and increase contrast while preserving color characteristics. The visual result of different pre-processing techniques can be very dissimilar and it is necessary to make an objective assessment of the techniques in order to select the most suitable. In this article the performance of eight algorithms to correct the non-uniform illumination, contrast modification and color preservation was evaluated. In order to choose the most suitable a general score was proposed. The results got good impression from experts, although some differences suggest that not necessarily the best statistical quality of image is the one of best diagnostic quality to the trained doctor eye. This means that the best pre-processing algorithm for an automatic classification may be different to the most suitable one for visual diagnosis. However, both should result in the same final diagnosis.

  9. FITPix data preprocessing pipeline for the Timepix single particle pixel detector

    NASA Astrophysics Data System (ADS)

    Kraus, V.; Holik, M.; Jakubek, J.; Georgiev, V.

    2012-04-01

    The semiconductor pixel detector Timepix contains an array of 256 × 256 square pixels with a pitch of 55 μm. The single quantum counting detector Timepix can also provide information about the energy or arrival time of a particle from every single pixel. This device is a powerful tool for radiation imaging and ionizing particle tracking. The Timepix device can be read-out via a serial or parallel interface enabling speeds of 100 fps or 3200 fps, respectively. The device can be connected to a PC via the USB 2.0 based interface FITPix, which currently supports the serial output of Timepix reaching a speed of 90 fps. FITPix supports adjustable clock frequency and hardware triggering which is a useful tool for the synchronized operation of multiple devices. The FITPix interface can handle up to 16 detectors in daisy chain. The complete system including the FITPix interface and Timepix detector is controlled from the PC by the Pixelman software package. A pipeline structure is now implemented in the new version of the readout interface of FITPix. This version also supports parallel Timepix readout. The pipeline architecture brings the possibility of data preprocessing directly in the hardware. The first pipeline stage converts the raw Timepix data into the form of a matrix or stream of pixel values. Another stage performs further data processing such as event thresholding and data compression. Complex data processing currently performed by Pixelman in the PC is significantly reduced in this way. The described architecture together with the parallel readout increases data throughput reaching a higher frame-rate and reducing the dead time. Significant data compression is performed directly in the hardware especially for sparse data sets from particle tracking applications. The data frame size is typically compressed by factor of 10-100.

  10. Functional MRI Preprocessing in Lesioned Brains: Manual Versus Automated Region of Interest Analysis

    PubMed Central

    Garrison, Kathleen A.; Rogalsky, Corianne; Sheng, Tong; Liu, Brent; Damasio, Hanna; Winstein, Carolee J.; Aziz-Zadeh, Lisa S.

    2015-01-01

    Functional magnetic resonance imaging (fMRI) has significant potential in the study and treatment of neurological disorders and stroke. Region of interest (ROI) analysis in such studies allows for testing of strong a priori clinical hypotheses with improved statistical power. A commonly used automated approach to ROI analysis is to spatially normalize each participant’s structural brain image to a template brain image and define ROIs using an atlas. However, in studies of individuals with structural brain lesions, such as stroke, the gold standard approach may be to manually hand-draw ROIs on each participant’s non-normalized structural brain image. Automated approaches to ROI analysis are faster and more standardized, yet are susceptible to preprocessing error (e.g., normalization error) that can be greater in lesioned brains. The manual approach to ROI analysis has high demand for time and expertise, but may provide a more accurate estimate of brain response. In this study, commonly used automated and manual approaches to ROI analysis were directly compared by reanalyzing data from a previously published hypothesis-driven cognitive fMRI study, involving individuals with stroke. The ROI evaluated is the pars opercularis of the inferior frontal gyrus. Significant differences were identified in task-related effect size and percent-activated voxels in this ROI between the automated and manual approaches to ROI analysis. Task interactions, however, were consistent across ROI analysis approaches. These findings support the use of automated approaches to ROI analysis in studies of lesioned brains, provided they employ a task interaction design. PMID:26441816

  11. Hardware Design and Implementation of a Wavelet De-Noising Procedure for Medical Signal Preprocessing

    PubMed Central

    Chen, Szi-Wen; Chen, Yuan-Ho

    2015-01-01

    In this paper, a discrete wavelet transform (DWT) based de-noising with its applications into the noise reduction for medical signal preprocessing is introduced. This work focuses on the hardware realization of a real-time wavelet de-noising procedure. The proposed de-noising circuit mainly consists of three modules: a DWT, a thresholding, and an inverse DWT (IDWT) modular circuits. We also proposed a novel adaptive thresholding scheme and incorporated it into our wavelet de-noising procedure. Performance was then evaluated on both the architectural designs of the software and. In addition, the de-noising circuit was also implemented by downloading the Verilog codes to a field programmable gate array (FPGA) based platform so that its ability in noise reduction may be further validated in actual practice. Simulation experiment results produced by applying a set of simulated noise-contaminated electrocardiogram (ECG) signals into the de-noising circuit showed that the circuit could not only desirably meet the requirement of real-time processing, but also achieve satisfactory performance for noise reduction, while the sharp features of the ECG signals can be well preserved. The proposed de-noising circuit was further synthesized using the Synopsys Design Compiler with an Artisan Taiwan Semiconductor Manufacturing Company (TSMC, Hsinchu, Taiwan) 40 nm standard cell library. The integrated circuit (IC) synthesis simulation results showed that the proposed design can achieve a clock frequency of 200 MHz and the power consumption was only 17.4 mW, when operated at 200 MHz. PMID:26501290

  12. Optimising chemical named entity recognition with pre-processing analytics, knowledge-rich features and heuristics

    PubMed Central

    2015-01-01

    Background The development of robust methods for chemical named entity recognition, a challenging natural language processing task, was previously hindered by the lack of publicly available, large-scale, gold standard corpora. The recent public release of a large chemical entity-annotated corpus as a resource for the CHEMDNER track of the Fourth BioCreative Challenge Evaluation (BioCreative IV) workshop greatly alleviated this problem and allowed us to develop a conditional random fields-based chemical entity recogniser. In order to optimise its performance, we introduced customisations in various aspects of our solution. These include the selection of specialised pre-processing analytics, the incorporation of chemistry knowledge-rich features in the training and application of the statistical model, and the addition of post-processing rules. Results Our evaluation shows that optimal performance is obtained when our customisations are integrated into the chemical entity recogniser. When its performance is compared with that of state-of-the-art methods, under comparable experimental settings, our solution achieves competitive advantage. We also show that our recogniser that uses a model trained on the CHEMDNER corpus is suitable for recognising names in a wide range of corpora, consistently outperforming two popular chemical NER tools. Conclusion The contributions resulting from this work are two-fold. Firstly, we present the details of a chemical entity recognition methodology that has demonstrated performance at a competitive, if not superior, level as that of state-of-the-art methods. Secondly, the developed suite of solutions has been made publicly available as a configurable workflow in the interoperable text mining workbench Argo. This allows interested users to conveniently apply and evaluate our solutions in the context of other chemical text mining tasks. PMID:25810777

  13. Hardware design and implementation of a wavelet de-noising procedure for medical signal preprocessing.

    PubMed

    Chen, Szi-Wen; Chen, Yuan-Ho

    2015-01-01

    In this paper, a discrete wavelet transform (DWT) based de-noising with its applications into the noise reduction for medical signal preprocessing is introduced. This work focuses on the hardware realization of a real-time wavelet de-noising procedure. The proposed de-noising circuit mainly consists of three modules: a DWT, a thresholding, and an inverse DWT (IDWT) modular circuits. We also proposed a novel adaptive thresholding scheme and incorporated it into our wavelet de-noising procedure. Performance was then evaluated on both the architectural designs of the software and. In addition, the de-noising circuit was also implemented by downloading the Verilog codes to a field programmable gate array (FPGA) based platform so that its ability in noise reduction may be further validated in actual practice. Simulation experiment results produced by applying a set of simulated noise-contaminated electrocardiogram (ECG) signals into the de-noising circuit showed that the circuit could not only desirably meet the requirement of real-time processing, but also achieve satisfactory performance for noise reduction, while the sharp features of the ECG signals can be well preserved. The proposed de-noising circuit was further synthesized using the Synopsys Design Compiler with an Artisan Taiwan Semiconductor Manufacturing Company (TSMC, Hsinchu, Taiwan) 40 nm standard cell library. The integrated circuit (IC) synthesis simulation results showed that the proposed design can achieve a clock frequency of 200 MHz and the power consumption was only 17.4 mW, when operated at 200 MHz. PMID:26501290

  14. Detection of epileptic seizure in EEG signals using linear least squares preprocessing.

    PubMed

    Roshan Zamir, Z

    2016-09-01

    An epileptic seizure is a transient event of abnormal excessive neuronal discharge in the brain. This unwanted event can be obstructed by detection of electrical changes in the brain that happen before the seizure takes place. The automatic detection of seizures is necessary since the visual screening of EEG recordings is a time consuming task and requires experts to improve the diagnosis. Much of the prior research in detection of seizures has been developed based on artificial neural network, genetic programming, and wavelet transforms. Although the highest achieved accuracy for classification is 100%, there are drawbacks, such as the existence of unbalanced datasets and the lack of investigations in performances consistency. To address these, four linear least squares-based preprocessing models are proposed to extract key features of an EEG signal in order to detect seizures. The first two models are newly developed. The original signal (EEG) is approximated by a sinusoidal curve. Its amplitude is formed by a polynomial function and compared with the predeveloped spline function. Different statistical measures, namely classification accuracy, true positive and negative rates, false positive and negative rates and precision, are utilised to assess the performance of the proposed models. These metrics are derived from confusion matrices obtained from classifiers. Different classifiers are used over the original dataset and the set of extracted features. The proposed models significantly reduce the dimension of the classification problem and the computational time while the classification accuracy is improved in most cases. The first and third models are promising feature extraction methods with the classification accuracy of 100%. Logistic, LazyIB1, LazyIB5, and J48 are the best classifiers. Their true positive and negative rates are 1 while false positive and negative rates are 0 and the corresponding precision values are 1. Numerical results suggest that these

  15. MeteoIO: A Pre-Processing Library for Numerical Models

    NASA Astrophysics Data System (ADS)

    Bavay, M.; Egger, T.; Fierz, C.; Lehning, M.

    2012-04-01

    Using numerical models, which require large meteorological data sets, is sometimes difficult and problems can often be traced back to the Input/Output functionality. Complex models are usually developed by the environmental sciences community with a focus on the core modeling issues. As a consequence, the I/O routines are often error-prone, lacking flexibility and robustness. With the increasing use of such models in operational applications, this situation ceases to be simply uncomfortable and becomes a major issue. In parallel, the added requirements (in term of robustness and flexibility) increase tremendously the cost of dealing with the I/O... The MeteoIO library has been designed for the specific needs of numerical models that require meteorological data. The whole task of data pre-processing has been delegated to this library, namely retrieving, filtering and re-sampling the data if necessary as well as providing spatial interpolations. The focus has been to design an Application Programming Interface (API) that (i) provides a uniform interface to meteorological data in the models; (ii) hides the complexity of the processing taking place; (iii) guarantees a robust behavior in case of format or transmission errors, erroneous or missing data. Moreover, in an operational context, this error handling should avoid unnecessary interruptions in the simulation process. A strong emphasis has been put on simplicity and modularity in order to make it extremely easy to support new data formats or protocols and to allow contributors with diverse backgrounds to participate. This library can also be used in the context of High Performance Computing in a parallel environment. Finally, it is released under an Open Source license and is available at https://slfsmm.indefero.net/p/meteoio .

  16. Impact of functional MRI data preprocessing pipeline on default-mode network detectability in patients with disorders of consciousness

    PubMed Central

    Andronache, Adrian; Rosazza, Cristina; Sattin, Davide; Leonardi, Matilde; D'Incerti, Ludovico; Minati, Ludovico

    2013-01-01

    An emerging application of resting-state functional MRI (rs-fMRI) is the study of patients with disorders of consciousness (DoC), where integrity of default-mode network (DMN) activity is associated to the clinical level of preservation of consciousness. Due to the inherent inability to follow verbal instructions, arousal induced by scanning noise and postural pain, these patients tend to exhibit substantial levels of movement. This results in spurious, non-neural fluctuations of the rs-fMRI signal, which impair the evaluation of residual functional connectivity. Here, the effect of data preprocessing choices on the detectability of the DMN was systematically evaluated in a representative cohort of 30 clinically and etiologically heterogeneous DoC patients and 33 healthy controls. Starting from a standard preprocessing pipeline, additional steps were gradually inserted, namely band-pass filtering (BPF), removal of co-variance with the movement vectors, removal of co-variance with the global brain parenchyma signal, rejection of realignment outlier volumes and ventricle masking. Both independent-component analysis (ICA) and seed-based analysis (SBA) were performed, and DMN detectability was assessed quantitatively as well as visually. The results of the present study strongly show that the detection of DMN activity in the sub-optimal fMRI series acquired on DoC patients is contingent on the use of adequate filtering steps. ICA and SBA are differently affected but give convergent findings for high-grade preprocessing. We propose that future studies in this area should adopt the described preprocessing procedures as a minimum standard to reduce the probability of wrongly inferring that DMN activity is absent. PMID:23986694

  17. Approach to reduce the computational image processing requirements for a computer vision system using sensor preprocessing and the Hotelling transform

    NASA Astrophysics Data System (ADS)

    Schei, Thomas R.; Wright, Cameron H. G.; Pack, Daniel J.

    2005-03-01

    We describe a new development approach to computer vision for a compact, low-power, real-time system such as mobile robots. We take advantage of preprocessing in a biomimetic vision sensor and employ a computational strategy using subspace methods and the Hotelling transform in an effort to reduce the computational imaging load. The combination, while providing an overall reduction in the computational imaging requirements, is not optimized to each other and requires additional investigation.

  18. Learning curves in classification with microarray data.

    PubMed

    Hess, Kenneth R; Wei, Caimiao

    2010-02-01

    The performance of many repeated tasks improves with experience and practice. This improvement tends to be rapid initially and then decreases. The term "learning curve" is often used to describe the phenomenon. In supervised machine learning, the performance of classification algorithms often increases with the number of observations used to train the algorithm. We use progressively larger samples of observations to train the algorithm and then plot performance against the number of training observations. This yields the familiar negatively accelerating learning curve. To quantify the learning curve, we fit inverse power law models to the progressively sampled data. We fit such learning curves to four large clinical cancer genomic datasets, using three classifiers (diagonal linear discriminant analysis, K-nearest-neighbor with three neighbors, and support vector machines) and four values for the number of top genes included (5, 50, 500, 5,000). The inverse power law models fit the progressively sampled data reasonably well and showed considerable diversity when multiple classifiers are applied to the same data. Some classifiers showed rapid and continued increase in performance as the number of training samples increased, while others showed little if any improvement. Assessing classifier efficiency is particularly important in genomic studies since samples are so expensive to obtain. It is important to employ an algorithm that uses the predictive information efficiently, but with a modest number of training samples (>50), learning curves can be used to assess the predictive efficiency of classification algorithms. PMID:20172367

  19. Protein Microarrays with Novel Microfluidic Methods: Current Advances

    PubMed Central

    Dixit, Chandra K.; Aguirre, Gerson R.

    2014-01-01

    Microfluidic-based micromosaic technology has allowed the pattering of recognition elements in restricted micrometer scale areas with high precision. This controlled patterning enabled the development of highly multiplexed arrays multiple analyte detection. This arraying technology was first introduced in the beginning of 2001 and holds tremendous potential to revolutionize microarray development and analyte detection. Later, several microfluidic methods were developed for microarray application. In this review we discuss these novel methods and approaches which leverage the property of microfluidic technologies to significantly improve various physical aspects of microarray technology, such as enhanced imprinting homogeneity, stability of the immobilized biomolecules, decreasing assay times, and reduction of the costs and of the bulky instrumentation.

  20. A Perspective on DNA Microarrays in Pathology Research and Practice

    PubMed Central

    Pollack, Jonathan R.

    2007-01-01

    DNA microarray technology matured in the mid-1990s, and the past decade has witnessed a tremendous growth in its application. DNA microarrays have provided powerful tools for pathology researchers seeking to describe, classify, and understand human disease. There has also been great expectation that the technology would advance the practice of pathology. This review highlights some of the key contributions of DNA microarrays to experimental pathology, focusing in the area of cancer research. Also discussed are some of the current challenges in translating utility to clinical practice. PMID:17600117

  1. Deciphering the glycosaminoglycan code with the help of microarrays.

    PubMed

    de Paz, Jose L; Seeberger, Peter H

    2008-07-01

    Carbohydrate microarrays have become a powerful tool to elucidate the biological role of complex sugars. Microarrays are particularly useful for the study of glycosaminoglycans (GAGs), a key class of carbohydrates. The high-throughput chip format enables rapid screening of large numbers of potential GAG sequences produced via a complex biosynthesis while consuming very little sample. Here, we briefly highlight the most recent advances involving GAG microarrays built with synthetic or naturally derived oligosaccharides. These chips are powerful tools for characterizing GAG-protein interactions and determining structure-activity relationships for specific sequences. Thereby, they contribute to decoding the information contained in specific GAG sequences. PMID:18563243

  2. Imaging combined autoimmune and infectious disease microarrays

    NASA Astrophysics Data System (ADS)

    Ewart, Tom; Raha, Sandeep; Kus, Dorothy; Tarnopolsky, Mark

    2006-09-01

    Bacterial and viral pathogens are implicated in many severe autoimmune diseases, acting through such mechanisms as molecular mimicry, and superantigen activation of T-cells. For example, Helicobacter pylori, well known cause of stomach ulcers and cancers, is also identified in ischaemic heart disease (mimicry of heat shock protein 65), autoimmune pancreatitis, systemic sclerosis, autoimmune thyroiditis (HLA DRB1*0301 allele susceptibility), and Crohn's disease. Successful antibiotic eradication of H.pylori often accompanies their remission. Yet current diagnostic devices, and test-limiting cost containment, impede recognition of the linkage, delaying both diagnosis and therapeutic intervention until the chronic debilitating stage. We designed a 15 minute low cost 39 antigen microarray assay, combining autoimmune, viral and bacterial antigens1. This enables point-of-care serodiagnosis and cost-effective narrowly targeted concurrent antibiotic and monoclonal anti-T-cell and anti-cytokine immunotherapy. Arrays of 26 pathogen and 13 autoimmune antigens with IgG and IgM dilution series were printed in triplicate on epoxysilane covalent binding slides with Teflon well masks. Sera diluted 1:20 were incubated 10 minutes, washed off, anti-IgG-Cy3 (green) and anti-IgM-Dy647 (red) were incubated for 5 minutes, washed off and the slide was read in an ArrayWoRx(e) scanning CCD imager (Applied Precision, Issaquah, WA). As a preliminary model for the combined infectious disease-autoimmune diagnostic microarray we surveyed 98 unidentified, outdated sera that were discarded after Hepatitis B antibody testing. In these, significant IgG or IgM autoantibody levels were found: dsDNA 5, ssDNA 11, Ro 2, RNP 7, SSB 4, gliadin 2, thyroglobulin 13 cases. Since control sera showed no autoantibodies, the high frequency of anti-DNA and anti-thyroglobulin antibodies found in infected sera lend increased support for linkage of infection to subsequent autoimmune disease. Expansion of the antigen

  3. New technologies for fabricating biological microarrays

    NASA Astrophysics Data System (ADS)

    Larson, Bradley James

    This dissertation contains the description of two technologies that we have developed to reduce the cost and improve the quality of spotted biological microarrays. The first is a device, called a fluid microplotter, that uses ultrasonics to deposit spots with diameters of less than 5 microns. It consists of a dispenser, composed of a micropipette fastened to a piece of PZT piezoelectric, attached to a precision positioning system. A gentle pumping of fluid to the surface occurs when the micropipette is driven at specific frequencies. Spots or continuous lines can be deposited in this manner. The small fluid features conserve expensive and limited-quantity biological reagents. We characterize the performance of the microplotter in depositing fluid and examine the theoretical underpinnings of its operation. We present an analytical expression for the diameter of a deposited spot as a function of droplet volume and wettability of a surface and compare it with experimental results. We also examine the resonant properties of the piezoelectric element used to drive the dispenser and relate that to the frequencies at which pumping occurs. Finally, we propose a mechanism to explain the pumping behavior within the microplotter dispenser. The second technology we present is a process that uses a cold plasma and a subsequent in vacuo vapor-phase reaction to terminate a variety of oxide surfaces with epoxide chemical groups. These epoxide groups can react with amine-containing biomolecules to form strong covalent linkages between the biomolecules and the treated surface. The use of a plasma activation step followed by an in vacuo vapor-phase reaction allows for the precise control of surface functional groups, rather than the mixture of functionalities normally produced. This process modifies a range of different oxide surfaces, is fast, consumes a minimal amount of reagents, and produces attachment densities for bound biomolecules that are comparable to or better than

  4. An intelligent pre-processing framework for standardizing medical images for CAD and other post-processing applications

    NASA Astrophysics Data System (ADS)

    Raghupathi, Lakshminarasimhan; Devarakota, Pandu R.; Wolf, Matthias

    2012-03-01

    There is an increasing need to provide end-users with seamless and secure access to healthcare information acquired from a diverse range of sources. This might include local and remote hospital sites equipped with different vendors and practicing varied acquisition protocols and also heterogeneous external sources such as the Internet cloud. In such scenarios, image post-processing tools such as CAD (computer-aided diagnosis) which were hitherto developed using a smaller set of images may not always work optimally on newer set of images having entirely different characteristics. In this paper, we propose a framework that assesses the quality of a given input image and automatically applies an appropriate pre-processing method in such a manner that the image characteristics are normalized regardless of its source. We focus mainly on medical images, and the objective of the said preprocessing method is to standardize the performance of various image processing and workflow applications like CAD to perform in a consistent manner. First, our system consists of an assessment step wherein an image is evaluated based on criteria such as noise, image sharpness, etc. Depending on the measured characteristic, we then apply an appropriate normalization technique thus giving way to our overall pre-processing framework. A systematic evaluation of the proposed scheme is carried out on large set of CT images acquired from various vendors including images reconstructed with next generation iterative methods. Results demonstrate that the images are normalized and thus suitable for an existing LungCAD prototype1.

  5. A non-linear preprocessing for opto-digital image encryption using multiple-parameter discrete fractional Fourier transform

    NASA Astrophysics Data System (ADS)

    Azoug, Seif Eddine; Bouguezel, Saad

    2016-01-01

    In this paper, a novel opto-digital image encryption technique is proposed by introducing a new non-linear preprocessing and using the multiple-parameter discrete fractional Fourier transform (MPDFrFT). The non-linear preprocessing is performed digitally on the input image in the spatial domain using a piecewise linear chaotic map (PLCM) coupled with the bitwise exclusive OR (XOR). The resulting image is multiplied by a random phase mask before applying the MPDFrFT to whiten the image. Then, a chaotic permutation is performed on the output of the MPDFrFT using another PLCM different from the one used in the spatial domain. Finally, another MPDFrFT is applied to obtain the encrypted image. The parameters of the PLCMs together with the multiple fractional orders of the MPDFrFTs constitute the secret key for the proposed cryptosystem. Computer simulation results and security analysis are presented to show the robustness of the proposed opto-digital image encryption technique and the great importance of the new non-linear preprocessing introduced to enhance the security of the cryptosystem and overcome the problem of linearity encountered in the existing permutation-based opto-digital image encryption schemes.

  6. PathoQC: Computationally Efficient Read Preprocessing and Quality Control for High-Throughput Sequencing Data Sets

    PubMed Central

    Hong, Changjin; Manimaran, Solaiappan; Johnson, William Evan

    2014-01-01

    Quality control and read preprocessing are critical steps in the analysis of data sets generated from high-throughput genomic screens. In the most extreme cases, improper preprocessing can negatively affect downstream analyses and may lead to incorrect biological conclusions. Here, we present PathoQC, a streamlined toolkit that seamlessly combines the benefits of several popular quality control software approaches for preprocessing next-generation sequencing data. PathoQC provides a variety of quality control options appropriate for most high-throughput sequencing applications. PathoQC is primarily developed as a module in the PathoScope software suite for metagenomic analysis. However, PathoQC is also available as an open-source Python module that can run as a stand-alone application or can be easily integrated into any bioinformatics workflow. PathoQC achieves high performance by supporting parallel computation and is an effective tool that removes technical sequencing artifacts and facilitates robust downstream analysis. The PathoQC software package is available at http://sourceforge.net/projects/PathoScope/. PMID:25983538

  7. Conservative Patch Algorithm and Mesh Sequencing for PAB3D

    NASA Technical Reports Server (NTRS)

    Pao, S. P.; Abdol-Hamid, K. S.

    2005-01-01

    A mesh-sequencing algorithm and a conservative patched-grid-interface algorithm (hereafter Patch Algorithm ) have been incorporated into the PAB3D code, which is a computer program that solves the Navier-Stokes equations for the simulation of subsonic, transonic, or supersonic flows surrounding an aircraft or other complex aerodynamic shapes. These algorithms are efficient, flexible, and have added tremendously to the capabilities of PAB3D. The mesh-sequencing algorithm makes it possible to perform preliminary computations using only a fraction of the grid cells (provided the original cell count is divisible by an integer) along any grid coordinate axis, independently of the other axes. The patch algorithm addresses another critical need in multi-block grid situation where the cell faces of adjacent grid blocks may not coincide, leading to errors in calculating fluxes of conserved physical quantities across interfaces between the blocks. The patch algorithm, based on the Stokes integral formulation of the applicable conservation laws, effectively matches each of the interfacial cells on one side of the block interface to the corresponding fractional cell area pieces on the other side. This approach is comprehensive and unified such that all interface topology is automatically processed without user intervention. This algorithm is implemented in a preprocessing code that creates a cell-by-cell database that will maintain flux conservation at any level of full or reduced grid density as the user may choose by way of the mesh-sequencing algorithm. These two algorithms have enhanced the numerical accuracy of the code, reduced the time and effort for grid preprocessing, and provided users with the flexibility of performing computations at any desired full or reduced grid resolution to suit their specific computational requirements.

  8. Differential analysis for high density tiling microarray data

    PubMed Central

    Ghosh, Srinka; Hirsch, Heather A; Sekinger, Edward A; Kapranov, Philipp; Struhl, Kevin; Gingeras, Thomas R

    2007-01-01

    Background High density oligonucleotide tiling arrays are an effective and powerful platform for conducting unbiased genome-wide studies. The ab initio probe selection method employed in tiling arrays is unbiased, and thus ensures consistent sampling across coding and non-coding regions of the genome. These arrays are being increasingly used to study the associated processes of transcription, transcription factor binding, chromatin structure and their association. Studies of differential expression and/or regulation provide critical insight into the mechanics of transcription and regulation that occurs during the developmental program of a cell. The time-course experiment, which comprises an in-vivo system and the proposed analyses, is used to determine if annotated and un-annotated portions of genome manifest coordinated differential response to the induced developmental program. Results We have proposed a novel approach, based on a piece-wise function – to analyze genome-wide differential response. This enables segmentation of the response based on protein-coding and non-coding regions; for genes the methodology also partitions differential response with a 5' versus 3' versus intra-genic bias. Conclusion The algorithm built upon the framework of Significance Analysis of Microarrays, uses a generalized logic to define regions/patterns of coordinated differential change. By not adhering to the gene-centric paradigm, discordant differential expression patterns between exons and introns have been identified at a FDR of less than 12 percent. A co-localization of differential binding between RNA Polymerase II and tetra-acetylated histone has been quantified at a p-value < 0.003; it is most significant at the 5' end of genes, at a p-value < 10-13. The prototype R code has been made available as supplementary material [see Additional file 1]. PMID:17892592

  9. SU-E-J-261: The Importance of Appropriate Image Preprocessing to Augment the Information of Radiomics Image Features

    SciTech Connect

    Zhang, L; Fried, D; Fave, X; Mackin, D; Yang, J; Court, L

    2015-06-15

    Purpose: To investigate how different image preprocessing techniques, their parameters, and the different boundary handling techniques can augment the information of features and improve feature’s differentiating capability. Methods: Twenty-seven NSCLC patients with a solid tumor volume and no visually obvious necrotic regions in the simulation CT images were identified. Fourteen of these patients had a necrotic region visible in their pre-treatment PET images (necrosis group), and thirteen had no visible necrotic region in the pre-treatment PET images (non-necrosis group). We investigated how image preprocessing can impact the ability of radiomics image features extracted from the CT to differentiate between two groups. It is expected the histogram in the necrosis group is more negatively skewed, and the uniformity from the necrosis group is less. Therefore, we analyzed two first order features, skewness and uniformity, on the image inside the GTV in the intensity range [−20HU, 180HU] under the combination of several image preprocessing techniques: (1) applying the isotropic Gaussian or anisotropic diffusion smoothing filter with a range of parameter(Gaussian smoothing: size=11, sigma=0:0.1:2.3; anisotropic smoothing: iteration=4, kappa=0:10:110); (2) applying the boundaryadapted Laplacian filter; and (3) applying the adaptive upper threshold for the intensity range. A 2-tailed T-test was used to evaluate the differentiating capability of CT features on pre-treatment PT necrosis. Result: Without any preprocessing, no differences in either skewness or uniformity were observed between two groups. After applying appropriate Gaussian filters (sigma>=1.3) or anisotropic filters(kappa >=60) with the adaptive upper threshold, skewness was significantly more negative in the necrosis group(p<0.05). By applying the boundary-adapted Laplacian filtering after the appropriate Gaussian filters (0.5 <=sigma<=1.1) or anisotropic filters(20<=kappa <=50), the uniformity was

  10. Development and Optimization of a Thrombin Sandwich Aptamer Microarray

    PubMed Central

    Meneghello, Anna; Sosic, Alice; Antognoli, Agnese; Cretaio, Erica; Gatto, Barbara

    2012-01-01

    A sandwich microarray employing two distinct aptamers for human thrombin has been optimized for the detection of subnanomolar concentrations of the protein. The aptamer microarray demonstrates high specificity for thrombin, proving that a two-site binding assay with the TBA1 aptamer as capture layer and the TBA2 aptamer as detection layer can ensure great specificity at times and conditions compatible with standard routine analysis of biological samples. Aptamer microarray sensitivity was evaluated directly by fluorescent analysis employing Cy5-labeled TBA2 and indirectly by the use of TBA2-biotin followed by detection with fluorescent streptavidin. Sub-nanomolar LODs were reached in all cases and in the presence of serum, demonstrating that the optimized aptamer microarray can identify thrombin by a low-cost, sensitive and specific method.

  11. Cell-Based Microarrays for In Vitro Toxicology

    NASA Astrophysics Data System (ADS)

    Wegener, Joachim

    2015-07-01

    DNA/RNA and protein microarrays have proven their outstanding bioanalytical performance throughout the past decades, given the unprecedented level of parallelization by which molecular recognition assays can be performed and analyzed. Cell microarrays (CMAs) make use of similar construction principles. They are applied to profile a given cell population with respect to the expression of specific molecular markers and also to measure functional cell responses to drugs and chemicals. This review focuses on the use of cell-based microarrays for assessing the cytotoxicity of drugs, toxins, or chemicals in general. It also summarizes CMA construction principles with respect to the cell types that are used for such microarrays, the readout parameters to assess toxicity, and the various formats that have been established and applied. The review ends with a critical comparison of CMAs and well-established microtiter plate (MTP) approaches.

  12. Cell-Based Microarrays for In Vitro Toxicology.

    PubMed

    Wegener, Joachim

    2015-01-01

    DNA/RNA and protein microarrays have proven their outstanding bioanalytical performance throughout the past decades, given the unprecedented level of parallelization by which molecular recognition assays can be performed and analyzed. Cell microarrays (CMAs) make use of similar construction principles. They are applied to profile a given cell population with respect to the expression of specific molecular markers and also to measure functional cell responses to drugs and chemicals. This review focuses on the use of cell-based microarrays for assessing the cytotoxicity of drugs, toxins, or chemicals in general. It also summarizes CMA construction principles with respect to the cell types that are used for such microarrays, the readout parameters to assess toxicity, and the various formats that have been established and applied. The review ends with a critical comparison of CMAs and well-established microtiter plate (MTP) approaches. PMID:26077916

  13. Applications in high-content functional protein microarrays.

    PubMed

    Moore, Cedric D; Ajala, Olutobi Z; Zhu, Heng

    2016-02-01

    Protein microarray technology provides a versatile platform for characterization of hundreds to thousands of proteins in a parallel and high-throughput manner. Over the last decade, applications of functional protein microarrays in particular have flourished in studying protein function at a systems level and have led to the construction of networks and pathways describing these functions. Relevant areas of research include the detection of various binding properties of proteins, the study of enzyme-substrate relationships, the analysis of host-microbe interactions, and profiling antibody specificity. In addition, discovery of novel biomarkers in autoimmune diseases and cancers is emerging as a major clinical application of functional protein microarrays. In this review, we will summarize the recent advances of functional protein microarrays in both basic and clinical applications. PMID:26599287

  14. Improving Microarray Sample Size Using Bootstrap Data Combination

    PubMed Central

    Phan, John H.; Moffitt, Richard A.; Barrett, Andrea B.; Wang, May D.

    2016-01-01

    Microarray technology has enabled us to simultaneously measure the expression of thousands of genes. Using this high-throughput technology, we can examine subtle genetic changes between biological samples and build predictive models for clinical applications. Although microarrays have dramatically increased the rate of data collection, sample size is still a major issue when selecting features. Previous methods show that combining multiple microarray datasets improves feature selection using simple methods such as fold change. We propose a wrapper-based gene selection technique that combines bootstrap estimated classification errors for individual genes across multiple datasets and reduces the contribution of datasets with high variance. We use the bootstrap because it is an unbiased estimator of classification error that is also effective for small sample data. Coupled with data combination across multiple datasets, we show that our meta-analytic approach improves the biological relevance of gene selection using prostate and renal cancer microarray data. PMID:19164001

  15. Color sorting algorithm based on K-means clustering algorithm

    NASA Astrophysics Data System (ADS)

    Zhang, BaoFeng; Huang, Qian

    2009-11-01

    In the process of raisin production, there were a variety of color impurities, which needs be removed effectively. A new kind of efficient raisin color-sorting algorithm was presented here. First, the technology of image processing basing on the threshold was applied for the image pre-processing, and then the gray-scale distribution characteristic of the raisin image was found. In order to get the chromatic aberration image and reduce some disturbance, we made the flame image subtraction that the target image data minus the background image data. Second, Haar wavelet filter was used to get the smooth image of raisins. According to the different colors and mildew, spots and other external features, the calculation was made to identify the characteristics of their images, to enable them to fully reflect the quality differences between the raisins of different types. After the processing above, the image were analyzed by K-means clustering analysis method, which can achieve the adaptive extraction of the statistic features, in accordance with which, the image data were divided into different categories, thereby the categories of abnormal colors were distinct. By the use of this algorithm, the raisins of abnormal colors and ones with mottles were eliminated. The sorting rate was up to 98.6%, and the ratio of normal raisins to sorted grains was less than one eighth.

  16. Comparison of molecular mechanisms of rheumatoid arthritis and osteoarthritis using gene microarrays

    PubMed Central

    LI, HONGQIANG; HAO, ZHENYONG; ZHAO, LIQIANG; LIU, WEI; HAN, YANLONG; BAI, YUNXING; WANG, JIAN

    2016-01-01

    The present study aimed to compare the molecular mechanisms of rheumatoid arthritis (RA) and osteoarthritis (OA). The microarray dataset no. GSE29746 was downloaded from Gene Expression Omnibus. After data pre-processing, differential expression analysis between the RA group and the control, as well as between the OA group and the control was performed using the LIMMA package in R and differentially expressed transcripts (DETs) with |log2fold change (FC)|>1 and P<0.01 were identified. DETs screened from each disease group were then subjected to functional annotation using DAVID. Next, DETs from each group were used to construct individual interaction networks using the BIND database, followed by sub-network mining using clusterONE. Significant functions of nodes in each sub-network were also investigated. In total, 19 and 281 DETs were screened from the RA and OA groups, respectively, with only six common DETs. DETs from the RA and OA groups were enriched in 8 and 130 gene ontology (GO) terms, respectively, with four common GO terms, of which to were associated with phospholipase C (PLC) activity. In addition, DETs screened from the OA group were enriched in immune response-associated GO terms, and those screened from the RA group were largely associated with biological processes linked with the cell cycle and chromosomes. Genes involved in PLC activity and its regulation were indicated to be altered in RA as well as in OA. Alterations in the expression of cell cycle-associated genes were indicated to be linked with the occurrence of OA, while genes participating in the immune response were involved in the occurrence of RA. PMID:27082252

  17. Screening for key genes associated with atopic dermatitis with DNA microarrays.

    PubMed

    Zhang, Zhong-Kui; Yang, Yong; Bai, Shu-Rong; Zhang, Gui-Zhen; Liu, Tai-Hua; Zhou, Zhou; Wang, Chun-Mei; Tang, Li-Jun; Wang, Jun; He, Si-Xian

    2014-03-01

    The aim of the present study was to identify key genes associated with atopic dermatitis (AD) using microarray data and bioinformatic analyses. The dataset GSE6012, downloaded from the Gene Expression Omnibus (GEO) database, contains gene expression data from 10 AD skin samples and 10 healthy skin samples. Following data preprocessing, differentially expressed genes (DEGs) were identified using the limma package of the R project. Interaction networks were constructed comprising DEGs that showed a degree of node of >3, >5 and >10, using the Osprey software. Functional enrichment and pathway enrichment analysis of the network comprising all DEGs and of the network comprising DEGs with a high degree of node, were performed with the DAVID and WebGestalt toolkits, respectively. A total of 337 DEGs were identified. The functional enrichment analysis revealed that the list of DEGs was significantly enriched for proteins related to epidermis development (P=2.95E-07), including loricrin (LOR), keratin 17 (KRT17), small proline-rich repeat proteins (SPRRs) and involucrin (IVL). The chemokine signaling pathway was the most significantly enriched pathway (P=0.0490978) in the network of all DEGs and in the network consisting of high degree‑node DEGs (>10), which comprised the genes coding for chemokine receptor 7 (CCR7), chemokine ligand (CCL19), signal transducer and activator of transcription 1 (STAT1), and phosphoinositide-3-kinase regulatory subunit 1 (PIK3R1). In conclusion, the list of AD-associated proteins identified in this study, including LOR, KRT17, SPRRs, IVL, CCR7, CCL19, PIK3R1 and STAT1 may prove useful for the development of methods to treat AD. From these proteins, PIK3R1 and KRT17 are novel and promising targets for AD therapy. PMID:24452877

  18. An examination of the regulatory mechanism of Pxdn mutation-induced eye disorders using microarray analysis

    PubMed Central

    YANG, YANG; XING, YIQIAO; LIANG, CHAOQUN; HU, LIYA; XU, FEI; MEI, QI

    2016-01-01

    The present study aimed to identify biomarkers for peroxidasin (Pxdn) mutation-induced eye disorders and study the underlying mechanisms involved in this process. The microarray dataset GSE49704 was used, which encompasses 4 mouse samples from embryos with Pxdn mutation and 4 samples from normal tissues. After data preprocessing, the differentially expressed genes (DEGs) between Pxdn mutation and normal tissues were identified using the t-test in the limma package, followed by functional enrichment analysis. The protein-protein interaction (PPI) network was constructed based on the STRING database, and the transcriptional regulatory (TR) network was established using the GeneCodis database. Subsequently, the overlapping DEGs with high degrees in two networks were identified, as well as the sub-network extracted from the TR network. In total, 121 (75 upregulated and 46 downregulated) DEGs were identified, and these DEGs play important roles in biological processes (BPs), including neuron development and differentiation. A PPI network containing 25 nodes such as actin, alpha 1, skeletal muscle (Acta1) and troponin C type 2 (fast) (Tnnc2), and a TR network including 120 nodes were built. By comparing the two networks, seven crucial genes which overlapped were identified, including cyclin-dependent kinase inhibitor 1B (Cdkn1b), Acta1 and troponin T type 3 (Tnnt3). In the sub-network, Cdkn1b was predicted as the target of miRNAs such as mmu-miR-24 and transcription factors (TFs) including forkhead box O4 (FOXO4) and activating enhancer binding protein 4 (AP4). Thus, we suggest that seven crucial genes, including Cdkn1b, Acta1 and Tnnt3, play important roles in the progression of eye disorders such as glaucoma. We suggest that Cdkn1b exert its effects via the inhibition of proliferation and is mediated by mmu-miR-24 and targeted by the TFs FOXO4 and AP4. PMID:27121343

  19. Skull removal in MR images using a modified artificial bee colony optimization algorithm.

    PubMed

    Taherdangkoo, Mohammad

    2014-01-01

    Removal of the skull from brain Magnetic Resonance (MR) images is an important preprocessing step required for other image analysis techniques such as brain tissue segmentation. In this paper, we propose a new algorithm based on the Artificial Bee Colony (ABC) optimization algorithm to remove the skull region from brain MR images. We modify the ABC algorithm using a different strategy for initializing the coordinates of scout bees and their direction of search. Moreover, we impose an additional constraint to the ABC algorithm to avoid the creation of discontinuous regions. We found that our algorithm successfully removed all bony skull from a sample of de-identified MR brain images acquired from different model scanners. The obtained results of the proposed algorithm compared with those of previously introduced well known optimization algorithms such as Particle Swarm Optimization (PSO) and Ant Colony Optimization (ACO) demonstrate the superior results and computational performance of our algorithm, suggesting its potential for clinical applications. PMID:25059256

  20. Emerging Use of Gene Expression Microarrays in Plant Physiology

    DOE PAGESBeta

    Wullschleger, Stan D.; Difazio, Stephen P.

    2003-01-01

    Microarrays have become an important technology for the global analysis of gene expression in humans, animals, plants, and microbes. Implemented in the context of a well-designed experiment, cDNA and oligonucleotide arrays can provide highthroughput, simultaneous analysis of transcript abundance for hundreds, if not thousands, of genes. However, despite widespread acceptance, the use of microarrays as a tool to better understand processes of interest to the plant physiologist is still being explored. To help illustrate current uses of microarrays in the plant sciences, several case studies that we believe demonstrate the emerging application of gene expression arrays in plant physiology weremore » selected from among the many posters and presentations at the 2003 Plant and Animal Genome XI Conference. Based on this survey, microarrays are being used to assess gene expression in plants exposed to the experimental manipulation of air temperature, soil water content and aluminium concentration in the root zone. Analysis often includes characterizing transcript profiles for multiple post-treatment sampling periods and categorizing genes with common patterns of response using hierarchical clustering techniques. In addition, microarrays are also providing insights into developmental changes in gene expression associated with fibre and root elongation in cotton and maize, respectively. Technical and analytical limitations of microarrays are discussed and projects attempting to advance areas of microarray design and data analysis are highlighted. Finally, although much work remains, we conclude that microarrays are a valuable tool for the plant physiologist interested in the characterization and identification of individual genes and gene families with potential application in the fields of agriculture, horticulture and forestry.« less