Sample records for large-scale microarray study

  1. Microarray Data Processing Techniques for Genome-Scale Network Inference from Large Public Repositories.

    PubMed

    Chockalingam, Sriram; Aluru, Maneesha; Aluru, Srinivas

    2016-09-19

    Pre-processing of microarray data is a well-studied problem. Furthermore, all popular platforms come with their own recommended best practices for differential analysis of genes. However, for genome-scale network inference using microarray data collected from large public repositories, these methods filter out a considerable number of genes. This is primarily due to the effects of aggregating a diverse array of experiments with different technical and biological scenarios. Here we introduce a pre-processing pipeline suitable for inferring genome-scale gene networks from large microarray datasets. We show that partitioning of the available microarray datasets according to biological relevance into tissue- and process-specific categories significantly extends the limits of downstream network construction. We demonstrate the effectiveness of our pre-processing pipeline by inferring genome-scale networks for the model plant Arabidopsis thaliana using two different construction methods and a collection of 11,760 Affymetrix ATH1 microarray chips. Our pre-processing pipeline and the datasets used in this paper are made available at http://alurulab.cc.gatech.edu/microarray-pp.

  2. Transfection microarray and the applications.

    PubMed

    Miyake, Masato; Yoshikawa, Tomohiro; Fujita, Satoshi; Miyake, Jun

    2009-05-01

    Microarray transfection has been extensively studied for high-throughput functional analysis of mammalian cells. However, control of efficiency and reproducibility are the critical issues for practical use. By using solid-phase transfection accelerators and nano-scaffold, we provide a highly efficient and reproducible microarray-transfection device, "transfection microarray". The device would be applied to the limited number of available primary cells and stem cells not only for large-scale functional analysis but also reporter-based time-lapse cellular event analysis.

  3. Gene Expression Browser: Large-Scale and Cross-Experiment Microarray Data Management, Search & Visualization

    USDA-ARS?s Scientific Manuscript database

    The amount of microarray gene expression data in public repositories has been increasing exponentially for the last couple of decades. High-throughput microarray data integration and analysis has become a critical step in exploring the large amount of expression data for biological discovery. Howeve...

  4. Cloud-Scale Genomic Signals Processing for Robust Large-Scale Cancer Genomic Microarray Data Analysis.

    PubMed

    Harvey, Benjamin Simeon; Ji, Soo-Yeon

    2017-01-01

    As microarray data available to scientists continues to increase in size and complexity, it has become overwhelmingly important to find multiple ways to bring forth oncological inference to the bioinformatics community through the analysis of large-scale cancer genomic (LSCG) DNA and mRNA microarray data that is useful to scientists. Though there have been many attempts to elucidate the issue of bringing forth biological interpretation by means of wavelet preprocessing and classification, there has not been a research effort that focuses on a cloud-scale distributed parallel (CSDP) separable 1-D wavelet decomposition technique for denoising through differential expression thresholding and classification of LSCG microarray data. This research presents a novel methodology that utilizes a CSDP separable 1-D method for wavelet-based transformation in order to initialize a threshold which will retain significantly expressed genes through the denoising process for robust classification of cancer patients. Additionally, the overall study was implemented and encompassed within CSDP environment. The utilization of cloud computing and wavelet-based thresholding for denoising was used for the classification of samples within the Global Cancer Map, Cancer Cell Line Encyclopedia, and The Cancer Genome Atlas. The results proved that separable 1-D parallel distributed wavelet denoising in the cloud and differential expression thresholding increased the computational performance and enabled the generation of higher quality LSCG microarray datasets, which led to more accurate classification results.

  5. Development and application of a microarray meter tool to optimize microarray experiments

    PubMed Central

    Rouse, Richard JD; Field, Katrine; Lapira, Jennifer; Lee, Allen; Wick, Ivan; Eckhardt, Colleen; Bhasker, C Ramana; Soverchia, Laura; Hardiman, Gary

    2008-01-01

    Background Successful microarray experimentation requires a complex interplay between the slide chemistry, the printing pins, the nucleic acid probes and targets, and the hybridization milieu. Optimization of these parameters and a careful evaluation of emerging slide chemistries are a prerequisite to any large scale array fabrication effort. We have developed a 'microarray meter' tool which assesses the inherent variations associated with microarray measurement prior to embarking on large scale projects. Findings The microarray meter consists of nucleic acid targets (reference and dynamic range control) and probe components. Different plate designs containing identical probe material were formulated to accommodate different robotic and pin designs. We examined the variability in probe quality and quantity (as judged by the amount of DNA printed and remaining post-hybridization) using three robots equipped with capillary printing pins. Discussion The generation of microarray data with minimal variation requires consistent quality control of the (DNA microarray) manufacturing and experimental processes. Spot reproducibility is a measure primarily of the variations associated with printing. The microarray meter assesses array quality by measuring the DNA content for every feature. It provides a post-hybridization analysis of array quality by scoring probe performance using three metrics, a) a measure of variability in the signal intensities, b) a measure of the signal dynamic range and c) a measure of variability of the spot morphologies. PMID:18710498

  6. Analysis of host response to bacterial infection using error model based gene expression microarray experiments

    PubMed Central

    Stekel, Dov J.; Sarti, Donatella; Trevino, Victor; Zhang, Lihong; Salmon, Mike; Buckley, Chris D.; Stevens, Mark; Pallen, Mark J.; Penn, Charles; Falciani, Francesco

    2005-01-01

    A key step in the analysis of microarray data is the selection of genes that are differentially expressed. Ideally, such experiments should be properly replicated in order to infer both technical and biological variability, and the data should be subjected to rigorous hypothesis tests to identify the differentially expressed genes. However, in microarray experiments involving the analysis of very large numbers of biological samples, replication is not always practical. Therefore, there is a need for a method to select differentially expressed genes in a rational way from insufficiently replicated data. In this paper, we describe a simple method that uses bootstrapping to generate an error model from a replicated pilot study that can be used to identify differentially expressed genes in subsequent large-scale studies on the same platform, but in which there may be no replicated arrays. The method builds a stratified error model that includes array-to-array variability, feature-to-feature variability and the dependence of error on signal intensity. We apply this model to the characterization of the host response in a model of bacterial infection of human intestinal epithelial cells. We demonstrate the effectiveness of error model based microarray experiments and propose this as a general strategy for a microarray-based screening of large collections of biological samples. PMID:15800204

  7. Determination of Minimum Training Sample Size for Microarray-Based Cancer Outcome Prediction–An Empirical Assessment

    PubMed Central

    Cheng, Ningtao; Wu, Leihong; Cheng, Yiyu

    2013-01-01

    The promise of microarray technology in providing prediction classifiers for cancer outcome estimation has been confirmed by a number of demonstrable successes. However, the reliability of prediction results relies heavily on the accuracy of statistical parameters involved in classifiers. It cannot be reliably estimated with only a small number of training samples. Therefore, it is of vital importance to determine the minimum number of training samples and to ensure the clinical value of microarrays in cancer outcome prediction. We evaluated the impact of training sample size on model performance extensively based on 3 large-scale cancer microarray datasets provided by the second phase of MicroArray Quality Control project (MAQC-II). An SSNR-based (scale of signal-to-noise ratio) protocol was proposed in this study for minimum training sample size determination. External validation results based on another 3 cancer datasets confirmed that the SSNR-based approach could not only determine the minimum number of training samples efficiently, but also provide a valuable strategy for estimating the underlying performance of classifiers in advance. Once translated into clinical routine applications, the SSNR-based protocol would provide great convenience in microarray-based cancer outcome prediction in improving classifier reliability. PMID:23861920

  8. An evaluation of two-channel ChIP-on-chip and DNA methylation microarray normalization strategies

    PubMed Central

    2012-01-01

    Background The combination of chromatin immunoprecipitation with two-channel microarray technology enables genome-wide mapping of binding sites of DNA-interacting proteins (ChIP-on-chip) or sites with methylated CpG di-nucleotides (DNA methylation microarray). These powerful tools are the gateway to understanding gene transcription regulation. Since the goals of such studies, the sample preparation procedures, the microarray content and study design are all different from transcriptomics microarrays, the data pre-processing strategies traditionally applied to transcriptomics microarrays may not be appropriate. Particularly, the main challenge of the normalization of "regulation microarrays" is (i) to make the data of individual microarrays quantitatively comparable and (ii) to keep the signals of the enriched probes, representing DNA sequences from the precipitate, as distinguishable as possible from the signals of the un-enriched probes, representing DNA sequences largely absent from the precipitate. Results We compare several widely used normalization approaches (VSN, LOWESS, quantile, T-quantile, Tukey's biweight scaling, Peng's method) applied to a selection of regulation microarray datasets, ranging from DNA methylation to transcription factor binding and histone modification studies. Through comparison of the data distributions of control probes and gene promoter probes before and after normalization, and assessment of the power to identify known enriched genomic regions after normalization, we demonstrate that there are clear differences in performance between normalization procedures. Conclusion T-quantile normalization applied separately on the channels and Tukey's biweight scaling outperform other methods in terms of the conservation of enriched and un-enriched signal separation, as well as in identification of genomic regions known to be enriched. T-quantile normalization is preferable as it additionally improves comparability between microarrays. In contrast, popular normalization approaches like quantile, LOWESS, Peng's method and VSN normalization alter the data distributions of regulation microarrays to such an extent that using these approaches will impact the reliability of the downstream analysis substantially. PMID:22276688

  9. APPLICATION OF CDNA MICROARRAY TECHNOLOGY TO IN VITRO TOXICOLOGY AND THE SELECTION OF GENES FOR A REAL TIME RT-PCR-BASED SCREEN FOR OXIDATIVE STRESS IN HEP-G2 CELLS

    EPA Science Inventory

    Large-scale analysis of gene expression using cDNA microarrays promises the
    rapid detection of the mode of toxicity for drugs and other chemicals. cDNA
    microarrays were used to examine chemically-induced alterations of gene
    expression in HepG2 cells exposed to oxidative ...

  10. Genome-wide polymorphisms and development of a microarray platform to detect genetic variations in Plasmodium yoelii.

    PubMed

    Nair, Sethu C; Pattaradilokrat, Sittiporn; Zilversmit, Martine M; Dommer, Jennifer; Nagarajan, Vijayaraj; Stephens, Melissa T; Xiao, Wenming; Tan, John C; Su, Xin-Zhuan

    2014-01-01

    The rodent malaria parasite Plasmodium yoelii is an important model for studying malaria immunity and pathogenesis. One approach for studying malaria disease phenotypes is genetic mapping, which requires typing a large number of genetic markers from multiple parasite strains and/or progeny from genetic crosses. Hundreds of microsatellite (MS) markers have been developed to genotype the P. yoelii genome; however, typing a large number of MS markers can be labor intensive, time consuming, and expensive. Thus, development of high-throughput genotyping tools such as DNA microarrays that enable rapid and accurate large-scale genotyping of the malaria parasite will be highly desirable. In this study, we sequenced the genomes of two P. yoelii strains (33X and N67) and obtained a large number of single nucleotide polymorphisms (SNPs). Based on the SNPs obtained, we designed sets of oligonucleotide probes to develop a microarray that could interrogate ∼11,000 SNPs across the 14 chromosomes of the parasite in a single hybridization. Results from hybridizations of DNA samples of five P. yoelii strains or cloned lines (17XNL, YM, 33X, N67 and N67C) and two progeny from a genetic cross (N67×17XNL) to the microarray showed that the array had a high call rate (∼97%) and accuracy (99.9%) in calling SNPs, providing a simple and reliable tool for typing the P. yoelii genome. Our data show that the P. yoelii genome is highly polymorphic, although isogenic pairs of parasites were also detected. Additionally, our results indicate that the 33X parasite is a progeny of 17XNL (or YM) and an unknown parasite. The highly accurate and reliable microarray developed in this study will greatly facilitate our ability to study the genetic basis of important traits and the disease it causes. Published by Elsevier B.V.

  11. AccuTyping: new algorithms for automated analysis of data from high-throughput genotyping with oligonucleotide microarrays

    PubMed Central

    Hu, Guohong; Wang, Hui-Yun; Greenawalt, Danielle M.; Azaro, Marco A.; Luo, Minjie; Tereshchenko, Irina V.; Cui, Xiangfeng; Yang, Qifeng; Gao, Richeng; Shen, Li; Li, Honghua

    2006-01-01

    Microarray-based analysis of single nucleotide polymorphisms (SNPs) has many applications in large-scale genetic studies. To minimize the influence of experimental variation, microarray data usually need to be processed in different aspects including background subtraction, normalization and low-signal filtering before genotype determination. Although many algorithms are sophisticated for these purposes, biases are still present. In the present paper, new algorithms for SNP microarray data analysis and the software, AccuTyping, developed based on these algorithms are described. The algorithms take advantage of a large number of SNPs included in each assay, and the fact that the top and bottom 20% of SNPs can be safely treated as homozygous after sorting based on their ratios between the signal intensities. These SNPs are then used as controls for color channel normalization and background subtraction. Genotype calls are made based on the logarithms of signal intensity ratios using two cutoff values, which were determined after training the program with a dataset of ∼160 000 genotypes and validated by non-microarray methods. AccuTyping was used to determine >300 000 genotypes of DNA and sperm samples. The accuracy was shown to be >99%. AccuTyping can be downloaded from . PMID:16982644

  12. Principles of gene microarray data analysis.

    PubMed

    Mocellin, Simone; Rossi, Carlo Riccardo

    2007-01-01

    The development of several gene expression profiling methods, such as comparative genomic hybridization (CGH), differential display, serial analysis of gene expression (SAGE), and gene microarray, together with the sequencing of the human genome, has provided an opportunity to monitor and investigate the complex cascade of molecular events leading to tumor development and progression. The availability of such large amounts of information has shifted the attention of scientists towards a nonreductionist approach to biological phenomena. High throughput technologies can be used to follow changing patterns of gene expression over time. Among them, gene microarray has become prominent because it is easier to use, does not require large-scale DNA sequencing, and allows for the parallel quantification of thousands of genes from multiple samples. Gene microarray technology is rapidly spreading worldwide and has the potential to drastically change the therapeutic approach to patients affected with tumor. Therefore, it is of paramount importance for both researchers and clinicians to know the principles underlying the analysis of the huge amount of data generated with microarray technology.

  13. MALDI-TOF mass spectrometry for quantitative gene expression analysis of acid responses in Staphylococcus aureus.

    PubMed

    Rode, Tone Mari; Berget, Ingunn; Langsrud, Solveig; Møretrø, Trond; Holck, Askild

    2009-07-01

    Microorganisms are constantly exposed to new and altered growth conditions, and respond by changing gene expression patterns. Several methods for studying gene expression exist. During the last decade, the analysis of microarrays has been one of the most common approaches applied for large scale gene expression studies. A relatively new method for gene expression analysis is MassARRAY, which combines real competitive-PCR and MALDI-TOF (matrix-assisted laser desorption/ionization time-of-flight) mass spectrometry. In contrast to microarray methods, MassARRAY technology is suitable for analysing a larger number of samples, though for a smaller set of genes. In this study we compare the results from MassARRAY with microarrays on gene expression responses of Staphylococcus aureus exposed to acid stress at pH 4.5. RNA isolated from the same stress experiments was analysed using both the MassARRAY and the microarray methods. The MassARRAY and microarray methods showed good correlation. Both MassARRAY and microarray estimated somewhat lower fold changes compared with quantitative real-time PCR (qRT-PCR). The results confirmed the up-regulation of the urease genes in acidic environments, and also indicated the importance of metal ion regulation. This study shows that the MassARRAY technology is suitable for gene expression analysis in prokaryotes, and has advantages when a set of genes is being analysed for an organism exposed to many different environmental conditions.

  14. A universal reference sample derived from clone vector for improved detection of differential gene expression

    PubMed Central

    Khan, Rishi L; Gonye, Gregory E; Gao, Guang; Schwaber, James S

    2006-01-01

    Background Using microarrays by co-hybridizing two samples labeled with different dyes enables differential gene expression measurements and comparisons across slides while controlling for within-slide variability. Typically one dye produces weaker signal intensities than the other often causing signals to be undetectable. In addition, undetectable spots represent a large problem for two-color microarray designs and most arrays contain at least 40% undetectable spots even when labeled with reference samples such as Stratagene's Universal Reference RNAs™. Results We introduce a novel universal reference sample that produces strong signal for all spots on the array, increasing the average fraction of detectable spots to 97%. Maximizing detectable spots on the reference image channel also decreases the variability of microarray data allowing for reliable detection of smaller differential gene expression changes. The reference sample is derived from sequence contained in the parental EST clone vector pT7T3D-Pac and is called vector RNA (vRNA). We show that vRNA can also be used for quality control of microarray printing and PCR product quality, detection of hybridization anomalies, and simplification of spot finding and segmentation tasks. This reference sample can be made inexpensively in large quantities as a renewable resource that is consistent across experiments. Conclusion Results of this study show that vRNA provides a useful universal reference that yields high signal for almost all spots on a microarray, reduces variation and allows for comparisons between experiments and laboratories. Further, it can be used for quality control of microarray printing and PCR product quality, detection of hybridization anomalies, and simplification of spot finding and segmentation tasks. This type of reference allows for detection of small changes in differential expression while reference designs in general allow for large-scale multivariate experimental designs. vRNA in combination with reference designs enable systems biology microarray experiments of small physiologically relevant changes. PMID:16677381

  15. Signal amplification by rolling circle amplification on DNA microarrays

    PubMed Central

    Nallur, Girish; Luo, Chenghua; Fang, Linhua; Cooley, Stephanie; Dave, Varshal; Lambert, Jeremy; Kukanskis, Kari; Kingsmore, Stephen; Lasken, Roger; Schweitzer, Barry

    2001-01-01

    While microarrays hold considerable promise in large-scale biology on account of their massively parallel analytical nature, there is a need for compatible signal amplification procedures to increase sensitivity without loss of multiplexing. Rolling circle amplification (RCA) is a molecular amplification method with the unique property of product localization. This report describes the application of RCA signal amplification for multiplexed, direct detection and quantitation of nucleic acid targets on planar glass and gel-coated microarrays. As few as 150 molecules bound to the surface of microarrays can be detected using RCA. Because of the linear kinetics of RCA, nucleic acid target molecules may be measured with a dynamic range of four orders of magnitude. Consequently, RCA is a promising technology for the direct measurement of nucleic acids on microarrays without the need for a potentially biasing preamplification step. PMID:11726701

  16. Fabrication of a 3D micro/nano dual-scale carbon array and its demonstration as the microelectrodes for supercapacitors

    NASA Astrophysics Data System (ADS)

    Jiang, Shulan; Shi, Tielin; Gao, Yang; Long, Hu; Xi, Shuang; Tang, Zirong

    2014-04-01

    An easily accessible method is proposed for the fabrication of a 3D micro/nano dual-scale carbon array with a large surface area. The process mainly consists of three critical steps. Firstly, a hemispherical photoresist micro-array was obtained by the cost-effective nanoimprint lithography process. Then the micro-array was transformed into hierarchical structures with longitudinal nanowires on the microstructure surface by oxygen plasma etching. Finally, the micro/nano dual-scale carbon array was fabricated by carbonizing these hierarchical photoresist structures. It has also been demonstrated that the micro/nano dual-scale carbon array can be used as the microelectrodes for supercapacitors by the electrodeposition of a manganese dioxide (MnO2) film onto the hierarchical carbon structures with greatly enhanced electrochemical performance. The specific gravimetric capacitance of the deposited micro/nano dual-scale microelectrodes is estimated to be 337 F g-1 at the scan rate of 5 mV s-1. This proposed approach of fabricating a micro/nano dual-scale carbon array provides a facile way in large-scale microstructures’ manufacturing for a wide variety of applications, including sensors and on-chip energy storage devices.

  17. Parallel human genome analysis: microarray-based expression monitoring of 1000 genes.

    PubMed Central

    Schena, M; Shalon, D; Heller, R; Chai, A; Brown, P O; Davis, R W

    1996-01-01

    Microarrays containing 1046 human cDNAs of unknown sequence were printed on glass with high-speed robotics. These 1.0-cm2 DNA "chips" were used to quantitatively monitor differential expression of the cognate human genes using a highly sensitive two-color hybridization assay. Array elements that displayed differential expression patterns under given experimental conditions were characterized by sequencing. The identification of known and novel heat shock and phorbol ester-regulated genes in human T cells demonstrates the sensitivity of the assay. Parallel gene analysis with microarrays provides a rapid and efficient method for large-scale human gene discovery. Images Fig. 1 Fig. 2 Fig. 3 PMID:8855227

  18. GeneXplorer: an interactive web application for microarray data visualization and analysis.

    PubMed

    Rees, Christian A; Demeter, Janos; Matese, John C; Botstein, David; Sherlock, Gavin

    2004-10-01

    When publishing large-scale microarray datasets, it is of great value to create supplemental websites where either the full data, or selected subsets corresponding to figures within the paper, can be browsed. We set out to create a CGI application containing many of the features of some of the existing standalone software for the visualization of clustered microarray data. We present GeneXplorer, a web application for interactive microarray data visualization and analysis in a web environment. GeneXplorer allows users to browse a microarray dataset in an intuitive fashion. It provides simple access to microarray data over the Internet and uses only HTML and JavaScript to display graphic and annotation information. It provides radar and zoom views of the data, allows display of the nearest neighbors to a gene expression vector based on their Pearson correlations and provides the ability to search gene annotation fields. The software is released under the permissive MIT Open Source license, and the complete documentation and the entire source code are freely available for download from CPAN http://search.cpan.org/dist/Microarray-GeneXplorer/.

  19. Finding Groups in Gene Expression Data

    PubMed Central

    2005-01-01

    The vast potential of the genomic insight offered by microarray technologies has led to their widespread use since they were introduced a decade ago. Application areas include gene function discovery, disease diagnosis, and inferring regulatory networks. Microarray experiments enable large-scale, high-throughput investigations of gene activity and have thus provided the data analyst with a distinctive, high-dimensional field of study. Many questions in this field relate to finding subgroups of data profiles which are very similar. A popular type of exploratory tool for finding subgroups is cluster analysis, and many different flavors of algorithms have been used and indeed tailored for microarray data. Cluster analysis, however, implies a partitioning of the entire data set, and this does not always match the objective. Sometimes pattern discovery or bump hunting tools are more appropriate. This paper reviews these various tools for finding interesting subgroups. PMID:16046827

  20. ExprAlign - the identification of ESTs in non-model species by alignment of cDNA microarray expression profiles

    PubMed Central

    2009-01-01

    Background Sequence identification of ESTs from non-model species offers distinct challenges particularly when these species have duplicated genomes and when they are phylogenetically distant from sequenced model organisms. For the common carp, an environmental model of aquacultural interest, large numbers of ESTs remained unidentified using BLAST sequence alignment. We have used the expression profiles from large-scale microarray experiments to suggest gene identities. Results Expression profiles from ~700 cDNA microarrays describing responses of 7 major tissues to multiple environmental stressors were used to define a co-expression landscape. This was based on the Pearsons correlation coefficient relating each gene with all other genes, from which a network description provided clusters of highly correlated genes as 'mountains'. We show that these contain genes with known identities and genes with unknown identities, and that the correlation constitutes evidence of identity in the latter. This procedure has suggested identities to 522 of 2701 unknown carp ESTs sequences. We also discriminate several common carp genes and gene isoforms that were not discriminated by BLAST sequence alignment alone. Precision in identification was substantially improved by use of data from multiple tissues and treatments. Conclusion The detailed analysis of co-expression landscapes is a sensitive technique for suggesting an identity for the large number of BLAST unidentified cDNAs generated in EST projects. It is capable of detecting even subtle changes in expression profiles, and thereby of distinguishing genes with a common BLAST identity into different identities. It benefits from the use of multiple treatments or contrasts, and from the large-scale microarray data. PMID:19939286

  1. Systematic validation and atomic force microscopy of non-covalent short oligonucleotide barcode microarrays.

    PubMed

    Cook, Michael A; Chan, Chi-Kin; Jorgensen, Paul; Ketela, Troy; So, Daniel; Tyers, Mike; Ho, Chi-Yip

    2008-02-06

    Molecular barcode arrays provide a powerful means to analyze cellular phenotypes in parallel through detection of short (20-60 base) unique sequence tags, or "barcodes", associated with each strain or clone in a collection. However, costs of current methods for microarray construction, whether by in situ oligonucleotide synthesis or ex situ coupling of modified oligonucleotides to the slide surface are often prohibitive to large-scale analyses. Here we demonstrate that unmodified 20mer oligonucleotide probes printed on conventional surfaces show comparable hybridization signals to covalently linked 5'-amino-modified probes. As a test case, we undertook systematic cell size analysis of the budding yeast Saccharomyces cerevisiae genome-wide deletion collection by size separation of the deletion pool followed by determination of strain abundance in size fractions by barcode arrays. We demonstrate that the properties of a 13K unique feature spotted 20 mer oligonucleotide barcode microarray compare favorably with an analogous covalently-linked oligonucleotide array. Further, cell size profiles obtained with the size selection/barcode array approach recapitulate previous cell size measurements of individual deletion strains. Finally, through atomic force microscopy (AFM), we characterize the mechanism of hybridization to unmodified barcode probes on the slide surface. These studies push the lower limit of probe size in genome-scale unmodified oligonucleotide microarray construction and demonstrate a versatile, cost-effective and reliable method for molecular barcode analysis.

  2. An integrated bioinformatics approach to improve two-color microarray quality-control: impact on biological conclusions.

    PubMed

    van Haaften, Rachel I M; Luceri, Cristina; van Erk, Arie; Evelo, Chris T A

    2009-06-01

    Omics technology used for large-scale measurements of gene expression is rapidly evolving. This work pointed out the need of an extensive bioinformatics analyses for array quality assessment before and after gene expression clustering and pathway analysis. A study focused on the effect of red wine polyphenols on rat colon mucosa was used to test the impact of quality control and normalisation steps on the biological conclusions. The integration of data visualization, pathway analysis and clustering revealed an artifact problem that was solved with an adapted normalisation. We propose a possible point to point standard analysis procedure, based on a combination of clustering and data visualization for the analysis of microarray data.

  3. A proposed metric for assessing the measurement quality of individual microarrays

    PubMed Central

    Kim, Kyoungmi; Page, Grier P; Beasley, T Mark; Barnes, Stephen; Scheirer, Katherine E; Allison, David B

    2006-01-01

    Background High-density microarray technology is increasingly applied to study gene expression levels on a large scale. Microarray experiments rely on several critical steps that may introduce error and uncertainty in analyses. These steps include mRNA sample extraction, amplification and labeling, hybridization, and scanning. In some cases this may be manifested as systematic spatial variation on the surface of microarray in which expression measurements within an individual array may vary as a function of geographic position on the array surface. Results We hypothesized that an index of the degree of spatiality of gene expression measurements associated with their physical geographic locations on an array could indicate the summary of the physical reliability of the microarray. We introduced a novel way to formulate this index using a statistical analysis tool. Our approach regressed gene expression intensity measurements on a polynomial response surface of the microarray's Cartesian coordinates. We demonstrated this method using a fixed model and presented results from real and simulated datasets. Conclusion We demonstrated the potential of such a quantitative metric for assessing the reliability of individual arrays. Moreover, we showed that this procedure can be incorporated into laboratory practice as a means to set quality control specifications and as a tool to determine whether an array has sufficient quality to be retained in terms of spatial correlation of gene expression measurements. PMID:16430768

  4. MAPPI-DAT: data management and analysis for protein-protein interaction data from the high-throughput MAPPIT cell microarray platform.

    PubMed

    Gupta, Surya; De Puysseleyr, Veronic; Van der Heyden, José; Maddelein, Davy; Lemmens, Irma; Lievens, Sam; Degroeve, Sven; Tavernier, Jan; Martens, Lennart

    2017-05-01

    Protein-protein interaction (PPI) studies have dramatically expanded our knowledge about cellular behaviour and development in different conditions. A multitude of high-throughput PPI techniques have been developed to achieve proteome-scale coverage for PPI studies, including the microarray based Mammalian Protein-Protein Interaction Trap (MAPPIT) system. Because such high-throughput techniques typically report thousands of interactions, managing and analysing the large amounts of acquired data is a challenge. We have therefore built the MAPPIT cell microArray Protein Protein Interaction-Data management & Analysis Tool (MAPPI-DAT) as an automated data management and analysis tool for MAPPIT cell microarray experiments. MAPPI-DAT stores the experimental data and metadata in a systematic and structured way, automates data analysis and interpretation, and enables the meta-analysis of MAPPIT cell microarray data across all stored experiments. MAPPI-DAT is developed in Python, using R for data analysis and MySQL as data management system. MAPPI-DAT is cross-platform and can be ran on Microsoft Windows, Linux and OS X/macOS. The source code and a Microsoft Windows executable are freely available under the permissive Apache2 open source license at https://github.com/compomics/MAPPI-DAT. jan.tavernier@vib-ugent.be or lennart.martens@vib-ugent.be. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press.

  5. MiMiR – an integrated platform for microarray data sharing, mining and analysis

    PubMed Central

    Tomlinson, Chris; Thimma, Manjula; Alexandrakis, Stelios; Castillo, Tito; Dennis, Jayne L; Brooks, Anthony; Bradley, Thomas; Turnbull, Carly; Blaveri, Ekaterini; Barton, Geraint; Chiba, Norie; Maratou, Klio; Soutter, Pat; Aitman, Tim; Game, Laurence

    2008-01-01

    Background Despite considerable efforts within the microarray community for standardising data format, content and description, microarray technologies present major challenges in managing, sharing, analysing and re-using the large amount of data generated locally or internationally. Additionally, it is recognised that inconsistent and low quality experimental annotation in public data repositories significantly compromises the re-use of microarray data for meta-analysis. MiMiR, the Microarray data Mining Resource was designed to tackle some of these limitations and challenges. Here we present new software components and enhancements to the original infrastructure that increase accessibility, utility and opportunities for large scale mining of experimental and clinical data. Results A user friendly Online Annotation Tool allows researchers to submit detailed experimental information via the web at the time of data generation rather than at the time of publication. This ensures the easy access and high accuracy of meta-data collected. Experiments are programmatically built in the MiMiR database from the submitted information and details are systematically curated and further annotated by a team of trained annotators using a new Curation and Annotation Tool. Clinical information can be annotated and coded with a clinical Data Mapping Tool within an appropriate ethical framework. Users can visualise experimental annotation, assess data quality, download and share data via a web-based experiment browser called MiMiR Online. All requests to access data in MiMiR are routed through a sophisticated middleware security layer thereby allowing secure data access and sharing amongst MiMiR registered users prior to publication. Data in MiMiR can be mined and analysed using the integrated EMAAS open source analysis web portal or via export of data and meta-data into Rosetta Resolver data analysis package. Conclusion The new MiMiR suite of software enables systematic and effective capture of extensive experimental and clinical information with the highest MIAME score, and secure data sharing prior to publication. MiMiR currently contains more than 150 experiments corresponding to over 3000 hybridisations and supports the Microarray Centre's large microarray user community and two international consortia. The MiMiR flexible and scalable hardware and software architecture enables secure warehousing of thousands of datasets, including clinical studies, from microarray and potentially other -omics technologies. PMID:18801157

  6. MiMiR--an integrated platform for microarray data sharing, mining and analysis.

    PubMed

    Tomlinson, Chris; Thimma, Manjula; Alexandrakis, Stelios; Castillo, Tito; Dennis, Jayne L; Brooks, Anthony; Bradley, Thomas; Turnbull, Carly; Blaveri, Ekaterini; Barton, Geraint; Chiba, Norie; Maratou, Klio; Soutter, Pat; Aitman, Tim; Game, Laurence

    2008-09-18

    Despite considerable efforts within the microarray community for standardising data format, content and description, microarray technologies present major challenges in managing, sharing, analysing and re-using the large amount of data generated locally or internationally. Additionally, it is recognised that inconsistent and low quality experimental annotation in public data repositories significantly compromises the re-use of microarray data for meta-analysis. MiMiR, the Microarray data Mining Resource was designed to tackle some of these limitations and challenges. Here we present new software components and enhancements to the original infrastructure that increase accessibility, utility and opportunities for large scale mining of experimental and clinical data. A user friendly Online Annotation Tool allows researchers to submit detailed experimental information via the web at the time of data generation rather than at the time of publication. This ensures the easy access and high accuracy of meta-data collected. Experiments are programmatically built in the MiMiR database from the submitted information and details are systematically curated and further annotated by a team of trained annotators using a new Curation and Annotation Tool. Clinical information can be annotated and coded with a clinical Data Mapping Tool within an appropriate ethical framework. Users can visualise experimental annotation, assess data quality, download and share data via a web-based experiment browser called MiMiR Online. All requests to access data in MiMiR are routed through a sophisticated middleware security layer thereby allowing secure data access and sharing amongst MiMiR registered users prior to publication. Data in MiMiR can be mined and analysed using the integrated EMAAS open source analysis web portal or via export of data and meta-data into Rosetta Resolver data analysis package. The new MiMiR suite of software enables systematic and effective capture of extensive experimental and clinical information with the highest MIAME score, and secure data sharing prior to publication. MiMiR currently contains more than 150 experiments corresponding to over 3000 hybridisations and supports the Microarray Centre's large microarray user community and two international consortia. The MiMiR flexible and scalable hardware and software architecture enables secure warehousing of thousands of datasets, including clinical studies, from microarray and potentially other -omics technologies.

  7. BABAR: an R package to simplify the normalisation of common reference design microarray-based transcriptomic datasets

    PubMed Central

    2010-01-01

    Background The development of DNA microarrays has facilitated the generation of hundreds of thousands of transcriptomic datasets. The use of a common reference microarray design allows existing transcriptomic data to be readily compared and re-analysed in the light of new data, and the combination of this design with large datasets is ideal for 'systems'-level analyses. One issue is that these datasets are typically collected over many years and may be heterogeneous in nature, containing different microarray file formats and gene array layouts, dye-swaps, and showing varying scales of log2- ratios of expression between microarrays. Excellent software exists for the normalisation and analysis of microarray data but many data have yet to be analysed as existing methods struggle with heterogeneous datasets; options include normalising microarrays on an individual or experimental group basis. Our solution was to develop the Batch Anti-Banana Algorithm in R (BABAR) algorithm and software package which uses cyclic loess to normalise across the complete dataset. We have already used BABAR to analyse the function of Salmonella genes involved in the process of infection of mammalian cells. Results The only input required by BABAR is unprocessed GenePix or BlueFuse microarray data files. BABAR provides a combination of 'within' and 'between' microarray normalisation steps and diagnostic boxplots. When applied to a real heterogeneous dataset, BABAR normalised the dataset to produce a comparable scaling between the microarrays, with the microarray data in excellent agreement with RT-PCR analysis. When applied to a real non-heterogeneous dataset and a simulated dataset, BABAR's performance in identifying differentially expressed genes showed some benefits over standard techniques. Conclusions BABAR is an easy-to-use software tool, simplifying the simultaneous normalisation of heterogeneous two-colour common reference design cDNA microarray-based transcriptomic datasets. We show BABAR transforms real and simulated datasets to allow for the correct interpretation of these data, and is the ideal tool to facilitate the identification of differentially expressed genes or network inference analysis from transcriptomic datasets. PMID:20128918

  8. Stormbow: A Cloud-Based Tool for Reads Mapping and Expression Quantification in Large-Scale RNA-Seq Studies

    PubMed Central

    Zhao, Shanrong; Prenger, Kurt; Smith, Lance

    2013-01-01

    RNA-Seq is becoming a promising replacement to microarrays in transcriptome profiling and differential gene expression study. Technical improvements have decreased sequencing costs and, as a result, the size and number of RNA-Seq datasets have increased rapidly. However, the increasing volume of data from large-scale RNA-Seq studies poses a practical challenge for data analysis in a local environment. To meet this challenge, we developed Stormbow, a cloud-based software package, to process large volumes of RNA-Seq data in parallel. The performance of Stormbow has been tested by practically applying it to analyse 178 RNA-Seq samples in the cloud. In our test, it took 6 to 8 hours to process an RNA-Seq sample with 100 million reads, and the average cost was $3.50 per sample. Utilizing Amazon Web Services as the infrastructure for Stormbow allows us to easily scale up to handle large datasets with on-demand computational resources. Stormbow is a scalable, cost effective, and open-source based tool for large-scale RNA-Seq data analysis. Stormbow can be freely downloaded and can be used out of box to process Illumina RNA-Seq datasets. PMID:25937948

  9. Stormbow: A Cloud-Based Tool for Reads Mapping and Expression Quantification in Large-Scale RNA-Seq Studies.

    PubMed

    Zhao, Shanrong; Prenger, Kurt; Smith, Lance

    2013-01-01

    RNA-Seq is becoming a promising replacement to microarrays in transcriptome profiling and differential gene expression study. Technical improvements have decreased sequencing costs and, as a result, the size and number of RNA-Seq datasets have increased rapidly. However, the increasing volume of data from large-scale RNA-Seq studies poses a practical challenge for data analysis in a local environment. To meet this challenge, we developed Stormbow, a cloud-based software package, to process large volumes of RNA-Seq data in parallel. The performance of Stormbow has been tested by practically applying it to analyse 178 RNA-Seq samples in the cloud. In our test, it took 6 to 8 hours to process an RNA-Seq sample with 100 million reads, and the average cost was $3.50 per sample. Utilizing Amazon Web Services as the infrastructure for Stormbow allows us to easily scale up to handle large datasets with on-demand computational resources. Stormbow is a scalable, cost effective, and open-source based tool for large-scale RNA-Seq data analysis. Stormbow can be freely downloaded and can be used out of box to process Illumina RNA-Seq datasets.

  10. Quantitative characterization of conformational-specific protein-DNA binding using a dual-spectral interferometric imaging biosensor.

    PubMed

    Zhang, Xirui; Daaboul, George G; Spuhler, Philipp S; Dröge, Peter; Ünlü, M Selim

    2016-03-14

    DNA-binding proteins play crucial roles in the maintenance and functions of the genome and yet, their specific binding mechanisms are not fully understood. Recently, it was discovered that DNA-binding proteins recognize specific binding sites to carry out their functions through an indirect readout mechanism by recognizing and capturing DNA conformational flexibility and deformation. High-throughput DNA microarray-based methods that provide large-scale protein-DNA binding information have shown effective and comprehensive analysis of protein-DNA binding affinities, but do not provide information of DNA conformational changes in specific protein-DNA complexes. Building on the high-throughput capability of DNA microarrays, we demonstrate a quantitative approach that simultaneously measures the amount of protein binding to DNA and nanometer-scale DNA conformational change induced by protein binding in a microarray format. Both measurements rely on spectral interferometry on a layered substrate using a single optical instrument in two distinct modalities. In the first modality, we quantitate the amount of binding of protein to surface-immobilized DNA in each DNA spot using a label-free spectral reflectivity technique that accurately measures the surface densities of protein and DNA accumulated on the substrate. In the second modality, for each DNA spot, we simultaneously measure DNA conformational change using a fluorescence vertical sectioning technique that determines average axial height of fluorophores tagged to specific nucleotides of the surface-immobilized DNA. The approach presented in this paper, when combined with current high-throughput DNA microarray-based technologies, has the potential to serve as a rapid and simple method for quantitative and large-scale characterization of conformational specific protein-DNA interactions.

  11. Identification of consensus biomarkers for predicting non-genotoxic hepatocarcinogens

    PubMed Central

    Huang, Shan-Han; Tung, Chun-Wei

    2017-01-01

    The assessment of non-genotoxic hepatocarcinogens (NGHCs) is currently relying on two-year rodent bioassays. Toxicogenomics biomarkers provide a potential alternative method for the prioritization of NGHCs that could be useful for risk assessment. However, previous studies using inconsistently classified chemicals as the training set and a single microarray dataset concluded no consensus biomarkers. In this study, 4 consensus biomarkers of A2m, Ca3, Cxcl1, and Cyp8b1 were identified from four large-scale microarray datasets of the one-day single maximum tolerated dose and a large set of chemicals without inconsistent classifications. Machine learning techniques were subsequently applied to develop prediction models for NGHCs. The final bagging decision tree models were constructed with an average AUC performance of 0.803 for an independent test. A set of 16 chemicals with controversial classifications were reclassified according to the consensus biomarkers. The developed prediction models and identified consensus biomarkers are expected to be potential alternative methods for prioritization of NGHCs for further experimental validation. PMID:28117354

  12. Microintaglio Printing for Soft Lithography-Based in Situ Microarrays

    PubMed Central

    Biyani, Manish; Ichiki, Takanori

    2015-01-01

    Advances in lithographic approaches to fabricating bio-microarrays have been extensively explored over the last two decades. However, the need for pattern flexibility, a high density, a high resolution, affordability and on-demand fabrication is promoting the development of unconventional routes for microarray fabrication. This review highlights the development and uses of a new molecular lithography approach, called “microintaglio printing technology”, for large-scale bio-microarray fabrication using a microreactor array (µRA)-based chip consisting of uniformly-arranged, femtoliter-size µRA molds. In this method, a single-molecule-amplified DNA microarray pattern is self-assembled onto a µRA mold and subsequently converted into a messenger RNA or protein microarray pattern by simultaneously producing and transferring (immobilizing) a messenger RNA or a protein from a µRA mold to a glass surface. Microintaglio printing allows the self-assembly and patterning of in situ-synthesized biomolecules into high-density (kilo-giga-density), ordered arrays on a chip surface with µm-order precision. This holistic aim, which is difficult to achieve using conventional printing and microarray approaches, is expected to revolutionize and reshape proteomics. This review is not written comprehensively, but rather substantively, highlighting the versatility of microintaglio printing for developing a prerequisite platform for microarray technology for the postgenomic era. PMID:27600226

  13. MIPHENO: Data normalization for high throughput metabolic analysis.

    EPA Science Inventory

    High throughput methodologies such as microarrays, mass spectrometry and plate-based small molecule screens are increasingly used to facilitate discoveries from gene function to drug candidate identification. These large-scale experiments are typically carried out over the course...

  14. Systematic Validation and Atomic Force Microscopy of Non-Covalent Short Oligonucleotide Barcode Microarrays

    PubMed Central

    Cook, Michael A.; Chan, Chi-Kin; Jorgensen, Paul; Ketela, Troy; So, Daniel; Tyers, Mike; Ho, Chi-Yip

    2008-01-01

    Background Molecular barcode arrays provide a powerful means to analyze cellular phenotypes in parallel through detection of short (20–60 base) unique sequence tags, or “barcodes”, associated with each strain or clone in a collection. However, costs of current methods for microarray construction, whether by in situ oligonucleotide synthesis or ex situ coupling of modified oligonucleotides to the slide surface are often prohibitive to large-scale analyses. Methodology/Principal Findings Here we demonstrate that unmodified 20mer oligonucleotide probes printed on conventional surfaces show comparable hybridization signals to covalently linked 5′-amino-modified probes. As a test case, we undertook systematic cell size analysis of the budding yeast Saccharomyces cerevisiae genome-wide deletion collection by size separation of the deletion pool followed by determination of strain abundance in size fractions by barcode arrays. We demonstrate that the properties of a 13K unique feature spotted 20 mer oligonucleotide barcode microarray compare favorably with an analogous covalently-linked oligonucleotide array. Further, cell size profiles obtained with the size selection/barcode array approach recapitulate previous cell size measurements of individual deletion strains. Finally, through atomic force microscopy (AFM), we characterize the mechanism of hybridization to unmodified barcode probes on the slide surface. Conclusions/Significance These studies push the lower limit of probe size in genome-scale unmodified oligonucleotide microarray construction and demonstrate a versatile, cost-effective and reliable method for molecular barcode analysis. PMID:18253494

  15. Genetic Structures of Copy Number Variants Revealed by Genotyping Single Sperm

    PubMed Central

    Luo, Minjie; Cui, Xiangfeng; Fredman, David; Brookes, Anthony J.; Azaro, Marco A.; Greenawalt, Danielle M.; Hu, Guohong; Wang, Hui-Yun; Tereshchenko, Irina V.; Lin, Yong; Shentu, Yue; Gao, Richeng; Shen, Li; Li, Honghua

    2009-01-01

    Background Copy number variants (CNVs) occupy a significant portion of the human genome and may have important roles in meiotic recombination, human genome evolution and gene expression. Many genetic diseases may be underlain by CNVs. However, because of the presence of their multiple copies, variability in copy numbers and the diploidy of the human genome, detailed genetic structure of CNVs cannot be readily studied by available techniques. Methodology/Principal Findings Single sperm samples were used as the primary subjects for the study so that CNV haplotypes in the sperm donors could be studied individually. Forty-eight CNVs characterized in a previous study were analyzed using a microarray-based high-throughput genotyping method after multiplex amplification. Seventeen single nucleotide polymorphisms (SNPs) were also included as controls. Two single-base variants, either allelic or paralogous, could be discriminated for all markers. Microarray data were used to resolve SNP alleles and CNV haplotypes, to quantitatively assess the numbers and compositions of the paralogous segments in each CNV haplotype. Conclusions/Significance This is the first study of the genetic structure of CNVs on a large scale. Resulting information may help understand evolution of the human genome, gain insight into many genetic processes, and discriminate between CNVs and SNPs. The highly sensitive high-throughput experimental system with haploid sperm samples as subjects may be used to facilitate detailed large-scale CNV analysis. PMID:19384415

  16. RNA sequencing: current and prospective uses in metabolic research.

    PubMed

    Vikman, Petter; Fadista, Joao; Oskolkov, Nikolay

    2014-10-01

    Previous global RNA analysis was restricted to known transcripts in species with a defined transcriptome. Next generation sequencing has transformed transcriptomics by making it possible to analyse expressed genes with an exon level resolution from any tissue in any species without any a priori knowledge of which genes that are being expressed, splice patterns or their nucleotide sequence. In addition, RNA sequencing is a more sensitive technique compared with microarrays with a larger dynamic range, and it also allows for investigation of imprinting and allele-specific expression. This can be done for a cost that is able to compete with that of a microarray, making RNA sequencing a technique available to most researchers. Therefore RNA sequencing has recently become the state of the art with regards to large-scale RNA investigations and has to a large extent replaced microarrays. The only drawback is the large data amounts produced, which together with the complexity of the data can make a researcher spend far more time on analysis than performing the actual experiment. © 2014 Society for Endocrinology.

  17. Versatile Gene-Specific Sequence Tags for Arabidopsis Functional Genomics: Transcript Profiling and Reverse Genetics Applications

    PubMed Central

    Hilson, Pierre; Allemeersch, Joke; Altmann, Thomas; Aubourg, Sébastien; Avon, Alexandra; Beynon, Jim; Bhalerao, Rishikesh P.; Bitton, Frédérique; Caboche, Michel; Cannoot, Bernard; Chardakov, Vasil; Cognet-Holliger, Cécile; Colot, Vincent; Crowe, Mark; Darimont, Caroline; Durinck, Steffen; Eickhoff, Holger; de Longevialle, Andéol Falcon; Farmer, Edward E.; Grant, Murray; Kuiper, Martin T.R.; Lehrach, Hans; Léon, Céline; Leyva, Antonio; Lundeberg, Joakim; Lurin, Claire; Moreau, Yves; Nietfeld, Wilfried; Paz-Ares, Javier; Reymond, Philippe; Rouzé, Pierre; Sandberg, Goran; Segura, Maria Dolores; Serizet, Carine; Tabrett, Alexandra; Taconnat, Ludivine; Thareau, Vincent; Van Hummelen, Paul; Vercruysse, Steven; Vuylsteke, Marnik; Weingartner, Magdalena; Weisbeek, Peter J.; Wirta, Valtteri; Wittink, Floyd R.A.; Zabeau, Marc; Small, Ian

    2004-01-01

    Microarray transcript profiling and RNA interference are two new technologies crucial for large-scale gene function studies in multicellular eukaryotes. Both rely on sequence-specific hybridization between complementary nucleic acid strands, inciting us to create a collection of gene-specific sequence tags (GSTs) representing at least 21,500 Arabidopsis genes and which are compatible with both approaches. The GSTs were carefully selected to ensure that each of them shared no significant similarity with any other region in the Arabidopsis genome. They were synthesized by PCR amplification from genomic DNA. Spotted microarrays fabricated from the GSTs show good dynamic range, specificity, and sensitivity in transcript profiling experiments. The GSTs have also been transferred to bacterial plasmid vectors via recombinational cloning protocols. These cloned GSTs constitute the ideal starting point for a variety of functional approaches, including reverse genetics. We have subcloned GSTs on a large scale into vectors designed for gene silencing in plant cells. We show that in planta expression of GST hairpin RNA results in the expected phenotypes in silenced Arabidopsis lines. These versatile GST resources provide novel and powerful tools for functional genomics. PMID:15489341

  18. Gene Expression Analysis: Teaching Students to Do 30,000 Experiments at Once with Microarray

    ERIC Educational Resources Information Center

    Carvalho, Felicia I.; Johns, Christopher; Gillespie, Marc E.

    2012-01-01

    Genome scale experiments routinely produce large data sets that require computational analysis, yet there are few student-based labs that illustrate the design and execution of these experiments. In order for students to understand and participate in the genomic world, teaching labs must be available where students generate and analyze large data…

  19. Large scale aggregate microarray analysis reveals three distinct molecular subclasses of human preeclampsia.

    PubMed

    Leavey, Katherine; Bainbridge, Shannon A; Cox, Brian J

    2015-01-01

    Preeclampsia (PE) is a life-threatening hypertensive pathology of pregnancy affecting 3-5% of all pregnancies. To date, PE has no cure, early detection markers, or effective treatments short of the removal of what is thought to be the causative organ, the placenta, which may necessitate a preterm delivery. Additionally, numerous small placental microarray studies attempting to identify "PE-specific" genes have yielded inconsistent results. We therefore hypothesize that preeclampsia is a multifactorial disease encompassing several pathology subclasses, and that large cohort placental gene expression analysis will reveal these groups. To address our hypothesis, we utilized known bioinformatic methods to aggregate 7 microarray data sets across multiple platforms in order to generate a large data set of 173 patient samples, including 77 with preeclampsia. Unsupervised clustering of these patient samples revealed three distinct molecular subclasses of PE. This included a "canonical" PE subclass demonstrating elevated expression of known PE markers and genes associated with poor oxygenation and increased secretion, as well as two other subclasses potentially representing a poor maternal response to pregnancy and an immunological presentation of preeclampsia. Our analysis sheds new light on the heterogeneity of PE patients, and offers up additional avenues for future investigation. Hopefully, our subclassification of preeclampsia based on molecular diversity will finally lead to the development of robust diagnostics and patient-based treatments for this disorder.

  20. The Importance of Normalization on Large and Heterogeneous Microarray Datasets

    EPA Science Inventory

    DNA microarray technology is a powerful functional genomics tool increasingly used for investigating global gene expression in environmental studies. Microarrays can also be used in identifying biological networks, as they give insight on the complex gene-to-gene interactions, ne...

  1. The CD117 immunohistochemistry tissue microarray survey for quality assurance and interlaboratory comparison: a College of American Pathologists Cell Markers Committee Study.

    PubMed

    Dorfman, David M; Bui, Marilyn M; Tubbs, Raymond R; Hsi, Eric D; Fitzgibbons, Patrick L; Linden, Michael D; Rickert, Robert R; Roche, Patrick C

    2006-06-01

    We have developed tissue microarray-based surveys to allow laboratories to compare their performance in staining predictive immunohistochemical markers, including proto-oncogene CD117 (c-kit), which is characteristically expressed in gastrointestinal stromal tumors (GISTs). GISTs exhibit activating mutations in the c-kit proto-oncogene, which render them amenable to treatment with imatinib mesylate. Consequently, correct identification of c-Kit expression is important for the diagnosis and treatment of GISTs. To analyze CD117 immunohistochemical staining performance by a large number of clinical laboratories. A mechanical device was used to construct tissue microarrays consisting of 3 x 1-mm cores of 10 tumor samples, which can be used to generate hundreds of tissue sections from the arrayed cases, suitable for large-scale interlaboratory comparison of immunohistochemical staining. An initial survey of 63 laboratories and a second survey of 90 laboratories, performed in 2004 and 2005, exhibited >81% concordance for 7 of 10 cores, including all 4 GIST cases, which were immunoreactive for CD117 with >95% staining concordance. Three of the cores achieved less than 81% concordance of results, possibly due to the presence of foci of necrosis in one core and CD117-positive mast cells in 2 cores of CD117-negative neoplasms. There was good performance among a large number of laboratories performing CD117 immunohistochemical staining, with consistently higher concordance of results for CD117-positive GIST cases than for nonimmunoreactive cases. Tissue microarrays for CD117 and other predictive markers should be useful for interlaboratory comparisons, quality assurance, and education of participants regarding staining nuances such as the expression of CKIT by nonneoplastic mast cells.

  2. Microarray analysis identifies candidate genes for key roles in coral development

    PubMed Central

    Grasso, Lauretta C; Maindonald, John; Rudd, Stephen; Hayward, David C; Saint, Robert; Miller, David J; Ball, Eldon E

    2008-01-01

    Background Anthozoan cnidarians are amongst the simplest animals at the tissue level of organization, but are surprisingly complex and vertebrate-like in terms of gene repertoire. As major components of tropical reef ecosystems, the stony corals are anthozoans of particular ecological significance. To better understand the molecular bases of both cnidarian development in general and coral-specific processes such as skeletogenesis and symbiont acquisition, microarray analysis was carried out through the period of early development – when skeletogenesis is initiated, and symbionts are first acquired. Results Of 5081 unique peptide coding genes, 1084 were differentially expressed (P ≤ 0.05) in comparisons between four different stages of coral development, spanning key developmental transitions. Genes of likely relevance to the processes of settlement, metamorphosis, calcification and interaction with symbionts were characterised further and their spatial expression patterns investigated using whole-mount in situ hybridization. Conclusion This study is the first large-scale investigation of developmental gene expression for any cnidarian, and has provided candidate genes for key roles in many aspects of coral biology, including calcification, metamorphosis and symbiont uptake. One surprising finding is that some of these genes have clear counterparts in higher animals but are not present in the closely-related sea anemone Nematostella. Secondly, coral-specific processes (i.e. traits which distinguish corals from their close relatives) may be analogous to similar processes in distantly related organisms. This first large-scale application of microarray analysis demonstrates the potential of this approach for investigating many aspects of coral biology, including the effects of stress and disease. PMID:19014561

  3. Quantitative characterization of conformational-specific protein-DNA binding using a dual-spectral interferometric imaging biosensor

    NASA Astrophysics Data System (ADS)

    Zhang, Xirui; Daaboul, George G.; Spuhler, Philipp S.; Dröge, Peter; Ünlü, M. Selim

    2016-03-01

    DNA-binding proteins play crucial roles in the maintenance and functions of the genome and yet, their specific binding mechanisms are not fully understood. Recently, it was discovered that DNA-binding proteins recognize specific binding sites to carry out their functions through an indirect readout mechanism by recognizing and capturing DNA conformational flexibility and deformation. High-throughput DNA microarray-based methods that provide large-scale protein-DNA binding information have shown effective and comprehensive analysis of protein-DNA binding affinities, but do not provide information of DNA conformational changes in specific protein-DNA complexes. Building on the high-throughput capability of DNA microarrays, we demonstrate a quantitative approach that simultaneously measures the amount of protein binding to DNA and nanometer-scale DNA conformational change induced by protein binding in a microarray format. Both measurements rely on spectral interferometry on a layered substrate using a single optical instrument in two distinct modalities. In the first modality, we quantitate the amount of binding of protein to surface-immobilized DNA in each DNA spot using a label-free spectral reflectivity technique that accurately measures the surface densities of protein and DNA accumulated on the substrate. In the second modality, for each DNA spot, we simultaneously measure DNA conformational change using a fluorescence vertical sectioning technique that determines average axial height of fluorophores tagged to specific nucleotides of the surface-immobilized DNA. The approach presented in this paper, when combined with current high-throughput DNA microarray-based technologies, has the potential to serve as a rapid and simple method for quantitative and large-scale characterization of conformational specific protein-DNA interactions.DNA-binding proteins play crucial roles in the maintenance and functions of the genome and yet, their specific binding mechanisms are not fully understood. Recently, it was discovered that DNA-binding proteins recognize specific binding sites to carry out their functions through an indirect readout mechanism by recognizing and capturing DNA conformational flexibility and deformation. High-throughput DNA microarray-based methods that provide large-scale protein-DNA binding information have shown effective and comprehensive analysis of protein-DNA binding affinities, but do not provide information of DNA conformational changes in specific protein-DNA complexes. Building on the high-throughput capability of DNA microarrays, we demonstrate a quantitative approach that simultaneously measures the amount of protein binding to DNA and nanometer-scale DNA conformational change induced by protein binding in a microarray format. Both measurements rely on spectral interferometry on a layered substrate using a single optical instrument in two distinct modalities. In the first modality, we quantitate the amount of binding of protein to surface-immobilized DNA in each DNA spot using a label-free spectral reflectivity technique that accurately measures the surface densities of protein and DNA accumulated on the substrate. In the second modality, for each DNA spot, we simultaneously measure DNA conformational change using a fluorescence vertical sectioning technique that determines average axial height of fluorophores tagged to specific nucleotides of the surface-immobilized DNA. The approach presented in this paper, when combined with current high-throughput DNA microarray-based technologies, has the potential to serve as a rapid and simple method for quantitative and large-scale characterization of conformational specific protein-DNA interactions. Electronic supplementary information (ESI) available: DNA sequences and nomenclature (Table 1S); SDS-PAGE assay of IHF stock solution (Fig. 1S); determination of the concentration of IHF stock solution by Bradford assay (Fig. 2S); equilibrium binding isotherm fitting results of other DNA sequences (Table 2S); calculation of dissociation constants (Fig. 3S, 4S; Table 2S); geometric model for quantitation of DNA bending angle induced by specific IHF binding (Fig. 4S); customized flow cell assembly (Fig. 5S); real-time measurement of average fluorophore height change by SSFM (Fig. 6S); summary of binding parameters obtained from additive isotherm model fitting (Table 3S); average surface densities of 10 dsDNA spots and bound IHF at equilibrium (Table 4S); effects of surface densities on the binding and bending of dsDNA (Tables 5S, 6S and Fig. 7S-10S). See DOI: 10.1039/c5nr06785e

  4. Removing Batch Effects from Longitudinal Gene Expression - Quantile Normalization Plus ComBat as Best Approach for Microarray Transcriptome Data

    PubMed Central

    Müller, Christian; Schillert, Arne; Röthemeier, Caroline; Trégouët, David-Alexandre; Proust, Carole; Binder, Harald; Pfeiffer, Norbert; Beutel, Manfred; Lackner, Karl J.; Schnabel, Renate B.; Tiret, Laurence; Wild, Philipp S.; Blankenberg, Stefan

    2016-01-01

    Technical variation plays an important role in microarray-based gene expression studies, and batch effects explain a large proportion of this noise. It is therefore mandatory to eliminate technical variation while maintaining biological variability. Several strategies have been proposed for the removal of batch effects, although they have not been evaluated in large-scale longitudinal gene expression data. In this study, we aimed at identifying a suitable method for batch effect removal in a large study of microarray-based longitudinal gene expression. Monocytic gene expression was measured in 1092 participants of the Gutenberg Health Study at baseline and 5-year follow up. Replicates of selected samples were measured at both time points to identify technical variability. Deming regression, Passing-Bablok regression, linear mixed models, non-linear models as well as ReplicateRUV and ComBat were applied to eliminate batch effects between replicates. In a second step, quantile normalization prior to batch effect correction was performed for each method. Technical variation between batches was evaluated by principal component analysis. Associations between body mass index and transcriptomes were calculated before and after batch removal. Results from association analyses were compared to evaluate maintenance of biological variability. Quantile normalization, separately performed in each batch, combined with ComBat successfully reduced batch effects and maintained biological variability. ReplicateRUV performed perfectly in the replicate data subset of the study, but failed when applied to all samples. All other methods did not substantially reduce batch effects in the replicate data subset. Quantile normalization plus ComBat appears to be a valuable approach for batch correction in longitudinal gene expression data. PMID:27272489

  5. Gaussian mixture clustering and imputation of microarray data.

    PubMed

    Ouyang, Ming; Welsh, William J; Georgopoulos, Panos

    2004-04-12

    In microarray experiments, missing entries arise from blemishes on the chips. In large-scale studies, virtually every chip contains some missing entries and more than 90% of the genes are affected. Many analysis methods require a full set of data. Either those genes with missing entries are excluded, or the missing entries are filled with estimates prior to the analyses. This study compares methods of missing value estimation. Two evaluation metrics of imputation accuracy are employed. First, the root mean squared error measures the difference between the true values and the imputed values. Second, the number of mis-clustered genes measures the difference between clustering with true values and that with imputed values; it examines the bias introduced by imputation to clustering. The Gaussian mixture clustering with model averaging imputation is superior to all other imputation methods, according to both evaluation metrics, on both time-series (correlated) and non-time series (uncorrelated) data sets.

  6. Tissue microarrays and digital image analysis.

    PubMed

    Ryan, Denise; Mulrane, Laoighse; Rexhepaj, Elton; Gallagher, William M

    2011-01-01

    Tissue microarrays (TMAs) have recently emerged as very valuable tools for high-throughput pathological assessment, especially in the cancer research arena. This important technology, however, has yet to fully penetrate into the area of toxicology. Here, we describe the creation of TMAs representative of samples produced from conventional toxicology studies within a large-scale, multi-institutional pan-European project, PredTox. PredTox, short for Predictive Toxicology, formed part of an EU FP6 Integrated Project, Innovative Medicines for Europe (InnoMed), and aimed to study pre-clinically 16 compounds of known liver and/or kidney toxicity. In more detail, TMAs were constructed from materials corresponding to the full face sections of liver and kidney from rats treated with different drug candidates by members of the consortium. We also describe the process of digital slide scanning of kidney and liver sections, in the context of creating an online resource of histopathological data.

  7. Construction of a robust microarray from a non-model species (largemouth bass) using pyrosequencing technology

    PubMed Central

    Garcia-Reyero, Natàlia; Griffitt, Robert J.; Liu, Li; Kroll, Kevin J.; Farmerie, William G.; Barber, David S.; Denslow, Nancy D.

    2009-01-01

    A novel custom microarray for largemouth bass (Micropterus salmoides) was designed with sequences obtained from a normalized cDNA library using the 454 Life Sciences GS-20 pyrosequencer. This approach yielded in excess of 58 million bases of high-quality sequence. The sequence information was combined with 2,616 reads obtained by traditional suppressive subtractive hybridizations to derive a total of 31,391 unique sequences. Annotation and coding sequences were predicted for these transcripts where possible. 16,350 annotated transcripts were selected as target sequences for the design of the custom largemouth bass oligonucleotide microarray. The microarray was validated by examining the transcriptomic response in male largemouth bass exposed to 17β-œstradiol. Transcriptomic responses were assessed in liver and gonad, and indicated gene expression profiles typical of exposure to œstradiol. The results demonstrate the potential to rapidly create the tools necessary to assess large scale transcriptional responses in non-model species, paving the way for expanded impact of toxicogenomics in ecotoxicology. PMID:19936325

  8. Tissue microarray immunohistochemical detection of brachyury is not a prognostic indicator in chordoma.

    PubMed

    Zhang, Linlin; Guo, Shang; Schwab, Joseph H; Nielsen, G Petur; Choy, Edwin; Ye, Shunan; Zhang, Zhan; Mankin, Henry; Hornicek, Francis J; Duan, Zhenfeng

    2013-01-01

    Brachyury is a marker for notochord-derived tissues and neoplasms, such as chordoma. However, the prognostic relevance of brachyury expression in chordoma is still unknown. The improvement of tissue microarray technology has provided the opportunity to perform analyses of tumor tissues on a large scale in a uniform and consistent manner. This study was designed with the use of tissue microarray to determine the expression of brachyury. Brachyury expression in chordoma tissues from 78 chordoma patients was analyzed by immunohistochemical staining of tissue microarray. The clinicopathologic parameters, including gender, age, location of tumor and metastatic status were evaluated. Fifty-nine of 78 (75.64%) tumors showed nuclear staining for brachyury, and among them, 29 tumors (49.15%) showed 1+ (<30% positive cells) staining, 15 tumors (25.42%) had 2+ (31% to 60% positive cells) staining, and 15 tumors (25.42%) demonstrated 3+ (61% to 100% positive cells) staining. Brachyury nuclear staining was detected more frequently in sacral chordomas than in chordomas of the mobile spine. However, there was no significant relationship between brachyury expression and other clinical variables. By Kaplan-Meier analysis, brachyury expression failed to produce any significant relationship with the overall survival rate. In conclusion, brachyury expression is not a prognostic indicator in chordoma.

  9. Large-scale atlas of microarray data reveals biological landscape of gene expression in Arabidopsis

    USDA-ARS?s Scientific Manuscript database

    Transcriptome datasets from thousands of samples of the model plant Arabidopsis thaliana have been collectively generated by multiple individual labs. Although integration and meta-analysis of these samples has become routine in the plant research community, it is often hampered by the lack of metad...

  10. Segmental Duplications and Copy-Number Variation in the Human Genome

    PubMed Central

    Sharp, Andrew J. ; Locke, Devin P. ; McGrath, Sean D. ; Cheng, Ze ; Bailey, Jeffrey A. ; Vallente, Rhea U. ; Pertz, Lisa M. ; Clark, Royden A. ; Schwartz, Stuart ; Segraves, Rick ; Oseroff, Vanessa V. ; Albertson, Donna G. ; Pinkel, Daniel ; Eichler, Evan E. 

    2005-01-01

    The human genome contains numerous blocks of highly homologous duplicated sequence. This higher-order architecture provides a substrate for recombination and recurrent chromosomal rearrangement associated with genomic disease. However, an assessment of the role of segmental duplications in normal variation has not yet been made. On the basis of the duplication architecture of the human genome, we defined a set of 130 potential rearrangement hotspots and constructed a targeted bacterial artificial chromosome (BAC) microarray (with 2,194 BACs) to assess copy-number variation in these regions by array comparative genomic hybridization. Using our segmental duplication BAC microarray, we screened a panel of 47 normal individuals, who represented populations from four continents, and we identified 119 regions of copy-number polymorphism (CNP), 73 of which were previously unreported. We observed an equal frequency of duplications and deletions, as well as a 4-fold enrichment of CNPs within hotspot regions, compared with control BACs (P < .000001), which suggests that segmental duplications are a major catalyst of large-scale variation in the human genome. Importantly, segmental duplications themselves were also significantly enriched >4-fold within regions of CNP. Almost without exception, CNPs were not confined to a single population, suggesting that these either are recurrent events, having occurred independently in multiple founders, or were present in early human populations. Our study demonstrates that segmental duplications define hotspots of chromosomal rearrangement, likely acting as mediators of normal variation as well as genomic disease, and it suggests that the consideration of genomic architecture can significantly improve the ascertainment of large-scale rearrangements. Our specialized segmental duplication BAC microarray and associated database of structural polymorphisms will provide an important resource for the future characterization of human genomic disorders. PMID:15918152

  11. High dimensional biological data retrieval optimization with NoSQL technology.

    PubMed

    Wang, Shicai; Pandis, Ioannis; Wu, Chao; He, Sijin; Johnson, David; Emam, Ibrahim; Guitton, Florian; Guo, Yike

    2014-01-01

    High-throughput transcriptomic data generated by microarray experiments is the most abundant and frequently stored kind of data currently used in translational medicine studies. Although microarray data is supported in data warehouses such as tranSMART, when querying relational databases for hundreds of different patient gene expression records queries are slow due to poor performance. Non-relational data models, such as the key-value model implemented in NoSQL databases, hold promise to be more performant solutions. Our motivation is to improve the performance of the tranSMART data warehouse with a view to supporting Next Generation Sequencing data. In this paper we introduce a new data model better suited for high-dimensional data storage and querying, optimized for database scalability and performance. We have designed a key-value pair data model to support faster queries over large-scale microarray data and implemented the model using HBase, an implementation of Google's BigTable storage system. An experimental performance comparison was carried out against the traditional relational data model implemented in both MySQL Cluster and MongoDB, using a large publicly available transcriptomic data set taken from NCBI GEO concerning Multiple Myeloma. Our new key-value data model implemented on HBase exhibits an average 5.24-fold increase in high-dimensional biological data query performance compared to the relational model implemented on MySQL Cluster, and an average 6.47-fold increase on query performance on MongoDB. The performance evaluation found that the new key-value data model, in particular its implementation in HBase, outperforms the relational model currently implemented in tranSMART. We propose that NoSQL technology holds great promise for large-scale data management, in particular for high-dimensional biological data such as that demonstrated in the performance evaluation described in this paper. We aim to use this new data model as a basis for migrating tranSMART's implementation to a more scalable solution for Big Data.

  12. High dimensional biological data retrieval optimization with NoSQL technology

    PubMed Central

    2014-01-01

    Background High-throughput transcriptomic data generated by microarray experiments is the most abundant and frequently stored kind of data currently used in translational medicine studies. Although microarray data is supported in data warehouses such as tranSMART, when querying relational databases for hundreds of different patient gene expression records queries are slow due to poor performance. Non-relational data models, such as the key-value model implemented in NoSQL databases, hold promise to be more performant solutions. Our motivation is to improve the performance of the tranSMART data warehouse with a view to supporting Next Generation Sequencing data. Results In this paper we introduce a new data model better suited for high-dimensional data storage and querying, optimized for database scalability and performance. We have designed a key-value pair data model to support faster queries over large-scale microarray data and implemented the model using HBase, an implementation of Google's BigTable storage system. An experimental performance comparison was carried out against the traditional relational data model implemented in both MySQL Cluster and MongoDB, using a large publicly available transcriptomic data set taken from NCBI GEO concerning Multiple Myeloma. Our new key-value data model implemented on HBase exhibits an average 5.24-fold increase in high-dimensional biological data query performance compared to the relational model implemented on MySQL Cluster, and an average 6.47-fold increase on query performance on MongoDB. Conclusions The performance evaluation found that the new key-value data model, in particular its implementation in HBase, outperforms the relational model currently implemented in tranSMART. We propose that NoSQL technology holds great promise for large-scale data management, in particular for high-dimensional biological data such as that demonstrated in the performance evaluation described in this paper. We aim to use this new data model as a basis for migrating tranSMART's implementation to a more scalable solution for Big Data. PMID:25435347

  13. Sex genes for genomic analysis in human brain: internal controls for comparison of probe level data extraction.

    PubMed Central

    Galfalvy, Hanga C; Erraji-Benchekroun, Loubna; Smyrniotopoulos, Peggy; Pavlidis, Paul; Ellis, Steven P; Mann, J John; Sibille, Etienne; Arango, Victoria

    2003-01-01

    Background Genomic studies of complex tissues pose unique analytical challenges for assessment of data quality, performance of statistical methods used for data extraction, and detection of differentially expressed genes. Ideally, to assess the accuracy of gene expression analysis methods, one needs a set of genes which are known to be differentially expressed in the samples and which can be used as a "gold standard". We introduce the idea of using sex-chromosome genes as an alternative to spiked-in control genes or simulations for assessment of microarray data and analysis methods. Results Expression of sex-chromosome genes were used as true internal biological controls to compare alternate probe-level data extraction algorithms (Microarray Suite 5.0 [MAS5.0], Model Based Expression Index [MBEI] and Robust Multi-array Average [RMA]), to assess microarray data quality and to establish some statistical guidelines for analyzing large-scale gene expression. These approaches were implemented on a large new dataset of human brain samples. RMA-generated gene expression values were markedly less variable and more reliable than MAS5.0 and MBEI-derived values. A statistical technique controlling the false discovery rate was applied to adjust for multiple testing, as an alternative to the Bonferroni method, and showed no evidence of false negative results. Fourteen probesets, representing nine Y- and two X-chromosome linked genes, displayed significant sex differences in brain prefrontal cortex gene expression. Conclusion In this study, we have demonstrated the use of sex genes as true biological internal controls for genomic analysis of complex tissues, and suggested analytical guidelines for testing alternate oligonucleotide microarray data extraction protocols and for adjusting multiple statistical analysis of differentially expressed genes. Our results also provided evidence for sex differences in gene expression in the brain prefrontal cortex, supporting the notion of a putative direct role of sex-chromosome genes in differentiation and maintenance of sexual dimorphism of the central nervous system. Importantly, these analytical approaches are applicable to all microarray studies that include male and female human or animal subjects. PMID:12962547

  14. Sex genes for genomic analysis in human brain: internal controls for comparison of probe level data extraction.

    PubMed

    Galfalvy, Hanga C; Erraji-Benchekroun, Loubna; Smyrniotopoulos, Peggy; Pavlidis, Paul; Ellis, Steven P; Mann, J John; Sibille, Etienne; Arango, Victoria

    2003-09-08

    Genomic studies of complex tissues pose unique analytical challenges for assessment of data quality, performance of statistical methods used for data extraction, and detection of differentially expressed genes. Ideally, to assess the accuracy of gene expression analysis methods, one needs a set of genes which are known to be differentially expressed in the samples and which can be used as a "gold standard". We introduce the idea of using sex-chromosome genes as an alternative to spiked-in control genes or simulations for assessment of microarray data and analysis methods. Expression of sex-chromosome genes were used as true internal biological controls to compare alternate probe-level data extraction algorithms (Microarray Suite 5.0 [MAS5.0], Model Based Expression Index [MBEI] and Robust Multi-array Average [RMA]), to assess microarray data quality and to establish some statistical guidelines for analyzing large-scale gene expression. These approaches were implemented on a large new dataset of human brain samples. RMA-generated gene expression values were markedly less variable and more reliable than MAS5.0 and MBEI-derived values. A statistical technique controlling the false discovery rate was applied to adjust for multiple testing, as an alternative to the Bonferroni method, and showed no evidence of false negative results. Fourteen probesets, representing nine Y- and two X-chromosome linked genes, displayed significant sex differences in brain prefrontal cortex gene expression. In this study, we have demonstrated the use of sex genes as true biological internal controls for genomic analysis of complex tissues, and suggested analytical guidelines for testing alternate oligonucleotide microarray data extraction protocols and for adjusting multiple statistical analysis of differentially expressed genes. Our results also provided evidence for sex differences in gene expression in the brain prefrontal cortex, supporting the notion of a putative direct role of sex-chromosome genes in differentiation and maintenance of sexual dimorphism of the central nervous system. Importantly, these analytical approaches are applicable to all microarray studies that include male and female human or animal subjects.

  15. Cloud-scale genomic signals processing classification analysis for gene expression microarray data.

    PubMed

    Harvey, Benjamin; Soo-Yeon Ji

    2014-01-01

    As microarray data available to scientists continues to increase in size and complexity, it has become overwhelmingly important to find multiple ways to bring inference though analysis of DNA/mRNA sequence data that is useful to scientists. Though there have been many attempts to elucidate the issue of bringing forth biological inference by means of wavelet preprocessing and classification, there has not been a research effort that focuses on a cloud-scale classification analysis of microarray data using Wavelet thresholding in a Cloud environment to identify significantly expressed features. This paper proposes a novel methodology that uses Wavelet based Denoising to initialize a threshold for determination of significantly expressed genes for classification. Additionally, this research was implemented and encompassed within cloud-based distributed processing environment. The utilization of Cloud computing and Wavelet thresholding was used for the classification 14 tumor classes from the Global Cancer Map (GCM). The results proved to be more accurate than using a predefined p-value for differential expression classification. This novel methodology analyzed Wavelet based threshold features of gene expression in a Cloud environment, furthermore classifying the expression of samples by analyzing gene patterns, which inform us of biological processes. Moreover, enabling researchers to face the present and forthcoming challenges that may arise in the analysis of data in functional genomics of large microarray datasets.

  16. Comprehensive Analysis of DNA Methylation Data with RnBeads

    PubMed Central

    Walter, Jörn; Lengauer, Thomas; Bock, Christoph

    2014-01-01

    RnBeads is a software tool for large-scale analysis and interpretation of DNA methylation data, providing a user-friendly analysis workflow that yields detailed hypertext reports (http://rnbeads.mpi-inf.mpg.de). Supported assays include whole genome bisulfite sequencing, reduced representation bisulfite sequencing, Infinium microarrays, and any other protocol that produces high-resolution DNA methylation data. Important applications of RnBeads include the analysis of epigenome-wide association studies and epigenetic biomarker discovery in cancer cohorts. PMID:25262207

  17. GStream: Improving SNP and CNV Coverage on Genome-Wide Association Studies

    PubMed Central

    Alonso, Arnald; Marsal, Sara; Tortosa, Raül; Canela-Xandri, Oriol; Julià, Antonio

    2013-01-01

    We present GStream, a method that combines genome-wide SNP and CNV genotyping in the Illumina microarray platform with unprecedented accuracy. This new method outperforms previous well-established SNP genotyping software. More importantly, the CNV calling algorithm of GStream dramatically improves the results obtained by previous state-of-the-art methods and yields an accuracy that is close to that obtained by purely CNV-oriented technologies like Comparative Genomic Hybridization (CGH). We demonstrate the superior performance of GStream using microarray data generated from HapMap samples. Using the reference CNV calls generated by the 1000 Genomes Project (1KGP) and well-known studies on whole genome CNV characterization based either on CGH or genotyping microarray technologies, we show that GStream can increase the number of reliably detected variants up to 25% compared to previously developed methods. Furthermore, the increased genome coverage provided by GStream allows the discovery of CNVs in close linkage disequilibrium with SNPs, previously associated with disease risk in published Genome-Wide Association Studies (GWAS). These results could provide important insights into the biological mechanism underlying the detected disease risk association. With GStream, large-scale GWAS will not only benefit from the combined genotyping of SNPs and CNVs at an unprecedented accuracy, but will also take advantage of the computational efficiency of the method. PMID:23844243

  18. Living-Cell Microarrays

    PubMed Central

    Yarmush, Martin L.; King, Kevin R.

    2011-01-01

    Living cells are remarkably complex. To unravel this complexity, living-cell assays have been developed that allow delivery of experimental stimuli and measurement of the resulting cellular responses. High-throughput adaptations of these assays, known as living-cell microarrays, which are based on microtiter plates, high-density spotting, microfabrication, and microfluidics technologies, are being developed for two general applications: (a) to screen large-scale chemical and genomic libraries and (b) to systematically investigate the local cellular microenvironment. These emerging experimental platforms offer exciting opportunities to rapidly identify genetic determinants of disease, to discover modulators of cellular function, and to probe the complex and dynamic relationships between cells and their local environment. PMID:19413510

  19. Analysis of large-scale gene expression data.

    PubMed

    Sherlock, G

    2000-04-01

    The advent of cDNA and oligonucleotide microarray technologies has led to a paradigm shift in biological investigation, such that the bottleneck in research is shifting from data generation to data analysis. Hierarchical clustering, divisive clustering, self-organizing maps and k-means clustering have all been recently used to make sense of this mass of data.

  20. Isolation of Microarray-Grade Total RNA, MicroRNA, and DNA from a Single PAXgene Blood RNA Tube

    PubMed Central

    Kruhøffer, Mogens; Dyrskjøt, Lars; Voss, Thorsten; Lindberg, Raija L.P.; Wyrich, Ralf; Thykjaer, Thomas; Orntoft, Torben F.

    2007-01-01

    We have developed a procedure for isolation of microRNA and genomic DNA in addition to total RNA from whole blood stabilized in PAXgene Blood RNA tubes. The procedure is based on automatic extraction on a BioRobot MDx and includes isolation of DNA from a fraction of the stabilized blood and recovery of small RNA species that are otherwise lost. The procedure presented here is suitable for large-scale experiments and is amenable to further automation. Procured total RNA and DNA was tested using Affymetrix Expression and single-nucleotide polymorphism GeneChips, respectively, and isolated microRNA was tested using spotted locked nucleic acid-based microarrays. We conclude that the yield and quality of total RNA, microRNA, and DNA from a single PAXgene blood RNA tube is sufficient for downstream microarray analysis. PMID:17690207

  1. Decentralized Data Sharing of Tissue Microarrays for Investigative Research in Oncology

    PubMed Central

    Chen, Wenjin; Schmidt, Cristina; Parashar, Manish; Reiss, Michael; Foran, David J.

    2007-01-01

    Tissue microarray technology (TMA) is a relatively new approach for efficiently and economically assessing protein and gene expression across large ensembles of tissue specimens. Tissue microarray technology holds great potential for reducing the time and cost associated with conducting research in tissue banking, proteomics, and outcome studies. However, the sheer volume of images and other data generated from even limited studies involving tissue microarrays quickly approach the processing capacity and resources of a division or department. This challenge is compounded by the fact that large-scale projects in several areas of modern research rely upon multi-institutional efforts in which investigators and resources are spread out over multiple campuses, cities, and states. To address some of the data management issues several leading institutions have begun to develop their own “in-house” systems, independently, but such data will be only minimally useful if it isn’t accessible to others in the scientific community. Investigators at different institutions studying the same or related disorders might benefit from the synergy of sharing results. To facilitate sharing of TMA data across different database implementations, the Technical Standards Committee of the Association for Pathology Informatics organized workshops in efforts to establish a standardized TMA data exchange specification. The focus of our research does not relate to the establishment of standards for exchange, but rather builds on these efforts and concentrates on the design, development and deployment of a decentralized collaboratory for the unsupervised characterization, and seamless and secure discovery and sharing of TMA data. Specifically, we present a self-organizing, peer-to-peer indexing and discovery infrastructure for quantitatively assessing digitized TMA’s. The system utilizes a novel, optimized decentralized search engine that supports flexible querying, while guaranteeing that once information has been stored in the system, it will be found with bounded costs. PMID:19081778

  2. Porous Silicon Antibody Microarrays for Quantitative Analysis: Measurement of Free and Total PSA in Clinical Plasma Samples

    PubMed Central

    Tojo, Axel; Malm, Johan; Marko-Varga, György; Lilja, Hans; Laurell, Thomas

    2014-01-01

    The antibody microarrays have become widespread, but their use for quantitative analyses in clinical samples has not yet been established. We investigated an immunoassay based on nanoporous silicon antibody microarrays for quantification of total prostate-specific-antigen (PSA) in 80 clinical plasma samples, and provide quantitative data from a duplex microarray assay that simultaneously quantifies free and total PSA in plasma. To further develop the assay the porous silicon chips was placed into a standard 96-well microtiter plate for higher throughput analysis. The samples analyzed by this quantitative microarray were 80 plasma samples obtained from men undergoing clinical PSA testing (dynamic range: 0.14-44ng/ml, LOD: 0.14ng/ml). The second dataset, measuring free PSA (dynamic range: 0.40-74.9ng/ml, LOD: 0.47ng/ml) and total PSA (dynamic range: 0.87-295ng/ml, LOD: 0.76ng/ml), was also obtained from the clinical routine. The reference for the quantification was a commercially available assay, the ProStatus PSA Free/Total DELFIA. In an analysis of 80 plasma samples the microarray platform performs well across the range of total PSA levels. This assay might have the potential to substitute for the large-scale microtiter plate format in diagnostic applications. The duplex assay paves the way for a future quantitative multiplex assay, which analyses several prostate cancer biomarkers simultaneously. PMID:22921878

  3. Microcontact Printing of Thiol-Functionalized Ionic Liquid Microarrays for "Membrane-less" and "Spill-less" Gas Sensors.

    PubMed

    Gondosiswanto, Richard; Gunawan, Christian A; Hibbert, David B; Harper, Jason B; Zhao, Chuan

    2016-11-16

    Lab-on-a-chip systems have gained significant interest for both chemical synthesis and assays at the micro-to-nanoscale with a unique set of benefits. However, solvent volatility represents one of the major hurdles to the reliability and reproducibility of the lab-on-a-chip devices for large-scale applications. Here we demonstrate a strategy of combining nonvolatile and functionalized ionic liquids with microcontact printing for fabrication of "wall-less" microreactors and microfluidics with high reproducibility and high throughput. A range of thiol-functionalized ionic liquids have been synthesized and used as inks for microcontact printing of ionic liquid microdroplet arrays onto gold chips. The covalent bonds formed between the thiol-functionalized ionic liquids and the gold substrate offer enhanced stability of the ionic liquid microdroplets, compared to conventional nonfunctionalized ionic liquids, and these microdroplets remain stable in a range of nonpolar and polar solvents, including water. We further demonstrate the use of these open ionic liquid microarrays for fabrication of "membrane-less" and "spill-less" gas sensors with enhanced reproducibility and robustness. Ionic-liquid-based microarray and microfluidics fabricated using the described microcontact printing may provide a versatile platform for a diverse number of applications at scale.

  4. Performance analysis of clustering techniques over microarray data: A case study

    NASA Astrophysics Data System (ADS)

    Dash, Rasmita; Misra, Bijan Bihari

    2018-03-01

    Handling big data is one of the major issues in the field of statistical data analysis. In such investigation cluster analysis plays a vital role to deal with the large scale data. There are many clustering techniques with different cluster analysis approach. But which approach suits a particular dataset is difficult to predict. To deal with this problem a grading approach is introduced over many clustering techniques to identify a stable technique. But the grading approach depends on the characteristic of dataset as well as on the validity indices. So a two stage grading approach is implemented. In this study the grading approach is implemented over five clustering techniques like hybrid swarm based clustering (HSC), k-means, partitioning around medoids (PAM), vector quantization (VQ) and agglomerative nesting (AGNES). The experimentation is conducted over five microarray datasets with seven validity indices. The finding of grading approach that a cluster technique is significant is also established by Nemenyi post-hoc hypothetical test.

  5. A Customized DNA Microarray for Microbial Source Tracking ...

    EPA Pesticide Factsheets

    It is estimated that more than 160, 000 miles of rivers and streams in the United States are impaired due to the presence of waterborne pathogens. These pathogens typically originate from human and other animal fecal pollution sources; therefore, a rapid microbial source tracking (MST) method is needed to facilitate water quality assessment and impaired water remediation. We report a novel qualitative DNA microarray technology consisting of 453 probes for the detection of general fecal and host-associated bacteria, viruses, antibiotic resistance, and other environmentally relevant genetic indicators. A novel data normalization and reduction approach is also presented to help alleviate false positives often associated with high-density microarray applications. To evaluate the performance of the approach, DNA and cDNA was isolated from swine, cattle, duck, goose and gull fecal reference samples, as well as soiled poultry liter and raw municipal sewage. Based on nonmetric multidimensional scaling analysis of results, findings suggest that the novel microarray approach may be useful for pathogen detection and identification of fecal contamination in recreational waters. The ability to simultaneously detect a large collection of environmentally important genetic indicators in a single test has the potential to provide water quality managers with a wide range of information in a short period of time. Future research is warranted to measure microarray performance i

  6. DNA microarray analyses reveal a post-irradiation differential time-dependent gene expression profile in yeast cells exposed to X-rays and gamma-rays.

    PubMed

    Kimura, Shinzo; Ishidou, Emi; Kurita, Sakiko; Suzuki, Yoshiteru; Shibato, Junko; Rakwal, Randeep; Iwahashi, Hitoshi

    2006-07-21

    Ionizing radiation (IR) is the most enigmatic of genotoxic stress inducers in our environment that has been around from the eons of time. IR is generally considered harmful, and has been the subject of numerous studies, mostly looking at the DNA damaging effects in cells and the repair mechanisms therein. Moreover, few studies have focused on large-scale identification of cellular responses to IR, and to this end, we describe here an initial study on the transcriptional responses of the unicellular genome model, yeast (Saccharomyces cerevisiae strain S288C), by cDNA microarray. The effect of two different IR, X-rays, and gamma (gamma)-rays, was investigated by irradiating the yeast cells cultured in YPD medium with 50 Gy doses of X- and gamma-rays, followed by resuspension of the cells in YPD for time-course experiments. The samples were collected for microarray analysis at 20, 40, and 80 min after irradiation. Microarray analysis revealed a time-course transcriptional profile of changed gene expressions. Up-regulated genes belonged to the functional categories mainly related to cell cycle and DNA processing, cell rescue defense and virulence, protein and cell fate, and metabolism (X- and gamma-rays). Similarly, for X- and gamma-rays, the down-regulated genes belonged to mostly transcription and protein synthesis, cell cycle and DNA processing, control of cellular organization, cell fate, and C-compound and carbohydrate metabolism categories, respectively. This study provides for the first time a snapshot of the genome-wide mRNA expression profiles in X- and gamma-ray post-irradiated yeast cells and comparatively interprets/discusses the changed gene functional categories as effects of these two radiations vis-à-vis their energy levels.

  7. Development of DNA Microarrays for Metabolic Pathway and Bioprocess Monitoring

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gregory Stephanopoulos

    Transcriptional profiling experiments utilizing DNA microarrays to study the intracellular accumulation of PHB in Synechocystis has proved difficult in large part because strains that show significant differences in PHB which would justify global analysis of gene expression have not been isolated.

  8. Differential binding of calmodulin-related proteins to their targets revealed through high-density Arabidopsis protein microarrays

    PubMed Central

    Popescu, Sorina C.; Popescu, George V.; Bachan, Shawn; Zhang, Zimei; Seay, Montrell; Gerstein, Mark; Snyder, Michael; Dinesh-Kumar, S. P.

    2007-01-01

    Calmodulins (CaMs) are the most ubiquitous calcium sensors in eukaryotes. A number of CaM-binding proteins have been identified through classical methods, and many proteins have been predicted to bind CaMs based on their structural homology with known targets. However, multicellular organisms typically contain many CaM-like (CML) proteins, and a global identification of their targets and specificity of interaction is lacking. In an effort to develop a platform for large-scale analysis of proteins in plants we have developed a protein microarray and used it to study the global analysis of CaM/CML interactions. An Arabidopsis thaliana expression collection containing 1,133 ORFs was generated and used to produce proteins with an optimized medium-throughput plant-based expression system. Protein microarrays were prepared and screened with several CaMs/CMLs. A large number of previously known and novel CaM/CML targets were identified, including transcription factors, receptor and intracellular protein kinases, F-box proteins, RNA-binding proteins, and proteins of unknown function. Multiple CaM/CML proteins bound many binding partners, but the majority of targets were specific to one or a few CaMs/CMLs indicating that different CaM family members function through different targets. Based on our analyses, the emergent CaM/CML interactome is more extensive than previously predicted. Our results suggest that calcium functions through distinct CaM/CML proteins to regulate a wide range of targets and cellular activities. PMID:17360592

  9. Prediction of gene expression in embryonic structures of Drosophila melanogaster.

    PubMed

    Samsonova, Anastasia A; Niranjan, Mahesan; Russell, Steven; Brazma, Alvis

    2007-07-01

    Understanding how sets of genes are coordinately regulated in space and time to generate the diversity of cell types that characterise complex metazoans is a major challenge in modern biology. The use of high-throughput approaches, such as large-scale in situ hybridisation and genome-wide expression profiling via DNA microarrays, is beginning to provide insights into the complexities of development. However, in many organisms the collection and annotation of comprehensive in situ localisation data is a difficult and time-consuming task. Here, we present a widely applicable computational approach, integrating developmental time-course microarray data with annotated in situ hybridisation studies, that facilitates the de novo prediction of tissue-specific expression for genes that have no in vivo gene expression localisation data available. Using a classification approach, trained with data from microarray and in situ hybridisation studies of gene expression during Drosophila embryonic development, we made a set of predictions on the tissue-specific expression of Drosophila genes that have not been systematically characterised by in situ hybridisation experiments. The reliability of our predictions is confirmed by literature-derived annotations in FlyBase, by overrepresentation of Gene Ontology biological process annotations, and, in a selected set, by detailed gene-specific studies from the literature. Our novel organism-independent method will be of considerable utility in enriching the annotation of gene function and expression in complex multicellular organisms.

  10. Prediction of Gene Expression in Embryonic Structures of Drosophila melanogaster

    PubMed Central

    Samsonova, Anastasia A; Niranjan, Mahesan; Russell, Steven; Brazma, Alvis

    2007-01-01

    Understanding how sets of genes are coordinately regulated in space and time to generate the diversity of cell types that characterise complex metazoans is a major challenge in modern biology. The use of high-throughput approaches, such as large-scale in situ hybridisation and genome-wide expression profiling via DNA microarrays, is beginning to provide insights into the complexities of development. However, in many organisms the collection and annotation of comprehensive in situ localisation data is a difficult and time-consuming task. Here, we present a widely applicable computational approach, integrating developmental time-course microarray data with annotated in situ hybridisation studies, that facilitates the de novo prediction of tissue-specific expression for genes that have no in vivo gene expression localisation data available. Using a classification approach, trained with data from microarray and in situ hybridisation studies of gene expression during Drosophila embryonic development, we made a set of predictions on the tissue-specific expression of Drosophila genes that have not been systematically characterised by in situ hybridisation experiments. The reliability of our predictions is confirmed by literature-derived annotations in FlyBase, by overrepresentation of Gene Ontology biological process annotations, and, in a selected set, by detailed gene-specific studies from the literature. Our novel organism-independent method will be of considerable utility in enriching the annotation of gene function and expression in complex multicellular organisms. PMID:17658945

  11. Graph Based Study of Allergen Cross-Reactivity of Plant Lipid Transfer Proteins (LTPs) Using Microarray in a Multicenter Study

    PubMed Central

    Palacín, Arantxa; Gómez-Casado, Cristina; Rivas, Luis A.; Aguirre, Jacobo; Tordesillas, Leticia; Bartra, Joan; Blanco, Carlos; Carrillo, Teresa; Cuesta-Herranz, Javier; de Frutos, Consolación; Álvarez-Eire, Genoveva García; Fernández, Francisco J.; Gamboa, Pedro; Muñoz, Rosa; Sánchez-Monge, Rosa; Sirvent, Sofía; Torres, María J.; Varela-Losada, Susana; Rodríguez, Rosalía; Parro, Victor; Blanca, Miguel; Salcedo, Gabriel; Díaz-Perales, Araceli

    2012-01-01

    The study of cross-reactivity in allergy is key to both understanding. the allergic response of many patients and providing them with a rational treatment In the present study, protein microarrays and a co-sensitization graph approach were used in conjunction with an allergen microarray immunoassay. This enabled us to include a wide number of proteins and a large number of patients, and to study sensitization profiles among members of the LTP family. Fourteen LTPs from the most frequent plant food-induced allergies in the geographical area studied were printed into a microarray specifically designed for this research. 212 patients with fruit allergy and 117 food-tolerant pollen allergic subjects were recruited from seven regions of Spain with different pollen profiles, and their sera were tested with allergen microarray. This approach has proven itself to be a good tool to study cross-reactivity between members of LTP family, and could become a useful strategy to analyze other families of allergens. PMID:23272072

  12. At what scale should microarray data be analyzed?

    PubMed

    Huang, Shuguang; Yeo, Adeline A; Gelbert, Lawrence; Lin, Xi; Nisenbaum, Laura; Bemis, Kerry G

    2004-01-01

    The hybridization intensities derived from microarray experiments, for example Affymetrix's MAS5 signals, are very often transformed in one way or another before statistical models are fitted. The motivation for performing transformation is usually to satisfy the model assumptions such as normality and homogeneity in variance. Generally speaking, two types of strategies are often applied to microarray data depending on the analysis need: correlation analysis where all the gene intensities on the array are considered simultaneously, and gene-by-gene ANOVA where each gene is analyzed individually. We investigate the distributional properties of the Affymetrix GeneChip signal data under the two scenarios, focusing on the impact of analyzing the data at an inappropriate scale. The Box-Cox type of transformation is first investigated for the strategy of pooling genes. The commonly used log-transformation is particularly applied for comparison purposes. For the scenario where analysis is on a gene-by-gene basis, the model assumptions such as normality are explored. The impact of using a wrong scale is illustrated by log-transformation and quartic-root transformation. When all the genes on the array are considered together, the dependent relationship between the expression and its variation level can be satisfactorily removed by Box-Cox transformation. When genes are analyzed individually, the distributional properties of the intensities are shown to be gene dependent. Derivation and simulation show that some loss of power is incurred when a wrong scale is used, but due to the robustness of the t-test, the loss is acceptable when the fold-change is not very large.

  13. VLSI Microsystem for Rapid Bioinformatic Pattern Recognition

    NASA Technical Reports Server (NTRS)

    Fang, Wai-Chi; Lue, Jaw-Chyng

    2009-01-01

    A system comprising very-large-scale integrated (VLSI) circuits is being developed as a means of bioinformatics-oriented analysis and recognition of patterns of fluorescence generated in a microarray in an advanced, highly miniaturized, portable genetic-expression-assay instrument. Such an instrument implements an on-chip combination of polymerase chain reactions and electrochemical transduction for amplification and detection of deoxyribonucleic acid (DNA).

  14. Autonomous system for Web-based microarray image analysis.

    PubMed

    Bozinov, Daniel

    2003-12-01

    Software-based feature extraction from DNA microarray images still requires human intervention on various levels. Manual adjustment of grid and metagrid parameters, precise alignment of superimposed grid templates and gene spots, or simply identification of large-scale artifacts have to be performed beforehand to reliably analyze DNA signals and correctly quantify their expression values. Ideally, a Web-based system with input solely confined to a single microarray image and a data table as output containing measurements for all gene spots would directly transform raw image data into abstracted gene expression tables. Sophisticated algorithms with advanced procedures for iterative correction function can overcome imminent challenges in image processing. Herein is introduced an integrated software system with a Java-based interface on the client side that allows for decentralized access and furthermore enables the scientist to instantly employ the most updated software version at any given time. This software tool is extended from PixClust as used in Extractiff incorporated with Java Web Start deployment technology. Ultimately, this setup is destined for high-throughput pipelines in genome-wide medical diagnostics labs or microarray core facilities aimed at providing fully automated service to its users.

  15. High density DNA microarrays: algorithms and biomedical applications.

    PubMed

    Liu, Wei-Min

    2004-08-01

    DNA microarrays are devices capable of detecting the identity and abundance of numerous DNA or RNA segments in samples. They are used for analyzing gene expressions, identifying genetic markers and detecting mutations on a genomic scale. The fundamental chemical mechanism of DNA microarrays is the hybridization between probes and targets due to the hydrogen bonds of nucleotide base pairing. Since the cross hybridization is inevitable, and probes or targets may form undesirable secondary or tertiary structures, the microarray data contain noise and depend on experimental conditions. It is crucial to apply proper statistical algorithms to obtain useful signals from noisy data. After we obtained the signals of a large amount of probes, we need to derive the biomedical information such as the existence of a transcript in a cell, the difference of expression levels of a gene in multiple samples, and the type of a genetic marker. Furthermore, after the expression levels of thousands of genes or the genotypes of thousands of single nucleotide polymorphisms are determined, it is usually important to find a small number of genes or markers that are related to a disease, individual reactions to drugs, or other phenotypes. All these applications need careful data analyses and reliable algorithms.

  16. Evaluation of two outlier-detection-based methods for detecting tissue-selective genes from microarray data.

    PubMed

    Kadota, Koji; Konishi, Tomokazu; Shimizu, Kentaro

    2007-05-01

    Large-scale expression profiling using DNA microarrays enables identification of tissue-selective genes for which expression is considerably higher and/or lower in some tissues than in others. Among numerous possible methods, only two outlier-detection-based methods (an AIC-based method and Sprent's non-parametric method) can treat equally various types of selective patterns, but they produce substantially different results. We investigated the performance of these two methods for different parameter settings and for a reduced number of samples. We focused on their ability to detect selective expression patterns robustly. We applied them to public microarray data collected from 36 normal human tissue samples and analyzed the effects of both changing the parameter settings and reducing the number of samples. The AIC-based method was more robust in both cases. The findings confirm that the use of the AIC-based method in the recently proposed ROKU method for detecting tissue-selective expression patterns is correct and that Sprent's method is not suitable for ROKU.

  17. Performance of automated scoring of ER, PR, HER2, CK5/6 and EGFR in breast cancer tissue microarrays in the Breast Cancer Association Consortium

    PubMed Central

    Howat, William J; Blows, Fiona M; Provenzano, Elena; Brook, Mark N; Morris, Lorna; Gazinska, Patrycja; Johnson, Nicola; McDuffus, Leigh‐Anne; Miller, Jodi; Sawyer, Elinor J; Pinder, Sarah; van Deurzen, Carolien H M; Jones, Louise; Sironen, Reijo; Visscher, Daniel; Caldas, Carlos; Daley, Frances; Coulson, Penny; Broeks, Annegien; Sanders, Joyce; Wesseling, Jelle; Nevanlinna, Heli; Fagerholm, Rainer; Blomqvist, Carl; Heikkilä, Päivi; Ali, H Raza; Dawson, Sarah‐Jane; Figueroa, Jonine; Lissowska, Jolanta; Brinton, Louise; Mannermaa, Arto; Kataja, Vesa; Kosma, Veli‐Matti; Cox, Angela; Brock, Ian W; Cross, Simon S; Reed, Malcolm W; Couch, Fergus J; Olson, Janet E; Devillee, Peter; Mesker, Wilma E; Seyaneve, Caroline M; Hollestelle, Antoinette; Benitez, Javier; Perez, Jose Ignacio Arias; Menéndez, Primitiva; Bolla, Manjeet K; Easton, Douglas F; Schmidt, Marjanka K; Pharoah, Paul D; Sherman, Mark E

    2014-01-01

    Abstract Breast cancer risk factors and clinical outcomes vary by tumour marker expression. However, individual studies often lack the power required to assess these relationships, and large‐scale analyses are limited by the need for high throughput, standardized scoring methods. To address these limitations, we assessed whether automated image analysis of immunohistochemically stained tissue microarrays can permit rapid, standardized scoring of tumour markers from multiple studies. Tissue microarray sections prepared in nine studies containing 20 263 cores from 8267 breast cancers stained for two nuclear (oestrogen receptor, progesterone receptor), two membranous (human epidermal growth factor receptor 2 and epidermal growth factor receptor) and one cytoplasmic (cytokeratin 5/6) marker were scanned as digital images. Automated algorithms were used to score markers in tumour cells using the Ariol system. We compared automated scores against visual reads, and their associations with breast cancer survival. Approximately 65–70% of tissue microarray cores were satisfactory for scoring. Among satisfactory cores, agreement between dichotomous automated and visual scores was highest for oestrogen receptor (Kappa = 0.76), followed by human epidermal growth factor receptor 2 (Kappa = 0.69) and progesterone receptor (Kappa = 0.67). Automated quantitative scores for these markers were associated with hazard ratios for breast cancer mortality in a dose‐response manner. Considering visual scores of epidermal growth factor receptor or cytokeratin 5/6 as the reference, automated scoring achieved excellent negative predictive value (96–98%), but yielded many false positives (positive predictive value = 30–32%). For all markers, we observed substantial heterogeneity in automated scoring performance across tissue microarrays. Automated analysis is a potentially useful tool for large‐scale, quantitative scoring of immunohistochemically stained tissue microarrays available in consortia. However, continued optimization, rigorous marker‐specific quality control measures and standardization of tissue microarray designs, staining and scoring protocols is needed to enhance results. PMID:27499890

  18. SaDA: From Sampling to Data Analysis—An Extensible Open Source Infrastructure for Rapid, Robust and Automated Management and Analysis of Modern Ecological High-Throughput Microarray Data

    PubMed Central

    Singh, Kumar Saurabh; Thual, Dominique; Spurio, Roberto; Cannata, Nicola

    2015-01-01

    One of the most crucial characteristics of day-to-day laboratory information management is the collection, storage and retrieval of information about research subjects and environmental or biomedical samples. An efficient link between sample data and experimental results is absolutely important for the successful outcome of a collaborative project. Currently available software solutions are largely limited to large scale, expensive commercial Laboratory Information Management Systems (LIMS). Acquiring such LIMS indeed can bring laboratory information management to a higher level, but most of the times this requires a sufficient investment of money, time and technical efforts. There is a clear need for a light weighted open source system which can easily be managed on local servers and handled by individual researchers. Here we present a software named SaDA for storing, retrieving and analyzing data originated from microorganism monitoring experiments. SaDA is fully integrated in the management of environmental samples, oligonucleotide sequences, microarray data and the subsequent downstream analysis procedures. It is simple and generic software, and can be extended and customized for various environmental and biomedical studies. PMID:26047146

  19. SaDA: From Sampling to Data Analysis-An Extensible Open Source Infrastructure for Rapid, Robust and Automated Management and Analysis of Modern Ecological High-Throughput Microarray Data.

    PubMed

    Singh, Kumar Saurabh; Thual, Dominique; Spurio, Roberto; Cannata, Nicola

    2015-06-03

    One of the most crucial characteristics of day-to-day laboratory information management is the collection, storage and retrieval of information about research subjects and environmental or biomedical samples. An efficient link between sample data and experimental results is absolutely important for the successful outcome of a collaborative project. Currently available software solutions are largely limited to large scale, expensive commercial Laboratory Information Management Systems (LIMS). Acquiring such LIMS indeed can bring laboratory information management to a higher level, but most of the times this requires a sufficient investment of money, time and technical efforts. There is a clear need for a light weighted open source system which can easily be managed on local servers and handled by individual researchers. Here we present a software named SaDA for storing, retrieving and analyzing data originated from microorganism monitoring experiments. SaDA is fully integrated in the management of environmental samples, oligonucleotide sequences, microarray data and the subsequent downstream analysis procedures. It is simple and generic software, and can be extended and customized for various environmental and biomedical studies.

  20. The Porcelain Crab Transcriptome and PCAD, the Porcelain Crab Microarray and Sequence Database

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Tagmount, Abderrahmane; Wang, Mei; Lindquist, Erika

    2010-01-27

    Background: With the emergence of a completed genome sequence of the freshwater crustacean Daphnia pulex, construction of genomic-scale sequence databases for additional crustacean sequences are important for comparative genomics and annotation. Porcelain crabs, genus Petrolisthes, have been powerful crustacean models for environmental and evolutionary physiology with respect to thermal adaptation and understanding responses of marine organisms to climate change. Here, we present a large-scale EST sequencing and cDNA microarray database project for the porcelain crab Petrolisthes cinctipes. Methodology/Principal Findings: A set of ~;;30K unique sequences (UniSeqs) representing ~;;19K clusters were generated from ~;;98K high quality ESTs from a set ofmore » tissue specific non-normalized and mixed-tissue normalized cDNA libraries from the porcelain crab Petrolisthes cinctipes. Homology for each UniSeq was assessed using BLAST, InterProScan, GO and KEGG database searches. Approximately 66percent of the UniSeqs had homology in at least one of the databases. All EST and UniSeq sequences along with annotation results and coordinated cDNA microarray datasets have been made publicly accessible at the Porcelain Crab Array Database (PCAD), a feature-enriched version of the Stanford and Longhorn Array Databases.Conclusions/Significance: The EST project presented here represents the third largest sequencing effort for any crustacean, and the largest effort for any crab species. Our assembly and clustering results suggest that our porcelain crab EST data set is equally diverse to the much larger EST set generated in the Daphnia pulex genome sequencing project, and thus will be an important resource to the Daphnia research community. Our homology results support the pancrustacea hypothesis and suggest that Malacostraca may be ancestral to Branchiopoda and Hexapoda. Our results also suggest that our cDNA microarrays cover as much of the transcriptome as can reasonably be captured in EST library sequencing approaches, and thus represent a rich resource for studies of environmental genomics.« less

  1. Meta-analysis methods for combining multiple expression profiles: comparisons, statistical characterization and an application guideline

    PubMed Central

    2013-01-01

    Background As high-throughput genomic technologies become accurate and affordable, an increasing number of data sets have been accumulated in the public domain and genomic information integration and meta-analysis have become routine in biomedical research. In this paper, we focus on microarray meta-analysis, where multiple microarray studies with relevant biological hypotheses are combined in order to improve candidate marker detection. Many methods have been developed and applied in the literature, but their performance and properties have only been minimally investigated. There is currently no clear conclusion or guideline as to the proper choice of a meta-analysis method given an application; the decision essentially requires both statistical and biological considerations. Results We performed 12 microarray meta-analysis methods for combining multiple simulated expression profiles, and such methods can be categorized for different hypothesis setting purposes: (1) HS A : DE genes with non-zero effect sizes in all studies, (2) HS B : DE genes with non-zero effect sizes in one or more studies and (3) HS r : DE gene with non-zero effect in "majority" of studies. We then performed a comprehensive comparative analysis through six large-scale real applications using four quantitative statistical evaluation criteria: detection capability, biological association, stability and robustness. We elucidated hypothesis settings behind the methods and further apply multi-dimensional scaling (MDS) and an entropy measure to characterize the meta-analysis methods and data structure, respectively. Conclusions The aggregated results from the simulation study categorized the 12 methods into three hypothesis settings (HS A , HS B , and HS r ). Evaluation in real data and results from MDS and entropy analyses provided an insightful and practical guideline to the choice of the most suitable method in a given application. All source files for simulation and real data are available on the author’s publication website. PMID:24359104

  2. Meta-analysis methods for combining multiple expression profiles: comparisons, statistical characterization and an application guideline.

    PubMed

    Chang, Lun-Ching; Lin, Hui-Min; Sibille, Etienne; Tseng, George C

    2013-12-21

    As high-throughput genomic technologies become accurate and affordable, an increasing number of data sets have been accumulated in the public domain and genomic information integration and meta-analysis have become routine in biomedical research. In this paper, we focus on microarray meta-analysis, where multiple microarray studies with relevant biological hypotheses are combined in order to improve candidate marker detection. Many methods have been developed and applied in the literature, but their performance and properties have only been minimally investigated. There is currently no clear conclusion or guideline as to the proper choice of a meta-analysis method given an application; the decision essentially requires both statistical and biological considerations. We performed 12 microarray meta-analysis methods for combining multiple simulated expression profiles, and such methods can be categorized for different hypothesis setting purposes: (1) HS(A): DE genes with non-zero effect sizes in all studies, (2) HS(B): DE genes with non-zero effect sizes in one or more studies and (3) HS(r): DE gene with non-zero effect in "majority" of studies. We then performed a comprehensive comparative analysis through six large-scale real applications using four quantitative statistical evaluation criteria: detection capability, biological association, stability and robustness. We elucidated hypothesis settings behind the methods and further apply multi-dimensional scaling (MDS) and an entropy measure to characterize the meta-analysis methods and data structure, respectively. The aggregated results from the simulation study categorized the 12 methods into three hypothesis settings (HS(A), HS(B), and HS(r)). Evaluation in real data and results from MDS and entropy analyses provided an insightful and practical guideline to the choice of the most suitable method in a given application. All source files for simulation and real data are available on the author's publication website.

  3. Deciphering the glycosaminoglycan code with the help of microarrays.

    PubMed

    de Paz, Jose L; Seeberger, Peter H

    2008-07-01

    Carbohydrate microarrays have become a powerful tool to elucidate the biological role of complex sugars. Microarrays are particularly useful for the study of glycosaminoglycans (GAGs), a key class of carbohydrates. The high-throughput chip format enables rapid screening of large numbers of potential GAG sequences produced via a complex biosynthesis while consuming very little sample. Here, we briefly highlight the most recent advances involving GAG microarrays built with synthetic or naturally derived oligosaccharides. These chips are powerful tools for characterizing GAG-protein interactions and determining structure-activity relationships for specific sequences. Thereby, they contribute to decoding the information contained in specific GAG sequences.

  4. Prostate Cancer Prevention Through Induction of Phase 2 Enzymes

    DTIC Science & Technology

    2001-04-01

    enzymes. During our Phase I Award, we identified sulforaphane as the most potent inducer of carcinogen defenses in the prostate cell. We have...characterized global effects of sulforaphane in prostate cancer cell lines using cDNA microarray technology that allows large-scale determination of changes...of sulforaphane ) and decreased risk of prostate cancer. These findings argue strongly for a preventive intervention trial involving supplementation

  5. Prior robust empirical Bayes inference for large-scale data by conditioning on rank with application to microarray data

    PubMed Central

    Liao, J. G.; Mcmurry, Timothy; Berg, Arthur

    2014-01-01

    Empirical Bayes methods have been extensively used for microarray data analysis by modeling the large number of unknown parameters as random effects. Empirical Bayes allows borrowing information across genes and can automatically adjust for multiple testing and selection bias. However, the standard empirical Bayes model can perform poorly if the assumed working prior deviates from the true prior. This paper proposes a new rank-conditioned inference in which the shrinkage and confidence intervals are based on the distribution of the error conditioned on rank of the data. Our approach is in contrast to a Bayesian posterior, which conditions on the data themselves. The new method is almost as efficient as standard Bayesian methods when the working prior is close to the true prior, and it is much more robust when the working prior is not close. In addition, it allows a more accurate (but also more complex) non-parametric estimate of the prior to be easily incorporated, resulting in improved inference. The new method’s prior robustness is demonstrated via simulation experiments. Application to a breast cancer gene expression microarray dataset is presented. Our R package rank.Shrinkage provides a ready-to-use implementation of the proposed methodology. PMID:23934072

  6. PageMan: an interactive ontology tool to generate, display, and annotate overview graphs for profiling experiments.

    PubMed

    Usadel, Björn; Nagel, Axel; Steinhauser, Dirk; Gibon, Yves; Bläsing, Oliver E; Redestig, Henning; Sreenivasulu, Nese; Krall, Leonard; Hannah, Matthew A; Poree, Fabien; Fernie, Alisdair R; Stitt, Mark

    2006-12-18

    Microarray technology has become a widely accepted and standardized tool in biology. The first microarray data analysis programs were developed to support pair-wise comparison. However, as microarray experiments have become more routine, large scale experiments have become more common, which investigate multiple time points or sets of mutants or transgenics. To extract biological information from such high-throughput expression data, it is necessary to develop efficient analytical platforms, which combine manually curated gene ontologies with efficient visualization and navigation tools. Currently, most tools focus on a few limited biological aspects, rather than offering a holistic, integrated analysis. Here we introduce PageMan, a multiplatform, user-friendly, and stand-alone software tool that annotates, investigates, and condenses high-throughput microarray data in the context of functional ontologies. It includes a GUI tool to transform different ontologies into a suitable format, enabling the user to compare and choose between different ontologies. It is equipped with several statistical modules for data analysis, including over-representation analysis and Wilcoxon statistical testing. Results are exported in a graphical format for direct use, or for further editing in graphics programs.PageMan provides a fast overview of single treatments, allows genome-level responses to be compared across several microarray experiments covering, for example, stress responses at multiple time points. This aids in searching for trait-specific changes in pathways using mutants or transgenics, analyzing development time-courses, and comparison between species. In a case study, we analyze the results of publicly available microarrays of multiple cold stress experiments using PageMan, and compare the results to a previously published meta-analysis.PageMan offers a complete user's guide, a web-based over-representation analysis as well as a tutorial, and is freely available at http://mapman.mpimp-golm.mpg.de/pageman/. PageMan allows multiple microarray experiments to be efficiently condensed into a single page graphical display. The flexible interface allows data to be quickly and easily visualized, facilitating comparisons within experiments and to published experiments, thus enabling researchers to gain a rapid overview of the biological responses in the experiments.

  7. A Review of Feature Extraction Software for Microarray Gene Expression Data

    PubMed Central

    Tan, Ching Siang; Ting, Wai Soon; Mohamad, Mohd Saberi; Chan, Weng Howe; Deris, Safaai; Ali Shah, Zuraini

    2014-01-01

    When gene expression data are too large to be processed, they are transformed into a reduced representation set of genes. Transforming large-scale gene expression data into a set of genes is called feature extraction. If the genes extracted are carefully chosen, this gene set can extract the relevant information from the large-scale gene expression data, allowing further analysis by using this reduced representation instead of the full size data. In this paper, we review numerous software applications that can be used for feature extraction. The software reviewed is mainly for Principal Component Analysis (PCA), Independent Component Analysis (ICA), Partial Least Squares (PLS), and Local Linear Embedding (LLE). A summary and sources of the software are provided in the last section for each feature extraction method. PMID:25250315

  8. ArrayNinja: An Open Source Platform for Unified Planning and Analysis of Microarray Experiments.

    PubMed

    Dickson, B M; Cornett, E M; Ramjan, Z; Rothbart, S B

    2016-01-01

    Microarray-based proteomic platforms have emerged as valuable tools for studying various aspects of protein function, particularly in the field of chromatin biochemistry. Microarray technology itself is largely unrestricted in regard to printable material and platform design, and efficient multidimensional optimization of assay parameters requires fluidity in the design and analysis of custom print layouts. This motivates the need for streamlined software infrastructure that facilitates the combined planning and analysis of custom microarray experiments. To this end, we have developed ArrayNinja as a portable, open source, and interactive application that unifies the planning and visualization of microarray experiments and provides maximum flexibility to end users. Array experiments can be planned, stored to a private database, and merged with the imaged results for a level of data interaction and centralization that is not currently attainable with available microarray informatics tools. © 2016 Elsevier Inc. All rights reserved.

  9. Directed module detection in a large-scale expression compendium.

    PubMed

    Fu, Qiang; Lemmens, Karen; Sanchez-Rodriguez, Aminael; Thijs, Inge M; Meysman, Pieter; Sun, Hong; Fierro, Ana Carolina; Engelen, Kristof; Marchal, Kathleen

    2012-01-01

    Public online microarray databases contain tremendous amounts of expression data. Mining these data sources can provide a wealth of information on the underlying transcriptional networks. In this chapter, we illustrate how the web services COLOMBOS and DISTILLER can be used to identify condition-dependent coexpression modules by exploring compendia of public expression data. COLOMBOS is designed for user-specified query-driven analysis, whereas DISTILLER generates a global regulatory network overview. The user is guided through both web services by means of a case study in which condition-dependent coexpression modules comprising a gene of interest (i.e., "directed") are identified.

  10. An efficient pseudomedian filter for tiling microrrays.

    PubMed

    Royce, Thomas E; Carriero, Nicholas J; Gerstein, Mark B

    2007-06-07

    Tiling microarrays are becoming an essential technology in the functional genomics toolbox. They have been applied to the tasks of novel transcript identification, elucidation of transcription factor binding sites, detection of methylated DNA and several other applications in several model organisms. These experiments are being conducted at increasingly finer resolutions as the microarray technology enjoys increasingly greater feature densities. The increased densities naturally lead to increased data analysis requirements. Specifically, the most widely employed algorithm for tiling array analysis involves smoothing observed signals by computing pseudomedians within sliding windows, a O(n2logn) calculation in each window. This poor time complexity is an issue for tiling array analysis and could prove to be a real bottleneck as tiling microarray experiments become grander in scope and finer in resolution. We therefore implemented Monahan's HLQEST algorithm that reduces the runtime complexity for computing the pseudomedian of n numbers to O(nlogn) from O(n2logn). For a representative tiling microarray dataset, this modification reduced the smoothing procedure's runtime by nearly 90%. We then leveraged the fact that elements within sliding windows remain largely unchanged in overlapping windows (as one slides across genomic space) to further reduce computation by an additional 43%. This was achieved by the application of skip lists to maintaining a sorted list of values from window to window. This sorted list could be maintained with simple O(log n) inserts and deletes. We illustrate the favorable scaling properties of our algorithms with both time complexity analysis and benchmarking on synthetic datasets. Tiling microarray analyses that rely upon a sliding window pseudomedian calculation can require many hours of computation. We have eased this requirement significantly by implementing efficient algorithms that scale well with genomic feature density. This result not only speeds the current standard analyses, but also makes possible ones where many iterations of the filter may be required, such as might be required in a bootstrap or parameter estimation setting. Source code and executables are available at http://tiling.gersteinlab.org/pseudomedian/.

  11. An efficient pseudomedian filter for tiling microrrays

    PubMed Central

    Royce, Thomas E; Carriero, Nicholas J; Gerstein, Mark B

    2007-01-01

    Background Tiling microarrays are becoming an essential technology in the functional genomics toolbox. They have been applied to the tasks of novel transcript identification, elucidation of transcription factor binding sites, detection of methylated DNA and several other applications in several model organisms. These experiments are being conducted at increasingly finer resolutions as the microarray technology enjoys increasingly greater feature densities. The increased densities naturally lead to increased data analysis requirements. Specifically, the most widely employed algorithm for tiling array analysis involves smoothing observed signals by computing pseudomedians within sliding windows, a O(n2logn) calculation in each window. This poor time complexity is an issue for tiling array analysis and could prove to be a real bottleneck as tiling microarray experiments become grander in scope and finer in resolution. Results We therefore implemented Monahan's HLQEST algorithm that reduces the runtime complexity for computing the pseudomedian of n numbers to O(nlogn) from O(n2logn). For a representative tiling microarray dataset, this modification reduced the smoothing procedure's runtime by nearly 90%. We then leveraged the fact that elements within sliding windows remain largely unchanged in overlapping windows (as one slides across genomic space) to further reduce computation by an additional 43%. This was achieved by the application of skip lists to maintaining a sorted list of values from window to window. This sorted list could be maintained with simple O(log n) inserts and deletes. We illustrate the favorable scaling properties of our algorithms with both time complexity analysis and benchmarking on synthetic datasets. Conclusion Tiling microarray analyses that rely upon a sliding window pseudomedian calculation can require many hours of computation. We have eased this requirement significantly by implementing efficient algorithms that scale well with genomic feature density. This result not only speeds the current standard analyses, but also makes possible ones where many iterations of the filter may be required, such as might be required in a bootstrap or parameter estimation setting. Source code and executables are available at . PMID:17555595

  12. Development and validation of a microarray for the investigation of the CAZymes encoded by the human gut microbiome.

    PubMed

    El Kaoutari, Abdessamad; Armougom, Fabrice; Leroy, Quentin; Vialettes, Bernard; Million, Matthieu; Raoult, Didier; Henrissat, Bernard

    2013-01-01

    Distal gut bacteria play a pivotal role in the digestion of dietary polysaccharides by producing a large number of carbohydrate-active enzymes (CAZymes) that the host otherwise does not produce. We report here the design of a custom microarray that we used to spot non-redundant DNA probes for more than 6,500 genes encoding glycoside hydrolases and lyases selected from 174 reference genomes from distal gut bacteria. The custom microarray was tested and validated by the hybridization of bacterial DNA extracted from the stool samples of lean, obese and anorexic individuals. Our results suggest that a microarray-based study can detect genes from low-abundance bacteria better than metagenomic-based studies. A striking example was the finding that a gene encoding a GH6-family cellulase was present in all subjects examined, whereas metagenomic studies have consistently failed to detect this gene in both human and animal gut microbiomes. In addition, an examination of eight stool samples allowed the identification of a corresponding CAZome core containing 46 families of glycoside hydrolases and polysaccharide lyases, which suggests the functional stability of the gut microbiota despite large taxonomical variations between individuals.

  13. The Development of Protein Microarrays and Their Applications in DNA-Protein and Protein-Protein Interaction Analyses of Arabidopsis Transcription Factors

    PubMed Central

    Gong, Wei; He, Kun; Covington, Mike; Dinesh-Kumar, S. P.; Snyder, Michael; Harmer, Stacey L.; Zhu, Yu-Xian; Deng, Xing Wang

    2009-01-01

    We used our collection of Arabidopsis transcription factor (TF) ORFeome clones to construct protein microarrays containing as many as 802 TF proteins. These protein microarrays were used for both protein-DNA and protein-protein interaction analyses. For protein-DNA interaction studies, we examined AP2/ERF family TFs and their cognate cis-elements. By careful comparison of the DNA-binding specificity of 13 TFs on the protein microarray with previous non-microarray data, we showed that protein microarrays provide an efficient and high throughput tool for genome-wide analysis of TF-DNA interactions. This microarray protein-DNA interaction analysis allowed us to derive a comprehensive view of DNA-binding profiles of AP2/ERF family proteins in Arabidopsis. It also revealed four TFs that bound the EE (evening element) and had the expected phased gene expression under clock-regulation, thus providing a basis for further functional analysis of their roles in clock regulation of gene expression. We also developed procedures for detecting protein interactions using this TF protein microarray and discovered four novel partners that interact with HY5, which can be validated by yeast two-hybrid assays. Thus, plant TF protein microarrays offer an attractive high-throughput alternative to traditional techniques for TF functional characterization on a global scale. PMID:19802365

  14. Solving satisfiability problems using a novel microarray-based DNA computer.

    PubMed

    Lin, Che-Hsin; Cheng, Hsiao-Ping; Yang, Chang-Biau; Yang, Chia-Ning

    2007-01-01

    An algorithm based on a modified sticker model accompanied with an advanced MEMS-based microarray technology is demonstrated to solve SAT problem, which has long served as a benchmark in DNA computing. Unlike conventional DNA computing algorithms needing an initial data pool to cover correct and incorrect answers and further executing a series of separation procedures to destroy the unwanted ones, we built solutions in parts to satisfy one clause in one step, and eventually solve the entire Boolean formula through steps. No time-consuming sample preparation procedures and delicate sample applying equipment were required for the computing process. Moreover, experimental results show the bound DNA sequences can sustain the chemical solutions during computing processes such that the proposed method shall be useful in dealing with large-scale problems.

  15. Patterns Cancer Prevention Through Induction of Phase 2 Enzymes

    DTIC Science & Technology

    2003-04-01

    2) enzymes. During our Phase I Award, we identified sulforaphane as the most potent inducer of carcinogen defenses in the prostate cell. We have...characterized global effects of sulforaphane in prostate cancer cell lines using cDNA microarray technology that allows large-scale determination of...changes in gene expression. These findings argue strongly for a preventive intervention trial involving with sulforaphane . During our Phase 2 Award, we used

  16. Evaluation of Two Outlier-Detection-Based Methods for Detecting Tissue-Selective Genes from Microarray Data

    PubMed Central

    Kadota, Koji; Konishi, Tomokazu; Shimizu, Kentaro

    2007-01-01

    Large-scale expression profiling using DNA microarrays enables identification of tissue-selective genes for which expression is considerably higher and/or lower in some tissues than in others. Among numerous possible methods, only two outlier-detection-based methods (an AIC-based method and Sprent’s non-parametric method) can treat equally various types of selective patterns, but they produce substantially different results. We investigated the performance of these two methods for different parameter settings and for a reduced number of samples. We focused on their ability to detect selective expression patterns robustly. We applied them to public microarray data collected from 36 normal human tissue samples and analyzed the effects of both changing the parameter settings and reducing the number of samples. The AIC-based method was more robust in both cases. The findings confirm that the use of the AIC-based method in the recently proposed ROKU method for detecting tissue-selective expression patterns is correct and that Sprent’s method is not suitable for ROKU. PMID:19936074

  17. CEM-designer: design of custom expression microarrays in the post-ENCODE Era.

    PubMed

    Arnold, Christian; Externbrink, Fabian; Hackermüller, Jörg; Reiche, Kristin

    2014-11-10

    Microarrays are widely used in gene expression studies, and custom expression microarrays are popular to monitor expression changes of a customer-defined set of genes. However, the complexity of transcriptomes uncovered recently make custom expression microarray design a non-trivial task. Pervasive transcription and alternative processing of transcripts generate a wealth of interweaved transcripts that requires well-considered probe design strategies and is largely neglected in existing approaches. We developed the web server CEM-Designer that facilitates microarray platform independent design of custom expression microarrays for complex transcriptomes. CEM-Designer covers (i) the collection and generation of a set of unique target sequences from different sources and (ii) the selection of a set of sensitive and specific probes that optimally represents the target sequences. Probe design itself is left to third party software to ensure that probes meet provider-specific constraints. CEM-Designer is available at http://designpipeline.bioinf.uni-leipzig.de. Copyright © 2014 Elsevier B.V. All rights reserved.

  18. Gene expression profiling of human mesenchymal stem cells derived from bone marrow during expansion and osteoblast differentiation.

    PubMed

    Kulterer, Birgit; Friedl, Gerald; Jandrositz, Anita; Sanchez-Cabo, Fatima; Prokesch, Andreas; Paar, Christine; Scheideler, Marcel; Windhager, Reinhard; Preisegger, Karl-Heinz; Trajanoski, Zlatko

    2007-03-12

    Human mesenchymal stem cells (MSC) with the capacity to differentiate into osteoblasts provide potential for the development of novel treatment strategies, such as improved healing of large bone defects. However, their low frequency in bone marrow necessitate ex vivo expansion for further clinical application. In this study we asked if MSC are developing in an aberrant or unwanted way during ex vivo long-term cultivation and if artificial cultivation conditions exert any influence on their stem cell maintenance. To address this question we first developed human oligonucleotide microarrays with 30.000 elements and then performed large-scale expression profiling of long-term expanded MSC and MSC during differentiation into osteoblasts. The results showed that MSC did not alter their osteogenic differentiation capacity, surface marker profile, and the expression profiles of MSC during expansion. Microarray analysis of MSC during osteogenic differentiation identified three candidate genes for further examination and functional analysis: ID4, CRYAB, and SORT1. Additionally, we were able to reconstruct the three developmental phases during osteoblast differentiation: proliferation, matrix maturation, and mineralization, and illustrate the activation of the SMAD signaling pathways by TGF-beta2 and BMPs. With a variety of assays we could show that MSC represent a cell population which can be expanded for therapeutic applications.

  19. Construction of a cDNA microarray derived from the ascidian Ciona intestinalis.

    PubMed

    Azumi, Kaoru; Takahashi, Hiroki; Miki, Yasufumi; Fujie, Manabu; Usami, Takeshi; Ishikawa, Hisayoshi; Kitayama, Atsusi; Satou, Yutaka; Ueno, Naoto; Satoh, Nori

    2003-10-01

    A cDNA microarray was constructed from a basal chordate, the ascidian Ciona intestinalis. The draft genome of Ciona has been read and inferred to contain approximately 16,000 protein-coding genes, and cDNAs for transcripts of 13,464 genes have been characterized and compiled as the "Ciona intestinalis Gene Collection Release I". In the present study, we constructed a cDNA microarray of these 13,464 Ciona genes. A preliminary experiment with Cy3- and Cy5-labeled probes showed extensive differential gene expression between fertilized eggs and larvae. In addition, there was a good correlation between results obtained by the present microarray analysis and those from previous EST analyses. This first microarray of a large collection of Ciona intestinalis cDNA clones should facilitate the analysis of global gene expression and gene networks during the embryogenesis of basal chordates.

  20. EDGE3: A web-based solution for management and analysis of Agilent two color microarray experiments

    PubMed Central

    Vollrath, Aaron L; Smith, Adam A; Craven, Mark; Bradfield, Christopher A

    2009-01-01

    Background The ability to generate transcriptional data on the scale of entire genomes has been a boon both in the improvement of biological understanding and in the amount of data generated. The latter, the amount of data generated, has implications when it comes to effective storage, analysis and sharing of these data. A number of software tools have been developed to store, analyze, and share microarray data. However, a majority of these tools do not offer all of these features nor do they specifically target the commonly used two color Agilent DNA microarray platform. Thus, the motivating factor for the development of EDGE3 was to incorporate the storage, analysis and sharing of microarray data in a manner that would provide a means for research groups to collaborate on Agilent-based microarray experiments without a large investment in software-related expenditures or extensive training of end-users. Results EDGE3 has been developed with two major functions in mind. The first function is to provide a workflow process for the generation of microarray data by a research laboratory or a microarray facility. The second is to store, analyze, and share microarray data in a manner that doesn't require complicated software. To satisfy the first function, EDGE3 has been developed as a means to establish a well defined experimental workflow and information system for microarray generation. To satisfy the second function, the software application utilized as the user interface of EDGE3 is a web browser. Within the web browser, a user is able to access the entire functionality, including, but not limited to, the ability to perform a number of bioinformatics based analyses, collaborate between research groups through a user-based security model, and access to the raw data files and quality control files generated by the software used to extract the signals from an array image. Conclusion Here, we present EDGE3, an open-source, web-based application that allows for the storage, analysis, and controlled sharing of transcription-based microarray data generated on the Agilent DNA platform. In addition, EDGE3 provides a means for managing RNA samples and arrays during the hybridization process. EDGE3 is freely available for download at . PMID:19732451

  1. EDGE(3): a web-based solution for management and analysis of Agilent two color microarray experiments.

    PubMed

    Vollrath, Aaron L; Smith, Adam A; Craven, Mark; Bradfield, Christopher A

    2009-09-04

    The ability to generate transcriptional data on the scale of entire genomes has been a boon both in the improvement of biological understanding and in the amount of data generated. The latter, the amount of data generated, has implications when it comes to effective storage, analysis and sharing of these data. A number of software tools have been developed to store, analyze, and share microarray data. However, a majority of these tools do not offer all of these features nor do they specifically target the commonly used two color Agilent DNA microarray platform. Thus, the motivating factor for the development of EDGE(3) was to incorporate the storage, analysis and sharing of microarray data in a manner that would provide a means for research groups to collaborate on Agilent-based microarray experiments without a large investment in software-related expenditures or extensive training of end-users. EDGE(3) has been developed with two major functions in mind. The first function is to provide a workflow process for the generation of microarray data by a research laboratory or a microarray facility. The second is to store, analyze, and share microarray data in a manner that doesn't require complicated software. To satisfy the first function, EDGE3 has been developed as a means to establish a well defined experimental workflow and information system for microarray generation. To satisfy the second function, the software application utilized as the user interface of EDGE(3) is a web browser. Within the web browser, a user is able to access the entire functionality, including, but not limited to, the ability to perform a number of bioinformatics based analyses, collaborate between research groups through a user-based security model, and access to the raw data files and quality control files generated by the software used to extract the signals from an array image. Here, we present EDGE(3), an open-source, web-based application that allows for the storage, analysis, and controlled sharing of transcription-based microarray data generated on the Agilent DNA platform. In addition, EDGE(3) provides a means for managing RNA samples and arrays during the hybridization process. EDGE(3) is freely available for download at http://edge.oncology.wisc.edu/.

  2. Manufacturing of microarrays.

    PubMed

    Petersen, David W; Kawasaki, Ernest S

    2007-01-01

    DNA microarray technology has become a powerful tool in the arsenal of the molecular biologist. Capitalizing on high precision robotics and the wealth of DNA sequences annotated from the genomes of a large number of organisms, the manufacture of microarrays is now possible for the average academic laboratory with the funds and motivation. Microarray production requires attention to both biological and physical resources, including DNA libraries, robotics, and qualified personnel. While the fabrication of microarrays is a very labor-intensive process, production of quality microarrays individually tailored on a project-by-project basis will help researchers shed light on future scientific questions.

  3. A probabilistic framework for microarray data analysis: fundamental probability models and statistical inference.

    PubMed

    Ogunnaike, Babatunde A; Gelmi, Claudio A; Edwards, Jeremy S

    2010-05-21

    Gene expression studies generate large quantities of data with the defining characteristic that the number of genes (whose expression profiles are to be determined) exceed the number of available replicates by several orders of magnitude. Standard spot-by-spot analysis still seeks to extract useful information for each gene on the basis of the number of available replicates, and thus plays to the weakness of microarrays. On the other hand, because of the data volume, treating the entire data set as an ensemble, and developing theoretical distributions for these ensembles provides a framework that plays instead to the strength of microarrays. We present theoretical results that under reasonable assumptions, the distribution of microarray intensities follows the Gamma model, with the biological interpretations of the model parameters emerging naturally. We subsequently establish that for each microarray data set, the fractional intensities can be represented as a mixture of Beta densities, and develop a procedure for using these results to draw statistical inference regarding differential gene expression. We illustrate the results with experimental data from gene expression studies on Deinococcus radiodurans following DNA damage using cDNA microarrays. Copyright (c) 2010 Elsevier Ltd. All rights reserved.

  4. From Coexpression to Coregulation: An Approach to Inferring Transcriptional Regulation Among Gene Classes from Large-Scale Expression Data

    NASA Technical Reports Server (NTRS)

    Mjolsness, Eric; Castano, Rebecca; Mann, Tobias; Wold, Barbara

    2000-01-01

    We provide preliminary evidence that existing algorithms for inferring small-scale gene regulation networks from gene expression data can be adapted to large-scale gene expression data coming from hybridization microarrays. The essential steps are (I) clustering many genes by their expression time-course data into a minimal set of clusters of co-expressed genes, (2) theoretically modeling the various conditions under which the time-courses are measured using a continuous-time analog recurrent neural network for the cluster mean time-courses, (3) fitting such a regulatory model to the cluster mean time courses by simulated annealing with weight decay, and (4) analysing several such fits for commonalities in the circuit parameter sets including the connection matrices. This procedure can be used to assess the adequacy of existing and future gene expression time-course data sets for determining transcriptional regulatory relationships such as coregulation.

  5. Generation of Antigen Microarrays to Screen for Autoantibodies in Heart Failure and Heart Transplantation.

    PubMed

    Chruscinski, Andrzej; Huang, Flora Y Y; Nguyen, Albert; Lioe, Jocelyn; Tumiati, Laura C; Kozuszko, Stella; Tinckam, Kathryn J; Rao, Vivek; Dunn, Shannon E; Persinger, Michael A; Levy, Gary A; Ross, Heather J

    2016-01-01

    Autoantibodies directed against endogenous proteins including contractile proteins and endothelial antigens are frequently detected in patients with heart failure and after heart transplantation. There is evidence that these autoantibodies contribute to cardiac dysfunction and correlate with clinical outcomes. Currently, autoantibodies are detected in patient sera using individual ELISA assays (one for each antigen). Thus, screening for many individual autoantibodies is laborious and consumes a large amount of patient sample. To better capture the broad-scale antibody reactivities that occur in heart failure and post-transplant, we developed a custom antigen microarray technique that can simultaneously measure IgM and IgG reactivities against 64 unique antigens using just five microliters of patient serum. We first demonstrated that our antigen microarray technique displayed enhanced sensitivity to detect autoantibodies compared to the traditional ELISA method. We then piloted this technique using two sets of samples that were obtained at our institution. In the first retrospective study, we profiled pre-transplant sera from 24 heart failure patients who subsequently received heart transplants. We identified 8 antibody reactivities that were higher in patients who developed cellular rejection (2 or more episodes of grade 2R rejection in first year after transplant as defined by revised criteria from the International Society for Heart and Lung Transplantation) compared with those who did have not have rejection episodes. In a second retrospective study with 31 patients, we identified 7 IgM reactivities that were higher in heart transplant recipients who developed antibody-mediated rejection (AMR) compared with control recipients, and in time course studies, these reactivities appeared prior to overt graft dysfunction. In conclusion, we demonstrated that the autoantibody microarray technique outperforms traditional ELISAs as it uses less patient sample, has increased sensitivity, and can detect autoantibodies in a multiplex fashion. Furthermore, our results suggest that this autoantibody array technology may help to identify patients at risk of rejection following heart transplantation and identify heart transplant recipients with AMR.

  6. Generation of Antigen Microarrays to Screen for Autoantibodies in Heart Failure and Heart Transplantation

    PubMed Central

    Chruscinski, Andrzej; Huang, Flora Y. Y.; Nguyen, Albert; Lioe, Jocelyn; Tumiati, Laura C.; Kozuszko, Stella; Tinckam, Kathryn J.; Rao, Vivek; Dunn, Shannon E.; Persinger, Michael A.; Levy, Gary A.; Ross, Heather J.

    2016-01-01

    Autoantibodies directed against endogenous proteins including contractile proteins and endothelial antigens are frequently detected in patients with heart failure and after heart transplantation. There is evidence that these autoantibodies contribute to cardiac dysfunction and correlate with clinical outcomes. Currently, autoantibodies are detected in patient sera using individual ELISA assays (one for each antigen). Thus, screening for many individual autoantibodies is laborious and consumes a large amount of patient sample. To better capture the broad-scale antibody reactivities that occur in heart failure and post-transplant, we developed a custom antigen microarray technique that can simultaneously measure IgM and IgG reactivities against 64 unique antigens using just five microliters of patient serum. We first demonstrated that our antigen microarray technique displayed enhanced sensitivity to detect autoantibodies compared to the traditional ELISA method. We then piloted this technique using two sets of samples that were obtained at our institution. In the first retrospective study, we profiled pre-transplant sera from 24 heart failure patients who subsequently received heart transplants. We identified 8 antibody reactivities that were higher in patients who developed cellular rejection (2 or more episodes of grade 2R rejection in first year after transplant as defined by revised criteria from the International Society for Heart and Lung Transplantation) compared with those who did have not have rejection episodes. In a second retrospective study with 31 patients, we identified 7 IgM reactivities that were higher in heart transplant recipients who developed antibody-mediated rejection (AMR) compared with control recipients, and in time course studies, these reactivities appeared prior to overt graft dysfunction. In conclusion, we demonstrated that the autoantibody microarray technique outperforms traditional ELISAs as it uses less patient sample, has increased sensitivity, and can detect autoantibodies in a multiplex fashion. Furthermore, our results suggest that this autoantibody array technology may help to identify patients at risk of rejection following heart transplantation and identify heart transplant recipients with AMR. PMID:26967734

  7. LS-CAP: an algorithm for identifying cytogenetic aberrations in hepatocellular carcinoma using microarray data.

    PubMed

    He, Xianmin; Wei, Qing; Sun, Meiqian; Fu, Xuping; Fan, Sichang; Li, Yao

    2006-05-01

    Biological techniques such as Array-Comparative genomic hybridization (CGH), fluorescent in situ hybridization (FISH) and affymetrix single nucleotide pleomorphism (SNP) array have been used to detect cytogenetic aberrations. However, on genomic scale, these techniques are labor intensive and time consuming. Comparative genomic microarray analysis (CGMA) has been used to identify cytogenetic changes in hepatocellular carcinoma (HCC) using gene expression microarray data. However, CGMA algorithm can not give precise localization of aberrations, fails to identify small cytogenetic changes, and exhibits false negatives and positives. Locally un-weighted smoothing cytogenetic aberrations prediction (LS-CAP) based on local smoothing and binomial distribution can be expected to address these problems. LS-CAP algorithm was built and used on HCC microarray profiles. Eighteen cytogenetic abnormalities were identified, among them 5 were reported previously, and 12 were proven by CGH studies. LS-CAP effectively reduced the false negatives and positives, and precisely located small fragments with cytogenetic aberrations.

  8. Parallel Clustering Algorithm for Large-Scale Biological Data Sets

    PubMed Central

    Wang, Minchao; Zhang, Wu; Ding, Wang; Dai, Dongbo; Zhang, Huiran; Xie, Hao; Chen, Luonan; Guo, Yike; Xie, Jiang

    2014-01-01

    Backgrounds Recent explosion of biological data brings a great challenge for the traditional clustering algorithms. With increasing scale of data sets, much larger memory and longer runtime are required for the cluster identification problems. The affinity propagation algorithm outperforms many other classical clustering algorithms and is widely applied into the biological researches. However, the time and space complexity become a great bottleneck when handling the large-scale data sets. Moreover, the similarity matrix, whose constructing procedure takes long runtime, is required before running the affinity propagation algorithm, since the algorithm clusters data sets based on the similarities between data pairs. Methods Two types of parallel architectures are proposed in this paper to accelerate the similarity matrix constructing procedure and the affinity propagation algorithm. The memory-shared architecture is used to construct the similarity matrix, and the distributed system is taken for the affinity propagation algorithm, because of its large memory size and great computing capacity. An appropriate way of data partition and reduction is designed in our method, in order to minimize the global communication cost among processes. Result A speedup of 100 is gained with 128 cores. The runtime is reduced from serval hours to a few seconds, which indicates that parallel algorithm is capable of handling large-scale data sets effectively. The parallel affinity propagation also achieves a good performance when clustering large-scale gene data (microarray) and detecting families in large protein superfamilies. PMID:24705246

  9. A curated collection of tissue microarray images and clinical outcome data of prostate cancer patients

    PubMed Central

    Zhong, Qing; Guo, Tiannan; Rechsteiner, Markus; Rüschoff, Jan H.; Rupp, Niels; Fankhauser, Christian; Saba, Karim; Mortezavi, Ashkan; Poyet, Cédric; Hermanns, Thomas; Zhu, Yi; Moch, Holger; Aebersold, Ruedi; Wild, Peter J.

    2017-01-01

    Microscopy image data of human cancers provide detailed phenotypes of spatially and morphologically intact tissues at single-cell resolution, thus complementing large-scale molecular analyses, e.g., next generation sequencing or proteomic profiling. Here we describe a high-resolution tissue microarray (TMA) image dataset from a cohort of 71 prostate tissue samples, which was hybridized with bright-field dual colour chromogenic and silver in situ hybridization probes for the tumour suppressor gene PTEN. These tissue samples were digitized and supplemented with expert annotations, clinical information, statistical models of PTEN genetic status, and computer source codes. For validation, we constructed an additional TMA dataset for 424 prostate tissues, hybridized with FISH probes for PTEN, and performed survival analysis on a subset of 339 radical prostatectomy specimens with overall, disease-specific and recurrence-free survival (maximum 167 months). For application, we further produced 6,036 image patches derived from two whole slides. Our curated collection of prostate cancer data sets provides reuse potential for both biomedical and computational studies. PMID:28291248

  10. An object model and database for functional genomics.

    PubMed

    Jones, Andrew; Hunt, Ela; Wastling, Jonathan M; Pizarro, Angel; Stoeckert, Christian J

    2004-07-10

    Large-scale functional genomics analysis is now feasible and presents significant challenges in data analysis, storage and querying. Data standards are required to enable the development of public data repositories and to improve data sharing. There is an established data format for microarrays (microarray gene expression markup language, MAGE-ML) and a draft standard for proteomics (PEDRo). We believe that all types of functional genomics experiments should be annotated in a consistent manner, and we hope to open up new ways of comparing multiple datasets used in functional genomics. We have created a functional genomics experiment object model (FGE-OM), developed from the microarray model, MAGE-OM and two models for proteomics, PEDRo and our own model (Gla-PSI-Glasgow Proposal for the Proteomics Standards Initiative). FGE-OM comprises three namespaces representing (i) the parts of the model common to all functional genomics experiments; (ii) microarray-specific components; and (iii) proteomics-specific components. We believe that FGE-OM should initiate discussion about the contents and structure of the next version of MAGE and the future of proteomics standards. A prototype database called RNA And Protein Abundance Database (RAPAD), based on FGE-OM, has been implemented and populated with data from microbial pathogenesis. FGE-OM and the RAPAD schema are available from http://www.gusdb.org/fge.html, along with a set of more detailed diagrams. RAPAD can be accessed by registration at the site.

  11. A comprehensive transcript index of the human genome generated using microarrays and computational approaches

    PubMed Central

    Schadt, Eric E; Edwards, Stephen W; GuhaThakurta, Debraj; Holder, Dan; Ying, Lisa; Svetnik, Vladimir; Leonardson, Amy; Hart, Kyle W; Russell, Archie; Li, Guoya; Cavet, Guy; Castle, John; McDonagh, Paul; Kan, Zhengyan; Chen, Ronghua; Kasarskis, Andrew; Margarint, Mihai; Caceres, Ramon M; Johnson, Jason M; Armour, Christopher D; Garrett-Engele, Philip W; Tsinoremas, Nicholas F; Shoemaker, Daniel D

    2004-01-01

    Background Computational and microarray-based experimental approaches were used to generate a comprehensive transcript index for the human genome. Oligonucleotide probes designed from approximately 50,000 known and predicted transcript sequences from the human genome were used to survey transcription from a diverse set of 60 tissues and cell lines using ink-jet microarrays. Further, expression activity over at least six conditions was more generally assessed using genomic tiling arrays consisting of probes tiled through a repeat-masked version of the genomic sequence making up chromosomes 20 and 22. Results The combination of microarray data with extensive genome annotations resulted in a set of 28,456 experimentally supported transcripts. This set of high-confidence transcripts represents the first experimentally driven annotation of the human genome. In addition, the results from genomic tiling suggest that a large amount of transcription exists outside of annotated regions of the genome and serves as an example of how this activity could be measured on a genome-wide scale. Conclusions These data represent one of the most comprehensive assessments of transcriptional activity in the human genome and provide an atlas of human gene expression over a unique set of gene predictions. Before the annotation of the human genome is considered complete, however, the previously unannotated transcriptional activity throughout the genome must be fully characterized. PMID:15461792

  12. A reference guide for tree analysis and visualization

    PubMed Central

    2010-01-01

    The quantities of data obtained by the new high-throughput technologies, such as microarrays or ChIP-Chip arrays, and the large-scale OMICS-approaches, such as genomics, proteomics and transcriptomics, are becoming vast. Sequencing technologies become cheaper and easier to use and, thus, large-scale evolutionary studies towards the origins of life for all species and their evolution becomes more and more challenging. Databases holding information about how data are related and how they are hierarchically organized expand rapidly. Clustering analysis is becoming more and more difficult to be applied on very large amounts of data since the results of these algorithms cannot be efficiently visualized. Most of the available visualization tools that are able to represent such hierarchies, project data in 2D and are lacking often the necessary user friendliness and interactivity. For example, the current phylogenetic tree visualization tools are not able to display easy to understand large scale trees with more than a few thousand nodes. In this study, we review tools that are currently available for the visualization of biological trees and analysis, mainly developed during the last decade. We describe the uniform and standard computer readable formats to represent tree hierarchies and we comment on the functionality and the limitations of these tools. We also discuss on how these tools can be developed further and should become integrated with various data sources. Here we focus on freely available software that offers to the users various tree-representation methodologies for biological data analysis. PMID:20175922

  13. Genomic profiling using array comparative genomic hybridization define distinct subtypes of diffuse large b-cell lymphoma: a review of the literature

    PubMed Central

    2012-01-01

    Diffuse large B-cell lymphoma (DLBCL) is the most common type of non-Hodgkin Lymphoma comprising of greater than 30% of adult non-Hodgkin Lymphomas. DLBCL represents a diverse set of lymphomas, defined as diffuse proliferation of large B lymphoid cells. Numerous cytogenetic studies including karyotypes and fluorescent in situ hybridization (FISH), as well as morphological, biological, clinical, microarray and sequencing technologies have attempted to categorize DLBCL into morphological variants, molecular and immunophenotypic subgroups, as well as distinct disease entities. Despite such efforts, most lymphoma remains undistinguishable and falls into DLBCL, not otherwise specified (DLBCL-NOS). The advent of microarray-based studies (chromosome, RNA, gene expression, etc) has provided a plethora of high-resolution data that could potentially facilitate the finer classification of DLBCL. This review covers the microarray data currently published for DLBCL. We will focus on these types of data; 1) array based CGH; 2) classical CGH; and 3) gene expression profiling studies. The aims of this review were three-fold: (1) to catalog chromosome loci that are present in at least 20% or more of distinct DLBCL subtypes; a detailed list of gains and losses for different subtypes was generated in a table form to illustrate specific chromosome loci affected in selected subtypes; (2) to determine common and distinct copy number alterations among the different subtypes and based on this information, characteristic and similar chromosome loci for the different subtypes were depicted in two separate chromosome ideograms; and, (3) to list re-classified subtypes and those that remained indistinguishable after review of the microarray data. To the best of our knowledge, this is the first effort to compile and review available literatures on microarray analysis data and their practical utility in classifying DLBCL subtypes. Although conventional cytogenetic methods such as Karyotypes and FISH have played a major role in classification schemes of lymphomas, better classification models are clearly needed to further understanding the biology, disease outcome and therapeutic management of DLBCL. In summary, microarray data reviewed here can provide better subtype specific classifications models for DLBCL. PMID:22967872

  14. Diversity Arrays Technology (DArT) for Pan-Genomic Evolutionary Studies of Non-Model Organisms

    PubMed Central

    James, Karen E.; Schneider, Harald; Ansell, Stephen W.; Evers, Margaret; Robba, Lavinia; Uszynski, Grzegorz; Pedersen, Niklas; Newton, Angela E.; Russell, Stephen J.; Vogel, Johannes C.; Kilian, Andrzej

    2008-01-01

    Background High-throughput tools for pan-genomic study, especially the DNA microarray platform, have sparked a remarkable increase in data production and enabled a shift in the scale at which biological investigation is possible. The use of microarrays to examine evolutionary relationships and processes, however, is predominantly restricted to model or near-model organisms. Methodology/Principal Findings This study explores the utility of Diversity Arrays Technology (DArT) in evolutionary studies of non-model organisms. DArT is a hybridization-based genotyping method that uses microarray technology to identify and type DNA polymorphism. Theoretically applicable to any organism (even one for which no prior genetic data are available), DArT has not yet been explored in exclusively wild sample sets, nor extensively examined in a phylogenetic framework. DArT recovered 1349 markers of largely low copy-number loci in two lineages of seed-free land plants: the diploid fern Asplenium viride and the haploid moss Garovaglia elegans. Direct sequencing of 148 of these DArT markers identified 30 putative loci including four routinely sequenced for evolutionary studies in plants. Phylogenetic analyses of DArT genotypes reveal phylogeographic and substrate specificity patterns in A. viride, a lack of phylogeographic pattern in Australian G. elegans, and additive variation in hybrid or mixed samples. Conclusions/Significance These results enable methodological recommendations including procedures for detecting and analysing DArT markers tailored specifically to evolutionary investigations and practical factors informing the decision to use DArT, and raise evolutionary hypotheses concerning substrate specificity and biogeographic patterns. Thus DArT is a demonstrably valuable addition to the set of existing molecular approaches used to infer biological phenomena such as adaptive radiations, population dynamics, hybridization, introgression, ecological differentiation and phylogeography. PMID:18301759

  15. ArraySolver: an algorithm for colour-coded graphical display and Wilcoxon signed-rank statistics for comparing microarray gene expression data.

    PubMed

    Khan, Haseeb Ahmad

    2004-01-01

    The massive surge in the production of microarray data poses a great challenge for proper analysis and interpretation. In recent years numerous computational tools have been developed to extract meaningful interpretation of microarray gene expression data. However, a convenient tool for two-groups comparison of microarray data is still lacking and users have to rely on commercial statistical packages that might be costly and require special skills, in addition to extra time and effort for transferring data from one platform to other. Various statistical methods, including the t-test, analysis of variance, Pearson test and Mann-Whitney U test, have been reported for comparing microarray data, whereas the utilization of the Wilcoxon signed-rank test, which is an appropriate test for two-groups comparison of gene expression data, has largely been neglected in microarray studies. The aim of this investigation was to build an integrated tool, ArraySolver, for colour-coded graphical display and comparison of gene expression data using the Wilcoxon signed-rank test. The results of software validation showed similar outputs with ArraySolver and SPSS for large datasets. Whereas the former program appeared to be more accurate for 25 or fewer pairs (n < or = 25), suggesting its potential application in analysing molecular signatures that usually contain small numbers of genes. The main advantages of ArraySolver are easy data selection, convenient report format, accurate statistics and the familiar Excel platform.

  16. ArraySolver: An Algorithm for Colour-Coded Graphical Display and Wilcoxon Signed-Rank Statistics for Comparing Microarray Gene Expression Data

    PubMed Central

    2004-01-01

    The massive surge in the production of microarray data poses a great challenge for proper analysis and interpretation. In recent years numerous computational tools have been developed to extract meaningful interpretation of microarray gene expression data. However, a convenient tool for two-groups comparison of microarray data is still lacking and users have to rely on commercial statistical packages that might be costly and require special skills, in addition to extra time and effort for transferring data from one platform to other. Various statistical methods, including the t-test, analysis of variance, Pearson test and Mann–Whitney U test, have been reported for comparing microarray data, whereas the utilization of the Wilcoxon signed-rank test, which is an appropriate test for two-groups comparison of gene expression data, has largely been neglected in microarray studies. The aim of this investigation was to build an integrated tool, ArraySolver, for colour-coded graphical display and comparison of gene expression data using the Wilcoxon signed-rank test. The results of software validation showed similar outputs with ArraySolver and SPSS for large datasets. Whereas the former program appeared to be more accurate for 25 or fewer pairs (n ≤ 25), suggesting its potential application in analysing molecular signatures that usually contain small numbers of genes. The main advantages of ArraySolver are easy data selection, convenient report format, accurate statistics and the familiar Excel platform. PMID:18629036

  17. Differential gene expression detection and sample classification using penalized linear regression models.

    PubMed

    Wu, Baolin

    2006-02-15

    Differential gene expression detection and sample classification using microarray data have received much research interest recently. Owing to the large number of genes p and small number of samples n (p > n), microarray data analysis poses big challenges for statistical analysis. An obvious problem owing to the 'large p small n' is over-fitting. Just by chance, we are likely to find some non-differentially expressed genes that can classify the samples very well. The idea of shrinkage is to regularize the model parameters to reduce the effects of noise and produce reliable inferences. Shrinkage has been successfully applied in the microarray data analysis. The SAM statistics proposed by Tusher et al. and the 'nearest shrunken centroid' proposed by Tibshirani et al. are ad hoc shrinkage methods. Both methods are simple, intuitive and prove to be useful in empirical studies. Recently Wu proposed the penalized t/F-statistics with shrinkage by formally using the (1) penalized linear regression models for two-class microarray data, showing good performance. In this paper we systematically discussed the use of penalized regression models for analyzing microarray data. We generalize the two-class penalized t/F-statistics proposed by Wu to multi-class microarray data. We formally derive the ad hoc shrunken centroid used by Tibshirani et al. using the (1) penalized regression models. And we show that the penalized linear regression models provide a rigorous and unified statistical framework for sample classification and differential gene expression detection.

  18. Construction of regulatory networks using expression time-series data of a genotyped population.

    PubMed

    Yeung, Ka Yee; Dombek, Kenneth M; Lo, Kenneth; Mittler, John E; Zhu, Jun; Schadt, Eric E; Bumgarner, Roger E; Raftery, Adrian E

    2011-11-29

    The inference of regulatory and biochemical networks from large-scale genomics data is a basic problem in molecular biology. The goal is to generate testable hypotheses of gene-to-gene influences and subsequently to design bench experiments to confirm these network predictions. Coexpression of genes in large-scale gene-expression data implies coregulation and potential gene-gene interactions, but provide little information about the direction of influences. Here, we use both time-series data and genetics data to infer directionality of edges in regulatory networks: time-series data contain information about the chronological order of regulatory events and genetics data allow us to map DNA variations to variations at the RNA level. We generate microarray data measuring time-dependent gene-expression levels in 95 genotyped yeast segregants subjected to a drug perturbation. We develop a Bayesian model averaging regression algorithm that incorporates external information from diverse data types to infer regulatory networks from the time-series and genetics data. Our algorithm is capable of generating feedback loops. We show that our inferred network recovers existing and novel regulatory relationships. Following network construction, we generate independent microarray data on selected deletion mutants to prospectively test network predictions. We demonstrate the potential of our network to discover de novo transcription-factor binding sites. Applying our construction method to previously published data demonstrates that our method is competitive with leading network construction algorithms in the literature.

  19. Assessing differential expression in two-color microarrays: a resampling-based empirical Bayes approach.

    PubMed

    Li, Dongmei; Le Pape, Marc A; Parikh, Nisha I; Chen, Will X; Dye, Timothy D

    2013-01-01

    Microarrays are widely used for examining differential gene expression, identifying single nucleotide polymorphisms, and detecting methylation loci. Multiple testing methods in microarray data analysis aim at controlling both Type I and Type II error rates; however, real microarray data do not always fit their distribution assumptions. Smyth's ubiquitous parametric method, for example, inadequately accommodates violations of normality assumptions, resulting in inflated Type I error rates. The Significance Analysis of Microarrays, another widely used microarray data analysis method, is based on a permutation test and is robust to non-normally distributed data; however, the Significance Analysis of Microarrays method fold change criteria are problematic, and can critically alter the conclusion of a study, as a result of compositional changes of the control data set in the analysis. We propose a novel approach, combining resampling with empirical Bayes methods: the Resampling-based empirical Bayes Methods. This approach not only reduces false discovery rates for non-normally distributed microarray data, but it is also impervious to fold change threshold since no control data set selection is needed. Through simulation studies, sensitivities, specificities, total rejections, and false discovery rates are compared across the Smyth's parametric method, the Significance Analysis of Microarrays, and the Resampling-based empirical Bayes Methods. Differences in false discovery rates controls between each approach are illustrated through a preterm delivery methylation study. The results show that the Resampling-based empirical Bayes Methods offer significantly higher specificity and lower false discovery rates compared to Smyth's parametric method when data are not normally distributed. The Resampling-based empirical Bayes Methods also offers higher statistical power than the Significance Analysis of Microarrays method when the proportion of significantly differentially expressed genes is large for both normally and non-normally distributed data. Finally, the Resampling-based empirical Bayes Methods are generalizable to next generation sequencing RNA-seq data analysis.

  20. Integrative analysis of RUNX1 downstream pathways and target genes

    PubMed Central

    Michaud, Joëlle; Simpson, Ken M; Escher, Robert; Buchet-Poyau, Karine; Beissbarth, Tim; Carmichael, Catherine; Ritchie, Matthew E; Schütz, Frédéric; Cannon, Ping; Liu, Marjorie; Shen, Xiaofeng; Ito, Yoshiaki; Raskind, Wendy H; Horwitz, Marshall S; Osato, Motomi; Turner, David R; Speed, Terence P; Kavallaris, Maria; Smyth, Gordon K; Scott, Hamish S

    2008-01-01

    Background The RUNX1 transcription factor gene is frequently mutated in sporadic myeloid and lymphoid leukemia through translocation, point mutation or amplification. It is also responsible for a familial platelet disorder with predisposition to acute myeloid leukemia (FPD-AML). The disruption of the largely unknown biological pathways controlled by RUNX1 is likely to be responsible for the development of leukemia. We have used multiple microarray platforms and bioinformatic techniques to help identify these biological pathways to aid in the understanding of why RUNX1 mutations lead to leukemia. Results Here we report genes regulated either directly or indirectly by RUNX1 based on the study of gene expression profiles generated from 3 different human and mouse platforms. The platforms used were global gene expression profiling of: 1) cell lines with RUNX1 mutations from FPD-AML patients, 2) over-expression of RUNX1 and CBFβ, and 3) Runx1 knockout mouse embryos using either cDNA or Affymetrix microarrays. We observe that our datasets (lists of differentially expressed genes) significantly correlate with published microarray data from sporadic AML patients with mutations in either RUNX1 or its cofactor, CBFβ. A number of biological processes were identified among the differentially expressed genes and functional assays suggest that heterozygous RUNX1 point mutations in patients with FPD-AML impair cell proliferation, microtubule dynamics and possibly genetic stability. In addition, analysis of the regulatory regions of the differentially expressed genes has for the first time systematically identified numerous potential novel RUNX1 target genes. Conclusion This work is the first large-scale study attempting to identify the genetic networks regulated by RUNX1, a master regulator in the development of the hematopoietic system and leukemia. The biological pathways and target genes controlled by RUNX1 will have considerable importance in disease progression in both familial and sporadic leukemia as well as therapeutic implications. PMID:18671852

  1. Fuzzy support vector machine for microarray imbalanced data classification

    NASA Astrophysics Data System (ADS)

    Ladayya, Faroh; Purnami, Santi Wulan; Irhamah

    2017-11-01

    DNA microarrays are data containing gene expression with small sample sizes and high number of features. Furthermore, imbalanced classes is a common problem in microarray data. This occurs when a dataset is dominated by a class which have significantly more instances than the other minority classes. Therefore, it is needed a classification method that solve the problem of high dimensional and imbalanced data. Support Vector Machine (SVM) is one of the classification methods that is capable of handling large or small samples, nonlinear, high dimensional, over learning and local minimum issues. SVM has been widely applied to DNA microarray data classification and it has been shown that SVM provides the best performance among other machine learning methods. However, imbalanced data will be a problem because SVM treats all samples in the same importance thus the results is bias for minority class. To overcome the imbalanced data, Fuzzy SVM (FSVM) is proposed. This method apply a fuzzy membership to each input point and reformulate the SVM such that different input points provide different contributions to the classifier. The minority classes have large fuzzy membership so FSVM can pay more attention to the samples with larger fuzzy membership. Given DNA microarray data is a high dimensional data with a very large number of features, it is necessary to do feature selection first using Fast Correlation based Filter (FCBF). In this study will be analyzed by SVM, FSVM and both methods by applying FCBF and get the classification performance of them. Based on the overall results, FSVM on selected features has the best classification performance compared to SVM.

  2. DEXTER: Disease-Expression Relation Extraction from Text.

    PubMed

    Gupta, Samir; Dingerdissen, Hayley; Ross, Karen E; Hu, Yu; Wu, Cathy H; Mazumder, Raja; Vijay-Shanker, K

    2018-01-01

    Gene expression levels affect biological processes and play a key role in many diseases. Characterizing expression profiles is useful for clinical research, and diagnostics and prognostics of diseases. There are currently several high-quality databases that capture gene expression information, obtained mostly from large-scale studies, such as microarray and next-generation sequencing technologies, in the context of disease. The scientific literature is another rich source of information on gene expression-disease relationships that not only have been captured from large-scale studies but have also been observed in thousands of small-scale studies. Expression information obtained from literature through manual curation can extend expression databases. While many of the existing databases include information from literature, they are limited by the time-consuming nature of manual curation and have difficulty keeping up with the explosion of publications in the biomedical field. In this work, we describe an automated text-mining tool, Disease-Expression Relation Extraction from Text (DEXTER) to extract information from literature on gene and microRNA expression in the context of disease. One of the motivations in developing DEXTER was to extend the BioXpress database, a cancer-focused gene expression database that includes data derived from large-scale experiments and manual curation of publications. The literature-based portion of BioXpress lags behind significantly compared to expression information obtained from large-scale studies and can benefit from our text-mined results. We have conducted two different evaluations to measure the accuracy of our text-mining tool and achieved average F-scores of 88.51 and 81.81% for the two evaluations, respectively. Also, to demonstrate the ability to extract rich expression information in different disease-related scenarios, we used DEXTER to extract information on differential expression information for 2024 genes in lung cancer, 115 glycosyltransferases in 62 cancers and 826 microRNA in 171 cancers. All extractions using DEXTER are integrated in the literature-based portion of BioXpress.Database URL: http://biotm.cis.udel.edu/DEXTER.

  3. Integrating Colon Cancer Microarray Data: Associating Locus-Specific Methylation Groups to Gene Expression-Based Classifications.

    PubMed

    Barat, Ana; Ruskin, Heather J; Byrne, Annette T; Prehn, Jochen H M

    2015-11-23

    Recently, considerable attention has been paid to gene expression-based classifications of colorectal cancers (CRC) and their association with patient prognosis. In addition to changes in gene expression, abnormal DNA-methylation is known to play an important role in cancer onset and development, and colon cancer is no exception to this rule. Large-scale technologies, such as methylation microarray assays and specific sequencing of methylated DNA, have been used to determine whole genome profiles of CpG island methylation in tissue samples. In this article, publicly available microarray-based gene expression and methylation data sets are used to characterize expression subtypes with respect to locus-specific methylation. A major objective was to determine whether integration of these data types improves previously characterized subtypes, or provides evidence for additional subtypes. We used unsupervised clustering techniques to determine methylation-based subgroups, which are subsequently annotated with three published expression-based classifications, comprising from three to six subtypes. Our results showed that, while methylation profiles provide a further basis for segregation of certain (Inflammatory and Goblet-like) finer-grained expression-based subtypes, they also suggest that other finer-grained subtypes are not distinctive and can be considered as a single subtype.

  4. Integrating Colon Cancer Microarray Data: Associating Locus-Specific Methylation Groups to Gene Expression-Based Classifications

    PubMed Central

    Barat, Ana; Ruskin, Heather J.; Byrne, Annette T.; Prehn, Jochen H. M.

    2015-01-01

    Recently, considerable attention has been paid to gene expression-based classifications of colorectal cancers (CRC) and their association with patient prognosis. In addition to changes in gene expression, abnormal DNA-methylation is known to play an important role in cancer onset and development, and colon cancer is no exception to this rule. Large-scale technologies, such as methylation microarray assays and specific sequencing of methylated DNA, have been used to determine whole genome profiles of CpG island methylation in tissue samples. In this article, publicly available microarray-based gene expression and methylation data sets are used to characterize expression subtypes with respect to locus-specific methylation. A major objective was to determine whether integration of these data types improves previously characterized subtypes, or provides evidence for additional subtypes. We used unsupervised clustering techniques to determine methylation-based subgroups, which are subsequently annotated with three published expression-based classifications, comprising from three to six subtypes. Our results showed that, while methylation profiles provide a further basis for segregation of certain (Inflammatory and Goblet-like) finer-grained expression-based subtypes, they also suggest that other finer-grained subtypes are not distinctive and can be considered as a single subtype. PMID:27600244

  5. A low-density cDNA microarray with a unique reference RNA: pattern recognition analysis for IFN efficacy prediction to HCV as a model.

    PubMed

    Daiba, Akito; Inaba, Niro; Ando, Satoshi; Kajiyama, Naoki; Yatsuhashi, Hiroshi; Terasaki, Hiroshi; Ito, Atsushi; Ogasawara, Masanori; Abe, Aki; Yoshioka, Junichi; Hayashida, Kazuhiro; Kaneko, Shuichi; Kohara, Michinori; Ito, Satoru

    2004-03-19

    We have designed and established a low-density (295 genes) cDNA microarray for the prediction of IFN efficacy in hepatitis C patients. To obtain a precise and consistent microarray data, we collected a data set from three spots for each gene (mRNA) and using three different scanning conditions. We also established an artificial reference RNA representing pseudo-inflammatory conditions from established hepatocyte cell lines supplemented with synthetic RNAs to 48 inflammatory genes. We also developed a novel algorithm that replaces the standard hierarchical-clustering method and allows handling of the large data set with ease. This algorithm utilizes a standard space database (SSDB) as a key scale to calculate the Mahalanobis distance (MD) from the center of gravity in the SSDB. We further utilized sMD (divided by parameter k: MD/k) to reduce MD number as a predictive value. The efficacy prediction of conventional IFN mono-therapy was 100% for non-responder (NR) vs. transient responder (TR)/sustained responder (SR) (P < 0.0005). Finally, we show that this method is acceptable for clinical application.

  6. Validation of Biomarker Proteins Using Reverse Capture Protein Microarrays.

    PubMed

    Jozwik, Catherine; Eidelman, Ofer; Starr, Joshua; Pollard, Harvey B; Srivastava, Meera

    2017-01-01

    Genomics has revolutionized large-scale and high-throughput sequencing and has led to the discovery of thousands of new proteins. Protein chip technology is emerging as a miniaturized and highly parallel platform that is suited to rapid, simultaneous screening of large numbers of proteins and the analysis of various protein-binding activities, enzyme substrate relationships, and posttranslational modifications. Specifically, reverse capture protein microarrays provide the most appropriate platform for identifying low-abundance, disease-specific biomarker proteins in a sea of high-abundance proteins from biological fluids such as blood, serum, plasma, saliva, urine, and cerebrospinal fluid as well as tissues and cells obtained by biopsy. Samples from hundreds of patients can be spotted in serial dilutions on many replicate glass slides. Each slide can then be probed with one specific antibody to the biomarker of interest. That antibody's titer can then be determined quantitatively for each patient, allowing for the statistical assessment and validation of the diagnostic or prognostic utility of that particular antigen. As the technology matures and the availability of validated, platform-compatible antibodies increases, the platform will move further into the desirable realm of discovery science for detecting and quantitating low-abundance signaling proteins. In this chapter, we describe methods for the successful application of the reverse capture protein microarray platform for which we have made substantial contributions to the development and application of this method, particularly in the use of body fluids other than serum/plasma.

  7. A model of binding on DNA microarrays: understanding the combined effect of probe synthesis failure, cross-hybridization, DNA fragmentation and other experimental details of affymetrix arrays

    PubMed Central

    2012-01-01

    Background DNA microarrays are used both for research and for diagnostics. In research, Affymetrix arrays are commonly used for genome wide association studies, resequencing, and for gene expression analysis. These arrays provide large amounts of data. This data is analyzed using statistical methods that quite often discard a large portion of the information. Most of the information that is lost comes from probes that systematically fail across chips and from batch effects. The aim of this study was to develop a comprehensive model for hybridization that predicts probe intensities for Affymetrix arrays and that could provide a basis for improved microarray analysis and probe development. The first part of the model calculates probe binding affinities to all the possible targets in the hybridization solution using the Langmuir isotherm. In the second part of the model we integrate details that are specific to each experiment and contribute to the differences between hybridization in solution and on the microarray. These details include fragmentation, wash stringency, temperature, salt concentration, and scanner settings. Furthermore, the model fits probe synthesis efficiency and target concentration parameters directly to the data. All the parameters used in the model have a well-established physical origin. Results For the 302 chips that were analyzed the mean correlation between expected and observed probe intensities was 0.701 with a range of 0.88 to 0.55. All available chips were included in the analysis regardless of the data quality. Our results show that batch effects arise from differences in probe synthesis, scanner settings, wash strength, and target fragmentation. We also show that probe synthesis efficiencies for different nucleotides are not uniform. Conclusions To date this is the most complete model for binding on microarrays. This is the first model that includes both probe synthesis efficiency and hybridization kinetics/cross-hybridization. These two factors are sequence dependent and have a large impact on probe intensity. The results presented here provide novel insight into the effect of probe synthesis errors on Affymetrix microarrays; furthermore, the algorithms developed in this work provide useful tools for the analysis of cross-hybridization, probe synthesis efficiency, fragmentation, wash stringency, temperature, and salt concentration on microarray intensities. PMID:23270536

  8. Hepatic gene expression patterns following trauma-hemorrhage: effect of posttreatment with estrogen.

    PubMed

    Yu, Huang-Ping; Pang, See-Tong; Chaudry, Irshad H

    2013-01-01

    The aim of this study was to examine the role of estrogen on hepatic gene expression profiles at an early time point following trauma-hemorrhage in rats. Groups of injured and sham controls receiving estrogen or vehicle were killed 2 h after injury and resuscitation, and liver tissue was harvested. Complementary RNA was synthesized from each RNA sample and hybridized to microarrays. A large number of genes were differentially expressed at the 2-h time point in injured animals with or without estrogen treatment. The upregulation or downregulation of a cohort of 14 of these genes was validated by reverse transcription-polymerase chain reaction. This large-scale microarray analysis shows that at the 2-h time point, there is marked alteration in hepatic gene expression following trauma-hemorrhage. However, estrogen treatment attenuated these changes in injured animals. Pathway analysis demonstrated predominant changes in the expression of genes involved in metabolism, immunity, and apoptosis. Upregulation of low-density lipoprotein receptor, protein phosphatase 1, regulatory subunit 3C, ring-finger protein 11, pyroglutamyl-peptidase I, bactericidal/permeability-increasing protein, integrin, αD, BCL2-like 11, leukemia inhibitory factor receptor, ATPase, Cu transporting, α polypeptide, and Mk1 protein was found in estrogen-treated trauma-hemorrhaged animals. Thus, estrogen produces hepatoprotection following trauma-hemorrhage likely via antiapoptosis and improving/restoring metabolism and immunity pathways.

  9. A genome-wide 20 K citrus microarray for gene expression analysis

    PubMed Central

    Martinez-Godoy, M Angeles; Mauri, Nuria; Juarez, Jose; Marques, M Carmen; Santiago, Julia; Forment, Javier; Gadea, Jose

    2008-01-01

    Background Understanding of genetic elements that contribute to key aspects of citrus biology will impact future improvements in this economically important crop. Global gene expression analysis demands microarray platforms with a high genome coverage. In the last years, genome-wide EST collections have been generated in citrus, opening the possibility to create new tools for functional genomics in this crop plant. Results We have designed and constructed a publicly available genome-wide cDNA microarray that include 21,081 putative unigenes of citrus. As a functional companion to the microarray, a web-browsable database [1] was created and populated with information about the unigenes represented in the microarray, including cDNA libraries, isolated clones, raw and processed nucleotide and protein sequences, and results of all the structural and functional annotation of the unigenes, like general description, BLAST hits, putative Arabidopsis orthologs, microsatellites, putative SNPs, GO classification and PFAM domains. We have performed a Gene Ontology comparison with the full set of Arabidopsis proteins to estimate the genome coverage of the microarray. We have also performed microarray hybridizations to check its usability. Conclusion This new cDNA microarray replaces the first 7K microarray generated two years ago and allows gene expression analysis at a more global scale. We have followed a rational design to minimize cross-hybridization while maintaining its utility for different citrus species. Furthermore, we also provide access to a website with full structural and functional annotation of the unigenes represented in the microarray, along with the ability to use this site to directly perform gene expression analysis using standard tools at different publicly available servers. Furthermore, we show how this microarray offers a good representation of the citrus genome and present the usefulness of this genomic tool for global studies in citrus by using it to catalogue genes expressed in citrus globular embryos. PMID:18598343

  10. ArrayPitope: Automated Analysis of Amino Acid Substitutions for Peptide Microarray-Based Antibody Epitope Mapping.

    PubMed

    Hansen, Christian Skjødt; Østerbye, Thomas; Marcatili, Paolo; Lund, Ole; Buus, Søren; Nielsen, Morten

    2017-01-01

    Identification of epitopes targeted by antibodies (B cell epitopes) is of critical importance for the development of many diagnostic and therapeutic tools. For clinical usage, such epitopes must be extensively characterized in order to validate specificity and to document potential cross-reactivity. B cell epitopes are typically classified as either linear epitopes, i.e. short consecutive segments from the protein sequence or conformational epitopes adapted through native protein folding. Recent advances in high-density peptide microarrays enable high-throughput, high-resolution identification and characterization of linear B cell epitopes. Using exhaustive amino acid substitution analysis of peptides originating from target antigens, these microarrays can be used to address the specificity of polyclonal antibodies raised against such antigens containing hundreds of epitopes. However, the interpretation of the data provided in such large-scale screenings is far from trivial and in most cases it requires advanced computational and statistical skills. Here, we present an online application for automated identification of linear B cell epitopes, allowing the non-expert user to analyse peptide microarray data. The application takes as input quantitative peptide data of fully or partially substituted overlapping peptides from a given antigen sequence and identifies epitope residues (residues that are significantly affected by substitutions) and visualize the selectivity towards each residue by sequence logo plots. Demonstrating utility, the application was used to identify and address the antibody specificity of 18 linear epitope regions in Human Serum Albumin (HSA), using peptide microarray data consisting of fully substituted peptides spanning the entire sequence of HSA and incubated with polyclonal rabbit anti-HSA (and mouse anti-rabbit-Cy3). The application is made available at: www.cbs.dtu.dk/services/ArrayPitope.

  11. ArrayPitope: Automated Analysis of Amino Acid Substitutions for Peptide Microarray-Based Antibody Epitope Mapping

    PubMed Central

    Hansen, Christian Skjødt; Østerbye, Thomas; Marcatili, Paolo; Lund, Ole; Buus, Søren

    2017-01-01

    Identification of epitopes targeted by antibodies (B cell epitopes) is of critical importance for the development of many diagnostic and therapeutic tools. For clinical usage, such epitopes must be extensively characterized in order to validate specificity and to document potential cross-reactivity. B cell epitopes are typically classified as either linear epitopes, i.e. short consecutive segments from the protein sequence or conformational epitopes adapted through native protein folding. Recent advances in high-density peptide microarrays enable high-throughput, high-resolution identification and characterization of linear B cell epitopes. Using exhaustive amino acid substitution analysis of peptides originating from target antigens, these microarrays can be used to address the specificity of polyclonal antibodies raised against such antigens containing hundreds of epitopes. However, the interpretation of the data provided in such large-scale screenings is far from trivial and in most cases it requires advanced computational and statistical skills. Here, we present an online application for automated identification of linear B cell epitopes, allowing the non-expert user to analyse peptide microarray data. The application takes as input quantitative peptide data of fully or partially substituted overlapping peptides from a given antigen sequence and identifies epitope residues (residues that are significantly affected by substitutions) and visualize the selectivity towards each residue by sequence logo plots. Demonstrating utility, the application was used to identify and address the antibody specificity of 18 linear epitope regions in Human Serum Albumin (HSA), using peptide microarray data consisting of fully substituted peptides spanning the entire sequence of HSA and incubated with polyclonal rabbit anti-HSA (and mouse anti-rabbit-Cy3). The application is made available at: www.cbs.dtu.dk/services/ArrayPitope. PMID:28095436

  12. Analysis of developmental gene conservation in the Actinomycetales using DNA/DNA microarray comparisons.

    PubMed

    Kirby, Ralph; Herron, Paul; Hoskisson, Paul

    2011-02-01

    Based on available genome sequences, Actinomycetales show significant gene synteny across a wide range of species and genera. In addition, many genera show varying degrees of complex morphological development. Using the presence of gene synteny as a basis, it is clear that an analysis of gene conservation across the Streptomyces and various other Actinomycetales will provide information on both the importance of genes and gene clusters and the evolution of morphogenesis in these bacteria. Genome sequencing, although becoming cheaper, is still relatively expensive for comparing large numbers of strains. Thus, a heterologous DNA/DNA microarray hybridization dataset based on a Streptomyces coelicolor microarray allows a cheaper and greater depth of analysis of gene conservation. This study, using both bioinformatical and microarray approaches, was able to classify genes previously identified as involved in morphogenesis in Streptomyces into various subgroups in terms of conservation across species and genera. This will allow the targeting of genes for further study based on their importance at the species level and at higher evolutionary levels.

  13. Cross species analysis of microarray expression data

    PubMed Central

    Lu, Yong; Huggins, Peter; Bar-Joseph, Ziv

    2009-01-01

    Motivation: Many biological systems operate in a similar manner across a large number of species or conditions. Cross-species analysis of sequence and interaction data is often applied to determine the function of new genes. In contrast to these static measurements, microarrays measure the dynamic, condition-specific response of complex biological systems. The recent exponential growth in microarray expression datasets allows researchers to combine expression experiments from multiple species to identify genes that are not only conserved in sequence but also operated in a similar way in the different species studied. Results: In this review we discuss the computational and technical challenges associated with these studies, the approaches that have been developed to address these challenges and the advantages of cross-species analysis of microarray data. We show how successful application of these methods lead to insights that cannot be obtained when analyzing data from a single species. We also highlight current open problems and discuss possible ways to address them. Contact: zivbj@cs.cmu.edu PMID:19357096

  14. geneCommittee: a web-based tool for extensively testing the discriminatory power of biologically relevant gene sets in microarray data classification.

    PubMed

    Reboiro-Jato, Miguel; Arrais, Joel P; Oliveira, José Luis; Fdez-Riverola, Florentino

    2014-01-30

    The diagnosis and prognosis of several diseases can be shortened through the use of different large-scale genome experiments. In this context, microarrays can generate expression data for a huge set of genes. However, to obtain solid statistical evidence from the resulting data, it is necessary to train and to validate many classification techniques in order to find the best discriminative method. This is a time-consuming process that normally depends on intricate statistical tools. geneCommittee is a web-based interactive tool for routinely evaluating the discriminative classification power of custom hypothesis in the form of biologically relevant gene sets. While the user can work with different gene set collections and several microarray data files to configure specific classification experiments, the tool is able to run several tests in parallel. Provided with a straightforward and intuitive interface, geneCommittee is able to render valuable information for diagnostic analyses and clinical management decisions based on systematically evaluating custom hypothesis over different data sets using complementary classifiers, a key aspect in clinical research. geneCommittee allows the enrichment of microarrays raw data with gene functional annotations, producing integrated datasets that simplify the construction of better discriminative hypothesis, and allows the creation of a set of complementary classifiers. The trained committees can then be used for clinical research and diagnosis. Full documentation including common use cases and guided analysis workflows is freely available at http://sing.ei.uvigo.es/GC/.

  15. Bacterial discrimination by means of a universal array approach mediated by LDR (ligase detection reaction)

    PubMed Central

    Busti, Elena; Bordoni, Roberta; Castiglioni, Bianca; Monciardini, Paolo; Sosio, Margherita; Donadio, Stefano; Consolandi, Clarissa; Rossi Bernardi, Luigi; Battaglia, Cristina; De Bellis, Gianluca

    2002-01-01

    Background PCR amplification of bacterial 16S rRNA genes provides the most comprehensive and flexible means of sampling bacterial communities. Sequence analysis of these cloned fragments can provide a qualitative and quantitative insight of the microbial population under scrutiny although this approach is not suited to large-scale screenings. Other methods, such as denaturing gradient gel electrophoresis, heteroduplex or terminal restriction fragment analysis are rapid and therefore amenable to field-scale experiments. A very recent addition to these analytical tools is represented by microarray technology. Results Here we present our results using a Universal DNA Microarray approach as an analytical tool for bacterial discrimination. The proposed procedure is based on the properties of the DNA ligation reaction and requires the design of two probes specific for each target sequence. One oligo carries a fluorescent label and the other a unique sequence (cZipCode or complementary ZipCode) which identifies a ligation product. Ligated fragments, obtained in presence of a proper template (a PCR amplified fragment of the 16s rRNA gene) contain either the fluorescent label or the unique sequence and therefore are addressed to the location on the microarray where the ZipCode sequence has been spotted. Such an array is therefore "Universal" being unrelated to a specific molecular analysis. Here we present the design of probes specific for some groups of bacteria and their application to bacterial diagnostics. Conclusions The combined use of selective probes, ligation reaction and the Universal Array approach yielded an analytical procedure with a good power of discrimination among bacteria. PMID:12243651

  16. Glycan microarray screening assay for glycosyltransferase specificities.

    PubMed

    Peng, Wenjie; Nycholat, Corwin M; Razi, Nahid

    2013-01-01

    Glycan microarrays represent a high-throughput approach to determining the specificity of glycan-binding proteins against a large set of glycans in a single format. This chapter describes the use of a glycan microarray platform for evaluating the activity and substrate specificity of glycosyltransferases (GTs). The methodology allows simultaneous screening of hundreds of immobilized glycan acceptor substrates by in situ incubation of a GT and its appropriate donor substrate on the microarray surface. Using biotin-conjugated donor substrate enables direct detection of the incorporated sugar residues on acceptor substrates on the array. In addition, the feasibility of the method has been validated using label-free donor substrate combined with lectin-based detection of product to assess enzyme activity. Here, we describe the application of both procedures to assess the specificity of a recombinant human α2-6 sialyltransferase. This technique is readily adaptable to studying other glycosyltransferases.

  17. A statistical model for investigating binding probabilities of DNA nucleotide sequences using microarrays.

    PubMed

    Lee, Mei-Ling Ting; Bulyk, Martha L; Whitmore, G A; Church, George M

    2002-12-01

    There is considerable scientific interest in knowing the probability that a site-specific transcription factor will bind to a given DNA sequence. Microarray methods provide an effective means for assessing the binding affinities of a large number of DNA sequences as demonstrated by Bulyk et al. (2001, Proceedings of the National Academy of Sciences, USA 98, 7158-7163) in their study of the DNA-binding specificities of Zif268 zinc fingers using microarray technology. In a follow-up investigation, Bulyk, Johnson, and Church (2002, Nucleic Acid Research 30, 1255-1261) studied the interdependence of nucleotides on the binding affinities of transcription proteins. Our article is motivated by this pair of studies. We present a general statistical methodology for analyzing microarray intensity measurements reflecting DNA-protein interactions. The log probability of a protein binding to a DNA sequence on an array is modeled using a linear ANOVA model. This model is convenient because it employs familiar statistical concepts and procedures and also because it is effective for investigating the probability structure of the binding mechanism.

  18. APPLICATION OF DNA MICROARRAYS TO REPRODUCTIVE TOXICOLOGY AND THE DEVELOPMENT OF A TESTIS ARRAY

    EPA Science Inventory

    With the advent of sequence information for entire mammalian genomes, it is now possible to analyze gene expression and gene polymorphisms on a genomic scale. The primary tool for analysis of gene expression is the DNA microarray. We have used commercially available cDNA micro...

  19. BIOMONITORING THE TOXICOGENOMIC RESPONSE TO ENDOCRINE DISRUPTING CHEMICALS IN HUMANS, LABORATORY SPECIES AND WILDLIFE

    EPA Science Inventory

    With the advent of sequence information for entire eukaryotic genomes, it is now possible to analyze gene expression on a genomic scale. The primary tool for genomic analysis of gene expression is the gene microarray. We have used commercially available and custom cDNA microarray...

  20. EgoNet: identification of human disease ego-network modules

    PubMed Central

    2014-01-01

    Background Mining novel biomarkers from gene expression profiles for accurate disease classification is challenging due to small sample size and high noise in gene expression measurements. Several studies have proposed integrated analyses of microarray data and protein-protein interaction (PPI) networks to find diagnostic subnetwork markers. However, the neighborhood relationship among network member genes has not been fully considered by those methods, leaving many potential gene markers unidentified. The main idea of this study is to take full advantage of the biological observation that genes associated with the same or similar diseases commonly reside in the same neighborhood of molecular networks. Results We present EgoNet, a novel method based on egocentric network-analysis techniques, to exhaustively search and prioritize disease subnetworks and gene markers from a large-scale biological network. When applied to a triple-negative breast cancer (TNBC) microarray dataset, the top selected modules contain both known gene markers in TNBC and novel candidates, such as RAD51 and DOK1, which play a central role in their respective ego-networks by connecting many differentially expressed genes. Conclusions Our results suggest that EgoNet, which is based on the ego network concept, allows the identification of novel biomarkers and provides a deeper understanding of their roles in complex diseases. PMID:24773628

  1. Advances in the Biology and Chemistry of Sialic Acids

    PubMed Central

    Chen, Xi; Varki, Ajit

    2010-01-01

    Sialic acids are a subset of nonulosonic acids, which are nine-carbon alpha-keto aldonic acids. Natural existing sialic acid-containing structures are presented in different sialic acid forms, various sialyl linkages, and on diverse underlying glycans. They play important roles in biological, pathological, and immunological processes. Sialobiology has been a challenging and yet attractive research area. Recent advances in chemical and chemoenzymatic synthesis as well as large-scale E. coli cell-based production have provided a large library of sialoside standards and derivatives in amounts sufficient for structure-activity relationship studies. Sialoglycan microarrays provide an efficient platform for quick identification of preferred ligands for sialic acid-binding proteins. Future research on sialic acid will continue to be at the interface of chemistry and biology. Research efforts will not only lead to a better understanding of the biological and pathological importance of sialic acids and their diversity, but could also lead to the development of therapeutics. PMID:20020717

  2. Development and validation of a mixed-tissue oligonucleotide DNA microarray for Atlantic bluefin tuna, Thunnus thynnus (Linnaeus, 1758).

    PubMed

    Trumbić, Željka; Bekaert, Michaël; Taggart, John B; Bron, James E; Gharbi, Karim; Mladineo, Ivona

    2015-11-25

    The largest of the tuna species, Atlantic bluefin tuna (Thunnus thynnus), inhabits the North Atlantic Ocean and the Mediterranean Sea and is considered to be an endangered species, largely a consequence of overfishing. T. thynnus aquaculture, referred to as fattening or farming, is a capture based activity dependent on yearly renewal from the wild. Thus, the development of aquaculture practices independent of wild resources can provide an important contribution towards ensuring security and sustainability of this species in the longer-term. The development of such practices is today greatly assisted by large scale transcriptomic studies. We have used pyrosequencing technology to sequence a mixed-tissue normalised cDNA library, derived from adult T. thynnus. A total of 976,904 raw sequence reads were assembled into 33,105 unique transcripts having a mean length of 893 bases and an N50 of 870. Of these, 33.4% showed similarity to known proteins or gene transcripts and 86.6% of them were matched to the congeneric Pacific bluefin tuna (Thunnus orientalis) genome, compared to 70.3% for the more distantly related Nile tilapia (Oreochromis niloticus) genome. Transcript sequences were used to develop a novel 15 K Agilent oligonucleotide DNA microarray for T. thynnus and comparative tissue gene expression profiles were inferred for gill, heart, liver, ovaries and testes. Functional contrasts were strongest between gills and ovaries. Gills were particularly associated with immune system, signal transduction and cell communication, while ovaries displayed signatures of glycan biosynthesis, nucleotide metabolism, transcription, translation, replication and repair. Sequence data generated from a novel mixed-tissue T. thynnus cDNA library provide an important transcriptomic resource that can be further employed for study of various aspects of T. thynnus ecology and genomics, with strong applications in aquaculture. Tissue-specific gene expression profiles inferred through the use of novel oligo-microarray can serve in the design of new and more focused transcriptomic studies for future research of tuna physiology and assessment of the welfare in a production environment.

  3. Genome-scale cluster analysis of replicated microarrays using shrinkage correlation coefficient.

    PubMed

    Yao, Jianchao; Chang, Chunqi; Salmi, Mari L; Hung, Yeung Sam; Loraine, Ann; Roux, Stanley J

    2008-06-18

    Currently, clustering with some form of correlation coefficient as the gene similarity metric has become a popular method for profiling genomic data. The Pearson correlation coefficient and the standard deviation (SD)-weighted correlation coefficient are the two most widely-used correlations as the similarity metrics in clustering microarray data. However, these two correlations are not optimal for analyzing replicated microarray data generated by most laboratories. An effective correlation coefficient is needed to provide statistically sufficient analysis of replicated microarray data. In this study, we describe a novel correlation coefficient, shrinkage correlation coefficient (SCC), that fully exploits the similarity between the replicated microarray experimental samples. The methodology considers both the number of replicates and the variance within each experimental group in clustering expression data, and provides a robust statistical estimation of the error of replicated microarray data. The value of SCC is revealed by its comparison with two other correlation coefficients that are currently the most widely-used (Pearson correlation coefficient and SD-weighted correlation coefficient) using statistical measures on both synthetic expression data as well as real gene expression data from Saccharomyces cerevisiae. Two leading clustering methods, hierarchical and k-means clustering were applied for the comparison. The comparison indicated that using SCC achieves better clustering performance. Applying SCC-based hierarchical clustering to the replicated microarray data obtained from germinating spores of the fern Ceratopteris richardii, we discovered two clusters of genes with shared expression patterns during spore germination. Functional analysis suggested that some of the genetic mechanisms that control germination in such diverse plant lineages as mosses and angiosperms are also conserved among ferns. This study shows that SCC is an alternative to the Pearson correlation coefficient and the SD-weighted correlation coefficient, and is particularly useful for clustering replicated microarray data. This computational approach should be generally useful for proteomic data or other high-throughput analysis methodology.

  4. Comparative genomic characterization of citrus-associated Xylella fastidiosa strains.

    PubMed

    da Silva, Vivian S; Shida, Cláudio S; Rodrigues, Fabiana B; Ribeiro, Diógenes C D; de Souza, Alessandra A; Coletta-Filho, Helvécio D; Machado, Marcos A; Nunes, Luiz R; de Oliveira, Regina Costa

    2007-12-21

    The xylem-inhabiting bacterium Xylella fastidiosa (Xf) is the causal agent of Pierce's disease (PD) in vineyards and citrus variegated chlorosis (CVC) in orange trees. Both of these economically-devastating diseases are caused by distinct strains of this complex group of microorganisms, which has motivated researchers to conduct extensive genomic sequencing projects with Xf strains. This sequence information, along with other molecular tools, have been used to estimate the evolutionary history of the group and provide clues to understand the capacity of Xf to infect different hosts, causing a variety of symptoms. Nonetheless, although significant amounts of information have been generated from Xf strains, a large proportion of these efforts has concentrated on the study of North American strains, limiting our understanding about the genomic composition of South American strains - which is particularly important for CVC-associated strains. This paper describes the first genome-wide comparison among South American Xf strains, involving 6 distinct citrus-associated bacteria. Comparative analyses performed through a microarray-based approach allowed identification and characterization of large mobile genetic elements that seem to be exclusive to South American strains. Moreover, a large-scale sequencing effort, based on Suppressive Subtraction Hybridization (SSH), identified 290 new ORFs, distributed in 135 Groups of Orthologous Elements, throughout the genomes of these bacteria. Results from microarray-based comparisons provide further evidence concerning activity of horizontally transferred elements, reinforcing their importance as major mediators in the evolution of Xf. Moreover, the microarray-based genomic profiles showed similarity between Xf strains 9a5c and Fb7, which is unexpected, given the geographical and chronological differences associated with the isolation of these microorganisms. The newly identified ORFs, obtained by SSH, represent an approximately 10% increase in our current knowledge of the South American Xf gene pool and include new putative virulence factors, as well as novel potential markers for strain identification. Surprisingly, this list of novel elements include sequences previously believed to be unique to North American strains, pointing to the necessity of revising the list of specific markers that may be used for identification of distinct Xf strains.

  5. Expression profiling of chickpea genes differentially regulated during a resistance response to Ascochyta rabiei.

    PubMed

    Coram, Tristan E; Pang, Edwin C K

    2006-11-01

    Using microarray technology and a set of chickpea (Cicer arietinum L.) unigenes, grasspea (Lathyrus sativus L.) expressed sequence tags (ESTs) and lentil (Lens culinaris Med.) resistance gene analogues, the ascochyta blight (Ascochyta rabiei (Pass.) L.) resistance response was studied in four chickpea genotypes, including resistant, moderately resistant, susceptible and wild relative (Cicer echinospermum L.) genotypes. The experimental system minimized environmental effects and was conducted in reference design, in which samples from mock-inoculated controls acted as reference against post-inoculation samples. Robust data quality was achieved through the use of three biological replicates (including a dye swap), the inclusion of negative controls and strict selection criteria for differentially expressed genes, including a fold change cut-off determined by self-self hybridizations, Student's t-test and multiple testing correction (P < 0.05). Microarray observations were also validated by quantitative reverse transcriptase-polymerase chain reaction (RT-PCR). The time course expression patterns of 756 microarray features resulted in the differential expression of 97 genes in at least one genotype at one time point. k-means clustering grouped the genes into clusters of similar observations for each genotype, and comparisons between A. rabiei-resistant and A. rabiei-susceptible genotypes revealed potential gene 'signatures' predictive of effective A. rabiei resistance. These genes included several pathogenesis-related proteins, SNAKIN2 antimicrobial peptide, proline-rich protein, disease resistance response protein DRRG49-C, environmental stress-inducible protein, leucine-zipper protein, polymorphic antigen membrane protein, Ca-binding protein and several unknown proteins. The potential involvement of these genes and their pathways of induction are discussed. This study represents the first large-scale gene expression profiling in chickpea, and future work will focus on the functional validation of the genes of interest.

  6. Identification of significant features by the Global Mean Rank test.

    PubMed

    Klammer, Martin; Dybowski, J Nikolaj; Hoffmann, Daniel; Schaab, Christoph

    2014-01-01

    With the introduction of omics-technologies such as transcriptomics and proteomics, numerous methods for the reliable identification of significantly regulated features (genes, proteins, etc.) have been developed. Experimental practice requires these tests to successfully deal with conditions such as small numbers of replicates, missing values, non-normally distributed expression levels, and non-identical distributions of features. With the MeanRank test we aimed at developing a test that performs robustly under these conditions, while favorably scaling with the number of replicates. The test proposed here is a global one-sample location test, which is based on the mean ranks across replicates, and internally estimates and controls the false discovery rate. Furthermore, missing data is accounted for without the need of imputation. In extensive simulations comparing MeanRank to other frequently used methods, we found that it performs well with small and large numbers of replicates, feature dependent variance between replicates, and variable regulation across features on simulation data and a recent two-color microarray spike-in dataset. The tests were then used to identify significant changes in the phosphoproteomes of cancer cells induced by the kinase inhibitors erlotinib and 3-MB-PP1 in two independently published mass spectrometry-based studies. MeanRank outperformed the other global rank-based methods applied in this study. Compared to the popular Significance Analysis of Microarrays and Linear Models for Microarray methods, MeanRank performed similar or better. Furthermore, MeanRank exhibits more consistent behavior regarding the degree of regulation and is robust against the choice of preprocessing methods. MeanRank does not require any imputation of missing values, is easy to understand, and yields results that are easy to interpret. The software implementing the algorithm is freely available for academic and commercial use.

  7. Fabrication of high quality cDNA microarray using a small amount of cDNA.

    PubMed

    Park, Chan Hee; Jeong, Ha Jin; Jung, Jae Jun; Lee, Gui Yeon; Kim, Sang-Chul; Kim, Tae Soo; Yang, Sang Hwa; Chung, Hyun Cheol; Rha, Sun Young

    2004-05-01

    DNA microarray technology has become an essential part of biological research. It enables the genome-scale analysis of gene expression in various types of model systems. Manufacturing high quality cDNA microarrays of microdeposition type depends on some key factors including a printing device, spotting pins, glass slides, spotting solution, and humidity during spotting. UsingEthe Microgrid II TAS model printing device, this study defined the optimal conditions for producing high density, high quality cDNA microarrays with the least amount of cDNA product. It was observed that aminosilane-modified slides were superior to other types of surface modified-slides. A humidity of 30+/-3% in a closed environment and the overnight drying of the spotted slides gave the best conditions for arraying. In addition, the cDNA dissolved in 30% DMSO gave the optimal conditions for spotting compared to the 1X ArrayIt, 3X SSC and 50% DMSO. Lastly, cDNA in the concentration range of 100-300 ng/ micro l was determined to be best for arraying and post-processing. Currently, the printing system in this study yields reproducible 9000 spots with a spot size 150 mm diameter, and a 200 nm spot spacing.

  8. Application of chemical arrays in screening elastase inhibitors.

    PubMed

    Gao, Feng; Du, Guan-Hua

    2006-06-01

    Protein chip technology provides a new and useful tool for high-throughput screening of drugs because of its high performance and low sample consumption. In order to screen elastase inhibitors on a large scale, we designed a composite microarray integrating enzyme chip containing chemical arrays on glass slides to screen for enzymatic inhibitors. The composite microarray includes an active proteinase film, screened chemical arrays distributed on the film, and substrate microarrays to demonstrate change of color. The detection principle is that elastase hydrolyzes synthetic colorless substrates and turns them into yellow products. Because yellow is difficult to detect, bromochlorophenol blue (BPB) was added into substrate solutions to facilitate the detection process. After the enzyme had catalyzed reactions for 2 h, effects of samples on enzymatic activity could be determined by detecting color change of the spots. When chemical samples inhibited enzymatic activity, substrates were blue instead of yellow products. If the enzyme retained its activity, the yellow color of the products combined with blue of BPB to make the spots green. Chromogenic differences demonstrated whether chemicals inhibited enzymatic activity or not. In this assay, 11,680 compounds were screened, and two valuable chemical hits were identified, which demonstrates that this assay is effective, sensitive and applicable for high-throughput screening (HTS).

  9. Comprehensive analysis of correlation coefficients estimated from pooling heterogeneous microarray data

    PubMed Central

    2013-01-01

    Background The synthesis of information across microarray studies has been performed by combining statistical results of individual studies (as in a mosaic), or by combining data from multiple studies into a large pool to be analyzed as a single data set (as in a melting pot of data). Specific issues relating to data heterogeneity across microarray studies, such as differences within and between labs or differences among experimental conditions, could lead to equivocal results in a melting pot approach. Results We applied statistical theory to determine the specific effect of different means and heteroskedasticity across 19 groups of microarray data on the sign and magnitude of gene-to-gene Pearson correlation coefficients obtained from the pool of 19 groups. We quantified the biases of the pooled coefficients and compared them to the biases of correlations estimated by an effect-size model. Mean differences across the 19 groups were the main factor determining the magnitude and sign of the pooled coefficients, which showed largest values of bias as they approached ±1. Only heteroskedasticity across the pool of 19 groups resulted in less efficient estimations of correlations than did a classical meta-analysis approach of combining correlation coefficients. These results were corroborated by simulation studies involving either mean differences or heteroskedasticity across a pool of N > 2 groups. Conclusions The combination of statistical results is best suited for synthesizing the correlation between expression profiles of a gene pair across several microarray studies. PMID:23822712

  10. Apparently low reproducibility of true differential expression discoveries in microarray studies.

    PubMed

    Zhang, Min; Yao, Chen; Guo, Zheng; Zou, Jinfeng; Zhang, Lin; Xiao, Hui; Wang, Dong; Yang, Da; Gong, Xue; Zhu, Jing; Li, Yanhui; Li, Xia

    2008-09-15

    Differentially expressed gene (DEG) lists detected from different microarray studies for a same disease are often highly inconsistent. Even in technical replicate tests using identical samples, DEG detection still shows very low reproducibility. It is often believed that current small microarray studies will largely introduce false discoveries. Based on a statistical model, we show that even in technical replicate tests using identical samples, it is highly likely that the selected DEG lists will be very inconsistent in the presence of small measurement variations. Therefore, the apparently low reproducibility of DEG detection from current technical replicate tests does not indicate low quality of microarray technology. We also demonstrate that heterogeneous biological variations existing in real cancer data will further reduce the overall reproducibility of DEG detection. Nevertheless, in small subsamples from both simulated and real data, the actual false discovery rate (FDR) for each DEG list tends to be low, suggesting that each separately determined list may comprise mostly true DEGs. Rather than simply counting the overlaps of the discovery lists from different studies for a complex disease, novel metrics are needed for evaluating the reproducibility of discoveries characterized with correlated molecular changes. Supplementaty information: Supplementary data are available at Bioinformatics online.

  11. A Systems Biology Methodology Combining Transcriptome and Interactome Datasets to Assess the Implications of Cytokinin Signaling for Plant Immune Networks.

    PubMed

    Kunz, Meik; Dandekar, Thomas; Naseem, Muhammad

    2017-01-01

    Cytokinins (CKs) play an important role in plant growth and development. Also, several studies highlight the modulatory implications of CKs for plant-pathogen interaction. However, the underlying mechanisms of CK mediating immune networks in plants are still not fully understood. A detailed analysis of high-throughput transcriptome (RNA-Seq and microarrays) datasets under modulated conditions of plant CKs and its mergence with cellular interactome (large-scale protein-protein interaction data) has the potential to unlock the contribution of CKs to plant defense. Here, we specifically describe a detailed systems biology methodology pertinent to the acquisition and analysis of various omics datasets that delineate the role of plant CKs in impacting immune pathways in Arabidopsis.

  12. Influence of age, sex, and strength training on human muscle gene expression determined by microarray

    PubMed Central

    ROTH, STEPHEN M.; FERRELL, ROBERT E.; PETERS, DAVID G.; METTER, E. JEFFREY; HURLEY, BEN F.; ROGERS, MARC A.

    2010-01-01

    The purpose of this study was to determine the influence of age, sex, and strength training (ST) on large-scale gene expression patterns in vastus lateralis muscle biopsies using high-density cDNA microarrays and quantitative PCR. Muscle samples from sedentary young (20–30 yr) and older (65–75 yr) men and women (5 per group) were obtained before and after a 9-wk unilateral heavy resistance ST program. RNA was hybridized to cDNA filter microarrays representing ~4,000 known human genes and comparisons were made among arrays to determine differential gene expression as a result of age and sex differences, and/or response to ST. Sex had the strongest influence on muscle gene expression, with differential expression (>1.7-fold) observed for ~200 genes between men and women (~75% with higher expression in men). Age contributed to differential expression as well, as ~50 genes were identified as differentially expressed (>1.7-fold) in relation to age, representing structural, metabolic, and regulatory gene classes. Sixty-nine genes were identified as being differentially expressed (>1.7-fold) in all groups in response to ST, and the majority of these were downregulated. Quantitative PCR was employed to validate expression levels for caldesmon, SWI/SNF (BAF60b), and four-and-a-half LIM domains 1. These significant differences suggest that in the analysis of skeletal muscle gene expression issues of sex, age, and habitual physical activity must be addressed, with sex being the most critical variable. PMID:12209020

  13. Elucidation of the effect of brain cortex tetrapeptide Cortagen on gene expression in mouse heart by microarray.

    PubMed

    Anisimov, Sergey V; Khavinson, Vladimir Kh; Anisimov, Vladimir N

    2004-01-01

    Aging is associated with significant alterations in gene expression in numerous organs and tissues. Anti-aging therapy with peptide bioregulators holds much promise for the correction of age-associated changes, making a screening for their molecular targets in tissues an important question of modern gerontology. The synthetic tetrapeptide Cortagen (Ala-Glu-Asp-Pro) was obtained by directed synthesis based on amino acid analysis of natural brain cortex peptide preparation Cortexin. In humans, Cortagen demonstrated a pronounced therapeutic effect upon the structural and functional posttraumatic recovery of peripheral nerve tissue. Importantly, other effects were also observed in cardiovascular and cerebrovascular parameters. Based on these latter observations, we hypothesized that acute course of Cortagen treatment, large-scale transcriptome analysis, and identification of transcripts with altered expression in heart would facilitate our understanding of the mechanisms responsible for this peptide biological effects. We therefore analyzed the expression of 15,247 transcripts in the heart of female 6-months CBA mice receiving injections of Cortagen for 5 consecutive days was studied by cDNA microarrays. Comparative analysis of cDNA microarray hybridisation with heart samples from control and experimental group revealed 234 clones (1,53% of the total number of clones) with significant changes of expression that matched 110 known genes belonging to various functional categories. Maximum up- and down-regulation was +5.42 and -2.86, respectively. Intercomparison of changes in cardiac expression profile induced by synthetic peptides (Cortagen, Vilon, Epitalon) and pineal peptide hormone melatonin revealed both common and specific effects of Cortagen upon gene expression in heart.

  14. Comparison of oral microbial profiles between children with severe early childhood caries and caries-free children using the human oral microbe identification microarray.

    PubMed

    Ma, Chen; Chen, Feng; Zhang, Yifei; Sun, Xiangyu; Tong, Peiyuan; Si, Yan; Zheng, Shuguo

    2015-01-01

    Early childhood caries (ECC) has become a prevalent public health problem among Chinese preschool children. The bacterial microflora is considered to be an important factor in the formation and progress of dental caries. However, high-throughput and large-scale studies of the primary dentition are lacking. The present study aimed to compare oral microbial profiles between children with severe ECC (SECC) and caries-free children. Both saliva and supragingival plaque samples were obtained from children with SECC (n = 20) and caries-free children (n = 20) aged 3 to 4 years. The samples were assayed using the Human Oral Microbe Identification Microarray (HOMIM). A total of 379 bacterial species were detected in both the saliva and supragingival plaque samples from all children. Thirteen (including Streptococcus) and two (Streptococcus and Actinomyces) bacterial species in supragingival plaque and saliva, respectively, showed significant differences in prevalence between the two groups. Of these, the frequency of Streptococcus mutans detection was significantly higher in both saliva (p = 0.026) and plaque (p = 0.006) samples from the SECC group than in those from the caries-free group. The findings of our study revealed differences in the oral microbiota between the SECC and caries-free groups Several genera, including Streptococcus, Porphyromonas, and Actinomyces, are strongly associated with SECC and can be potential biomarkers of dental caries in the primary dentition.

  15. Development of atmospheric pressure large area plasma jet for sterilisation and investigation of molecule and plasma interaction

    NASA Astrophysics Data System (ADS)

    Zerbe, Kristina; Iberler, Marcus; Jacoby, Joachim; Wagner, Christopher

    2016-09-01

    The intention of the project is the development and improvement of an atmospheric plasma jet based on various discharge forms (e.g. DBD, RF, micro-array) for sterilisation of biomedical equipment and investigation of biomolecules under the influence of plasma stress. The major objective is to design a plasma jet with a large area and an extended length. Due to the success on small scale plasma sterilisation the request of large area plasma has increased. Many applications of chemical disinfection in environmental and medical cleaning could thereby be complemented. Subsequently, the interaction between plasma and biomolecules should be investigated to improve plasma strerilisation. Special interest will be on non equilibrium plasma electrons affecting the chemical bindings of organic molecules.

  16. Optimised padlock probe ligation and microarray detection of multiple (non-authorised) GMOs in a single reaction

    PubMed Central

    Prins, Theo W; van Dijk, Jeroen P; Beenen, Henriek G; Van Hoef, AM Angeline; Voorhuijzen, Marleen M; Schoen, Cor D; Aarts, Henk JM; Kok, Esther J

    2008-01-01

    Background To maintain EU GMO regulations, producers of new GM crop varieties need to supply an event-specific method for the new variety. As a result methods are nowadays available for EU-authorised genetically modified organisms (GMOs), but only to a limited extent for EU-non-authorised GMOs (NAGs). In the last decade the diversity of genetically modified (GM) ingredients in food and feed has increased significantly. As a result of this increase GMO laboratories currently need to apply many different methods to establish to potential presence of NAGs in raw materials and complex derived products. Results In this paper we present an innovative method for detecting (approved) GMOs as well as the potential presence of NAGs in complex DNA samples containing different crop species. An optimised protocol has been developed for padlock probe ligation in combination with microarray detection (PPLMD) that can easily be scaled up. Linear padlock probes targeted against GMO-events, -elements and -species have been developed that can hybridise to their genomic target DNA and are visualised using microarray hybridisation. In a tenplex PPLMD experiment, different genomic targets in Roundup-Ready soya, MON1445 cotton and Bt176 maize were detected down to at least 1%. In single experiments, the targets were detected down to 0.1%, i.e. comparable to standard qPCR. Conclusion Compared to currently available methods this is a significant step forward towards multiplex detection in complex raw materials and derived products. It is shown that the PPLMD approach is suitable for large-scale detection of GMOs in real-life samples and provides the possibility to detect and/or identify NAGs that would otherwise remain undetected. PMID:19055784

  17. Optimised padlock probe ligation and microarray detection of multiple (non-authorised) GMOs in a single reaction.

    PubMed

    Prins, Theo W; van Dijk, Jeroen P; Beenen, Henriek G; Van Hoef, Am Angeline; Voorhuijzen, Marleen M; Schoen, Cor D; Aarts, Henk J M; Kok, Esther J

    2008-12-04

    To maintain EU GMO regulations, producers of new GM crop varieties need to supply an event-specific method for the new variety. As a result methods are nowadays available for EU-authorised genetically modified organisms (GMOs), but only to a limited extent for EU-non-authorised GMOs (NAGs). In the last decade the diversity of genetically modified (GM) ingredients in food and feed has increased significantly. As a result of this increase GMO laboratories currently need to apply many different methods to establish to potential presence of NAGs in raw materials and complex derived products. In this paper we present an innovative method for detecting (approved) GMOs as well as the potential presence of NAGs in complex DNA samples containing different crop species. An optimised protocol has been developed for padlock probe ligation in combination with microarray detection (PPLMD) that can easily be scaled up. Linear padlock probes targeted against GMO-events, -elements and -species have been developed that can hybridise to their genomic target DNA and are visualised using microarray hybridisation.In a tenplex PPLMD experiment, different genomic targets in Roundup-Ready soya, MON1445 cotton and Bt176 maize were detected down to at least 1%. In single experiments, the targets were detected down to 0.1%, i.e. comparable to standard qPCR. Compared to currently available methods this is a significant step forward towards multiplex detection in complex raw materials and derived products. It is shown that the PPLMD approach is suitable for large-scale detection of GMOs in real-life samples and provides the possibility to detect and/or identify NAGs that would otherwise remain undetected.

  18. EMAAS: An extensible grid-based Rich Internet Application for microarray data analysis and management

    PubMed Central

    Barton, G; Abbott, J; Chiba, N; Huang, DW; Huang, Y; Krznaric, M; Mack-Smith, J; Saleem, A; Sherman, BT; Tiwari, B; Tomlinson, C; Aitman, T; Darlington, J; Game, L; Sternberg, MJE; Butcher, SA

    2008-01-01

    Background Microarray experimentation requires the application of complex analysis methods as well as the use of non-trivial computer technologies to manage the resultant large data sets. This, together with the proliferation of tools and techniques for microarray data analysis, makes it very challenging for a laboratory scientist to keep up-to-date with the latest developments in this field. Our aim was to develop a distributed e-support system for microarray data analysis and management. Results EMAAS (Extensible MicroArray Analysis System) is a multi-user rich internet application (RIA) providing simple, robust access to up-to-date resources for microarray data storage and analysis, combined with integrated tools to optimise real time user support and training. The system leverages the power of distributed computing to perform microarray analyses, and provides seamless access to resources located at various remote facilities. The EMAAS framework allows users to import microarray data from several sources to an underlying database, to pre-process, quality assess and analyse the data, to perform functional analyses, and to track data analysis steps, all through a single easy to use web portal. This interface offers distance support to users both in the form of video tutorials and via live screen feeds using the web conferencing tool EVO. A number of analysis packages, including R-Bioconductor and Affymetrix Power Tools have been integrated on the server side and are available programmatically through the Postgres-PLR library or on grid compute clusters. Integrated distributed resources include the functional annotation tool DAVID, GeneCards and the microarray data repositories GEO, CELSIUS and MiMiR. EMAAS currently supports analysis of Affymetrix 3' and Exon expression arrays, and the system is extensible to cater for other microarray and transcriptomic platforms. Conclusion EMAAS enables users to track and perform microarray data management and analysis tasks through a single easy-to-use web application. The system architecture is flexible and scalable to allow new array types, analysis algorithms and tools to be added with relative ease and to cope with large increases in data volume. PMID:19032776

  19. Transcriptome Profiling of In-Vivo Produced Bovine Pre-implantation Embryos Using Two-color Microarray Platform.

    PubMed

    Salehi, Reza; Tsoi, Stephen C M; Colazo, Marcos G; Ambrose, Divakar J; Robert, Claude; Dyck, Michael K

    2017-01-30

    Early embryonic loss is a large contributor to infertility in cattle. Moreover, bovine becomes an interesting model to study human preimplantation embryo development due to their similar developmental process. Although genetic factors are known to affect early embryonic development, the discovery of such factors has been a serious challenge. Microarray technology allows quantitative measurement and gene expression profiling of transcript levels on a genome-wide basis. One of the main decisions that have to be made when planning a microarray experiment is whether to use a one- or two-color approach. Two-color design increases technical replication, minimizes variability, improves sensitivity and accuracy as well as allows having loop designs, defining the common reference samples. Although microarray is a powerful biological tool, there are potential pitfalls that can attenuate its power. Hence, in this technical paper we demonstrate an optimized protocol for RNA extraction, amplification, labeling, hybridization of the labeled amplified RNA to the array, array scanning and data analysis using the two-color analysis strategy.

  20. Inference of sigma factor controlled networks by using numerical modeling applied to microarray time series data of the germinating prokaryote.

    PubMed

    Strakova, Eva; Zikova, Alice; Vohradsky, Jiri

    2014-01-01

    A computational model of gene expression was applied to a novel test set of microarray time series measurements to reveal regulatory interactions between transcriptional regulators represented by 45 sigma factors and the genes expressed during germination of a prokaryote Streptomyces coelicolor. Using microarrays, the first 5.5 h of the process was recorded in 13 time points, which provided a database of gene expression time series on genome-wide scale. The computational modeling of the kinetic relations between the sigma factors, individual genes and genes clustered according to the similarity of their expression kinetics identified kinetically plausible sigma factor-controlled networks. Using genome sequence annotations, functional groups of genes that were predominantly controlled by specific sigma factors were identified. Using external binding data complementing the modeling approach, specific genes involved in the control of the studied process were identified and their function suggested.

  1. Genome-wide map of Apn1 binding sites under oxidative stress in Saccharomyces cerevisiae.

    PubMed

    Morris, Lydia P; Conley, Andrew B; Degtyareva, Natalya; Jordan, I King; Doetsch, Paul W

    2017-11-01

    The DNA is cells is continuously exposed to reactive oxygen species resulting in toxic and mutagenic DNA damage. Although the repair of oxidative DNA damage occurs primarily through the base excision repair (BER) pathway, the nucleotide excision repair (NER) pathway processes some of the same lesions. In addition, damage tolerance mechanisms, such as recombination and translesion synthesis, enable cells to tolerate oxidative DNA damage, especially when BER and NER capacities are exceeded. Thus, disruption of BER alone or disruption of BER and NER in Saccharomyces cerevisiae leads to increased mutations as well as large-scale genomic rearrangements. Previous studies demonstrated that a particular region of chromosome II is susceptible to chronic oxidative stress-induced chromosomal rearrangements, suggesting the existence of DNA damage and/or DNA repair hotspots. Here we investigated the relationship between oxidative damage and genomic instability utilizing chromatin immunoprecipitation combined with DNA microarray technology to profile DNA repair sites along yeast chromosomes under different oxidative stress conditions. We targeted the major yeast AP endonuclease Apn1 as a representative BER protein. Our results indicate that Apn1 target sequences are enriched for cytosine and guanine nucleotides. We predict that BER protects these sites in the genome because guanines and cytosines are thought to be especially susceptible to oxidative attack, thereby preventing large-scale genome destabilization from chronic accumulation of DNA damage. Information from our studies should provide insight into how regional deployment of oxidative DNA damage management systems along chromosomes protects against large-scale rearrangements. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  2. Revisiting adverse effects of cross-hybridization in Affymetrix gene expression data: do they matter for correlation analysis?

    PubMed

    Klebanov, Lev; Chen, Linlin; Yakovlev, Andrei

    2007-11-07

    This work was undertaken in response to a recently published paper by Okoniewski and Miller (BMC Bioinformatics 2006, 7: Article 276). The authors of that paper came to the conclusion that the process of multiple targeting in short oligonucleotide microarrays induces spurious correlations and this effect may deteriorate the inference on correlation coefficients. The design of their study and supporting simulations cast serious doubt upon the validity of this conclusion. The work by Okoniewski and Miller drove us to revisit the issue by means of experimentation with biological data and probabilistic modeling of cross-hybridization effects. We have identified two serious flaws in the study by Okoniewski and Miller: (1) The data used in their paper are not amenable to correlation analysis; (2) The proposed simulation model is inadequate for studying the effects of cross-hybridization. Using two other data sets, we have shown that removing multiply targeted probe sets does not lead to a shift in the histogram of sample correlation coefficients towards smaller values. A more realistic approach to mathematical modeling of cross-hybridization demonstrates that this process is by far more complex than the simplistic model considered by the authors. A diversity of correlation effects (such as the induction of positive or negative correlations) caused by cross-hybridization can be expected in theory but there are natural limitations on the ability to provide quantitative insights into such effects due to the fact that they are not directly observable. The proposed stochastic model is instrumental in studying general regularities in hybridization interaction between probe sets in microarray data. As the problem stands now, there is no compelling reason to believe that multiple targeting causes a large-scale effect on the correlation structure of Affymetrix gene expression data. Our analysis suggests that the observed long-range correlations in microarray data are of a biological nature rather than a technological flaw.

  3. A two-step hierarchical hypothesis set testing framework, with applications to gene expression data on ordered categories

    PubMed Central

    2014-01-01

    Background In complex large-scale experiments, in addition to simultaneously considering a large number of features, multiple hypotheses are often being tested for each feature. This leads to a problem of multi-dimensional multiple testing. For example, in gene expression studies over ordered categories (such as time-course or dose-response experiments), interest is often in testing differential expression across several categories for each gene. In this paper, we consider a framework for testing multiple sets of hypothesis, which can be applied to a wide range of problems. Results We adopt the concept of the overall false discovery rate (OFDR) for controlling false discoveries on the hypothesis set level. Based on an existing procedure for identifying differentially expressed gene sets, we discuss a general two-step hierarchical hypothesis set testing procedure, which controls the overall false discovery rate under independence across hypothesis sets. In addition, we discuss the concept of the mixed-directional false discovery rate (mdFDR), and extend the general procedure to enable directional decisions for two-sided alternatives. We applied the framework to the case of microarray time-course/dose-response experiments, and proposed three procedures for testing differential expression and making multiple directional decisions for each gene. Simulation studies confirm the control of the OFDR and mdFDR by the proposed procedures under independence and positive correlations across genes. Simulation results also show that two of our new procedures achieve higher power than previous methods. Finally, the proposed methodology is applied to a microarray dose-response study, to identify 17 β-estradiol sensitive genes in breast cancer cells that are induced at low concentrations. Conclusions The framework we discuss provides a platform for multiple testing procedures covering situations involving two (or potentially more) sources of multiplicity. The framework is easy to use and adaptable to various practical settings that frequently occur in large-scale experiments. Procedures generated from the framework are shown to maintain control of the OFDR and mdFDR, quantities that are especially relevant in the case of multiple hypothesis set testing. The procedures work well in both simulations and real datasets, and are shown to have better power than existing methods. PMID:24731138

  4. Large scale ZnTe nanostructures on polymer micro patterns via capillary force photolithography

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Florence, S. Sasi, E-mail: sshanmugaraj@jazanu.edu.sa; Can, N.; Adam, H.

    2016-06-10

    A novel approach to prepare micro patterns ZnTe nanostructures on Si (100) substrate using thermal evaporation is proposed by capillary Force Lithography (CFL) technique on a self-assembled sacrificial Polystyrene mask. Polystyrene thin films on Si substrates are used to fabricate surface micro-relief patterns. ZnTe nanoparticles have been deposited by thermal evaporation method. The deposited ZnTe nanoparticles properties were assessed by Atomic Force Microscope (AFM), Scanning Electron Microscope (SEM). SEM studies indicated that the particles are uniform in size and shape, well dispersed and spherical in shape. This study reports the micro-arrays of ZnTe nanoparticles on a self-assembled sacrificial PS maskmore » using a capillary flow photolithography process which showed excellent, morphological properties which can be used in photovoltaic devices for anti-reflection applications.« less

  5. PATIKA: an integrated visual environment for collaborative construction and analysis of cellular pathways.

    PubMed

    Demir, E; Babur, O; Dogrusoz, U; Gursoy, A; Nisanci, G; Cetin-Atalay, R; Ozturk, M

    2002-07-01

    Availability of the sequences of entire genomes shifts the scientific curiosity towards the identification of function of the genomes in large scale as in genome studies. In the near future, data produced about cellular processes at molecular level will accumulate with an accelerating rate as a result of proteomics studies. In this regard, it is essential to develop tools for storing, integrating, accessing, and analyzing this data effectively. We define an ontology for a comprehensive representation of cellular events. The ontology presented here enables integration of fragmented or incomplete pathway information and supports manipulation and incorporation of the stored data, as well as multiple levels of abstraction. Based on this ontology, we present the architecture of an integrated environment named Patika (Pathway Analysis Tool for Integration and Knowledge Acquisition). Patika is composed of a server-side, scalable, object-oriented database and client-side editors to provide an integrated, multi-user environment for visualizing and manipulating network of cellular events. This tool features automated pathway layout, functional computation support, advanced querying and a user-friendly graphical interface. We expect that Patika will be a valuable tool for rapid knowledge acquisition, microarray generated large-scale data interpretation, disease gene identification, and drug development. A prototype of Patika is available upon request from the authors.

  6. A fisheye viewer for microarray-based gene expression data

    PubMed Central

    Wu, Min; Thao, Cheng; Mu, Xiangming; Munson, Ethan V

    2006-01-01

    Background Microarray has been widely used to measure the relative amounts of every mRNA transcript from the genome in a single scan. Biologists have been accustomed to reading their experimental data directly from tables. However, microarray data are quite large and are stored in a series of files in a machine-readable format, so direct reading of the full data set is not feasible. The challenge is to design a user interface that allows biologists to usefully view large tables of raw microarray-based gene expression data. This paper presents one such interface – an electronic table (E-table) that uses fisheye distortion technology. Results The Fisheye Viewer for microarray-based gene expression data has been successfully developed to view MIAME data stored in the MAGE-ML format. The viewer can be downloaded from the project web site . The fisheye viewer was implemented in Java so that it could run on multiple platforms. We implemented the E-table by adapting JTable, a default table implementation in the Java Swing user interface library. Fisheye views use variable magnification to balance magnification for easy viewing and compression for maximizing the amount of data on the screen. Conclusion This Fisheye Viewer is a lightweight but useful tool for biologists to quickly overview the raw microarray-based gene expression data in an E-table. PMID:17038193

  7. Parallel Mutual Information Based Construction of Genome-Scale Networks on the Intel® Xeon Phi™ Coprocessor.

    PubMed

    Misra, Sanchit; Pamnany, Kiran; Aluru, Srinivas

    2015-01-01

    Construction of whole-genome networks from large-scale gene expression data is an important problem in systems biology. While several techniques have been developed, most cannot handle network reconstruction at the whole-genome scale, and the few that can, require large clusters. In this paper, we present a solution on the Intel Xeon Phi coprocessor, taking advantage of its multi-level parallelism including many x86-based cores, multiple threads per core, and vector processing units. We also present a solution on the Intel® Xeon® processor. Our solution is based on TINGe, a fast parallel network reconstruction technique that uses mutual information and permutation testing for assessing statistical significance. We demonstrate the first ever inference of a plant whole genome regulatory network on a single chip by constructing a 15,575 gene network of the plant Arabidopsis thaliana from 3,137 microarray experiments in only 22 minutes. In addition, our optimization for parallelizing mutual information computation on the Intel Xeon Phi coprocessor holds out lessons that are applicable to other domains.

  8. A New Distribution Family for Microarray Data †

    PubMed Central

    Kelmansky, Diana Mabel; Ricci, Lila

    2017-01-01

    The traditional approach with microarray data has been to apply transformations that approximately normalize them, with the drawback of losing the original scale. The alternative standpoint taken here is to search for models that fit the data, characterized by the presence of negative values, preserving their scale; one advantage of this strategy is that it facilitates a direct interpretation of the results. A new family of distributions named gpower-normal indexed by p∈R is introduced and it is proven that these variables become normal or truncated normal when a suitable gpower transformation is applied. Expressions are given for moments and quantiles, in terms of the truncated normal density. This new family can be used to model asymmetric data that include non-positive values, as required for microarray analysis. Moreover, it has been proven that the gpower-normal family is a special case of pseudo-dispersion models, inheriting all the good properties of these models, such as asymptotic normality for small variances. A combined maximum likelihood method is proposed to estimate the model parameters, and it is applied to microarray and contamination data. R codes are available from the authors upon request. PMID:28208652

  9. A New Distribution Family for Microarray Data.

    PubMed

    Kelmansky, Diana Mabel; Ricci, Lila

    2017-02-10

    The traditional approach with microarray data has been to apply transformations that approximately normalize them, with the drawback of losing the original scale. The alternative stand point taken here is to search for models that fit the data, characterized by the presence of negative values, preserving their scale; one advantage of this strategy is that it facilitates a direct interpretation of the results. A new family of distributions named gpower-normal indexed by p∈R is introduced and it is proven that these variables become normal or truncated normal when a suitable gpower transformation is applied. Expressions are given for moments and quantiles, in terms of the truncated normal density. This new family can be used to model asymmetric data that include non-positive values, as required for microarray analysis. Moreover, it has been proven that the gpower-normal family is a special case of pseudo-dispersion models, inheriting all the good properties of these models, such as asymptotic normality for small variances. A combined maximum likelihood method is proposed to estimate the model parameters, and it is applied to microarray and contamination data. Rcodes are available from the authors upon request.

  10. Genomic paradigms for food-borne enteric pathogen analysis at the USFDA: case studies highlighting method utility, integration and resolution.

    PubMed

    Elkins, C A; Kotewicz, M L; Jackson, S A; Lacher, D W; Abu-Ali, G S; Patel, I R

    2013-01-01

    Modern risk control and food safety practices involving food-borne bacterial pathogens are benefiting from new genomic technologies for rapid, yet highly specific, strain characterisations. Within the United States Food and Drug Administration (USFDA) Center for Food Safety and Applied Nutrition (CFSAN), optical genome mapping and DNA microarray genotyping have been used for several years to quickly assess genomic architecture and gene content, respectively, for outbreak strain subtyping and to enhance retrospective trace-back analyses. The application and relative utility of each method varies with outbreak scenario and the suspect pathogen, with comparative analytical power enhanced by database scale and depth. Integration of these two technologies allows high-resolution scrutiny of the genomic landscapes of enteric food-borne pathogens with notable examples including Shiga toxin-producing Escherichia coli (STEC) and Salmonella enterica serovars from a variety of food commodities. Moreover, the recent application of whole genome sequencing technologies to food-borne pathogen outbreaks and surveillance has enhanced resolution to the single nucleotide scale. This new wealth of sequence data will support more refined next-generation custom microarray designs, targeted re-sequencing and "genomic signature recognition" approaches involving a combination of genes and single nucleotide polymorphism detection to distil strain-specific fingerprinting to a minimised scale. This paper examines the utility of microarrays and optical mapping in analysing outbreaks, reviews best practices and the limits of these technologies for pathogen differentiation, and it considers future integration with whole genome sequencing efforts.

  11. Statistical issues in signal extraction from microarrays

    NASA Astrophysics Data System (ADS)

    Bergemann, Tracy; Quiaoit, Filemon; Delrow, Jeffrey J.; Zhao, Lue Ping

    2001-06-01

    Microarray technologies are increasingly used in biomedical research to study genome-wide expression profiles in the post genomic era. Their popularity is largely due to their high throughput and economical affordability. For example, microarrays have been applied to studies of cell cycle, regulatory circuitry, cancer cell lines, tumor tissues, and drug discoveries. One obstacle facing the continued success of applying microarray technologies, however, is the random variaton present on microarrays: within signal spots, between spots and among chips. In addition, signals extracted by available software packages seem to vary significantly. Despite a variety of software packages, it appears that there are two major approaches to signal extraction. One approach is to focus on the identification of signal regions and hence estimation of signal levels above background levels. The other approach is to use the distribution of intensity values as a way of identifying relevant signals. Building upon both approaches, the objective of our work is to develop a method that is statistically rigorous and also efficient and robust. Statistical issues to be considered here include: (1) how to refine grid alignment so that the overall variation is minimized, (2) how to estimate the signal levels relative to the local background levels as well as the variance of this estimate, and (3) how to integrate red and green channel signals so that the ratio of interest is stable, simultaneously relaxing distributional assumptions.

  12. A prototype for unsupervised analysis of tissue microarrays for cancer research and diagnostics.

    PubMed

    Chen, Wenjin; Reiss, Michael; Foran, David J

    2004-06-01

    The tissue microarray (TMA) technique enables researchers to extract small cylinders of tissue from histological sections and arrange them in a matrix configuration on a recipient paraffin block such that hundreds can be analyzed simultaneously. TMA offers several advantages over traditional specimen preparation by maximizing limited tissue resources and providing a highly efficient means for visualizing molecular targets. By enabling researchers to reliably determine the protein expression profile for specific types of cancer, it may be possible to elucidate the mechanism by which healthy tissues are transformed into malignancies. Currently, the primary methods used to evaluate arrays involve the interactive review of TMA samples while they are viewed under a microscope, subjectively evaluated, and scored by a technician. This process is extremely slow, tedious, and prone to error. In order to facilitate large-scale, multi-institutional studies, a more automated and reliable means for analyzing TMAs is needed. We report here a web-based prototype which features automated imaging, registration, and distributed archiving of TMAs in multiuser network environments. The system utilizes a principal color decomposition approach to identify and characterize the predominant staining signatures of specimens in color space. This strategy was shown to be reliable for detecting and quantifying the immunohistochemical expression levels for TMAs.

  13. [Transciptome among Mexicans: a large scale methodology to analyze the genetics expression profile of simultaneous samples in muscle, adipose tissue and lymphocytes obtained from the same individual].

    PubMed

    Bastarrachea, Raúl A; López-Alvarenga, Juan Carlos; Kent, Jack W; Laviada-Molina, Hugo A; Cerda-Flores, Ricardo M; Calderón-Garcidueñas, Ana Laura; Torres-Salazar, Amada; Torres-Salazar, Amanda; Nava-González, Edna J; Solis-Pérez, Elizabeth; Gallegos-Cabrales, Esther C; Cole, Shelley A; Comuzzie, Anthony G

    2008-01-01

    We describe the methodology used to analyze multiple transcripts using microarray techniques in simultaneous biopsies of muscle, adipose tissue and lymphocytes obtained from the same individual as part of the standard protocol of the Genetics of Metabolic Diseases in Mexico: GEMM Family Study. We recruited 4 healthy male subjects with BM1 20-41, who signed an informed consent letter. Subjects participated in a clinical examination that included anthropometric and body composition measurements, muscle biopsies (vastus lateralis) subcutaneous fat biopsies anda blood draw. All samples provided sufficient amplified RNA for microarray analysis. Total RNA was extracted from the biopsy samples and amplified for analysis. Of the 48,687 transcript targets queried, 39.4% were detectable in a least one of the studied tissues. Leptin was not detectable in lymphocytes, weakly expressed in muscle, but overexpressed and highly correlated with BMI in subcutaneous fat. Another example was GLUT4, which was detectable only in muscle and not correlated with BMI. Expression level concordance was 0.7 (p< 0.001) for the three tissues studied. We demonstrated the feasibility of carrying out simultaneous analysis of gene expression in multiple tissues, concordance of genetic expression in different tissues, and obtained confidence that this method corroborates the expected biological relationships among LEPand GLUT4. TheGEMM study will provide a broad and valuable overview on metabolic diseases, including obesity and type 2 diabetes.

  14. Molecular Genetics Information System (MOLGENIS): alternatives in developing local experimental genomics databases.

    PubMed

    Swertz, Morris A; De Brock, E O; Van Hijum, Sacha A F T; De Jong, Anne; Buist, Girbe; Baerends, Richard J S; Kok, Jan; Kuipers, Oscar P; Jansen, Ritsert C

    2004-09-01

    Genomic research laboratories need adequate infrastructure to support management of their data production and research workflow. But what makes infrastructure adequate? A lack of appropriate criteria makes any decision on buying or developing a system difficult. Here, we report on the decision process for the case of a molecular genetics group establishing a microarray laboratory. Five typical requirements for experimental genomics database systems were identified: (i) evolution ability to keep up with the fast developing genomics field; (ii) a suitable data model to deal with local diversity; (iii) suitable storage of data files in the system; (iv) easy exchange with other software; and (v) low maintenance costs. The computer scientists and the researchers of the local microarray laboratory considered alternative solutions for these five requirements and chose the following options: (i) use of automatic code generation; (ii) a customized data model based on standards; (iii) storage of datasets as black boxes instead of decomposing them in database tables; (iv) loosely linking to other programs for improved flexibility; and (v) a low-maintenance web-based user interface. Our team evaluated existing microarray databases and then decided to build a new system, Molecular Genetics Information System (MOLGENIS), implemented using code generation in a period of three months. This case can provide valuable insights and lessons to both software developers and a user community embarking on large-scale genomic projects. http://www.molgenis.nl

  15. MADGE: scalable distributed data management software for cDNA microarrays.

    PubMed

    McIndoe, Richard A; Lanzen, Aaron; Hurtz, Kimberly

    2003-01-01

    The human genome project and the development of new high-throughput technologies have created unparalleled opportunities to study the mechanism of diseases, monitor the disease progression and evaluate effective therapies. Gene expression profiling is a critical tool to accomplish these goals. The use of nucleic acid microarrays to assess the gene expression of thousands of genes simultaneously has seen phenomenal growth over the past five years. Although commercial sources of microarrays exist, investigators wanting more flexibility in the genes represented on the array will turn to in-house production. The creation and use of cDNA microarrays is a complicated process that generates an enormous amount of information. Effective data management of this information is essential to efficiently access, analyze, troubleshoot and evaluate the microarray experiments. We have developed a distributable software package designed to track and store the various pieces of data generated by a cDNA microarray facility. This includes the clone collection storage data, annotation data, workflow queues, microarray data, data repositories, sample submission information, and project/investigator information. This application was designed using a 3-tier client server model. The data access layer (1st tier) contains the relational database system tuned to support a large number of transactions. The data services layer (2nd tier) is a distributed COM server with full database transaction support. The application layer (3rd tier) is an internet based user interface that contains both client and server side code for dynamic interactions with the user. This software is freely available to academic institutions and non-profit organizations at http://www.genomics.mcg.edu/niddkbtc.

  16. The Effect of Gestational Age on Angiogenic Gene Expression in the Rat Placenta

    PubMed Central

    Vaswani, Kanchan; Hum, Melissa Wen-Ching; Chan, Hsiu-Wen; Ryan, Jennifer; Wood-Bradley, Ryan J.; Nitert, Marloes Dekker; Mitchell, Murray D.; Armitage, James A.; Rice, Gregory E.

    2013-01-01

    The placenta plays a central role in determining the outcome of pregnancy. It undergoes changes during gestation as the fetus develops and as demands for energy substrate transfer and gas exchange increase. The molecular mechanisms that coordinate these changes have yet to be fully elucidated. The study performed a large scale screen of the transcriptome of the rat placenta throughout mid-late gestation (E14.25–E20) with emphasis on characterizing gestational age associated changes in the expression of genes invoved in angiogenic pathways. Sprague Dawley dams were sacrificed at E14.25, E15.25, E17.25 and E20 (n = 6 per group) and RNA was isolated from one placenta per dam. Changes in placental gene expression were identifed using Illumina Rat Ref-12 Expression BeadChip Microarrays. Differentially expressed genes (>2-fold change, <1% false discovery rate, FDR) were functionally categorised by gene ontology pathway analysis. A subset of differentially expressed genes identified by microarrays were confirmed using Real-Time qPCR. The expression of thirty one genes involved in the angiogenic pathway was shown to change over time, using microarray analysis (22 genes displayed increased and 9 gene decreased expression). Five genes (4 up regulated: Cd36, Mmp14, Rhob and Angpt4 and 1 down regulated: Foxm1) involved in angiogenesis and blood vessel morphogenesis were subjected to further validation. qPCR confirmed late gestational increased expression of Cd36, Mmp14, Rhob and Angpt4 and a decrease in expression of Foxm1 before labour onset (P<0.0001). The observed acute, pre-labour changes in the expression of the 31 genes during gestation warrant further investigation to elucidate their role in pregnancy. PMID:24391823

  17. Peptides and Anti-peptide Antibodies for Small and Medium Scale Peptide and Anti-peptide Affinity Microarrays: Antigenic Peptide Selection, Immobilization, and Processing.

    PubMed

    Zhang, Fan; Briones, Andrea; Soloviev, Mikhail

    2016-01-01

    This chapter describes the principles of selection of antigenic peptides for the development of anti-peptide antibodies for use in microarray-based multiplex affinity assays and also with mass-spectrometry detection. The methods described here are mostly applicable to small to medium scale arrays. Although the same principles of peptide selection would be suitable for larger scale arrays (with 100+ features) the actual informatics software and printing methods may well be different. Because of the sheer number of proteins/peptides to be processed and analyzed dedicated software capable of processing all the proteins and an enterprise level array robotics may be necessary for larger scale efforts. This report aims to provide practical advice to those who develop or use arrays with up to ~100 different peptide or protein features.

  18. Developing an Efficient and General Strategy for Immobilization of Small Molecules onto Microarrays Using Isocyanate Chemistry.

    PubMed

    Zhu, Chenggang; Zhu, Xiangdong; Landry, James P; Cui, Zhaomeng; Li, Quanfu; Dang, Yongjun; Mi, Lan; Zheng, Fengyun; Fei, Yiyan

    2016-03-16

    Small-molecule microarray (SMM) is an effective platform for identifying lead compounds from large collections of small molecules in drug discovery, and efficient immobilization of molecular compounds is a pre-requisite for the success of such a platform. On an isocyanate functionalized surface, we studied the dependence of immobilization efficiency on chemical residues on molecular compounds, terminal residues on isocyanate functionalized surface, lengths of spacer molecules, and post-printing treatment conditions, and we identified a set of optimized conditions that enable us to immobilize small molecules with significantly improved efficiencies, particularly for those molecules with carboxylic acid residues that are known to have low isocyanate reactivity. We fabricated microarrays of 3375 bioactive compounds on isocyanate functionalized glass slides under these optimized conditions and confirmed that immobilization percentage is over 73%.

  19. Microarrays for Undergraduate Classes

    ERIC Educational Resources Information Center

    Hancock, Dale; Nguyen, Lisa L.; Denyer, Gareth S.; Johnston, Jill M.

    2006-01-01

    A microarray experiment is presented that, in six laboratory sessions, takes undergraduate students from the tissue sample right through to data analysis. The model chosen, the murine erythroleukemia cell line, can be easily cultured in sufficient quantities for class use. Large changes in gene expression can be induced in these cells by…

  20. Microarray and Real-Time PCR Analyses of the Responses of High-Arctic Soil Bacteria to Hydrocarbon Pollution and Bioremediation Treatments▿

    PubMed Central

    Yergeau, Etienne; Arbour, Mélanie; Brousseau, Roland; Juck, David; Lawrence, John R.; Masson, Luke; Whyte, Lyle G.; Greer, Charles W.

    2009-01-01

    High-Arctic soils have low nutrient availability, low moisture content, and very low temperatures and, as such, they pose a particular problem in terms of hydrocarbon bioremediation. An in-depth knowledge of the microbiology involved in this process is likely to be crucial to understand and optimize the factors most influencing bioremediation. Here, we compared two distinct large-scale field bioremediation experiments, located at the Canadian high-Arctic stations of Alert (ex situ approach) and Eureka (in situ approach). Bacterial community structure and function were assessed using microarrays targeting the 16S rRNA genes of bacteria found in cold environments and hydrocarbon degradation genes as well as quantitative reverse transcriptase PCR targeting key functional genes. The results indicated a large difference between sampling sites in terms of both soil microbiology and decontamination rates. A rapid reorganization of the bacterial community structure and functional potential as well as rapid increases in the expression of alkane monooxygenases and polyaromatic hydrocarbon-ring-hydroxylating dioxygenases were observed 1 month after the bioremediation treatment commenced in the Alert soils. In contrast, no clear changes in community structure were observed in Eureka soils, while key gene expression increased after a relatively long lag period (1 year). Such discrepancies are likely caused by differences in bioremediation treatments (i.e., ex situ versus in situ), weathering of the hydrocarbons, indigenous microbial communities, and environmental factors such as soil humidity and temperature. In addition, this study demonstrates the value of molecular tools for the monitoring of polar bacteria and their associated functions during bioremediation. PMID:19684169

  1. Microarray slide hybridization using fluorescently labeled cDNA.

    PubMed

    Ares, Manuel

    2014-01-01

    Microarray hybridization is used to determine the amount and genomic origins of RNA molecules in an experimental sample. Unlabeled probe sequences for each gene or gene region are printed in an array on the surface of a slide, and fluorescently labeled cDNA derived from the RNA target is hybridized to it. This protocol describes a blocking and hybridization protocol for microarray slides. The blocking step is particular to the chemistry of "CodeLink" slides, but it serves to remind us that almost every kind of microarray has a treatment step that occurs after printing but before hybridization. We recommend making sure of the precise treatment necessary for the particular chemistry used in the slides to be hybridized because the attachment chemistries differ significantly. Hybridization is similar to northern or Southern blots, but on a much smaller scale.

  2. A RNA-Seq Analysis of the Rat Supraoptic Nucleus Transcriptome: Effects of Salt Loading on Gene Expression

    PubMed Central

    Salinas, Yasmmyn D.; Shi, YiJun; Greenwood, Michael; Hoe, See Ziau; Murphy, David; Gainer, Harold

    2015-01-01

    Magnocellular neurons (MCNs) in the hypothalamo-neurohypophysial system (HNS) are highly specialized to release large amounts of arginine vasopressin (Avp) or oxytocin (Oxt) into the blood stream and play critical roles in the regulation of body fluid homeostasis. The MCNs are osmosensory neurons and are excited by exposure to hypertonic solutions and inhibited by hypotonic solutions. The MCNs respond to systemic hypertonic and hypotonic stimulation with large changes in the expression of their Avp and Oxt genes, and microarray studies have shown that these osmotic perturbations also cause large changes in global gene expression in the HNS. In this paper, we examine gene expression in the rat supraoptic nucleus (SON) under normosmotic and chronic salt-loading SL) conditions by the first time using “new-generation”, RNA sequencing (RNA-Seq) methods. We reliably detect 9,709 genes as present in the SON by RNA-Seq, and 552 of these genes were changed in expression as a result of chronic SL. These genes reflect diverse functions, and 42 of these are involved in either transcriptional or translational processes. In addition, we compare the SON transcriptomes resolved by RNA-Seq methods with the SON transcriptomes determined by Affymetrix microarray methods in rats under the same osmotic conditions, and find that there are 6,466 genes present in the SON that are represented in both data sets, although 1,040 of the expressed genes were found only in the microarray data, and 2,762 of the expressed genes are selectively found in the RNA-Seq data and not the microarray data. These data provide the research community a comprehensive view of the transcriptome in the SON under normosmotic conditions and the changes in specific gene expression evoked by salt loading. PMID:25897513

  3. Construction of an Ostrea edulis database from genomic and expressed sequence tags (ESTs) obtained from Bonamia ostreae infected haemocytes: Development of an immune-enriched oligo-microarray.

    PubMed

    Pardo, Belén G; Álvarez-Dios, José Antonio; Cao, Asunción; Ramilo, Andrea; Gómez-Tato, Antonio; Planas, Josep V; Villalba, Antonio; Martínez, Paulino

    2016-12-01

    The flat oyster, Ostrea edulis, is one of the main farmed oysters, not only in Europe but also in the United States and Canada. Bonamiosis due to the parasite Bonamia ostreae has been associated with high mortality episodes in this species. This parasite is an intracellular protozoan that infects haemocytes, the main cells involved in oyster defence. Due to the economical and ecological importance of flat oyster, genomic data are badly needed for genetic improvement of the species, but they are still very scarce. The objective of this study is to develop a sequence database, OedulisDB, with new genomic and transcriptomic resources, providing new data and convenient tools to improve our knowledge of the oyster's immune mechanisms. Transcriptomic and genomic sequences were obtained using 454 pyrosequencing and compiled into an O. edulis database, OedulisDB, consisting of two sets of 10,318 and 7159 unique sequences that represent the oyster's genome (WG) and de novo haemocyte transcriptome (HT), respectively. The flat oyster transcriptome was obtained from two strains (naïve and tolerant) challenged with B. ostreae, and from their corresponding non-challenged controls. Approximately 78.5% of 5619 HT unique sequences were successfully annotated by Blast search using public databases. A total of 984 sequences were identified as being related to immune response and several key immune genes were identified for the first time in flat oyster. Additionally, transcriptome information was used to design and validate the first oligo-microarray in flat oyster enriched with immune sequences from haemocytes. Our transcriptomic and genomic sequencing and subsequent annotation have largely increased the scarce resources available for this economically important species and have enabled us to develop an OedulisDB database and accompanying tools for gene expression analysis. This study represents the first attempt to characterize in depth the O. edulis haemocyte transcriptome in response to B. ostreae through massively sequencing and has aided to improve our knowledge of the immune mechanisms of flat oyster. The validated oligo-microarray and the establishment of a reference transcriptome will be useful for large-scale gene expression studies in this species. Copyright © 2016 Elsevier Ltd. All rights reserved.

  4. Implementation of spectral clustering with partitioning around medoids (PAM) algorithm on microarray data of carcinoma

    NASA Astrophysics Data System (ADS)

    Cahyaningrum, Rosalia D.; Bustamam, Alhadi; Siswantining, Titin

    2017-03-01

    Technology of microarray became one of the imperative tools in life science to observe the gene expression levels, one of which is the expression of the genes of people with carcinoma. Carcinoma is a cancer that forms in the epithelial tissue. These data can be analyzed such as the identification expressions hereditary gene and also build classifications that can be used to improve diagnosis of carcinoma. Microarray data usually served in large dimension that most methods require large computing time to do the grouping. Therefore, this study uses spectral clustering method which allows to work with any object for reduces dimension. Spectral clustering method is a method based on spectral decomposition of the matrix which is represented in the form of a graph. After the data dimensions are reduced, then the data are partitioned. One of the famous partition method is Partitioning Around Medoids (PAM) which is minimize the objective function with exchanges all the non-medoid points into medoid point iteratively until converge. Objectivity of this research is to implement methods spectral clustering and partitioning algorithm PAM to obtain groups of 7457 genes with carcinoma based on the similarity value. The result in this study is two groups of genes with carcinoma.

  5. NF2 tumor suppressor gene: a comprehensive and efficient detection of somatic mutations by denaturing HPLC and microarray-CGH.

    PubMed

    Szijan, Irene; Rochefort, Daniel; Bruder, Carl; Surace, Ezequiel; Machiavelli, Gloria; Dalamon, Viviana; Cotignola, Javier; Ferreiro, Veronica; Campero, Alvaro; Basso, Armando; Dumanski, Jan P; Rouleau, Guy A

    2003-01-01

    The NF2 tumor suppressor gene, located in chromosome 22q12, is involved in the development of multiple tumors of the nervous system, either associated with neurofibromatosis 2 or sporadic ones, mainly schwannomas and meningiomas. In order to evaluate the role of the NF2 gene in sporadic central nervous system (CNS) tumors, we analyzed NF2 mutations in 26 specimens: 14 meningiomas, 4 schwannomas, 4 metastases, and 4 other histopathological types of neoplasms. Denaturing high performance liquid chromatography (denaturing HPLC) and comparative genomic hybridization on a DNA microarray (microarray- CGH) were used as scanning methods for small mutations and gross rearrangements respectively. Small mutations were identified in six out of seventeen meningiomas and schwannomas, one mutation was novel. Large deletions were detected in six meningiomas. All mutations were predicted to result in truncated protein or in the absence of a large protein domain. No NF2 mutations were found in other histopathological types of CNS tumors. These results provide additional evidence that mutations in the NF2 gene play an important role in the development of sporadic meningiomas and schwannomas. Denaturing HPLC analysis of small mutations and microarray-CGH of large deletions are complementary, fast, and efficient methods for the detection of mutations in tumor tissues.

  6. Large-scale gene expression profiling data for the model moss Physcomitrella patens aid understanding of developmental progression, culture and stress conditions.

    PubMed

    Hiss, Manuel; Laule, Oliver; Meskauskiene, Rasa M; Arif, Muhammad A; Decker, Eva L; Erxleben, Anika; Frank, Wolfgang; Hanke, Sebastian T; Lang, Daniel; Martin, Anja; Neu, Christina; Reski, Ralf; Richardt, Sandra; Schallenberg-Rüdinger, Mareike; Szövényi, Peter; Tiko, Theodhor; Wiedemann, Gertrud; Wolf, Luise; Zimmermann, Philip; Rensing, Stefan A

    2014-08-01

    The moss Physcomitrella patens is an important model organism for studying plant evolution, development, physiology and biotechnology. Here we have generated microarray gene expression data covering the principal developmental stages, culture forms and some environmental/stress conditions. Example analyses of developmental stages and growth conditions as well as abiotic stress treatments demonstrate that (i) growth stage is dominant over culture conditions, (ii) liquid culture is not stressful for the plant, (iii) low pH might aid protoplastation by reduced expression of cell wall structure genes, (iv) largely the same gene pool mediates response to dehydration and rehydration, and (v) AP2/EREBP transcription factors play important roles in stress response reactions. With regard to the AP2 gene family, phylogenetic analysis and comparison with Arabidopsis thaliana shows commonalities as well as uniquely expressed family members under drought, light perturbations and protoplastation. Gene expression profiles for P. patens are available for the scientific community via the easy-to-use tool at https://www.genevestigator.com. By providing large-scale expression profiles, the usability of this model organism is further enhanced, for example by enabling selection of control genes for quantitative real-time PCR. Now, gene expression levels across a broad range of conditions can be accessed online for P. patens. © 2014 The Authors The Plant Journal © 2014 John Wiley & Sons Ltd.

  7. Using microarrays to identify positional candidate genes for QTL: the case study of ACTH response in pigs.

    PubMed

    Jouffe, Vincent; Rowe, Suzanne; Liaubet, Laurence; Buitenhuis, Bart; Hornshøj, Henrik; SanCristobal, Magali; Mormède, Pierre; de Koning, D J

    2009-07-16

    Microarray studies can supplement QTL studies by suggesting potential candidate genes in the QTL regions, which by themselves are too large to provide a limited selection of candidate genes. Here we provide a case study where we explore ways to integrate QTL data and microarray data for the pig, which has only a partial genome sequence. We outline various procedures to localize differentially expressed genes on the pig genome and link this with information on published QTL. The starting point is a set of 237 differentially expressed cDNA clones in adrenal tissue from two pig breeds, before and after treatment with adrenocorticotropic hormone (ACTH). Different approaches to localize the differentially expressed (DE) genes to the pig genome showed different levels of success and a clear lack of concordance for some genes between the various approaches. For a focused analysis on 12 genes, overlapping QTL from the public domain were presented. Also, differentially expressed genes underlying QTL for ACTH response were described. Using the latest version of the draft sequence, the differentially expressed genes were mapped to the pig genome. This enabled co-location of DE genes and previously studied QTL regions, but the draft genome sequence is still incomplete and will contain many errors. A further step to explore links between DE genes and QTL at the pathway level was largely unsuccessful due to the lack of annotation of the pig genome. This could be improved by further comparative mapping analyses but this would be time consuming. This paper provides a case study for the integration of QTL data and microarray data for a species with limited genome sequence information and annotation. The results illustrate the challenges that must be addressed but also provide a roadmap for future work that is applicable to other non-model species.

  8. Small Fish Species as Powerful Model Systems to Study Vertebrate Physiology in Space

    NASA Astrophysics Data System (ADS)

    Muller, M.; Aceto, J.; Dalcq, J.; Alestrom, P.; Nourizadeh-Lillabadi, R.; Goerlich, R.; Schiller, V.; Winkler, C.; Renn, J.; Eberius, M.; Slenzka, K.

    2008-06-01

    Small fish models, mainly zebrafish (Danio rerio) and medaka (Oryzias latipes), have been used for many years as powerful model systems for vertebrate developmental biology. Moreover, these species are increasingly recognized as valuable systems to study vertebrate physiology, pathology, pharmacology and toxicology, including in particular bone physiology. The biology of small fishes presents many advantages, such as transparency of the embryos, external and rapid development, small size and easy reproduction. Further characteristics are particularly useful for space research or for large scale screening approaches. Finally, many technologies for easily characterizing bones are available. Our objective is to investigate the changes induced by microgravity in small fish. By combining whole genome analysis (microarray, DNA methylation, chromatin modification) with live imaging of selected genes in transgenic animals, a comprehensive and integrated characterization of physiological changes in space could be gained, especially concerning bone physiology.

  9. KUTE-BASE: storing, downloading and exporting MIAME-compliant microarray experiments in minutes rather than hours.

    PubMed

    Draghici, Sorin; Tarca, Adi L; Yu, Longfei; Ethier, Stephen; Romero, Roberto

    2008-03-01

    The BioArray Software Environment (BASE) is a very popular MIAME-compliant, web-based microarray data repository. However in BASE, like in most other microarray data repositories, the experiment annotation and raw data uploading can be very timeconsuming, especially for large microarray experiments. We developed KUTE (Karmanos Universal daTabase for microarray Experiments), as a plug-in for BASE 2.0 that addresses these issues. KUTE provides an automatic experiment annotation feature and a completely redesigned data work-flow that dramatically reduce the human-computer interaction time. For instance, in BASE 2.0 a typical Affymetrix experiment involving 100 arrays required 4 h 30 min of user interaction time forexperiment annotation, and 45 min for data upload/download. In contrast, for the same experiment, KUTE required only 28 min of user interaction time for experiment annotation, and 3.3 min for data upload/download. http://vortex.cs.wayne.edu/kute/index.html.

  10. Chondrocyte channel transcriptomics

    PubMed Central

    Lewis, Rebecca; May, Hannah; Mobasheri, Ali; Barrett-Jolley, Richard

    2013-01-01

    To date, a range of ion channels have been identified in chondrocytes using a number of different techniques, predominantly electrophysiological and/or biomolecular; each of these has its advantages and disadvantages. Here we aim to compare and contrast the data available from biophysical and microarray experiments. This letter analyses recent transcriptomics datasets from chondrocytes, accessible from the European Bioinformatics Institute (EBI). We discuss whether such bioinformatic analysis of microarray datasets can potentially accelerate identification and discovery of ion channels in chondrocytes. The ion channels which appear most frequently across these microarray datasets are discussed, along with their possible functions. We discuss whether functional or protein data exist which support the microarray data. A microarray experiment comparing gene expression in osteoarthritis and healthy cartilage is also discussed and we verify the differential expression of 2 of these genes, namely the genes encoding large calcium-activated potassium (BK) and aquaporin channels. PMID:23995703

  11. Is this the real time for genomics?

    PubMed

    Guarnaccia, Maria; Gentile, Giulia; Alessi, Enrico; Schneider, Claudio; Petralia, Salvatore; Cavallaro, Sebastiano

    2014-01-01

    In the last decades, molecular biology has moved from gene-by-gene analysis to more complex studies using a genome-wide scale. Thanks to high-throughput genomic technologies, such as microarrays and next-generation sequencing, a huge amount of information has been generated, expanding our knowledge on the genetic basis of various diseases. Although some of this information could be transferred to clinical diagnostics, the technologies available are not suitable for this purpose. In this review, we will discuss the drawbacks associated with the use of traditional DNA microarrays in diagnostics, pointing out emerging platforms that could overcome these obstacles and offer a more reproducible, qualitative and quantitative multigenic analysis. New miniaturized and automated devices, called Lab-on-Chip, begin to integrate PCR and microarray on the same platform, offering integrated sample-to-result systems. The introduction of this kind of innovative devices may facilitate the transition of genome-based tests into clinical routine. Copyright © 2014. Published by Elsevier Inc.

  12. Mining subspace clusters from DNA microarray data using large itemset techniques.

    PubMed

    Chang, Ye-In; Chen, Jiun-Rung; Tsai, Yueh-Chi

    2009-05-01

    Mining subspace clusters from the DNA microarrays could help researchers identify those genes which commonly contribute to a disease, where a subspace cluster indicates a subset of genes whose expression levels are similar under a subset of conditions. Since in a DNA microarray, the number of genes is far larger than the number of conditions, those previous proposed algorithms which compute the maximum dimension sets (MDSs) for any two genes will take a long time to mine subspace clusters. In this article, we propose the Large Itemset-Based Clustering (LISC) algorithm for mining subspace clusters. Instead of constructing MDSs for any two genes, we construct only MDSs for any two conditions. Then, we transform the task of finding the maximal possible gene sets into the problem of mining large itemsets from the condition-pair MDSs. Since we are only interested in those subspace clusters with gene sets as large as possible, it is desirable to pay attention to those gene sets which have reasonable large support values in the condition-pair MDSs. From our simulation results, we show that the proposed algorithm needs shorter processing time than those previous proposed algorithms which need to construct gene-pair MDSs.

  13. Genome-Scale Protein Microarray Comparison of Human Antibody Responses in Plasmodium vivax Relapse and Reinfection

    PubMed Central

    Chuquiyauri, Raul; Molina, Douglas M.; Moss, Eli L.; Wang, Ruobing; Gardner, Malcolm J.; Brouwer, Kimberly C.; Torres, Sonia; Gilman, Robert H.; Llanos-Cuentas, Alejandro; Neafsey, Daniel E.; Felgner, Philip; Liang, Xiaowu; Vinetz, Joseph M.

    2015-01-01

    Large scale antibody responses in Plasmodium vivax malaria remains unexplored in the endemic setting. Protein microarray analysis of asexual-stage P. vivax was used to identify antigens recognized in sera from residents of hypoendemic Peruvian Amazon. Over 24 months, of 106 participants, 91 had two symptomatic P. vivax malaria episodes, 11 had three episodes, 3 had four episodes, and 1 had five episodes. Plasmodium vivax relapse was distinguished from reinfection by a merozoite surface protein-3α restriction fragment length polymorphism polymerase chain reaction (MSP3α PCR-RFLP) assay. Notably, P. vivax reinfection subjects did not have higher reactivity to the entire set of recognized P. vivax blood-stage antigens than relapse subjects, regardless of the number of malaria episodes. The most highly recognized P. vivax proteins were MSP 4, 7, 8, and 10 (PVX_003775, PVX_082650, PVX_097625, and PVX_114145); sexual-stage antigen s16 (PVX_000930); early transcribed membrane protein (PVX_090230); tryptophan-rich antigen (Pv-fam-a) (PVX_092995); apical merozoite antigen 1 (PVX_092275); and proteins of unknown function (PVX_081830, PVX_117680, PVX_118705, PVX_121935, PVX_097730, PVX_110935, PVX_115450, and PVX_082475). Genes encoding reactive proteins exhibited a significant enrichment of non-synonymous nucleotide variation, an observation suggesting immune selection. These data identify candidates for seroepidemiological tools to support malaria elimination efforts in P. vivax-endemic regions. PMID:26149860

  14. Functional proteomic analysis reveals the involvement of KIAA1199 in breast cancer growth, motility and invasiveness

    PubMed Central

    2014-01-01

    Background KIAA1199 is a recently identified novel gene that is up-regulated in human cancer with poor survival. Our proteomic study on signaling polarity in chemotactic cells revealed KIAA1199 as a novel protein target that may be involved in cellular chemotaxis and motility. In the present study, we examined the functional significance of KIAA1199 expression in breast cancer growth, motility and invasiveness. Methods We validated the previous microarray observation by tissue microarray immunohistochemistry using a TMA slide containing 12 breast tumor tissue cores and 12 corresponding normal tissues. We performed the shRNA-mediated knockdown of KIAA1199 in MDA-MB-231 and HS578T cells to study the role of this protein in cell proliferation, migration and apoptosis in vitro. We studied the effects of KIAA1199 knockdown in vivo in two groups of mice (n = 5). We carried out the SILAC LC-MS/MS based proteomic studies on the involvement of KIAA1199 in breast cancer. Results KIAA1199 mRNA and protein was significantly overexpressed in breast tumor specimens and cell lines as compared with non-neoplastic breast tissues from large-scale microarray and studies of breast cancer cell lines and tumors. To gain deeper insights into the novel role of KIAA1199 in breast cancer, we modulated KIAA1199 expression using shRNA-mediated knockdown in two breast cancer cell lines (MDA-MB-231 and HS578T), expressing higher levels of KIAA1199. The KIAA1199 knockdown cells showed reduced motility and cell proliferation in vitro. Moreover, when the knockdown cells were injected into the mammary fat pads of female athymic nude mice, there was a significant decrease in tumor incidence and growth. In addition, quantitative proteomic analysis revealed that knockdown of KIAA1199 in breast cancer (MDA-MB-231) cells affected a broad range of cellular functions including apoptosis, metabolism and cell motility. Conclusions Our findings indicate that KIAA1199 may play an important role in breast tumor growth and invasiveness, and that it may represent a novel target for biomarker development and a novel therapeutic target for breast cancer. PMID:24628760

  15. SpliceCenter: A suite of web-based bioinformatic applications for evaluating the impact of alternative splicing on RT-PCR, RNAi, microarray, and peptide-based studies

    PubMed Central

    Ryan, Michael C; Zeeberg, Barry R; Caplen, Natasha J; Cleland, James A; Kahn, Ari B; Liu, Hongfang; Weinstein, John N

    2008-01-01

    Background Over 60% of protein-coding genes in vertebrates express mRNAs that undergo alternative splicing. The resulting collection of transcript isoforms poses significant challenges for contemporary biological assays. For example, RT-PCR validation of gene expression microarray results may be unsuccessful if the two technologies target different splice variants. Effective use of sequence-based technologies requires knowledge of the specific splice variant(s) that are targeted. In addition, the critical roles of alternative splice forms in biological function and in disease suggest that assay results may be more informative if analyzed in the context of the targeted splice variant. Results A number of contemporary technologies are used for analyzing transcripts or proteins. To enable investigation of the impact of splice variation on the interpretation of data derived from those technologies, we have developed SpliceCenter. SpliceCenter is a suite of user-friendly, web-based applications that includes programs for analysis of RT-PCR primer/probe sets, effectors of RNAi, microarrays, and protein-targeting technologies. Both interactive and high-throughput implementations of the tools are provided. The interactive versions of SpliceCenter tools provide visualizations of a gene's alternative transcripts and probe target positions, enabling the user to identify which splice variants are or are not targeted. The high-throughput batch versions accept user query files and provide results in tabular form. When, for example, we used SpliceCenter's batch siRNA-Check to process the Cancer Genome Anatomy Project's large-scale shRNA library, we found that only 59% of the 50,766 shRNAs in the library target all known splice variants of the target gene, 32% target some but not all, and 9% do not target any currently annotated transcript. Conclusion SpliceCenter provides unique, user-friendly applications for assessing the impact of transcript variation on the design and interpretation of RT-PCR, RNAi, gene expression microarrays, antibody-based detection, and mass spectrometry proteomics. The tools are intended for use by bench biologists as well as bioinformaticists. PMID:18638396

  16. Probing the Xenopus laevis inner ear transcriptome for biological function

    PubMed Central

    2012-01-01

    Background The senses of hearing and balance depend upon mechanoreception, a process that originates in the inner ear and shares features across species. Amphibians have been widely used for physiological studies of mechanotransduction by sensory hair cells. In contrast, much less is known of the genetic basis of auditory and vestibular function in this class of animals. Among amphibians, the genus Xenopus is a well-characterized genetic and developmental model that offers unique opportunities for inner ear research because of the amphibian capacity for tissue and organ regeneration. For these reasons, we implemented a functional genomics approach as a means to undertake a large-scale analysis of the Xenopus laevis inner ear transcriptome through microarray analysis. Results Microarray analysis uncovered genes within the X. laevis inner ear transcriptome associated with inner ear function and impairment in other organisms, thereby supporting the inclusion of Xenopus in cross-species genetic studies of the inner ear. The use of gene categories (inner ear tissue; deafness; ion channels; ion transporters; transcription factors) facilitated the assignment of functional significance to probe set identifiers. We enhanced the biological relevance of our microarray data by using a variety of curation approaches to increase the annotation of the Affymetrix GeneChip® Xenopus laevis Genome array. In addition, annotation analysis revealed the prevalence of inner ear transcripts represented by probe set identifiers that lack functional characterization. Conclusions We identified an abundance of targets for genetic analysis of auditory and vestibular function. The orthologues to human genes with known inner ear function and the highly expressed transcripts that lack annotation are particularly interesting candidates for future analyses. We used informatics approaches to impart biologically relevant information to the Xenopus inner ear transcriptome, thereby addressing the impediment imposed by insufficient gene annotation. These findings heighten the relevance of Xenopus as a model organism for genetic investigations of inner ear organogenesis, morphogenesis, and regeneration. PMID:22676585

  17. Comprehensive Assessments of RNA-seq by the SEQC Consortium: FDA-Led Efforts Advance Precision Medicine.

    PubMed

    Xu, Joshua; Gong, Binsheng; Wu, Leihong; Thakkar, Shraddha; Hong, Huixiao; Tong, Weida

    2016-03-15

    Studies on gene expression in response to therapy have led to the discovery of pharmacogenomics biomarkers and advances in precision medicine. Whole transcriptome sequencing (RNA-seq) is an emerging tool for profiling gene expression and has received wide adoption in the biomedical research community. However, its value in regulatory decision making requires rigorous assessment and consensus between various stakeholders, including the research community, regulatory agencies, and industry. The FDA-led SEquencing Quality Control (SEQC) consortium has made considerable progress in this direction, and is the subject of this review. Specifically, three RNA-seq platforms (Illumina HiSeq, Life Technologies SOLiD, and Roche 454) were extensively evaluated at multiple sites to assess cross-site and cross-platform reproducibility. The results demonstrated that relative gene expression measurements were consistently comparable across labs and platforms, but not so for the measurement of absolute expression levels. As part of the quality evaluation several studies were included to evaluate the utility of RNA-seq in clinical settings and safety assessment. The neuroblastoma study profiled tumor samples from 498 pediatric neuroblastoma patients by both microarray and RNA-seq. RNA-seq offers more utilities than microarray in determining the transcriptomic characteristics of cancer. However, RNA-seq and microarray-based models were comparable in clinical endpoint prediction, even when including additional features unique to RNA-seq beyond gene expression. The toxicogenomics study compared microarray and RNA-seq profiles of the liver samples from rats exposed to 27 different chemicals representing multiple toxicity modes of action. Cross-platform concordance was dependent on chemical treatment and transcript abundance. Though both RNA-seq and microarray are suitable for developing gene expression based predictive models with comparable prediction performance, RNA-seq offers advantages over microarray in profiling genes with low expression. The rat BodyMap study provided a comprehensive rat transcriptomic body map by performing RNA-Seq on 320 samples from 11 organs in either sex of juvenile, adolescent, adult and aged Fischer 344 rats. Lastly, the transferability study demonstrated that signature genes of predictive models are reciprocally transferable between microarray and RNA-seq data for model development using a comprehensive approach with two large clinical data sets. This result suggests continued usefulness of legacy microarray data in the coming RNA-seq era. In conclusion, the SEQC project enhances our understanding of RNA-seq and provides valuable guidelines for RNA-seq based clinical application and safety evaluation to advance precision medicine.

  18. Fabrication of Carbohydrate Microarrays by Boronate Formation.

    PubMed

    Adak, Avijit K; Lin, Ting-Wei; Li, Ben-Yuan; Lin, Chun-Cheng

    2017-01-01

    The interactions between soluble carbohydrates and/or surface displayed glycans and protein receptors are essential to many biological processes and cellular recognition events. Carbohydrate microarrays provide opportunities for high-throughput quantitative analysis of carbohydrate-protein interactions. Over the past decade, various techniques have been implemented for immobilizing glycans on solid surfaces in a microarray format. Herein, we describe a detailed protocol for fabricating carbohydrate microarrays that capitalizes on the intrinsic reactivity of boronic acid toward carbohydrates to form stable boronate diesters. A large variety of unprotected carbohydrates ranging in structure from simple disaccharides and trisaccharides to considerably more complex human milk and blood group (oligo)saccharides have been covalently immobilized in a single step on glass slides, which were derivatized with high-affinity boronic acid ligands. The immobilized ligands in these microarrays maintain the receptor-binding activities including those of lectins and antibodies according to the structures of their pendant carbohydrates for rapid analysis of a number of carbohydrate-recognition events within 30 h. This method facilitates the direct construction of otherwise difficult to obtain carbohydrate microarrays from underivatized glycans.

  19. Profiling protein function with small molecule microarrays

    PubMed Central

    Winssinger, Nicolas; Ficarro, Scott; Schultz, Peter G.; Harris, Jennifer L.

    2002-01-01

    The regulation of protein function through posttranslational modification, local environment, and protein–protein interaction is critical to cellular function. The ability to analyze on a genome-wide scale protein functional activity rather than changes in protein abundance or structure would provide important new insights into complex biological processes. Herein, we report the application of a spatially addressable small molecule microarray to an activity-based profile of proteases in crude cell lysates. The potential of this small molecule-based profiling technology is demonstrated by the detection of caspase activation upon induction of apoptosis, characterization of the activated caspase, and inhibition of the caspase-executed apoptotic phenotype using the small molecule inhibitor identified in the microarray-based profile. PMID:12167675

  20. Comparative modular analysis of gene expression in vertebrate organs.

    PubMed

    Piasecka, Barbara; Kutalik, Zoltán; Roux, Julien; Bergmann, Sven; Robinson-Rechavi, Marc

    2012-03-29

    The degree of conservation of gene expression between homologous organs largely remains an open question. Several recent studies reported some evidence in favor of such conservation. Most studies compute organs' similarity across all orthologous genes, whereas the expression level of many genes are not informative about organ specificity. Here, we use a modularization algorithm to overcome this limitation through the identification of inter-species co-modules of organs and genes. We identify such co-modules using mouse and human microarray expression data. They are functionally coherent both in terms of genes and of organs from both organisms. We show that a large proportion of genes belonging to the same co-module are orthologous between mouse and human. Moreover, their zebrafish orthologs also tend to be expressed in the corresponding homologous organs. Notable exceptions to the general pattern of conservation are the testis and the olfactory bulb. Interestingly, some co-modules consist of single organs, while others combine several functionally related organs. For instance, amygdala, cerebral cortex, hypothalamus and spinal cord form a clearly discernible unit of expression, both in mouse and human. Our study provides a new framework for comparative analysis which will be applicable also to other sets of large-scale phenotypic data collected across different species.

  1. Peptidoglycan microarray as a novel tool to explore protein-ligand recognition.

    PubMed

    Wang, Ning; Hirata, Akiyoshi; Nokihara, Kiyoshi; Fukase, Koichi; Fujimoto, Yukari

    2016-11-04

    Peptidoglycan is a giant bag-shaped molecule essential for bacterial cell shape and resistance to osmotic stresses. The activity of a large number of bacterial surface proteins involved in cell growth and division requires binding to this macromolecule. Recognition of peptidoglycan by immune effectors is also crucial for the establishment of the immune response against pathogens. The availability of pure and chemically defined peptidoglycan fragments is a major technical bottleneck that has precluded systematic studies of the mechanisms underpinning protein-mediated peptidoglycan recognition. Here, we report a microarray strategy suitable to carry out comprehensive studies to characterize proteins-peptidoglycan interactions. We describe a method to introduce a functional group on peptidoglycan fragments allowing their stable immobilization on amorphous carbon chip plates to minimize nonspecific binding. Such peptidoglycan microarrays were used with a model peptidoglycan binding protein-the human peptidoglycan recognition protein-S (hPGRP-S). We propose that this strategy could be implemented to carry out high-throughput analyses to study peptidoglycan binding proteins. © 2016 Wiley Periodicals, Inc. Biopolymers (Pept Sci) 106: 422-429, 2016. © 2016 Wiley Periodicals, Inc.

  2. The emergence and diffusion of DNA microarray technology.

    PubMed

    Lenoir, Tim; Giannella, Eric

    2006-08-22

    The network model of innovation widely adopted among researchers in the economics of science and technology posits relatively porous boundaries between firms and academic research programs and a bi-directional flow of inventions, personnel, and tacit knowledge between sites of university and industry innovation. Moreover, the model suggests that these bi-directional flows should be considered as mutual stimulation of research and invention in both industry and academe, operating as a positive feedback loop. One side of this bi-directional flow--namely; the flow of inventions into industry through the licensing of university-based technologies--has been well studied; but the reverse phenomenon of the stimulation of university research through the absorption of new directions emanating from industry has yet to be investigated in much detail. We discuss the role of federal funding of academic research in the microarray field, and the multiple pathways through which federally supported development of commercial microarray technologies have transformed core academic research fields. Our study confirms the picture put forward by several scholars that the open character of networked economies is what makes them truly innovative. In an open system innovations emerge from the network. The emergence and diffusion of microarray technologies we have traced here provides an excellent example of an open system of innovation in action. Whether they originated in a startup company environment that operated like a think-tank, such as Affymax, the research labs of a large firm, such as Agilent, or within a research university, the inventors we have followed drew heavily on knowledge resources from all parts of the network in bringing microarray platforms to light. Federal funding for high-tech startups and new industrial development was important at several phases in the early history of microarrays, and federal funding of academic researchers using microarrays was fundamental to transforming the research agendas of several fields within academe. The typical story told about the role of federal funding emphasizes the spillovers from federally funded academic research to industry. Our study shows that the knowledge spillovers worked both ways, with federal funding of non-university research providing the impetus for reshaping the research agendas of several academic fields.

  3. The emergence and diffusion of DNA microarray technology

    PubMed Central

    Lenoir, Tim; Giannella, Eric

    2006-01-01

    The network model of innovation widely adopted among researchers in the economics of science and technology posits relatively porous boundaries between firms and academic research programs and a bi-directional flow of inventions, personnel, and tacit knowledge between sites of university and industry innovation. Moreover, the model suggests that these bi-directional flows should be considered as mutual stimulation of research and invention in both industry and academe, operating as a positive feedback loop. One side of this bi-directional flow – namely; the flow of inventions into industry through the licensing of university-based technologies – has been well studied; but the reverse phenomenon of the stimulation of university research through the absorption of new directions emanating from industry has yet to be investigated in much detail. We discuss the role of federal funding of academic research in the microarray field, and the multiple pathways through which federally supported development of commercial microarray technologies have transformed core academic research fields. Our study confirms the picture put forward by several scholars that the open character of networked economies is what makes them truly innovative. In an open system innovations emerge from the network. The emergence and diffusion of microarray technologies we have traced here provides an excellent example of an open system of innovation in action. Whether they originated in a startup company environment that operated like a think-tank, such as Affymax, the research labs of a large firm, such as Agilent, or within a research university, the inventors we have followed drew heavily on knowledge resources from all parts of the network in bringing microarray platforms to light. Federal funding for high-tech startups and new industrial development was important at several phases in the early history of microarrays, and federal funding of academic researchers using microarrays was fundamental to transforming the research agendas of several fields within academe. The typical story told about the role of federal funding emphasizes the spillovers from federally funded academic research to industry. Our study shows that the knowledge spillovers worked both ways, with federal funding of non-university research providing the impetus for reshaping the research agendas of several academic fields. PMID:16925816

  4. Tumor immunology.

    PubMed

    Mocellin, Simone; Lise, Mario; Nitti, Donato

    2007-01-01

    Advances in tumor immunology are supporting the clinical implementation of several immunological approaches to cancer in the clinical setting. However, the alternate success of current immunotherapeutic regimens underscores the fact that the molecular mechanisms underlying immune-mediated tumor rejection are still poorly understood. Given the complexity of the immune system network and the multidimensionality of tumor/host interactions, the comprehension of tumor immunology might greatly benefit from high-throughput microarray analysis, which can portrait the molecular kinetics of immune response on a genome-wide scale, thus accelerating the discovery pace and ultimately catalyzing the development of new hypotheses in cell biology. Although in its infancy, the implementation of microarray technology in tumor immunology studies has already provided investigators with novel data and intriguing new hypotheses on the molecular cascade leading to an effective immune response against cancer. Although the general principles of microarray-based gene profiling have rapidly spread in the scientific community, the need for mastering this technique to produce meaningful data and correctly interpret the enormous output of information generated by this technology is critical and represents a tremendous challenge for investigators, as outlined in the first section of this book. In the present Chapter, we report on some of the most significant results obtained with the application of DNA microarray in this oncology field.

  5. APPLICATION OF GENE ARRAY TECHNOLOGY IN THE RESEARCH OF CARDIOPULMONARY TOXICITY INDUCED BY PARTICULATE MATTER (PM) AND ITS CONSTITUENTS.

    EPA Science Inventory

    Because of its ability to provide a "snap-shot" view of expression of large number of genes simultaneously, the microarray technology may be a useful tool to uncover new mechanisms of toxicity. This proposal will use the state-of-the-art gene microarrays and a new bioinformatic t...

  6. Particle Radiation signals the Expression of Genes in stress-associated Pathways

    NASA Astrophysics Data System (ADS)

    Blakely, E.; Chang, P.; Bjornstad, K.; Dosanjh, M.; Cherbonnel, C.; Rosen, C.

    The explosive development of microarray screening methods has propelled genome research in a variety of biological systems allowing investigators to examine large-scale alterations in gene expression for research in toxicology pathology and therapy The radiation environment in space is complex and encompasses a variety of highly energetic and charged particles Estimation of biological responses after exposure to these types of radiation is important for NASA in their plans for long-term manned space missions Instead of using the 10 000 gene arrays that are in the marketplace we have chosen to examine particle radiation-induced changes in gene expression using a focused DNA microarray system to study the expression of about 100 genes specifically associated with both the upstream and downstream aspects of the TP53 stress-responsive pathway Genes that are regulated by TP53 include functional clusters that are implicated in cell cycle arrest apoptosis and DNA repair A cultured human lens epithelial cell model Blakely et al IOVS 41 3808 2000 was used for these studies Additional human normal and radiosensitive fibroblast cell lines have also been examined Lens cells were grown on matrix-coated substrate and exposed to 55 MeV u protons at the 88 cyclotron in LBNL or 1 GeV u Iron ions at the NASA Space Radiation Laboratory The other cells lines were grown on conventional tissue culture plasticware RNA and proteins were harvested at different times after irradiation RNA was isolated from sham-treated or select irradiated populations

  7. Large-scale atlas of microarray data reveals the distinct expression landscape of different tissues in Arabidopsis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    He, Fei; Maslov, Sergei; Yoo, Shinjae

    Here, transcriptome datasets from thousands of samples of the model plant Arabidopsis thaliana have been collectively generated by multiple individual labs. Although integration and meta-analysis of these samples has become routine in the plant research community, it is often hampered by the lack of metadata or differences in annotation styles by different labs. In this study, we carefully selected and integrated 6,057 Arabidopsis microarray expression samples from 304 experiments deposited to NCBI GEO. Metadata such as tissue type, growth condition, and developmental stage were manually curated for each sample. We then studied global expression landscape of the integrated dataset andmore » found that samples of the same tissue tend to be more similar to each other than to samples of other tissues, even in different growth conditions or developmental stages. Root has the most distinct transcriptome compared to aerial tissues, but the transcriptome of cultured root is more similar to those of aerial tissues as the former samples lost their cellular identity. Using a simple computational classification method, we showed that the tissue type of a sample can be successfully predicted based on its expression profile, opening the door for automatic metadata extraction and facilitating re-use of plant transcriptome data. As a proof of principle we applied our automated annotation pipeline to 708 RNA-seq samples from public repositories and verified accuracy of our predictions with samples’ metadata provided by authors.« less

  8. Large-scale atlas of microarray data reveals the distinct expression landscape of different tissues in Arabidopsis

    DOE PAGES

    He, Fei; Maslov, Sergei; Yoo, Shinjae; ...

    2016-05-25

    Here, transcriptome datasets from thousands of samples of the model plant Arabidopsis thaliana have been collectively generated by multiple individual labs. Although integration and meta-analysis of these samples has become routine in the plant research community, it is often hampered by the lack of metadata or differences in annotation styles by different labs. In this study, we carefully selected and integrated 6,057 Arabidopsis microarray expression samples from 304 experiments deposited to NCBI GEO. Metadata such as tissue type, growth condition, and developmental stage were manually curated for each sample. We then studied global expression landscape of the integrated dataset andmore » found that samples of the same tissue tend to be more similar to each other than to samples of other tissues, even in different growth conditions or developmental stages. Root has the most distinct transcriptome compared to aerial tissues, but the transcriptome of cultured root is more similar to those of aerial tissues as the former samples lost their cellular identity. Using a simple computational classification method, we showed that the tissue type of a sample can be successfully predicted based on its expression profile, opening the door for automatic metadata extraction and facilitating re-use of plant transcriptome data. As a proof of principle we applied our automated annotation pipeline to 708 RNA-seq samples from public repositories and verified accuracy of our predictions with samples’ metadata provided by authors.« less

  9. Differential gene transcription across the life cycle in Daphnia magna using a new all genome custom-made microarray.

    PubMed

    Campos, Bruno; Fletcher, Danielle; Piña, Benjamín; Tauler, Romà; Barata, Carlos

    2018-05-18

    Unravelling the link between genes and environment across the life cycle is a challenging goal that requires model organisms with well-characterized life-cycles, ecological interactions in nature, tractability in the laboratory, and available genomic tools. Very few well-studied invertebrate model species meet these requirements, being the waterflea Daphnia magna one of them. Here we report a full genome transcription profiling of D. magna during its life-cycle. The study was performed using a new microarray platform designed from the complete set of gene models representing the whole transcribed genome of D. magna. Up to 93% of the existing 41,317 D. magna gene models showed differential transcription patterns across the developmental stages of D. magna, 59% of which were functionally annotated. Embryos showed the highest number of unique transcribed genes, mainly related to DNA, RNA, and ribosome biogenesis, likely related to cellular proliferation and morphogenesis of the several body organs. Adult females showed an enrichment of transcripts for genes involved in reproductive processes. These female-specific transcripts were essentially absent in males, whose transcriptome was enriched in specific genes of male sexual differentiation genes, like doublesex. Our results define major characteristics of transcriptional programs involved in the life-cycle, differentiate males and females, and show that large scale gene-transcription data collected in whole animals can be used to identify genes involved in specific biological and biochemical processes.

  10. A comprehensive sensitivity analysis of microarray breast cancer classification under feature variability

    PubMed Central

    2009-01-01

    Background Large discrepancies in signature composition and outcome concordance have been observed between different microarray breast cancer expression profiling studies. This is often ascribed to differences in array platform as well as biological variability. We conjecture that other reasons for the observed discrepancies are the measurement error associated with each feature and the choice of preprocessing method. Microarray data are known to be subject to technical variation and the confidence intervals around individual point estimates of expression levels can be wide. Furthermore, the estimated expression values also vary depending on the selected preprocessing scheme. In microarray breast cancer classification studies, however, these two forms of feature variability are almost always ignored and hence their exact role is unclear. Results We have performed a comprehensive sensitivity analysis of microarray breast cancer classification under the two types of feature variability mentioned above. We used data from six state of the art preprocessing methods, using a compendium consisting of eight diferent datasets, involving 1131 hybridizations, containing data from both one and two-color array technology. For a wide range of classifiers, we performed a joint study on performance, concordance and stability. In the stability analysis we explicitly tested classifiers for their noise tolerance by using perturbed expression profiles that are based on uncertainty information directly related to the preprocessing methods. Our results indicate that signature composition is strongly influenced by feature variability, even if the array platform and the stratification of patient samples are identical. In addition, we show that there is often a high level of discordance between individual class assignments for signatures constructed on data coming from different preprocessing schemes, even if the actual signature composition is identical. Conclusion Feature variability can have a strong impact on breast cancer signature composition, as well as the classification of individual patient samples. We therefore strongly recommend that feature variability is considered in analyzing data from microarray breast cancer expression profiling experiments. PMID:19941644

  11. Self-directed student research through analysis of microarray datasets: a computer-based functional genomics practical class for masters-level students.

    PubMed

    Grenville-Briggs, Laura J; Stansfield, Ian

    2011-01-01

    This report describes a linked series of Masters-level computer practical workshops. They comprise an advanced functional genomics investigation, based upon analysis of a microarray dataset probing yeast DNA damage responses. The workshops require the students to analyse highly complex transcriptomics datasets, and were designed to stimulate active learning through experience of current research methods in bioinformatics and functional genomics. They seek to closely mimic a realistic research environment, and require the students first to propose research hypotheses, then test those hypotheses using specific sections of the microarray dataset. The complexity of the microarray data provides students with the freedom to propose their own unique hypotheses, tested using appropriate sections of the microarray data. This research latitude was highly regarded by students and is a strength of this practical. In addition, the focus on DNA damage by radiation and mutagenic chemicals allows them to place their results in a human medical context, and successfully sparks broad interest in the subject material. In evaluation, 79% of students scored the practical workshops on a five-point scale as 4 or 5 (totally effective) for student learning. More broadly, the general use of microarray data as a "student research playground" is also discussed. Copyright © 2011 Wiley Periodicals, Inc.

  12. A multiplex reverse transcription PCR and automated electronic microarray assay for detection and differentiation of seven viruses affecting swine.

    PubMed

    Erickson, A; Fisher, M; Furukawa-Stoffer, T; Ambagala, A; Hodko, D; Pasick, J; King, D P; Nfon, C; Ortega Polo, R; Lung, O

    2018-04-01

    Microarray technology can be useful for pathogen detection as it allows simultaneous interrogation of the presence or absence of a large number of genetic signatures. However, most microarray assays are labour-intensive and time-consuming to perform. This study describes the development and initial evaluation of a multiplex reverse transcription (RT)-PCR and novel accompanying automated electronic microarray assay for simultaneous detection and differentiation of seven important viruses that affect swine (foot-and-mouth disease virus [FMDV], swine vesicular disease virus [SVDV], vesicular exanthema of swine virus [VESV], African swine fever virus [ASFV], classical swine fever virus [CSFV], porcine respiratory and reproductive syndrome virus [PRRSV] and porcine circovirus type 2 [PCV2]). The novel electronic microarray assay utilizes a single, user-friendly instrument that integrates and automates capture probe printing, hybridization, washing and reporting on a disposable electronic microarray cartridge with 400 features. This assay accurately detected and identified a total of 68 isolates of the seven targeted virus species including 23 samples of FMDV, representing all seven serotypes, and 10 CSFV strains, representing all three genotypes. The assay successfully detected viruses in clinical samples from the field, experimentally infected animals (as early as 1 day post-infection (dpi) for FMDV and SVDV, 4 dpi for ASFV, 5 dpi for CSFV), as well as in biological material that were spiked with target viruses. The limit of detection was 10 copies/μl for ASFV, PCV2 and PRRSV, 100 copies/μl for SVDV, CSFV, VESV and 1,000 copies/μl for FMDV. The electronic microarray component had reduced analytical sensitivity for several of the target viruses when compared with the multiplex RT-PCR. The integration of capture probe printing allows custom onsite array printing as needed, while electrophoretically driven hybridization generates results faster than conventional microarrays that rely on passive hybridization. With further refinement, this novel, rapid, highly automated microarray technology has potential applications in multipathogen surveillance of livestock diseases. © 2017 Her Majesty the Queen in Right of Canada • Transboundary and Emerging Diseases.

  13. Gene selection for microarray data classification via subspace learning and manifold regularization.

    PubMed

    Tang, Chang; Cao, Lijuan; Zheng, Xiao; Wang, Minhui

    2017-12-19

    With the rapid development of DNA microarray technology, large amount of genomic data has been generated. Classification of these microarray data is a challenge task since gene expression data are often with thousands of genes but a small number of samples. In this paper, an effective gene selection method is proposed to select the best subset of genes for microarray data with the irrelevant and redundant genes removed. Compared with original data, the selected gene subset can benefit the classification task. We formulate the gene selection task as a manifold regularized subspace learning problem. In detail, a projection matrix is used to project the original high dimensional microarray data into a lower dimensional subspace, with the constraint that the original genes can be well represented by the selected genes. Meanwhile, the local manifold structure of original data is preserved by a Laplacian graph regularization term on the low-dimensional data space. The projection matrix can serve as an importance indicator of different genes. An iterative update algorithm is developed for solving the problem. Experimental results on six publicly available microarray datasets and one clinical dataset demonstrate that the proposed method performs better when compared with other state-of-the-art methods in terms of microarray data classification. Graphical Abstract The graphical abstract of this work.

  14. Analysis of microarray leukemia data using an efficient MapReduce-based K-nearest-neighbor classifier.

    PubMed

    Kumar, Mukesh; Rath, Nitish Kumar; Rath, Santanu Kumar

    2016-04-01

    Microarray-based gene expression profiling has emerged as an efficient technique for classification, prognosis, diagnosis, and treatment of cancer. Frequent changes in the behavior of this disease generates an enormous volume of data. Microarray data satisfies both the veracity and velocity properties of big data, as it keeps changing with time. Therefore, the analysis of microarray datasets in a small amount of time is essential. They often contain a large amount of expression, but only a fraction of it comprises genes that are significantly expressed. The precise identification of genes of interest that are responsible for causing cancer are imperative in microarray data analysis. Most existing schemes employ a two-phase process such as feature selection/extraction followed by classification. In this paper, various statistical methods (tests) based on MapReduce are proposed for selecting relevant features. After feature selection, a MapReduce-based K-nearest neighbor (mrKNN) classifier is also employed to classify microarray data. These algorithms are successfully implemented in a Hadoop framework. A comparative analysis is done on these MapReduce-based models using microarray datasets of various dimensions. From the obtained results, it is observed that these models consume much less execution time than conventional models in processing big data. Copyright © 2016 Elsevier Inc. All rights reserved.

  15. How large a training set is needed to develop a classifier for microarray data?

    PubMed

    Dobbin, Kevin K; Zhao, Yingdong; Simon, Richard M

    2008-01-01

    A common goal of gene expression microarray studies is the development of a classifier that can be used to divide patients into groups with different prognoses, or with different expected responses to a therapy. These types of classifiers are developed on a training set, which is the set of samples used to train a classifier. The question of how many samples are needed in the training set to produce a good classifier from high-dimensional microarray data is challenging. We present a model-based approach to determining the sample size required to adequately train a classifier. It is shown that sample size can be determined from three quantities: standardized fold change, class prevalence, and number of genes or features on the arrays. Numerous examples and important experimental design issues are discussed. The method is adapted to address ex post facto determination of whether the size of a training set used to develop a classifier was adequate. An interactive web site for performing the sample size calculations is provided. We showed that sample size calculations for classifier development from high-dimensional microarray data are feasible, discussed numerous important considerations, and presented examples.

  16. Evaluation of normalization methods for cDNA microarray data by k-NN classification

    PubMed Central

    Wu, Wei; Xing, Eric P; Myers, Connie; Mian, I Saira; Bissell, Mina J

    2005-01-01

    Background Non-biological factors give rise to unwanted variations in cDNA microarray data. There are many normalization methods designed to remove such variations. However, to date there have been few published systematic evaluations of these techniques for removing variations arising from dye biases in the context of downstream, higher-order analytical tasks such as classification. Results Ten location normalization methods that adjust spatial- and/or intensity-dependent dye biases, and three scale methods that adjust scale differences were applied, individually and in combination, to five distinct, published, cancer biology-related cDNA microarray data sets. Leave-one-out cross-validation (LOOCV) classification error was employed as the quantitative end-point for assessing the effectiveness of a normalization method. In particular, a known classifier, k-nearest neighbor (k-NN), was estimated from data normalized using a given technique, and the LOOCV error rate of the ensuing model was computed. We found that k-NN classifiers are sensitive to dye biases in the data. Using NONRM and GMEDIAN as baseline methods, our results show that single-bias-removal techniques which remove either spatial-dependent dye bias (referred later as spatial effect) or intensity-dependent dye bias (referred later as intensity effect) moderately reduce LOOCV classification errors; whereas double-bias-removal techniques which remove both spatial- and intensity effect reduce LOOCV classification errors even further. Of the 41 different strategies examined, three two-step processes, IGLOESS-SLFILTERW7, ISTSPLINE-SLLOESS and IGLOESS-SLLOESS, all of which removed intensity effect globally and spatial effect locally, appear to reduce LOOCV classification errors most consistently and effectively across all data sets. We also found that the investigated scale normalization methods do not reduce LOOCV classification error. Conclusion Using LOOCV error of k-NNs as the evaluation criterion, three double-bias-removal normalization strategies, IGLOESS-SLFILTERW7, ISTSPLINE-SLLOESS and IGLOESS-SLLOESS, outperform other strategies for removing spatial effect, intensity effect and scale differences from cDNA microarray data. The apparent sensitivity of k-NN LOOCV classification error to dye biases suggests that this criterion provides an informative measure for evaluating normalization methods. All the computational tools used in this study were implemented using the R language for statistical computing and graphics. PMID:16045803

  17. Evaluation of normalization methods for cDNA microarray data by k-NN classification.

    PubMed

    Wu, Wei; Xing, Eric P; Myers, Connie; Mian, I Saira; Bissell, Mina J

    2005-07-26

    Non-biological factors give rise to unwanted variations in cDNA microarray data. There are many normalization methods designed to remove such variations. However, to date there have been few published systematic evaluations of these techniques for removing variations arising from dye biases in the context of downstream, higher-order analytical tasks such as classification. Ten location normalization methods that adjust spatial- and/or intensity-dependent dye biases, and three scale methods that adjust scale differences were applied, individually and in combination, to five distinct, published, cancer biology-related cDNA microarray data sets. Leave-one-out cross-validation (LOOCV) classification error was employed as the quantitative end-point for assessing the effectiveness of a normalization method. In particular, a known classifier, k-nearest neighbor (k-NN), was estimated from data normalized using a given technique, and the LOOCV error rate of the ensuing model was computed. We found that k-NN classifiers are sensitive to dye biases in the data. Using NONRM and GMEDIAN as baseline methods, our results show that single-bias-removal techniques which remove either spatial-dependent dye bias (referred later as spatial effect) or intensity-dependent dye bias (referred later as intensity effect) moderately reduce LOOCV classification errors; whereas double-bias-removal techniques which remove both spatial- and intensity effect reduce LOOCV classification errors even further. Of the 41 different strategies examined, three two-step processes, IGLOESS-SLFILTERW7, ISTSPLINE-SLLOESS and IGLOESS-SLLOESS, all of which removed intensity effect globally and spatial effect locally, appear to reduce LOOCV classification errors most consistently and effectively across all data sets. We also found that the investigated scale normalization methods do not reduce LOOCV classification error. Using LOOCV error of k-NNs as the evaluation criterion, three double-bias-removal normalization strategies, IGLOESS-SLFILTERW7, ISTSPLINE-SLLOESS and IGLOESS-SLLOESS, outperform other strategies for removing spatial effect, intensity effect and scale differences from cDNA microarray data. The apparent sensitivity of k-NN LOOCV classification error to dye biases suggests that this criterion provides an informative measure for evaluating normalization methods. All the computational tools used in this study were implemented using the R language for statistical computing and graphics.

  18. Differential prioritization between relevance and redundancy in correlation-based feature selection techniques for multiclass gene expression data.

    PubMed

    Ooi, Chia Huey; Chetty, Madhu; Teng, Shyh Wei

    2006-06-23

    Due to the large number of genes in a typical microarray dataset, feature selection looks set to play an important role in reducing noise and computational cost in gene expression-based tissue classification while improving accuracy at the same time. Surprisingly, this does not appear to be the case for all multiclass microarray datasets. The reason is that many feature selection techniques applied on microarray datasets are either rank-based and hence do not take into account correlations between genes, or are wrapper-based, which require high computational cost, and often yield difficult-to-reproduce results. In studies where correlations between genes are considered, attempts to establish the merit of the proposed techniques are hampered by evaluation procedures which are less than meticulous, resulting in overly optimistic estimates of accuracy. We present two realistically evaluated correlation-based feature selection techniques which incorporate, in addition to the two existing criteria involved in forming a predictor set (relevance and redundancy), a third criterion called the degree of differential prioritization (DDP). DDP functions as a parameter to strike the balance between relevance and redundancy, providing our techniques with the novel ability to differentially prioritize the optimization of relevance against redundancy (and vice versa). This ability proves useful in producing optimal classification accuracy while using reasonably small predictor set sizes for nine well-known multiclass microarray datasets. For multiclass microarray datasets, especially the GCM and NCI60 datasets, DDP enables our filter-based techniques to produce accuracies better than those reported in previous studies which employed similarly realistic evaluation procedures.

  19. High-density polymer microarrays: identifying synthetic polymers that control human embryonic stem cell growth.

    PubMed

    Hansen, Anne; Mjoseng, Heidi K; Zhang, Rong; Kalloudis, Michail; Koutsos, Vasileios; de Sousa, Paul A; Bradley, Mark

    2014-06-01

    The fabrication of high-density polymer microarray is described, allowing the simultaneous and efficient evaluation of more than 7000 different polymers in a single-cellular-based screen. These high-density polymer arrays are applied in the search for synthetic substrates for hESCs culture. Up-scaling of the identified hit polymers enables long-term cellular cultivation and promoted successful stem-cell maintenance. © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  20. From High-Throughput Microarray-Based Screening to Clinical Application: The Development of a Second Generation Multigene Test for Breast Cancer Prognosis

    PubMed Central

    Brase, Jan C.; Kronenwett, Ralf; Petry, Christoph; Denkert, Carsten; Schmidt, Marcus

    2013-01-01

    Several multigene tests have been developed for breast cancer patients to predict the individual risk of recurrence. Most of the first generation tests rely on proliferation-associated genes and are commonly carried out in central reference laboratories. Here, we describe the development of a second generation multigene assay, the EndoPredict test, a prognostic multigene expression test for estrogen receptor (ER) positive, human epidermal growth factor receptor (HER2) negative (ER+/HER2−) breast cancer patients. The EndoPredict gene signature was initially established in a large high-throughput microarray-based screening study. The key steps for biomarker identification are discussed in detail, in comparison to the establishment of other multigene signatures. After biomarker selection, genes and algorithms were transferred to a diagnostic platform (reverse transcription quantitative PCR (RT-qPCR)) to allow for assaying formalin-fixed, paraffin-embedded (FFPE) samples. A comprehensive analytical validation was performed and a prospective proficiency testing study with seven pathological laboratories finally proved that EndoPredict can be reliably used in the decentralized setting. Three independent large clinical validation studies (n = 2,257) demonstrated that EndoPredict offers independent prognostic information beyond current clinicopathological parameters and clinical guidelines. The review article summarizes several important steps that should be considered for the development process of a second generation multigene test and offers a means for transferring a microarray signature from the research laboratory to clinical practice. PMID:27605191

  1. Ontology-based meta-analysis of global collections of high-throughput public data.

    PubMed

    Kupershmidt, Ilya; Su, Qiaojuan Jane; Grewal, Anoop; Sundaresh, Suman; Halperin, Inbal; Flynn, James; Shekar, Mamatha; Wang, Helen; Park, Jenny; Cui, Wenwu; Wall, Gregory D; Wisotzkey, Robert; Alag, Satnam; Akhtari, Saeid; Ronaghi, Mostafa

    2010-09-29

    The investigation of the interconnections between the molecular and genetic events that govern biological systems is essential if we are to understand the development of disease and design effective novel treatments. Microarray and next-generation sequencing technologies have the potential to provide this information. However, taking full advantage of these approaches requires that biological connections be made across large quantities of highly heterogeneous genomic datasets. Leveraging the increasingly huge quantities of genomic data in the public domain is fast becoming one of the key challenges in the research community today. We have developed a novel data mining framework that enables researchers to use this growing collection of public high-throughput data to investigate any set of genes or proteins. The connectivity between molecular states across thousands of heterogeneous datasets from microarrays and other genomic platforms is determined through a combination of rank-based enrichment statistics, meta-analyses, and biomedical ontologies. We address data quality concerns through dataset replication and meta-analysis and ensure that the majority of the findings are derived using multiple lines of evidence. As an example of our strategy and the utility of this framework, we apply our data mining approach to explore the biology of brown fat within the context of the thousands of publicly available gene expression datasets. Our work presents a practical strategy for organizing, mining, and correlating global collections of large-scale genomic data to explore normal and disease biology. Using a hypothesis-free approach, we demonstrate how a data-driven analysis across very large collections of genomic data can reveal novel discoveries and evidence to support existing hypothesis.

  2. Development and application of a DNA microarray-based yeast two-hybrid system

    PubMed Central

    Suter, Bernhard; Fontaine, Jean-Fred; Yildirimman, Reha; Raskó, Tamás; Schaefer, Martin H.; Rasche, Axel; Porras, Pablo; Vázquez-Álvarez, Blanca M.; Russ, Jenny; Rau, Kirstin; Foulle, Raphaele; Zenkner, Martina; Saar, Kathrin; Herwig, Ralf; Andrade-Navarro, Miguel A.; Wanker, Erich E.

    2013-01-01

    The yeast two-hybrid (Y2H) system is the most widely applied methodology for systematic protein–protein interaction (PPI) screening and the generation of comprehensive interaction networks. We developed a novel Y2H interaction screening procedure using DNA microarrays for high-throughput quantitative PPI detection. Applying a global pooling and selection scheme to a large collection of human open reading frames, proof-of-principle Y2H interaction screens were performed for the human neurodegenerative disease proteins huntingtin and ataxin-1. Using systematic controls for unspecific Y2H results and quantitative benchmarking, we identified and scored a large number of known and novel partner proteins for both huntingtin and ataxin-1. Moreover, we show that this parallelized screening procedure and the global inspection of Y2H interaction data are uniquely suited to define specific PPI patterns and their alteration by disease-causing mutations in huntingtin and ataxin-1. This approach takes advantage of the specificity and flexibility of DNA microarrays and of the existence of solid-related statistical methods for the analysis of DNA microarray data, and allows a quantitative approach toward interaction screens in human and in model organisms. PMID:23275563

  3. Curated collection of yeast transcription factor DNA binding specificity data reveals novel structural and gene regulatory insights

    PubMed Central

    2011-01-01

    Background Transcription factors (TFs) play a central role in regulating gene expression by interacting with cis-regulatory DNA elements associated with their target genes. Recent surveys have examined the DNA binding specificities of most Saccharomyces cerevisiae TFs, but a comprehensive evaluation of their data has been lacking. Results We analyzed in vitro and in vivo TF-DNA binding data reported in previous large-scale studies to generate a comprehensive, curated resource of DNA binding specificity data for all characterized S. cerevisiae TFs. Our collection comprises DNA binding site motifs and comprehensive in vitro DNA binding specificity data for all possible 8-bp sequences. Investigation of the DNA binding specificities within the basic leucine zipper (bZIP) and VHT1 regulator (VHR) TF families revealed unexpected plasticity in TF-DNA recognition: intriguingly, the VHR TFs, newly characterized by protein binding microarrays in this study, recognize bZIP-like DNA motifs, while the bZIP TF Hac1 recognizes a motif highly similar to the canonical E-box motif of basic helix-loop-helix (bHLH) TFs. We identified several TFs with distinct primary and secondary motifs, which might be associated with different regulatory functions. Finally, integrated analysis of in vivo TF binding data with protein binding microarray data lends further support for indirect DNA binding in vivo by sequence-specific TFs. Conclusions The comprehensive data in this curated collection allow for more accurate analyses of regulatory TF-DNA interactions, in-depth structural studies of TF-DNA specificity determinants, and future experimental investigations of the TFs' predicted target genes and regulatory roles. PMID:22189060

  4. Design of 240,000 orthogonal 25mer DNA barcode probes.

    PubMed

    Xu, Qikai; Schlabach, Michael R; Hannon, Gregory J; Elledge, Stephen J

    2009-02-17

    DNA barcodes linked to genetic features greatly facilitate screening these features in pooled formats using microarray hybridization, and new tools are needed to design large sets of barcodes to allow construction of large barcoded mammalian libraries such as shRNA libraries. Here we report a framework for designing large sets of orthogonal barcode probes. We demonstrate the utility of this framework by designing 240,000 barcode probes and testing their performance by hybridization. From the test hybridizations, we also discovered new probe design rules that significantly reduce cross-hybridization after their introduction into the framework of the algorithm. These rules should improve the performance of DNA microarray probe designs for many applications.

  5. Design of 240,000 orthogonal 25mer DNA barcode probes

    PubMed Central

    Xu, Qikai; Schlabach, Michael R.; Hannon, Gregory J.; Elledge, Stephen J.

    2009-01-01

    DNA barcodes linked to genetic features greatly facilitate screening these features in pooled formats using microarray hybridization, and new tools are needed to design large sets of barcodes to allow construction of large barcoded mammalian libraries such as shRNA libraries. Here we report a framework for designing large sets of orthogonal barcode probes. We demonstrate the utility of this framework by designing 240,000 barcode probes and testing their performance by hybridization. From the test hybridizations, we also discovered new probe design rules that significantly reduce cross-hybridization after their introduction into the framework of the algorithm. These rules should improve the performance of DNA microarray probe designs for many applications. PMID:19171886

  6. Application of HLA-DRB1 genotyping by oligonucleotide micro-array technology in forensic medicine.

    PubMed

    Jiang, Bin; Li, Yao; Wu, Hai; He, Xianmin; Li, Chengtao; Li, Li; Tang, Rong; Xie, Yi; Mao, Yumin

    2006-10-16

    The human leukocyte antigen (HLA) system is known to be the most complex polymorphic system in the human genome. Among all of the HLA loci, HLA-DRB1 has the second largest number of alleles. The purpose of this study is to develop an oligonucleotide micro-array based HLA-DRB1 typing system for use in forensic identification, anthropology, tissue transplantation, and other genetic research fields. The system was developed by analyzing the HLA-DRB1 (DRB1) genotypes in 1198 unrelated healthy Chinese Han individuals originating from various parts of China and residing in Shanghai, China. Polymerase chain reaction (PCR) coupled with the oligonucleotide micro-array technology was used to detect and type HLA-DRB1 alleles of the sample individuals. The reliability, sensitivity, consistency and specificity were evaluated for use in forensic identification. Furthermore, a meta-analysis was carried out by comparing the allele frequencies of the HLA-DRB1 locus with those of other Chinese Han groups, Chinese minorities and other ethnic populations. All the DNA samples yielded a 273 bp amplification product, with no other amplification products in this length range. The minimum quantity of DNA detected by this method is 15 ng in a PCR reaction system of 25 microl. The population studied appeared to be not in Hardy-Weinberg equilibrium. Observed heterozygosity (Ho), expected heterozygosity (He), expected probability of exclusion (PE), polymorphic information content (PIC), and discrimination power (DP) of the HLA-DRB1 locus from the Shanghai Han ethnic group were evaluated to be 0.8022, 0.8870, 0.7741, 0.8771, 0.9750, respectively. A total of 25 HLA-DRB1 alleles were identified. HLA-DRB1*09XX, *04XX, *12XX and *15XX were the most frequent DRB1 alleles, which were observed in 58.76% of the sample. One hundred and sixteen genotypes were found. The five most frequent genotypes were: *04XX/*04XX (0.0626), *09XX/*09XX (0.0593), *04XX/*09XX (0.0551), *09XX/*15XX (0.0384) and *08XX/*12XX (0.0351). The meta-analysis showed that there were uniquely distributed features of DRB1 alleles among various ethnic populations and among the studied population groups from various regions with the same ethnic origin. An HLA-DRB1 genotyping system has been developed and established based on the oligonucleotide micro-array technology. The HLA-DRB1 typing of the Han population in Shanghai has revealed a relatively high heterogeneity. Information obtained in this study will be useful for medical and forensic applications as well as in anthropology research. Large-scale micro-array detection is highly accurate and reliable for DNA-based HLA-DRB1 genotyping. These results suggest that HLA-DRB1 DNA polymorphisms and the database of the Shanghai Han group have useful applications in processing forensic casework (as personal identification, paternity test), tracing population migration and genetic diagnosis.

  7. Unravelling Glucan Recognition Systems by Glycome Microarrays Using the Designer Approach and Mass Spectrometry*

    PubMed Central

    Palma, Angelina S.; Liu, Yan; Zhang, Hongtao; Zhang, Yibing; McCleary, Barry V.; Yu, Guangli; Huang, Qilin; Guidolin, Leticia S.; Ciocchini, Andres E.; Torosantucci, Antonella; Wang, Denong; Carvalho, Ana Luísa; Fontes, Carlos M. G. A.; Mulloy, Barbara; Childs, Robert A.; Feizi, Ten; Chai, Wengang

    2015-01-01

    Glucans are polymers of d-glucose with differing linkages in linear or branched sequences. They are constituents of microbial and plant cell-walls and involved in important bio-recognition processes, including immunomodulation, anticancer activities, pathogen virulence, and plant cell-wall biodegradation. Translational possibilities for these activities in medicine and biotechnology are considerable. High-throughput micro-methods are needed to screen proteins for recognition of specific glucan sequences as a lead to structure–function studies and their exploitation. We describe construction of a “glucome” microarray, the first sequence-defined glycome-scale microarray, using a “designer” approach from targeted ligand-bearing glucans in conjunction with a novel high-sensitivity mass spectrometric sequencing method, as a screening tool to assign glucan recognition motifs. The glucome microarray comprises 153 oligosaccharide probes with high purity, representing major sequences in glucans. Negative-ion electrospray tandem mass spectrometry with collision-induced dissociation was used for complete linkage analysis of gluco-oligosaccharides in linear “homo” and “hetero” and branched sequences. The system is validated using antibodies and carbohydrate-binding modules known to target α- or β-glucans in different biological contexts, extending knowledge on their specificities, and applied to reveal new information on glucan recognition by two signaling molecules of the immune system against pathogens: Dectin-1 and DC-SIGN. The sequencing of the glucan oligosaccharides by the MS method and their interrogation on the microarrays provides detailed information on linkage, sequence and chain length requirements of glucan-recognizing proteins, and are a sensitive means of revealing unsuspected sequences in the polysaccharides. PMID:25670804

  8. Transcriptome-Wide Mega-Analyses Reveal Joint Dysregulation of Immunologic Genes and Transcription Regulators in Brain and Blood in Schizophrenia

    PubMed Central

    Hess, Jonathan L.; Tylee, Daniel S.; Barve, Rahul; de Jong, Simone; Ophoff, Roel A.; Kumarasinghe, Nishantha; Tooney, Paul; Schall, Ulrich; Gardiner, Erin; Beveridge, Natalie Jane; Scott, Rodney J.; Yasawardene, Surangi; Perera, Antionette; Mendis, Jayan; Carr, Vaughan; Kelly, Brian; Cairns, Murray; Tsuang, Ming T.; Glatt, Stephen J.

    2016-01-01

    The application of microarray technology in schizophrenia research was heralded as paradigm-shifting, as it allowed for high-throughput assessment of cell and tissue function. This technology was widely adopted, initially in studies of postmortem brain tissue, and later in studies of peripheral blood. The collective body of schizophrenia microarray literature contains apparent inconsistencies between studies, with failures to replicate top hits, in part due to small sample sizes, cohort-specific effects, differences in array types, and other confounders. In an attempt to summarize existing studies of schizophrenia cases and non-related comparison subjects, we performed two mega-analyses of a combined set of microarray data from postmortem prefrontal cortices (n = 315) and from ex-vivo blood tissues (n = 578). We adjusted regression models per gene to remove non-significant covariates, providing best-estimates of transcripts dysregulated in schizophrenia. We also examined dysregulation of functionally related gene sets and gene co-expression modules, and assessed enrichment of cell types and genetic risk factors. The identities of the most significantly dysregulated genes were largely distinct for each tissue, but the findings indicated common emergent biological functions (e.g. immunity) and regulatory factors (e.g., predicted targets of transcription factors and miRNA species across tissues). Our network-based analyses converged upon similar patterns of heightened innate immune gene expression in both brain and blood in schizophrenia. We also constructed generalizable machine-learning classifiers using the blood-based microarray data. Our study provides an informative atlas for future pathophysiologic and biomarker studies of schizophrenia. PMID:27450777

  9. Transcriptome-wide mega-analyses reveal joint dysregulation of immunologic genes and transcription regulators in brain and blood in schizophrenia.

    PubMed

    Hess, Jonathan L; Tylee, Daniel S; Barve, Rahul; de Jong, Simone; Ophoff, Roel A; Kumarasinghe, Nishantha; Tooney, Paul; Schall, Ulrich; Gardiner, Erin; Beveridge, Natalie Jane; Scott, Rodney J; Yasawardene, Surangi; Perera, Antionette; Mendis, Jayan; Carr, Vaughan; Kelly, Brian; Cairns, Murray; Tsuang, Ming T; Glatt, Stephen J

    2016-10-01

    The application of microarray technology in schizophrenia research was heralded as paradigm-shifting, as it allowed for high-throughput assessment of cell and tissue function. This technology was widely adopted, initially in studies of postmortem brain tissue, and later in studies of peripheral blood. The collective body of schizophrenia microarray literature contains apparent inconsistencies between studies, with failures to replicate top hits, in part due to small sample sizes, cohort-specific effects, differences in array types, and other confounders. In an attempt to summarize existing studies of schizophrenia cases and non-related comparison subjects, we performed two mega-analyses of a combined set of microarray data from postmortem prefrontal cortices (n=315) and from ex-vivo blood tissues (n=578). We adjusted regression models per gene to remove non-significant covariates, providing best-estimates of transcripts dysregulated in schizophrenia. We also examined dysregulation of functionally related gene sets and gene co-expression modules, and assessed enrichment of cell types and genetic risk factors. The identities of the most significantly dysregulated genes were largely distinct for each tissue, but the findings indicated common emergent biological functions (e.g. immunity) and regulatory factors (e.g., predicted targets of transcription factors and miRNA species across tissues). Our network-based analyses converged upon similar patterns of heightened innate immune gene expression in both brain and blood in schizophrenia. We also constructed generalizable machine-learning classifiers using the blood-based microarray data. Our study provides an informative atlas for future pathophysiologic and biomarker studies of schizophrenia. Published by Elsevier B.V.

  10. Microarray-based screening of heat shock protein inhibitors.

    PubMed

    Schax, Emilia; Walter, Johanna-Gabriela; Märzhäuser, Helene; Stahl, Frank; Scheper, Thomas; Agard, David A; Eichner, Simone; Kirschning, Andreas; Zeilinger, Carsten

    2014-06-20

    Based on the importance of heat shock proteins (HSPs) in diseases such as cancer, Alzheimer's disease or malaria, inhibitors of these chaperons are needed. Today's state-of-the-art techniques to identify HSP inhibitors are performed in microplate format, requiring large amounts of proteins and potential inhibitors. In contrast, we have developed a miniaturized protein microarray-based assay to identify novel inhibitors, allowing analysis with 300 pmol of protein. The assay is based on competitive binding of fluorescence-labeled ATP and potential inhibitors to the ATP-binding site of HSP. Therefore, the developed microarray enables the parallel analysis of different ATP-binding proteins on a single microarray. We have demonstrated the possibility of multiplexing by immobilizing full-length human HSP90α and HtpG of Helicobacter pylori on microarrays. Fluorescence-labeled ATP was competed by novel geldanamycin/reblastatin derivatives with IC50 values in the range of 0.5 nM to 4 μM and Z(*)-factors between 0.60 and 0.96. Our results demonstrate the potential of a target-oriented multiplexed protein microarray to identify novel inhibitors for different members of the HSP90 family. Copyright © 2014 Elsevier B.V. All rights reserved.

  11. Mapping the affinity landscape of Thrombin-binding aptamers on 2΄F-ANA/DNA chimeric G-Quadruplex microarrays

    PubMed Central

    Abou Assi, Hala; Gómez-Pinto, Irene; González, Carlos

    2017-01-01

    Abstract In situ fabricated nucleic acids microarrays are versatile and very high-throughput platforms for aptamer optimization and discovery, but the chemical space that can be probed against a given target has largely been confined to DNA, while RNA and non-natural nucleic acid microarrays are still an essentially uncharted territory. 2΄-Fluoroarabinonucleic acid (2΄F-ANA) is a prime candidate for such use in microarrays. Indeed, 2΄F-ANA chemistry is readily amenable to photolithographic microarray synthesis and its potential in high affinity aptamers has been recently discovered. We thus synthesized the first microarrays containing 2΄F-ANA and 2΄F-ANA/DNA chimeric sequences to fully map the binding affinity landscape of the TBA1 thrombin-binding G-quadruplex aptamer containing all 32 768 possible DNA-to-2΄F-ANA mutations. The resulting microarray was screened against thrombin to identify a series of promising 2΄F-ANA-modified aptamer candidates with Kds significantly lower than that of the unmodified control and which were found to adopt highly stable, antiparallel-folded G-quadruplex structures. The solution structure of the TBA1 aptamer modified with 2΄F-ANA at position T3 shows that fluorine substitution preorganizes the dinucleotide loop into the proper conformation for interaction with thrombin. Overall, our work strengthens the potential of 2΄F-ANA in aptamer research and further expands non-genomic applications of nucleic acids microarrays. PMID:28100695

  12. Framework for Parallel Preprocessing of Microarray Data Using Hadoop

    PubMed Central

    2018-01-01

    Nowadays, microarray technology has become one of the popular ways to study gene expression and diagnosis of disease. National Center for Biology Information (NCBI) hosts public databases containing large volumes of biological data required to be preprocessed, since they carry high levels of noise and bias. Robust Multiarray Average (RMA) is one of the standard and popular methods that is utilized to preprocess the data and remove the noises. Most of the preprocessing algorithms are time-consuming and not able to handle a large number of datasets with thousands of experiments. Parallel processing can be used to address the above-mentioned issues. Hadoop is a well-known and ideal distributed file system framework that provides a parallel environment to run the experiment. In this research, for the first time, the capability of Hadoop and statistical power of R have been leveraged to parallelize the available preprocessing algorithm called RMA to efficiently process microarray data. The experiment has been run on cluster containing 5 nodes, while each node has 16 cores and 16 GB memory. It compares efficiency and the performance of parallelized RMA using Hadoop with parallelized RMA using affyPara package as well as sequential RMA. The result shows the speed-up rate of the proposed approach outperforms the sequential approach and affyPara approach. PMID:29796018

  13. MGDB: crossing the marker genes of a user microarray with a database of public-microarrays marker genes.

    PubMed

    Huerta, Mario; Munyi, Marc; Expósito, David; Querol, Enric; Cedano, Juan

    2014-06-15

    The microarrays performed by scientific teams grow exponentially. These microarray data could be useful for researchers around the world, but unfortunately they are underused. To fully exploit these data, it is necessary (i) to extract these data from a repository of the high-throughput gene expression data like Gene Expression Omnibus (GEO) and (ii) to make the data from different microarrays comparable with tools easy to use for scientists. We have developed these two solutions in our server, implementing a database of microarray marker genes (Marker Genes Data Base). This database contains the marker genes of all GEO microarray datasets and it is updated monthly with the new microarrays from GEO. Thus, researchers can see whether the marker genes of their microarray are marker genes in other microarrays in the database, expanding the analysis of their microarray to the rest of the public microarrays. This solution helps not only to corroborate the conclusions regarding a researcher's microarray but also to identify the phenotype of different subsets of individuals under investigation, to frame the results with microarray experiments from other species, pathologies or tissues, to search for drugs that promote the transition between the studied phenotypes, to detect undesirable side effects of the treatment applied, etc. Thus, the researcher can quickly add relevant information to his/her studies from all of the previous analyses performed in other studies as long as they have been deposited in public repositories. Marker-gene database tool: http://ibb.uab.es/mgdb © The Author 2014. Published by Oxford University Press.

  14. Privacy Preserving PCA on Distributed Bioinformatics Datasets

    ERIC Educational Resources Information Center

    Li, Xin

    2011-01-01

    In recent years, new bioinformatics technologies, such as gene expression microarray, genome-wide association study, proteomics, and metabolomics, have been widely used to simultaneously identify a huge number of human genomic/genetic biomarkers, generate a tremendously large amount of data, and dramatically increase the knowledge on human…

  15. Low-Density microarray technologies for rapid human norovirus genotyping

    USDA-ARS?s Scientific Manuscript database

    Human noroviruses (HuNoV) are the most common cause of food borne disease and viruses are likely responsible for a large proportion of foodborne diseases of unknown etiology. Recent advancements in molecular biology, bioinformatics, epidemiology, and risk analysis have aided the study of these agent...

  16. Genome-Scale Protein Microarray Comparison of Human Antibody Responses in Plasmodium vivax Relapse and Reinfection.

    PubMed

    Chuquiyauri, Raul; Molina, Douglas M; Moss, Eli L; Wang, Ruobing; Gardner, Malcolm J; Brouwer, Kimberly C; Torres, Sonia; Gilman, Robert H; Llanos-Cuentas, Alejandro; Neafsey, Daniel E; Felgner, Philip; Liang, Xiaowu; Vinetz, Joseph M

    2015-10-01

    Large scale antibody responses in Plasmodium vivax malaria remains unexplored in the endemic setting. Protein microarray analysis of asexual-stage P. vivax was used to identify antigens recognized in sera from residents of hypoendemic Peruvian Amazon. Over 24 months, of 106 participants, 91 had two symptomatic P. vivax malaria episodes, 11 had three episodes, 3 had four episodes, and 1 had five episodes. Plasmodium vivax relapse was distinguished from reinfection by a merozoite surface protein-3α restriction fragment length polymorphism polymerase chain reaction (MSP3α PCR-RFLP) assay. Notably, P. vivax reinfection subjects did not have higher reactivity to the entire set of recognized P. vivax blood-stage antigens than relapse subjects, regardless of the number of malaria episodes. The most highly recognized P. vivax proteins were MSP 4, 7, 8, and 10 (PVX_003775, PVX_082650, PVX_097625, and PVX_114145); sexual-stage antigen s16 (PVX_000930); early transcribed membrane protein (PVX_090230); tryptophan-rich antigen (Pv-fam-a) (PVX_092995); apical merozoite antigen 1 (PVX_092275); and proteins of unknown function (PVX_081830, PVX_117680, PVX_118705, PVX_121935, PVX_097730, PVX_110935, PVX_115450, and PVX_082475). Genes encoding reactive proteins exhibited a significant enrichment of non-synonymous nucleotide variation, an observation suggesting immune selection. These data identify candidates for seroepidemiological tools to support malaria elimination efforts in P. vivax-endemic regions. © The American Society of Tropical Medicine and Hygiene.

  17. Identification of reference genes for quantitative expression analysis using large-scale RNA-seq data of Arabidopsis thaliana and model crop plants.

    PubMed

    Kudo, Toru; Sasaki, Yohei; Terashima, Shin; Matsuda-Imai, Noriko; Takano, Tomoyuki; Saito, Misa; Kanno, Maasa; Ozaki, Soichi; Suwabe, Keita; Suzuki, Go; Watanabe, Masao; Matsuoka, Makoto; Takayama, Seiji; Yano, Kentaro

    2016-10-13

    In quantitative gene expression analysis, normalization using a reference gene as an internal control is frequently performed for appropriate interpretation of the results. Efforts have been devoted to exploring superior novel reference genes using microarray transcriptomic data and to evaluating commonly used reference genes by targeting analysis. However, because the number of specifically detectable genes is totally dependent on probe design in the microarray analysis, exploration using microarray data may miss some of the best choices for the reference genes. Recently emerging RNA sequencing (RNA-seq) provides an ideal resource for comprehensive exploration of reference genes since this method is capable of detecting all expressed genes, in principle including even unknown genes. We report the results of a comprehensive exploration of reference genes using public RNA-seq data from plants such as Arabidopsis thaliana (Arabidopsis), Glycine max (soybean), Solanum lycopersicum (tomato) and Oryza sativa (rice). To select reference genes suitable for the broadest experimental conditions possible, candidates were surveyed by the following four steps: (1) evaluation of the basal expression level of each gene in each experiment; (2) evaluation of the expression stability of each gene in each experiment; (3) evaluation of the expression stability of each gene across the experiments; and (4) selection of top-ranked genes, after ranking according to the number of experiments in which the gene was expressed stably. Employing this procedure, 13, 10, 12 and 21 top candidates for reference genes were proposed in Arabidopsis, soybean, tomato and rice, respectively. Microarray expression data confirmed that the expression of the proposed reference genes under broad experimental conditions was more stable than that of commonly used reference genes. These novel reference genes will be useful for analyzing gene expression profiles across experiments carried out under various experimental conditions.

  18. A biomimetic algorithm for the improved detection of microarray features

    NASA Astrophysics Data System (ADS)

    Nicolau, Dan V., Jr.; Nicolau, Dan V.; Maini, Philip K.

    2007-02-01

    One the major difficulties of microarray technology relate to the processing of large and - importantly - error-loaded images of the dots on the chip surface. Whatever the source of these errors, those obtained in the first stage of data acquisition - segmentation - are passed down to the subsequent processes, with deleterious results. As it has been demonstrated recently that biological systems have evolved algorithms that are mathematically efficient, this contribution attempts to test an algorithm that mimics a bacterial-"patented" algorithm for the search of available space and nutrients to find, "zero-in" and eventually delimitate the features existent on the microarray surface.

  19. Quantifying protein-protein interactions in high throughput using protein domain microarrays.

    PubMed

    Kaushansky, Alexis; Allen, John E; Gordus, Andrew; Stiffler, Michael A; Karp, Ethan S; Chang, Bryan H; MacBeath, Gavin

    2010-04-01

    Protein microarrays provide an efficient way to identify and quantify protein-protein interactions in high throughput. One drawback of this technique is that proteins show a broad range of physicochemical properties and are often difficult to produce recombinantly. To circumvent these problems, we have focused on families of protein interaction domains. Here we provide protocols for constructing microarrays of protein interaction domains in individual wells of 96-well microtiter plates, and for quantifying domain-peptide interactions in high throughput using fluorescently labeled synthetic peptides. As specific examples, we will describe the construction of microarrays of virtually every human Src homology 2 (SH2) and phosphotyrosine binding (PTB) domain, as well as microarrays of mouse PDZ domains, all produced recombinantly in Escherichia coli. For domains that mediate high-affinity interactions, such as SH2 and PTB domains, equilibrium dissociation constants (K(D)s) for their peptide ligands can be measured directly on arrays by obtaining saturation binding curves. For weaker binding domains, such as PDZ domains, arrays are best used to identify candidate interactions, which are then retested and quantified by fluorescence polarization. Overall, protein domain microarrays provide the ability to rapidly identify and quantify protein-ligand interactions with minimal sample consumption. Because entire domain families can be interrogated simultaneously, they provide a powerful way to assess binding selectivity on a proteome-wide scale and provide an unbiased perspective on the connectivity of protein-protein interaction networks.

  20. On hydrophilicity improvement of the porous anodic alumina film by hybrid nano/micro structuring

    NASA Astrophysics Data System (ADS)

    Wang, Weichao; Zhao, Wei; Wang, Kaige; Wang, Lei; Wang, Xuewen; Wang, Shuang; Zhang, Chen; Bai, Jintao

    2017-09-01

    In both, laboratory and industry, tremendous attention is paid to discover an effective technique to produce uniform, controllable and (super) hydrophilic surfaces over large areas that are useful in a wide range of applications. In this investigation, by combing porous anodic alumina (PAA) film with nano-structures and microarray of aluminum, the hydrophilicity of hybrid nano-micro structure has been significantly improved. It is found some factors can affect the hydrophilicity of film, such as the size and aspect ratio of microarray, the thickness of nano-PAA film etc. Comparing with pure nano-PAA films and microarray, the hybrid nano-micro structure can provide uniform surface with significantly better hydrophilicity. The improvement can be up to 84%. Also, this technique exhibits good stability and repeatability for industrial production. By optimizing the thickness of nano-PAA film and aspect ratio of micro-structures, super-hydrophilicity can be reached. This study has obvious prospect in the fields of chemical industry, biomedical engineering and lab-on-a-chip applications.

  1. Alteration of gene expression in the colon of colorectal cancer model rat by dietary sodium gluconate.

    PubMed

    Kameue, Chiyoko; Tsukahara, Takamitsu; Ushida, Kazunari

    2006-03-01

    Butyrate induces apoptosis of various cancer cell lines in a p53-independent manner and inhibits the proliferation of cancer cells. In a previous report, we reported a significant reduction in tumor incidence in rat colon as a result of dietary sodium gluconate (GNA). The stimulation of apoptosis through enhanced butyrate production in the large intestine was involved in the antitumorigenic effect of GNA. In the present study, a cDNA microarray analysis was performed to investigate the particular mechanism involved in the antitumorigenic effect of GNA. Some up-regulated genes suggested by microarray analysis were further evaluated using real-time PCR. A microarray revealed that GNA regulates the expression of retinoic acid receptor (RAR) and retinoid X receptor (RXR), and several genes known as the target of retinoids in cancer cells. In other words, the antitumorigenic effect of GNA may involve the regulation of the retinoid signaling pathway by butyrate in a retinoid-independent manner.

  2. Soybean defense responses to the soybean aphid.

    PubMed

    Li, Yan; Zou, Jijun; Li, Min; Bilgin, Damla D; Vodkin, Lila O; Hartman, Glen L; Clough, Steven J

    2008-01-01

    Transcript profiles in aphid (Aphis glycines)-resistant (cv. Dowling) and -susceptible (cv. Williams 82) soybean (Glycine max) cultivars using soybean cDNA microarrays were investigated. Large-scale soybean cDNA microarrays representing approx. 18 000 genes or c. 30% of the soybean genome were compared at 6 and 12 h post-application of aphids. In a separate experiment utilizing clip cages, expression of three defense-related genes were examined at 6, 12, 24, 48, and 72 h in both cultivars by quantitative real-time PCR. One hundred and forty genes showed specific responses for resistance; these included genes related to cell wall, defense, DNA/RNA, secondary metabolism, signaling and other processes. When an extended time period of sampling was investigated, earlier and greater induction of three defense-related genes was observed in the resistant cultivar; however, the induction declined after 24 or 48 h in the resistant cultivar but continued to increase in the susceptible cultivar after 24 h. Aphid-challenged resistant plants showed rapid differential gene expression patterns similar to the incompatible response induced by avirulent Pseudomonas syringae. Five genes were identified as differentially expressed between the two genotypes in the absence of aphids.

  3. Integrated poly(dimethysiloxane) with an intrinsic nonfouling property approaching "absolute" zero background in immunoassays.

    PubMed

    Ma, Hongwei; Wu, Yuanzi; Yang, Xiaoli; Liu, Xing; He, Jianan; Fu, Long; Wang, Jie; Xu, Hongke; Shi, Yi; Zhong, Renqian

    2010-08-01

    The key to achieve a highly sensitive and specific protein microarray assay is to prevent nonspecific protein adsorption to an "absolute" zero level because any signal amplification method will simultaneously amplify signal and noise. Here, we develop a novel solid supporting material, namely, polymer coated initiator integrated poly(dimethysiloxane) (iPDMS), which was able to achieve such "absolute" zero (i.e., below the detection limit of instrument). The implementation of this iPDMS enables practical and high-quality multiplexed enzyme-linked immunosorbent assay (ELISA) of 11 tumor markers. This iPDMS does not need any blocking steps and only require mild washing conditions. It also uses on an average 8-fold less capture antibodies compared with the mainstream nitrocellulose (NC) film. Besides saving time and materials, iPDMS achieved a limit-of-detection (LOD) as low as 19 pg mL(-1), which is sufficiently low for most current clinical diagnostic applications. We expect to see an immediate impact of this iPDMS on the realization of the great potential of protein microarray in research and practical uses such as large scale and high-throughput screening, clinical diagnosis, inspection, and quarantine.

  4. SimArray: a user-friendly and user-configurable microarray design tool

    PubMed Central

    Auburn, Richard P; Russell, Roslin R; Fischer, Bettina; Meadows, Lisa A; Sevillano Matilla, Santiago; Russell, Steven

    2006-01-01

    Background Microarrays were first developed to assess gene expression but are now also used to map protein-binding sites and to assess allelic variation between individuals. Regardless of the intended application, efficient production and appropriate array design are key determinants of experimental success. Inefficient production can make larger-scale studies prohibitively expensive, whereas poor array design makes normalisation and data analysis problematic. Results We have developed a user-friendly tool, SimArray, which generates a randomised spot layout, computes a maximum meta-grid area, and estimates the print time, in response to user-specified design decisions. Selected parameters include: the number of probes to be printed; the microtitre plate format; the printing pin configuration, and the achievable spot density. SimArray is compatible with all current robotic spotters that employ 96-, 384- or 1536-well microtitre plates, and can be configured to reflect most production environments. Print time and maximum meta-grid area estimates facilitate evaluation of each array design for its suitability. Randomisation of the spot layout facilitates correction of systematic biases by normalisation. Conclusion SimArray is intended to help both established researchers and those new to the microarray field to develop microarray designs with randomised spot layouts that are compatible with their specific production environment. SimArray is an open-source program and is available from . PMID:16509966

  5. Contributions to Statistical Problems Related to Microarray Data

    ERIC Educational Resources Information Center

    Hong, Feng

    2009-01-01

    Microarray is a high throughput technology to measure the gene expression. Analysis of microarray data brings many interesting and challenging problems. This thesis consists three studies related to microarray data. First, we propose a Bayesian model for microarray data and use Bayes Factors to identify differentially expressed genes. Second, we…

  6. Identification of rice genes associated with cosmic-ray response via co-expression gene network analysis.

    PubMed

    Hwang, Sun-Goo; Kim, Dong Sub; Hwang, Jung Eun; Han, A-Reum; Jang, Cheol Seong

    2014-05-15

    In order to better understand the biological systems that are affected in response to cosmic ray (CR), we conducted weighted gene co-expression network analysis using the module detection method. By using the Pearson's correlation coefficient (PCC) value, we evaluated complex gene-gene functional interactions between 680 CR-responsive probes from integrated microarray data sets, which included large-scale transcriptional profiling of 1000 microarray samples. These probes were divided into 6 distinct modules that contained 20 enriched gene ontology (GO) functions, such as oxidoreductase activity, hydrolase activity, and response to stimulus and stress. In particular, modules 1 and 2 commonly showed enriched annotation categories such as oxidoreductase activity, including enriched cis-regulatory elements known as ROS-specific regulators. These results suggest that the ROS-mediated irradiation response pathway is affected by CR in modules 1 and 2. We found 243 ionizing radiation (IR)-responsive probes that exhibited similarities in expression patterns in various irradiation microarray data sets. The expression patterns of 6 randomly selected IR-responsive genes were evaluated by quantitative reverse transcription polymerase chain reaction following treatment with CR, gamma rays (GR), and ion beam (IB); similar patterns were observed among these genes under these 3 treatments. Moreover, we constructed subnetworks of IR-responsive genes and evaluated the expression levels of their neighboring genes following GR treatment; similar patterns were observed among them. These results of network-based analyses might provide a clue to understanding the complex biological system related to the CR response in plants. Copyright © 2014 Elsevier B.V. All rights reserved.

  7. Development of a Schistosoma mansoni shotgun O-glycan microarray and application to the discovery of new antigenic schistosome glycan motifs.

    PubMed

    van Diepen, Angela; van der Plas, Arend-Jan; Kozak, Radoslaw P; Royle, Louise; Dunne, David W; Hokke, Cornelis H

    2015-06-01

    Upon infection with Schistosoma, antibody responses are mounted that are largely directed against glycans. Over the last few years significant progress has been made in characterising the antigenic properties of N-glycans of Schistosoma mansoni. Despite also being abundantly expressed by schistosomes, much less is understood about O-glycans and antibody responses to these have not yet been systematically analysed. Antibody binding to schistosome glycans can be analysed efficiently and quantitatively using glycan microarrays, but O-glycan array construction and exploration is lagging behind because no universal O-glycanase is available, and release of O-glycans has been dependent on chemical methods. Recently, a modified hydrazinolysis method has been developed that allows the release of O-glycans with free reducing termini and limited degradation, and we applied this method to obtain O-glycans from different S. mansoni life stages. Two-dimensional HPLC separation of 2-aminobenzoic acid-labelled O-glycans generated 362 O-glycan-containing fractions that were printed on an epoxide-modified glass slide, thereby generating the first shotgun O-glycan microarray containing naturally occurring schistosome O-glycans. Monoclonal antibodies and mass spectrometry showed that the O-glycan microarray contains well-known antigenic glycan motifs as well as numerous other, potentially novel, antibody targets. Incubations of the microarrays with sera from Schistosoma-infected humans showed substantial antibody responses to O-glycans in addition to those observed to the previously investigated N- and glycosphingolipid glycans. This underlines the importance of the inclusion of these often schistosome-specific O-glycans in glycan antigen studies and indicates that O-glycans contain novel antigenic motifs that have potential for use in diagnostic methods and studies aiming at the discovery of vaccine targets. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.

  8. The XBabelPhish MAGE-ML and XML translator.

    PubMed

    Maier, Don; Wymore, Farrell; Sherlock, Gavin; Ball, Catherine A

    2008-01-18

    MAGE-ML has been promoted as a standard format for describing microarray experiments and the data they produce. Two characteristics of the MAGE-ML format compromise its use as a universal standard: First, MAGE-ML files are exceptionally large - too large to be easily read by most people, and often too large to be read by most software programs. Second, the MAGE-ML standard permits many ways of representing the same information. As a result, different producers of MAGE-ML create different documents describing the same experiment and its data. Recognizing all the variants is an unwieldy software engineering task, resulting in software packages that can read and process MAGE-ML from some, but not all producers. This Tower of MAGE-ML Babel bars the unencumbered exchange of microarray experiment descriptions couched in MAGE-ML. We have developed XBabelPhish - an XQuery-based technology for translating one MAGE-ML variant into another. XBabelPhish's use is not restricted to translating MAGE-ML documents. It can transform XML files independent of their DTD, XML schema, or semantic content. Moreover, it is designed to work on very large (> 200 Mb.) files, which are common in the world of MAGE-ML. XBabelPhish provides a way to inter-translate MAGE-ML variants for improved interchange of microarray experiment information. More generally, it can be used to transform most XML files, including very large ones that exceed the capacity of most XML tools.

  9. Compact cancer biomarkers discovery using a swarm intelligence feature selection algorithm.

    PubMed

    Martinez, Emmanuel; Alvarez, Mario Moises; Trevino, Victor

    2010-08-01

    Biomarker discovery is a typical application from functional genomics. Due to the large number of genes studied simultaneously in microarray data, feature selection is a key step. Swarm intelligence has emerged as a solution for the feature selection problem. However, swarm intelligence settings for feature selection fail to select small features subsets. We have proposed a swarm intelligence feature selection algorithm based on the initialization and update of only a subset of particles in the swarm. In this study, we tested our algorithm in 11 microarray datasets for brain, leukemia, lung, prostate, and others. We show that the proposed swarm intelligence algorithm successfully increase the classification accuracy and decrease the number of selected features compared to other swarm intelligence methods. Copyright © 2010 Elsevier Ltd. All rights reserved.

  10. Calibration and assessment of channel-specific biases in microarray data with extended dynamical range.

    PubMed

    Bengtsson, Henrik; Jönsson, Göran; Vallon-Christersson, Johan

    2004-11-12

    Non-linearities in observed log-ratios of gene expressions, also known as intensity dependent log-ratios, can often be accounted for by global biases in the two channels being compared. Any step in a microarray process may introduce such offsets and in this article we study the biases introduced by the microarray scanner and the image analysis software. By scanning the same spotted oligonucleotide microarray at different photomultiplier tube (PMT) gains, we have identified a channel-specific bias present in two-channel microarray data. For the scanners analyzed it was in the range of 15-25 (out of 65,535). The observed bias was very stable between subsequent scans of the same array although the PMT gain was greatly adjusted. This indicates that the bias does not originate from a step preceding the scanner detector parts. The bias varies slightly between arrays. When comparing estimates based on data from the same array, but from different scanners, we have found that different scanners introduce different amounts of bias. So do various image analysis methods. We propose a scanning protocol and a constrained affine model that allows us to identify and estimate the bias in each channel. Backward transformation removes the bias and brings the channels to the same scale. The result is that systematic effects such as intensity dependent log-ratios are removed, but also that signal densities become much more similar. The average scan, which has a larger dynamical range and greater signal-to-noise ratio than individual scans, can then be obtained. The study shows that microarray scanners may introduce a significant bias in each channel. Such biases have to be calibrated for, otherwise systematic effects such as intensity dependent log-ratios will be observed. The proposed scanning protocol and calibration method is simple to use and is useful for evaluating scanner biases or for obtaining calibrated measurements with extended dynamical range and better precision. The cross-platform R package aroma, which implements all described methods, is available for free from http://www.maths.lth.se/bioinformatics/.

  11. A Versatile Microarray Platform for Capturing Rare Cells

    NASA Astrophysics Data System (ADS)

    Brinkmann, Falko; Hirtz, Michael; Haller, Anna; Gorges, Tobias M.; Vellekoop, Michael J.; Riethdorf, Sabine; Müller, Volkmar; Pantel, Klaus; Fuchs, Harald

    2015-10-01

    Analyses of rare events occurring at extremely low frequencies in body fluids are still challenging. We established a versatile microarray-based platform able to capture single target cells from large background populations. As use case we chose the challenging application of detecting circulating tumor cells (CTCs) - about one cell in a billion normal blood cells. After incubation with an antibody cocktail, targeted cells are extracted on a microarray in a microfluidic chip. The accessibility of our platform allows for subsequent recovery of targets for further analysis. The microarray facilitates exclusion of false positive capture events by co-localization allowing for detection without fluorescent labelling. Analyzing blood samples from cancer patients with our platform reached and partly outreached gold standard performance, demonstrating feasibility for clinical application. Clinical researchers free choice of antibody cocktail without need for altered chip manufacturing or incubation protocol, allows virtual arbitrary targeting of capture species and therefore wide spread applications in biomedical sciences.

  12. Clustering approaches to identifying gene expression patterns from DNA microarray data.

    PubMed

    Do, Jin Hwan; Choi, Dong-Kug

    2008-04-30

    The analysis of microarray data is essential for large amounts of gene expression data. In this review we focus on clustering techniques. The biological rationale for this approach is the fact that many co-expressed genes are co-regulated, and identifying co-expressed genes could aid in functional annotation of novel genes, de novo identification of transcription factor binding sites and elucidation of complex biological pathways. Co-expressed genes are usually identified in microarray experiments by clustering techniques. There are many such methods, and the results obtained even for the same datasets may vary considerably depending on the algorithms and metrics for dissimilarity measures used, as well as on user-selectable parameters such as desired number of clusters and initial values. Therefore, biologists who want to interpret microarray data should be aware of the weakness and strengths of the clustering methods used. In this review, we survey the basic principles of clustering of DNA microarray data from crisp clustering algorithms such as hierarchical clustering, K-means and self-organizing maps, to complex clustering algorithms like fuzzy clustering.

  13. Specific Discrimination of Three Pathogenic Salmonella enterica subsp. enterica Serotypes by carB-Based Oligonucleotide Microarray

    PubMed Central

    Shin, Hwa Hui; Hwang, Byeong Hee; Seo, Jeong Hyun

    2014-01-01

    It is important to rapidly and selectively detect and analyze pathogenic Salmonella enterica subsp. enterica in contaminated food to reduce the morbidity and mortality of Salmonella infection and to guarantee food safety. In the present work, we developed an oligonucleotide microarray containing duplicate specific capture probes based on the carB gene, which encodes the carbamoyl phosphate synthetase large subunit, as a competent biomarker evaluated by genetic analysis to selectively and efficiently detect and discriminate three S. enterica subsp. enterica serotypes: Choleraesuis, Enteritidis, and Typhimurium. Using the developed microarray system, three serotype targets were successfully analyzed in a range as low as 1.6 to 3.1 nM and were specifically discriminated from each other without nonspecific signals. In addition, the constructed microarray did not have cross-reactivity with other common pathogenic bacteria and even enabled the clear discrimination of the target Salmonella serotype from a bacterial mixture. Therefore, these results demonstrated that our novel carB-based oligonucleotide microarray can be used as an effective and specific detection system for S. enterica subsp. enterica serotypes. PMID:24185846

  14. Specific discrimination of three pathogenic Salmonella enterica subsp. enterica serotypes by carB-based oligonucleotide microarray.

    PubMed

    Shin, Hwa Hui; Hwang, Byeong Hee; Seo, Jeong Hyun; Cha, Hyung Joon

    2014-01-01

    It is important to rapidly and selectively detect and analyze pathogenic Salmonella enterica subsp. enterica in contaminated food to reduce the morbidity and mortality of Salmonella infection and to guarantee food safety. In the present work, we developed an oligonucleotide microarray containing duplicate specific capture probes based on the carB gene, which encodes the carbamoyl phosphate synthetase large subunit, as a competent biomarker evaluated by genetic analysis to selectively and efficiently detect and discriminate three S. enterica subsp. enterica serotypes: Choleraesuis, Enteritidis, and Typhimurium. Using the developed microarray system, three serotype targets were successfully analyzed in a range as low as 1.6 to 3.1 nM and were specifically discriminated from each other without nonspecific signals. In addition, the constructed microarray did not have cross-reactivity with other common pathogenic bacteria and even enabled the clear discrimination of the target Salmonella serotype from a bacterial mixture. Therefore, these results demonstrated that our novel carB-based oligonucleotide microarray can be used as an effective and specific detection system for S. enterica subsp. enterica serotypes.

  15. Association analyses of large-scale glycan microarray data reveal novel host-specific substructures in influenza A virus binding glycans

    NASA Astrophysics Data System (ADS)

    Zhao, Nan; Martin, Brigitte E.; Yang, Chun-Kai; Luo, Feng; Wan, Xiu-Feng

    2015-10-01

    Influenza A viruses can infect a wide variety of animal species and, occasionally, humans. Infection occurs through the binding formed by viral surface glycoprotein hemagglutinin and certain types of glycan receptors on host cell membranes. Studies have shown that the α2,3-linked sialic acid motif (SA2,3Gal) in avian, equine, and canine species; the α2,6-linked sialic acid motif (SA2,6Gal) in humans; and SA2,3Gal and SA2,6Gal in swine are responsible for the corresponding host tropisms. However, more detailed and refined substructures that determine host tropisms are still not clear. Thus, in this study, we applied association mining on a set of glycan microarray data for 211 influenza viruses from five host groups: humans, swine, canine, migratory waterfowl, and terrestrial birds. The results suggest that besides Neu5Acα2-6Galβ, human-origin viruses could bind glycans with Neu5Acα2-8Neu5Acα2-8Neu5Ac and Neu5Gcα2-6Galβ1-4GlcNAc substructures; Galβ and GlcNAcβ terminal substructures, without sialic acid branches, were associated with the binding of human-, swine-, and avian-origin viruses; sulfated Neu5Acα2-3 substructures were associated with the binding of human- and swine-origin viruses. Finally, through three-dimensional structure characterization, we revealed that the role of glycan chain shapes is more important than that of torsion angles or of overall structural similarities in virus host tropisms.

  16. Mayday - integrative analytics for expression data

    PubMed Central

    2010-01-01

    Background DNA Microarrays have become the standard method for large scale analyses of gene expression and epigenomics. The increasing complexity and inherent noisiness of the generated data makes visual data exploration ever more important. Fast deployment of new methods as well as a combination of predefined, easy to apply methods with programmer's access to the data are important requirements for any analysis framework. Mayday is an open source platform with emphasis on visual data exploration and analysis. Many built-in methods for clustering, machine learning and classification are provided for dissecting complex datasets. Plugins can easily be written to extend Mayday's functionality in a large number of ways. As Java program, Mayday is platform-independent and can be used as Java WebStart application without any installation. Mayday can import data from several file formats, database connectivity is included for efficient data organization. Numerous interactive visualization tools, including box plots, profile plots, principal component plots and a heatmap are available, can be enhanced with metadata and exported as publication quality vector files. Results We have rewritten large parts of Mayday's core to make it more efficient and ready for future developments. Among the large number of new plugins are an automated processing framework, dynamic filtering, new and efficient clustering methods, a machine learning module and database connectivity. Extensive manual data analysis can be done using an inbuilt R terminal and an integrated SQL querying interface. Our visualization framework has become more powerful, new plot types have been added and existing plots improved. Conclusions We present a major extension of Mayday, a very versatile open-source framework for efficient micro array data analysis designed for biologists and bioinformaticians. Most everyday tasks are already covered. The large number of available plugins as well as the extension possibilities using compiled plugins and ad-hoc scripting allow for the rapid adaption of Mayday also to very specialized data exploration. Mayday is available at http://microarray-analysis.org. PMID:20214778

  17. Decreased triadin and increased calstabin2 expression in Great Danes with dilated cardiomyopathy.

    PubMed

    Oyama, M A; Chittur, S V; Reynolds, C A

    2009-01-01

    Dilated cardiomyopathy (DCM) is a common cardiac disease of Great Dane dogs, yet very little is known about the underlying molecular abnormalities that contribute to disease. Discover a set of genes that are differentially expressed in Great Dane dogs with DCM as a way to identify candidate genes for further study as well as to better understand the molecular abnormalities that underlie the disease. Three Great Dane dogs with end-stage DCM and 3 large breed control dogs. Prospective study. Transcriptional activity of 42,869 canine DNA sequences was determined with a canine-specific oligonucleotide microarray. Genome expression patterns of left ventricular tissue samples from affected Great Dane dogs were evaluated by measuring the relative amount of complementary RNA hybridization to the microarray probes and comparing it with expression from large breed dogs with noncardiac disease. Three hundred and twenty-three transcripts were differentially expressed (> or = 2-fold change). The transcript with the greatest degree of upregulation (+61.3-fold) was calstabin2 (FKBP12.6), whereas the transcript with the greatest degree of downregulation (-9.07-fold) was triadin. Calstabin2 and triadin are both regulatory components of the cardiac ryanodine receptor (RyR2) and are critical to normal intracellular Ca2+ release and excitation-contraction coupling. Great Dane dogs with DCM demonstrate abnormal calstabin2 and triadin expression. These changes likely affect Ca2+ flux within cardiac cells and may contribute to the pathophysiology of disease. Microarray-based analysis identifies calstabin2, triadin, and RyR2 function as targets of future study.

  18. Factorial microarray analysis of zebra mussel (Dreissena polymorpha: Dreissenidae, Bivalvia) adhesion

    PubMed Central

    2010-01-01

    Background The zebra mussel (Dreissena polymorpha) has been well known for its expertise in attaching to substances under the water. Studies in past decades on this underwater adhesion focused on the adhesive protein isolated from the byssogenesis apparatus of the zebra mussel. However, the mechanism of the initiation, maintenance, and determination of the attachment process remains largely unknown. Results In this study, we used a zebra mussel cDNA microarray previously developed in our lab and a factorial analysis to identify the genes that were involved in response to the changes of four factors: temperature (Factor A), current velocity (Factor B), dissolved oxygen (Factor C), and byssogenesis status (Factor D). Twenty probes in the microarray were found to be modified by one of the factors. The transcription products of four selected genes, DPFP-BG20_A01, EGP-BG97/192_B06, EGP-BG13_G05, and NH-BG17_C09 were unique to the zebra mussel foot based on the results of quantitative reverse transcription PCR (qRT-PCR). The expression profiles of these four genes under the attachment and non-attachment were also confirmed by qRT-PCR and the result is accordant to that from microarray assay. The in situ hybridization with the RNA probes of two identified genes DPFP-BG20_A01 and EGP-BG97/192_B06 indicated that both of them were expressed by a type of exocrine gland cell located in the middle part of the zebra mussel foot. Conclusions The results of this study suggested that the changes of D. polymorpha byssogenesis status and the environmental factors can dramatically affect the expression profiles of the genes unique to the foot. It turns out that the factorial design and analysis of the microarray experiment is a reliable method to identify the influence of multiple factors on the expression profiles of the probesets in the microarray; therein it provides a powerful tool to reveal the mechanism of zebra mussel underwater attachment. PMID:20509938

  19. Nanotechnology: moving from microarrays toward nanoarrays.

    PubMed

    Chen, Hua; Li, Jun

    2007-01-01

    Microarrays are important tools for high-throughput analysis of biomolecules. The use of microarrays for parallel screening of nucleic acid and protein profiles has become an industry standard. A few limitations of microarrays are the requirement for relatively large sample volumes and elongated incubation time, as well as the limit of detection. In addition, traditional microarrays make use of bulky instrumentation for the detection, and sample amplification and labeling are quite laborious, which increase analysis cost and delays the time for obtaining results. These problems limit microarray techniques from point-of-care and field applications. One strategy for overcoming these problems is to develop nanoarrays, particularly electronics-based nanoarrays. With further miniaturization, higher sensitivity, and simplified sample preparation, nanoarrays could potentially be employed for biomolecular analysis in personal healthcare and monitoring of trace pathogens. In this chapter, it is intended to introduce the concept and advantage of nanotechnology and then describe current methods and protocols for novel nanoarrays in three aspects: (1) label-free nucleic acids analysis using nanoarrays, (2) nanoarrays for protein detection by conventional optical fluorescence microscopy as well as by novel label-free methods such as atomic force microscopy, and (3) nanoarray for enzymatic-based assay. These nanoarrays will have significant applications in drug discovery, medical diagnosis, genetic testing, environmental monitoring, and food safety inspection.

  20. Transcriptomics as a tool for assessing the scalability of mammalian cell perfusion systems.

    PubMed

    Jayapal, Karthik P; Goudar, Chetan T

    2014-01-01

    DNA microarray-based transcriptomics have been used to determine the time course of laboratory and manufacturing-scale perfusion bioreactors in an attempt to characterize cell physiological state at these two bioreactor scales. Given the limited availability of genomic data for baby hamster kidney (BHK) cells, a Chinese hamster ovary (CHO)-based microarray was used following a feasibility assessment of cross-species hybridization. A heat shock experiment was performed using both BHK and CHO cells and resulting DNA microarray data were analyzed using a filtering criteria of perfect match (PM)/single base mismatch (MM) > 1.5 and PM-MM > 50 to exclude probes with low specificity or sensitivity for cross-species hybridizations. For BHK cells, 8910 probe sets (39 %) passed the cutoff criteria, whereas 12,961 probe sets (56 %) passed the cutoff criteria for CHO cells. Yet, the data from BHK cells allowed distinct clustering of heat shock and control samples as well as identification of biologically relevant genes as being differentially expressed, indicating the utility of cross-species hybridization. Subsequently, DNA microarray analysis was performed on time course samples from laboratory- and manufacturing-scale perfusion bioreactors that were operated under the same conditions. A majority of the variability (37 %) was associated with the first principal component (PC-1). Although PC-1 changed monotonically with culture duration, the trends were very similar in both the laboratory and manufacturing-scale bioreactors. Therefore, despite time-related changes to the cell physiological state, transcriptomic fingerprints were similar across the two bioreactor scales at any given instance in culture. Multiple genes were identified with time-course expression profiles that were very highly correlated (> 0.9) with bioprocess variables of interest. Although the current incomplete annotation limits the biological interpretation of these observations, their full potential may be realized in due course when richer genomic data become available. By taking a pragmatic approach of transcriptome fingerprinting, we have demonstrated the utility of systems biology to support the comparability of laboratory and manufacturing-scale perfusion systems. Scale-down model qualification is the first step in process characterization and hence is an integral component of robust regulatory filings. Augmenting the current paradigm, which relies primarily on cell culture and product quality information, with gene expression data can help make a substantially stronger case for similarity. With continued advances in systems biology approaches, we expect them to be seamlessly integrated into bioprocess development, which can translate into more robust and high yielding processes that can ultimately reduce cost of care for patients.

  1. WholePathwayScope: a comprehensive pathway-based analysis tool for high-throughput data

    PubMed Central

    Yi, Ming; Horton, Jay D; Cohen, Jonathan C; Hobbs, Helen H; Stephens, Robert M

    2006-01-01

    Background Analysis of High Throughput (HTP) Data such as microarray and proteomics data has provided a powerful methodology to study patterns of gene regulation at genome scale. A major unresolved problem in the post-genomic era is to assemble the large amounts of data generated into a meaningful biological context. We have developed a comprehensive software tool, WholePathwayScope (WPS), for deriving biological insights from analysis of HTP data. Result WPS extracts gene lists with shared biological themes through color cue templates. WPS statistically evaluates global functional category enrichment of gene lists and pathway-level pattern enrichment of data. WPS incorporates well-known biological pathways from KEGG (Kyoto Encyclopedia of Genes and Genomes) and Biocarta, GO (Gene Ontology) terms as well as user-defined pathways or relevant gene clusters or groups, and explores gene-term relationships within the derived gene-term association networks (GTANs). WPS simultaneously compares multiple datasets within biological contexts either as pathways or as association networks. WPS also integrates Genetic Association Database and Partial MedGene Database for disease-association information. We have used this program to analyze and compare microarray and proteomics datasets derived from a variety of biological systems. Application examples demonstrated the capacity of WPS to significantly facilitate the analysis of HTP data for integrative discovery. Conclusion This tool represents a pathway-based platform for discovery integration to maximize analysis power. The tool is freely available at . PMID:16423281

  2. Identification of suitable reference gene and biomarkers of serum miRNAs for osteoporosis

    PubMed Central

    Chen, Jian; Li, Kai; Pang, Qianqian; Yang, Chao; Zhang, Hongyu; Wu, Feng; Cao, Hongqing; Liu, Hongju; Wan, Yumin; Xia, Weibo; Wang, Jinfu; Dai, Zhongquan; Li, Yinghui

    2016-01-01

    Our objective was to identify suitable reference genes in serum miRNA for normalization and screen potential new biomarkers for osteoporosis diagnosis by a systematic study. Two types of osteoporosis models were used like as mechanical unloading and estrogen deficiency. Through a large-scale screening using microarray, qPCR validation and statistical algorithms, we first identified miR-25-3p as a suitable reference gene for both type of osteoporosis, which also showed stability during the differentiation processes of osteoblast and osteoclast. Then 15 serum miRNAs with differential expression in OVX rats were identified by microarray and qPCR validation. We further detected these 15 miRNAs in postmenopausal women and bedrest rhesus monkeys and evaluated their diagnostic value by ROC analysis. Among these miRNAs, miR-30b-5p was significantly down-regulated in postmenopausal women with osteopenia or osteoporosis; miR-103-3p, miR-142-3p, miR-328-3p were only significantly decreased in osteoporosis. They all showed positive correlations with BMD. Except miR328-3p, the other three miRNAs were also declined in the rhesus monkeys after long-duration bedrest. Their AUC values (all >0.75) proved the diagnostic potential. Our results provided a reliable normalization reference gene and verified a group of circulating miRNAs as non-invasive biomarkers in the detection of postmenopausal- and mechanical unloading- osteoporosis. PMID:27821865

  3. A Molecular Simulation Study of Antibody-Antigen Interactions on Surfaces for the Rational Design of Next-Generation Antibody Microarrays

    NASA Astrophysics Data System (ADS)

    Bush, Derek B.

    Antibody microarrays constitute a next-generation sensing platform that has the potential to revolutionize the way that molecular detection is conducted in many scientific fields. Unfortunately, current technologies have not found mainstream use because of reliability problems that undermine trust in their results. Although several factors are involved, it is believed that undesirable protein interactions with the array surface are a fundamental source of problems where little detail about the molecular-level biophysics are known. A better understanding of antibody stability and antibody-antigen binding on the array surface is needed to improve microarray technology. Despite the availability of many laboratory methods for studying protein stability and binding, these methods either do not work when the protein is attached to a surface or they do not provide the atomistic structural information that is needed to better understand protein behavior on the surface. As a result, molecular simulation has emerged as the primary method for studying proteins on surfaces because it can provide metrics and views of atomistic structures and molecular motion. Using an advanced, coarse-grain, protein-surface model this study investigated how antibodies react to and function on different types of surfaces. Three topics were addressed: (1) the stability of individual antibodies on surfaces, (2) antibody binding to small antigens while on a surface, and (3) antibody binding to large antigens while on a surface. The results indicate that immobilizing antibodies or antibody fragments in an upright orientation on a hydrophilic surface can provide the molecules with thermal stability similar to their native aqueous stability, enhance antigen binding strength, and minimize the entropic cost of binding. Furthermore, the results indicate that it is more difficult for large antigens to approach the surface than small antigens, that multiple binding sites can aid antigen binding, and that antigen flexiblity simultaneously helps and hinders the binding process as it approaches the surface. The results provide hope that next-generation microarrays and other devices decorated with proteins can be improved through rational design.

  4. Chromosomal microarray analysis as the first-tier test for the identification of pathogenic copy number variants in chromosome 9 pericentric regions and its challenge.

    PubMed

    Wang, Jia-Chi; Boyar, Fatih Z

    2016-01-01

    Chromosomal microarray analysis (CMA) has been recommended and practiced routinely in the large reference laboratories of U.S.A. as the first-tier test for the postnatal evaluation of individuals with intellectual disability, autism spectrum disorders, and/or multiple congenital anomalies. Using CMA as a diagnostic tool and without a routine setting of fluorescence in situ hybridization with labeled bacterial artificial chromosome probes (BAC-FISH) in the large reference laboratories becomes a challenge in the characterization of chromosome 9 pericentric region. This region has a very complex genomic structure and contains a variety of heterochromatic and euchromatic polymorphic variants. These variants were usually studied by G-banding, C-banding and BAC-FISH analysis. Chromosomal microarray analysis (CMA) was not recommended since it may lead to false positive results. Here, we presented a cohort of four cases, in which high-resolution CMA was used as the first-tier test or simultaneously with G-banding analysis on the proband to identify pathogenic copy number variants (CNVs) in the whole genome. CMA revealed large pathogenic CNVs from chromosome 9 in 3 cases which also revealed different G-banding patterns between the two chromosome 9 homologues. Although we demonstrated that high-resolution CMA played an important role in the identification of pathogenic copy number variants in chromosome 9 pericentric regions, the lack of BAC-FISH analysis or other useful tools renders significant challenges in the characterization of chromosome 9 pericentric regions. None; it is not a clinical trial, and the cases were retrospectively collected and analyzed.

  5. Novel calibration tools and validation concepts for microarray-based platforms used in molecular diagnostics and food safety control.

    PubMed

    Brunner, C; Hoffmann, K; Thiele, T; Schedler, U; Jehle, H; Resch-Genger, U

    2015-04-01

    Commercial platforms consisting of ready-to-use microarrays printed with target-specific DNA probes, a microarray scanner, and software for data analysis are available for different applications in medical diagnostics and food analysis, detecting, e.g., viral and bacteriological DNA sequences. The transfer of these tools from basic research to routine analysis, their broad acceptance in regulated areas, and their use in medical practice requires suitable calibration tools for regular control of instrument performance in addition to internal assay controls. Here, we present the development of a novel assay-adapted calibration slide for a commercialized DNA-based assay platform, consisting of precisely arranged fluorescent areas of various intensities obtained by incorporating different concentrations of a "green" dye and a "red" dye in a polymer matrix. These dyes present "Cy3" and "Cy5" analogues with improved photostability, chosen based upon their spectroscopic properties closely matching those of common labels for the green and red channel of microarray scanners. This simple tool allows to efficiently and regularly assess and control the performance of the microarray scanner provided with the biochip platform and to compare different scanners. It will be eventually used as fluorescence intensity scale for referencing of assays results and to enhance the overall comparability of diagnostic tests.

  6. Engineered human skin substitutes undergo large-scale genomic reprogramming and normal skin-like maturation after transplantation to athymic mice.

    PubMed

    Klingenberg, Jennifer M; McFarland, Kevin L; Friedman, Aaron J; Boyce, Steven T; Aronow, Bruce J; Supp, Dorothy M

    2010-02-01

    Bioengineered skin substitutes can facilitate wound closure in severely burned patients, but deficiencies limit their outcomes compared with native skin autografts. To identify gene programs associated with their in vivo capabilities and limitations, we extended previous gene expression profile analyses to now compare engineered skin after in vivo grafting with both in vitro maturation and normal human skin. Cultured skin substitutes were grafted on full-thickness wounds in athymic mice, and biopsy samples for microarray analyses were collected at multiple in vitro and in vivo time points. Over 10,000 transcripts exhibited large-scale expression pattern differences during in vitro and in vivo maturation. Using hierarchical clustering, 11 different expression profile clusters were partitioned on the basis of differential sample type and temporal stage-specific activation or repression. Analyses show that the wound environment exerts a massive influence on gene expression in skin substitutes. For example, in vivo-healed skin substitutes gained the expression of many native skin-expressed genes, including those associated with epidermal barrier and multiple categories of cell-cell and cell-basement membrane adhesion. In contrast, immunological, trichogenic, and endothelial gene programs were largely lacking. These analyses suggest important areas for guiding further improvement of engineered skin for both increased homology with native skin and enhanced wound healing.

  7. Perspectives of gene expression profiling for diagnosis and therapy in haematological malignancies.

    PubMed

    Bacher, Ulrike; Kohlmann, Alexander; Haferlach, Torsten

    2009-05-01

    Considering the heterogeneity of leukaemias and the widening spectrum of therapeutic strategies, novel diagnostic methods are urgently needed for haematological malignancies. For a decade, gene expression profiling (GEP) has been applied in leukaemia research. Thus, various studies demonstrated worldwide that the majority of genetically defined leukaemia subtypes are accurately predictable by GEP, for example, with respect to reciprocal rearrangements in acute myeloid leukaemia (AML). Moreover, novel prognostically relevant gene classifiers were developed as, for example, in normal karyotype AML. Considering the lymphatic malignancies, GEP studies defined novel clinically relevant subtypes in diffuse large B cell lymphoma (DLBCL), and improved the discrimination of Burkitt lymphoma and DLBCL cases, overcoming considerable overlaps of these entities that exist from morphological and genetic perspectives. Treatment-specific sensitivity assays are being developed for targeted drugs such as farnesyl transferase inhibitors in AML or imatinib in BCR-ABL1 positive acute lymphoblastic leukaemia (ALL). Irrespectively of these proceedings, an introduction of the microarray technology in haematological practice requires diagnostic algorithms and strategies for interaction with currently established diagnostic techniques. Large multicentre studies such as the MILE Study (Microarray Innovations in LEukemia) aim at translating this methodology into clinical routine workflows and to catalyze this process.

  8. Intra-Platform Repeatability and Inter-Platform Comparability of MicroRNA Microarray Technology

    PubMed Central

    Sato, Fumiaki; Tsuchiya, Soken; Terasawa, Kazuya; Tsujimoto, Gozoh

    2009-01-01

    Over the last decade, DNA microarray technology has provided a great contribution to the life sciences. The MicroArray Quality Control (MAQC) project demonstrated the way to analyze the expression microarray. Recently, microarray technology has been utilized to analyze a comprehensive microRNA expression profiling. Currently, several platforms of microRNA microarray chips are commercially available. Thus, we compared repeatability and comparability of five different microRNA microarray platforms (Agilent, Ambion, Exiqon, Invitrogen and Toray) using 309 microRNAs probes, and the Taqman microRNA system using 142 microRNA probes. This study demonstrated that microRNA microarray has high intra-platform repeatability and comparability to quantitative RT-PCR of microRNA. Among the five platforms, Agilent and Toray array showed relatively better performances than the others. However, the current lineup of commercially available microRNA microarray systems fails to show good inter-platform concordance, probably because of lack of an adequate normalization method and severe divergence in stringency of detection call criteria between different platforms. This study provided the basic information about the performance and the problems specific to the current microRNA microarray systems. PMID:19436744

  9. A highly oriented hybrid microarray modified electrode fabricated by a template-free method for ultrasensitive electrochemical DNA recognition

    NASA Astrophysics Data System (ADS)

    Shi, Lei; Chu, Zhenyu; Dong, Xueliang; Jin, Wanqin; Dempsey, Eithne

    2013-10-01

    Highly oriented growth of a hybrid microarray was realized by a facile template-free method on gold substrates for the first time. The proposed formation mechanism involves an interfacial structure-directing force arising from self-assembled monolayers (SAMs) between gold substrates and hybrid crystals. Different SAMs and variable surface coverage of the assembled molecules play a critical role in the interfacial directing forces and influence the morphologies of hybrid films. A highly oriented hybrid microarray was formed on the highly aligned and vertical SAMs of 1,4-benzenedithiol molecules with rigid backbones, which afforded an intense structure-directing power for the oriented growth of hybrid crystals. Additionally, the density of the microarray could be adjusted by controlling the surface coverage of assembled molecules. Based on the hybrid microarray modified electrode with a large specific area (ca. 10 times its geometrical area), a label-free electrochemical DNA biosensor was constructed for the detection of an oligonucleotide fragment of the avian flu virus H5N1. The DNA biosensor displayed a significantly low detection limit of 5 pM (S/N = 3), a wide linear response from 10 pM to 10 nM, as well as excellent selectivity, good regeneration and high stability. We expect that the proposed template-free method can provide a new reference for the fabrication of a highly oriented hybrid array and the as-prepared microarray modified electrode will be a promising paradigm in constructing highly sensitive and selective biosensors.Highly oriented growth of a hybrid microarray was realized by a facile template-free method on gold substrates for the first time. The proposed formation mechanism involves an interfacial structure-directing force arising from self-assembled monolayers (SAMs) between gold substrates and hybrid crystals. Different SAMs and variable surface coverage of the assembled molecules play a critical role in the interfacial directing forces and influence the morphologies of hybrid films. A highly oriented hybrid microarray was formed on the highly aligned and vertical SAMs of 1,4-benzenedithiol molecules with rigid backbones, which afforded an intense structure-directing power for the oriented growth of hybrid crystals. Additionally, the density of the microarray could be adjusted by controlling the surface coverage of assembled molecules. Based on the hybrid microarray modified electrode with a large specific area (ca. 10 times its geometrical area), a label-free electrochemical DNA biosensor was constructed for the detection of an oligonucleotide fragment of the avian flu virus H5N1. The DNA biosensor displayed a significantly low detection limit of 5 pM (S/N = 3), a wide linear response from 10 pM to 10 nM, as well as excellent selectivity, good regeneration and high stability. We expect that the proposed template-free method can provide a new reference for the fabrication of a highly oriented hybrid array and the as-prepared microarray modified electrode will be a promising paradigm in constructing highly sensitive and selective biosensors. Electronic supplementary information (ESI) available: Four-probe method for determining the conductivity of the hybrid crystal (Fig. S1); stability comparisons of the hybrid films (Fig. S2); FESEM images of the hybrid microarray (Fig. S3); electrochemical characterizations of the hybrid films (Fig. S4); DFT simulations (Fig. S5); cross-sectional FESEM image of the hybrid microarray (Fig. S6); regeneration and stability tests of the DNA biosensor (Fig. S7). See DOI: 10.1039/c3nr03097k

  10. Unique differentiation profile of mouse embryonic stem cells in rotary and stirred tank bioreactors.

    PubMed

    Fridley, Krista M; Fernandez, Irina; Li, Mon-Tzu Alice; Kettlewell, Robert B; Roy, Krishnendu

    2010-11-01

    Embryonic stem (ES)-cell-derived lineage-specific stem cells, for example, hematopoietic stem cells, could provide a potentially unlimited source for transplantable cells, especially for cell-based therapies. However, reproducible methods must be developed to maximize and scale-up ES cell differentiation to produce clinically relevant numbers of therapeutic cells. Bioreactor-based dynamic culture conditions are amenable to large-scale cell production, but few studies have evaluated how various bioreactor types and culture parameters influence ES cell differentiation, especially hematopoiesis. Our results indicate that cell seeding density and bioreactor speed significantly affect embryoid body formation and subsequent generation of hematopoietic stem and progenitor cells in both stirred tank (spinner flask) and rotary microgravity (Synthecon™) type bioreactors. In general, high percentages of hematopoietic stem and progenitor cells were generated in both bioreactors, especially at high cell densities. In addition, Synthecon bioreactors produced more sca-1(+) progenitors and spinner flasks generated more c-Kit(+) progenitors, demonstrating their unique differentiation profiles. cDNA microarray analysis of genes involved in pluripotency, germ layer formation, and hematopoietic differentiation showed that on day 7 of differentiation, embryoid bodies from both bioreactors consisted of all three germ layers of embryonic development. However, unique gene expression profiles were observed in the two bioreactors; for example, expression of specific hematopoietic genes were significantly more upregulated in the Synthecon cultures than in spinner flasks. We conclude that bioreactor type and culture parameters can be used to control ES cell differentiation, enhance unique progenitor cell populations, and provide means for large-scale production of transplantable therapeutic cells.

  11. LS Bound based gene selection for DNA microarray data.

    PubMed

    Zhou, Xin; Mao, K Z

    2005-04-15

    One problem with discriminant analysis of DNA microarray data is that each sample is represented by quite a large number of genes, and many of them are irrelevant, insignificant or redundant to the discriminant problem at hand. Methods for selecting important genes are, therefore, of much significance in microarray data analysis. In the present study, a new criterion, called LS Bound measure, is proposed to address the gene selection problem. The LS Bound measure is derived from leave-one-out procedure of LS-SVMs (least squares support vector machines), and as the upper bound for leave-one-out classification results it reflects to some extent the generalization performance of gene subsets. We applied this LS Bound measure for gene selection on two benchmark microarray datasets: colon cancer and leukemia. We also compared the LS Bound measure with other evaluation criteria, including the well-known Fisher's ratio and Mahalanobis class separability measure, and other published gene selection algorithms, including Weighting factor and SVM Recursive Feature Elimination. The strength of the LS Bound measure is that it provides gene subsets leading to more accurate classification results than the filter method while its computational complexity is at the level of the filter method. A companion website can be accessed at http://www.ntu.edu.sg/home5/pg02776030/lsbound/. The website contains: (1) the source code of the gene selection algorithm; (2) the complete set of tables and figures regarding the experimental study; (3) proof of the inequality (9). ekzmao@ntu.edu.sg.

  12. Rapid Characterization of Candidate Biomarkers for Pancreatic Cancer Using Cell Microarrays (CMAs)

    PubMed Central

    Kim, Min-Sik; Kuppireddy, Sarada V.; Sakamuri, Sruthi; Singal, Mukul; Getnet, Derese; Harsha, H. C.; Goel, Renu; Balakrishnan, Lavanya; Jacob, Harrys K. C.; Kashyap, Manoj K.; Tankala, Shantal G.; Maitra, Anirban; Iacobuzio-Donahue, Christine A.; Jaffee, Elizabeth; Goggins, Michael G.; Velculescu, Victor E.; Hruban, Ralph H.; Pandey, Akhilesh

    2013-01-01

    Tissue microarrays have become a valuable tool for high-throughput analysis using immunohistochemical labeling. However, the large majority of biochemical studies are carried out in cell lines to further characterize candidate biomarkers or therapeutic targets with subsequent studies in animals or using primary tissues. Thus, cell line-based microarrays could be a useful screening tool in some situations. Here, we constructed a cell microarray (CMA) containing a panel of 40 pancreatic cancer cell lines available from American Type Culture Collection in addition to those locally available at Johns Hopkins. As proof of principle, we performed immunocytochemical labeling of an epithelial cell adhesion molecule (Ep-CAM), a molecule generally expressed in the epithelium, on this pancreatic cancer CMA. In addition, selected molecules that have been previously shown to be differentially expressed in pancreatic cancer in the literature were validated. For example, we observed strong labeling of CA19-9 antigen, a prognostic and predictive marker for pancreatic cancer. We also carried out a bioinformatics analysis of a literature curated catalog of pancreatic cancer biomarkers developed previously by our group and identified two candidate biomarkers, HLA class I and transmembrane protease, serine 4 (TMPRSS4), and examined their expression in the cell lines represented on the pancreatic cancer CMAs. Our results demonstrate the utility of CMAs as a useful resource for rapid screening of molecules of interest and suggest that CMAs can become a universal standard platform in cancer research. PMID:22985314

  13. Assembly of 500,000 inter-specific catfish expressed sequence tags and large scale gene-associated marker development for whole genome association studies

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Catfish Genome Consortium; Wang, Shaolin; Peatman, Eric

    2010-03-23

    Background-Through the Community Sequencing Program, a catfish EST sequencing project was carried out through a collaboration between the catfish research community and the Department of Energy's Joint Genome Institute. Prior to this project, only a limited EST resource from catfish was available for the purpose of SNP identification. Results-A total of 438,321 quality ESTs were generated from 8 channel catfish (Ictalurus punctatus) and 4 blue catfish (Ictalurus furcatus) libraries, bringing the number of catfish ESTs to nearly 500,000. Assembly of all catfish ESTs resulted in 45,306 contigs and 66,272 singletons. Over 35percent of the unique sequences had significant similarities tomore » known genes, allowing the identification of 14,776 unique genes in catfish. Over 300,000 putative SNPs have been identified, of which approximately 48,000 are high-quality SNPs identified from contigs with at least four sequences and the minor allele presence of at least two sequences in the contig. The EST resource should be valuable for identification of microsatellites, genome annotation, large-scale expression analysis, and comparative genome analysis. Conclusions-This project generated a large EST resource for catfish that captured the majority of the catfish transcriptome. The parallel analysis of ESTs from two closely related Ictalurid catfishes should also provide powerful means for the evaluation of ancient and recent gene duplications, and for the development of high-density microarrays in catfish. The inter- and intra-specific SNPs identified from all catfish EST dataset assembly will greatly benefit the catfish introgression breeding program and whole genome association studies.« less

  14. Transcriptional analysis of the Arabidopsis ovule by massively parallel signature sequencing

    PubMed Central

    Sánchez-León, Nidia; Arteaga-Vázquez, Mario; Alvarez-Mejía, César; Mendiola-Soto, Javier; Durán-Figueroa, Noé; Rodríguez-Leal, Daniel; Rodríguez-Arévalo, Isaac; García-Campayo, Vicenta; García-Aguilar, Marcelina; Olmedo-Monfil, Vianey; Arteaga-Sánchez, Mario; Martínez de la Vega, Octavio; Nobuta, Kan; Vemaraju, Kalyan; Meyers, Blake C.; Vielle-Calzada, Jean-Philippe

    2012-01-01

    The life cycle of flowering plants alternates between a predominant sporophytic (diploid) and an ephemeral gametophytic (haploid) generation that only occurs in reproductive organs. In Arabidopsis thaliana, the female gametophyte is deeply embedded within the ovule, complicating the study of the genetic and molecular interactions involved in the sporophytic to gametophytic transition. Massively parallel signature sequencing (MPSS) was used to conduct a quantitative large-scale transcriptional analysis of the fully differentiated Arabidopsis ovule prior to fertilization. The expression of 9775 genes was quantified in wild-type ovules, additionally detecting >2200 new transcripts mapping to antisense or intergenic regions. A quantitative comparison of global expression in wild-type and sporocyteless (spl) individuals resulted in 1301 genes showing 25-fold reduced or null activity in ovules lacking a female gametophyte, including those encoding 92 signalling proteins, 75 transcription factors, and 72 RNA-binding proteins not reported in previous studies based on microarray profiling. A combination of independent genetic and molecular strategies confirmed the differential expression of 28 of them, showing that they are either preferentially active in the female gametophyte, or dependent on the presence of a female gametophyte to be expressed in sporophytic cells of the ovule. Among 18 genes encoding pentatricopeptide-repeat proteins (PPRs) that show transcriptional activity in wild-type but not spl ovules, CIHUATEOTL (At4g38150) is specifically expressed in the female gametophyte and necessary for female gametogenesis. These results expand the nature of the transcriptional universe present in the ovule of Arabidopsis, and offer a large-scale quantitative reference of global expression for future genomic and developmental studies. PMID:22442422

  15. Transcriptional analysis of the Arabidopsis ovule by massively parallel signature sequencing.

    PubMed

    Sánchez-León, Nidia; Arteaga-Vázquez, Mario; Alvarez-Mejía, César; Mendiola-Soto, Javier; Durán-Figueroa, Noé; Rodríguez-Leal, Daniel; Rodríguez-Arévalo, Isaac; García-Campayo, Vicenta; García-Aguilar, Marcelina; Olmedo-Monfil, Vianey; Arteaga-Sánchez, Mario; de la Vega, Octavio Martínez; Nobuta, Kan; Vemaraju, Kalyan; Meyers, Blake C; Vielle-Calzada, Jean-Philippe

    2012-06-01

    The life cycle of flowering plants alternates between a predominant sporophytic (diploid) and an ephemeral gametophytic (haploid) generation that only occurs in reproductive organs. In Arabidopsis thaliana, the female gametophyte is deeply embedded within the ovule, complicating the study of the genetic and molecular interactions involved in the sporophytic to gametophytic transition. Massively parallel signature sequencing (MPSS) was used to conduct a quantitative large-scale transcriptional analysis of the fully differentiated Arabidopsis ovule prior to fertilization. The expression of 9775 genes was quantified in wild-type ovules, additionally detecting >2200 new transcripts mapping to antisense or intergenic regions. A quantitative comparison of global expression in wild-type and sporocyteless (spl) individuals resulted in 1301 genes showing 25-fold reduced or null activity in ovules lacking a female gametophyte, including those encoding 92 signalling proteins, 75 transcription factors, and 72 RNA-binding proteins not reported in previous studies based on microarray profiling. A combination of independent genetic and molecular strategies confirmed the differential expression of 28 of them, showing that they are either preferentially active in the female gametophyte, or dependent on the presence of a female gametophyte to be expressed in sporophytic cells of the ovule. Among 18 genes encoding pentatricopeptide-repeat proteins (PPRs) that show transcriptional activity in wild-type but not spl ovules, CIHUATEOTL (At4g38150) is specifically expressed in the female gametophyte and necessary for female gametogenesis. These results expand the nature of the transcriptional universe present in the ovule of Arabidopsis, and offer a large-scale quantitative reference of global expression for future genomic and developmental studies.

  16. Towards MRI microarrays.

    PubMed

    Hall, Andrew; Mundell, Victoria J; Blanco-Andujar, Cristina; Bencsik, Martin; McHale, Glen; Newton, Michael I; Cave, Gareth W V

    2010-04-14

    Superparamagnetic iron oxide nanometre scale particles have been utilised as contrast agents to image staked target binding oligonucleotide arrays using MRI to correlate the signal intensity and T(2)* relaxation times in different NMR fluids.

  17. Power enhancement via multivariate outlier testing with gene expression arrays.

    PubMed

    Asare, Adam L; Gao, Zhong; Carey, Vincent J; Wang, Richard; Seyfert-Margolis, Vicki

    2009-01-01

    As the use of microarrays in human studies continues to increase, stringent quality assurance is necessary to ensure accurate experimental interpretation. We present a formal approach for microarray quality assessment that is based on dimension reduction of established measures of signal and noise components of expression followed by parametric multivariate outlier testing. We applied our approach to several data resources. First, as a negative control, we found that the Affymetrix and Illumina contributions to MAQC data were free from outliers at a nominal outlier flagging rate of alpha=0.01. Second, we created a tunable framework for artificially corrupting intensity data from the Affymetrix Latin Square spike-in experiment to allow investigation of sensitivity and specificity of quality assurance (QA) criteria. Third, we applied the procedure to 507 Affymetrix microarray GeneChips processed with RNA from human peripheral blood samples. We show that exclusion of arrays by this approach substantially increases inferential power, or the ability to detect differential expression, in large clinical studies. http://bioconductor.org/packages/2.3/bioc/html/arrayMvout.html and http://bioconductor.org/packages/2.3/bioc/html/affyContam.html affyContam (credentials: readonly/readonly)

  18. Maskless localized patterning of biomolecules on carbon nanotube microarray functionalized by ultrafine atmospheric pressure plasma jet using biotin-avidin system

    NASA Astrophysics Data System (ADS)

    Abuzairi, Tomy; Okada, Mitsuru; Purnamaningsih, Retno Wigajatri; Poespawati, Nji Raden; Iwata, Futoshi; Nagatsu, Masaaki

    2016-07-01

    Ultrafine plasma jet is a promising technology with great potential for nano- or micro-scale surface modification. In this letter, we demonstrated the use of ultrafine atmospheric pressure plasma jet (APPJ) for patterning bio-immobilization on vertically aligned carbon nanotube (CNT) microarray platform without a physical mask. The biotin-avidin system was utilized to demonstrate localized biomolecule patterning on the biosensor devices. Using ±7.5 kV square-wave pulses, the optimum condition of plasma jet with He/NH3 gas mixture and 2.5 s treatment period has been obtained to functionalize CNTs. The functionalized CNTs were covalently linked to biotin, bovine serum albumin (BSA), and avidin-(fluorescein isothiocyanate) FITC, sequentially. BSA was necessary as a blocking agent to protect the untreated CNTs from avidin adsorption. The localized patterning results have been evaluated from avidin-FITC fluorescence signals analyzed using a fluorescence microscope. The patterning of biomolecules on the CNT microarray platform using ultrafine APPJ provides a means for potential application of microarray biosensors based on CNTs.

  19. Approximate geodesic distances reveal biologically relevant structures in microarray data.

    PubMed

    Nilsson, Jens; Fioretos, Thoas; Höglund, Mattias; Fontes, Magnus

    2004-04-12

    Genome-wide gene expression measurements, as currently determined by the microarray technology, can be represented mathematically as points in a high-dimensional gene expression space. Genes interact with each other in regulatory networks, restricting the cellular gene expression profiles to a certain manifold, or surface, in gene expression space. To obtain knowledge about this manifold, various dimensionality reduction methods and distance metrics are used. For data points distributed on curved manifolds, a sensible distance measure would be the geodesic distance along the manifold. In this work, we examine whether an approximate geodesic distance measure captures biological similarities better than the traditionally used Euclidean distance. We computed approximate geodesic distances, determined by the Isomap algorithm, for one set of lymphoma and one set of lung cancer microarray samples. Compared with the ordinary Euclidean distance metric, this distance measure produced more instructive, biologically relevant, visualizations when applying multidimensional scaling. This suggests the Isomap algorithm as a promising tool for the interpretation of microarray data. Furthermore, the results demonstrate the benefit and importance of taking nonlinearities in gene expression data into account.

  20. The MGED Ontology: a resource for semantics-based description of microarray experiments.

    PubMed

    Whetzel, Patricia L; Parkinson, Helen; Causton, Helen C; Fan, Liju; Fostel, Jennifer; Fragoso, Gilberto; Game, Laurence; Heiskanen, Mervi; Morrison, Norman; Rocca-Serra, Philippe; Sansone, Susanna-Assunta; Taylor, Chris; White, Joseph; Stoeckert, Christian J

    2006-04-01

    The generation of large amounts of microarray data and the need to share these data bring challenges for both data management and annotation and highlights the need for standards. MIAME specifies the minimum information needed to describe a microarray experiment and the Microarray Gene Expression Object Model (MAGE-OM) and resulting MAGE-ML provide a mechanism to standardize data representation for data exchange, however a common terminology for data annotation is needed to support these standards. Here we describe the MGED Ontology (MO) developed by the Ontology Working Group of the Microarray Gene Expression Data (MGED) Society. The MO provides terms for annotating all aspects of a microarray experiment from the design of the experiment and array layout, through to the preparation of the biological sample and the protocols used to hybridize the RNA and analyze the data. The MO was developed to provide terms for annotating experiments in line with the MIAME guidelines, i.e. to provide the semantics to describe a microarray experiment according to the concepts specified in MIAME. The MO does not attempt to incorporate terms from existing ontologies, e.g. those that deal with anatomical parts or developmental stages terms, but provides a framework to reference terms in other ontologies and therefore facilitates the use of ontologies in microarray data annotation. The MGED Ontology version.1.2.0 is available as a file in both DAML and OWL formats at http://mged.sourceforge.net/ontologies/index.php. Release notes and annotation examples are provided. The MO is also provided via the NCICB's Enterprise Vocabulary System (http://nciterms.nci.nih.gov/NCIBrowser/Dictionary.do). Stoeckrt@pcbi.upenn.edu Supplementary data are available at Bioinformatics online.

  1. Methods to study legionella transcriptome in vitro and in vivo.

    PubMed

    Faucher, Sebastien P; Shuman, Howard A

    2013-01-01

    The study of transcriptome responses can provide insight into the regulatory pathways and genetic factors that contribute to a specific phenotype. For bacterial pathogens, it can identify putative new virulence systems and shed light on the mechanisms underlying the regulation of virulence factors. Microarrays have been previously used to study gene regulation in Legionella pneumophila. In the past few years a sharp reduction of the costs associated with microarray experiments together with the availability of relatively inexpensive custom-designed commercial microarrays has made microarray technology an accessible tool for the majority of researchers. Here we describe the methodologies to conduct microarray experiments from in vitro and in vivo samples.

  2. Gene Expression-Based Survival Prediction in Lung Adenocarcinoma: A Multi-Site, Blinded Validation Study

    PubMed Central

    Shedden, Kerby; Taylor, Jeremy M.G.; Enkemann, Steve A.; Tsao, Ming S.; Yeatman, Timothy J.; Gerald, William L.; Eschrich, Steve; Jurisica, Igor; Venkatraman, Seshan E.; Meyerson, Matthew; Kuick, Rork; Dobbin, Kevin K.; Lively, Tracy; Jacobson, James W.; Beer, David G.; Giordano, Thomas J.; Misek, David E.; Chang, Andrew C.; Zhu, Chang Qi; Strumpf, Dan; Hanash, Samir; Shepherd, Francis A.; Ding, Kuyue; Seymour, Lesley; Naoki, Katsuhiko; Pennell, Nathan; Weir, Barbara; Verhaak, Roel; Ladd-Acosta, Christine; Golub, Todd; Gruidl, Mike; Szoke, Janos; Zakowski, Maureen; Rusch, Valerie; Kris, Mark; Viale, Agnes; Motoi, Noriko; Travis, William; Sharma, Anupama

    2009-01-01

    Although prognostic gene expression signatures for survival in early stage lung cancer have been proposed, for clinical application it is critical to establish their performance across different subject populations and in different laboratories. Here we report a large, training-testing, multi-site blinded validation study to characterize the performance of several prognostic models based on gene expression for 442 lung adenocarcinomas. The hypotheses proposed examined whether microarray measurements of gene expression either alone or combined with basic clinical covariates (stage, age, sex) can be used to predict overall survival in lung cancer subjects. Several models examined produced risk scores that substantially correlated with actual subject outcome. Most methods performed better with clinical data, supporting the combined use of clinical and molecular information when building prognostic models for early stage lung cancer. This study also provides the largest available set of microarray data with extensive pathological and clinical annotation for lung adenocarcinomas. PMID:18641660

  3. Quantification of the epitope diversity of HIV-1-specific binding antibodies by peptide microarrays for global HIV-1 vaccine development

    DOE PAGES

    Stephenson, Kathryn E.; Neubauer, George H.; Reimer, Ulf; ...

    2014-11-14

    An effective vaccine against human immunodeficiency virus type 1 (HIV-1) will have to provide protection against a vast array of different HIV-1 strains. Current methods to measure HIV-1-specific binding antibodies following immunization typically focus on determining the magnitude of antibody responses, but the epitope diversity of antibody responses has remained largely unexplored. Here we describe the development of a global HIV-1 peptide microarray that contains 6564 peptides from across the HIV-1 proteome and covers the majority of HIV-1 sequences in the Los Alamos National Laboratory global HIV-1 sequence database. Using this microarray, we quantified the magnitude, breadth, and depth ofmore » IgG binding to linear HIV-1 sequences in HIV-1-infected humans and HIV-1-vaccinated humans, rhesus monkeys and guinea pigs. The microarray measured potentially important differences in antibody epitope diversity, particularly regarding the depth of epitope variants recognized at each binding site. Our data suggest that the global HIV-1 peptide microarray may be a useful tool for both preclinical and clinical HIV-1 research.« less

  4. Quantification of the epitope diversity of HIV-1-specific binding antibodies by peptide microarrays for global HIV-1 vaccine development

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Stephenson, Kathryn E.; Neubauer, George H.; Reimer, Ulf

    An effective vaccine against human immunodeficiency virus type 1 (HIV-1) will have to provide protection against a vast array of different HIV-1 strains. Current methods to measure HIV-1-specific binding antibodies following immunization typically focus on determining the magnitude of antibody responses, but the epitope diversity of antibody responses has remained largely unexplored. Here we describe the development of a global HIV-1 peptide microarray that contains 6564 peptides from across the HIV-1 proteome and covers the majority of HIV-1 sequences in the Los Alamos National Laboratory global HIV-1 sequence database. Using this microarray, we quantified the magnitude, breadth, and depth ofmore » IgG binding to linear HIV-1 sequences in HIV-1-infected humans and HIV-1-vaccinated humans, rhesus monkeys and guinea pigs. The microarray measured potentially important differences in antibody epitope diversity, particularly regarding the depth of epitope variants recognized at each binding site. Our data suggest that the global HIV-1 peptide microarray may be a useful tool for both preclinical and clinical HIV-1 research.« less

  5. Genome Consortium for Active Teaching: Meeting the Goals of BIO2010

    PubMed Central

    Ledbetter, Mary Lee S.; Hoopes, Laura L.M.; Eckdahl, Todd T.; Heyer, Laurie J.; Rosenwald, Anne; Fowlks, Edison; Tonidandel, Scott; Bucholtz, Brooke; Gottfried, Gail

    2007-01-01

    The Genome Consortium for Active Teaching (GCAT) facilitates the use of modern genomics methods in undergraduate education. Initially focused on microarray technology, but with an eye toward diversification, GCAT is a community working to improve the education of tomorrow's life science professionals. GCAT participants have access to affordable microarrays, microarray scanners, free software for data analysis, and faculty workshops. Microarrays provided by GCAT have been used by 141 faculty on 134 campuses, including 21 faculty that serve large numbers of underrepresented minority students. An estimated 9480 undergraduates a year will have access to microarrays by 2009 as a direct result of GCAT faculty workshops. Gains for students include significantly improved comprehension of topics in functional genomics and increased interest in research. Faculty reported improved access to new technology and gains in understanding thanks to their involvement with GCAT. GCAT's network of supportive colleagues encourages faculty to explore genomics through student research and to learn a new and complex method with their undergraduates. GCAT is meeting important goals of BIO2010 by making research methods accessible to undergraduates, training faculty in genomics and bioinformatics, integrating mathematics into the biology curriculum, and increasing participation by underrepresented minority students. PMID:17548873

  6. Genome Consortium for Active Teaching: meeting the goals of BIO2010.

    PubMed

    Campbell, A Malcolm; Ledbetter, Mary Lee S; Hoopes, Laura L M; Eckdahl, Todd T; Heyer, Laurie J; Rosenwald, Anne; Fowlks, Edison; Tonidandel, Scott; Bucholtz, Brooke; Gottfried, Gail

    2007-01-01

    The Genome Consortium for Active Teaching (GCAT) facilitates the use of modern genomics methods in undergraduate education. Initially focused on microarray technology, but with an eye toward diversification, GCAT is a community working to improve the education of tomorrow's life science professionals. GCAT participants have access to affordable microarrays, microarray scanners, free software for data analysis, and faculty workshops. Microarrays provided by GCAT have been used by 141 faculty on 134 campuses, including 21 faculty that serve large numbers of underrepresented minority students. An estimated 9480 undergraduates a year will have access to microarrays by 2009 as a direct result of GCAT faculty workshops. Gains for students include significantly improved comprehension of topics in functional genomics and increased interest in research. Faculty reported improved access to new technology and gains in understanding thanks to their involvement with GCAT. GCAT's network of supportive colleagues encourages faculty to explore genomics through student research and to learn a new and complex method with their undergraduates. GCAT is meeting important goals of BIO2010 by making research methods accessible to undergraduates, training faculty in genomics and bioinformatics, integrating mathematics into the biology curriculum, and increasing participation by underrepresented minority students.

  7. Efficient computation of optimal oligo-RNA binding.

    PubMed

    Hodas, Nathan O; Aalberts, Daniel P

    2004-01-01

    We present an algorithm that calculates the optimal binding conformation and free energy of two RNA molecules, one or both oligomeric. This algorithm has applications to modeling DNA microarrays, RNA splice-site recognitions and other antisense problems. Although other recent algorithms perform the same calculation in time proportional to the sum of the lengths cubed, O((N1 + N2)3), our oligomer binding algorithm, called bindigo, scales as the product of the sequence lengths, O(N1*N2). The algorithm performs well in practice with the aid of a heuristic for large asymmetric loops. To demonstrate its speed and utility, we use bindigo to investigate the binding proclivities of U1 snRNA to mRNA donor splice sites.

  8. Reverse phase protein microarrays: fluorometric and colorimetric detection.

    PubMed

    Gallagher, Rosa I; Silvestri, Alessandra; Petricoin, Emanuel F; Liotta, Lance A; Espina, Virginia

    2011-01-01

    The Reverse Phase Protein Microarray (RPMA) is an array platform used to quantitate proteins and their posttranslationally modified forms. RPMAs are applicable for profiling key cellular signaling pathways and protein networks, allowing direct comparison of the activation state of proteins from multiple samples within the same array. The RPMA format consists of proteins immobilized directly on a nitrocellulose substratum. The analyte is subsequently probed with a primary antibody and a series of reagents for signal amplification and detection. Due to the diversity, low concentration, and large dynamic range of protein analytes, RPMAs require stringent signal amplification methods, high quality image acquisition, and software capable of precisely analyzing spot intensities on an array. Microarray detection strategies can be either fluorescent or colorimetric. The choice of a detection system depends on (a) the expected analyte concentration, (b) type of microarray imaging system, and (c) type of sample. The focus of this chapter is to describe RPMA detection and imaging using fluorescent and colorimetric (diaminobenzidine (DAB)) methods.

  9. Glycome Diagnosis of Human Induced Pluripotent Stem Cells Using Lectin Microarray*

    PubMed Central

    Tateno, Hiroaki; Toyota, Masashi; Saito, Shigeru; Onuma, Yasuko; Ito, Yuzuru; Hiemori, Keiko; Fukumura, Mihoko; Matsushima, Asako; Nakanishi, Mio; Ohnuma, Kiyoshi; Akutsu, Hidenori; Umezawa, Akihiro; Horimoto, Katsuhisa; Hirabayashi, Jun; Asashima, Makoto

    2011-01-01

    Induced pluripotent stem cells (iPSCs) can now be produced from various somatic cell (SC) lines by ectopic expression of the four transcription factors. Although the procedure has been demonstrated to induce global change in gene and microRNA expressions and even epigenetic modification, it remains largely unknown how this transcription factor-induced reprogramming affects the total glycan repertoire expressed on the cells. Here we performed a comprehensive glycan analysis using 114 types of human iPSCs generated from five different SCs and compared their glycomes with those of human embryonic stem cells (ESCs; nine cell types) using a high density lectin microarray. In unsupervised cluster analysis of the results obtained by lectin microarray, both undifferentiated iPSCs and ESCs were clustered as one large group. However, they were clearly separated from the group of differentiated SCs, whereas all of the four SCs had apparently distinct glycome profiles from one another, demonstrating that SCs with originally distinct glycan profiles have acquired those similar to ESCs upon induction of pluripotency. Thirty-eight lectins discriminating between SCs and iPSCs/ESCs were statistically selected, and characteristic features of the pluripotent state were then obtained at the level of the cellular glycome. The expression profiles of relevant glycosyltransferase genes agreed well with the results obtained by lectin microarray. Among the 38 lectins, rBC2LCN was found to detect only undifferentiated iPSCs/ESCs and not differentiated SCs. Hence, the high density lectin microarray has proved to be valid for not only comprehensive analysis of glycans but also diagnosis of stem cells under the concept of the cellular glycome. PMID:21471226

  10. Identification of new autoantigens for primary biliary cirrhosis using human proteome microarrays.

    PubMed

    Hu, Chao-Jun; Song, Guang; Huang, Wei; Liu, Guo-Zhen; Deng, Chui-Wen; Zeng, Hai-Pan; Wang, Li; Zhang, Feng-Chun; Zhang, Xuan; Jeong, Jun Seop; Blackshaw, Seth; Jiang, Li-Zhi; Zhu, Heng; Wu, Lin; Li, Yong-Zhe

    2012-09-01

    Primary biliary cirrhosis (PBC) is a chronic cholestatic liver disease of unknown etiology and is considered to be an autoimmune disease. Autoantibodies are important tools for accurate diagnosis of PBC. Here, we employed serum profiling analysis using a human proteome microarray composed of about 17,000 full-length unique proteins and identified 23 proteins that correlated with PBC. To validate these results, we fabricated a PBC-focused microarray with 21 of these newly identified candidates and nine additional known PBC antigens. By screening the PBC microarrays with additional cohorts of 191 PBC patients and 321 controls (43 autoimmune hepatitis, 55 hepatitis B virus, 31 hepatitis C virus, 48 rheumatoid arthritis, 45 systematic lupus erythematosus, 49 systemic sclerosis, and 50 healthy), six proteins were confirmed as novel PBC autoantigens with high sensitivities and specificities, including hexokinase-1 (isoforms I and II), Kelch-like protein 7, Kelch-like protein 12, zinc finger and BTB domain-containing protein 2, and eukaryotic translation initiation factor 2C, subunit 1. To facilitate clinical diagnosis, we developed ELISA for Kelch-like protein 12 and zinc finger and BTB domain-containing protein 2 and tested large cohorts (297 PBC and 637 control sera) to confirm the sensitivities and specificities observed in the microarray-based assays. In conclusion, our research showed that a strategy using high content protein microarray combined with a smaller but more focused protein microarray can effectively identify and validate novel PBC-specific autoantigens and has the capacity to be translated to clinical diagnosis by means of an ELISA-based method.

  11. Identifying Fishes through DNA Barcodes and Microarrays.

    PubMed

    Kochzius, Marc; Seidel, Christian; Antoniou, Aglaia; Botla, Sandeep Kumar; Campo, Daniel; Cariani, Alessia; Vazquez, Eva Garcia; Hauschild, Janet; Hervet, Caroline; Hjörleifsdottir, Sigridur; Hreggvidsson, Gudmundur; Kappel, Kristina; Landi, Monica; Magoulas, Antonios; Marteinsson, Viggo; Nölte, Manfred; Planes, Serge; Tinti, Fausto; Turan, Cemal; Venugopal, Moleyur N; Weber, Hannes; Blohm, Dietmar

    2010-09-07

    International fish trade reached an import value of 62.8 billion Euro in 2006, of which 44.6% are covered by the European Union. Species identification is a key problem throughout the life cycle of fishes: from eggs and larvae to adults in fisheries research and control, as well as processed fish products in consumer protection. This study aims to evaluate the applicability of the three mitochondrial genes 16S rRNA (16S), cytochrome b (cyt b), and cytochrome oxidase subunit I (COI) for the identification of 50 European marine fish species by combining techniques of "DNA barcoding" and microarrays. In a DNA barcoding approach, neighbour Joining (NJ) phylogenetic trees of 369 16S, 212 cyt b, and 447 COI sequences indicated that cyt b and COI are suitable for unambiguous identification, whereas 16S failed to discriminate closely related flatfish and gurnard species. In course of probe design for DNA microarray development, each of the markers yielded a high number of potentially species-specific probes in silico, although many of them were rejected based on microarray hybridisation experiments. None of the markers provided probes to discriminate the sibling flatfish and gurnard species. However, since 16S-probes were less negatively influenced by the "position of label" effect and showed the lowest rejection rate and the highest mean signal intensity, 16S is more suitable for DNA microarray probe design than cty b and COI. The large portion of rejected COI-probes after hybridisation experiments (>90%) renders the DNA barcoding marker as rather unsuitable for this high-throughput technology. Based on these data, a DNA microarray containing 64 functional oligonucleotide probes for the identification of 30 out of the 50 fish species investigated was developed. It represents the next step towards an automated and easy-to-handle method to identify fish, ichthyoplankton, and fish products.

  12. Connectivity Mapping for Candidate Therapeutics Identification Using Next Generation Sequencing RNA-Seq Data

    PubMed Central

    McArt, Darragh G.; Dunne, Philip D.; Blayney, Jaine K.; Salto-Tellez, Manuel; Van Schaeybroeck, Sandra; Hamilton, Peter W.; Zhang, Shu-Dong

    2013-01-01

    The advent of next generation sequencing technologies (NGS) has expanded the area of genomic research, offering high coverage and increased sensitivity over older microarray platforms. Although the current cost of next generation sequencing is still exceeding that of microarray approaches, the rapid advances in NGS will likely make it the platform of choice for future research in differential gene expression. Connectivity mapping is a procedure for examining the connections among diseases, genes and drugs by differential gene expression initially based on microarray technology, with which a large collection of compound-induced reference gene expression profiles have been accumulated. In this work, we aim to test the feasibility of incorporating NGS RNA-Seq data into the current connectivity mapping framework by utilizing the microarray based reference profiles and the construction of a differentially expressed gene signature from a NGS dataset. This would allow for the establishment of connections between the NGS gene signature and those microarray reference profiles, alleviating the associated incurring cost of re-creating drug profiles with NGS technology. We examined the connectivity mapping approach on a publicly available NGS dataset with androgen stimulation of LNCaP cells in order to extract candidate compounds that could inhibit the proliferative phenotype of LNCaP cells and to elucidate their potential in a laboratory setting. In addition, we also analyzed an independent microarray dataset of similar experimental settings. We found a high level of concordance between the top compounds identified using the gene signatures from the two datasets. The nicotine derivative cotinine was returned as the top candidate among the overlapping compounds with potential to suppress this proliferative phenotype. Subsequent lab experiments validated this connectivity mapping hit, showing that cotinine inhibits cell proliferation in an androgen dependent manner. Thus the results in this study suggest a promising prospect of integrating NGS data with connectivity mapping. PMID:23840550

  13. Consistency of biological networks inferred from microarray and sequencing data.

    PubMed

    Vinciotti, Veronica; Wit, Ernst C; Jansen, Rick; de Geus, Eco J C N; Penninx, Brenda W J H; Boomsma, Dorret I; 't Hoen, Peter A C

    2016-06-24

    Sparse Gaussian graphical models are popular for inferring biological networks, such as gene regulatory networks. In this paper, we investigate the consistency of these models across different data platforms, such as microarray and next generation sequencing, on the basis of a rich dataset containing samples that are profiled under both techniques as well as a large set of independent samples. Our analysis shows that individual node variances can have a remarkable effect on the connectivity of the resulting network. Their inconsistency across platforms and the fact that the variability level of a node may not be linked to its regulatory role mean that, failing to scale the data prior to the network analysis, leads to networks that are not reproducible across different platforms and that may be misleading. Moreover, we show how the reproducibility of networks across different platforms is significantly higher if networks are summarised in terms of enrichment amongst functional groups of interest, such as pathways, rather than at the level of individual edges. Careful pre-processing of transcriptional data and summaries of networks beyond individual edges can improve the consistency of network inference across platforms. However, caution is needed at this stage in the (over)interpretation of gene regulatory networks inferred from biological data.

  14. Generation and analyses of human synthetic antibody libraries and their application for protein microarrays.

    PubMed

    Säll, Anna; Walle, Maria; Wingren, Christer; Müller, Susanne; Nyman, Tomas; Vala, Andrea; Ohlin, Mats; Borrebaeck, Carl A K; Persson, Helena

    2016-10-01

    Antibody-based proteomics offers distinct advantages in the analysis of complex samples for discovery and validation of biomarkers associated with disease. However, its large-scale implementation requires tools and technologies that allow development of suitable antibody or antibody fragments in a high-throughput manner. To address this we designed and constructed two human synthetic antibody fragment (scFv) libraries denoted HelL-11 and HelL-13. By the use of phage display technology, in total 466 unique scFv antibodies specific for 114 different antigens were generated. The specificities of these antibodies were analyzed in a variety of immunochemical assays and a subset was further evaluated for functionality in protein microarray applications. This high-throughput approach demonstrates the ability to rapidly generate a wealth of reagents not only for proteome research, but potentially also for diagnostics and therapeutics. In addition, this work provides a great example on how a synthetic approach can be used to optimize library designs. By having precise control of the diversity introduced into the antigen-binding sites, synthetic libraries offer increased understanding of how different diversity contributes to antibody binding reactivity and stability, thereby providing the key to future library optimization. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  15. Large Scale Immune Profiling of Infected Humans and Goats Reveals Differential Recognition of Brucella melitensis Antigens

    PubMed Central

    Liang, Li; Leng, Diana; Burk, Chad; Nakajima-Sasaki, Rie; Kayala, Matthew A.; Atluri, Vidya L.; Pablo, Jozelyn; Unal, Berkay; Ficht, Thomas A.; Gotuzzo, Eduardo; Saito, Mayuko; Morrow, W. John W.; Liang, Xiaowu; Baldi, Pierre; Gilman, Robert H.; Vinetz, Joseph M.; Tsolis, Renée M.; Felgner, Philip L.

    2010-01-01

    Brucellosis is a widespread zoonotic disease that is also a potential agent of bioterrorism. Current serological assays to diagnose human brucellosis in clinical settings are based on detection of agglutinating anti-LPS antibodies. To better understand the universe of antibody responses that develop after B. melitensis infection, a protein microarray was fabricated containing 1,406 predicted B. melitensis proteins. The array was probed with sera from experimentally infected goats and naturally infected humans from an endemic region in Peru. The assay identified 18 antigens differentially recognized by infected and non-infected goats, and 13 serodiagnostic antigens that differentiate human patients proven to have acute brucellosis from syndromically similar patients. There were 31 cross-reactive antigens in healthy goats and 20 cross-reactive antigens in healthy humans. Only two of the serodiagnostic antigens and eight of the cross-reactive antigens overlap between humans and goats. Based on these results, a nitrocellulose line blot containing the human serodiagnostic antigens was fabricated and applied in a simple assay that validated the accuracy of the protein microarray results in the diagnosis of humans. These data demonstrate that an experimentally infected natural reservoir host produces a fundamentally different immune response than a naturally infected accidental human host. PMID:20454614

  16. Synthesis of high-quality libraries of long (150mer) oligonucleotides by a novel depurination controlled process

    PubMed Central

    LeProust, Emily M.; Peck, Bill J.; Spirin, Konstantin; McCuen, Heather Brummel; Moore, Bridget; Namsaraev, Eugeni; Caruthers, Marvin H.

    2010-01-01

    We have achieved the ability to synthesize thousands of unique, long oligonucleotides (150mers) in fmol amounts using parallel synthesis of DNA on microarrays. The sequence accuracy of the oligonucleotides in such large-scale syntheses has been limited by the yields and side reactions of the DNA synthesis process used. While there has been significant demand for libraries of long oligos (150mer and more), the yields in conventional DNA synthesis and the associated side reactions have previously limited the availability of oligonucleotide pools to lengths <100 nt. Using novel array based depurination assays, we show that the depurination side reaction is the limiting factor for the synthesis of libraries of long oligonucleotides on Agilent Technologies’ SurePrint® DNA microarray platform. We also demonstrate how depurination can be controlled and reduced by a novel detritylation process to enable the synthesis of high quality, long (150mer) oligonucleotide libraries and we report the characterization of synthesis efficiency for such libraries. Oligonucleotide libraries prepared with this method have changed the economics and availability of several existing applications (e.g. targeted resequencing, preparation of shRNA libraries, site-directed mutagenesis), and have the potential to enable even more novel applications (e.g. high-complexity synthetic biology). PMID:20308161

  17. Single molecule characterization of DNA binding and strand displacement reactions on lithographic DNA origami microarrays.

    PubMed

    Scheible, Max B; Pardatscher, Günther; Kuzyk, Anton; Simmel, Friedrich C

    2014-03-12

    The combination of molecular self-assembly based on the DNA origami technique with lithographic patterning enables the creation of hierarchically ordered nanosystems, in which single molecules are positioned at precise locations on multiple length scales. Based on a hybrid assembly protocol utilizing DNA self-assembly and electron-beam lithography on transparent glass substrates, we here demonstrate a DNA origami microarray, which is compatible with the requirements of single molecule fluorescence and super-resolution microscopy. The spatial arrangement allows for a simple and reliable identification of single molecule events and facilitates automated read-out and data analysis. As a specific application, we utilize the microarray to characterize the performance of DNA strand displacement reactions localized on the DNA origami structures. We find considerable variability within the array, which results both from structural variations and stochastic reaction dynamics prevalent at the single molecule level.

  18. A comparison of bootstrap methods and an adjusted bootstrap approach for estimating the prediction error in microarray classification.

    PubMed

    Jiang, Wenyu; Simon, Richard

    2007-12-20

    This paper first provides a critical review on some existing methods for estimating the prediction error in classifying microarray data where the number of genes greatly exceeds the number of specimens. Special attention is given to the bootstrap-related methods. When the sample size n is small, we find that all the reviewed methods suffer from either substantial bias or variability. We introduce a repeated leave-one-out bootstrap (RLOOB) method that predicts for each specimen in the sample using bootstrap learning sets of size ln. We then propose an adjusted bootstrap (ABS) method that fits a learning curve to the RLOOB estimates calculated with different bootstrap learning set sizes. The ABS method is robust across the situations we investigate and provides a slightly conservative estimate for the prediction error. Even with small samples, it does not suffer from large upward bias as the leave-one-out bootstrap and the 0.632+ bootstrap, and it does not suffer from large variability as the leave-one-out cross-validation in microarray applications. Copyright (c) 2007 John Wiley & Sons, Ltd.

  19. The MicroArray Quality Control (MAQC)-II study of common practices for the development and validation of microarray-based predictive models

    EPA Science Inventory

    The second phase of the MicroArray Quality Control (MAQC-II) project evaluated common practices for developing and validating microarray-based models aimed at predicting toxicological and clinical endpoints. Thirty-six teams developed classifiers for 13 endpoints - some easy, som...

  20. Evolutionary genomics of Staphylococcus aureus: insights into the origin of methicillin-resistant strains and the toxic shock syndrome epidemic.

    PubMed

    Fitzgerald, J R; Sturdevant, D E; Mackie, S M; Gill, S R; Musser, J M

    2001-07-17

    An emerging theme in medical microbiology is that extensive variation exists in gene content among strains of many pathogenic bacterial species. However, this topic has not been investigated on a genome scale with strains recovered from patients with well-defined clinical conditions. Staphylococcus aureus is a major human pathogen and also causes economically important infections in cows and sheep. A DNA microarray representing >90% of the S. aureus genome was used to characterize genomic diversity, evolutionary relationships, and virulence gene distribution among 36 strains of divergent clonal lineages, including methicillin-resistant strains and organisms causing toxic shock syndrome. Genetic variation in S. aureus is very extensive, with approximately 22% of the genome comprised of dispensable genetic material. Eighteen large regions of difference were identified, and 10 of these regions have genes that encode putative virulence factors or proteins mediating antibiotic resistance. We find that lateral gene transfer has played a fundamental role in the evolution of S. aureus. The mec gene has been horizontally transferred into distinct S. aureus chromosomal backgrounds at least five times, demonstrating that methicillin-resistant strains have evolved multiple independent times, rather than from a single ancestral strain. This finding resolves a long-standing controversy in S. aureus research. The epidemic of toxic shock syndrome that occurred in the 1970s was caused by a change in the host environment, rather than rapid geographic dissemination of a new hypervirulent strain. DNA microarray analysis of large samples of clinically characterized strains provides broad insights into evolution, pathogenesis, and disease emergence.

  1. Evidence for somatic gene conversion and deletion in bipolar disorder, Crohn's disease, coronary artery disease, hypertension, rheumatoid arthritis, type-1 diabetes, and type-2 diabetes

    PubMed Central

    2011-01-01

    Background During gene conversion, genetic information is transferred unidirectionally between highly homologous but non-allelic regions of DNA. While germ-line gene conversion has been implicated in the pathogenesis of some diseases, somatic gene conversion has remained technically difficult to investigate on a large scale. Methods A novel analysis technique is proposed for detecting the signature of somatic gene conversion from SNP microarray data. The Wellcome Trust Case Control Consortium has gathered SNP microarray data for two control populations and cohorts for bipolar disorder (BD), cardiovascular disease (CAD), Crohn's disease (CD), hypertension (HT), rheumatoid arthritis (RA), type-1 diabetes (T1D) and type-2 diabetes (T2D). Using the new analysis technique, the seven disease cohorts are analyzed to identify cohort-specific SNPs at which conversion is predicted. The quality of the predictions is assessed by identifying known disease associations for genes in the homologous duplicons, and comparing the frequency of such associations with background rates. Results Of 28 disease/locus pairs meeting stringent conditions, 22 show various degrees of disease association, compared with only 8 of 70 in a mock study designed to measure the background association rate (P < 10-9). Additional candidate genes are identified using less stringent filtering conditions. In some cases, somatic deletions appear likely. RA has a distinctive pattern of events relative to other diseases. Similarities in patterns are apparent between BD and HT. Conclusions The associations derived represent the first evidence that somatic gene conversion could be a significant causative factor in each of the seven diseases. The specific genes provide potential insights about disease mechanisms, and are strong candidates for further study. Please see Commentary: http://www.biomedcentral.com/1741-7015/9/13/abstract. PMID:21291537

  2. Evidence for somatic gene conversion and deletion in bipolar disorder, Crohn's disease, coronary artery disease, hypertension, rheumatoid arthritis, type-1 diabetes, and type-2 diabetes.

    PubMed

    Ross, Kenneth Andrew

    2011-02-03

    During gene conversion, genetic information is transferred unidirectionally between highly homologous but non-allelic regions of DNA. While germ-line gene conversion has been implicated in the pathogenesis of some diseases, somatic gene conversion has remained technically difficult to investigate on a large scale. A novel analysis technique is proposed for detecting the signature of somatic gene conversion from SNP microarray data. The Wellcome Trust Case Control Consortium has gathered SNP microarray data for two control populations and cohorts for bipolar disorder (BD), cardiovascular disease (CAD), Crohn's disease (CD), hypertension (HT), rheumatoid arthritis (RA), type-1 diabetes (T1D) and type-2 diabetes (T2D). Using the new analysis technique, the seven disease cohorts are analyzed to identify cohort-specific SNPs at which conversion is predicted. The quality of the predictions is assessed by identifying known disease associations for genes in the homologous duplicons, and comparing the frequency of such associations with background rates. Of 28 disease/locus pairs meeting stringent conditions, 22 show various degrees of disease association, compared with only 8 of 70 in a mock study designed to measure the background association rate (P < 10-9). Additional candidate genes are identified using less stringent filtering conditions. In some cases, somatic deletions appear likely. RA has a distinctive pattern of events relative to other diseases. Similarities in patterns are apparent between BD and HT. The associations derived represent the first evidence that somatic gene conversion could be a significant causative factor in each of the seven diseases. The specific genes provide potential insights about disease mechanisms, and are strong candidates for further study.

  3. IKKε and TBK1 expression in gastric cancer.

    PubMed

    Lee, Seung Eun; Hong, Mineui; Cho, Junhun; Lee, Jeeyun; Kim, Kyoung-Mee

    2017-03-07

    Inhibitor of kappa B kinase epsilon (IKKε) and TANK-binding kinase 1 (TBK1) are non-canonical IKKs. IKKε and TBK1 share the kinase domain and are similar in their ability to activate the nuclear factor-kappa B signaling pathway. IKKε and TBK1 are overexpressed through multiple mechanisms in various human cancers. However, the expression of IKKε and TBK1 in gastric cancer and their role in prognosis have not been studied.To investigate overexpression of the IKKε and TBK1 proteins in gastric cancer and their relationship with clinicopathologic factors, we performed immunohistochemical staining using a tissue microarray. Tissue microarray samples were obtained from 1,107 gastric cancer patients who underwent R0 gastrectomy with extensive lymph node dissection and adjuvant chemotherapy.We identified expression of IKKε in 150 (13.6%) and TBK1 in 38 (3.4%) gastric cancers. Furthermore, co-expression of IKKε and TBK1 was identified in 1.5% of cases. Co-expression of IKKε and TBK1 was associated with differentiated intestinal histology and earlier T stage. In a multivariate binary logistic regression model, intestinal histologic type by Lauren classification and early AJCC stage were significant predictors for expression of IKKε and TBK1 proteins in gastric cancer. Changes in IKKε and TBK1 expression may be involved in the development of intestinal-type gastric cancer. The overexpression of IKKε and TBK1 should be considered in selected patients with intestinal-type gastric cancer.In conclusion, this is the first large-scale study investigating the relationships between expression of IKKε and TBK1 and clinicopathologic features of gastric cancer. The role of IKKε and TBK1 in intestinal-type gastric cancer pathogenesis should be elucidated by further investigation.

  4. Large-scale integrative network-based analysis identifies common pathways disrupted by copy number alterations across cancers

    PubMed Central

    2013-01-01

    Background Many large-scale studies analyzed high-throughput genomic data to identify altered pathways essential to the development and progression of specific types of cancer. However, no previous study has been extended to provide a comprehensive analysis of pathways disrupted by copy number alterations across different human cancers. Towards this goal, we propose a network-based method to integrate copy number alteration data with human protein-protein interaction networks and pathway databases to identify pathways that are commonly disrupted in many different types of cancer. Results We applied our approach to a data set of 2,172 cancer patients across 16 different types of cancers, and discovered a set of commonly disrupted pathways, which are likely essential for tumor formation in majority of the cancers. We also identified pathways that are only disrupted in specific cancer types, providing molecular markers for different human cancers. Analysis with independent microarray gene expression datasets confirms that the commonly disrupted pathways can be used to identify patient subgroups with significantly different survival outcomes. We also provide a network view of disrupted pathways to explain how copy number alterations affect pathways that regulate cell growth, cycle, and differentiation for tumorigenesis. Conclusions In this work, we demonstrated that the network-based integrative analysis can help to identify pathways disrupted by copy number alterations across 16 types of human cancers, which are not readily identifiable by conventional overrepresentation-based and other pathway-based methods. All the results and source code are available at http://compbio.cs.umn.edu/NetPathID/. PMID:23822816

  5. A comprehensive simulation study on classification of RNA-Seq data.

    PubMed

    Zararsız, Gökmen; Goksuluk, Dincer; Korkmaz, Selcuk; Eldem, Vahap; Zararsiz, Gozde Erturk; Duru, Izzet Parug; Ozturk, Ahmet

    2017-01-01

    RNA sequencing (RNA-Seq) is a powerful technique for the gene-expression profiling of organisms that uses the capabilities of next-generation sequencing technologies. Developing gene-expression-based classification algorithms is an emerging powerful method for diagnosis, disease classification and monitoring at molecular level, as well as providing potential markers of diseases. Most of the statistical methods proposed for the classification of gene-expression data are either based on a continuous scale (eg. microarray data) or require a normal distribution assumption. Hence, these methods cannot be directly applied to RNA-Seq data since they violate both data structure and distributional assumptions. However, it is possible to apply these algorithms with appropriate modifications to RNA-Seq data. One way is to develop count-based classifiers, such as Poisson linear discriminant analysis and negative binomial linear discriminant analysis. Another way is to bring the data closer to microarrays and apply microarray-based classifiers. In this study, we compared several classifiers including PLDA with and without power transformation, NBLDA, single SVM, bagging SVM (bagSVM), classification and regression trees (CART), and random forests (RF). We also examined the effect of several parameters such as overdispersion, sample size, number of genes, number of classes, differential-expression rate, and the transformation method on model performances. A comprehensive simulation study is conducted and the results are compared with the results of two miRNA and two mRNA experimental datasets. The results revealed that increasing the sample size, differential-expression rate and decreasing the dispersion parameter and number of groups lead to an increase in classification accuracy. Similar with differential-expression studies, the classification of RNA-Seq data requires careful attention when handling data overdispersion. We conclude that, as a count-based classifier, the power transformed PLDA and, as a microarray-based classifier, vst or rlog transformed RF and SVM classifiers may be a good choice for classification. An R/BIOCONDUCTOR package, MLSeq, is freely available at https://www.bioconductor.org/packages/release/bioc/html/MLSeq.html.

  6. Genotyping microarray: Mutation screening in Spanish families with autosomal dominant retinitis pigmentosa

    PubMed Central

    García-Hoyos, María; Cortón, Marta; Ávila-Fernández, Almudena; Riveiro-Álvarez, Rosa; Giménez, Ascensión; Hernan, Inma; Carballo, Miguel; Ayuso, Carmen

    2012-01-01

    Purpose Presently, 22 genes have been described in association with autosomal dominant retinitis pigmentosa (adRP); however, they explain only 50% of all cases, making genetic diagnosis of this disease difficult and costly. The aim of this study was to evaluate a specific genotyping microarray for its application to the molecular diagnosis of adRP in Spanish patients. Methods We analyzed 139 unrelated Spanish families with adRP. Samples were studied by using a genotyping microarray (adRP). All mutations found were further confirmed with automatic sequencing. Rhodopsin (RHO) sequencing was performed in all negative samples for the genotyping microarray. Results The adRP genotyping microarray detected the mutation associated with the disease in 20 of the 139 families with adRP. As in other populations, RHO was found to be the most frequently mutated gene in these families (7.9% of the microarray genotyped families). The rate of false positives (microarray results not confirmed with sequencing) and false negatives (mutations in RHO detected with sequencing but not with the genotyping microarray) were established, and high levels of analytical sensitivity (95%) and specificity (100%) were found. Diagnostic accuracy was 15.1%. Conclusions The adRP genotyping microarray is a quick, cost-efficient first step in the molecular diagnosis of Spanish patients with adRP. PMID:22736939

  7. Electrophoretic and field-effect graphene for all-electrical DNA array technology.

    PubMed

    Xu, Guangyu; Abbott, Jeffrey; Qin, Ling; Yeung, Kitty Y M; Song, Yi; Yoon, Hosang; Kong, Jing; Ham, Donhee

    2014-09-05

    Field-effect transistor biomolecular sensors based on low-dimensional nanomaterials boast sensitivity, label-free operation and chip-scale construction. Chemical vapour deposition graphene is especially well suited for multiplexed electronic DNA array applications, since its large two-dimensional morphology readily lends itself to top-down fabrication of transistor arrays. Nonetheless, graphene field-effect transistor DNA sensors have been studied mainly at single-device level. Here we create, from chemical vapour deposition graphene, field-effect transistor arrays with two features representing steps towards multiplexed DNA arrays. First, a robust array yield--seven out of eight transistors--is achieved with a 100-fM sensitivity, on par with optical DNA microarrays and at least 10 times higher than prior chemical vapour deposition graphene transistor DNA sensors. Second, each graphene acts as an electrophoretic electrode for site-specific probe DNA immobilization, and performs subsequent site-specific detection of target DNA as a field-effect transistor. The use of graphene as both electrode and transistor suggests a path towards all-electrical multiplexed graphene DNA arrays.

  8. In silico Pathway Activation Network Decomposition Analysis (iPANDA) as a method for biomarker development.

    PubMed

    Ozerov, Ivan V; Lezhnina, Ksenia V; Izumchenko, Evgeny; Artemov, Artem V; Medintsev, Sergey; Vanhaelen, Quentin; Aliper, Alexander; Vijg, Jan; Osipov, Andreyan N; Labat, Ivan; West, Michael D; Buzdin, Anton; Cantor, Charles R; Nikolsky, Yuri; Borisov, Nikolay; Irincheeva, Irina; Khokhlovich, Edward; Sidransky, David; Camargo, Miguel Luiz; Zhavoronkov, Alex

    2016-11-16

    Signalling pathway activation analysis is a powerful approach for extracting biologically relevant features from large-scale transcriptomic and proteomic data. However, modern pathway-based methods often fail to provide stable pathway signatures of a specific phenotype or reliable disease biomarkers. In the present study, we introduce the in silico Pathway Activation Network Decomposition Analysis (iPANDA) as a scalable robust method for biomarker identification using gene expression data. The iPANDA method combines precalculated gene coexpression data with gene importance factors based on the degree of differential gene expression and pathway topology decomposition for obtaining pathway activation scores. Using Microarray Analysis Quality Control (MAQC) data sets and pretreatment data on Taxol-based neoadjuvant breast cancer therapy from multiple sources, we demonstrate that iPANDA provides significant noise reduction in transcriptomic data and identifies highly robust sets of biologically relevant pathway signatures. We successfully apply iPANDA for stratifying breast cancer patients according to their sensitivity to neoadjuvant therapy.

  9. Using FAIRE (Formaldehyde-Assisted Isolation of Regulatory Elements) to isolate active regulatory DNA

    PubMed Central

    Simon, Jeremy M.; Giresi, Paul G.; Davis, Ian J.; Lieb, Jason D.

    2013-01-01

    Eviction or destabilization of nucleosomes from chromatin is a hallmark of functional regulatory elements of the eukaryotic genome. Historically identified by nuclease hypersensitivity, these regulatory elements are typically bound by transcription factors or other regulatory proteins. FAIRE (Formaldehyde-Assisted Isolation of Regulatory Elements) is an alternative approach to identify these genomic regions and has proven successful in a multitude of eukaryotic cell and tissue types. Cells or dissociated tissues are crosslinked briefly with formaldehyde, lysed, and sonicated. Sheared chromatin is subjected to phenol-chloroform extraction and the isolated DNA, typically encompassing 1–3% of the human genome, is purified. We provide guidelines for quantitative analysis by PCR, microarrays, or next-generation sequencing. Regulatory elements enriched by FAIRE display high concordance with those identified by nuclease hypersensitivity or ChIP, and the entire procedure can be completed in three days. FAIRE exhibits low technical variability, which allows its use in large-scale studies of chromatin from normal or diseased tissues. PMID:22262007

  10. In silico Pathway Activation Network Decomposition Analysis (iPANDA) as a method for biomarker development

    PubMed Central

    Ozerov, Ivan V.; Lezhnina, Ksenia V.; Izumchenko, Evgeny; Artemov, Artem V.; Medintsev, Sergey; Vanhaelen, Quentin; Aliper, Alexander; Vijg, Jan; Osipov, Andreyan N.; Labat, Ivan; West, Michael D.; Buzdin, Anton; Cantor, Charles R.; Nikolsky, Yuri; Borisov, Nikolay; Irincheeva, Irina; Khokhlovich, Edward; Sidransky, David; Camargo, Miguel Luiz; Zhavoronkov, Alex

    2016-01-01

    Signalling pathway activation analysis is a powerful approach for extracting biologically relevant features from large-scale transcriptomic and proteomic data. However, modern pathway-based methods often fail to provide stable pathway signatures of a specific phenotype or reliable disease biomarkers. In the present study, we introduce the in silico Pathway Activation Network Decomposition Analysis (iPANDA) as a scalable robust method for biomarker identification using gene expression data. The iPANDA method combines precalculated gene coexpression data with gene importance factors based on the degree of differential gene expression and pathway topology decomposition for obtaining pathway activation scores. Using Microarray Analysis Quality Control (MAQC) data sets and pretreatment data on Taxol-based neoadjuvant breast cancer therapy from multiple sources, we demonstrate that iPANDA provides significant noise reduction in transcriptomic data and identifies highly robust sets of biologically relevant pathway signatures. We successfully apply iPANDA for stratifying breast cancer patients according to their sensitivity to neoadjuvant therapy. PMID:27848968

  11. Quantitative phenotyping via deep barcode sequencing.

    PubMed

    Smith, Andrew M; Heisler, Lawrence E; Mellor, Joseph; Kaper, Fiona; Thompson, Michael J; Chee, Mark; Roth, Frederick P; Giaever, Guri; Nislow, Corey

    2009-10-01

    Next-generation DNA sequencing technologies have revolutionized diverse genomics applications, including de novo genome sequencing, SNP detection, chromatin immunoprecipitation, and transcriptome analysis. Here we apply deep sequencing to genome-scale fitness profiling to evaluate yeast strain collections in parallel. This method, Barcode analysis by Sequencing, or "Bar-seq," outperforms the current benchmark barcode microarray assay in terms of both dynamic range and throughput. When applied to a complex chemogenomic assay, Bar-seq quantitatively identifies drug targets, with performance superior to the benchmark microarray assay. We also show that Bar-seq is well-suited for a multiplex format. We completely re-sequenced and re-annotated the yeast deletion collection using deep sequencing, found that approximately 20% of the barcodes and common priming sequences varied from expectation, and used this revised list of barcode sequences to improve data quality. Together, this new assay and analysis routine provide a deep-sequencing-based toolkit for identifying gene-environment interactions on a genome-wide scale.

  12. A genome-scale map of expression for a mouse brain section obtained using voxelation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chin, Mark H.; Geng, Alex B.; Khan, Arshad H.

    Gene expression signatures in the mammalian brain hold the key to understanding neural development and neurological diseases. We have reconstructed 2- dimensional images of gene expression for 20,000 genes in a coronal slice of the mouse brain at the level of the striatum by using microarrays in combination with voxelation at a resolution of 1 mm3. Good reliability of the microarray results were confirmed using multiple replicates, subsequent quantitative RT-PCR voxelation, mass spectrometry voxelation and publicly available in situ hybridization data. Known and novel genes were identified with expression patterns localized to defined substructures within the brain. In addition, genesmore » with unexpected patterns were identified and cluster analysis identified a set of genes with a gradient of dorsal/ventral expression not restricted to known anatomical boundaries. The genome-scale maps of gene expression obtained using voxelation will be a valuable tool for the neuroscience community.« less

  13. Large-scale identification and comparative analysis of miRNA expression profile in the respiratory tree of the sea cucumber Apostichopus japonicus during aestivation.

    PubMed

    Chen, Muyan; Storey, Kenneth B

    2014-02-01

    The sea cucumber Apostichopus japonicus withstands high water temperatures in the summer by suppressing its metabolic rate and entering a state of aestivation. We hypothesized that changes in the expression of miRNAs could provide important post-transcriptional regulation of gene expression during hypometabolism via control over mRNA translation. The present study analyzed profiles of miRNA expression in the sea cucumber respiratory tree using Solexa deep sequencing technology. We identified 279 sea cucumber miRNAs, including 15 novel miRNAs specific to sea cucumber. Animals sampled during deep aestivation (DA; after at least 15 days of continuous torpor) were compared with animals from a non-aestivation (NA) state (animals that had passed through aestivation and returned to an active state). We identified 30 differentially expressed miRNAs ([RPM (reads per million) >10, |FC| (|fold change|)≥1, FDR (false discovery rate)<0.01]) during aestivation, which were validated by two other miRNA profiling methods: miRNA microarray and real-time PCR. Among the most prominent miRNA species, miR-124, miR-124-3p, miR-79, miR-9 and miR-2010 were significantly over-expressed during deep aestivation compared with non-aestivation animals, suggesting that these miRNAs may play important roles in metabolic rate suppression during aestivation. High-throughput sequencing data and microarray data have been submitted to the GEO database with accession number: 16902695. Copyright © 2014 Elsevier B.V. All rights reserved.

  14. Isolation of RNA From Peripheral Blood Cells: A Validation Study for Molecular Diagnostics by Microarray and Kinetic RT-PCR Assays - Application in Aerospace Medicine

    DTIC Science & Technology

    2004-01-01

    of RNA From Peripheral Blood Cells: A Validation Study for Molecular Diagnostics by Microarray and Kinetic RT-PCR Assays  Application in...VALIDATION STUDY FOR MOLECULAR DIAGNOSTICS BY MICROARRAY AND KINETIC RT-PCR ASSAYS  APPLICATION IN AEROSPACE MEDICINE INTRODUCTION Extraction of cellular

  15. Functional Analyses of NSF1 in Wine Yeast Using Interconnected Correlation Clustering and Molecular Analyses

    PubMed Central

    Bessonov, Kyrylo; Walkey, Christopher J.; Shelp, Barry J.; van Vuuren, Hennie J. J.; Chiu, David; van der Merwe, George

    2013-01-01

    Analyzing time-course expression data captured in microarray datasets is a complex undertaking as the vast and complex data space is represented by a relatively low number of samples as compared to thousands of available genes. Here, we developed the Interdependent Correlation Clustering (ICC) method to analyze relationships that exist among genes conditioned on the expression of a specific target gene in microarray data. Based on Correlation Clustering, the ICC method analyzes a large set of correlation values related to gene expression profiles extracted from given microarray datasets. ICC can be applied to any microarray dataset and any target gene. We applied this method to microarray data generated from wine fermentations and selected NSF1, which encodes a C2H2 zinc finger-type transcription factor, as the target gene. The validity of the method was verified by accurate identifications of the previously known functional roles of NSF1. In addition, we identified and verified potential new functions for this gene; specifically, NSF1 is a negative regulator for the expression of sulfur metabolism genes, the nuclear localization of Nsf1 protein (Nsf1p) is controlled in a sulfur-dependent manner, and the transcription of NSF1 is regulated by Met4p, an important transcriptional activator of sulfur metabolism genes. The inter-disciplinary approach adopted here highlighted the accuracy and relevancy of the ICC method in mining for novel gene functions using complex microarray datasets with a limited number of samples. PMID:24130853

  16. A Novel Pan-Flavivirus Detection and Identification Assay Based on RT-qPCR and Microarray

    PubMed Central

    Sachse, Konrad; Ziegler, Ute; Keller, Markus

    2017-01-01

    The genus Flavivirus includes arthropod-borne viruses responsible for a large number of infections in humans and economically important animals. While RT-PCR protocols for specific detection of most Flavivirus species are available, there has been also a demand for a broad-range Flavivirus assay covering all members of the genus. It is particularly challenging to balance specificity at genus level with equal sensitivity towards each target species. In the present study, a novel assay combining a SYBR Green-based RT-qPCR with a low-density DNA microarray has been developed. Validation experiments confirmed that the RT-qPCR exhibited roughly equal sensitivity of detection and quantification for all flaviviruses tested. These PCR products are subjected to hybridization on a microarray carrying 84 different oligonucleotide probes that represent all known Flavivirus species. This assay has been used as a screening and confirmation tool for Flavivirus presence in laboratory and field samples, and it performed successfully in international External Quality Assessment of NAT studies. Twenty-six Flavivirus strains were tested with the assay, showing equivalent or superior characteristics compared with the original or even with species-specific RT-PCRs. As an example, test results on West Nile virus detection in a panel of 340 mosquito pool samples from Greece are presented. PMID:28626758

  17. Expression profiling of cardiovascular disease

    PubMed Central

    2004-01-01

    Cardiovascular disease is the most important cause of morbidity and mortality in developed countries, causing twice as many deaths as cancer in the USA. The major cardiovascular diseases, including coronary artery disease (CAD), myocardial infarction (MI), congestive heart failure (CHF) and common congenital heart disease (CHD), are caused by multiple genetic and environmental factors, as well as the interactions between them. The underlying molecular pathogenic mechanisms for these disorders are still largely unknown, but gene expression may play a central role in the development and progression of cardiovascular disease. Microarrays are high-throughput genomic tools that allow the comparison of global expression changes in thousands of genes between normal and diseased cells/tissues. Microarrays have recently been applied to CAD/MI, CHF and CHD to profile changes in gene expression patterns in diseased and non-diseased patients. This same technology has also been used to characterise endothelial cells, vascular smooth muscle cells and inflammatory cells, with or without various treatments that mimic disease processes involved in CAD/MI. These studies have led to the identification of unique subsets of genes associated with specific diseases and disease processes. Ongoing microarray studies in the field will provide insights into the molecular mechanism of cardiovascular disease and may generate new diagnostic and therapeutic markers. PMID:15588496

  18. Development and application of automated systems for plasmid-based functional proteomics to improve syntheitc biology of engineered industrial microbes for high level expression of proteases for biofertilizer production

    USDA-ARS?s Scientific Manuscript database

    In addition to microarray technology, which provides a robust method to study protein function in a rapid, economical, and proteome-wide fashion, plasmid-based functional proteomics is an important technology for rapidly obtaining large quantities of protein and determining protein function across a...

  19. Knowledge-based analysis of microarrays for the discovery of transcriptional regulation relationships

    PubMed Central

    2010-01-01

    Background The large amount of high-throughput genomic data has facilitated the discovery of the regulatory relationships between transcription factors and their target genes. While early methods for discovery of transcriptional regulation relationships from microarray data often focused on the high-throughput experimental data alone, more recent approaches have explored the integration of external knowledge bases of gene interactions. Results In this work, we develop an algorithm that provides improved performance in the prediction of transcriptional regulatory relationships by supplementing the analysis of microarray data with a new method of integrating information from an existing knowledge base. Using a well-known dataset of yeast microarrays and the Yeast Proteome Database, a comprehensive collection of known information of yeast genes, we show that knowledge-based predictions demonstrate better sensitivity and specificity in inferring new transcriptional interactions than predictions from microarray data alone. We also show that comprehensive, direct and high-quality knowledge bases provide better prediction performance. Comparison of our results with ChIP-chip data and growth fitness data suggests that our predicted genome-wide regulatory pairs in yeast are reasonable candidates for follow-up biological verification. Conclusion High quality, comprehensive, and direct knowledge bases, when combined with appropriate bioinformatic algorithms, can significantly improve the discovery of gene regulatory relationships from high throughput gene expression data. PMID:20122245

  20. Knowledge-based analysis of microarrays for the discovery of transcriptional regulation relationships.

    PubMed

    Seok, Junhee; Kaushal, Amit; Davis, Ronald W; Xiao, Wenzhong

    2010-01-18

    The large amount of high-throughput genomic data has facilitated the discovery of the regulatory relationships between transcription factors and their target genes. While early methods for discovery of transcriptional regulation relationships from microarray data often focused on the high-throughput experimental data alone, more recent approaches have explored the integration of external knowledge bases of gene interactions. In this work, we develop an algorithm that provides improved performance in the prediction of transcriptional regulatory relationships by supplementing the analysis of microarray data with a new method of integrating information from an existing knowledge base. Using a well-known dataset of yeast microarrays and the Yeast Proteome Database, a comprehensive collection of known information of yeast genes, we show that knowledge-based predictions demonstrate better sensitivity and specificity in inferring new transcriptional interactions than predictions from microarray data alone. We also show that comprehensive, direct and high-quality knowledge bases provide better prediction performance. Comparison of our results with ChIP-chip data and growth fitness data suggests that our predicted genome-wide regulatory pairs in yeast are reasonable candidates for follow-up biological verification. High quality, comprehensive, and direct knowledge bases, when combined with appropriate bioinformatic algorithms, can significantly improve the discovery of gene regulatory relationships from high throughput gene expression data.

  1. Microarray-based identification of differentially expressed genes in extramammary Paget’s disease

    PubMed Central

    Lin, Jin-Ran; Liang, Jun; Zhang, Qiao-An; Huang, Qiong; Wang, Shang-Shang; Qin, Hai-Hong; Chen, Lian-Jun; Xu, Jin-Hua

    2015-01-01

    Extramammary Paget’s disease (EMPD) is a rare cutaneous malignancy accounting for approximately 1-2% of vulvar cancers. The rarity of this disease has caused difficulties in characterization and the molecular mechanism underlying EMPD development remains largely unclear. Here we used microarray analysis to identify differentially expressed genes in EMPD of the scrotum comparing with normal epithelium from healthy donors. Agilent single-channel microarray was used to compare the gene expression between 6 EMPD specimens and 6 normal scrotum epithelium samples. A total of 799 up-regulated genes and 723 down-regulated genes were identified in EMPD tissues. Real-time PCR was conducted to verify the differential expression of some representative genes, including ERBB4, TCF3, PAPSS2, PIK3R3, PRLR, SULT1A1, TCF7L1, and CREB3L4. Generally, the real-time PCR results were consistent with microarray data, and the expression of ERBB4, PRLR, TCF3, PIK3R3, SULT1A1, and TCF7L1 was significantly overexpressed in EMPD (P<0.05). Moreover, the overexpression of PRLR in EMPD, a receptor for the anterior pituitary hormone prolactin (PRL), was confirmed by immunohistochemistry. These data demonstrate that the differentially expressed genes from the microarray-based identification are tightly associated with EMPD occurrence. PMID:26221264

  2. Advances in cell-free protein array methods.

    PubMed

    Yu, Xiaobo; Petritis, Brianne; Duan, Hu; Xu, Danke; LaBaer, Joshua

    2018-01-01

    Cell-free protein microarrays represent a special form of protein microarray which display proteins made fresh at the time of the experiment, avoiding storage and denaturation. They have been used increasingly in basic and translational research over the past decade to study protein-protein interactions, the pathogen-host relationship, post-translational modifications, and antibody biomarkers of different human diseases. Their role in the first blood-based diagnostic test for early stage breast cancer highlights their value in managing human health. Cell-free protein microarrays will continue to evolve to become widespread tools for research and clinical management. Areas covered: We review the advantages and disadvantages of different cell-free protein arrays, with an emphasis on the methods that have been studied in the last five years. We also discuss the applications of each microarray method. Expert commentary: Given the growing roles and impact of cell-free protein microarrays in research and medicine, we discuss: 1) the current technical and practical limitations of cell-free protein microarrays; 2) the biomarker discovery and verification pipeline using protein microarrays; and 3) how cell-free protein microarrays will advance over the next five years, both in their technology and applications.

  3. The opportunities and challenges of large-scale molecular approaches to songbird neurobiology

    PubMed Central

    Mello, C.V.; Clayton, D.F.

    2014-01-01

    High-through put methods for analyzing genome structure and function are having a large impact in song-bird neurobiology. Methods include genome sequencing and annotation, comparative genomics, DNA microarrays and transcriptomics, and the development of a brain atlas of gene expression. Key emerging findings include the identification of complex transcriptional programs active during singing, the robust brain expression of non-coding RNAs, evidence of profound variations in gene expression across brain regions, and the identification of molecular specializations within song production and learning circuits. Current challenges include the statistical analysis of large datasets, effective genome curations, the efficient localization of gene expression changes to specific neuronal circuits and cells, and the dissection of behavioral and environmental factors that influence brain gene expression. The field requires efficient methods for comparisons with organisms like chicken, which offer important anatomical, functional and behavioral contrasts. As sequencing costs plummet, opportunities emerge for comparative approaches that may help reveal evolutionary transitions contributing to vocal learning, social behavior and other properties that make songbirds such compelling research subjects. PMID:25280907

  4. Design and coverage of high throughput genotyping arrays optimized for individuals of East Asian, African American, and Latino race/ethnicity using imputation and a novel hybrid SNP selection algorithm.

    PubMed

    Hoffmann, Thomas J; Zhan, Yiping; Kvale, Mark N; Hesselson, Stephanie E; Gollub, Jeremy; Iribarren, Carlos; Lu, Yontao; Mei, Gangwu; Purdy, Matthew M; Quesenberry, Charles; Rowell, Sarah; Shapero, Michael H; Smethurst, David; Somkin, Carol P; Van den Eeden, Stephen K; Walter, Larry; Webster, Teresa; Whitmer, Rachel A; Finn, Andrea; Schaefer, Catherine; Kwok, Pui-Yan; Risch, Neil

    2011-12-01

    Four custom Axiom genotyping arrays were designed for a genome-wide association (GWA) study of 100,000 participants from the Kaiser Permanente Research Program on Genes, Environment and Health. The array optimized for individuals of European race/ethnicity was previously described. Here we detail the development of three additional microarrays optimized for individuals of East Asian, African American, and Latino race/ethnicity. For these arrays, we decreased redundancy of high-performing SNPs to increase SNP capacity. The East Asian array was designed using greedy pairwise SNP selection. However, removing SNPs from the target set based on imputation coverage is more efficient than pairwise tagging. Therefore, we developed a novel hybrid SNP selection method for the African American and Latino arrays utilizing rounds of greedy pairwise SNP selection, followed by removal from the target set of SNPs covered by imputation. The arrays provide excellent genome-wide coverage and are valuable additions for large-scale GWA studies. Copyright © 2011 Elsevier Inc. All rights reserved.

  5. Revisiting the immune microenvironment of diffuse large B-cell lymphoma using a tissue microarray and immunohistochemistry: robust semi-automated analysis reveals CD3 and FoxP3 as potential predictors of response to R-CHOP

    PubMed Central

    Coutinho, Rita; Clear, Andrew J.; Mazzola, Emanuele; Owen, Andrew; Greaves, Paul; Wilson, Andrew; Matthews, Janet; Lee, Abigail; Alvarez, Rute; da Silva, Maria Gomes; Cabeçadas, José; Neuberg, Donna; Calaminici, Maria; Gribben, John G.

    2015-01-01

    Gene expression studies have identified the microenvironment as a prognostic player in diffuse large B-cell lymphoma. However, there is a lack of simple immune biomarkers that can be applied in the clinical setting and could be helpful in stratifying patients. Immunohistochemistry has been used for this purpose but the results are inconsistent. We decided to reinvestigate the immune microenvironment and its impact using immunohistochemistry, with two systems of image analysis, in a large set of patients with diffuse large B-cell lymphoma. Diagnostic tissue from 309 patients was arrayed onto tissue microarrays. Results from 161 chemoimmunotherapy-treated patients were used for outcome prediction. Positive cells, percentage stained area and numbers of pixels/area were quantified and results were compared with the purpose of inferring consistency between the two semi-automated systems. Measurement cutpoints were assessed using a recursive partitioning algorithm classifying results according to survival. Kaplan-Meier estimators and Fisher exact tests were evaluated to check for significant differences between measurement classes, and for dependence between pairs of measurements, respectively. Results were validated by multivariate analysis incorporating the International Prognostic Index. The concordance between the two systems of image analysis was surprisingly high, supporting their applicability for immunohistochemistry studies. Patients with a high density of CD3 and FoxP3 by both methods had a better outcome. Automated analysis should be the preferred method for immunohistochemistry studies. Following the use of two methods of semi-automated analysis we suggest that CD3 and FoxP3 play a role in predicting response to chemoimmunotherapy in diffuse large B-cell lymphoma. PMID:25425693

  6. Revisiting the immune microenvironment of diffuse large B-cell lymphoma using a tissue microarray and immunohistochemistry: robust semi-automated analysis reveals CD3 and FoxP3 as potential predictors of response to R-CHOP.

    PubMed

    Coutinho, Rita; Clear, Andrew J; Mazzola, Emanuele; Owen, Andrew; Greaves, Paul; Wilson, Andrew; Matthews, Janet; Lee, Abigail; Alvarez, Rute; da Silva, Maria Gomes; Cabeçadas, José; Neuberg, Donna; Calaminici, Maria; Gribben, John G

    2015-03-01

    Gene expression studies have identified the microenvironment as a prognostic player in diffuse large B-cell lymphoma. However, there is a lack of simple immune biomarkers that can be applied in the clinical setting and could be helpful in stratifying patients. Immunohistochemistry has been used for this purpose but the results are inconsistent. We decided to reinvestigate the immune microenvironment and its impact using immunohistochemistry, with two systems of image analysis, in a large set of patients with diffuse large B-cell lymphoma. Diagnostic tissue from 309 patients was arrayed onto tissue microarrays. Results from 161 chemoimmunotherapy-treated patients were used for outcome prediction. Positive cells, percentage stained area and numbers of pixels/area were quantified and results were compared with the purpose of inferring consistency between the two semi-automated systems. Measurement cutpoints were assessed using a recursive partitioning algorithm classifying results according to survival. Kaplan-Meier estimators and Fisher exact tests were evaluated to check for significant differences between measurement classes, and for dependence between pairs of measurements, respectively. Results were validated by multivariate analysis incorporating the International Prognostic Index. The concordance between the two systems of image analysis was surprisingly high, supporting their applicability for immunohistochemistry studies. Patients with a high density of CD3 and FoxP3 by both methods had a better outcome. Automated analysis should be the preferred method for immunohistochemistry studies. Following the use of two methods of semi-automated analysis we suggest that CD3 and FoxP3 play a role in predicting response to chemoimmunotherapy in diffuse large B-cell lymphoma. Copyright© Ferrata Storti Foundation.

  7. Optimization of cDNA microarrays procedures using criteria that do not rely on external standards.

    PubMed

    Bruland, Torunn; Anderssen, Endre; Doseth, Berit; Bergum, Hallgeir; Beisvag, Vidar; Laegreid, Astrid

    2007-10-18

    The measurement of gene expression using microarray technology is a complicated process in which a large number of factors can be varied. Due to the lack of standard calibration samples such as are used in traditional chemical analysis it may be a problem to evaluate whether changes done to the microarray procedure actually improve the identification of truly differentially expressed genes. The purpose of the present work is to report the optimization of several steps in the microarray process both in laboratory practices and in data processing using criteria that do not rely on external standards. We performed a cDNA microarry experiment including RNA from samples with high expected differential gene expression termed "high contrasts" (rat cell lines AR42J and NRK52E) compared to self-self hybridization, and optimized a pipeline to maximize the number of genes found to be differentially expressed in the "high contrasts" RNA samples by estimating the false discovery rate (FDR) using a null distribution obtained from the self-self experiment. The proposed high-contrast versus self-self method (HCSSM) requires only four microarrays per evaluation. The effects of blocking reagent dose, filtering, and background corrections methodologies were investigated. In our experiments a dose of 250 ng LNA (locked nucleic acid) dT blocker, no background correction and weight based filtering gave the largest number of differentially expressed genes. The choice of background correction method had a stronger impact on the estimated number of differentially expressed genes than the choice of filtering method. Cross platform microarray (Illumina) analysis was used to validate that the increase in the number of differentially expressed genes found by HCSSM was real. The results show that HCSSM can be a useful and simple approach to optimize microarray procedures without including external standards. Our optimizing method is highly applicable to both long oligo-probe microarrays which have become commonly used for well characterized organisms such as man, mouse and rat, as well as to cDNA microarrays which are still of importance for organisms with incomplete genome sequence information such as many bacteria, plants and fish.

  8. Optimization of cDNA microarrays procedures using criteria that do not rely on external standards

    PubMed Central

    Bruland, Torunn; Anderssen, Endre; Doseth, Berit; Bergum, Hallgeir; Beisvag, Vidar; Lægreid, Astrid

    2007-01-01

    Background The measurement of gene expression using microarray technology is a complicated process in which a large number of factors can be varied. Due to the lack of standard calibration samples such as are used in traditional chemical analysis it may be a problem to evaluate whether changes done to the microarray procedure actually improve the identification of truly differentially expressed genes. The purpose of the present work is to report the optimization of several steps in the microarray process both in laboratory practices and in data processing using criteria that do not rely on external standards. Results We performed a cDNA microarry experiment including RNA from samples with high expected differential gene expression termed "high contrasts" (rat cell lines AR42J and NRK52E) compared to self-self hybridization, and optimized a pipeline to maximize the number of genes found to be differentially expressed in the "high contrasts" RNA samples by estimating the false discovery rate (FDR) using a null distribution obtained from the self-self experiment. The proposed high-contrast versus self-self method (HCSSM) requires only four microarrays per evaluation. The effects of blocking reagent dose, filtering, and background corrections methodologies were investigated. In our experiments a dose of 250 ng LNA (locked nucleic acid) dT blocker, no background correction and weight based filtering gave the largest number of differentially expressed genes. The choice of background correction method had a stronger impact on the estimated number of differentially expressed genes than the choice of filtering method. Cross platform microarray (Illumina) analysis was used to validate that the increase in the number of differentially expressed genes found by HCSSM was real. Conclusion The results show that HCSSM can be a useful and simple approach to optimize microarray procedures without including external standards. Our optimizing method is highly applicable to both long oligo-probe microarrays which have become commonly used for well characterized organisms such as man, mouse and rat, as well as to cDNA microarrays which are still of importance for organisms with incomplete genome sequence information such as many bacteria, plants and fish. PMID:17949480

  9. A Human Lectin Microarray for Sperm Surface Glycosylation Analysis *

    PubMed Central

    Sun, Yangyang; Cheng, Li; Gu, Yihua; Xin, Aijie; Wu, Bin; Zhou, Shumin; Guo, Shujuan; Liu, Yin; Diao, Hua; Shi, Huijuan; Wang, Guangyu; Tao, Sheng-ce

    2016-01-01

    Glycosylation is one of the most abundant and functionally important protein post-translational modifications. As such, technology for efficient glycosylation analysis is in high demand. Lectin microarrays are a powerful tool for such investigations and have been successfully applied for a variety of glycobiological studies. However, most of the current lectin microarrays are primarily constructed from plant lectins, which are not well suited for studies of human glycosylation because of the extreme complexity of human glycans. Herein, we constructed a human lectin microarray with 60 human lectin and lectin-like proteins. All of the lectins and lectin-like proteins were purified from yeast, and most showed binding to human glycans. To demonstrate the applicability of the human lectin microarray, human sperm were probed on the microarray and strong bindings were observed for several lectins, including galectin-1, 7, 8, GalNAc-T6, and ERGIC-53 (LMAN1). These bindings were validated by flow cytometry and fluorescence immunostaining. Further, mass spectrometry analysis showed that galectin-1 binds several membrane-associated proteins including heat shock protein 90. Finally, functional assays showed that binding of galectin-8 could significantly enhance the acrosome reaction within human sperms. To our knowledge, this is the first construction of a human lectin microarray, and we anticipate it will find wide use for a range of human or mammalian studies, alone or in combination with plant lectin microarrays. PMID:27364157

  10. Wide screening of phage-displayed libraries identifies immune targets in planta.

    PubMed

    Rioja, Cristina; Van Wees, Saskia C; Charlton, Keith A; Pieterse, Corné M J; Lorenzo, Oscar; García-Sánchez, Susana

    2013-01-01

    Microbe-Associated Molecular Patterns and virulence effectors are recognized by plants as a first step to mount a defence response against potential pathogens. This recognition involves a large family of extracellular membrane receptors and other immune proteins located in different sub-cellular compartments. We have used phage-display technology to express and select for Arabidopsis proteins able to bind bacterial pathogens. To rapidly identify microbe-bound phage, we developed a monitoring method based on microarrays. This combined strategy allowed for a genome-wide screening of plant proteins involved in pathogen perception. Two phage libraries for high-throughput selection were constructed from cDNA of plants infected with Pseudomonas aeruginosa PA14, or from combined samples of the virulent isolate DC3000 of Pseudomonas syringae pv. tomato and its avirulent variant avrRpt2. These three pathosystems represent different degrees in the specificity of plant-microbe interactions. Libraries cover up to 2 × 10(7) different plant transcripts that can be displayed as functional proteins on the surface of T7 bacteriophage. A number of these were selected in a bio-panning assay for binding to Pseudomonas cells. Among the selected clones we isolated the ethylene response factor ATERF-1, which was able to bind the three bacterial strains in competition assays. ATERF-1 was rapidly exported from the nucleus upon infiltration of either alive or heat-killed Pseudomonas. Moreover, aterf-1 mutants exhibited enhanced susceptibility to infection. These findings suggest that ATERF-1 contains a microbe-recognition domain with a role in plant defence. To identify other putative pathogen-binding proteins on a genome-wide scale, the copy number of selected-vs.-total clones was compared by hybridizing phage cDNAs with Arabidopsis microarrays. Microarray analysis revealed a set of 472 candidates with significant fold change. Within this set defence-related genes, including well-known targets of bacterial effectors, are over-represented. Other genes non-previously related to defence can be associated through this study with general or strain-specific recognition of Pseudomonas.

  11. Vaginal microbial flora analysis by next generation sequencing and microarrays; can microbes indicate vaginal origin in a forensic context?

    PubMed

    Benschop, Corina C G; Quaak, Frederike C A; Boon, Mathilde E; Sijen, Titia; Kuiper, Irene

    2012-03-01

    Forensic analysis of biological traces generally encompasses the investigation of both the person who contributed to the trace and the body site(s) from which the trace originates. For instance, for sexual assault cases, it can be beneficial to distinguish vaginal samples from skin or saliva samples. In this study, we explored the use of microbial flora to indicate vaginal origin. First, we explored the vaginal microbiome for a large set of clinical vaginal samples (n = 240) by next generation sequencing (n = 338,184 sequence reads) and found 1,619 different sequences. Next, we selected 389 candidate probes targeting genera or species and designed a microarray, with which we analysed a diverse set of samples; 43 DNA extracts from vaginal samples and 25 DNA extracts from samples from other body sites, including sites in close proximity of or in contact with the vagina. Finally, we used the microarray results and next generation sequencing dataset to assess the potential for a future approach that uses microbial markers to indicate vaginal origin. Since no candidate genera/species were found to positively identify all vaginal DNA extracts on their own, while excluding all non-vaginal DNA extracts, we deduce that a reliable statement about the cellular origin of a biological trace should be based on the detection of multiple species within various genera. Microarray analysis of a sample will then render a microbial flora pattern that is probably best analysed in a probabilistic approach.

  12. Adaptable gene-specific dye bias correction for two-channel DNA microarrays.

    PubMed

    Margaritis, Thanasis; Lijnzaad, Philip; van Leenen, Dik; Bouwmeester, Diane; Kemmeren, Patrick; van Hooff, Sander R; Holstege, Frank C P

    2009-01-01

    DNA microarray technology is a powerful tool for monitoring gene expression or for finding the location of DNA-bound proteins. DNA microarrays can suffer from gene-specific dye bias (GSDB), causing some probes to be affected more by the dye than by the sample. This results in large measurement errors, which vary considerably for different probes and also across different hybridizations. GSDB is not corrected by conventional normalization and has been difficult to address systematically because of its variance. We show that GSDB is influenced by label incorporation efficiency, explaining the variation of GSDB across different hybridizations. A correction method (Gene- And Slide-Specific Correction, GASSCO) is presented, whereby sequence-specific corrections are modulated by the overall bias of individual hybridizations. GASSCO outperforms earlier methods and works well on a variety of publically available datasets covering a range of platforms, organisms and applications, including ChIP on chip. A sequence-based model is also presented, which predicts which probes will suffer most from GSDB, useful for microarray probe design and correction of individual hybridizations. Software implementing the method is publicly available.

  13. Adaptable gene-specific dye bias correction for two-channel DNA microarrays

    PubMed Central

    Margaritis, Thanasis; Lijnzaad, Philip; van Leenen, Dik; Bouwmeester, Diane; Kemmeren, Patrick; van Hooff, Sander R; Holstege, Frank CP

    2009-01-01

    DNA microarray technology is a powerful tool for monitoring gene expression or for finding the location of DNA-bound proteins. DNA microarrays can suffer from gene-specific dye bias (GSDB), causing some probes to be affected more by the dye than by the sample. This results in large measurement errors, which vary considerably for different probes and also across different hybridizations. GSDB is not corrected by conventional normalization and has been difficult to address systematically because of its variance. We show that GSDB is influenced by label incorporation efficiency, explaining the variation of GSDB across different hybridizations. A correction method (Gene- And Slide-Specific Correction, GASSCO) is presented, whereby sequence-specific corrections are modulated by the overall bias of individual hybridizations. GASSCO outperforms earlier methods and works well on a variety of publically available datasets covering a range of platforms, organisms and applications, including ChIP on chip. A sequence-based model is also presented, which predicts which probes will suffer most from GSDB, useful for microarray probe design and correction of individual hybridizations. Software implementing the method is publicly available. PMID:19401678

  14. Screening of lipid composition for scalable fabrication of solvent free lipid microarrays

    NASA Astrophysics Data System (ADS)

    Ghazanfari, Lida; Lenhert, Steven

    2016-12-01

    Liquid microdroplet arrays on surfaces are a promising approach to the miniaturization of laboratory processes such as high throughput screening. The fluid nature of these droplets poses unique challenges and opportunities in their fabrication and application, particularly for the scalable integration of multiple materials over large areas and immersion into cell culture solution. Here we use pin spotting and nanointaglio printing to screen a library of lipids and their mixtures for their compatibility with these fabrication processes, as well as stability upon immersion into aqueous solution. More than 200 combinations of natural and synthetic oils composed of fatty acids, triglycerides, and hydrocarbons were tested for their pin-spotting and nanointaglio print quality and their ability to contain the fluorescent compound TRITC upon immersion in water. A combination of castor oil and hexanoic acid at the ratio of 1:1 (w/w) was found optimal for producing reproducible patterns that are stable upon immersion into water. This method is capable of large scale nano-materials integration.

  15. Prenatal chromosomal microarray analysis in fetuses with congenital heart disease: a prospective cohort study.

    PubMed

    Wang, Yan; Cao, Li; Liang, Dong; Meng, Lulu; Wu, Yun; Qiao, Fengchang; Ji, Xiuqing; Luo, Chunyu; Zhang, Jingjing; Xu, Tianhui; Yu, Bin; Wang, Leilei; Wang, Ting; Pan, Qiong; Ma, Dingyuan; Hu, Ping; Xu, Zhengfeng

    2018-02-01

    Currently, chromosomal microarray analysis is considered the first-tier test in pediatric care and prenatal diagnosis. However, the diagnostic yield of chromosomal microarray analysis for prenatal diagnosis of congenital heart disease has not been evaluated based on a large cohort. Our aim was to evaluate the clinical utility of chromosomal microarray as the first-tier test for chromosomal abnormalities in fetuses with congenital heart disease. In this prospective study, 602 prenatal cases of congenital heart disease were investigated using single nucleotide polymorphism array over a 5-year period. Overall, pathogenic chromosomal abnormalities were identified in 125 (20.8%) of 602 prenatal cases of congenital heart disease, with 52.0% of them being numerical chromosomal abnormalities. The detection rates of likely pathogenic copy number variations and variants of uncertain significance were 1.3% and 6.0%, respectively. The detection rate of pathogenic chromosomal abnormalities in congenital heart disease plus additional structural anomalies (48.9% vs 14.3%, P < .0001) or intrauterine growth retardation group (50.0% vs 14.3%, P = .044) was significantly higher than that in isolated congenital heart disease group. Additionally, the detection rate in congenital heart disease with additional structural anomalies group was significantly higher than that in congenital heart disease with soft markers group (48.9% vs 19.8%, P < .0001). No significant difference was observed in the detection rates between congenital heart disease with additional structural anomalies and congenital heart disease with intrauterine growth retardation groups (48.9% vs 50.0%), congenital heart disease with soft markers and congenital heart disease with intrauterine growth retardation groups (19.8% vs 50.0%), or congenital heart disease with soft markers and isolated congenital heart disease groups (19.8% vs 14.3%). The detection rate in fetuses with congenital heart disease plus mild ventriculomegaly was significantly higher than in those with other types of soft markers (50.0% vs 15.6%, P < .05). Our study suggests chromosomal microarray analysis is a reliable and high-resolution technology and should be used as the first-tier test for prenatal diagnosis of congenital heart disease in clinical practice. Copyright © 2017 Elsevier Inc. All rights reserved.

  16. Cross-Study Homogeneity of Psoriasis Gene Expression in Skin across a Large Expression Range

    PubMed Central

    Kerkof, Keith; Timour, Martin; Russell, Christopher B.

    2013-01-01

    Background In psoriasis, only limited overlap between sets of genes identified as differentially expressed (psoriatic lesional vs. psoriatic non-lesional) was found using statistical and fold-change cut-offs. To provide a framework for utilizing prior psoriasis data sets we sought to understand the consistency of those sets. Methodology/Principal Findings Microarray expression profiling and qRT-PCR were used to characterize gene expression in PP and PN skin from psoriasis patients. cDNA (three new data sets) and cRNA hybridization (four existing data sets) data were compared using a common analysis pipeline. Agreement between data sets was assessed using varying qualitative and quantitative cut-offs to generate a DEG list in a source data set and then using other data sets to validate the list. Concordance increased from 67% across all probe sets to over 99% across more than 10,000 probe sets when statistical filters were employed. The fold-change behavior of individual genes tended to be consistent across the multiple data sets. We found that genes with <2-fold change values were quantitatively reproducible between pairs of data-sets. In a subset of transcripts with a role in inflammation changes detected by microarray were confirmed by qRT-PCR with high concordance. For transcripts with both PN and PP levels within the microarray dynamic range, microarray and qRT-PCR were quantitatively reproducible, including minimal fold-changes in IL13, TNFSF11, and TNFRSF11B and genes with >10-fold changes in either direction such as CHRM3, IL12B and IFNG. Conclusions/Significance Gene expression changes in psoriatic lesions were consistent across different studies, despite differences in patient selection, sample handling, and microarray platforms but between-study comparisons showed stronger agreement within than between platforms. We could use cut-offs as low as log10(ratio) = 0.1 (fold-change = 1.26), generating larger gene lists that validate on independent data sets. The reproducibility of PP signatures across data sets suggests that different sample sets can be productively compared. PMID:23308107

  17. PRACTICAL STRATEGIES FOR PROCESSING AND ANALYZING SPOTTED OLIGONUCLEOTIDE MICROARRAY DATA

    EPA Science Inventory

    Thoughtful data analysis is as important as experimental design, biological sample quality, and appropriate experimental procedures for making microarrays a useful supplement to traditional toxicology. In the present study, spotted oligonucleotide microarrays were used to profile...

  18. Micropatterned comet assay enables high throughput and sensitive DNA damage quantification

    PubMed Central

    Ge, Jing; Chow, Danielle N.; Fessler, Jessica L.; Weingeist, David M.; Wood, David K.; Engelward, Bevin P.

    2015-01-01

    The single cell gel electrophoresis assay, also known as the comet assay, is a versatile method for measuring many classes of DNA damage, including base damage, abasic sites, single strand breaks and double strand breaks. However, limited throughput and difficulties with reproducibility have limited its utility, particularly for clinical and epidemiological studies. To address these limitations, we created a microarray comet assay. The use of a micrometer scale array of cells increases the number of analysable comets per square centimetre and enables automated imaging and analysis. In addition, the platform is compatible with standard 24- and 96-well plate formats. Here, we have assessed the consistency and sensitivity of the microarray comet assay. We showed that the linear detection range for H2O2-induced DNA damage in human lymphoblastoid cells is between 30 and 100 μM, and that within this range, inter-sample coefficient of variance was between 5 and 10%. Importantly, only 20 comets were required to detect a statistically significant induction of DNA damage for doses within the linear range. We also evaluated sample-to-sample and experiment-to-experiment variation and found that for both conditions, the coefficient of variation was lower than what has been reported for the traditional comet assay. Finally, we also show that the assay can be performed using a 4× objective (rather than the standard 10× objective for the traditional assay). This adjustment combined with the microarray format makes it possible to capture more than 50 analysable comets in a single image, which can then be automatically analysed using in-house software. Overall, throughput is increased more than 100-fold compared to the traditional assay. Together, the results presented here demonstrate key advances in comet assay technology that improve the throughput, sensitivity, and robustness, thus enabling larger scale clinical and epidemiological studies. PMID:25527723

  19. Micropatterned comet assay enables high throughput and sensitive DNA damage quantification.

    PubMed

    Ge, Jing; Chow, Danielle N; Fessler, Jessica L; Weingeist, David M; Wood, David K; Engelward, Bevin P

    2015-01-01

    The single cell gel electrophoresis assay, also known as the comet assay, is a versatile method for measuring many classes of DNA damage, including base damage, abasic sites, single strand breaks and double strand breaks. However, limited throughput and difficulties with reproducibility have limited its utility, particularly for clinical and epidemiological studies. To address these limitations, we created a microarray comet assay. The use of a micrometer scale array of cells increases the number of analysable comets per square centimetre and enables automated imaging and analysis. In addition, the platform is compatible with standard 24- and 96-well plate formats. Here, we have assessed the consistency and sensitivity of the microarray comet assay. We showed that the linear detection range for H2O2-induced DNA damage in human lymphoblastoid cells is between 30 and 100 μM, and that within this range, inter-sample coefficient of variance was between 5 and 10%. Importantly, only 20 comets were required to detect a statistically significant induction of DNA damage for doses within the linear range. We also evaluated sample-to-sample and experiment-to-experiment variation and found that for both conditions, the coefficient of variation was lower than what has been reported for the traditional comet assay. Finally, we also show that the assay can be performed using a 4× objective (rather than the standard 10× objective for the traditional assay). This adjustment combined with the microarray format makes it possible to capture more than 50 analysable comets in a single image, which can then be automatically analysed using in-house software. Overall, throughput is increased more than 100-fold compared to the traditional assay. Together, the results presented here demonstrate key advances in comet assay technology that improve the throughput, sensitivity, and robustness, thus enabling larger scale clinical and epidemiological studies. © The Author 2014. Published by Oxford University Press on behalf of the Mutagenesis Society. All rights reserved. For permissions, please email: journals.permissions@oup.com.

  20. PhyloChip microarray analysis reveals altered gastrointestinal microbial communities in a rat model of colonic hypersensitivity

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Nelson, T.A.; Holmes, S.; Alekseyenko, A.V.

    Irritable bowel syndrome (IBS) is a chronic, episodic gastrointestinal disorder that is prevalent in a significant fraction of western human populations; and changes in the microbiota of the large bowel have been implicated in the pathology of the disease. Using a novel comprehensive, high-density DNA microarray (PhyloChip) we performed a phylogenetic analysis of the microbial community of the large bowel in a rat model in which intracolonic acetic acid in neonates was used to induce long lasting colonic hypersensitivity and decreased stool water content and frequency, representing the equivalent of human constipation-predominant IBS. Our results revealed a significantly increased compositionalmore » difference in the microbial communities in rats with neonatal irritation as compared with controls. Even more striking was the dramatic change in the ratio of Firmicutes relative to Bacteroidetes, where neonatally irritated rats were enriched more with Bacteroidetes and also contained a different composition of species within this phylum. Our study also revealed differences at the level of bacterial families and species. The PhyloChip is a useful and convenient method to study enteric microflora. Further, this rat model system may be a useful experimental platform to study the causes and consequences of changes in microbial community composition associated with IBS.« less

  1. DNA Microarray for Rapid Detection and Identification of Food and Water Borne Bacteria: From Dry to Wet Lab.

    PubMed

    Ranjbar, Reza; Behzadi, Payam; Najafi, Ali; Roudi, Raheleh

    2017-01-01

    A rapid, accurate, flexible and reliable diagnostic method may significantly decrease the costs of diagnosis and treatment. Designing an appropriate microarray chip reduces noises and probable biases in the final result. The aim of this study was to design and construct a DNA Microarray Chip for a rapid detection and identification of 10 important bacterial agents. In the present survey, 10 unique genomic regions relating to 10 pathogenic bacterial agents including Escherichia coli (E.coli), Shigella boydii, Sh.dysenteriae, Sh.flexneri, Sh.sonnei, Salmonella typhi, S.typhimurium, Brucella sp., Legionella pneumophila, and Vibrio cholera were selected for designing specific long oligo microarray probes. For this reason, the in-silico operations including utilization of the NCBI RefSeq database, Servers of PanSeq and Gview, AlleleID 7.7 and Oligo Analyzer 3.1 was done. On the other hand, the in-vitro part of the study comprised stages of robotic microarray chip probe spotting, bacterial DNAs extraction and DNA labeling, hybridization and microarray chip scanning. In wet lab section, different tools and apparatus such as Nexterion® Slide E, Qarray mini spotter, NimbleGen kit, TrayMix TM S4, and Innoscan 710 were used. A DNA microarray chip including 10 long oligo microarray probes was designed and constructed for detection and identification of 10 pathogenic bacteria. The DNA microarray chip was capable to identify all 10 bacterial agents tested simultaneously. The presence of a professional bioinformatician as a probe designer is needed to design appropriate multifunctional microarray probes to increase the accuracy of the outcomes.

  2. Comparison of gene expression microarray data with count-based RNA measurements informs microarray interpretation.

    PubMed

    Richard, Arianne C; Lyons, Paul A; Peters, James E; Biasci, Daniele; Flint, Shaun M; Lee, James C; McKinney, Eoin F; Siegel, Richard M; Smith, Kenneth G C

    2014-08-04

    Although numerous investigations have compared gene expression microarray platforms, preprocessing methods and batch correction algorithms using constructed spike-in or dilution datasets, there remains a paucity of studies examining the properties of microarray data using diverse biological samples. Most microarray experiments seek to identify subtle differences between samples with variable background noise, a scenario poorly represented by constructed datasets. Thus, microarray users lack important information regarding the complexities introduced in real-world experimental settings. The recent development of a multiplexed, digital technology for nucleic acid measurement enables counting of individual RNA molecules without amplification and, for the first time, permits such a study. Using a set of human leukocyte subset RNA samples, we compared previously acquired microarray expression values with RNA molecule counts determined by the nCounter Analysis System (NanoString Technologies) in selected genes. We found that gene measurements across samples correlated well between the two platforms, particularly for high-variance genes, while genes deemed unexpressed by the nCounter generally had both low expression and low variance on the microarray. Confirming previous findings from spike-in and dilution datasets, this "gold-standard" comparison demonstrated signal compression that varied dramatically by expression level and, to a lesser extent, by dataset. Most importantly, examination of three different cell types revealed that noise levels differed across tissues. Microarray measurements generally correlate with relative RNA molecule counts within optimal ranges but suffer from expression-dependent accuracy bias and precision that varies across datasets. We urge microarray users to consider expression-level effects in signal interpretation and to evaluate noise properties in each dataset independently.

  3. Rational design of DNA sequences for nanotechnology, microarrays and molecular computers using Eulerian graphs.

    PubMed

    Pancoska, Petr; Moravek, Zdenek; Moll, Ute M

    2004-01-01

    Nucleic acids are molecules of choice for both established and emerging nanoscale technologies. These technologies benefit from large functional densities of 'DNA processing elements' that can be readily manufactured. To achieve the desired functionality, polynucleotide sequences are currently designed by a process that involves tedious and laborious filtering of potential candidates against a series of requirements and parameters. Here, we present a complete novel methodology for the rapid rational design of large sets of DNA sequences. This method allows for the direct implementation of very complex and detailed requirements for the generated sequences, thus avoiding 'brute force' filtering. At the same time, these sequences have narrow distributions of melting temperatures. The molecular part of the design process can be done without computer assistance, using an efficient 'human engineering' approach by drawing a single blueprint graph that represents all generated sequences. Moreover, the method eliminates the necessity for extensive thermodynamic calculations. Melting temperature can be calculated only once (or not at all). In addition, the isostability of the sequences is independent of the selection of a particular set of thermodynamic parameters. Applications are presented for DNA sequence designs for microarrays, universal microarray zip sequences and electron transfer experiments.

  4. Clustering-based spot segmentation of cDNA microarray images.

    PubMed

    Uslan, Volkan; Bucak, Ihsan Ömür

    2010-01-01

    Microarrays are utilized as that they provide useful information about thousands of gene expressions simultaneously. In this study segmentation step of microarray image processing has been implemented. Clustering-based methods, fuzzy c-means and k-means, have been applied for the segmentation step that separates the spots from the background. The experiments show that fuzzy c-means have segmented spots of the microarray image more accurately than the k-means.

  5. DNA microarrays of baculovirus genomes: differential expression of viral genes in two susceptible insect cell lines.

    PubMed

    Yamagishi, J; Isobe, R; Takebuchi, T; Bando, H

    2003-03-01

    We describe, for the first time, the generation of a viral DNA chip for simultaneous expression measurements of nearly all known open reading frames (ORFs) in the best-studied members of the family Baculoviridae, Autographa californica multiple nucleopolyhedrovirus (AcMNPV) and Bombyx mori nucleopolyhedrovirus (BmNPV). In this study, a viral DNA chip (Ac-BmNPV chip) was fabricated and used to characterize the viral gene expression profile for AcMNPV in different cell types. The viral chip is composed of microarrays of viral DNA prepared by robotic deposition of PCR-amplified viral DNA fragments on glass for ORFs in the NPV genome. Viral gene expression was monitored by hybridization to the DNA fragment microarrays with fluorescently labeled cDNAs prepared from infected Spodoptera frugiperda, Sf9 cells and Trichoplusia ni, TnHigh-Five cells, the latter a major producer of baculovirus and recombinant proteins. A comparison of expression profiles of known ORFs in AcMNPV elucidated six genes (ORF150, p10, pk2, and three late gene expression factor genes lef-3, p35 and lef- 6) the expression of each of which was regulated differently in the two cell lines. Most of these genes are known to be closely involved in the viral life cycle such as in DNA replication, late gene expression and the release of polyhedra from infected cells. These results imply that the differential expression of these viral genes accounts for the differences in viral replication between these two cell lines. Thus, these fabricated microarrays of NPV DNA which allow a rapid analysis of gene expression at the viral genome level should greatly speed the functional analysis of large genomes of NPV.

  6. Interim Report on SNP analysis and forensic microarray probe design for South American hemorrhagic fever viruses, tick-borne encephalitis virus, henipaviruses, Old World Arenaviruses, filoviruses, Crimean-Congo hemorrhagic fever viruses, Rift Valley fever

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jaing, C; Gardner, S

    The goal of this project is to develop forensic genotyping assays for select agent viruses, enhancing the current capabilities for the viral bioforensics and law enforcement community. We used a multipronged approach combining bioinformatics analysis, PCR-enriched samples, microarrays and TaqMan assays to develop high resolution and cost effective genotyping methods for strain level forensic discrimination of viruses. We have leveraged substantial experience and efficiency gained through year 1 on software development, SNP discovery, TaqMan signature design and phylogenetic signature mapping to scale up the development of forensics signatures in year 2. In this report, we have summarized the whole genomemore » wide SNP analysis and microarray probe design for forensics characterization of South American hemorrhagic fever viruses, tick-borne encephalitis viruses and henipaviruses, Old World Arenaviruses, filoviruses, Crimean-Congo hemorrhagic fever virus, Rift Valley fever virus and Japanese encephalitis virus.« less

  7. Large-Scale Mapping and Validation of Escherichia coli Transcriptional Regulation from a Compendium of Expression Profiles

    PubMed Central

    Thaden, Joshua T; Mogno, Ilaria; Wierzbowski, Jamey; Cottarel, Guillaume; Kasif, Simon; Collins, James J; Gardner, Timothy S

    2007-01-01

    Machine learning approaches offer the potential to systematically identify transcriptional regulatory interactions from a compendium of microarray expression profiles. However, experimental validation of the performance of these methods at the genome scale has remained elusive. Here we assess the global performance of four existing classes of inference algorithms using 445 Escherichia coli Affymetrix arrays and 3,216 known E. coli regulatory interactions from RegulonDB. We also developed and applied the context likelihood of relatedness (CLR) algorithm, a novel extension of the relevance networks class of algorithms. CLR demonstrates an average precision gain of 36% relative to the next-best performing algorithm. At a 60% true positive rate, CLR identifies 1,079 regulatory interactions, of which 338 were in the previously known network and 741 were novel predictions. We tested the predicted interactions for three transcription factors with chromatin immunoprecipitation, confirming 21 novel interactions and verifying our RegulonDB-based performance estimates. CLR also identified a regulatory link providing central metabolic control of iron transport, which we confirmed with real-time quantitative PCR. The compendium of expression data compiled in this study, coupled with RegulonDB, provides a valuable model system for further improvement of network inference algorithms using experimental data. PMID:17214507

  8. Identification of bovine leukemia virus tax function associated with host cell transcription, signaling, stress response and immune response pathway by microarray-based gene expression analysis

    PubMed Central

    2012-01-01

    Background Bovine leukemia virus (BLV) is associated with enzootic bovine leukosis and is closely related to human T-cell leukemia virus type I. The Tax protein of BLV is a transcriptional activator of viral replication and a key contributor to oncogenic potential. We previously identified interesting mutant forms of Tax with elevated (TaxD247G) or reduced (TaxS240P) transactivation effects on BLV replication and propagation. However, the effects of these mutations on functions other than transcriptional activation are unknown. In this study, to identify genes that play a role in the cascade of signal events regulated by wild-type and mutant Tax proteins, we used a large-scale host cell gene-profiling approach. Results Using a microarray containing approximately 18,400 human mRNA transcripts, we found several alterations after the expression of Tax proteins in genes involved in many cellular functions such as transcription, signal transduction, cell growth, apoptosis, stress response, and immune response, indicating that Tax protein has multiple biological effects on various cellular environments. We also found that TaxD247G strongly regulated more genes involved in transcription, signal transduction, and cell growth functions, contrary to TaxS240P, which regulated fewer genes. In addition, the expression of genes related to stress response significantly increased in the presence of TaxS240P as compared to wild-type Tax and TaxD247G. By contrast, the largest group of downregulated genes was related to immune response, and the majority of these genes belonged to the interferon family. However, no significant difference in the expression level of downregulated genes was observed among the Tax proteins. Finally, the expression of important cellular factors obtained from the human microarray results were validated at the RNA and protein levels by real-time quantitative reverse transcription-polymerase chain reaction and western blotting, respectively, after transfecting Tax proteins into bovine cells and human HeLa cells. Conclusion A comparative analysis of wild-type and mutant Tax proteins indicates that Tax protein exerts a significant impact on cellular functions as diverse as transcription, signal transduction, cell growth, stress response and immune response. Importantly, our study is the first report that shows the extent to which BLV Tax regulates the innate immune response. PMID:22455445

  9. Introduction to bioinformatics.

    PubMed

    Can, Tolga

    2014-01-01

    Bioinformatics is an interdisciplinary field mainly involving molecular biology and genetics, computer science, mathematics, and statistics. Data intensive, large-scale biological problems are addressed from a computational point of view. The most common problems are modeling biological processes at the molecular level and making inferences from collected data. A bioinformatics solution usually involves the following steps: Collect statistics from biological data. Build a computational model. Solve a computational modeling problem. Test and evaluate a computational algorithm. This chapter gives a brief introduction to bioinformatics by first providing an introduction to biological terminology and then discussing some classical bioinformatics problems organized by the types of data sources. Sequence analysis is the analysis of DNA and protein sequences for clues regarding function and includes subproblems such as identification of homologs, multiple sequence alignment, searching sequence patterns, and evolutionary analyses. Protein structures are three-dimensional data and the associated problems are structure prediction (secondary and tertiary), analysis of protein structures for clues regarding function, and structural alignment. Gene expression data is usually represented as matrices and analysis of microarray data mostly involves statistics analysis, classification, and clustering approaches. Biological networks such as gene regulatory networks, metabolic pathways, and protein-protein interaction networks are usually modeled as graphs and graph theoretic approaches are used to solve associated problems such as construction and analysis of large-scale networks.

  10. Building gene co-expression networks using transcriptomics data for systems biology investigations: Comparison of methods using microarray data

    PubMed Central

    Kadarmideen, Haja N; Watson-haigh, Nathan S

    2012-01-01

    Gene co-expression networks (GCN), built using high-throughput gene expression data are fundamental aspects of systems biology. The main aims of this study were to compare two popular approaches to building and analysing GCN. We use real ovine microarray transcriptomics datasets representing four different treatments with Metyrapone, an inhibitor of cortisol biosynthesis. We conducted several microarray quality control checks before applying GCN methods to filtered datasets. Then we compared the outputs of two methods using connectivity as a criterion, as it measures how well a node (gene) is connected within a network. The two GCN construction methods used were, Weighted Gene Co-expression Network Analysis (WGCNA) and Partial Correlation and Information Theory (PCIT) methods. Nodes were ranked based on their connectivity measures in each of the four different networks created by WGCNA and PCIT and node ranks in two methods were compared to identify those nodes which are highly differentially ranked (HDR). A total of 1,017 HDR nodes were identified across one or more of four networks. We investigated HDR nodes by gene enrichment analyses in relation to their biological relevance to phenotypes. We observed that, in contrast to WGCNA method, PCIT algorithm removes many of the edges of the most highly interconnected nodes. Removal of edges of most highly connected nodes or hub genes will have consequences for downstream analyses and biological interpretations. In general, for large GCN construction (with > 20000 genes) access to large computer clusters, particularly those with larger amounts of shared memory is recommended. PMID:23144540

  11. An Undergraduate Laboratory Exercise to Study the Effect of Darkness on Plant Gene Expression Using DNA Microarray

    ERIC Educational Resources Information Center

    Chang, Ming-Mei; Briggs, George M.

    2007-01-01

    DNA microarrays are microscopic arrays on a solid surface, typically a glass slide, on which DNA oligonucleotides are deposited or synthesized in a high-density matrix with a predetermined spatial order. Several types of DNA microarrays have been developed and used for various biological studies. Here, we developed an undergraduate laboratory…

  12. Extending Immunological Profiling in the Gilthead Sea Bream, Sparus aurata, by Enriched cDNA Library Analysis, Microarray Design and Initial Studies upon the Inflammatory Response to PAMPs.

    PubMed

    Boltaña, Sebastian; Castellana, Barbara; Goetz, Giles; Tort, Lluis; Teles, Mariana; Mulero, Victor; Novoa, Beatriz; Figueras, Antonio; Goetz, Frederick W; Gallardo-Escarate, Cristian; Planas, Josep V; Mackenzie, Simon

    2017-02-03

    This study describes the development and validation of an enriched oligonucleotide-microarray platform for Sparus aurata (SAQ) to provide a platform for transcriptomic studies in this species. A transcriptome database was constructed by assembly of gilthead sea bream sequences derived from public repositories of mRNA together with reads from a large collection of expressed sequence tags (EST) from two extensive targeted cDNA libraries characterizing mRNA transcripts regulated by both bacterial and viral challenge. The developed microarray was further validated by analysing monocyte/macrophage activation profiles after challenge with two Gram-negative bacterial pathogen-associated molecular patterns (PAMPs; lipopolysaccharide (LPS) and peptidoglycan (PGN)). Of the approximately 10,000 EST sequenced, we obtained a total of 6837 EST longer than 100 nt, with 3778 and 3059 EST obtained from the bacterial-primed and from the viral-primed cDNA libraries, respectively. Functional classification of contigs from the bacterial- and viral-primed cDNA libraries by Gene Ontology (GO) showed that the top five represented categories were equally represented in the two libraries: metabolism (approximately 24% of the total number of contigs), carrier proteins/membrane transport (approximately 15%), effectors/modulators and cell communication (approximately 11%), nucleoside, nucleotide and nucleic acid metabolism (approximately 7.5%) and intracellular transducers/signal transduction (approximately 5%). Transcriptome analyses using this enriched oligonucleotide platform identified differential shifts in the response to PGN and LPS in macrophage-like cells, highlighting responsive gene-cassettes tightly related to PAMP host recognition. As observed in other fish species, PGN is a powerful activator of the inflammatory response in S. aurata macrophage-like cells. We have developed and validated an oligonucleotide microarray (SAQ) that provides a platform enriched for the study of gene expression in S. aurata with an emphasis upon immunity and the immune response.

  13. Searching for links in the biotic characteristics and abiotic parameters of nine different biogas plants

    PubMed Central

    Walter, Andreas; Knapp, Brigitte A.; Farbmacher, Theresa; Ebner, Christian; Insam, Heribert; Franke‐Whittle, Ingrid H.

    2012-01-01

    Summary To find links between the biotic characteristics and abiotic process parameters in anaerobic digestion systems, the microbial communities of nine full‐scale biogas plants in South Tyrol (Italy) and Vorarlberg (Austria) were investigated using molecular techniques and the physical and chemical properties were monitored. DNA from sludge samples was subjected to microarray hybridization with the ANAEROCHIP microarray and results indicated that sludge samples grouped into two main clusters, dominated either by Methanosarcina or by Methanosaeta, both aceticlastic methanogens. Hydrogenotrophic methanogens were hardly detected or if detected, gave low hybridization signals. Results obtained using denaturing gradient gel electrophoresis (DGGE) supported the findings of microarray hybridization. Real‐time PCR targeting Methanosarcina and Methanosaeta was conducted to provide quantitative data on the dominating methanogens. Correlation analysis to determine any links between the microbial communities found by microarray analysis, and the physicochemical parameters investigated was conducted. It was shown that the sludge samples dominated by the genus Methanosarcina were positively correlated with higher concentrations of acetate, whereas sludge samples dominated by representatives of the genus Methanosaeta had lower acetate concentrations. No other correlations between biotic characteristics and abiotic parameters were found. Methanogenic communities in each reactor were highly stable and resilient over the whole year. PMID:22950603

  14. Rapid and simultaneous detection of ricin, staphylococcal enterotoxin B and saxitoxin by chemiluminescence-based microarray immunoassay.

    PubMed

    Szkola, A; Linares, E M; Worbs, S; Dorner, B G; Dietrich, R; Märtlbauer, E; Niessner, R; Seidel, M

    2014-11-21

    Simultaneous detection of small and large molecules on microarray immunoassays is a challenge that limits some applications in multiplex analysis. This is the case for biosecurity, where fast, cheap and reliable simultaneous detection of proteotoxins and small toxins is needed. Two highly relevant proteotoxins, ricin (60 kDa) and bacterial toxin staphylococcal enterotoxin B (SEB, 30 kDa) and the small phycotoxin saxitoxin (STX, 0.3 kDa) are potential biological warfare agents and require an analytical tool for simultaneous detection. Proteotoxins are successfully detected by sandwich immunoassays, whereas competitive immunoassays are more suitable for small toxins (<1 kDa). Based on this need, this work provides a novel and efficient solution based on anti-idiotypic antibodies for small molecules to combine both assay principles on one microarray. The biotoxin measurements are performed on a flow-through chemiluminescence microarray platform MCR3 in 18 minutes. The chemiluminescence signal was amplified by using a poly-horseradish peroxidase complex (polyHRP), resulting in low detection limits: 2.9 ± 3.1 μg L(-1) for ricin, 0.1 ± 0.1 μg L(-1) for SEB and 2.3 ± 1.7 μg L(-1) for STX. The developed multiplex system for the three biotoxins is completely novel, relevant in the context of biosecurity and establishes the basis for research on anti-idiotypic antibodies for microarray immunoassays.

  15. Implementation of spectral clustering on microarray data of carcinoma using k-means algorithm

    NASA Astrophysics Data System (ADS)

    Frisca, Bustamam, Alhadi; Siswantining, Titin

    2017-03-01

    Clustering is one of data analysis methods that aims to classify data which have similar characteristics in the same group. Spectral clustering is one of the most popular modern clustering algorithms. As an effective clustering technique, spectral clustering method emerged from the concepts of spectral graph theory. Spectral clustering method needs partitioning algorithm. There are some partitioning methods including PAM, SOM, Fuzzy c-means, and k-means. Based on the research that has been done by Capital and Choudhury in 2013, when using Euclidian distance k-means algorithm provide better accuracy than PAM algorithm. So in this paper we use k-means as our partition algorithm. The major advantage of spectral clustering is in reducing data dimension, especially in this case to reduce the dimension of large microarray dataset. Microarray data is a small-sized chip made of a glass plate containing thousands and even tens of thousands kinds of genes in the DNA fragments derived from doubling cDNA. Application of microarray data is widely used to detect cancer, for the example is carcinoma, in which cancer cells express the abnormalities in his genes. The purpose of this research is to classify the data that have high similarity in the same group and the data that have low similarity in the others. In this research, Carcinoma microarray data using 7457 genes. The result of partitioning using k-means algorithm is two clusters.

  16. MIGS-GPU: Microarray Image Gridding and Segmentation on the GPU.

    PubMed

    Katsigiannis, Stamos; Zacharia, Eleni; Maroulis, Dimitris

    2017-05-01

    Complementary DNA (cDNA) microarray is a powerful tool for simultaneously studying the expression level of thousands of genes. Nevertheless, the analysis of microarray images remains an arduous and challenging task due to the poor quality of the images that often suffer from noise, artifacts, and uneven background. In this study, the MIGS-GPU [Microarray Image Gridding and Segmentation on Graphics Processing Unit (GPU)] software for gridding and segmenting microarray images is presented. MIGS-GPU's computations are performed on the GPU by means of the compute unified device architecture (CUDA) in order to achieve fast performance and increase the utilization of available system resources. Evaluation on both real and synthetic cDNA microarray images showed that MIGS-GPU provides better performance than state-of-the-art alternatives, while the proposed GPU implementation achieves significantly lower computational times compared to the respective CPU approaches. Consequently, MIGS-GPU can be an advantageous and useful tool for biomedical laboratories, offering a user-friendly interface that requires minimum input in order to run.

  17. The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements

    PubMed Central

    2012-01-01

    Over the last decade, the introduction of microarray technology has had a profound impact on gene expression research. The publication of studies with dissimilar or altogether contradictory results, obtained using different microarray platforms to analyze identical RNA samples, has raised concerns about the reliability of this technology. The MicroArray Quality Control (MAQC) project was initiated to address these concerns, as well as other performance and data analysis issues. Expression data on four titration pools from two distinct reference RNA samples were generated at multiple test sites using a variety of microarray-based and alternative technology platforms. Here we describe the experimental design and probe mapping efforts behind the MAQC project. We show intraplatform consistency across test sites as well as a high level of interplatform concordance in terms of genes identified as differentially expressed. This study provides a resource that represents an important first step toward establishing a framework for the use of microarrays in clinical and regulatory settings. PMID:16964229

  18. Women's experiences receiving abnormal prenatal chromosomal microarray testing results.

    PubMed

    Bernhardt, Barbara A; Soucier, Danielle; Hanson, Karen; Savage, Melissa S; Jackson, Laird; Wapner, Ronald J

    2013-02-01

    Genomic microarrays can detect copy-number variants not detectable by conventional cytogenetics. This technology is diffusing rapidly into prenatal settings even though the clinical implications of many copy-number variants are currently unknown. We conducted a qualitative pilot study to explore the experiences of women receiving abnormal results from prenatal microarray testing performed in a research setting. Participants were a subset of women participating in a multicenter prospective study "Prenatal Cytogenetic Diagnosis by Array-based Copy Number Analysis." Telephone interviews were conducted with 23 women receiving abnormal prenatal microarray results. We found that five key elements dominated the experiences of women who had received abnormal prenatal microarray results: an offer too good to pass up, blindsided by the results, uncertainty and unquantifiable risks, need for support, and toxic knowledge. As prenatal microarray testing is increasingly used, uncertain findings will be common, resulting in greater need for careful pre- and posttest counseling, and more education of and resources for providers so they can adequately support the women who are undergoing testing.

  19. The Glycan Microarray Story from Construction to Applications.

    PubMed

    Hyun, Ji Young; Pai, Jaeyoung; Shin, Injae

    2017-04-18

    Not only are glycan-mediated binding processes in cells and organisms essential for a wide range of physiological processes, but they are also implicated in various pathological processes. As a result, elucidation of glycan-associated biomolecular interactions and their consequences is of great importance in basic biological research and biomedical applications. In 2002, we and others were the first to utilize glycan microarrays in efforts aimed at the rapid analysis of glycan-associated recognition events. Because they contain a number of glycans immobilized in a dense and orderly manner on a solid surface, glycan microarrays enable multiple parallel analyses of glycan-protein binding events while utilizing only small amounts of glycan samples. Therefore, this microarray technology has become a leading edge tool in studies aimed at elucidating roles played by glycans and glycan binding proteins in biological systems. In this Account, we summarize our efforts on the construction of glycan microarrays and their applications in studies of glycan-associated interactions. Immobilization strategies of functionalized and unmodified glycans on derivatized glass surfaces are described. Although others have developed immobilization techniques, our efforts have focused on improving the efficiencies and operational simplicity of microarray construction. The microarray-based technology has been most extensively used for rapid analysis of the glycan binding properties of proteins. In addition, glycan microarrays have been employed to determine glycan-protein interactions quantitatively, detect pathogens, and rapidly assess substrate specificities of carbohydrate-processing enzymes. More recently, the microarrays have been employed to identify functional glycans that elicit cell surface lectin-mediated cellular responses. Owing to these efforts, it is now possible to use glycan microarrays to expand the understanding of roles played by glycans and glycan binding proteins in biological systems.

  20. cDNA microarrays as a tool for identification of biomineralization proteins in the coccolithophorid Emiliania huxleyi (Haptophyta).

    PubMed

    Quinn, Patrick; Bowers, Robert M; Zhang, Xiaoyu; Wahlund, Thomas M; Fanelli, Michael A; Olszova, Daniela; Read, Betsy A

    2006-08-01

    Marine unicellular coccolithophore algae produce species-specific calcite scales otherwise known as coccoliths. While the coccoliths and their elaborate architecture have attracted the attention of investigators from various scientific disciplines, our knowledge of the underpinnings of the process of biomineralization in this alga is still in its infancy. The processes of calcification and coccolithogenesis are highly regulated and likely to be complex, requiring coordinated expression of many genes and pathways. In this study, we have employed cDNA microarrays to investigate changes in gene expression associated with biomineralization in the most abundant coccolithophorid, Emiliania huxleyi. Expression profiling of cultures grown under calcifying and noncalcifying conditions has been carried out using cDNA microarrays corresponding to approximately 2,300 expressed sequence tags. A total of 127 significantly up- or down-regulated transcripts were identified using a P value of 0.01 and a change of >2.0-fold. Real-time reverse transcriptase PCR was used to test the overall validity of the microarray data, as well as the relevance of many of the proteins predicted to be associated with biomineralization, including a novel gamma-class carbonic anhydrase (A. R. Soto, H. Zheng, D. Shoemaker, J. Rodriguez, B. A. Read, and T. M. Wahlund, Appl. Environ. Microbiol. 72:5500-5511, 2006). Differentially regulated genes include those related to cellular metabolism, ion channels, transport proteins, vesicular trafficking, and cell signaling. The putative function of the vast majority of candidate transcripts could not be defined. Nonetheless, the data described herein represent profiles of the transcription changes associated with biomineralization-related pathways in E. huxleyi and have identified novel and potentially useful targets for more detailed analysis.

  1. cDNA Microarrays as a Tool for Identification of Biomineralization Proteins in the Coccolithophorid Emiliania huxleyi (Haptophyta)

    PubMed Central

    Quinn, Patrick; Bowers, Robert M.; Zhang, Xiaoyu; Wahlund, Thomas M.; Fanelli, Michael A.; Olszova, Daniela; Read, Betsy A.

    2006-01-01

    Marine unicellular coccolithophore algae produce species-specific calcite scales otherwise known as coccoliths. While the coccoliths and their elaborate architecture have attracted the attention of investigators from various scientific disciplines, our knowledge of the underpinnings of the process of biomineralization in this alga is still in its infancy. The processes of calcification and coccolithogenesis are highly regulated and likely to be complex, requiring coordinated expression of many genes and pathways. In this study, we have employed cDNA microarrays to investigate changes in gene expression associated with biomineralization in the most abundant coccolithophorid, Emiliania huxleyi. Expression profiling of cultures grown under calcifying and noncalcifying conditions has been carried out using cDNA microarrays corresponding to approximately 2,300 expressed sequence tags. A total of 127 significantly up- or down-regulated transcripts were identified using a P value of 0.01 and a change of >2.0-fold. Real-time reverse transcriptase PCR was used to test the overall validity of the microarray data, as well as the relevance of many of the proteins predicted to be associated with biomineralization, including a novel gamma-class carbonic anhydrase (A. R. Soto, H. Zheng, D. Shoemaker, J. Rodriguez, B. A. Read, and T. M. Wahlund, Appl. Environ. Microbiol. 72:5500-5511, 2006). Differentially regulated genes include those related to cellular metabolism, ion channels, transport proteins, vesicular trafficking, and cell signaling. The putative function of the vast majority of candidate transcripts could not be defined. Nonetheless, the data described herein represent profiles of the transcription changes associated with biomineralization-related pathways in E. huxleyi and have identified novel and potentially useful targets for more detailed analysis. PMID:16885305

  2. Empirical evaluation of data normalization methods for molecular classification.

    PubMed

    Huang, Huei-Chung; Qin, Li-Xuan

    2018-01-01

    Data artifacts due to variations in experimental handling are ubiquitous in microarray studies, and they can lead to biased and irreproducible findings. A popular approach to correct for such artifacts is through post hoc data adjustment such as data normalization. Statistical methods for data normalization have been developed and evaluated primarily for the discovery of individual molecular biomarkers. Their performance has rarely been studied for the development of multi-marker molecular classifiers-an increasingly important application of microarrays in the era of personalized medicine. In this study, we set out to evaluate the performance of three commonly used methods for data normalization in the context of molecular classification, using extensive simulations based on re-sampling from a unique pair of microRNA microarray datasets for the same set of samples. The data and code for our simulations are freely available as R packages at GitHub. In the presence of confounding handling effects, all three normalization methods tended to improve the accuracy of the classifier when evaluated in an independent test data. The level of improvement and the relative performance among the normalization methods depended on the relative level of molecular signal, the distributional pattern of handling effects (e.g., location shift vs scale change), and the statistical method used for building the classifier. In addition, cross-validation was associated with biased estimation of classification accuracy in the over-optimistic direction for all three normalization methods. Normalization may improve the accuracy of molecular classification for data with confounding handling effects; however, it cannot circumvent the over-optimistic findings associated with cross-validation for assessing classification accuracy.

  3. The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements

    EPA Science Inventory

    Over the last decade, the introduction of microarray technology has had a profound impact on gene expression research. The publication of studies with dissimilar or altogether contradictory results, obtained using different microarray platforms to analyze identical RNA samples, ...

  4. Combined gene expression analysis of whole-tissue and microdissected pancreatic ductal adenocarcinoma identifies genes specifically overexpressed in tumor epithelia.

    PubMed

    Badea, Liviu; Herlea, Vlad; Dima, Simona Olimpia; Dumitrascu, Traian; Popescu, Irinel

    2008-01-01

    The precise details of pancreatic ductal adenocarcinoma (PDAC) pathogenesis are still insufficiently known, requiring the use of high-throughput methods. However, PDAC is especially difficult to study using microarrays due to its strong desmoplastic reaction, which involves a hyperproliferating stroma that effectively "masks" the contribution of the minoritary neoplastic epithelial cells. Thus it is not clear which of the genes that have been found differentially expressed between normal and whole tumor tissues are due to the tumor epithelia and which simply reflect the differences in cellular composition. To address this problem, laser microdissection studies have been performed, but these have to deal with much smaller tissue sample quantities and therefore have significantly higher experimental noise. In this paper we combine our own large sample whole-tissue study with a previously published smaller sample microdissection study by Grützmann et al. to identify the genes that are specifically overexpressed in PDAC tumor epithelia. The overlap of this list of genes with other microarray studies of pancreatic cancer as well as with the published literature is impressive. Moreover, we find a number of genes whose over-expression appears to be inversely correlated with patient survival: keratin 7, laminin gamma 2, stratifin, platelet phosphofructokinase, annexin A2, MAP4K4 and OACT2 (MBOAT2), which are all specifically upregulated in the neoplastic epithelia, rather than the tumor stroma. We improve on other microarray studies of PDAC by putting together the higher statistical power due to a larger number of samples with information about cell-type specific expression and patient survival.

  5. Bioinformatics and Microarray Data Analysis on the Cloud.

    PubMed

    Calabrese, Barbara; Cannataro, Mario

    2016-01-01

    High-throughput platforms such as microarray, mass spectrometry, and next-generation sequencing are producing an increasing volume of omics data that needs large data storage and computing power. Cloud computing offers massive scalable computing and storage, data sharing, on-demand anytime and anywhere access to resources and applications, and thus, it may represent the key technology for facing those issues. In fact, in the recent years it has been adopted for the deployment of different bioinformatics solutions and services both in academia and in the industry. Although this, cloud computing presents several issues regarding the security and privacy of data, that are particularly important when analyzing patients data, such as in personalized medicine. This chapter reviews main academic and industrial cloud-based bioinformatics solutions; with a special focus on microarray data analysis solutions and underlines main issues and problems related to the use of such platforms for the storage and analysis of patients data.

  6. Flow-pattern Guided Fabrication of High-density Barcode Antibody Microarray

    PubMed Central

    Ramirez, Lisa S.; Wang, Jun

    2016-01-01

    Antibody microarray as a well-developed technology is currently challenged by a few other established or emerging high-throughput technologies. In this report, we renovate the antibody microarray technology by using a novel approach for manufacturing and by introducing new features. The fabrication of our high-density antibody microarray is accomplished through perpendicularly oriented flow-patterning of single stranded DNAs and subsequent conversion mediated by DNA-antibody conjugates. This protocol outlines the critical steps in flow-patterning DNA, producing and purifying DNA-antibody conjugates, and assessing the quality of the fabricated microarray. The uniformity and sensitivity are comparable with conventional microarrays, while our microarray fabrication does not require the assistance of an array printer and can be performed in most research laboratories. The other major advantage is that the size of our microarray units is 10 times smaller than that of printed arrays, offering the unique capability of analyzing functional proteins from single cells when interfacing with generic microchip designs. This barcode technology can be widely employed in biomarker detection, cell signaling studies, tissue engineering, and a variety of clinical applications. PMID:26780370

  7. DNA microarrays and their use in dermatology.

    PubMed

    Mlakar, Vid; Glavac, Damjan

    2007-03-01

    Multiple different DNA microarray technologies are available on the market today. They can be used for studying either DNA or RNA with the purpose of identifying and explaining the role of genes involved in different processes. This paper reviews different DNA microarray platforms available for such studies and their usage in cases of malignant melanomas, psoriasis, and exposure of keratinocytes and melanocytes to UV illumination.

  8. Quantitative phenotyping via deep barcode sequencing

    PubMed Central

    Smith, Andrew M.; Heisler, Lawrence E.; Mellor, Joseph; Kaper, Fiona; Thompson, Michael J.; Chee, Mark; Roth, Frederick P.; Giaever, Guri; Nislow, Corey

    2009-01-01

    Next-generation DNA sequencing technologies have revolutionized diverse genomics applications, including de novo genome sequencing, SNP detection, chromatin immunoprecipitation, and transcriptome analysis. Here we apply deep sequencing to genome-scale fitness profiling to evaluate yeast strain collections in parallel. This method, Barcode analysis by Sequencing, or “Bar-seq,” outperforms the current benchmark barcode microarray assay in terms of both dynamic range and throughput. When applied to a complex chemogenomic assay, Bar-seq quantitatively identifies drug targets, with performance superior to the benchmark microarray assay. We also show that Bar-seq is well-suited for a multiplex format. We completely re-sequenced and re-annotated the yeast deletion collection using deep sequencing, found that ∼20% of the barcodes and common priming sequences varied from expectation, and used this revised list of barcode sequences to improve data quality. Together, this new assay and analysis routine provide a deep-sequencing-based toolkit for identifying gene–environment interactions on a genome-wide scale. PMID:19622793

  9. MICROARRAY QUALITY CONTROL PROJECT: A COMPREHENSIVE GENE EXPRESSION TECHNOLOGY SURVEY DEMONSTRATES MEASURABLE CONSISTENCY AND CONCORDANT RESULTS BETWEEN PLATFORMS

    EPA Science Inventory

    Over the last decade, the introduction of microarray technology has had a profound impact on gene expression research. The publication of studies with dissimilar or altogether contradictory results, obtained using different microarray platforms to analyze identical RNA samples, h...

  10. Comparative Analysis of CNV Calling Algorithms: Literature Survey and a Case Study Using Bovine High-Density SNP Data.

    PubMed

    Xu, Lingyang; Hou, Yali; Bickhart, Derek M; Song, Jiuzhou; Liu, George E

    2013-06-25

    Copy number variations (CNVs) are gains and losses of genomic sequence between two individuals of a species when compared to a reference genome. The data from single nucleotide polymorphism (SNP) microarrays are now routinely used for genotyping, but they also can be utilized for copy number detection. Substantial progress has been made in array design and CNV calling algorithms and at least 10 comparison studies in humans have been published to assess them. In this review, we first survey the literature on existing microarray platforms and CNV calling algorithms. We then examine a number of CNV calling tools to evaluate their impacts using bovine high-density SNP data. Large incongruities in the results from different CNV calling tools highlight the need for standardizing array data collection, quality assessment and experimental validation. Only after careful experimental design and rigorous data filtering can the impacts of CNVs on both normal phenotypic variability and disease susceptibility be fully revealed.

  11. Meta-review of protein network regulating obesity between validated obesity candidate genes in the white adipose tissue of high-fat diet-induced obese C57BL/6J mice.

    PubMed

    Kim, Eunjung; Kim, Eun Jung; Seo, Seung-Won; Hur, Cheol-Goo; McGregor, Robin A; Choi, Myung-Sook

    2014-01-01

    Worldwide obesity and related comorbidities are increasing, but identifying new therapeutic targets remains a challenge. A plethora of microarray studies in diet-induced obesity models has provided large datasets of obesity associated genes. In this review, we describe an approach to examine the underlying molecular network regulating obesity, and we discuss interactions between obesity candidate genes. We conducted network analysis on functional protein-protein interactions associated with 25 obesity candidate genes identified in a literature-driven approach based on published microarray studies of diet-induced obesity. The obesity candidate genes were closely associated with lipid metabolism and inflammation. Peroxisome proliferator activated receptor gamma (Pparg) appeared to be a core obesity gene, and obesity candidate genes were highly interconnected, suggesting a coordinately regulated molecular network in adipose tissue. In conclusion, the current network analysis approach may help elucidate the underlying molecular network regulating obesity and identify anti-obesity targets for therapeutic intervention.

  12. A novel universal DNA labeling and amplification system for rapid microarray-based detection of 117 antibiotic resistance genes in Gram-positive bacteria.

    PubMed

    Strauss, Christian; Endimiani, Andrea; Perreten, Vincent

    2015-01-01

    A rapid and simple DNA labeling system has been developed for disposable microarrays and has been validated for the detection of 117 antibiotic resistance genes abundant in Gram-positive bacteria. The DNA was fragmented and amplified using phi-29 polymerase and random primers with linkers. Labeling and further amplification were then performed by classic PCR amplification using biotinylated primers specific for the linkers. The microarray developed by Perreten et al. (Perreten, V., Vorlet-Fawer, L., Slickers, P., Ehricht, R., Kuhnert, P., Frey, J., 2005. Microarray-based detection of 90 antibiotic resistance genes of gram-positive bacteria. J.Clin.Microbiol. 43, 2291-2302.) was improved by additional oligonucleotides. A total of 244 oligonucleotides (26 to 37 nucleotide length and with similar melting temperatures) were spotted on the microarray, including genes conferring resistance to clinically important antibiotic classes like β-lactams, macrolides, aminoglycosides, glycopeptides and tetracyclines. Each antibiotic resistance gene is represented by at least 2 oligonucleotides designed from consensus sequences of gene families. The specificity of the oligonucleotides and the quality of the amplification and labeling were verified by analysis of a collection of 65 strains belonging to 24 species. Association between genotype and phenotype was verified for 6 antibiotics using 77 Staphylococcus strains belonging to different species and revealed 95% test specificity and a 93% predictive value of a positive test. The DNA labeling and amplification is independent of the species and of the target genes and could be used for different types of microarrays. This system has also the advantage to detect several genes within one bacterium at once, like in Staphylococcus aureus strain BM3318, in which up to 15 genes were detected. This new microarray-based detection system offers a large potential for applications in clinical diagnostic, basic research, food safety and surveillance programs for antimicrobial resistance. Copyright © 2014 Elsevier B.V. All rights reserved.

  13. A microarray of ubiquitylated proteins for profiling deubiquitylase activity reveals the critical roles of both chain and substrate.

    PubMed

    Loch, Christian M; Strickler, James E

    2012-11-01

    Substrate ubiquitylation is a reversible process critical to cellular homeostasis that is often dysregulated in many human pathologies including cancer and neurodegeneration. Elucidating the mechanistic details of this pathway could unlock a large store of information useful to the design of diagnostic and therapeutic interventions. Proteomic approaches to the questions at hand have generally utilized mass spectrometry (MS), which has been successful in identifying both ubiquitylation substrates and profiling pan-cellular chain linkages, but is generally unable to connect the two. Interacting partners of the deubiquitylating enzymes (DUBs) have also been reported by MS, although substrates of catalytically competent DUBs generally cannot be. Where they have been used towards the study of ubiquitylation, protein microarrays have usually functioned as platforms for the identification of substrates for specific E3 ubiquitin ligases. Here, we report on the first use of protein microarrays to identify substrates of DUBs, and in so doing demonstrate the first example of microarray proteomics involving multiple (i.e., distinct, sequential and opposing) enzymatic activities. This technique demonstrates the selectivity of DUBs for both substrate and type (mono- versus poly-) of ubiquitylation. This work shows that the vast majority of DUBs are monoubiquitylated in vitro, and are incapable of removing this modification from themselves. This work also underscores the critical role of utilizing both ubiquitin chains and substrates when attempting to characterize DUBs. This article is part of a Special Issue entitled: Ubiquitin Drug Discovery and Diagnostics. Copyright © 2012 Elsevier B.V. All rights reserved.

  14. Multiclassifier information fusion methods for microarray pattern recognition

    NASA Astrophysics Data System (ADS)

    Braun, Jerome J.; Glina, Yan; Judson, Nicholas; Herzig-Marx, Rachel

    2004-04-01

    This paper addresses automatic recognition of microarray patterns, a capability that could have a major significance for medical diagnostics, enabling development of diagnostic tools for automatic discrimination of specific diseases. The paper presents multiclassifier information fusion methods for microarray pattern recognition. The input space partitioning approach based on fitness measures that constitute an a-priori gauging of classification efficacy for each subspace is investigated. Methods for generation of fitness measures, generation of input subspaces and their use in the multiclassifier fusion architecture are presented. In particular, two-level quantification of fitness that accounts for the quality of each subspace as well as the quality of individual neighborhoods within the subspace is described. Individual-subspace classifiers are Support Vector Machine based. The decision fusion stage fuses the information from mulitple SVMs along with the multi-level fitness information. Final decision fusion stage techniques, including weighted fusion as well as Dempster-Shafer theory based fusion are investigated. It should be noted that while the above methods are discussed in the context of microarray pattern recognition, they are applicable to a broader range of discrimination problems, in particular to problems involving a large number of information sources irreducible to a low-dimensional feature space.

  15. Design and evaluation of Actichip, a thematic microarray for the study of the actin cytoskeleton

    PubMed Central

    Muller, Jean; Mehlen, André; Vetter, Guillaume; Yatskou, Mikalai; Muller, Arnaud; Chalmel, Frédéric; Poch, Olivier; Friederich, Evelyne; Vallar, Laurent

    2007-01-01

    Background The actin cytoskeleton plays a crucial role in supporting and regulating numerous cellular processes. Mutations or alterations in the expression levels affecting the actin cytoskeleton system or related regulatory mechanisms are often associated with complex diseases such as cancer. Understanding how qualitative or quantitative changes in expression of the set of actin cytoskeleton genes are integrated to control actin dynamics and organisation is currently a challenge and should provide insights in identifying potential targets for drug discovery. Here we report the development of a dedicated microarray, the Actichip, containing 60-mer oligonucleotide probes for 327 genes selected for transcriptome analysis of the human actin cytoskeleton. Results Genomic data and sequence analysis features were retrieved from GenBank and stored in an integrative database called Actinome. From these data, probes were designed using a home-made program (CADO4MI) allowing sequence refinement and improved probe specificity by combining the complementary information recovered from the UniGene and RefSeq databases. Actichip performance was analysed by hybridisation with RNAs extracted from epithelial MCF-7 cells and human skeletal muscle. Using thoroughly standardised procedures, we obtained microarray images with excellent quality resulting in high data reproducibility. Actichip displayed a large dynamic range extending over three logs with a limit of sensitivity between one and ten copies of transcript per cell. The array allowed accurate detection of small changes in gene expression and reliable classification of samples based on the expression profiles of tissue-specific genes. When compared to two other oligonucleotide microarray platforms, Actichip showed similar sensitivity and concordant expression ratios. Moreover, Actichip was able to discriminate the highly similar actin isoforms whereas the two other platforms did not. Conclusion Our data demonstrate that Actichip is a powerful alternative to commercial high density microarrays for cytoskeleton gene profiling in normal or pathological samples. Actichip is available upon request. PMID:17727702

  16. Design of microarray experiments for genetical genomics studies.

    PubMed

    Bueno Filho, Júlio S S; Gilmour, Steven G; Rosa, Guilherme J M

    2006-10-01

    Microarray experiments have been used recently in genetical genomics studies, as an additional tool to understand the genetic mechanisms governing variation in complex traits, such as for estimating heritabilities of mRNA transcript abundances, for mapping expression quantitative trait loci, and for inferring regulatory networks controlling gene expression. Several articles on the design of microarray experiments discuss situations in which treatment effects are assumed fixed and without any structure. In the case of two-color microarray platforms, several authors have studied reference and circular designs. Here, we discuss the optimal design of microarray experiments whose goals refer to specific genetic questions. Some examples are used to illustrate the choice of a design for comparing fixed, structured treatments, such as genotypic groups. Experiments targeting single genes or chromosomic regions (such as with transgene research) or multiple epistatic loci (such as within a selective phenotyping context) are discussed. In addition, microarray experiments in which treatments refer to families or to subjects (within family structures or complex pedigrees) are presented. In these cases treatments are more appropriately considered to be random effects, with specific covariance structures, in which the genetic goals relate to the estimation of genetic variances and the heritability of transcriptional abundances.

  17. SIMULATION AND VISUALIZATION OF FLOW PATTERN IN MICROARRAYS FOR LIQUID PHASE OLIGONUCLEOTIDE AND PEPTIDE SYNTHESIS

    PubMed Central

    O-Charoen, Sirimon; Srivannavit, Onnop; Gulari, Erdogan

    2008-01-01

    Microfluidic microarrays have been developed for economical and rapid parallel synthesis of oligonucleotide and peptide libraries. For a synthesis system to be reproducible and uniform, it is crucial to have a uniform reagent delivery throughout the system. Computational fluid dynamics (CFD) is used to model and simulate the microfluidic microarrays to study geometrical effects on flow patterns. By proper design geometry, flow uniformity could be obtained in every microreactor in the microarrays. PMID:17480053

  18. The application of DNA microarrays in gene expression analysis.

    PubMed

    van Hal, N L; Vorst, O; van Houwelingen, A M; Kok, E J; Peijnenburg, A; Aharoni, A; van Tunen, A J; Keijer, J

    2000-03-31

    DNA microarray technology is a new and powerful technology that will substantially increase the speed of molecular biological research. This paper gives a survey of DNA microarray technology and its use in gene expression studies. The technical aspects and their potential improvements are discussed. These comprise array manufacturing and design, array hybridisation, scanning, and data handling. Furthermore, it is discussed how DNA microarrays can be applied in the working fields of: safety, functionality and health of food and gene discovery and pathway engineering in plants.

  19. Fully Automated Complementary DNA Microarray Segmentation using a Novel Fuzzy-based Algorithm.

    PubMed

    Saberkari, Hamidreza; Bahrami, Sheyda; Shamsi, Mousa; Amoshahy, Mohammad Javad; Ghavifekr, Habib Badri; Sedaaghi, Mohammad Hossein

    2015-01-01

    DNA microarray is a powerful approach to study simultaneously, the expression of 1000 of genes in a single experiment. The average value of the fluorescent intensity could be calculated in a microarray experiment. The calculated intensity values are very close in amount to the levels of expression of a particular gene. However, determining the appropriate position of every spot in microarray images is a main challenge, which leads to the accurate classification of normal and abnormal (cancer) cells. In this paper, first a preprocessing approach is performed to eliminate the noise and artifacts available in microarray cells using the nonlinear anisotropic diffusion filtering method. Then, the coordinate center of each spot is positioned utilizing the mathematical morphology operations. Finally, the position of each spot is exactly determined through applying a novel hybrid model based on the principle component analysis and the spatial fuzzy c-means clustering (SFCM) algorithm. Using a Gaussian kernel in SFCM algorithm will lead to improving the quality in complementary DNA microarray segmentation. The performance of the proposed algorithm has been evaluated on the real microarray images, which is available in Stanford Microarray Databases. Results illustrate that the accuracy of microarray cells segmentation in the proposed algorithm reaches to 100% and 98% for noiseless/noisy cells, respectively.

  20. A microarray-based genotyping and genetic mapping approach for highly heterozygous outcrossing species enables localization of a large fraction of the unassembled Populus trichocarpa genome sequence.

    PubMed

    Drost, Derek R; Novaes, Evandro; Boaventura-Novaes, Carolina; Benedict, Catherine I; Brown, Ryan S; Yin, Tongming; Tuskan, Gerald A; Kirst, Matias

    2009-06-01

    Microarrays have demonstrated significant power for genome-wide analyses of gene expression, and recently have also revolutionized the genetic analysis of segregating populations by genotyping thousands of loci in a single assay. Although microarray-based genotyping approaches have been successfully applied in yeast and several inbred plant species, their power has not been proven in an outcrossing species with extensive genetic diversity. Here we have developed methods for high-throughput microarray-based genotyping in such species using a pseudo-backcross progeny of 154 individuals of Populus trichocarpa and P. deltoides analyzed with long-oligonucleotide in situ-synthesized microarray probes. Our analysis resulted in high-confidence genotypes for 719 single-feature polymorphism (SFP) and 1014 gene expression marker (GEM) candidates. Using these genotypes and an established microsatellite (SSR) framework map, we produced a high-density genetic map comprising over 600 SFPs, GEMs and SSRs. The abundance of gene-based markers allowed us to localize over 35 million base pairs of previously unplaced whole-genome shotgun (WGS) scaffold sequence to putative locations in the genome of P. trichocarpa. A high proportion of sampled scaffolds could be verified for their placement with independently mapped SSRs, demonstrating the previously un-utilized power that high-density genotyping can provide in the context of map-based WGS sequence reassembly. Our results provide a substantial contribution to the continued improvement of the Populus genome assembly, while demonstrating the feasibility of microarray-based genotyping in a highly heterozygous population. The strategies presented are applicable to genetic mapping efforts in all plant species with similarly high levels of genetic diversity.

  1. Reuse of imputed data in microarray analysis increases imputation efficiency

    PubMed Central

    Kim, Ki-Yeol; Kim, Byoung-Jin; Yi, Gwan-Su

    2004-01-01

    Background The imputation of missing values is necessary for the efficient use of DNA microarray data, because many clustering algorithms and some statistical analysis require a complete data set. A few imputation methods for DNA microarray data have been introduced, but the efficiency of the methods was low and the validity of imputed values in these methods had not been fully checked. Results We developed a new cluster-based imputation method called sequential K-nearest neighbor (SKNN) method. This imputes the missing values sequentially from the gene having least missing values, and uses the imputed values for the later imputation. Although it uses the imputed values, the efficiency of this new method is greatly improved in its accuracy and computational complexity over the conventional KNN-based method and other methods based on maximum likelihood estimation. The performance of SKNN was in particular higher than other imputation methods for the data with high missing rates and large number of experiments. Application of Expectation Maximization (EM) to the SKNN method improved the accuracy, but increased computational time proportional to the number of iterations. The Multiple Imputation (MI) method, which is well known but not applied previously to microarray data, showed a similarly high accuracy as the SKNN method, with slightly higher dependency on the types of data sets. Conclusions Sequential reuse of imputed data in KNN-based imputation greatly increases the efficiency of imputation. The SKNN method should be practically useful to save the data of some microarray experiments which have high amounts of missing entries. The SKNN method generates reliable imputed values which can be used for further cluster-based analysis of microarray data. PMID:15504240

  2. Direct labeling of serum proteins by fluorescent dye for antibody microarray.

    PubMed

    Klimushina, M V; Gumanova, N G; Metelskaya, V A

    2017-05-06

    Analysis of serum proteome by antibody microarray is used to identify novel biomarkers and to study signaling pathways including protein phosphorylation and protein-protein interactions. Labeling of serum proteins is important for optimal performance of the antibody microarray. Proper choice of fluorescent label and optimal concentration of protein loaded on the microarray ensure good quality of imaging that can be reliably scanned and processed by the software. We have optimized direct serum protein labeling using fluorescent dye Arrayit Green 540 (Arrayit Corporation, USA) for antibody microarray. Optimized procedure produces high quality images that can be readily scanned and used for statistical analysis of protein composition of the serum. Copyright © 2017 Elsevier Inc. All rights reserved.

  3. arrayCGHbase: an analysis platform for comparative genomic hybridization microarrays

    PubMed Central

    Menten, Björn; Pattyn, Filip; De Preter, Katleen; Robbrecht, Piet; Michels, Evi; Buysse, Karen; Mortier, Geert; De Paepe, Anne; van Vooren, Steven; Vermeesch, Joris; Moreau, Yves; De Moor, Bart; Vermeulen, Stefan; Speleman, Frank; Vandesompele, Jo

    2005-01-01

    Background The availability of the human genome sequence as well as the large number of physically accessible oligonucleotides, cDNA, and BAC clones across the entire genome has triggered and accelerated the use of several platforms for analysis of DNA copy number changes, amongst others microarray comparative genomic hybridization (arrayCGH). One of the challenges inherent to this new technology is the management and analysis of large numbers of data points generated in each individual experiment. Results We have developed arrayCGHbase, a comprehensive analysis platform for arrayCGH experiments consisting of a MIAME (Minimal Information About a Microarray Experiment) supportive database using MySQL underlying a data mining web tool, to store, analyze, interpret, compare, and visualize arrayCGH results in a uniform and user-friendly format. Following its flexible design, arrayCGHbase is compatible with all existing and forthcoming arrayCGH platforms. Data can be exported in a multitude of formats, including BED files to map copy number information on the genome using the Ensembl or UCSC genome browser. Conclusion ArrayCGHbase is a web based and platform independent arrayCGH data analysis tool, that allows users to access the analysis suite through the internet or a local intranet after installation on a private server. ArrayCGHbase is available at . PMID:15910681

  4. GOTree Machine (GOTM): a web-based platform for interpreting sets of interesting genes using Gene Ontology hierarchies

    PubMed Central

    Zhang, Bing; Schmoyer, Denise; Kirov, Stefan; Snoddy, Jay

    2004-01-01

    Background Microarray and other high-throughput technologies are producing large sets of interesting genes that are difficult to analyze directly. Bioinformatics tools are needed to interpret the functional information in the gene sets. Results We have created a web-based tool for data analysis and data visualization for sets of genes called GOTree Machine (GOTM). This tool was originally intended to analyze sets of co-regulated genes identified from microarray analysis but is adaptable for use with other gene sets from other high-throughput analyses. GOTree Machine generates a GOTree, a tree-like structure to navigate the Gene Ontology Directed Acyclic Graph for input gene sets. This system provides user friendly data navigation and visualization. Statistical analysis helps users to identify the most important Gene Ontology categories for the input gene sets and suggests biological areas that warrant further study. GOTree Machine is available online at . Conclusion GOTree Machine has a broad application in functional genomic, proteomic and other high-throughput methods that generate large sets of interesting genes; its primary purpose is to help users sort for interesting patterns in gene sets. PMID:14975175

  5. Non-Small-Cell Lung Cancer Molecular Signatures Recapitulate Lung Developmental Pathways

    PubMed Central

    Borczuk, Alain C.; Gorenstein, Lyall; Walter, Kristin L.; Assaad, Adel A.; Wang, Liqun; Powell, Charles A.

    2003-01-01

    Current paradigms hold that lung carcinomas arise from pleuripotent stem cells capable of differentiation into one or several histological types. These paradigms suggest lung tumor cell ontogeny is determined by consequences of gene expression that recapitulate events important in embryonic lung development. Using oligonucleotide microarrays, we acquired gene profiles from 32 microdissected non-small-cell lung tumors. We determined the 100 top-ranked marker genes for adenocarcinoma, squamous cell, large cell, and carcinoid using nearest neighbor analysis. Results were validated by immunostaining for 11 selected proteins using a tissue microarray representing 80 tumors. Gene expression data of lung development were accessed from a publicly available dataset generated with the murine Mu11k genome microarray. Self-organized mapping identified two temporally distinct clusters of murine orthologues. Supervised clustering of lung development data showed large-cell carcinoma gene orthologues were in a cluster expressed in pseudoglandular and canalicular stages whereas adenocarcinoma homologues were predominantly in a cluster expressed later in the terminal sac and alveolar stages of murine lung development. Representative large-cell genes (E2F3, MYBL2, HDAC2, CDK4, PCNA) are expressed in the nucleus and are associated with cell cycle and proliferation. In contrast, adenocarcinoma genes are associated with lung-specific transcription pathways (SFTPB, TTF-1), cell adhesion, and signal transduction. In sum, non-small-cell lung tumors histology gene profiles suggest mechanisms relevant to ontogeny and clinical course. Adenocarcinoma genes are associated with differentiation and glandular formation whereas large-cell genes are associated with proliferation and differentiation arrest. The identification of developmentally regulated pathways active in tumorigenesis provides insights into lung carcinogenesis and suggests early steps may differ according to the eventual tumor morphology. PMID:14578194

  6. Comparison of RNA-seq and microarray-based models for clinical endpoint prediction.

    PubMed

    Zhang, Wenqian; Yu, Ying; Hertwig, Falk; Thierry-Mieg, Jean; Zhang, Wenwei; Thierry-Mieg, Danielle; Wang, Jian; Furlanello, Cesare; Devanarayan, Viswanath; Cheng, Jie; Deng, Youping; Hero, Barbara; Hong, Huixiao; Jia, Meiwen; Li, Li; Lin, Simon M; Nikolsky, Yuri; Oberthuer, André; Qing, Tao; Su, Zhenqiang; Volland, Ruth; Wang, Charles; Wang, May D; Ai, Junmei; Albanese, Davide; Asgharzadeh, Shahab; Avigad, Smadar; Bao, Wenjun; Bessarabova, Marina; Brilliant, Murray H; Brors, Benedikt; Chierici, Marco; Chu, Tzu-Ming; Zhang, Jibin; Grundy, Richard G; He, Min Max; Hebbring, Scott; Kaufman, Howard L; Lababidi, Samir; Lancashire, Lee J; Li, Yan; Lu, Xin X; Luo, Heng; Ma, Xiwen; Ning, Baitang; Noguera, Rosa; Peifer, Martin; Phan, John H; Roels, Frederik; Rosswog, Carolina; Shao, Susan; Shen, Jie; Theissen, Jessica; Tonini, Gian Paolo; Vandesompele, Jo; Wu, Po-Yen; Xiao, Wenzhong; Xu, Joshua; Xu, Weihong; Xuan, Jiekun; Yang, Yong; Ye, Zhan; Dong, Zirui; Zhang, Ke K; Yin, Ye; Zhao, Chen; Zheng, Yuanting; Wolfinger, Russell D; Shi, Tieliu; Malkas, Linda H; Berthold, Frank; Wang, Jun; Tong, Weida; Shi, Leming; Peng, Zhiyu; Fischer, Matthias

    2015-06-25

    Gene expression profiling is being widely applied in cancer research to identify biomarkers for clinical endpoint prediction. Since RNA-seq provides a powerful tool for transcriptome-based applications beyond the limitations of microarrays, we sought to systematically evaluate the performance of RNA-seq-based and microarray-based classifiers in this MAQC-III/SEQC study for clinical endpoint prediction using neuroblastoma as a model. We generate gene expression profiles from 498 primary neuroblastomas using both RNA-seq and 44 k microarrays. Characterization of the neuroblastoma transcriptome by RNA-seq reveals that more than 48,000 genes and 200,000 transcripts are being expressed in this malignancy. We also find that RNA-seq provides much more detailed information on specific transcript expression patterns in clinico-genetic neuroblastoma subgroups than microarrays. To systematically compare the power of RNA-seq and microarray-based models in predicting clinical endpoints, we divide the cohort randomly into training and validation sets and develop 360 predictive models on six clinical endpoints of varying predictability. Evaluation of factors potentially affecting model performances reveals that prediction accuracies are most strongly influenced by the nature of the clinical endpoint, whereas technological platforms (RNA-seq vs. microarrays), RNA-seq data analysis pipelines, and feature levels (gene vs. transcript vs. exon-junction level) do not significantly affect performances of the models. We demonstrate that RNA-seq outperforms microarrays in determining the transcriptomic characteristics of cancer, while RNA-seq and microarray-based models perform similarly in clinical endpoint prediction. Our findings may be valuable to guide future studies on the development of gene expression-based predictive models and their implementation in clinical practice.

  7. sigReannot: an oligo-set re-annotation pipeline based on similarities with the Ensembl transcripts and Unigene clusters.

    PubMed

    Casel, Pierrot; Moreews, François; Lagarrigue, Sandrine; Klopp, Christophe

    2009-07-16

    Microarray is a powerful technology enabling to monitor tens of thousands of genes in a single experiment. Most microarrays are now using oligo-sets. The design of the oligo-nucleotides is time consuming and error prone. Genome wide microarray oligo-sets are designed using as large a set of transcripts as possible in order to monitor as many genes as possible. Depending on the genome sequencing state and on the assembly state the knowledge of the existing transcripts can be very different. This knowledge evolves with the different genome builds and gene builds. Once the design is done the microarrays are often used for several years. The biologists working in EADGENE expressed the need of up-to-dated annotation files for the oligo-sets they share including information about the orthologous genes of model species, the Gene Ontology, the corresponding pathways and the chromosomal location. The results of SigReannot on a chicken micro-array used in the EADGENE project compared to the initial annotations show that 23% of the oligo-nucleotide gene annotations were not confirmed, 2% were modified and 1% were added. The interest of this up-to-date annotation procedure is demonstrated through the analysis of real data previously published. SigReannot uses the oligo-nucleotide design procedure criteria to validate the probe-gene link and the Ensembl transcripts as reference for annotation. It therefore produces a high quality annotation based on reference gene sets.

  8. Qualitative assessment of gene expression in affymetrix genechip arrays

    NASA Astrophysics Data System (ADS)

    Nagarajan, Radhakrishnan; Upreti, Meenakshi

    2007-01-01

    Affymetrix Genechip microarrays are used widely to determine the simultaneous expression of genes in a given biological paradigm. Probes on the Genechip array are atomic entities which by definition are randomly distributed across the array and in turn govern the gene expression. In the present study, we make several interesting observations. We show that there is considerable correlation between the probe intensities across the array which defy the independence assumption. While the mechanism behind such correlations is unclear, we show that scaling behavior and the profiles of perfect match (PM) as well as mismatch (MM) probes are similar and immune-to-background subtraction. We believe that the observed correlations are possibly an outcome of inherent non-stationarities or patchiness in the array devoid of biological significance. This is demonstrated by inspecting their scaling behavior and profiles of the PM and MM probe intensities obtained from publicly available Genechip arrays from three eukaryotic genomes, namely: Drosophila melanogaster (fruit fly), Homo sapiens (humans) and Mus musculus (house mouse) across distinct biological paradigms and across laboratories, with and without background subtraction. The fluctuation functions were estimated using detrended fluctuation analysis (DFA) with fourth-order polynomial detrending. The results presented in this study provide new insights into correlation signatures of PM and MM probe intensities and suggests the choice of DFA as a tool for qualitative assessment of Affymetrix Genechip microarrays prior to their analysis. A more detailed investigation is necessary in order to understand the source of these correlations.

  9. Automated array-based genomic profiling in chronic lymphocytic leukemia: Development of a clinical tool and discovery of recurrent genomic alterations

    PubMed Central

    Schwaenen, Carsten; Nessling, Michelle; Wessendorf, Swen; Salvi, Tatjana; Wrobel, Gunnar; Radlwimmer, Bernhard; Kestler, Hans A.; Haslinger, Christian; Stilgenbauer, Stephan; Döhner, Hartmut; Bentz, Martin; Lichter, Peter

    2004-01-01

    B cell chronic lymphocytic leukemia (B-CLL) is characterized by a highly variable clinical course. Recurrent chromosomal imbalances provide significant prognostic markers. Risk-adapted therapy based on genomic alterations has become an option that is currently being tested in clinical trials. To supply a robust tool for such large scale studies, we developed a comprehensive DNA microarray dedicated to the automated analysis of recurrent genomic imbalances in B-CLL by array-based comparative genomic hybridization (matrix–CGH). Validation of this chip in a series of 106 B-CLL cases revealed a high specificity and sensitivity that fulfils the criteria for application in clinical oncology. This chip is immediately applicable within clinical B-CLL treatment trials that evaluate whether B-CLL cases with distinct chromosomal abnormalities should be treated with chemotherapy of different intensities and/or stem cell transplantation. Through the control set of DNA fragments equally distributed over the genome, recurrent genomic imbalances were discovered: trisomy of chromosome 19 and gain of the MYCN oncogene correlating with an elevation of MYCN mRNA expression. PMID:14730057

  10. Quantitative image analysis of cellular heterogeneity in breast tumors complements genomic profiling.

    PubMed

    Yuan, Yinyin; Failmezger, Henrik; Rueda, Oscar M; Ali, H Raza; Gräf, Stefan; Chin, Suet-Feung; Schwarz, Roland F; Curtis, Christina; Dunning, Mark J; Bardwell, Helen; Johnson, Nicola; Doyle, Sarah; Turashvili, Gulisa; Provenzano, Elena; Aparicio, Sam; Caldas, Carlos; Markowetz, Florian

    2012-10-24

    Solid tumors are heterogeneous tissues composed of a mixture of cancer and normal cells, which complicates the interpretation of their molecular profiles. Furthermore, tissue architecture is generally not reflected in molecular assays, rendering this rich information underused. To address these challenges, we developed a computational approach based on standard hematoxylin and eosin-stained tissue sections and demonstrated its power in a discovery and validation cohort of 323 and 241 breast tumors, respectively. To deconvolute cellular heterogeneity and detect subtle genomic aberrations, we introduced an algorithm based on tumor cellularity to increase the comparability of copy number profiles between samples. We next devised a predictor for survival in estrogen receptor-negative breast cancer that integrated both image-based and gene expression analyses and significantly outperformed classifiers that use single data types, such as microarray expression signatures. Image processing also allowed us to describe and validate an independent prognostic factor based on quantitative analysis of spatial patterns between stromal cells, which are not detectable by molecular assays. Our quantitative, image-based method could benefit any large-scale cancer study by refining and complementing molecular assays of tumor samples.

  11. Profiling the humoral immune response of acute and chronic Q fever by protein microarray.

    PubMed

    Vigil, Adam; Chen, Chen; Jain, Aarti; Nakajima-Sasaki, Rie; Jasinskas, Algimantas; Pablo, Jozelyn; Hendrix, Laura R; Samuel, James E; Felgner, Philip L

    2011-10-01

    Antigen profiling using comprehensive protein microarrays is a powerful tool for characterizing the humoral immune response to infectious pathogens. Coxiella burnetii is a CDC category B bioterrorist infectious agent with worldwide distribution. In order to assess the antibody repertoire of acute and chronic Q fever patients we have constructed a protein microarray containing 93% of the proteome of Coxiella burnetii, the causative agent of Q fever. Here we report the profile of the IgG and IgM seroreactivity in 25 acute Q fever patients in longitudinal samples. We found that both early and late time points of infection have a very consistent repertoire of IgM and IgG response, with a limited number of proteins undergoing increasing or decreasing seroreactivity. We also probed a large collection of acute and chronic Q fever patient samples and identified serological markers that can differentiate between the two disease states. In this comparative analysis we confirmed the identity of numerous IgG biomarkers of acute infection, identified novel IgG biomarkers for acute and chronic infections, and profiled for the first time the IgM antibody repertoire for both acute and chronic Q fever. Using these results we were able to devise a test that can distinguish acute from chronic Q fever. These results also provide a unique perspective on isotype switch and demonstrate the utility of protein microarrays for simultaneously examining the dynamic humoral immune response against thousands of proteins from a large number of patients. The results presented here identify novel seroreactive antigens for the development of recombinant protein-based diagnostics and subunit vaccines, and provide insight into the development of the antibody response.

  12. Moving Toward Integrating Gene Expression Profiling into High-throughput Testing:A Gene Expression Biomarker Accurately Predicts Estrogen Receptor α Modulation in a Microarray Compendium

    EPA Science Inventory

    Microarray profiling of chemical-induced effects is being increasingly used in medium and high-throughput formats. In this study, we describe computational methods to identify molecular targets from whole-genome microarray data using as an example the estrogen receptor α (ERα), ...

  13. Removing technical variability in RNA-seq data using conditional quantile normalization.

    PubMed

    Hansen, Kasper D; Irizarry, Rafael A; Wu, Zhijin

    2012-04-01

    The ability to measure gene expression on a genome-wide scale is one of the most promising accomplishments in molecular biology. Microarrays, the technology that first permitted this, were riddled with problems due to unwanted sources of variability. Many of these problems are now mitigated, after a decade's worth of statistical methodology development. The recently developed RNA sequencing (RNA-seq) technology has generated much excitement in part due to claims of reduced variability in comparison to microarrays. However, we show that RNA-seq data demonstrate unwanted and obscuring variability similar to what was first observed in microarrays. In particular, we find guanine-cytosine content (GC-content) has a strong sample-specific effect on gene expression measurements that, if left uncorrected, leads to false positives in downstream results. We also report on commonly observed data distortions that demonstrate the need for data normalization. Here, we describe a statistical methodology that improves precision by 42% without loss of accuracy. Our resulting conditional quantile normalization algorithm combines robust generalized regression to remove systematic bias introduced by deterministic features such as GC-content and quantile normalization to correct for global distortions.

  14. Classification of a large microarray data set: Algorithm comparison and analysis of drug signatures

    PubMed Central

    Natsoulis, Georges; El Ghaoui, Laurent; Lanckriet, Gert R.G.; Tolley, Alexander M.; Leroy, Fabrice; Dunlea, Shane; Eynon, Barrett P.; Pearson, Cecelia I.; Tugendreich, Stuart; Jarnagin, Kurt

    2005-01-01

    A large gene expression database has been produced that characterizes the gene expression and physiological effects of hundreds of approved and withdrawn drugs, toxicants, and biochemical standards in various organs of live rats. In order to derive useful biological knowledge from this large database, a variety of supervised classification algorithms were compared using a 597-microarray subset of the data. Our studies show that several types of linear classifiers based on Support Vector Machines (SVMs) and Logistic Regression can be used to derive readily interpretable drug signatures with high classification performance. Both methods can be tuned to produce classifiers of drug treatments in the form of short, weighted gene lists which upon analysis reveal that some of the signature genes have a positive contribution (act as “rewards” for the class-of-interest) while others have a negative contribution (act as “penalties”) to the classification decision. The combination of reward and penalty genes enhances performance by keeping the number of false positive treatments low. The results of these algorithms are combined with feature selection techniques that further reduce the length of the drug signatures, an important step towards the development of useful diagnostic biomarkers and low-cost assays. Multiple signatures with no genes in common can be generated for the same classification end-point. Comparison of these gene lists identifies biological processes characteristic of a given class. PMID:15867433

  15. Best practices for hybridization design in two-colour microarray analysis.

    PubMed

    Knapen, Dries; Vergauwen, Lucia; Laukens, Kris; Blust, Ronny

    2009-07-01

    Two-colour microarrays are a popular platform of choice in gene expression studies. Because two different samples are hybridized on a single microarray, and several microarrays are usually needed in a given experiment, there are many possible ways to combine samples on different microarrays. The actual combination employed is commonly referred to as the 'hybridization design'. Different types of hybridization designs have been developed, all aimed at optimizing the experimental setup for the detection of differentially expressed genes while coping with technical noise. Here, we first provide an overview of the different classes of hybridization designs, discussing their advantages and limitations, and then we illustrate the current trends in the use of different hybridization design types in contemporary research.

  16. Grapevine cell early activation of specific responses to DIMEB, a resveratrol elicitor

    PubMed Central

    Zamboni, Anita; Gatto, Pamela; Cestaro, Alessandro; Pilati, Stefania; Viola, Roberto; Mattivi, Fulvio; Moser, Claudio; Velasco, Riccardo

    2009-01-01

    Background In response to pathogen attack, grapevine synthesizes phytoalexins belonging to the family of stilbenes. Grapevine cell cultures represent a good model system for studying the basic mechanisms of plant response to biotic and abiotic elicitors. Among these, modified β-cyclodextrins seem to act as true elicitors inducing strong production of the stilbene resveratrol. Results The transcriptome changes of Vitis riparia × Vitis berlandieri grapevine cells in response to the modified β-cyclodextrin, DIMEB, were analyzed 2 and 6 h after treatment using a suppression subtractive hybridization experiment and a microarray analysis respectively. At both time points, we identified a specific set of induced genes belonging to the general phenylpropanoid metabolism, including stilbenes and hydroxycinnamates, and to defence proteins such as PR proteins and chitinases. At 6 h we also observed a down-regulation of the genes involved in cell division and cell-wall loosening. Conclusions We report the first large-scale study of the molecular effects of DIMEB, a resveratrol inducer, on grapevine cell cultures. This molecule seems to mimic a defence elicitor which enhances the physical barriers of the cell, stops cell division and induces phytoalexin synthesis. PMID:19660119

  17. Computational, Integrative, and Comparative Methods for the Elucidation of Genetic Coexpression Networks

    DOE PAGES

    Baldwin, Nicole E.; Chesler, Elissa J.; Kirov, Stefan; ...

    2005-01-01

    Gene expression microarray data can be used for the assembly of genetic coexpression network graphs. Using mRNA samples obtained from recombinant inbred Mus musculus strains, it is possible to integrate allelic variation with molecular and higher-order phenotypes. The depth of quantitative genetic analysis of microarray data can be vastly enhanced utilizing this mouse resource in combination with powerful computational algorithms, platforms, and data repositories. The resulting network graphs transect many levels of biological scale. This approach is illustrated with the extraction of cliques of putatively co-regulated genes and their annotation using gene ontology analysis and cis -regulatory element discovery. Themore » causal basis for co-regulation is detected through the use of quantitative trait locus mapping.« less

  18. Fuzzy Neural Network Applied to Gene Expression Profiling for Predicting the Prognosis of Diffuse Large B‐cell Lymphoma

    PubMed Central

    Ando, Tatsuya; Suguro, Miyuki; Hanai, Taizo; Kobayashi, Takeshi; Seto, Masao

    2002-01-01

    Diffuse large B‐cell lymphoma (DLBCL) is the largest category of aggressive lymphomas. Less than 50% of patients can be cured by combination chemotherapy. Microarray technologies have recently shown that the response to chemotherapy reflects the molecular heterogeneity in DLBCL. On the basis of published microarray data, we attempted to develop a long‐overdue method for the precise and simple prediction of survival of DLBCL patients. We developed a fuzzy neural network (FNN) model to analyze gene expression profiling data for DLBCL. From data on 5857 genes, this model identified four genes (CD10, AA807551, AA805611 and IRF‐4) that could be used to predict prognosis with 93% accuracy. FNNs are powerful tools for extracting significant biological markers affecting prognosis, and are applicable to various kinds of expression profiling data for any malignancy. PMID:12460461

  19. Empirical evaluation of data normalization methods for molecular classification

    PubMed Central

    Huang, Huei-Chung

    2018-01-01

    Background Data artifacts due to variations in experimental handling are ubiquitous in microarray studies, and they can lead to biased and irreproducible findings. A popular approach to correct for such artifacts is through post hoc data adjustment such as data normalization. Statistical methods for data normalization have been developed and evaluated primarily for the discovery of individual molecular biomarkers. Their performance has rarely been studied for the development of multi-marker molecular classifiers—an increasingly important application of microarrays in the era of personalized medicine. Methods In this study, we set out to evaluate the performance of three commonly used methods for data normalization in the context of molecular classification, using extensive simulations based on re-sampling from a unique pair of microRNA microarray datasets for the same set of samples. The data and code for our simulations are freely available as R packages at GitHub. Results In the presence of confounding handling effects, all three normalization methods tended to improve the accuracy of the classifier when evaluated in an independent test data. The level of improvement and the relative performance among the normalization methods depended on the relative level of molecular signal, the distributional pattern of handling effects (e.g., location shift vs scale change), and the statistical method used for building the classifier. In addition, cross-validation was associated with biased estimation of classification accuracy in the over-optimistic direction for all three normalization methods. Conclusion Normalization may improve the accuracy of molecular classification for data with confounding handling effects; however, it cannot circumvent the over-optimistic findings associated with cross-validation for assessing classification accuracy. PMID:29666754

  20. Transcriptome profiling of Pinus radiata juvenile wood with contrasting stiffness identifies putative candidate genes involved in microfibril orientation and cell wall mechanics

    PubMed Central

    2011-01-01

    Background The mechanical properties of wood are largely determined by the orientation of cellulose microfibrils in secondary cell walls. Several genes and their allelic variants have previously been found to affect microfibril angle (MFA) and wood stiffness; however, the molecular mechanisms controlling microfibril orientation and mechanical strength are largely uncharacterised. In the present study, cDNA microarrays were used to compare gene expression in developing xylem with contrasting stiffness and MFA in juvenile Pinus radiata trees in order to gain further insights into the molecular mechanisms underlying microfibril orientation and cell wall mechanics. Results Juvenile radiata pine trees with higher stiffness (HS) had lower MFA in the earlywood and latewood of each ring compared to low stiffness (LS) trees. Approximately 3.4 to 14.5% out of 3, 320 xylem unigenes on cDNA microarrays were differentially regulated in juvenile wood with contrasting stiffness and MFA. Greater variation in MFA and stiffness was observed in earlywood compared to latewood, suggesting earlywood contributes most to differences in stiffness; however, 3-4 times more genes were differentially regulated in latewood than in earlywood. A total of 108 xylem unigenes were differentially regulated in juvenile wood with HS and LS in at least two seasons, including 43 unigenes with unknown functions. Many genes involved in cytoskeleton development and secondary wall formation (cellulose and lignin biosynthesis) were preferentially transcribed in wood with HS and low MFA. In contrast, several genes involved in cell division and primary wall synthesis were more abundantly transcribed in LS wood with high MFA. Conclusions Microarray expression profiles in Pinus radiata juvenile wood with contrasting stiffness has shed more light on the transcriptional control of microfibril orientation and the mechanical properties of wood. The identified candidate genes provide an invaluable resource for further gene function and association genetics studies aimed at deepening our understanding of cell wall biomechanics with a view to improving the mechanical properties of wood. PMID:21962175

  1. High‐throughput automated scoring of Ki67 in breast cancer tissue microarrays from the Breast Cancer Association Consortium

    PubMed Central

    Howat, William J; Daley, Frances; Zabaglo, Lila; McDuffus, Leigh‐Anne; Blows, Fiona; Coulson, Penny; Raza Ali, H; Benitez, Javier; Milne, Roger; Brenner, Herman; Stegmaier, Christa; Mannermaa, Arto; Chang‐Claude, Jenny; Rudolph, Anja; Sinn, Peter; Couch, Fergus J; Tollenaar, Rob A.E.M.; Devilee, Peter; Figueroa, Jonine; Sherman, Mark E; Lissowska, Jolanta; Hewitt, Stephen; Eccles, Diana; Hooning, Maartje J; Hollestelle, Antoinette; WM Martens, John; HM van Deurzen, Carolien; Investigators, kConFab; Bolla, Manjeet K; Wang, Qin; Jones, Michael; Schoemaker, Minouk; Broeks, Annegien; van Leeuwen, Flora E; Van't Veer, Laura; Swerdlow, Anthony J; Orr, Nick; Dowsett, Mitch; Easton, Douglas; Schmidt, Marjanka K; Pharoah, Paul D; Garcia‐Closas, Montserrat

    2016-01-01

    Abstract Automated methods are needed to facilitate high‐throughput and reproducible scoring of Ki67 and other markers in breast cancer tissue microarrays (TMAs) in large‐scale studies. To address this need, we developed an automated protocol for Ki67 scoring and evaluated its performance in studies from the Breast Cancer Association Consortium. We utilized 166 TMAs containing 16,953 tumour cores representing 9,059 breast cancer cases, from 13 studies, with information on other clinical and pathological characteristics. TMAs were stained for Ki67 using standard immunohistochemical procedures, and scanned and digitized using the Ariol system. An automated algorithm was developed for the scoring of Ki67, and scores were compared to computer assisted visual (CAV) scores in a subset of 15 TMAs in a training set. We also assessed the correlation between automated Ki67 scores and other clinical and pathological characteristics. Overall, we observed good discriminatory accuracy (AUC = 85%) and good agreement (kappa = 0.64) between the automated and CAV scoring methods in the training set. The performance of the automated method varied by TMA (kappa range= 0.37–0.87) and study (kappa range = 0.39–0.69). The automated method performed better in satisfactory cores (kappa = 0.68) than suboptimal (kappa = 0.51) cores (p‐value for comparison = 0.005); and among cores with higher total nuclei counted by the machine (4,000–4,500 cells: kappa = 0.78) than those with lower counts (50–500 cells: kappa = 0.41; p‐value = 0.010). Among the 9,059 cases in this study, the correlations between automated Ki67 and clinical and pathological characteristics were found to be in the expected directions. Our findings indicate that automated scoring of Ki67 can be an efficient method to obtain good quality data across large numbers of TMAs from multicentre studies. However, robust algorithm development and rigorous pre‐ and post‐analytical quality control procedures are necessary in order to ensure satisfactory performance. PMID:27499923

  2. Experimental analysis of oligonucleotide microarray design criteria to detect deletions by comparative genomic hybridization.

    PubMed

    Flibotte, Stephane; Moerman, Donald G

    2008-10-21

    Microarray comparative genomic hybridization (CGH) is currently one of the most powerful techniques to measure DNA copy number in large genomes. In humans, microarray CGH is widely used to assess copy number variants in healthy individuals and copy number aberrations associated with various diseases, syndromes and disease susceptibility. In model organisms such as Caenorhabditis elegans (C. elegans) the technique has been applied to detect mutations, primarily deletions, in strains of interest. Although various constraints on oligonucleotide properties have been suggested to minimize non-specific hybridization and improve the data quality, there have been few experimental validations for CGH experiments. For genomic regions where strict design filters would limit the coverage it would also be useful to quantify the expected loss in data quality associated with relaxed design criteria. We have quantified the effects of filtering various oligonucleotide properties by measuring the resolving power for detecting deletions in the human and C. elegans genomes using NimbleGen microarrays. Approximately twice as many oligonucleotides are typically required to be affected by a deletion in human DNA samples in order to achieve the same statistical confidence as one would observe for a deletion in C. elegans. Surprisingly, the ability to detect deletions strongly depends on the oligonucleotide 15-mer count, which is defined as the sum of the genomic frequency of all the constituent 15-mers within the oligonucleotide. A similarity level above 80% to non-target sequences over the length of the probe produces significant cross-hybridization. We recommend the use of a fairly large melting temperature window of up to 10 degrees C, the elimination of repeat sequences, the elimination of homopolymers longer than 5 nucleotides, and a threshold of -1 kcal/mol on the oligonucleotide self-folding energy. We observed very little difference in data quality when varying the oligonucleotide length between 50 and 70, and even when using an isothermal design strategy. We have determined experimentally the effects of varying several key oligonucleotide microarray design criteria for detection of deletions in C. elegans and humans with NimbleGen's CGH technology. Our oligonucleotide design recommendations should be applicable for CGH analysis in most species.

  3. A Prognostic Gene Expression Profile That Predicts Circulating Tumor Cell Presence in Breast Cancer Patients

    PubMed Central

    Molloy, Timothy J.; Roepman, Paul; Naume, Bjørn; van't Veer, Laura J.

    2012-01-01

    The detection of circulating tumor cells (CTCs) in the peripheral blood and microarray gene expression profiling of the primary tumor are two promising new technologies able to provide valuable prognostic data for patients with breast cancer. Meta-analyses of several established prognostic breast cancer gene expression profiles in large patient cohorts have demonstrated that despite sharing few genes, their delineation of patients into “good prognosis” or “poor prognosis” are frequently very highly correlated, and combining prognostic profiles does not increase prognostic power. In the current study, we aimed to develop a novel profile which provided independent prognostic data by building a signature predictive of CTC status rather than outcome. Microarray gene expression data from an initial training cohort of 72 breast cancer patients for which CTC status had been determined in a previous study using a multimarker QPCR-based assay was used to develop a CTC-predictive profile. The generated profile was validated in two independent datasets of 49 and 123 patients and confirmed to be both predictive of CTC status, and independently prognostic. Importantly, the “CTC profile” also provided prognostic information independent of the well-established and powerful ‘70-gene’ prognostic breast cancer signature. This profile therefore has the potential to not only add prognostic information to currently-available microarray tests but in some circumstances even replace blood-based prognostic CTC tests at time of diagnosis for those patients already undergoing testing by multigene assays. PMID:22384245

  4. A Platform for Combined DNA and Protein Microarrays Based on Total Internal Reflection Fluorescence

    PubMed Central

    Asanov, Alexander; Zepeda, Angélica; Vaca, Luis

    2012-01-01

    We have developed a novel microarray technology based on total internal reflection fluorescence (TIRF) in combination with DNA and protein bioassays immobilized at the TIRF surface. Unlike conventional microarrays that exhibit reduced signal-to-background ratio, require several stages of incubation, rinsing and stringency control, and measure only end-point results, our TIRF microarray technology provides several orders of magnitude better signal-to-background ratio, performs analysis rapidly in one step, and measures the entire course of association and dissociation kinetics between target DNA and protein molecules and the bioassays. In many practical cases detection of only DNA or protein markers alone does not provide the necessary accuracy for diagnosing a disease or detecting a pathogen. Here we describe TIRF microarrays that detect DNA and protein markers simultaneously, which reduces the probabilities of false responses. Supersensitive and multiplexed TIRF DNA and protein microarray technology may provide a platform for accurate diagnosis or enhanced research studies. Our TIRF microarray system can be mounted on upright or inverted microscopes or interfaced directly with CCD cameras equipped with a single objective, facilitating the development of portable devices. As proof-of-concept we applied TIRF microarrays for detecting molecular markers from Bacillus anthracis, the pathogen responsible for anthrax. PMID:22438738

  5. Variance stabilization and normalization for one-color microarray data using a data-driven multiscale approach.

    PubMed

    Motakis, E S; Nason, G P; Fryzlewicz, P; Rutter, G A

    2006-10-15

    Many standard statistical techniques are effective on data that are normally distributed with constant variance. Microarray data typically violate these assumptions since they come from non-Gaussian distributions with a non-trivial mean-variance relationship. Several methods have been proposed that transform microarray data to stabilize variance and draw its distribution towards the Gaussian. Some methods, such as log or generalized log, rely on an underlying model for the data. Others, such as the spread-versus-level plot, do not. We propose an alternative data-driven multiscale approach, called the Data-Driven Haar-Fisz for microarrays (DDHFm) with replicates. DDHFm has the advantage of being 'distribution-free' in the sense that no parametric model for the underlying microarray data is required to be specified or estimated; hence, DDHFm can be applied very generally, not just to microarray data. DDHFm achieves very good variance stabilization of microarray data with replicates and produces transformed intensities that are approximately normally distributed. Simulation studies show that it performs better than other existing methods. Application of DDHFm to real one-color cDNA data validates these results. The R package of the Data-Driven Haar-Fisz transform (DDHFm) for microarrays is available in Bioconductor and CRAN.

  6. Comparative transcriptional profiling of human Merkel cells and Merkel cell carcinoma.

    PubMed

    Mouchet, Nicolas; Coquart, Nolwenn; Lebonvallet, Nicolas; Le Gall-Ianotto, Christelle; Mogha, Ariane; Fautrel, Alain; Boulais, Nicholas; Dréno, Brigitte; Martin, Ludovic; Hu, Weiguo; Galibert, Marie-Dominique; Misery, Laurent

    2014-12-01

    Merkel cell carcinoma is believed to be derived from Merkel cells after infection by Merkel cell polyomavirus (MCPyV) and other poorly understood events. Transcriptional profiling using cDNA microarrays was performed on cells from MCPy-negative and MCPy-positive Merkel cell carcinomas and isolated normal Merkel cells. This microarray revealed numerous significantly upregulated genes and some downregulated genes. The extensive list of genes that were identified in these experiments provides a large body of potentially valuable information of Merkel cell carcinoma carcinogenesis and could represent a source of potential targets for cancer therapy. © 2014 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  7. Optimal Control of Shock Wave Turbulent Boundary Layer Interactions Using Micro-Array Actuation

    NASA Technical Reports Server (NTRS)

    Anderson, Bernhard H.; Tinapple, Jon; Surber, Lewis

    2006-01-01

    The intent of this study on micro-array flow control is to demonstrate the viability and economy of Response Surface Methodology (RSM) to determine optimal designs of micro-array actuation for controlling the shock wave turbulent boundary layer interactions within supersonic inlets and compare these concepts to conventional bleed performance. The term micro-array refers to micro-actuator arrays which have heights of 25 to 40 percent of the undisturbed supersonic boundary layer thickness. This study covers optimal control of shock wave turbulent boundary layer interactions using standard micro-vane, tapered micro-vane, and standard micro-ramp arrays at a free stream Mach number of 2.0. The effectiveness of the three micro-array devices was tested using a shock pressure rise induced by the 10 shock generator, which was sufficiently strong as to separate the turbulent supersonic boundary layer. The overall design purpose of the micro-arrays was to alter the properties of the supersonic boundary layer by introducing a cascade of counter-rotating micro-vortices in the near wall region. In this manner, the impact of the shock wave boundary layer (SWBL) interaction on the main flow field was minimized without boundary bleed.

  8. Scaling down the size and increasing the throughput of glycosyltransferase assays: activity changes on stem cell differentiation.

    PubMed

    Patil, Shilpa A; Chandrasekaran, E V; Matta, Khushi L; Parikh, Abhirath; Tzanakakis, Emmanuel S; Neelamegham, Sriram

    2012-06-15

    Glycosyltransferases (glycoTs) catalyze the transfer of monosaccharides from nucleotide-sugars to carbohydrate-, lipid-, and protein-based acceptors. We examined strategies to scale down and increase the throughput of glycoT enzymatic assays because traditional methods require large reaction volumes and complex chromatography. Approaches tested used (i) microarray pin printing, an appropriate method when glycoT activity was high; (ii) microwells and microcentrifuge tubes, a suitable method for studies with cell lysates when enzyme activity was moderate; and (iii) C(18) pipette tips and solvent extraction, a method that enriched reaction product when the extent of reaction was low. In all cases, reverse-phase thin layer chromatography (RP-TLC) coupled with phosphorimaging quantified the reaction rate. Studies with mouse embryonic stem cells (mESCs) demonstrated an increase in overall β(1,3)galactosyltransferase and α(2,3)sialyltransferase activity and a decrease in α(1,3)fucosyltransferases when these cells differentiate toward cardiomyocytes. Enzymatic and lectin binding data suggest a transition from Lewis(x)-type structures in mESCs to sialylated Galβ1,3GalNAc-type glycans on differentiation, with more prominent changes in enzyme activity occurring at later stages when embryoid bodies differentiated toward cardiomyocytes. Overall, simple, rapid, quantitative, and scalable glycoT activity analysis methods are presented. These use a range of natural and synthetic acceptors for the analysis of complex biological specimens that have limited availability. Copyright © 2012 Elsevier Inc. All rights reserved.

  9. D-Light on promoters: a client-server system for the analysis and visualization of cis-regulatory elements

    PubMed Central

    2013-01-01

    Background The binding of transcription factors to DNA plays an essential role in the regulation of gene expression. Numerous experiments elucidated binding sequences which subsequently have been used to derive statistical models for predicting potential transcription factor binding sites (TFBS). The rapidly increasing number of genome sequence data requires sophisticated computational approaches to manage and query experimental and predicted TFBS data in the context of other epigenetic factors and across different organisms. Results We have developed D-Light, a novel client-server software package to store and query large amounts of TFBS data for any number of genomes. Users can add small-scale data to the server database and query them in a large scale, genome-wide promoter context. The client is implemented in Java and provides simple graphical user interfaces and data visualization. Here we also performed a statistical analysis showing what a user can expect for certain parameter settings and we illustrate the usage of D-Light with the help of a microarray data set. Conclusions D-Light is an easy to use software tool to integrate, store and query annotation data for promoters. A public D-Light server, the client and server software for local installation and the source code under GNU GPL license are available at http://biwww.che.sbg.ac.at/dlight. PMID:23617301

  10. BRCA-Monet: a breast cancer specific drug treatment mode-of-action network for treatment effective prediction using large scale microarray database.

    PubMed

    Ma, Chifeng; Chen, Hung-I; Flores, Mario; Huang, Yufei; Chen, Yidong

    2013-01-01

    Connectivity map (cMap) is a recent developed dataset and algorithm for uncovering and understanding the treatment effect of small molecules on different cancer cell lines. It is widely used but there are still remaining challenges for accurate predictions. Here, we propose BRCA-MoNet, a network of drug mode of action (MoA) specific to breast cancer, which is constructed based on the cMap dataset. A drug signature selection algorithm fitting the characteristic of cMap data, a quality control scheme as well as a novel query algorithm based on BRCA-MoNet are developed for more effective prediction of drug effects. BRCA-MoNet was applied to three independent data sets obtained from the GEO database: Estrodial treated MCF7 cell line, BMS-754807 treated MCF7 cell line, and a breast cancer patient microarray dataset. In the first case, BRCA-MoNet could identify drug MoAs likely to share same and reverse treatment effect. In the second case, the result demonstrated the potential of BRCA-MoNet to reposition drugs and predict treatment effects for drugs not in cMap data. In the third case, a possible procedure of personalized drug selection is showcased. The results clearly demonstrated that the proposed BRCA-MoNet approach can provide increased prediction power to cMap and thus will be useful for identification of new therapeutic candidates.

  11. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gardner, Shea N.; McLoughlin, Kevin; Be, Nicholas A.

    Venezuelan equine encephalitis virus (VEEV) is a mosquito-borne alphavirus that has caused large outbreaks of severe illness in both horses and humans. New approaches are needed to rapidly infer the origin of a newly discovered VEEV strain, estimate its equine amplification and resultant epidemic potential, and predict human virulence phenotype. We performed whole genome single nucleotide polymorphism (SNP) analysis of all available VEE antigenic complex genomes, verified that a SNP-based phylogeny accurately captured the features of a phylogenetic tree based on multiple sequence alignment, and developed a high resolution genome-wide SNP microarray. We used the microarray to analyze a broadmore » panel of VEEV isolates, found excellent concordance between array- and sequence-based SNP calls, genotyped unsequenced isolates, and placed them on a phylogeny with sequenced genomes. The microarray successfully genotyped VEEV directly from tissue samples of an infected mouse, bypassing the need for viral isolation, culture and genomic sequencing. Lastly, we identified genomic variants associated with serotypes and host species, revealing a complex relationship between genotype and phenotype.« less

  12. PhylArray: phylogenetic probe design algorithm for microarray.

    PubMed

    Militon, Cécile; Rimour, Sébastien; Missaoui, Mohieddine; Biderre, Corinne; Barra, Vincent; Hill, David; Moné, Anne; Gagne, Geneviève; Meier, Harald; Peyretaillade, Eric; Peyret, Pierre

    2007-10-01

    Microbial diversity is still largely unknown in most environments, such as soils. In order to get access to this microbial 'black-box', the development of powerful tools such as microarrays are necessary. However, the reliability of this approach relies on probe efficiency, in particular sensitivity, specificity and explorative power, in order to obtain an image of the microbial communities that is close to reality. We propose a new probe design algorithm that is able to select microarray probes targeting SSU rRNA at any phylogenetic level. This original approach, implemented in a program called 'PhylArray', designs a combination of degenerate and non-degenerate probes for each target taxon. Comparative experimental evaluations indicate that probes designed with PhylArray yield a higher sensitivity and specificity than those designed by conventional approaches. Applying the combined PhyArray/GoArrays strategy helps to optimize the hybridization performance of short probes. Finally, hybridizations with environmental targets have shown that the use of the PhylArray strategy can draw attention to even previously unknown bacteria.

  13. High-Throughput Lectin Microarray-Based Analysis of Live Cell Surface Glycosylation

    PubMed Central

    Li, Yu; Tao, Sheng-ce; Zhu, Heng; Schneck, Jonathan P.

    2011-01-01

    Lectins, plant-derived glycan-binding proteins, have long been used to detect glycans on cell surfaces. However, the techniques used to characterize serum or cells have largely been limited to mass spectrometry, blots, flow cytometry, and immunohistochemistry. While these lectin-based approaches are well established and they can discriminate a limited number of sugar isomers by concurrently using a limited number of lectins, they are not amenable for adaptation to a high-throughput platform. Fortunately, given the commercial availability of lectins with a variety of glycan specificities, lectins can be printed on a glass substrate in a microarray format to profile accessible cell-surface glycans. This method is an inviting alternative for analysis of a broad range of glycans in a high-throughput fashion and has been demonstrated to be a feasible method of identifying binding-accessible cell surface glycosylation on living cells. The current unit presents a lectin-based microarray approach for analyzing cell surface glycosylation in a high-throughput fashion. PMID:21400689

  14. Tracing phylogenomic events leading to diversity of Haemophilus influenzae and the emergence of Brazilian Purpuric Fever (BPF)-associated clones

    PubMed Central

    Papazisi, Leka; Ratnayake, Shashikala; Remortel, Brian G.; Bock, Geoffrey R.; Liang, Wei; Saeed, Alexander I.; Liu, Jia; Fleischmann, Robert D.; Kilian, Mogens; Peterson, Scott N.

    2010-01-01

    Here we report the use of a multi-genome DNA microarray to elucidate the genomic events associated with the emergence of the clonal variants of H. influenzae biogroup aegyptius causing Brazilian Purpuric Fever (BPF), an important pediatric disease with a high mortality rate. We performed directed genome sequencing of strain HK1212 unique loci to construct a species DNA microarray. Comparative genome hybridization using this microarray enabled us to determine and compare gene complements, and infer reliable phylogenomic relationships among members of the species. The higher genomic variability observed in the genomes of BPF-related strains (clones) and their close relatives may be characterized by significant gene flux related to a subset of functional role categories. We found that the acquisition of a large number of virulence determinants featuring numerous cell membrane proteins coupled to the loss of genes involved in transport, central biosynthetic pathways and in particular, energy production pathways to be characteristics of the BPF genomic variants. PMID:20654709

  15. DNA microarray technology in nutraceutical and food safety.

    PubMed

    Liu-Stratton, Yiwen; Roy, Sashwati; Sen, Chandan K

    2004-04-15

    The quality and quantity of diet is a key determinant of health and disease. Molecular diagnostics may play a key role in food safety related to genetically modified foods, food-borne pathogens and novel nutraceuticals. Functional outcomes in biology are determined, for the most part, by net balance between sets of genes related to the specific outcome in question. The DNA microarray technology offers a new dimension of strength in molecular diagnostics by permitting the simultaneous analysis of large sets of genes. Automation of assay and novel bioinformatics tools make DNA microarrays a robust technology for diagnostics. Since its development a few years ago, this technology has been used for the applications of toxicogenomics, pharmacogenomics, cell biology, and clinical investigations addressing the prevention and intervention of diseases. Optimization of this technology to specifically address food safety is a vast resource that remains to be mined. Efforts to develop diagnostic custom arrays and simplified bioinformatics tools for field use are warranted.

  16. Effect of storage time on gene expression data acquired from unfrozen archived newborn blood spots.

    PubMed

    Ho, Nhan T; Busik, Julia V; Resau, James H; Paneth, Nigel; Khoo, Sok Kean

    2016-11-01

    Unfrozen archived newborn blood spots (NBS) have been shown to retain sufficient messenger RNA (mRNA) for gene expression profiling. However, the effect of storage time at ambient temperature for NBS samples in relation to the quality of gene expression data is relatively unknown. Here, we evaluated mRNA expression from quantitative real-time PCR (qRT-PCR) and microarray data obtained from NBS samples stored at ambient temperature to determine the effect of storage time on the quality of gene expression. These data were generated in a previous case-control study examining NBS in 53 children with cerebral palsy (CP) and 53 matched controls. NBS sample storage period ranged from 3 to 16years at ambient temperature. We found persistently low RNA integrity numbers (RIN=2.3±0.71) and 28S/18S rRNA ratios (~0) across NBS samples for all storage periods. In both qRT-PCR and microarray data, the expression of three common housekeeping genes-beta cytoskeletal actin (ACTB), glyceraldehyde 3-phosphate dehydrogenase (GAPDH), and peptidylprolyl isomerase A (PPIA)-decreased with increased storage time. Median values of each microarray probe intensity at log 2 scale also decreased over time. After eight years of storage, probe intensity values were largely reduced to background intensity levels. Of 21,500 genes tested, 89% significantly decreased in signal intensity, with 13,551, 10,730, and 9925 genes detected within 5years, > 5 to <10years, and >10years of storage, respectively. We also examined the expression of two gender-specific genes (X inactivation-specific transcript, XIST and lysine-specific demethylase 5D, KDM5D) and seven gene sets representing the inflammatory, hypoxic, coagulative, and thyroidal pathways hypothesized to be related to CP risk to determine the effect of storage time on the detection of these biologically relevant genes. We found the gender-specific genes and CP-related gene sets detectable in all storage periods, but exhibited differential expression (between male vs. female or CP vs. control) only within the first six years of storage. We concluded that gene expression data quality deteriorates in unfrozen archived NBS over time and that differential gene expression profiling and analysis is recommended for those NBS samples collected and stored within six years at ambient temperature. Copyright © 2016 Elsevier Inc. All rights reserved.

  17. Development and validation of a flax (Linum usitatissimum L.) gene expression oligo microarray

    PubMed Central

    2010-01-01

    Background Flax (Linum usitatissimum L.) has been cultivated for around 9,000 years and is therefore one of the oldest cultivated species. Today, flax is still grown for its oil (oil-flax or linseed cultivars) and its cellulose-rich fibres (fibre-flax cultivars) used for high-value linen garments and composite materials. Despite the wide industrial use of flax-derived products, and our actual understanding of the regulation of both wood fibre production and oil biosynthesis more information must be acquired in both domains. Recent advances in genomics are now providing opportunities to improve our fundamental knowledge of these complex processes. In this paper we report the development and validation of a high-density oligo microarray platform dedicated to gene expression analyses in flax. Results Nine different RNA samples obtained from flax inner- and outer-stems, seeds, leaves and roots were used to generate a collection of 1,066,481 ESTs by massive parallel pyrosequencing. Sequences were assembled into 59,626 unigenes and 48,021 sequences were selected for oligo design and high-density microarray (Nimblegen 385K) fabrication with eight, non-overlapping 25-mers oligos per unigene. 18 independent experiments were used to evaluate the hybridization quality, precision, specificity and accuracy and all results confirmed the high technical quality of our microarray platform. Cross-validation of microarray data was carried out using quantitative qRT-PCR. Nine target genes were selected on the basis of microarray results and reflected the whole range of fold change (both up-regulated and down-regulated genes in different samples). A statistically significant positive correlation was obtained comparing expression levels for each target gene across all biological replicates both in qRT-PCR and microarray results. Further experiments illustrated the capacity of our arrays to detect differential gene expression in a variety of flax tissues as well as between two contrasted flax varieties. Conclusion All results suggest that our high-density flax oligo-microarray platform can be used as a very sensitive tool for analyzing gene expression in a large variety of tissues as well as in different cultivars. Moreover, this highly reliable platform can also be used for the quantification of mRNA transcriptional profiling in different flax tissues. PMID:20964859

  18. Development and validation of a flax (Linum usitatissimum L.) gene expression oligo microarray.

    PubMed

    Fenart, Stéphane; Ndong, Yves-Placide Assoumou; Duarte, Jorge; Rivière, Nathalie; Wilmer, Jeroen; van Wuytswinkel, Olivier; Lucau, Anca; Cariou, Emmanuelle; Neutelings, Godfrey; Gutierrez, Laurent; Chabbert, Brigitte; Guillot, Xavier; Tavernier, Reynald; Hawkins, Simon; Thomasset, Brigitte

    2010-10-21

    Flax (Linum usitatissimum L.) has been cultivated for around 9,000 years and is therefore one of the oldest cultivated species. Today, flax is still grown for its oil (oil-flax or linseed cultivars) and its cellulose-rich fibres (fibre-flax cultivars) used for high-value linen garments and composite materials. Despite the wide industrial use of flax-derived products, and our actual understanding of the regulation of both wood fibre production and oil biosynthesis more information must be acquired in both domains. Recent advances in genomics are now providing opportunities to improve our fundamental knowledge of these complex processes. In this paper we report the development and validation of a high-density oligo microarray platform dedicated to gene expression analyses in flax. Nine different RNA samples obtained from flax inner- and outer-stems, seeds, leaves and roots were used to generate a collection of 1,066,481 ESTs by massive parallel pyrosequencing. Sequences were assembled into 59,626 unigenes and 48,021 sequences were selected for oligo design and high-density microarray (Nimblegen 385K) fabrication with eight, non-overlapping 25-mers oligos per unigene. 18 independent experiments were used to evaluate the hybridization quality, precision, specificity and accuracy and all results confirmed the high technical quality of our microarray platform. Cross-validation of microarray data was carried out using quantitative qRT-PCR. Nine target genes were selected on the basis of microarray results and reflected the whole range of fold change (both up-regulated and down-regulated genes in different samples). A statistically significant positive correlation was obtained comparing expression levels for each target gene across all biological replicates both in qRT-PCR and microarray results. Further experiments illustrated the capacity of our arrays to detect differential gene expression in a variety of flax tissues as well as between two contrasted flax varieties. All results suggest that our high-density flax oligo-microarray platform can be used as a very sensitive tool for analyzing gene expression in a large variety of tissues as well as in different cultivars. Moreover, this highly reliable platform can also be used for the quantification of mRNA transcriptional profiling in different flax tissues.

  19. Microarray analysis in rat liver slices correctly predicts in vivo hepatotoxicity.

    PubMed

    Elferink, M G L; Olinga, P; Draaisma, A L; Merema, M T; Bauerschmidt, S; Polman, J; Schoonen, W G; Groothuis, G M M

    2008-06-15

    The microarray technology, developed for the simultaneous analysis of a large number of genes, may be useful for the detection of toxicity in an early stage of the development of new drugs. The effect of different hepatotoxins was analyzed at the gene expression level in the rat liver both in vivo and in vitro. As in vitro model system the precision-cut liver slice model was used, in which all liver cell types are present in their natural architecture. This is important since drug-induced toxicity often is a multi-cellular process involving not only hepatocytes but also other cell types such as Kupffer and stellate cells. As model toxic compounds lipopolysaccharide (LPS, inducing inflammation), paracetamol (necrosis), carbon tetrachloride (CCl(4), fibrosis and necrosis) and gliotoxin (apoptosis) were used. The aim of this study was to validate the rat liver slice system as in vitro model system for drug-induced toxicity studies. The results of the microarray studies show that the in vitro profiles of gene expression cluster per compound and incubation time, and when analyzed in a commercial gene expression database, can predict the toxicity and pathology observed in vivo. Each toxic compound induces a specific pattern of gene expression changes. In addition, some common genes were up- or down-regulated with all toxic compounds. These data show that the rat liver slice system can be an appropriate tool for the prediction of multi-cellular liver toxicity. The same experiments and analyses are currently performed for the prediction of human specific toxicity using human liver slices.

  20. Data submission and quality in microarray-based microRNA profiling

    PubMed Central

    Witwer, Kenneth W.

    2014-01-01

    Background Public sharing of scientific data has assumed greater importance in the ‘omics’ era. Transparency is necessary for confirmation and validation, and multiple examiners aid in extracting maximal value from large datasets. Accordingly, database submission and provision of the Minimum Information About a Microarray Experiment (MIAME) are required by most journals as a prerequisite for review or acceptance. Methods In this study, the level of data submission and MIAME compliance was reviewed for 127 articles that included microarray-based microRNA profiling and that were published from July, 2011 through April, 2012 in the journals that published the largest number of such articles—PLOS ONE, the Journal of Biological Chemistry, Blood, and Oncogene—along with articles from nine other journals, including Clinical Chemistry, that published smaller numbers of array-based articles. Results Overall, data submission was reported at publication for less than 40% of all articles, and almost 75% of articles were MIAME-noncompliant. On average, articles that included full data submission scored significantly higher on a quality metric than articles with limited or no data submission, and studies with adequate description of methods disproportionately included larger numbers of experimental repeats. Finally, for several articles that were not MIAME-compliant, data re-analysis revealed less than complete support for the published conclusions, in one case leading to retraction. Conclusions These findings buttress the hypothesis that reluctance to share data is associated with low study quality and suggest that most miRNA array investigations are underpowered and/or potentially compromised by a lack of appropriate reporting and data submission. PMID:23358751

  1. Data submission and quality in microarray-based microRNA profiling.

    PubMed

    Witwer, Kenneth W

    2013-02-01

    Public sharing of scientific data has assumed greater importance in the omics era. Transparency is necessary for confirmation and validation, and multiple examiners aid in extracting maximal value from large data sets. Accordingly, database submission and provision of the Minimum Information About a Microarray Experiment (MIAME)(3) are required by most journals as a prerequisite for review or acceptance. In this study, the level of data submission and MIAME compliance was reviewed for 127 articles that included microarray-based microRNA (miRNA) profiling and were published from July 2011 through April 2012 in the journals that published the largest number of such articles--PLOS ONE, the Journal of Biological Chemistry, Blood, and Oncogene--along with articles from 9 other journals, including Clinical Chemistry, that published smaller numbers of array-based articles. Overall, data submission was reported at publication for <40% of all articles, and almost 75% of articles were MIAME noncompliant. On average, articles that included full data submission scored significantly higher on a quality metric than articles with limited or no data submission, and studies with adequate description of methods disproportionately included larger numbers of experimental repeats. Finally, for several articles that were not MIAME compliant, data reanalysis revealed less than complete support for the published conclusions, in 1 case leading to retraction. These findings buttress the hypothesis that reluctance to share data is associated with low study quality and suggest that most miRNA array investigations are underpowered and/or potentially compromised by a lack of appropriate reporting and data submission. © 2012 American Association for Clinical Chemistry

  2. Microarray analysis in rat liver slices correctly predicts in vivo hepatotoxicity

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Elferink, M.G.L.; Olinga, P.; Draaisma, A.L.

    2008-06-15

    The microarray technology, developed for the simultaneous analysis of a large number of genes, may be useful for the detection of toxicity in an early stage of the development of new drugs. The effect of different hepatotoxins was analyzed at the gene expression level in the rat liver both in vivo and in vitro. As in vitro model system the precision-cut liver slice model was used, in which all liver cell types are present in their natural architecture. This is important since drug-induced toxicity often is a multi-cellular process involving not only hepatocytes but also other cell types such asmore » Kupffer and stellate cells. As model toxic compounds lipopolysaccharide (LPS, inducing inflammation), paracetamol (necrosis), carbon tetrachloride (CCl{sub 4}, fibrosis and necrosis) and gliotoxin (apoptosis) were used. The aim of this study was to validate the rat liver slice system as in vitro model system for drug-induced toxicity studies. The results of the microarray studies show that the in vitro profiles of gene expression cluster per compound and incubation time, and when analyzed in a commercial gene expression database, can predict the toxicity and pathology observed in vivo. Each toxic compound induces a specific pattern of gene expression changes. In addition, some common genes were up- or down-regulated with all toxic compounds. These data show that the rat liver slice system can be an appropriate tool for the prediction of multi-cellular liver toxicity. The same experiments and analyses are currently performed for the prediction of human specific toxicity using human liver slices.« less

  3. Progress in the application of DNA microarrays.

    PubMed Central

    Lobenhofer, E K; Bushel, P R; Afshari, C A; Hamadeh, H K

    2001-01-01

    Microarray technology has been applied to a variety of different fields to address fundamental research questions. The use of microarrays, or DNA chips, to study the gene expression profiles of biologic samples began in 1995. Since that time, the fundamental concepts behind the chip, the technology required for making and using these chips, and the multitude of statistical tools for analyzing the data have been extensively reviewed. For this reason, the focus of this review will be not on the technology itself but on the application of microarrays as a research tool and the future challenges of the field. PMID:11673116

  4. Multiplex amplification of large sets of human exons.

    PubMed

    Porreca, Gregory J; Zhang, Kun; Li, Jin Billy; Xie, Bin; Austin, Derek; Vassallo, Sara L; LeProust, Emily M; Peck, Bill J; Emig, Christopher J; Dahl, Fredrik; Gao, Yuan; Church, George M; Shendure, Jay

    2007-11-01

    A new generation of technologies is poised to reduce DNA sequencing costs by several orders of magnitude. But our ability to fully leverage the power of these technologies is crippled by the absence of suitable 'front-end' methods for isolating complex subsets of a mammalian genome at a scale that matches the throughput at which these platforms will routinely operate. We show that targeting oligonucleotides released from programmable microarrays can be used to capture and amplify approximately 10,000 human exons in a single multiplex reaction. Additionally, we show integration of this protocol with ultra-high-throughput sequencing for targeted variation discovery. Although the multiplex capture reaction is highly specific, we found that nonuniform capture is a key issue that will need to be resolved by additional optimization. We anticipate that highly multiplexed methods for targeted amplification will enable the comprehensive resequencing of human exons at a fraction of the cost of whole-genome resequencing.

  5. A rare sugar, d-allose, confers resistance to rice bacterial blight with upregulation of defense-related genes in Oryza sativa.

    PubMed

    Kano, Akihito; Gomi, Kenji; Yamasaki-Kokudo, Yumiko; Satoh, Masaru; Fukumoto, Takeshi; Ohtani, Kouhei; Tajima, Shigeyuki; Izumori, Ken; Tanaka, Keiji; Ishida, Yutaka; Tada, Yasuomi; Nishizawa, Yoko; Akimitsu, Kazuya

    2010-01-01

    We investigated responses of rice plant to three rare sugars, d-altrose, d-sorbose, and d-allose, due to establishment of mass production methods for these rare sugars. Root growth and shoot growth were significantly inhibited by d-allose but not by the other rare sugars. A large-scale gene expression analysis using a rice microarray revealed that d-allose treatment causes a high upregulation of many defense-related, pathogenesis-related (PR) protein genes in rice. The PR protein genes were not upregulated by other rare sugars. Furthermore, d-allose treatment of rice plants conferred limited resistance of the rice against the pathogen Xanthomonas oryzae pv. oryzae but the other tested sugars did not. These results indicate that d-allose has a growth inhibitory effect but might prove to be a candidate elicitor for reducing disease development in rice.

  6. xQTL workbench: a scalable web environment for multi-level QTL analysis.

    PubMed

    Arends, Danny; van der Velde, K Joeri; Prins, Pjotr; Broman, Karl W; Möller, Steffen; Jansen, Ritsert C; Swertz, Morris A

    2012-04-01

    xQTL workbench is a scalable web platform for the mapping of quantitative trait loci (QTLs) at multiple levels: for example gene expression (eQTL), protein abundance (pQTL), metabolite abundance (mQTL) and phenotype (phQTL) data. Popular QTL mapping methods for model organism and human populations are accessible via the web user interface. Large calculations scale easily on to multi-core computers, clusters and Cloud. All data involved can be uploaded and queried online: markers, genotypes, microarrays, NGS, LC-MS, GC-MS, NMR, etc. When new data types come available, xQTL workbench is quickly customized using the Molgenis software generator. xQTL workbench runs on all common platforms, including Linux, Mac OS X and Windows. An online demo system, installation guide, tutorials, software and source code are available under the LGPL3 license from http://www.xqtl.org. m.a.swertz@rug.nl.

  7. xQTL workbench: a scalable web environment for multi-level QTL analysis

    PubMed Central

    Arends, Danny; van der Velde, K. Joeri; Prins, Pjotr; Broman, Karl W.; Möller, Steffen; Jansen, Ritsert C.; Swertz, Morris A.

    2012-01-01

    Summary: xQTL workbench is a scalable web platform for the mapping of quantitative trait loci (QTLs) at multiple levels: for example gene expression (eQTL), protein abundance (pQTL), metabolite abundance (mQTL) and phenotype (phQTL) data. Popular QTL mapping methods for model organism and human populations are accessible via the web user interface. Large calculations scale easily on to multi-core computers, clusters and Cloud. All data involved can be uploaded and queried online: markers, genotypes, microarrays, NGS, LC-MS, GC-MS, NMR, etc. When new data types come available, xQTL workbench is quickly customized using the Molgenis software generator. Availability: xQTL workbench runs on all common platforms, including Linux, Mac OS X and Windows. An online demo system, installation guide, tutorials, software and source code are available under the LGPL3 license from http://www.xqtl.org. Contact: m.a.swertz@rug.nl PMID:22308096

  8. Gene expression profiling in the hippocampus of learned helpless and nonhelpless rats.

    PubMed

    Kohen, R; Kirov, S; Navaja, G P; Happe, H Kevin; Hamblin, M W; Snoddy, J R; Neumaier, J F; Petty, F

    2005-01-01

    In the learned helplessness (LH) animal model of depression, failure to attempt escape from avoidable environmental stress, LH, indicates behavioral despair, whereas nonhelpless (NH) behavior reflects behavioral resilience to the effects of environmental stress. Comparing hippocampal gene expression with large-scale oligonucleotide microarrays, we found that stress-resilient (NH) rats, although behaviorally indistinguishable from controls, showed a distinct gene expression profile compared to LH, sham stressed, and naïve control animals. Genes that were confirmed as differentially expressed in the NH group by quantitative PCR strongly correlated in their levels of expression across all four animal groups. Differential expression could not be confirmed at the protein level. We identified several shared degenerate sequence motifs in the 3' untranslated region (3'UTR) of differentially expressed genes that could be a factor in this tight correlation of expression levels among differentially expressed genes.

  9. BioStar models of clinical and genomic data for biomedical data warehouse design

    PubMed Central

    Wang, Liangjiang; Ramanathan, Murali

    2008-01-01

    Biomedical research is now generating large amounts of data, ranging from clinical test results to microarray gene expression profiles. The scale and complexity of these datasets give rise to substantial challenges in data management and analysis. It is highly desirable that data warehousing and online analytical processing technologies can be applied to biomedical data integration and mining. The major difficulty probably lies in the task of capturing and modelling diverse biological objects and their complex relationships. This paper describes multidimensional data modelling for biomedical data warehouse design. Since the conventional models such as star schema appear to be insufficient for modelling clinical and genomic data, we develop a new model called BioStar schema. The new model can capture the rich semantics of biomedical data and provide greater extensibility for the fast evolution of biological research methodologies. PMID:18048122

  10. Constructing Tissue Microarrays: Protocols and Methods Considering Potential Advantages and Disadvantages for Downstream Use.

    PubMed

    Bingle, Lynne; Fonseca, Felipe P; Farthing, Paula M

    2017-01-01

    Tissue microarrays were first constructed in the 1980s but were used by only a limited number of researchers for a considerable period of time. In the last 10 years there has been a dramatic increase in the number of publications describing the successful use of tissue microarrays in studies aimed at discovering and validating biomarkers. This, along with the increased availability of both manual and automated microarray builders on the market, has encouraged even greater use of this novel and powerful tool. This chapter describes the basic techniques required to build a tissue microarray using a manual method in order that the theory behind the practical steps can be fully explained. Guidance is given to ensure potential disadvantages of the technique are fully considered.

  11. Dynamic DNA cytosine methylation in the Populus trichocarpa genome: tissue-level variation and relationship to gene expression

    PubMed Central

    2012-01-01

    Background DNA cytosine methylation is an epigenetic modification that has been implicated in many biological processes. However, large-scale epigenomic studies have been applied to very few plant species, and variability in methylation among specialized tissues and its relationship to gene expression is poorly understood. Results We surveyed DNA methylation from seven distinct tissue types (vegetative bud, male inflorescence [catkin], female catkin, leaf, root, xylem, phloem) in the reference tree species black cottonwood (Populus trichocarpa). Using 5-methyl-cytosine DNA immunoprecipitation followed by Illumina sequencing (MeDIP-seq), we mapped a total of 129,360,151 36- or 32-mer reads to the P. trichocarpa reference genome. We validated MeDIP-seq results by bisulfite sequencing, and compared methylation and gene expression using published microarray data. Qualitative DNA methylation differences among tissues were obvious on a chromosome scale. Methylated genes had lower expression than unmethylated genes, but genes with methylation in transcribed regions ("gene body methylation") had even lower expression than genes with promoter methylation. Promoter methylation was more frequent than gene body methylation in all tissues except male catkins. Male catkins differed in demethylation of particular transposable element categories, in level of gene body methylation, and in expression range of genes with methylated transcribed regions. Tissue-specific gene expression patterns were correlated with both gene body and promoter methylation. Conclusions We found striking differences among tissues in methylation, which were apparent at the chromosomal scale and when genes and transposable elements were examined. In contrast to other studies in plants, gene body methylation had a more repressive effect on transcription than promoter methylation. PMID:22251412

  12. The effect of column purification on cDNA indirect labelling for microarrays

    PubMed Central

    Molas, M Lia; Kiss, John Z

    2007-01-01

    Background The success of the microarray reproducibility is dependent upon the performance of standardized procedures. Since the introduction of microarray technology for the analysis of global gene expression, reproducibility of results among different laboratories has been a major problem. Two of the main contributors to this variability are the use of different microarray platforms and different laboratory practices. In this paper, we address the latter question in terms of how variation in one of the steps of a labelling procedure affects the cDNA product prior to microarray hybridization. Results We used a standard procedure to label cDNA for microarray hybridization and employed different types of column chromatography for cDNA purification. After purifying labelled cDNA, we used the Agilent 2100 Bioanalyzer and agarose gel electrophoresis to assess the quality of the labelled cDNA before its hybridization onto a microarray platform. There were major differences in the cDNA profile (i.e. cDNA fragment lengths and abundance) as a result of using four different columns for purification. In addition, different columns have different efficiencies to remove rRNA contamination. This study indicates that the appropriate column to use in this type of protocol has to be experimentally determined. Finally, we present new evidence establishing the importance of testing the method of purification used during an indirect labelling procedure. Our results confirm the importance of assessing the quality of the sample in the labelling procedure prior to hybridization onto a microarray platform. Conclusion Standardization of column purification systems to be used in labelling procedures will improve the reproducibility of microarray results among different laboratories. In addition, implementation of a quality control check point of the labelled samples prior to microarray hybridization will prevent hybridizing a poor quality sample to expensive micorarrays. PMID:17597522

  13. The effect of column purification on cDNA indirect labelling for microarrays.

    PubMed

    Molas, M Lia; Kiss, John Z

    2007-06-27

    The success of the microarray reproducibility is dependent upon the performance of standardized procedures. Since the introduction of microarray technology for the analysis of global gene expression, reproducibility of results among different laboratories has been a major problem. Two of the main contributors to this variability are the use of different microarray platforms and different laboratory practices. In this paper, we address the latter question in terms of how variation in one of the steps of a labelling procedure affects the cDNA product prior to microarray hybridization. We used a standard procedure to label cDNA for microarray hybridization and employed different types of column chromatography for cDNA purification. After purifying labelled cDNA, we used the Agilent 2100 Bioanalyzer and agarose gel electrophoresis to assess the quality of the labelled cDNA before its hybridization onto a microarray platform. There were major differences in the cDNA profile (i.e. cDNA fragment lengths and abundance) as a result of using four different columns for purification. In addition, different columns have different efficiencies to remove rRNA contamination. This study indicates that the appropriate column to use in this type of protocol has to be experimentally determined. Finally, we present new evidence establishing the importance of testing the method of purification used during an indirect labelling procedure. Our results confirm the importance of assessing the quality of the sample in the labelling procedure prior to hybridization onto a microarray platform. Standardization of column purification systems to be used in labelling procedures will improve the reproducibility of microarray results among different laboratories. In addition, implementation of a quality control check point of the labelled samples prior to microarray hybridization will prevent hybridizing a poor quality sample to expensive micorarrays.

  14. Evolution of the MIDTAL microarray: the adaption and testing of oligonucleotide 18S and 28S rDNA probes and evaluation of subsequent microarray generations with Prymnesium spp. cultures and field samples.

    PubMed

    McCoy, Gary R; Touzet, Nicolas; Fleming, Gerard T A; Raine, Robin

    2015-07-01

    The toxic microalgal species Prymnesium parvum and Prymnesium polylepis are responsible for numerous fish kills causing economic stress on the aquaculture industry and, through the consumption of contaminated shellfish, can potentially impact on human health. Monitoring of toxic phytoplankton is traditionally carried out by light microscopy. However, molecular methods of identification and quantification are becoming more common place. This study documents the optimisation of the novel Microarrays for the Detection of Toxic Algae (MIDTAL) microarray from its initial stages to the final commercial version now available from Microbia Environnement (France). Existing oligonucleotide probes used in whole-cell fluorescent in situ hybridisation (FISH) for Prymnesium species from higher group probes to species-level probes were adapted and tested on the first-generation microarray. The combination and interaction of numerous other probes specific for a whole range of phytoplankton taxa also spotted on the chip surface caused high cross reactivity, resulting in false-positive results on the microarray. The probe sequences were extended for the subsequent second-generation microarray, and further adaptations of the hybridisation protocol and incubation temperatures significantly reduced false-positive readings from the first to the second-generation chip, thereby increasing the specificity of the MIDTAL microarray. Additional refinement of the subsequent third-generation microarray protocols with the addition of a poly-T amino linker to the 5' end of each probe further enhanced the microarray performance but also highlighted the importance of optimising RNA labelling efficiency when testing with natural seawater samples from Killary Harbour, Ireland.

  15. Spermatogenesis in mammals: proteomic insights.

    PubMed

    Chocu, Sophie; Calvel, Pierre; Rolland, Antoine D; Pineau, Charles

    2012-08-01

    Spermatogenesis is a highly sophisticated process involved in the transmission of genetic heritage. It includes halving ploidy, repackaging of the chromatin for transport, and the equipment of developing spermatids and eventually spermatozoa with the advanced apparatus (e.g., tightly packed mitochondrial sheat in the mid piece, elongating of the tail, reduction of cytoplasmic volume) to elicit motility once they reach the epididymis. Mammalian spermatogenesis is divided into three phases. In the first the primitive germ cells or spermatogonia undergo a series of mitotic divisions. In the second the spermatocytes undergo two consecutive divisions in meiosis to produce haploid spermatids. In the third the spermatids differentiate into spermatozoa in a process called spermiogenesis. Paracrine, autocrine, juxtacrine, and endocrine pathways all contribute to the regulation of the process. The array of structural elements and chemical factors modulating somatic and germ cell activity is such that the network linking the various cellular activities during spermatogenesis is unimaginably complex. Over the past two decades, advances in genomics have greatly improved our knowledge of spermatogenesis, by identifying numerous genes essential for the development of functional male gametes. Large-scale analyses of testicular function have deepened our insight into normal and pathological spermatogenesis. Progress in genome sequencing and microarray technology have been exploited for genome-wide expression studies, leading to the identification of hundreds of genes differentially expressed within the testis. However, although proteomics has now come of age, the proteomics-based investigation of spermatogenesis remains in its infancy. Here, we review the state-of-the-art of large-scale proteomic analyses of spermatogenesis, from germ cell development during sex determination to spermatogenesis in the adult. Indeed, a few laboratories have undertaken differential protein profiling expression studies and/or systematic analyses of testicular proteomes in entire organs or isolated cells from various species. We consider the pros and cons of proteomics for studying the testicular germ cell gene expression program. Finally, we address the use of protein datasets, through integrative genomics (i.e., combining genomics, transcriptomics, and proteomics), bioinformatics, and modelling.

  16. Characterization and simulation of cDNA microarray spots using a novel mathematical model

    PubMed Central

    Kim, Hye Young; Lee, Seo Eun; Kim, Min Jung; Han, Jin Il; Kim, Bo Kyung; Lee, Yong Sung; Lee, Young Seek; Kim, Jin Hyuk

    2007-01-01

    Background The quality of cDNA microarray data is crucial for expanding its application to other research areas, such as the study of gene regulatory networks. Despite the fact that a number of algorithms have been suggested to increase the accuracy of microarray gene expression data, it is necessary to obtain reliable microarray images by improving wet-lab experiments. As the first step of a cDNA microarray experiment, spotting cDNA probes is critical to determining the quality of spot images. Results We developed a governing equation of cDNA deposition during evaporation of a drop in the microarray spotting process. The governing equation included four parameters: the surface site density on the support, the extrapolated equilibrium constant for the binding of cDNA molecules with surface sites on glass slides, the macromolecular interaction factor, and the volume constant of a drop of cDNA solution. We simulated cDNA deposition from the single model equation by varying the value of the parameters. The morphology of the resulting cDNA deposit can be classified into three types: a doughnut shape, a peak shape, and a volcano shape. The spot morphology can be changed into a flat shape by varying the experimental conditions while considering the parameters of the governing equation of cDNA deposition. The four parameters were estimated by fitting the governing equation to the real microarray images. With the results of the simulation and the parameter estimation, the phenomenon of the formation of cDNA deposits in each type was investigated. Conclusion This study explains how various spot shapes can exist and suggests which parameters are to be adjusted for obtaining a good spot. This system is able to explore the cDNA microarray spotting process in a predictable, manageable and descriptive manner. We hope it can provide a way to predict the incidents that can occur during a real cDNA microarray experiment, and produce useful data for several research applications involving cDNA microarrays. PMID:18096047

  17. HeatmapGenerator: high performance RNAseq and microarray visualization software suite to examine differential gene expression levels using an R and C++ hybrid computational pipeline.

    PubMed

    Khomtchouk, Bohdan B; Van Booven, Derek J; Wahlestedt, Claes

    2014-01-01

    The graphical visualization of gene expression data using heatmaps has become an integral component of modern-day medical research. Heatmaps are used extensively to plot quantitative differences in gene expression levels, such as those measured with RNAseq and microarray experiments, to provide qualitative large-scale views of the transcriptonomic landscape. Creating high-quality heatmaps is a computationally intensive task, often requiring considerable programming experience, particularly for customizing features to a specific dataset at hand. Software to create publication-quality heatmaps is developed with the R programming language, C++ programming language, and OpenGL application programming interface (API) to create industry-grade high performance graphics. We create a graphical user interface (GUI) software package called HeatmapGenerator for Windows OS and Mac OS X as an intuitive, user-friendly alternative to researchers with minimal prior coding experience to allow them to create publication-quality heatmaps using R graphics without sacrificing their desired level of customization. The simplicity of HeatmapGenerator is that it only requires the user to upload a preformatted input file and download the publicly available R software language, among a few other operating system-specific requirements. Advanced features such as color, text labels, scaling, legend construction, and even database storage can be easily customized with no prior programming knowledge. We provide an intuitive and user-friendly software package, HeatmapGenerator, to create high-quality, customizable heatmaps generated using the high-resolution color graphics capabilities of R. The software is available for Microsoft Windows and Apple Mac OS X. HeatmapGenerator is released under the GNU General Public License and publicly available at: http://sourceforge.net/projects/heatmapgenerator/. The Mac OS X direct download is available at: http://sourceforge.net/projects/heatmapgenerator/files/HeatmapGenerator_MAC_OSX.tar.gz/download. The Windows OS direct download is available at: http://sourceforge.net/projects/heatmapgenerator/files/HeatmapGenerator_WINDOWS.zip/download.

  18. Single-Nucleotide Polymorphism-Microarray Ploidy Analysis of Paraffin-Embedded Products of Conception in Recurrent Pregnancy Loss Evaluations.

    PubMed

    Maslow, Bat-Sheva L; Budinetz, Tara; Sueldo, Carolina; Anspach, Erica; Engmann, Lawrence; Benadiva, Claudio; Nulsen, John C

    2015-07-01

    To compare the analysis of chromosome number from paraffin-embedded products of conception using single-nucleotide polymorphism (SNP) microarray with the recommended screening for the evaluation of couples presenting with recurrent pregnancy loss who do not have previous fetal cytogenetic data. We performed a retrospective cohort study including all women who presented for a new evaluation of recurrent pregnancy loss over a 2-year period (January 1, 2012, to December 31, 2013). All participants had at least two documented first-trimester losses and both the recommended screening tests and SNP microarray performed on at least one paraffin-embedded products of conception sample. Single-nucleotide polymorphism microarray identifies all 24 chromosomes (22 autosomes, X, and Y). Forty-two women with a total of 178 losses were included in the study. Paraffin-embedded products of conception from 62 losses were sent for SNP microarray. Single-nucleotide polymorphism microarray successfully diagnosed fetal chromosome number in 71% (44/62) of samples, of which 43% (19/44) were euploid and 57% (25/44) were noneuploid. Seven of 42 (17%) participants had abnormalities on recurrent pregnancy loss screening. The per-person detection rate for a cause of pregnancy loss was significantly higher in the SNP microarray (0.50; 95% confidence interval [CI] 0.36-0.64) compared with recurrent pregnancy loss evaluation (0.17; 95% CI 0.08-0.31) (P=.002). Participants with one or more euploid loss identified on paraffin-embedded products of conception were significantly more likely to have an abnormality on recurrent pregnancy loss screening than those with only noneuploid results (P=.028). The significance remained when controlling for age, number of losses, number of samples, and total pregnancies. These results suggest that SNP microarray testing of paraffin-embedded products of conception is a valuable tool for the evaluation of recurrent pregnancy loss in patients without prior fetal cytogenetic results. Recommended recurrent pregnancy loss screening was unnecessary in almost half the patients in our study. II.

  19. The Choice of the Filtering Method in Microarrays Affects the Inference Regarding Dosage Compensation of the Active X-Chromosome

    PubMed Central

    Zeller, Tanja; Wild, Philipp S.; Truong, Vinh; Trégouët, David-Alexandre; Munzel, Thomas; Ziegler, Andreas; Cambien, François; Blankenberg, Stefan; Tiret, Laurence

    2011-01-01

    Background The hypothesis of dosage compensation of genes of the X chromosome, supported by previous microarray studies, was recently challenged by RNA-sequencing data. It was suggested that microarray studies were biased toward an over-estimation of X-linked expression levels as a consequence of the filtering of genes below the detection threshold of microarrays. Methodology/Principal Findings To investigate this hypothesis, we used microarray expression data from circulating monocytes in 1,467 individuals. In total, 25,349 and 1,156 probes were unambiguously assigned to autosomes and the X chromosome, respectively. Globally, there was a clear shift of X-linked expressions toward lower levels than autosomes. We compared the ratio of expression levels of X-linked to autosomal transcripts (X∶AA) using two different filtering methods: 1. gene expressions were filtered out using a detection threshold irrespective of gene chromosomal location (the standard method in microarrays); 2. equal proportions of genes were filtered out separately on the X and on autosomes. For a wide range of filtering proportions, the X∶AA ratio estimated with the first method was not significantly different from 1, the value expected if dosage compensation was achieved, whereas it was significantly lower than 1 with the second method, leading to the rejection of the hypothesis of dosage compensation. We further showed in simulated data that the choice of the most appropriate method was dependent on biological assumptions regarding the proportion of actively expressed genes on the X chromosome comparative to the autosomes and the extent of dosage compensation. Conclusion/Significance This study shows that the method used for filtering out lowly expressed genes in microarrays may have a major impact according to the hypothesis investigated. The hypothesis of dosage compensation of X-linked genes cannot be firmly accepted or rejected using microarray-based data. PMID:21912656

  20. Electrosonic ejector microarray for drug and gene delivery.

    PubMed

    Zarnitsyn, Vladimir G; Meacham, J Mark; Varady, Mark J; Hao, Chunhai; Degertekin, F Levent; Fedorov, Andrei G

    2008-04-01

    We report on development and experimental characterization of a novel cell manipulation device-the electrosonic ejector microarray-which establishes a pathway for drug and/or gene delivery with control of biophysical action on the length scale of an individual cell. The device comprises a piezoelectric transducer for ultrasound wave generation, a reservoir for storing the sample mixture and a set of acoustic horn structures that form a nozzle array for focused application of mechanical energy. The nozzles are micromachined in silicon or plastic using simple and economical batch fabrication processes. When the device is driven at a particular resonant frequency of the acoustic horn structures, the sample mixture of cells and desired transfection agents/molecules suspended in culture medium is ejected from orifices located at the nozzle tips. During sample ejection, focused mechanical forces (pressure and shear) are generated on a microsecond time scale (dictated by nozzle size/geometry and ejection velocity) resulting in identical "active" microenvironments for each ejected cell. This process enables a number of cellular bioeffects, from uptake of small molecules and gene delivery/transfection to cell lysis. Specifically, we demonstrate successful calcein uptake and transfection of DNA plasmid encoding green fluorescent protein (GFP) into human malignant glioma cells (cell line LN443) using electrosonic microarrays with 36, 45 and 50 mum diameter nozzle orifices and operating at ultrasound frequencies between 0.91 and 0.98 MHz. Our results suggest that efficacy and the extent of bioeffects are mainly controlled by nozzle orifice size and the localized intensity of the applied acoustic field.

  1. A study of metaheuristic algorithms for high dimensional feature selection on microarray data

    NASA Astrophysics Data System (ADS)

    Dankolo, Muhammad Nasiru; Radzi, Nor Haizan Mohamed; Sallehuddin, Roselina; Mustaffa, Noorfa Haszlinna

    2017-11-01

    Microarray systems enable experts to examine gene profile at molecular level using machine learning algorithms. It increases the potentials of classification and diagnosis of many diseases at gene expression level. Though, numerous difficulties may affect the efficiency of machine learning algorithms which includes vast number of genes features comprised in the original data. Many of these features may be unrelated to the intended analysis. Therefore, feature selection is necessary to be performed in the data pre-processing. Many feature selection algorithms are developed and applied on microarray which including the metaheuristic optimization algorithms. This paper discusses the application of the metaheuristics algorithms for feature selection in microarray dataset. This study reveals that, the algorithms have yield an interesting result with limited resources thereby saving computational expenses of machine learning algorithms.

  2. Comprehensive analysis of differentially expressed profiles of lncRNAs and construction of miR-133b mediated ceRNA network in colorectal cancer.

    PubMed

    Wu, Hao; Wu, Runliu; Chen, Miao; Li, Daojiang; Dai, Jing; Zhang, Yi; Gao, Kai; Yu, Jun; Hu, Gui; Guo, Yihang; Lin, Changwei; Li, Xiaorong

    2017-03-28

    Growing evidence suggests that long non-coding RNAs (lncRNAs) play a key role in tumorigenesis. However, the mechanism remains largely unknown. Thousands of significantly dysregulated lncRNAs and mRNAs were identified by microarray. Furthermore, a miR-133b-meditated lncRNA-mRNA ceRNA network was revealed, a subset of which was validated in 14 paired CRC patient tumor/non-tumor samples. Gene set enrichment analysis (GSEA) results demonstrated that lncRNAs ENST00000520055 and ENST00000535511 shared KEGG pathways with miR-133b target genes. We used microarrays to survey the lncRNA and mRNA expression profiles of colorectal cancer and para-cancer tissues. Gene Ontology (GO) and KEGG pathway enrichment analyses were performed to explore the functions of the significantly dysregulated genes. An innovate method was employed that combined analyses of two microarray data sets to construct a miR-133b-mediated lncRNA-mRNA competing endogenous RNAs (ceRNA) network. Quantitative RT-PCR analysis was used to validate part of this network. GSEA was used to predict the potential functions of these lncRNAs. This study identifies and validates a new method to investigate the miR-133b-mediated lncRNA-mRNA ceRNA network and lays the foundation for future investigation into the role of lncRNAs in colorectal cancer.

  3. [Differential gene expression in incompatible interaction between Lilium regale Wilson and Fusarium oxysporum f. sp. lilii revealed by combined SSH and microarray analysis].

    PubMed

    Rao, J; Liu, D; Zhang, N; He, H; Ge, F; Chen, C

    2014-01-01

    Fusarium wilt, caused by a soilborne pathogen Fusarium oxysporum f. sp. lilii, is the major disease of lily (Lilium L.). In order to isolate the genes differentially expressed in a resistant reaction to F. oxysporum in L. regale Wilson, a cDNA library was constructed with L. regale root during F. oxysporum infection using the suppression subtractive hybridization (SSH), and a total of 585 unique expressed sequence tags (ESTs) were obtained. Furthermore, the gene expression profiles in the incompatible interaction between L. regale and F. oxysporum were revealed by oligonucleotide microarray analysis of 585 unique ESTs comparison to the compatible interaction between a susceptible Lilium Oriental Hybrid 'Siberia' and F. oxysporum. The result of expression profile analysis indicated that the genes encoding pathogenesis-related proteins (PRs), antioxidative stress enzymes, secondary metabolism enzymes, transcription factors, signal transduction proteins as well as a large number of unknown genes were involved in early defense response of L. regale to F. oxysporum infection. Moreover, the following quantitative reverse transcription PCR (QRT-PCR) analysis confirmed reliability of the oligonucleotide microarray data. In the present study, isolation of differentially expressed genes in L. regale during response to F. oxysporum helped to uncover the molecular mechanism associated with the resistance of L. regale against F. oxysporum.

  4. Overexpression of GRß in colonic mucosal cell line partly reflects altered gene expression in colonic mucosa of patients with inflammatory bowel disease.

    PubMed

    Nagy, Zsolt; Acs, Bence; Butz, Henriett; Feldman, Karolina; Marta, Alexa; Szabo, Peter M; Baghy, Kornelia; Pazmany, Tamas; Racz, Karoly; Liko, Istvan; Patocs, Attila

    2016-01-01

    The glucocorticoid receptor (GR) plays a crucial role in inflammatory responses. GR has several isoforms, of which the most deeply studied are the GRα and GRß. Recently it has been suggested that in addition to its negative dominant effect on GRα, the GRß may have a GRα-independent transcriptional activity. The GRß isoform was found to be frequently overexpressed in various autoimmune diseases, including inflammatory bowel disease (IBD). In this study, we wished to test whether the gene expression profile found in a GRß overexpressing intestinal cell line (Caco-2GRß) might mimic the gene expression alterations found in patients with IBD. Whole genome microarray analysis was performed in both normal and GRß overexpressing Caco-2 cell lines with and without dexamethasone treatment. IBD-related genes were identified from a meta-analysis of 245 microarrays available in online microarray deposits performed on intestinal mucosa samples from patients with IBD and healthy individuals. The differentially expressed genes were further studied using in silico pathway analysis. Overexpression of GRß altered a large proportion of genes that were not regulated by dexamethasone suggesting that GRß may have a GRα-independent role in the regulation of gene expression. About 10% of genes differentially expressed in colonic mucosa samples from IBD patients compared to normal subjects were also detected in Caco-2 GRß intestinal cell line. Common genes are involved in cell adhesion and cell proliferation. Overexpression of GRß in intestinal cells may affect appropriate mucosal repair and intact barrier function. The proposed novel role of GRß in intestinal epithelium warrants further studies. Copyright © 2015 Elsevier Ltd. All rights reserved.

  5. The Use of P63 Immunohistochemistry for the Identification of Squamous Cell Carcinoma of the Lung

    PubMed Central

    Conde, Esther; Angulo, Bárbara; Redondo, Pilar; Toldos, Oscar; García-García, Elena; Suárez-Gauthier, Ana; Rubio-Viqueira, Belén; Marrón, Carmen; García-Luján, Ricardo; Sánchez-Céspedes, Montse; López-Encuentra, Angel; Paz-Ares, Luis; López-Ríos, Fernando

    2010-01-01

    Introduction While some targeted agents should not be used in squamous cell carcinomas (SCCs), other agents might preferably target SCCs. In a previous microarray study, one of the top differentially expressed genes between adenocarcinomas (ACs) and SCCs is P63. It is a well-known marker of squamous differentiation, but surprisingly, its expression is not widely used for this purpose. Our goals in this study were (1) to further confirm our microarray data, (2) to analize the value of P63 immunohistochemistry (IHC) in reducing the number of large cell carcinoma (LCC) diagnoses in surgical specimens, and (3) to investigate the potential of P63 IHC to minimize the proportion of “carcinoma NOS (not otherwise specified)” in a prospective series of small tumor samples. Methods With these goals in mind, we studied (1) a tissue-microarray comprising 33 ACs and 99 SCCs on which we performed P63 IHC, (2) a series of 20 surgically resected LCCs studied for P63 and TTF-1 IHC, and (3) a prospective cohort of 66 small thoracic samples, including 32 carcinoma NOS, that were further classified by the result of P63 and TTF-1 IHC. Results The results in the three independent cohorts were as follows: (1) P63 IHC was differentially expressed in SCCs when compared to ACs (p<0.0001); (2) half of the 20 (50%) LCCs were positive for P63 and were reclassified as SCCs; and (3) all P63 positive cases (34%) were diagnosed as SCCs. Conclusions P63 IHC is useful for the identification of lung SCCs. PMID:20808915

  6. DFP: a Bioconductor package for fuzzy profile identification and gene reduction of microarray data

    PubMed Central

    Glez-Peña, Daniel; Álvarez, Rodrigo; Díaz, Fernando; Fdez-Riverola, Florentino

    2009-01-01

    Background Expression profiling assays done by using DNA microarray technology generate enormous data sets that are not amenable to simple analysis. The greatest challenge in maximizing the use of this huge amount of data is to develop algorithms to interpret and interconnect results from different genes under different conditions. In this context, fuzzy logic can provide a systematic and unbiased way to both (i) find biologically significant insights relating to meaningful genes, thereby removing the need for expert knowledge in preliminary steps of microarray data analyses and (ii) reduce the cost and complexity of later applied machine learning techniques being able to achieve interpretable models. Results DFP is a new Bioconductor R package that implements a method for discretizing and selecting differentially expressed genes based on the application of fuzzy logic. DFP takes advantage of fuzzy membership functions to assign linguistic labels to gene expression levels. The technique builds a reduced set of relevant genes (FP, Fuzzy Pattern) able to summarize and represent each underlying class (pathology). A last step constructs a biased set of genes (DFP, Discriminant Fuzzy Pattern) by intersecting existing fuzzy patterns in order to detect discriminative elements. In addition, the software provides new functions and visualisation tools that summarize achieved results and aid in the interpretation of differentially expressed genes from multiple microarray experiments. Conclusion DFP integrates with other packages of the Bioconductor project, uses common data structures and is accompanied by ample documentation. It has the advantage that its parameters are highly configurable, facilitating the discovery of biologically relevant connections between sets of genes belonging to different pathologies. This information makes it possible to automatically filter irrelevant genes thereby reducing the large volume of data supplied by microarray experiments. Based on these contributions GENECBR, a successful tool for cancer diagnosis using microarray datasets, has recently been released. PMID:19178723

  7. DFP: a Bioconductor package for fuzzy profile identification and gene reduction of microarray data.

    PubMed

    Glez-Peña, Daniel; Alvarez, Rodrigo; Díaz, Fernando; Fdez-Riverola, Florentino

    2009-01-29

    Expression profiling assays done by using DNA microarray technology generate enormous data sets that are not amenable to simple analysis. The greatest challenge in maximizing the use of this huge amount of data is to develop algorithms to interpret and interconnect results from different genes under different conditions. In this context, fuzzy logic can provide a systematic and unbiased way to both (i) find biologically significant insights relating to meaningful genes, thereby removing the need for expert knowledge in preliminary steps of microarray data analyses and (ii) reduce the cost and complexity of later applied machine learning techniques being able to achieve interpretable models. DFP is a new Bioconductor R package that implements a method for discretizing and selecting differentially expressed genes based on the application of fuzzy logic. DFP takes advantage of fuzzy membership functions to assign linguistic labels to gene expression levels. The technique builds a reduced set of relevant genes (FP, Fuzzy Pattern) able to summarize and represent each underlying class (pathology). A last step constructs a biased set of genes (DFP, Discriminant Fuzzy Pattern) by intersecting existing fuzzy patterns in order to detect discriminative elements. In addition, the software provides new functions and visualisation tools that summarize achieved results and aid in the interpretation of differentially expressed genes from multiple microarray experiments. DFP integrates with other packages of the Bioconductor project, uses common data structures and is accompanied by ample documentation. It has the advantage that its parameters are highly configurable, facilitating the discovery of biologically relevant connections between sets of genes belonging to different pathologies. This information makes it possible to automatically filter irrelevant genes thereby reducing the large volume of data supplied by microarray experiments. Based on these contributions GENECBR, a successful tool for cancer diagnosis using microarray datasets, has recently been released.

  8. The use of open source bioinformatics tools to dissect transcriptomic data.

    PubMed

    Nitsche, Benjamin M; Ram, Arthur F J; Meyer, Vera

    2012-01-01

    Microarrays are a valuable technology to study fungal physiology on a transcriptomic level. Various microarray platforms are available comprising both single and two channel arrays. Despite different technologies, preprocessing of microarray data generally includes quality control, background correction, normalization, and summarization of probe level data. Subsequently, depending on the experimental design, diverse statistical analysis can be performed, including the identification of differentially expressed genes and the construction of gene coexpression networks.We describe how Bioconductor, a collection of open source and open development packages for the statistical programming language R, can be used for dissecting microarray data. We provide fundamental details that facilitate the process of getting started with R and Bioconductor. Using two publicly available microarray datasets from Aspergillus niger, we give detailed protocols on how to identify differentially expressed genes and how to construct gene coexpression networks.

  9. The Carriage Of Multiresistant Bacteria After Travel (COMBAT) prospective cohort study: methodology and design.

    PubMed

    Arcilla, Maris S; van Hattem, Jarne M; Bootsma, Martin C J; van Genderen, Perry J; Goorhuis, Abraham; Schultsz, Constance; Stobberingh, Ellen E; Verbrugh, Henri A; de Jong, Menno D; Melles, Damian C; Penders, John

    2014-04-28

    Antimicrobial resistance (AMR) is one of the major threats to public health around the world. Besides the intense use and misuse of antimicrobial agents as the major force behind the increase in antimicrobial resistance, the exponential increase of international travel may also substantially contribute to the emergence and spread of AMR. However, knowledge on the extent to which international travel contributes to this is still limited. The Carriage Of Multiresistant Bacteria After Travel (COMBAT) study aims to 1. determine the acquisition rate of multiresistant Enterobacteriaceae during foreign travel 2. ascertain the duration of carriage of these micro-organisms 3. determine the transmission rate within households 4. identify risk factors for acquisition, persistence of carriage and transmission of multiresistant Enterobacteriaceae. The COMBAT-study is a large-scale multicenter longitudinal cohort study among travellers (n = 2001) and their non-travelling household members (n = 215). Faecal samples are collected before and immediately after travel and 1 month after return from all participants. Follow-up faecal samples are collected 3, 6 and 12 months after return from travellers (and their non-travelling household members) who acquired multiresistant Enterobacteriaceae. Questionnaires are collected from all participants at each time-point. Faecal samples are screened phenotypically for the presence of extended-spectrum beta-lactamase (ESBL) or carbapenemase-producing Enterobacteriaceae. Positive post-travel isolates from travellers with negative pre-travel samples are genotypically analysed for ESBL and carbapenemase genes with microarray and gene sequencing. The design and scale of the COMBAT-study will enable us to provide much needed detailed insights into the risks and dynamics of introduction and spread of ESBL- and carbapenemase-producing Enterobacteriaceae by healthy travellers and the potential need and measures to monitor or manage these risks. The study is registered at clinicaltrials.gov under accession number NCT01676974.

  10. Blood group genotyping for Jk(a)/Jk(b), Fy(a)/Fy(b), S/s, K/k, Kp(a)/Kp(b), Js(a)/Js(b), Co(a)/Co(b), and Lu(a)/Lu(b) with microarray beads.

    PubMed

    Karpasitou, Katerina; Drago, Francesca; Crespiatico, Loretta; Paccapelo, Cinzia; Truglio, Francesca; Frison, Sara; Scalamogna, Mario; Poli, Francesca

    2008-03-01

    Traditionally, blood group typing has been performed with serologic techniques, the classical method being the hemagglutination test. Serotyping, however, may present important limitations such as scarce availability of rare antisera, typing of recently transfused patients, and those with a positive direct antiglobulin test. Consequently, serologic tests are being complemented with molecular methods. The aim of this study was to develop a low-cost, high-throughput method for large-scale genotyping of red blood cells (RBCs). Single-nucleotide polymorphisms associated with some clinically important blood group antigens, as well as with certain rare blood antigens, were evaluated: Jk(a)/Jk(b), Fy(a)/Fy(b), S/s, K/k, Kp(a)/Kp(b), Js(a)/Js(b), Co(a)/Co(b), and Lu(a)/Lu(b). Polymerase chain reaction (PCR)-amplified targets were detected by direct hybridization to microspheres coupled to allele-specific oligonucleotides. Cutoff values for each genotype were established with phenotyped and/or genotyped samples. The method was validated with a blind panel of 92 blood donor samples. The results were fully concordant with those provided by hemagglutination assays and/or sequence-specific primer (SSP)-PCR. The method was subsequently evaluated with approximately 800 blood donor and patient samples. This study presents a flexible, quick, and economical method for complete genotyping of large donor cohorts for RBC alleles.

  11. CYANOBACTERIA AND CYANOTOXINS IN WATER SUPPLY RESERVOIRS – TO DEVELOP AND VALIDATE A MICROARRAY TO TEST FOR CYANOBACTERIA AND CYANOTOXIN GENES IN DRINKING WATER RESERVOIRS AS AN AID TO RISK ASSESSMENT AND MANAGEMENT OF WATER SUPPLIES

    EPA Science Inventory

    The objective of this study is to develop a microarray to test for cyanobacteria and cyanotoxin genes in drinking water reservoirs as an aid to risk assessment and manages of water supplies. The microarray will include probes recognizing important freshwater cyanobacterial tax...

  12. Microarray platform affords improved product analysis in mammalian cell growth studies

    PubMed Central

    Li, Lingyun; Migliore, Nicole; Schaefer, Eugene; Sharfstein, Susan T.; Dordick, Jonathan S.; Linhardt, Robert J.

    2014-01-01

    High throughput (HT) platforms serve as cost-efficient and rapid screening method for evaluating the effect of cell culture conditions and screening of chemicals. The aim of the current study was to develop a high-throughput cell-based microarray platform to assess the effect of culture conditions on Chinese hamster ovary (CHO) cells. Specifically, growth, transgene expression and metabolism of a GS/MSX CHO cell line, which produces a therapeutic monoclonal antibody, was examined using microarray system in conjunction with conventional shake flask platform in a non-proprietary medium. The microarray system consists of 60 nl spots of cells encapsulated in alginate and separated in groups via an 8-well chamber system attached to the chip. Results show the non-proprietary medium developed allows cell growth, production and normal glycosylation of recombinant antibody and metabolism of the recombinant CHO cells in both the microarray and shake flask platforms. In addition, 10.3 mM glutamate addition to the defined base media results in lactate metabolism shift in the recombinant GS/MSX CHO cells in the shake flask platform. Ultimately, the results demonstrate that the high-throughput microarray platform has the potential to be utilized for evaluating the impact of media additives on cellular processes, such as, cell growth, metabolism and productivity. PMID:24227746

  13. Circular RNA hsa_circ_0003575 regulates oxLDL induced vascular endothelial cells proliferation and angiogenesis.

    PubMed

    Li, Chen-Ye; Ma, Lan; Yu, Bo

    2017-11-01

    Circular RNAs (circRNAs) are a novel class of RNAs generated from back-splicing and characterized by covalently closed continuous loops. Recently, circRNAs have recently shown large regulation on cardiovascular system, including atherosclerosis. The present study aims to investigate the circRNA expression profile and identify their roles on vascular endothelial cells induced by oxLDL. Human circRNA microarray analysis revealed that total 943 differently expressed circRNAs were screened with 2 fold change. Hsa_circ_0003575 was validated to be significantly up-regulated in oxLDL induced HUVECs. Loss-of-function experiments indicated that hsa_circ_0003575 silencing promoted the proliferation and angiogenesis ability of HUVECs. Bioinformatics online programs predicted the potential circRNA-miRNA-mRNA network for hsa_circ_0003575. In summary, circRNA microarray analysis reveals the expression profiles of HUVECs and verifies the role of hsa_circ_0003575 on HUVECs, providing a therapeutic strategy for vascular endothelial cell injury of atherosclerosis. Copyright © 2017. Published by Elsevier Masson SAS.

  14. Emerging Use of Gene Expression Microarrays in Plant Physiology

    DOE PAGES

    Wullschleger, Stan D.; Difazio, Stephen P.

    2003-01-01

    Microarrays have become an important technology for the global analysis of gene expression in humans, animals, plants, and microbes. Implemented in the context of a well-designed experiment, cDNA and oligonucleotide arrays can provide highthroughput, simultaneous analysis of transcript abundance for hundreds, if not thousands, of genes. However, despite widespread acceptance, the use of microarrays as a tool to better understand processes of interest to the plant physiologist is still being explored. To help illustrate current uses of microarrays in the plant sciences, several case studies that we believe demonstrate the emerging application of gene expression arrays in plant physiology weremore » selected from among the many posters and presentations at the 2003 Plant and Animal Genome XI Conference. Based on this survey, microarrays are being used to assess gene expression in plants exposed to the experimental manipulation of air temperature, soil water content and aluminium concentration in the root zone. Analysis often includes characterizing transcript profiles for multiple post-treatment sampling periods and categorizing genes with common patterns of response using hierarchical clustering techniques. In addition, microarrays are also providing insights into developmental changes in gene expression associated with fibre and root elongation in cotton and maize, respectively. Technical and analytical limitations of microarrays are discussed and projects attempting to advance areas of microarray design and data analysis are highlighted. Finally, although much work remains, we conclude that microarrays are a valuable tool for the plant physiologist interested in the characterization and identification of individual genes and gene families with potential application in the fields of agriculture, horticulture and forestry.« less

  15. Profiling In Situ Microbial Community Structure with an Amplification Microarray

    PubMed Central

    Knickerbocker, Christopher; Bryant, Lexi; Golova, Julia; Wiles, Cory; Williams, Kenneth H.; Peacock, Aaron D.; Long, Philip E.

    2013-01-01

    The objectives of this study were to unify amplification, labeling, and microarray hybridization chemistries within a single, closed microfluidic chamber (an amplification microarray) and verify technology performance on a series of groundwater samples from an in situ field experiment designed to compare U(VI) mobility under conditions of various alkalinities (as HCO3−) during stimulated microbial activity accompanying acetate amendment. Analytical limits of detection were between 2 and 200 cell equivalents of purified DNA. Amplification microarray signatures were well correlated with 16S rRNA-targeted quantitative PCR results and hybridization microarray signatures. The succession of the microbial community was evident with and consistent between the two microarray platforms. Amplification microarray analysis of acetate-treated groundwater showed elevated levels of iron-reducing bacteria (Flexibacter, Geobacter, Rhodoferax, and Shewanella) relative to the average background profile, as expected. Identical molecular signatures were evident in the transect treated with acetate plus NaHCO3, but at much lower signal intensities and with a much more rapid decline (to nondetection). Azoarcus, Thaurea, and Methylobacterium were responsive in the acetate-only transect but not in the presence of bicarbonate. Observed differences in microbial community composition or response to bicarbonate amendment likely had an effect on measured rates of U reduction, with higher rates probable in the part of the field experiment that was amended with bicarbonate. The simplification in microarray-based work flow is a significant technological advance toward entirely closed-amplicon microarray-based tests and is generally extensible to any number of environmental monitoring applications. PMID:23160129

  16. Fibre optic microarrays.

    PubMed

    Walt, David R

    2010-01-01

    This tutorial review describes how fibre optic microarrays can be used to create a variety of sensing and measurement systems. This review covers the basics of optical fibres and arrays, the different microarray architectures, and describes a multitude of applications. Such arrays enable multiplexed sensing for a variety of analytes including nucleic acids, vapours, and biomolecules. Polymer-coated fibre arrays can be used for measuring microscopic chemical phenomena, such as corrosion and localized release of biochemicals from cells. In addition, these microarrays can serve as a substrate for fundamental studies of single molecules and single cells. The review covers topics of interest to chemists, biologists, materials scientists, and engineers.

  17. Genome Expression Pathway Analysis Tool – Analysis and visualization of microarray gene expression data under genomic, proteomic and metabolic context

    PubMed Central

    Weniger, Markus; Engelmann, Julia C; Schultz, Jörg

    2007-01-01

    Background Regulation of gene expression is relevant to many areas of biology and medicine, in the study of treatments, diseases, and developmental stages. Microarrays can be used to measure the expression level of thousands of mRNAs at the same time, allowing insight into or comparison of different cellular conditions. The data derived out of microarray experiments is highly dimensional and often noisy, and interpretation of the results can get intricate. Although programs for the statistical analysis of microarray data exist, most of them lack an integration of analysis results and biological interpretation. Results We have developed GEPAT, Genome Expression Pathway Analysis Tool, offering an analysis of gene expression data under genomic, proteomic and metabolic context. We provide an integration of statistical methods for data import and data analysis together with a biological interpretation for subsets of probes or single probes on the chip. GEPAT imports various types of oligonucleotide and cDNA array data formats. Different normalization methods can be applied to the data, afterwards data annotation is performed. After import, GEPAT offers various statistical data analysis methods, as hierarchical, k-means and PCA clustering, a linear model based t-test or chromosomal profile comparison. The results of the analysis can be interpreted by enrichment of biological terms, pathway analysis or interaction networks. Different biological databases are included, to give various information for each probe on the chip. GEPAT offers no linear work flow, but allows the usage of any subset of probes and samples as a start for a new data analysis. GEPAT relies on established data analysis packages, offers a modular approach for an easy extension, and can be run on a computer grid to allow a large number of users. It is freely available under the LGPL open source license for academic and commercial users at . Conclusion GEPAT is a modular, scalable and professional-grade software integrating analysis and interpretation of microarray gene expression data. An installation available for academic users can be found at . PMID:17543125

  18. Comparative study of classification algorithms for immunosignaturing data

    PubMed Central

    2012-01-01

    Background High-throughput technologies such as DNA, RNA, protein, antibody and peptide microarrays are often used to examine differences across drug treatments, diseases, transgenic animals, and others. Typically one trains a classification system by gathering large amounts of probe-level data, selecting informative features, and classifies test samples using a small number of features. As new microarrays are invented, classification systems that worked well for other array types may not be ideal. Expression microarrays, arguably one of the most prevalent array types, have been used for years to help develop classification algorithms. Many biological assumptions are built into classifiers that were designed for these types of data. One of the more problematic is the assumption of independence, both at the probe level and again at the biological level. Probes for RNA transcripts are designed to bind single transcripts. At the biological level, many genes have dependencies across transcriptional pathways where co-regulation of transcriptional units may make many genes appear as being completely dependent. Thus, algorithms that perform well for gene expression data may not be suitable when other technologies with different binding characteristics exist. The immunosignaturing microarray is based on complex mixtures of antibodies binding to arrays of random sequence peptides. It relies on many-to-many binding of antibodies to the random sequence peptides. Each peptide can bind multiple antibodies and each antibody can bind multiple peptides. This technology has been shown to be highly reproducible and appears promising for diagnosing a variety of disease states. However, it is not clear what is the optimal classification algorithm for analyzing this new type of data. Results We characterized several classification algorithms to analyze immunosignaturing data. We selected several datasets that range from easy to difficult to classify, from simple monoclonal binding to complex binding patterns in asthma patients. We then classified the biological samples using 17 different classification algorithms. Using a wide variety of assessment criteria, we found ‘Naïve Bayes’ far more useful than other widely used methods due to its simplicity, robustness, speed and accuracy. Conclusions ‘Naïve Bayes’ algorithm appears to accommodate the complex patterns hidden within multilayered immunosignaturing microarray data due to its fundamental mathematical properties. PMID:22720696

  19. CoryneRegNet 4.0 – A reference database for corynebacterial gene regulatory networks

    PubMed Central

    Baumbach, Jan

    2007-01-01

    Background Detailed information on DNA-binding transcription factors (the key players in the regulation of gene expression) and on transcriptional regulatory interactions of microorganisms deduced from literature-derived knowledge, computer predictions and global DNA microarray hybridization experiments, has opened the way for the genome-wide analysis of transcriptional regulatory networks. The large-scale reconstruction of these networks allows the in silico analysis of cell behavior in response to changing environmental conditions. We previously published CoryneRegNet, an ontology-based data warehouse of corynebacterial transcription factors and regulatory networks. Initially, it was designed to provide methods for the analysis and visualization of the gene regulatory network of Corynebacterium glutamicum. Results Now we introduce CoryneRegNet release 4.0, which integrates data on the gene regulatory networks of 4 corynebacteria, 2 mycobacteria and the model organism Escherichia coli K12. As the previous versions, CoryneRegNet provides a web-based user interface to access the database content, to allow various queries, and to support the reconstruction, analysis and visualization of regulatory networks at different hierarchical levels. In this article, we present the further improved database content of CoryneRegNet along with novel analysis features. The network visualization feature GraphVis now allows the inter-species comparisons of reconstructed gene regulatory networks and the projection of gene expression levels onto that networks. Therefore, we added stimulon data directly into the database, but also provide Web Service access to the DNA microarray analysis platform EMMA. Additionally, CoryneRegNet now provides a SOAP based Web Service server, which can easily be consumed by other bioinformatics software systems. Stimulons (imported from the database, or uploaded by the user) can be analyzed in the context of known transcriptional regulatory networks to predict putative contradictions or further gene regulatory interactions. Furthermore, it integrates protein clusters by means of heuristically solving the weighted graph cluster editing problem. In addition, it provides Web Service based access to up to date gene annotation data from GenDB. Conclusion The release 4.0 of CoryneRegNet is a comprehensive system for the integrated analysis of procaryotic gene regulatory networks. It is a versatile systems biology platform to support the efficient and large-scale analysis of transcriptional regulation of gene expression in microorganisms. It is publicly available at . PMID:17986320

  20. A measure of the signal-to-noise ratio of microarray samples and studies using gene correlations.

    PubMed

    Venet, David; Detours, Vincent; Bersini, Hugues

    2012-01-01

    The quality of gene expression data can vary dramatically from platform to platform, study to study, and sample to sample. As reliable statistical analysis rests on reliable data, determining such quality is of the utmost importance. Quality measures to spot problematic samples exist, but they are platform-specific, and cannot be used to compare studies. As a proxy for quality, we propose a signal-to-noise ratio for microarray data, the "Signal-to-Noise Applied to Gene Expression Experiments", or SNAGEE. SNAGEE is based on the consistency of gene-gene correlations. We applied SNAGEE to a compendium of 80 large datasets on 37 platforms, for a total of 24,380 samples, and assessed the signal-to-noise ratio of studies and samples. This allowed us to discover serious issues with three studies. We show that signal-to-noise ratios of both studies and samples are linked to the statistical significance of the biological results. We showed that SNAGEE is an effective way to measure data quality for most types of gene expression studies, and that it often outperforms existing techniques. Furthermore, SNAGEE is platform-independent and does not require raw data files. The SNAGEE R package is available in BioConductor.

  1. Automatic Identification and Quantification of Extra-Well Fluorescence in Microarray Images.

    PubMed

    Rivera, Robert; Wang, Jie; Yu, Xiaobo; Demirkan, Gokhan; Hopper, Marika; Bian, Xiaofang; Tahsin, Tasnia; Magee, D Mitchell; Qiu, Ji; LaBaer, Joshua; Wallstrom, Garrick

    2017-11-03

    In recent studies involving NAPPA microarrays, extra-well fluorescence is used as a key measure for identifying disease biomarkers because there is evidence to support that it is better correlated with strong antibody responses than statistical analysis involving intraspot intensity. Because this feature is not well quantified by traditional image analysis software, identification and quantification of extra-well fluorescence is performed manually, which is both time-consuming and highly susceptible to variation between raters. A system that could automate this task efficiently and effectively would greatly improve the process of data acquisition in microarray studies, thereby accelerating the discovery of disease biomarkers. In this study, we experimented with different machine learning methods, as well as novel heuristics, for identifying spots exhibiting extra-well fluorescence (rings) in microarray images and assigning each ring a grade of 1-5 based on its intensity and morphology. The sensitivity of our final system for identifying rings was found to be 72% at 99% specificity and 98% at 92% specificity. Our system performs this task significantly faster than a human, while maintaining high performance, and therefore represents a valuable tool for microarray image analysis.

  2. TOXICOGENOMIC ANALYSIS INCORPORATING OPERON-TRANSCRIPTIONAL COUPLING AND TOXICANT CONCENTRATRION-EXPRESSION RESPONSE: Analysis of MX-Treated Salmonella

    EPA Science Inventory

    What is the study? This study is the first to use microarray analysis in the Ames strains of Salmonella. The microarray chips were custom-designed for this study and are not commercially available, and we evaluated the well-studied drinking water mutagen, MX. Because much inform...

  3. Nanoscale integration is the next frontier for nanotechnology

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Picraux, Samuel T

    2009-01-01

    Nanoscale integration of materials and structures is the next critical step to exploit the promise of nanomaterials. Many novel and fascinating properties have been revealed for nanostructured materials. But if nanotechnology is to live up to its promise we must incorporate these nanoscale building blocks into functional systems that connect to the micro- and macroscale world. To do this we will inevitably need to understand and exploit the resulting combined unique properties of these integrated nanosystems. Much science waits to be discovered in the process. Nanoscale integration extends from the synthesis and fabrication of individual nanoscale building blocks, to themore » assembly of these building blocks into composite structures, and finally to the formation of complex functional systems. As illustrated in Figure 1, the building blocks may be homogeneous or heterogeneous, the composite materials may be nanocomposite or patterned structures, and the functional systems will involve additional combinations of materials. Nanoscale integration involves assembling diverse nanoscale materials across length scales to design and achieve new properties and functionality. At each stage size-dependent properties, the influence of surfaces in close proximity, and a multitude of interfaces all come into play. Whether the final system involves coherent electrons in a quantum computing approach, the combined flow of phonons and electrons for a high efficiency thermoelectric micro-generator, or a molecular recognition structure for bio-sensing, the combined effects of size, surface, and interface will be critical. In essence, one wants to combine the novel functions available through nanoscale science to achieve unique multi-functionalities not available in bulk materials. Perhaps the best-known example of integration is that of combining electronic components together into very large scale integrated circuits (VLSI). The integrated circuit has revolutionized electronics in many ways, from exploiting field-effect transistor devices and low power complementary logic to enable the electronic watch and hand calculator in the 1970's, to today's microprocessors and memories with billions of devices and a computational power not imagined a few decades ago. The manipulation of charges on a chip, the new concepts in combining devices for logic functions, and the new approaches to computation, information processing, and imaging have all emerged from Kilby and Noyce's simple concept of integrating devices on a single chip. Moving from hard to soft materials, a second more recent example of integration is the DNA microarray. These microarrays, with up to millions of elements in a planar array that can be optically read out, can simultaneously measure the expression of 10's of thousands of genes to study the effects of disease and treatment, or screen for single nucleotide polymorphisms for uses ranging from forensics to predisposition to disease. While still at an early stage, microarrays have revolutionized biosciences by providing the means to interrogate the complex genetic control of biological functions. Just as integrated circuits and microarrays have led to completely new functionalities and performance, the integration of nanoscale materials and structures is anticipated to lead to new performance and enable the design of new functionalities not previously envisioned. The fundamental questions underlying integration go beyond just complex fabrication or the engineering of known solutions; they lead to new discoveries and new science. The scientific challenges around nanoscale integration necessitate the development of new knowledge that is central to the advance of nanotechnology. To move forward one must address key science questions that arise in nanoscience integration and go beyond a single system or materials area. New science and discoveries especially await around three questions. How does one: (1) Control energy transfer and other interactions across interfaces and over mUltiple length scales? (2) Understand and control the interactions between nanoscale building blocks to assemble specific integrated structures? (3) Design and exploit interactions within assembled structures to achieve new properties and specific functionalities? These high level questions can serve to drive research, and to advance understanding of the complex phenomena and multifunctionality that may emerge from integration. For example, in photonics there is considerable effort to understand and control the response of nanoscale conducting structures on dielectrics, to allow one to localize, manipulate, and control electromagnetic energy in integrated systems such as in the field known as metamaterials. Essential to this area is a fundamental understanding of energy transfer across multiple length scales (question 1 above).« less

  4. A Self-Directed Method for Cell-Type Identification and Separation of Gene Expression Microarrays

    PubMed Central

    Zuckerman, Neta S.; Noam, Yair; Goldsmith, Andrea J.; Lee, Peter P.

    2013-01-01

    Gene expression analysis is generally performed on heterogeneous tissue samples consisting of multiple cell types. Current methods developed to separate heterogeneous gene expression rely on prior knowledge of the cell-type composition and/or signatures - these are not available in most public datasets. We present a novel method to identify the cell-type composition, signatures and proportions per sample without need for a-priori information. The method was successfully tested on controlled and semi-controlled datasets and performed as accurately as current methods that do require additional information. As such, this method enables the analysis of cell-type specific gene expression using existing large pools of publically available microarray datasets. PMID:23990767

  5. An oligonucleotide microarray to characterize multidrug resistant plasmids

    USDA-ARS?s Scientific Manuscript database

    Bacteria plasmids are fragments of extra-chromosomal double stranded deoxyribonucleic acid (DNA) that can contain a variety of genes beneficial to the host organism like antibiotic drug resistance. Many of the Enterobacteriaceae carry multiple drug resistance (MDR) genes on large plasmids of replic...

  6. A Versatile Method for Functionalizing Surfaces with Bioactive Glycans

    PubMed Central

    Cheng, Fang; Shang, Jing; Ratner, Daniel M.

    2011-01-01

    Microarrays and biosensors owe their functionality to our ability to display surface-bound biomolecules with retained biological function. Versatile, stable, and facile methods for the immobilization of bioactive compounds on surfaces have expanded the application of high-throughput ‘omics’-scale screening of molecular interactions by non-expert laboratories. Herein, we demonstrate the potential of simplified chemistries to fabricate a glycan microarray, utilizing divinyl sulfone (DVS)-modified surfaces for the covalent immobilization of natural and chemically derived carbohydrates, as well as glycoproteins. The bioactivity of the captured glycans was quantitatively examined by surface plasmon resonance imaging (SPRi). Composition and spectroscopic evidence of carbohydrate species on the DVS-modified surface were obtained by X-ray photoelectron spectroscopy (XPS) and time-of-flight secondary ion mass spectrometry (ToF-SIMS), respectively. The site-selective immobilization of glycans based on relative nucleophilicity (reducing sugar vs. amine- and sulfhydryl-derived saccharides) and anomeric configuration was also examined. Our results demonstrate straightforward and reproducible conjugation of a variety of functional biomolecules onto a vinyl sulfone-modified biosensor surface. The simplicity of this method will have a significant impact on glycomics research, as it expands the ability of non-synthetic laboratories to rapidly construct functional glycan microarrays and quantitative biosensors. PMID:21142056

  7. Employing image processing techniques for cancer detection using microarray images.

    PubMed

    Dehghan Khalilabad, Nastaran; Hassanpour, Hamid

    2017-02-01

    Microarray technology is a powerful genomic tool for simultaneously studying and analyzing the behavior of thousands of genes. The analysis of images obtained from this technology plays a critical role in the detection and treatment of diseases. The aim of the current study is to develop an automated system for analyzing data from microarray images in order to detect cancerous cases. The proposed system consists of three main phases, namely image processing, data mining, and the detection of the disease. The image processing phase performs operations such as refining image rotation, gridding (locating genes) and extracting raw data from images the data mining includes normalizing the extracted data and selecting the more effective genes. Finally, via the extracted data, cancerous cell is recognized. To evaluate the performance of the proposed system, microarray database is employed which includes Breast cancer, Myeloid Leukemia and Lymphomas from the Stanford Microarray Database. The results indicate that the proposed system is able to identify the type of cancer from the data set with an accuracy of 95.45%, 94.11%, and 100%, respectively. Copyright © 2017 Elsevier Ltd. All rights reserved.

  8. APPLICATION OF CDNA MICROARRAY TO THE STUDY OF ARSENIC TOXICOLOGY AND CARCINOGENESIS

    EPA Science Inventory

    Arsenic (As) is a common environmental toxicant and known human carcinogen. Epidemiological studies link As exposure to various disorders and cancers. However, the molecular mechanisms for As toxicity and carcinogenicity are not completely known. The cDNA microarray, a high-th...

  9. Discovery of prognostic biomarkers for predicting lung cancer metastasis using microarray and survival data.

    PubMed

    Huang, Hui-Ling; Wu, Yu-Chung; Su, Li-Jen; Huang, Yun-Ju; Charoenkwan, Phasit; Chen, Wen-Liang; Lee, Hua-Chin; Chu, William Cheng-Chung; Ho, Shinn-Ying

    2015-02-21

    Few studies have investigated prognostic biomarkers of distant metastases of lung cancer. One of the central difficulties in identifying biomarkers from microarray data is the availability of only a small number of samples, which results overtraining. Recently obtained evidence reveals that epithelial-mesenchymal transition (EMT) of tumor cells causes metastasis, which is detrimental to patients' survival. This work proposes a novel optimization approach to discovering EMT-related prognostic biomarkers to predict the distant metastasis of lung cancer using both microarray and survival data. This weighted objective function maximizes both the accuracy of prediction of distant metastasis and the area between the disease-free survival curves of the non-distant and distant metastases. Seventy-eight patients with lung cancer and a follow-up time of 120 months are used to identify a set of gene markers and an independent cohort of 26 patients is used to evaluate the identified biomarkers. The medical records of the 78 patients show a significant difference between the disease-free survival times of the 37 non-distant- and the 41 distant-metastasis patients. The experimental results thus obtained are as follows. 1) The use of disease-free survival curves can compensate for the shortcoming of insufficient samples and greatly increase the test accuracy by 11.10%; and 2) the support vector machine with a set of 17 transcripts, such as CCL16 and CDKN2AIP, can yield a leave-one-out cross-validation accuracy of 93.59%, a test accuracy of 76.92%, a large disease-free survival area of 74.81%, and a mean survival prediction error of 3.99 months. The identified putative biomarkers are examined using related studies and signaling pathways to reveal the potential effectiveness of the biomarkers in prospective confirmatory studies. The proposed new optimization approach to identifying prognostic biomarkers by combining multiple sources of data (microarray and survival) can facilitate the accurate selection of biomarkers that are most relevant to the disease while solving the problem of insufficient samples.

  10. Tumor Necrosis Factor-α Regulates Distinct Molecular Pathways and Gene Networks in Cultured Skeletal Muscle Cells

    PubMed Central

    Gupta, Sanjay K.; Dahiya, Saurabh; Lundy, Robert F.; Kumar, Ashok

    2010-01-01

    Background Skeletal muscle wasting is a debilitating consequence of large number of disease states and conditions. Tumor necrosis factor-α (TNF-α) is one of the most important muscle-wasting cytokine, elevated levels of which cause significant muscular abnormalities. However, the underpinning molecular mechanisms by which TNF-α causes skeletal muscle wasting are less well-understood. Methodology/Principal Findings We have used microarray, quantitative real-time PCR (QRT-PCR), Western blot, and bioinformatics tools to study the effects of TNF-α on various molecular pathways and gene networks in C2C12 cells (a mouse myoblastic cell line). Microarray analyses of C2C12 myotubes treated with TNF-α (10 ng/ml) for 18h showed differential expression of a number of genes involved in distinct molecular pathways. The genes involved in nuclear factor-kappa B (NF-kappaB) signaling, 26s proteasome pathway, Notch1 signaling, and chemokine networks are the most important ones affected by TNF-α. The expression of some of the genes in microarray dataset showed good correlation in independent QRT-PCR and Western blot assays. Analysis of TNF-treated myotubes showed that TNF-α augments the activity of both canonical and alternative NF-κB signaling pathways in myotubes. Bioinformatics analyses of microarray dataset revealed that TNF-α affects the activity of several important pathways including those involved in oxidative stress, hepatic fibrosis, mitochondrial dysfunction, cholesterol biosynthesis, and TGF-β signaling. Furthermore, TNF-α was found to affect the gene networks related to drug metabolism, cell cycle, cancer, neurological disease, organismal injury, and abnormalities in myotubes. Conclusions TNF-α regulates the expression of multiple genes involved in various toxic pathways which may be responsible for TNF-induced muscle loss in catabolic conditions. Our study suggests that TNF-α activates both canonical and alternative NF-κB signaling pathways in a time-dependent manner in skeletal muscle cells. The study provides novel insight into the mechanisms of action of TNF-α in skeletal muscle cells. PMID:20967264

  11. Identification of differentially expressed genes and false discovery rate in microarray studies.

    PubMed

    Gusnanto, Arief; Calza, Stefano; Pawitan, Yudi

    2007-04-01

    To highlight the development in microarray data analysis for the identification of differentially expressed genes, particularly via control of false discovery rate. The emergence of high-throughput technology such as microarrays raises two fundamental statistical issues: multiplicity and sensitivity. We focus on the biological problem of identifying differentially expressed genes. First, multiplicity arises due to testing tens of thousands of hypotheses, rendering the standard P value meaningless. Second, known optimal single-test procedures such as the t-test perform poorly in the context of highly multiple tests. The standard approach of dealing with multiplicity is too conservative in the microarray context. The false discovery rate concept is fast becoming the key statistical assessment tool replacing the P value. We review the false discovery rate approach and argue that it is more sensible for microarray data. We also discuss some methods to take into account additional information from the microarrays to improve the false discovery rate. There is growing consensus on how to analyse microarray data using the false discovery rate framework in place of the classical P value. Further research is needed on the preprocessing of the raw data, such as the normalization step and filtering, and on finding the most sensitive test procedure.

  12. Systematic spatial bias in DNA microarray hybridization is caused by probe spot position-dependent variability in lateral diffusion.

    PubMed

    Steger, Doris; Berry, David; Haider, Susanne; Horn, Matthias; Wagner, Michael; Stocker, Roman; Loy, Alexander

    2011-01-01

    The hybridization of nucleic acid targets with surface-immobilized probes is a widely used assay for the parallel detection of multiple targets in medical and biological research. Despite its widespread application, DNA microarray technology still suffers from several biases and lack of reproducibility, stemming in part from an incomplete understanding of the processes governing surface hybridization. In particular, non-random spatial variations within individual microarray hybridizations are often observed, but the mechanisms underpinning this positional bias remain incompletely explained. This study identifies and rationalizes a systematic spatial bias in the intensity of surface hybridization, characterized by markedly increased signal intensity of spots located at the boundaries of the spotted areas of the microarray slide. Combining observations from a simplified single-probe block array format with predictions from a mathematical model, the mechanism responsible for this bias is found to be a position-dependent variation in lateral diffusion of target molecules. Numerical simulations reveal a strong influence of microarray well geometry on the spatial bias. Reciprocal adjustment of the size of the microarray hybridization chamber to the area of surface-bound probes is a simple and effective measure to minimize or eliminate the diffusion-based bias, resulting in increased uniformity and accuracy of quantitative DNA microarray hybridization.

  13. Systematic Spatial Bias in DNA Microarray Hybridization Is Caused by Probe Spot Position-Dependent Variability in Lateral Diffusion

    PubMed Central

    Haider, Susanne; Horn, Matthias; Wagner, Michael; Stocker, Roman; Loy, Alexander

    2011-01-01

    Background The hybridization of nucleic acid targets with surface-immobilized probes is a widely used assay for the parallel detection of multiple targets in medical and biological research. Despite its widespread application, DNA microarray technology still suffers from several biases and lack of reproducibility, stemming in part from an incomplete understanding of the processes governing surface hybridization. In particular, non-random spatial variations within individual microarray hybridizations are often observed, but the mechanisms underpinning this positional bias remain incompletely explained. Methodology/Principal Findings This study identifies and rationalizes a systematic spatial bias in the intensity of surface hybridization, characterized by markedly increased signal intensity of spots located at the boundaries of the spotted areas of the microarray slide. Combining observations from a simplified single-probe block array format with predictions from a mathematical model, the mechanism responsible for this bias is found to be a position-dependent variation in lateral diffusion of target molecules. Numerical simulations reveal a strong influence of microarray well geometry on the spatial bias. Conclusions Reciprocal adjustment of the size of the microarray hybridization chamber to the area of surface-bound probes is a simple and effective measure to minimize or eliminate the diffusion-based bias, resulting in increased uniformity and accuracy of quantitative DNA microarray hybridization. PMID:21858215

  14. Crossword: A Fully Automated Algorithm for the Segmentation and Quality Control of Protein Microarray Images

    PubMed Central

    2015-01-01

    Biological assays formatted as microarrays have become a critical tool for the generation of the comprehensive data sets required for systems-level understanding of biological processes. Manual annotation of data extracted from images of microarrays, however, remains a significant bottleneck, particularly for protein microarrays due to the sensitivity of this technology to weak artifact signal. In order to automate the extraction and curation of data from protein microarrays, we describe an algorithm called Crossword that logically combines information from multiple approaches to fully automate microarray segmentation. Automated artifact removal is also accomplished by segregating structured pixels from the background noise using iterative clustering and pixel connectivity. Correlation of the location of structured pixels across image channels is used to identify and remove artifact pixels from the image prior to data extraction. This component improves the accuracy of data sets while reducing the requirement for time-consuming visual inspection of the data. Crossword enables a fully automated protocol that is robust to significant spatial and intensity aberrations. Overall, the average amount of user intervention is reduced by an order of magnitude and the data quality is increased through artifact removal and reduced user variability. The increase in throughput should aid the further implementation of microarray technologies in clinical studies. PMID:24417579

  15. The detection and differentiation of canine respiratory pathogens using oligonucleotide microarrays.

    PubMed

    Wang, Lih-Chiann; Kuo, Ya-Ting; Chueh, Ling-Ling; Huang, Dean; Lin, Jiunn-Horng

    2017-05-01

    Canine respiratory diseases are commonly seen in dogs along with co-infections with multiple respiratory pathogens, including viruses and bacteria. Virus infections in even vaccinated dogs were also reported. The clinical signs caused by different respiratory etiological agents are similar, which makes differential diagnosis imperative. An oligonucleotide microarray system was developed in this study. The wild type and vaccine strains of canine distemper virus (CDV), influenza virus, canine herpesvirus (CHV), Bordetella bronchiseptica and Mycoplasma cynos were detected and differentiated simultaneously on a microarray chip. The detection limit is 10, 10, 100, 50 and 50 copy numbers for CDV, influenza virus, CHV, B. bronchiseptica and M. cynos, respectively. The clinical test results of nasal swab samples showed that the microarray had remarkably better efficacy than the multiplex PCR-agarose gel method. The positive detection rate of microarray and agarose gel was 59.0% (n=33) and 41.1% (n=23) among the 56 samples, respectively. CDV vaccine strain and pathogen co-infections were further demonstrated by the microarray but not by the multiplex PCR-agarose gel. The oligonucleotide microarray provides a highly efficient diagnosis alternative that could be applied to clinical usage, greatly assisting in disease therapy and control. Copyright © 2017 Elsevier B.V. All rights reserved.

  16. [The problem of small "n" and big "P" in neuropsycho-pharmacology, or how to keep the rate of false discoveries under control].

    PubMed

    Petschner, Péter; Bagdy, György; Tóthfalusi, Laszló

    2015-03-01

    One of the characteristics of many methods used in neuropsychopharmacology is that a large number of parameters (P) are measured in relatively few subjects (n). Functional magnetic resonance imaging, electroencephalography (EEG) and genomic studies are typical examples. For example one microarray chip can contain thousands of probes. Therefore, in studies using microarray chips, P may be several thousand-fold larger than n. Statistical analysis of such studies is a challenging task and they are refereed to in the statistical literature such as the small "n" big "P" problem. The problem has many facets including the controversies associated with multiple hypothesis testing. A typical scenario in this context is, when two or more groups are compared by the individual attributes. If the increased classification error due to the multiple testing is neglected, then several highly significant differences will be discovered. But in reality, some of these significant differences are coincidental, not reproducible findings. Several methods were proposed to solve this problem. In this review we discuss two of the proposed solutions, algorithms to compare sets and statistical hypothesis tests controlling the false discovery rate.

  17. Inferring Molecular Processes Heterogeneity from Transcriptional Data.

    PubMed

    Gogolewski, Krzysztof; Wronowska, Weronika; Lech, Agnieszka; Lesyng, Bogdan; Gambin, Anna

    2017-01-01

    RNA microarrays and RNA-seq are nowadays standard technologies to study the transcriptional activity of cells. Most studies focus on tracking transcriptional changes caused by specific experimental conditions. Information referring to genes up- and downregulation is evaluated analyzing the behaviour of relatively large population of cells by averaging its properties. However, even assuming perfect sample homogeneity, different subpopulations of cells can exhibit diverse transcriptomic profiles, as they may follow different regulatory/signaling pathways. The purpose of this study is to provide a novel methodological scheme to account for possible internal, functional heterogeneity in homogeneous cell lines, including cancer ones. We propose a novel computational method to infer the proportion between subpopulations of cells that manifest various functional behaviour in a given sample. Our method was validated using two datasets from RNA microarray experiments. Both experiments aimed to examine cell viability in specific experimental conditions. The presented methodology can be easily extended to RNA-seq data as well as other molecular processes. Moreover, it complements standard tools to indicate most important networks from transcriptomic data and in particular could be useful in the analysis of cancer cell lines affected by biologically active compounds or drugs.

  18. Inferring Molecular Processes Heterogeneity from Transcriptional Data

    PubMed Central

    Wronowska, Weronika; Lesyng, Bogdan; Gambin, Anna

    2017-01-01

    RNA microarrays and RNA-seq are nowadays standard technologies to study the transcriptional activity of cells. Most studies focus on tracking transcriptional changes caused by specific experimental conditions. Information referring to genes up- and downregulation is evaluated analyzing the behaviour of relatively large population of cells by averaging its properties. However, even assuming perfect sample homogeneity, different subpopulations of cells can exhibit diverse transcriptomic profiles, as they may follow different regulatory/signaling pathways. The purpose of this study is to provide a novel methodological scheme to account for possible internal, functional heterogeneity in homogeneous cell lines, including cancer ones. We propose a novel computational method to infer the proportion between subpopulations of cells that manifest various functional behaviour in a given sample. Our method was validated using two datasets from RNA microarray experiments. Both experiments aimed to examine cell viability in specific experimental conditions. The presented methodology can be easily extended to RNA-seq data as well as other molecular processes. Moreover, it complements standard tools to indicate most important networks from transcriptomic data and in particular could be useful in the analysis of cancer cell lines affected by biologically active compounds or drugs. PMID:29362714

  19. Improvement in the amine glass platform by bubbling method for a DNA microarray

    PubMed Central

    Jee, Seung Hyun; Kim, Jong Won; Lee, Ji Hyeong; Yoon, Young Soo

    2015-01-01

    A glass platform with high sensitivity for sexually transmitted diseases microarray is described here. An amino-silane-based self-assembled monolayer was coated on the surface of a glass platform using a novel bubbling method. The optimized surface of the glass platform had highly uniform surface modifications using this method, as well as improved hybridization properties with capture probes in the DNA microarray. On the basis of these results, the improved glass platform serves as a highly reliable and optimal material for the DNA microarray. Moreover, in this study, we demonstrated that our glass platform, manufactured by utilizing the bubbling method, had higher uniformity, shorter processing time, lower background signal, and higher spot signal than the platforms manufactured by the general dipping method. The DNA microarray manufactured with a glass platform prepared using bubbling method can be used as a clinical diagnostic tool. PMID:26468293

  20. Improvement in the amine glass platform by bubbling method for a DNA microarray.

    PubMed

    Jee, Seung Hyun; Kim, Jong Won; Lee, Ji Hyeong; Yoon, Young Soo

    2015-01-01

    A glass platform with high sensitivity for sexually transmitted diseases microarray is described here. An amino-silane-based self-assembled monolayer was coated on the surface of a glass platform using a novel bubbling method. The optimized surface of the glass platform had highly uniform surface modifications using this method, as well as improved hybridization properties with capture probes in the DNA microarray. On the basis of these results, the improved glass platform serves as a highly reliable and optimal material for the DNA microarray. Moreover, in this study, we demonstrated that our glass platform, manufactured by utilizing the bubbling method, had higher uniformity, shorter processing time, lower background signal, and higher spot signal than the platforms manufactured by the general dipping method. The DNA microarray manufactured with a glass platform prepared using bubbling method can be used as a clinical diagnostic tool.

  1. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gentry, T.; Schadt, C.; Zhou, J.

    Microarray technology has the unparalleled potential tosimultaneously determine the dynamics and/or activities of most, if notall, of the microbial populations in complex environments such as soilsand sediments. Researchers have developed several types of arrays thatcharacterize the microbial populations in these samples based on theirphylogenetic relatedness or functional genomic content. Several recentstudies have used these microarrays to investigate ecological issues;however, most have only analyzed a limited number of samples withrelatively few experiments utilizing the full high-throughput potentialof microarray analysis. This is due in part to the unique analyticalchallenges that these samples present with regard to sensitivity,specificity, quantitation, and data analysis. Thismore » review discussesspecific applications of microarrays to microbial ecology research alongwith some of the latest studies addressing the difficulties encounteredduring analysis of complex microbial communities within environmentalsamples. With continued development, microarray technology may ultimatelyachieve its potential for comprehensive, high-throughput characterizationof microbial populations in near real-time.« less

  2. Network-based de-noising improves prediction from microarray data.

    PubMed

    Kato, Tsuyoshi; Murata, Yukio; Miura, Koh; Asai, Kiyoshi; Horton, Paul B; Koji, Tsuda; Fujibuchi, Wataru

    2006-03-20

    Prediction of human cell response to anti-cancer drugs (compounds) from microarray data is a challenging problem, due to the noise properties of microarrays as well as the high variance of living cell responses to drugs. Hence there is a strong need for more practical and robust methods than standard methods for real-value prediction. We devised an extended version of the off-subspace noise-reduction (de-noising) method to incorporate heterogeneous network data such as sequence similarity or protein-protein interactions into a single framework. Using that method, we first de-noise the gene expression data for training and test data and also the drug-response data for training data. Then we predict the unknown responses of each drug from the de-noised input data. For ascertaining whether de-noising improves prediction or not, we carry out 12-fold cross-validation for assessment of the prediction performance. We use the Pearson's correlation coefficient between the true and predicted response values as the prediction performance. De-noising improves the prediction performance for 65% of drugs. Furthermore, we found that this noise reduction method is robust and effective even when a large amount of artificial noise is added to the input data. We found that our extended off-subspace noise-reduction method combining heterogeneous biological data is successful and quite useful to improve prediction of human cell cancer drug responses from microarray data.

  3. Development of a rapid microarray-based DNA subtyping assay for the alleles of Shiga toxins 1 and 2 of Escherichia coli.

    PubMed

    Geue, Lutz; Stieber, Bettina; Monecke, Stefan; Engelmann, Ines; Gunzer, Florian; Slickers, Peter; Braun, Sascha D; Ehricht, Ralf

    2014-08-01

    In this study, we developed a new rapid, economic, and automated microarray-based genotyping test for the standardized subtyping of Shiga toxins 1 and 2 of Escherichia coli. The microarrays from Alere Technologies can be used in two different formats, the ArrayTube and the ArrayStrip (which enables high-throughput testing in a 96-well format). One microarray chip harbors all the gene sequences necessary to distinguish between all Stx subtypes, facilitating the identification of single and multiple subtypes within a single isolate in one experiment. Specific software was developed to automatically analyze all data obtained from the microarray. The assay was validated with 21 Shiga toxin-producing E. coli (STEC) reference strains that were previously tested by the complete set of conventional subtyping PCRs. The microarray results showed 100% concordance with the PCR results. Essentially identical results were detected when the standard DNA extraction method was replaced by a time-saving heat lysis protocol. For further validation of the microarray, we identified the Stx subtypes or combinations of the subtypes in 446 STEC field isolates of human and animal origin. In summary, this oligonucleotide array represents an excellent diagnostic tool that provides some advantages over standard PCR-based subtyping. The number of the spotted probes on the microarrays can be increased by additional probes, such as for novel alleles, species markers, or resistance genes, should the need arise. Copyright © 2014, American Society for Microbiology. All Rights Reserved.

  4. Ewing's Sarcoma: An Analysis of miRNA Expression Profiles and Target Genes in Paraffin-Embedded Primary Tumor Tissue.

    PubMed

    Parafioriti, Antonina; Bason, Caterina; Armiraglio, Elisabetta; Calciano, Lucia; Daolio, Primo Andrea; Berardocco, Martina; Di Bernardo, Andrea; Colosimo, Alessia; Luksch, Roberto; Berardi, Anna C

    2016-04-30

    The molecular mechanism responsible for Ewing's Sarcoma (ES) remains largely unknown. MicroRNAs (miRNAs), a class of small non-coding RNAs able to regulate gene expression, are deregulated in tumors and may serve as a tool for diagnosis and prediction. However, the status of miRNAs in ES has not yet been thoroughly investigated. This study compared global miRNAs expression in paraffin-embedded tumor tissue samples from 20 ES patients, affected by primary untreated tumors, with miRNAs expressed in normal human mesenchymal stromal cells (MSCs) by microarray analysis. A miRTarBase database was used to identify the predicted target genes for differentially expressed miRNAs. The miRNAs microarray analysis revealed distinct patterns of miRNAs expression between ES samples and normal MSCs. 58 of the 954 analyzed miRNAs were significantly differentially expressed in ES samples compared to MSCs. Moreover, the qRT-PCR analysis carried out on three selected miRNAs showed that miR-181b, miR-1915 and miR-1275 were significantly aberrantly regulated, confirming the microarray results. Bio-database analysis identified BCL-2 as a bona fide target gene of the miR-21, miR-181a, miR-181b, miR-29a, miR-29b, miR-497, miR-195, miR-let-7a, miR-34a and miR-1915. Using paraffin-embedded tissues from ES patients, this study has identified several potential target miRNAs and one gene that might be considered a novel critical biomarker for ES pathogenesis.

  5. The efficacy of microarray screening for autosomal recessive retinitis pigmentosa in routine clinical practice

    PubMed Central

    van Huet, Ramon A. C.; Pierrache, Laurence H.M.; Meester-Smoor, Magda A.; Klaver, Caroline C.W.; van den Born, L. Ingeborgh; Hoyng, Carel B.; de Wijs, Ilse J.; Collin, Rob W. J.; Hoefsloot, Lies H.

    2015-01-01

    Purpose To determine the efficacy of multiple versions of a commercially available arrayed primer extension (APEX) microarray chip for autosomal recessive retinitis pigmentosa (arRP). Methods We included 250 probands suspected of arRP who were genetically analyzed with the APEX microarray between January 2008 and November 2013. The mode of inheritance had to be autosomal recessive according to the pedigree (including isolated cases). If the microarray identified a heterozygous mutation, we performed Sanger sequencing of exons and exon–intron boundaries of that specific gene. The efficacy of this microarray chip with the additional Sanger sequencing approach was determined by the percentage of patients that received a molecular diagnosis. We also collected data from genetic tests other than the APEX analysis for arRP to provide a detailed description of the molecular diagnoses in our study cohort. Results The APEX microarray chip for arRP identified the molecular diagnosis in 21 (8.5%) of the patients in our cohort. Additional Sanger sequencing yielded a second mutation in 17 patients (6.8%), thereby establishing the molecular diagnosis. In total, 38 patients (15.2%) received a molecular diagnosis after analysis using the microarray and additional Sanger sequencing approach. Further genetic analyses after a negative result of the arRP microarray (n = 107) resulted in a molecular diagnosis of arRP (n = 23), autosomal dominant RP (n = 5), X-linked RP (n = 2), and choroideremia (n = 1). Conclusions The efficacy of the commercially available APEX microarray chips for arRP appears to be low, most likely caused by the limitations of this technique and the genetic and allelic heterogeneity of RP. Diagnostic yields up to 40% have been reported for next-generation sequencing (NGS) techniques that, as expected, thereby outperform targeted APEX analysis. PMID:25999674

  6. Clofibrate-induced gene expression changes in rat liver: a cross-laboratory analysis using membrane cDNA arrays.

    PubMed Central

    Baker, Valerie A; Harries, Helen M; Waring, Jeff F; Duggan, Colette M; Ni, Hong A; Jolly, Robert A; Yoon, Lawrence W; De Souza, Angus T; Schmid, Judith E; Brown, Roger H; Ulrich, Roger G; Rockett, John C

    2004-01-01

    Microarrays have the potential to significantly impact our ability to identify toxic hazards by the identification of mechanistically relevant markers of toxicity. To be useful for risk assessment, however, microarray data must be challenged to determine reliability and interlaboratory reproducibility. As part of a series of studies conducted by the International Life Sciences Institute Health and Environmental Science Institute Technical Committee on the Application of Genomics to Mechanism-Based Risk Assessment, the biological response in rats to the hepatotoxin clofibrate was investigated. Animals were treated with high (250 mg/kg/day) or low (25 mg/kg/day) doses for 1, 3, or 7 days in two laboratories. Clinical chemistry parameters were measured, livers removed for histopathological assessment, and gene expression analysis was conducted using cDNA arrays. Expression changes in genes involved in fatty acid metabolism (e.g., acyl-CoA oxidase), cell proliferation (e.g., topoisomerase II-Alpha), and fatty acid oxidation (e.g., cytochrome P450 4A1), consistent with the mechanism of clofibrate hepatotoxicity, were detected. Observed differences in gene expression levels correlated with the level of biological response induced in the two in vivo studies. Generally, there was a high level of concordance between the gene expression profiles generated from pooled and individual RNA samples. Quantitative real-time polymerase chain reaction was used to confirm modulations for a number of peroxisome proliferator marker genes. Though the results indicate some variability in the quantitative nature of the microarray data, this appears due largely to differences in experimental and data analysis procedures used within each laboratory. In summary, this study demonstrates the potential for gene expression profiling to identify toxic hazards by the identification of mechanistically relevant markers of toxicity. PMID:15033592

  7. Gene expression profiling of whole blood: Comparison of target preparation methods for accurate and reproducible microarray analysis

    PubMed Central

    Vartanian, Kristina; Slottke, Rachel; Johnstone, Timothy; Casale, Amanda; Planck, Stephen R; Choi, Dongseok; Smith, Justine R; Rosenbaum, James T; Harrington, Christina A

    2009-01-01

    Background Peripheral blood is an accessible and informative source of transcriptomal information for many human disease and pharmacogenomic studies. While there can be significant advantages to analyzing RNA isolated from whole blood, particularly in clinical studies, the preparation of samples for microarray analysis is complicated by the need to minimize artifacts associated with highly abundant globin RNA transcripts. The impact of globin RNA transcripts on expression profiling data can potentially be reduced by using RNA preparation and labeling methods that remove or block globin RNA during the microarray assay. We compared four different methods for preparing microarray hybridization targets from human whole blood collected in PAXGene tubes. Three of the methods utilized the Affymetrix one-cycle cDNA synthesis/in vitro transcription protocol but varied treatment of input RNA as follows: i. no treatment; ii. treatment with GLOBINclear; or iii. treatment with globin PNA oligos. In the fourth method cDNA targets were prepared with the Ovation amplification and labeling system. Results We find that microarray targets generated with labeling methods that reduce globin mRNA levels or minimize the impact of globin transcripts during hybridization detect more transcripts in the microarray assay compared with the standard Affymetrix method. Comparison of microarray results with quantitative PCR analysis of a panel of genes from the NF-kappa B pathway shows good correlation of transcript measurements produced with all four target preparation methods, although method-specific differences in overall correlation were observed. The impact of freezing blood collected in PAXGene tubes on data reproducibility was also examined. Expression profiles show little or no difference when RNA is extracted from either fresh or frozen blood samples. Conclusion RNA preparation and labeling methods designed to reduce the impact of globin mRNA transcripts can significantly improve the sensitivity of the DNA microarray expression profiling assay for whole blood samples. While blockage of globin transcripts during first strand cDNA synthesis with globin PNAs resulted in the best overall performance in this study, we conclude that selection of a protocol for expression profiling studies in blood should depend on several factors, including implementation requirements of the method and study design. RNA isolated from either freshly collected or frozen blood samples stored in PAXGene tubes can be used without altering gene expression profiles. PMID:19123946

  8. Fast imputation using medium or low-coverage sequence data

    USDA-ARS?s Scientific Manuscript database

    Accurate genotype imputation can greatly reduce costs and increase benefits by combining whole-genome sequence data of varying read depth and microarray genotypes of varying densities. For large populations, an efficient strategy chooses the two haplotypes most likely to form each genotype and updat...

  9. Screening and characterization of plant cell walls using carbohydrate microarrays.

    PubMed

    Sørensen, Iben; Willats, William G T

    2011-01-01

    Plant cells are surrounded by cell walls built largely from complex carbohydrates. The primary walls of growing plant cells consist of interdependent networks of three polysaccharide classes: cellulose, cross-linking glycans (also known as hemicelluloses), and pectins. Cellulose microfibrils are tethered together by cross-linking glycans, and this assembly forms the major load-bearing component of primary walls, which is infiltrated with pectic polymers. In the secondary walls of woody tissues, pectins are much reduced and walls are reinforced with the phenolic polymer lignin. Plant cell walls are essential for plant life and also have numerous industrial applications, ranging from wood to nutraceuticals. Enhancing our knowledge of cell wall biology and the effective use of cell wall materials is dependent to a large extent on being able to analyse their fine structures. We have developed a suite of techniques based on microarrays probed with monoclonal antibodies with specificity for cell wall components, and here we present practical protocols for this type of analysis.

  10. Repeated Measurements on Distinct Scales With Censoring—A Bayesian Approach Applied to Microarray Analysis of Maize

    PubMed Central

    Love, Tanzy; Carriquiry, Alicia

    2009-01-01

    We analyze data collected in a somatic embryogenesis experiment carried out on Zea mays at Iowa State University. The main objective of the study was to identify the set of genes in maize that actively participate in embryo development. Embryo tissue was sampled and analyzed at various time periods and under different mediums and light conditions. As is the case in many microarray experiments, the operator scanned each slide multiple times to find the slide-specific ‘optimal’ laser and sensor settings. The multiple readings of each slide are repeated measurements on different scales with differing censoring; they cannot be considered to be replicate measurements in the traditional sense. Yet it has been shown that the choice of reading can have an impact on genetic inference. We propose a hierarchical modeling approach to estimating gene expression that combines all available readings on each spot and accounts for censoring in the observed values. We assess the statistical properties of the proposed expression estimates using a simulation experiment. As expected, combining all available scans using an approach with good statistical properties results in expression estimates with noticeably lower bias and root mean squared error relative to other approaches that have been proposed in the literature. Inferences drawn from the somatic embryogenesis experiment, which motivated this work changed drastically when data were analyzed using the standard approaches or using the methodology we propose. PMID:19960120

  11. Importing MAGE-ML format microarray data into BioConductor.

    PubMed

    Durinck, Steffen; Allemeersch, Joke; Carey, Vincent J; Moreau, Yves; De Moor, Bart

    2004-12-12

    The microarray gene expression markup language (MAGE-ML) is a widely used XML (eXtensible Markup Language) standard for describing and exchanging information about microarray experiments. It can describe microarray designs, microarray experiment designs, gene expression data and data analysis results. We describe RMAGEML, a new Bioconductor package that provides a link between cDNA microarray data stored in MAGE-ML format and the Bioconductor framework for preprocessing, visualization and analysis of microarray experiments. http://www.bioconductor.org. Open Source.

  12. The future of microarray technology: networking the genome search.

    PubMed

    D'Ambrosio, C; Gatta, L; Bonini, S

    2005-10-01

    In recent years microarray technology has been increasingly used in both basic and clinical research, providing substantial information for a better understanding of genome-environment interactions responsible for diseases, as well as for their diagnosis and treatment. However, in genomic research using microarray technology there are several unresolved issues, including scientific, ethical and legal issues. Networks of excellence like GA(2)LEN may represent the best approach for teaching, cost reduction, data repositories, and functional studies implementation.

  13. Fiber-optic microarray for simultaneous detection of multiple harmful algal bloom species.

    PubMed

    Ahn, Soohyoun; Kulis, David M; Erdner, Deana L; Anderson, Donald M; Walt, David R

    2006-09-01

    Harmful algal blooms (HABs) are a serious threat to coastal resources, causing a variety of impacts on public health, regional economies, and ecosystems. Plankton analysis is a valuable component of many HAB monitoring and research programs, but the diversity of plankton poses a problem in discriminating toxic from nontoxic species using conventional detection methods. Here we describe a sensitive and specific sandwich hybridization assay that combines fiber-optic microarrays with oligonucleotide probes to detect and enumerate the HAB species Alexandrium fundyense, Alexandrium ostenfeldii, and Pseudo-nitzschia australis. Microarrays were prepared by loading oligonucleotide probe-coupled microspheres (diameter, 3 mum) onto the distal ends of chemically etched imaging fiber bundles. Hybridization of target rRNA from HAB cells to immobilized probes on the microspheres was visualized using Cy3-labeled secondary probes in a sandwich-type assay format. We applied these microarrays to the detection and enumeration of HAB cells in both cultured and field samples. Our study demonstrated a detection limit of approximately 5 cells for all three target organisms within 45 min, without a separate amplification step, in both sample types. We also developed a multiplexed microarray to detect the three HAB species simultaneously, which successfully detected the target organisms, alone and in combination, without cross-reactivity. Our study suggests that fiber-optic microarrays can be used for rapid and sensitive detection and potential enumeration of HAB species in the environment.

  14. Development and assessment of whole-genome oligonucleotide microarrays to analyze an anaerobic microbial community and its responses to oxidative stress.

    PubMed

    Scholten, Johannes C M; Culley, David E; Nie, Lei; Munn, Kyle J; Chow, Lely; Brockman, Fred J; Zhang, Weiwen

    2007-06-29

    The application of DNA microarray technology to investigate multiple-species microbial communities presents great challenges. In this study, we reported the design and quality assessment of four whole genome oligonucleotide microarrays for two syntroph bacteria, Desulfovibrio vulgaris and Syntrophobacter fumaroxidans, and two archaeal methanogens, Methanosarcina barkeri, and Methanospirillum hungatei, and their application to analyze global gene expression in a four-species microbial community in response to oxidative stress. In order to minimize the possibility of cross-hybridization, cross-genome comparison was performed to assure all probes unique to each genome so that the microarrays could provide species-level resolution. Microarray quality was validated by the good reproducibility of experimental measurements of multiple biological and analytical replicates. This study showed that S. fumaroxidans and M. hungatei responded to the oxidative stress with up-regulation of several genes known to be involved in reactive oxygen species (ROS) detoxification, such as catalase and rubrerythrin in S. fumaroxidans and thioredoxin and heat shock protein Hsp20 in M. hungatei. However, D. vulgaris seemed to be less sensitive to the oxidative stress as a member of a four-species community, since no gene involved in ROS detoxification was up-regulated. Our work demonstrated the successful application of microarrays to a multiple-species microbial community, and our preliminary results indicated that this approach could provide novel insights on the metabolism within microbial communities.

  15. Systematic evaluation of RNA quality, microarray data reliability and pathway analysis in fresh, fresh frozen and formalin-fixed paraffin-embedded tissue samples.

    PubMed

    Wimmer, Isabella; Tröscher, Anna R; Brunner, Florian; Rubino, Stephen J; Bien, Christian G; Weiner, Howard L; Lassmann, Hans; Bauer, Jan

    2018-04-20

    Formalin-fixed paraffin-embedded (FFPE) tissues are valuable resources commonly used in pathology. However, formalin fixation modifies nucleic acids challenging the isolation of high-quality RNA for genetic profiling. Here, we assessed feasibility and reliability of microarray studies analysing transcriptome data from fresh, fresh-frozen (FF) and FFPE tissues. We show that reproducible microarray data can be generated from only 2 ng FFPE-derived RNA. For RNA quality assessment, fragment size distribution (DV200) and qPCR proved most suitable. During RNA isolation, extending tissue lysis time to 10 hours reduced high-molecular-weight species, while additional incubation at 70 °C markedly increased RNA yields. Since FF- and FFPE-derived microarrays constitute different data entities, we used indirect measures to investigate gene signal variation and relative gene expression. Whole-genome analyses revealed high concordance rates, while reviewing on single-genes basis showed higher data variation in FFPE than FF arrays. Using an experimental model, gene set enrichment analysis (GSEA) of FFPE-derived microarrays and fresh tissue-derived RNA-Seq datasets yielded similarly affected pathways confirming the applicability of FFPE tissue in global gene expression analysis. Our study provides a workflow comprising RNA isolation, quality assessment and microarray profiling using minimal RNA input, thus enabling hypothesis-generating pathway analyses from limited amounts of precious, pathologically significant FFPE tissues.

  16. Evaluation of gene expression classification studies: factors associated with classification performance.

    PubMed

    Novianti, Putri W; Roes, Kit C B; Eijkemans, Marinus J C

    2014-01-01

    Classification methods used in microarray studies for gene expression are diverse in the way they deal with the underlying complexity of the data, as well as in the technique used to build the classification model. The MAQC II study on cancer classification problems has found that performance was affected by factors such as the classification algorithm, cross validation method, number of genes, and gene selection method. In this paper, we study the hypothesis that the disease under study significantly determines which method is optimal, and that additionally sample size, class imbalance, type of medical question (diagnostic, prognostic or treatment response), and microarray platform are potentially influential. A systematic literature review was used to extract the information from 48 published articles on non-cancer microarray classification studies. The impact of the various factors on the reported classification accuracy was analyzed through random-intercept logistic regression. The type of medical question and method of cross validation dominated the explained variation in accuracy among studies, followed by disease category and microarray platform. In total, 42% of the between study variation was explained by all the study specific and problem specific factors that we studied together.

  17. Characterization of genetic variability of Venezuelan equine encephalitis viruses

    DOE PAGES

    Gardner, Shea N.; McLoughlin, Kevin; Be, Nicholas A.; ...

    2016-04-07

    Venezuelan equine encephalitis virus (VEEV) is a mosquito-borne alphavirus that has caused large outbreaks of severe illness in both horses and humans. New approaches are needed to rapidly infer the origin of a newly discovered VEEV strain, estimate its equine amplification and resultant epidemic potential, and predict human virulence phenotype. We performed whole genome single nucleotide polymorphism (SNP) analysis of all available VEE antigenic complex genomes, verified that a SNP-based phylogeny accurately captured the features of a phylogenetic tree based on multiple sequence alignment, and developed a high resolution genome-wide SNP microarray. We used the microarray to analyze a broadmore » panel of VEEV isolates, found excellent concordance between array- and sequence-based SNP calls, genotyped unsequenced isolates, and placed them on a phylogeny with sequenced genomes. The microarray successfully genotyped VEEV directly from tissue samples of an infected mouse, bypassing the need for viral isolation, culture and genomic sequencing. Lastly, we identified genomic variants associated with serotypes and host species, revealing a complex relationship between genotype and phenotype.« less

  18. A microarray whole-genome gene expression dataset in a rat model of inflammatory corneal angiogenesis.

    PubMed

    Mukwaya, Anthony; Lindvall, Jessica M; Xeroudaki, Maria; Peebo, Beatrice; Ali, Zaheer; Lennikov, Anton; Jensen, Lasse Dahl Ejby; Lagali, Neil

    2016-11-22

    In angiogenesis with concurrent inflammation, many pathways are activated, some linked to VEGF and others largely VEGF-independent. Pathways involving inflammatory mediators, chemokines, and micro-RNAs may play important roles in maintaining a pro-angiogenic environment or mediating angiogenic regression. Here, we describe a gene expression dataset to facilitate exploration of pro-angiogenic, pro-inflammatory, and remodelling/normalization-associated genes during both an active capillary sprouting phase, and in the restoration of an avascular phenotype. The dataset was generated by microarray analysis of the whole transcriptome in a rat model of suture-induced inflammatory corneal neovascularisation. Regions of active capillary sprout growth or regression in the cornea were harvested and total RNA extracted from four biological replicates per group. High quality RNA was obtained for gene expression analysis using microarrays. Fold change of selected genes was validated by qPCR, and protein expression was evaluated by immunohistochemistry. We provide a gene expression dataset that may be re-used to investigate corneal neovascularisation, and may also have implications in other contexts of inflammation-mediated angiogenesis.

  19. Preliminary report for analysis of genome wide mutations from four ciprofloxacin resistant B. anthracis Sterne isolates generated by Illumina, 454 sequencing and microarrays for DHS

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jaing, Crystal; Vergez, Lisa; Hinckley, Aubree

    2011-06-21

    The objective of this project is to provide DHS a comprehensive evaluation of the current genomic technologies including genotyping, Taqman PCR, multiple locus variable tandem repeat analysis (MLVA), microarray and high-throughput DNA sequencing in the analysis of biothreat agents from complex environmental samples. As the result of a different DHS project, we have selected for and isolated a large number of ciprofloxacin resistant B. anthracis Sterne isolates. These isolates vary in the concentrations of ciprofloxacin that they can tolerate, suggesting multiple mutations in the samples. In collaboration with University of Houston, Eureka Genomics and Oak Ridge National Laboratory, we analyzedmore » the ciprofloxacin resistant B. anthracis Sterne isolates by microarray hybridization, Illumina and Roche 454 sequencing to understand the error rates and sensitivity of the different methods. The report provides an assessment of the results and a complete set of all protocols used and all data generated along with information to interpret the protocols and data sets.« less

  20. Tracing phylogenomic events leading to diversity of Haemophilus influenzae and the emergence of Brazilian Purpuric Fever (BPF)-associated clones.

    PubMed

    Papazisi, Leka; Ratnayake, Shashikala; Remortel, Brian G; Bock, Geoffrey R; Liang, Wei; Saeed, Alexander I; Liu, Jia; Fleischmann, Robert D; Kilian, Mogens; Peterson, Scott N

    2010-11-01

    Here we report the use of a multi-genome DNA microarray to elucidate the genomic events associated with the emergence of the clonal variants of Haemophilus influenzae biogroup aegyptius causing Brazilian Purpuric Fever (BPF), an important pediatric disease with a high mortality rate. We performed directed genome sequencing of strain HK1212 unique loci to construct a species DNA microarray. Comparative genome hybridization using this microarray enabled us to determine and compare gene complements, and infer reliable phylogenomic relationships among members of the species. The higher genomic variability observed in the genomes of BPF-related strains (clones) and their close relatives may be characterized by significant gene flux related to a subset of functional role categories. We found that the acquisition of a large number of virulence determinants featuring numerous cell membrane proteins coupled to the loss of genes involved in transport, central biosynthetic pathways and in particular, energy production pathways to be characteristics of the BPF genomic variants. Copyright © 2010 Elsevier Inc. All rights reserved.

Top