Sample records for existing microarray reanalysis

  1. DigOut: viewing differential expression genes as outliers.

    PubMed

    Yu, Hui; Tu, Kang; Xie, Lu; Li, Yuan-Yuan

    2010-12-01

    With regards to well-replicated two-conditional microarray datasets, the selection of differentially expressed (DE) genes is a well-studied computational topic, but for multi-conditional microarray datasets with limited or no replication, the same task is not properly addressed by previous studies. This paper adopts multivariate outlier analysis to analyze replication-lacking multi-conditional microarray datasets, finding that it performs significantly better than the widely used limit fold change (LFC) model in a simulated comparative experiment. Compared with the LFC model, the multivariate outlier analysis also demonstrates improved stability against sample variations in a series of manipulated real expression datasets. The reanalysis of a real non-replicated multi-conditional expression dataset series leads to satisfactory results. In conclusion, a multivariate outlier analysis algorithm, like DigOut, is particularly useful for selecting DE genes from non-replicated multi-conditional gene expression dataset.

  2. EzArray: A web-based highly automated Affymetrix expression array data management and analysis system

    PubMed Central

    Zhu, Yuerong; Zhu, Yuelin; Xu, Wei

    2008-01-01

    Background Though microarray experiments are very popular in life science research, managing and analyzing microarray data are still challenging tasks for many biologists. Most microarray programs require users to have sophisticated knowledge of mathematics, statistics and computer skills for usage. With accumulating microarray data deposited in public databases, easy-to-use programs to re-analyze previously published microarray data are in high demand. Results EzArray is a web-based Affymetrix expression array data management and analysis system for researchers who need to organize microarray data efficiently and get data analyzed instantly. EzArray organizes microarray data into projects that can be analyzed online with predefined or custom procedures. EzArray performs data preprocessing and detection of differentially expressed genes with statistical methods. All analysis procedures are optimized and highly automated so that even novice users with limited pre-knowledge of microarray data analysis can complete initial analysis quickly. Since all input files, analysis parameters, and executed scripts can be downloaded, EzArray provides maximum reproducibility for each analysis. In addition, EzArray integrates with Gene Expression Omnibus (GEO) and allows instantaneous re-analysis of published array data. Conclusion EzArray is a novel Affymetrix expression array data analysis and sharing system. EzArray provides easy-to-use tools for re-analyzing published microarray data and will help both novice and experienced users perform initial analysis of their microarray data from the location of data storage. We believe EzArray will be a useful system for facilities with microarray services and laboratories with multiple members involved in microarray data analysis. EzArray is freely available from . PMID:18218103

  3. A benchmark for statistical microarray data analysis that preserves actual biological and technical variance.

    PubMed

    De Hertogh, Benoît; De Meulder, Bertrand; Berger, Fabrice; Pierre, Michael; Bareke, Eric; Gaigneaux, Anthoula; Depiereux, Eric

    2010-01-11

    Recent reanalysis of spike-in datasets underscored the need for new and more accurate benchmark datasets for statistical microarray analysis. We present here a fresh method using biologically-relevant data to evaluate the performance of statistical methods. Our novel method ranks the probesets from a dataset composed of publicly-available biological microarray data and extracts subset matrices with precise information/noise ratios. Our method can be used to determine the capability of different methods to better estimate variance for a given number of replicates. The mean-variance and mean-fold change relationships of the matrices revealed a closer approximation of biological reality. Performance analysis refined the results from benchmarks published previously.We show that the Shrinkage t test (close to Limma) was the best of the methods tested, except when two replicates were examined, where the Regularized t test and the Window t test performed slightly better. The R scripts used for the analysis are available at http://urbm-cluster.urbm.fundp.ac.be/~bdemeulder/.

  4. Array-Based Gene Discovery with Three Unrelated Subjects Shows SCARB2/LIMP-2 Deficiency Causes Myoclonus Epilepsy and Glomerulosclerosis

    PubMed Central

    Berkovic, Samuel F.; Dibbens, Leanne M.; Oshlack, Alicia; Silver, Jeremy D.; Katerelos, Marina; Vears, Danya F.; Lüllmann-Rauch, Renate; Blanz, Judith; Zhang, Ke Wei; Stankovich, Jim; Kalnins, Renate M.; Dowling, John P.; Andermann, Eva; Andermann, Frederick; Faldini, Enrico; D'Hooge, Rudi; Vadlamudi, Lata; Macdonell, Richard A.; Hodgson, Bree L.; Bayly, Marta A.; Savige, Judy; Mulley, John C.; Smyth, Gordon K.; Power, David A.; Saftig, Paul; Bahlo, Melanie

    2008-01-01

    Action myoclonus-renal failure syndrome (AMRF) is an autosomal-recessive disorder with the remarkable combination of focal glomerulosclerosis, frequently with glomerular collapse, and progressive myoclonus epilepsy associated with storage material in the brain. Here, we employed a novel combination of molecular strategies to find the responsible gene and show its effects in an animal model. Utilizing only three unrelated affected individuals and their relatives, we used homozygosity mapping with single-nucleotide polymorphism chips to localize AMRF. We then used microarray-expression analysis to prioritize candidates prior to sequencing. The disorder was mapped to 4q13-21, and microarray-expression analysis identified SCARB2/Limp2, which encodes a lysosomal-membrane protein, as the likely candidate. Mutations in SCARB2/Limp2 were found in all three families used for mapping and subsequently confirmed in two other unrelated AMRF families. The mutations were associated with lack of SCARB2 protein. Reanalysis of an existing Limp2 knockout mouse showed intracellular inclusions in cerebral and cerebellar cortex, and the kidneys showed subtle glomerular changes. This study highlights that recessive genes can be identified with a very small number of subjects. The ancestral lysosomal-membrane protein SCARB2/LIMP-2 is responsible for AMRF. The heterogeneous pathology in the kidney and brain suggests that SCARB2/Limp2 has pleiotropic effects that may be relevant to understanding the pathogenesis of other forms of glomerulosclerosis or collapse and myoclonic epilepsies. PMID:18308289

  5. Framework for reanalysis of publicly available Affymetrix® GeneChip® data sets based on functional regions of interest.

    PubMed

    Saka, Ernur; Harrison, Benjamin J; West, Kirk; Petruska, Jeffrey C; Rouchka, Eric C

    2017-12-06

    Since the introduction of microarrays in 1995, researchers world-wide have used both commercial and custom-designed microarrays for understanding differential expression of transcribed genes. Public databases such as ArrayExpress and the Gene Expression Omnibus (GEO) have made millions of samples readily available. One main drawback to microarray data analysis involves the selection of probes to represent a specific transcript of interest, particularly in light of the fact that transcript-specific knowledge (notably alternative splicing) is dynamic in nature. We therefore developed a framework for reannotating and reassigning probe groups for Affymetrix® GeneChip® technology based on functional regions of interest. This framework addresses three issues of Affymetrix® GeneChip® data analyses: removing nonspecific probes, updating probe target mapping based on the latest genome knowledge and grouping probes into gene, transcript and region-based (UTR, individual exon, CDS) probe sets. Updated gene and transcript probe sets provide more specific analysis results based on current genomic and transcriptomic knowledge. The framework selects unique probes, aligns them to gene annotations and generates a custom Chip Description File (CDF). The analysis reveals only 87% of the Affymetrix® GeneChip® HG-U133 Plus 2 probes uniquely align to the current hg38 human assembly without mismatches. We also tested new mappings on the publicly available data series using rat and human data from GSE48611 and GSE72551 obtained from GEO, and illustrate that functional grouping allows for the subtle detection of regions of interest likely to have phenotypical consequences. Through reanalysis of the publicly available data series GSE48611 and GSE72551, we profiled the contribution of UTR and CDS regions to the gene expression levels globally. The comparison between region and gene based results indicated that the detected expressed genes by gene-based and region-based CDFs show high consistency and regions based results allows us to detection of changes in transcript formation.

  6. BABAR: an R package to simplify the normalisation of common reference design microarray-based transcriptomic datasets

    PubMed Central

    2010-01-01

    Background The development of DNA microarrays has facilitated the generation of hundreds of thousands of transcriptomic datasets. The use of a common reference microarray design allows existing transcriptomic data to be readily compared and re-analysed in the light of new data, and the combination of this design with large datasets is ideal for 'systems'-level analyses. One issue is that these datasets are typically collected over many years and may be heterogeneous in nature, containing different microarray file formats and gene array layouts, dye-swaps, and showing varying scales of log2- ratios of expression between microarrays. Excellent software exists for the normalisation and analysis of microarray data but many data have yet to be analysed as existing methods struggle with heterogeneous datasets; options include normalising microarrays on an individual or experimental group basis. Our solution was to develop the Batch Anti-Banana Algorithm in R (BABAR) algorithm and software package which uses cyclic loess to normalise across the complete dataset. We have already used BABAR to analyse the function of Salmonella genes involved in the process of infection of mammalian cells. Results The only input required by BABAR is unprocessed GenePix or BlueFuse microarray data files. BABAR provides a combination of 'within' and 'between' microarray normalisation steps and diagnostic boxplots. When applied to a real heterogeneous dataset, BABAR normalised the dataset to produce a comparable scaling between the microarrays, with the microarray data in excellent agreement with RT-PCR analysis. When applied to a real non-heterogeneous dataset and a simulated dataset, BABAR's performance in identifying differentially expressed genes showed some benefits over standard techniques. Conclusions BABAR is an easy-to-use software tool, simplifying the simultaneous normalisation of heterogeneous two-colour common reference design cDNA microarray-based transcriptomic datasets. We show BABAR transforms real and simulated datasets to allow for the correct interpretation of these data, and is the ideal tool to facilitate the identification of differentially expressed genes or network inference analysis from transcriptomic datasets. PMID:20128918

  7. Microarrays in brain research: the good, the bad and the ugly.

    PubMed

    Mirnics, K

    2001-06-01

    Making sense of microarray data is a complex process, in which the interpretation of findings will depend on the overall experimental design and judgement of the investigator performing the analysis. As a result, differences in tissue harvesting, microarray types, sample labelling and data analysis procedures make post hoc sharing of microarray data a great challenge. To ensure rapid and meaningful data exchange, we need to create some order out of the existing chaos. In these ground-breaking microarray standardization and data sharing efforts, NIH agencies should take a leading role

  8. The high-resolution regional reanalysis COSMO-REA6

    NASA Astrophysics Data System (ADS)

    Ohlwein, C.

    2016-12-01

    Reanalyses gain more and more importance as a source of meteorological information for many purposes and applications. Several global reanalyses projects (e.g., ERA, MERRA, CSFR, JMA9) produce and verify these data sets to provide time series as long as possible combined with a high data quality. Due to a spatial resolution down to 50-70km and 3-hourly temporal output, they are not suitable for small scale problems (e.g., regional climate assessment, meso-scale NWP verification, input for subsequent models such as river runoff simulations). The implementation of regional reanalyses based on a limited area model along with a data assimilation scheme is able to generate reanalysis data sets with high spatio-temporal resolution. Within the Hans-Ertel-Centre for Weather Research (HErZ), the climate monitoring branch concentrates efforts on the assessment and analysis of regional climate in Germany and Europe. In joint cooperation with DWD (German Meteorological Service), a high-resolution reanalysis system based on the COSMO model has been developed. The regional reanalysis for Europe matches the domain of the CORDEX EURO-11 specifications, albeit at a higher spatial resolution, i.e., 0.055° (6km) instead of 0.11° (12km) and comprises the assimilation of observational data using the existing nudging scheme of COSMO complemented by a special soil moisture analysis with boundary conditions provided by ERA-Interim data. The reanalysis data set covers the past 20 years. Extensive evaluation of the reanalysis is performed using independent observations with special emphasis on precipitation and high-impact weather situations indicating a better representation of small scale variability. Further, the evaluation shows an added value of the regional reanalysis with respect to the forcing ERA Interim reanalysis and compared to a pure high-resolution dynamical downscaling approach without data assimilation.

  9. A high-resolution regional reanalysis for Europe

    NASA Astrophysics Data System (ADS)

    Ohlwein, C.

    2015-12-01

    Reanalyses gain more and more importance as a source of meteorological information for many purposes and applications. Several global reanalyses projects (e.g., ERA, MERRA, CSFR, JMA9) produce and verify these data sets to provide time series as long as possible combined with a high data quality. Due to a spatial resolution down to 50-70km and 3-hourly temporal output, they are not suitable for small scale problems (e.g., regional climate assessment, meso-scale NWP verification, input for subsequent models such as river runoff simulations). The implementation of regional reanalyses based on a limited area model along with a data assimilation scheme is able to generate reanalysis data sets with high spatio-temporal resolution. Within the Hans-Ertel-Centre for Weather Research (HErZ), the climate monitoring branch concentrates efforts on the assessment and analysis of regional climate in Germany and Europe. In joint cooperation with DWD (German Meteorological Service), a high-resolution reanalysis system based on the COSMO model has been developed. The regional reanalysis for Europe matches the domain of the CORDEX EURO-11 specifications, albeit at a higher spatial resolution, i.e., 0.055° (6km) instead of 0.11° (12km) and comprises the assimilation of observational data using the existing nudging scheme of COSMO complemented by a special soil moisture analysis with boundary conditions provided by ERA-Interim data. The reanalysis data set covers the past 20 years. Extensive evaluation of the reanalysis is performed using independent observations with special emphasis on precipitation and high-impact weather situations indicating a better representation of small scale variability. Further, the evaluation shows an added value of the regional reanalysis with respect to the forcing ERA Interim reanalysis and compared to a pure high-resolution dynamical downscaling approach without data assimilation.

  10. Whether the decadal shift of South Asia High intensity around the late 1970s exists or not

    NASA Astrophysics Data System (ADS)

    Xue, Xu; Chen, Wen; Nath, Debashis; Zhou, Dingwen

    2015-05-01

    This study compares the decadal means of the seasonal (June-July-August (JJA)) mean geopotential heights available from the NCEP1 and ERA-40 reanalysis data in the Northern Hemisphere. The interdecadal changes in the South Asia High (SAH) intensity derived from the reanalysis data are also compared with ground-based radiosonde observations and atmospheric model outputs. The JJA mean geopotential heights in the 1980s are distinctly larger than the 1970s in NCEP1 over most of the regions in the Northern Hemisphere, while no obvious difference is observed in ERA-40. The interannual variation of the SAH strength is very close in the two reanalysis data, so that it is appropriate to utilize the reanalysis data to study the interannual variation of SAH strength after removing the interdecadal trend. However, the discrepancy in SAH intensity between NCEP1 and ERA-40 mainly exists on the interdecadal time scale. The SAH intensity in the NCEP1 was close to that in the ERA-40 before the late 1970s but became remarkably stronger after the late 1970s, leading to a much larger decadal strengthening during the period 1970-1990. Based on the six radiosonde observation stations in the area of the SAH, the results indicate that the decadal reinforcing in the SAH strength occurs around the mid-1980s. Thus, NCEP1 may overestimate the decadal shift in the SAH intensity around the late 1970s, while ERA-40 may underestimate it. Much attention needs to be paid when we use the reanalysis data to study the decadal variability of the SAH intensity.

  11. Shrinkage regression-based methods for microarray missing value imputation.

    PubMed

    Wang, Hsiuying; Chiu, Chia-Chun; Wu, Yi-Ching; Wu, Wei-Sheng

    2013-01-01

    Missing values commonly occur in the microarray data, which usually contain more than 5% missing values with up to 90% of genes affected. Inaccurate missing value estimation results in reducing the power of downstream microarray data analyses. Many types of methods have been developed to estimate missing values. Among them, the regression-based methods are very popular and have been shown to perform better than the other types of methods in many testing microarray datasets. To further improve the performances of the regression-based methods, we propose shrinkage regression-based methods. Our methods take the advantage of the correlation structure in the microarray data and select similar genes for the target gene by Pearson correlation coefficients. Besides, our methods incorporate the least squares principle, utilize a shrinkage estimation approach to adjust the coefficients of the regression model, and then use the new coefficients to estimate missing values. Simulation results show that the proposed methods provide more accurate missing value estimation in six testing microarray datasets than the existing regression-based methods do. Imputation of missing values is a very important aspect of microarray data analyses because most of the downstream analyses require a complete dataset. Therefore, exploring accurate and efficient methods for estimating missing values has become an essential issue. Since our proposed shrinkage regression-based methods can provide accurate missing value estimation, they are competitive alternatives to the existing regression-based methods.

  12. Status and Preliminary Evaluation for Chinese Re-Analysis Datasets

    NASA Astrophysics Data System (ADS)

    bin, zhao; chunxiang, shi; tianbao, zhao; dong, si; jingwei, liu

    2016-04-01

    Based on operational T639L60 spectral model, combined with Hybird_GSI assimilation system by using meteorological observations including radiosondes, buoyes, satellites el al., a set of Chinese Re-Analysis (CRA) datasets is developing by Chinese National Meteorological Information Center (NMIC) of Chinese Meteorological Administration (CMA). The datasets are run at 30km (0.28°latitude / longitude) resolution which holds higher resolution than most of the existing reanalysis dataset. The reanalysis is done in an effort to enhance the accuracy of historical synoptic analysis and aid to find out detailed investigation of various weather and climate systems. The current status of reanalysis is in a stage of preliminary experimental analysis. One-year forecast data during Jun 2013 and May 2014 has been simulated and used in synoptic and climate evaluation. We first examine the model prediction ability with the new assimilation system, and find out that it represents significant improvement in Northern and Southern hemisphere, due to addition of new satellite data, compared with operational T639L60 model, the effect of upper-level prediction is improved obviously and overall prediction stability is enhanced. In climatological analysis, compared with ERA-40, NCEP/NCAR and NCEP/DOE reanalyses, the results show that surface temperature simulates a bit lower in land and higher over ocean, 850-hPa specific humidity reflects weakened anomaly and the zonal wind value anomaly is focus on equatorial tropics. Meanwhile, the reanalysis dataset shows good ability for various climate index, such as subtropical high index, ESMI (East-Asia subtropical Summer Monsoon Index) et al., especially for the Indian and western North Pacific monsoon index. Latter we will further improve the assimilation system and dynamical simulating performance, and obtain 40-years (1979-2018) reanalysis datasets. It will provide a more comprehensive analysis for synoptic and climate diagnosis.

  13. Evaluation of a High-Resolution Regional Reanalysis for Europe

    NASA Astrophysics Data System (ADS)

    Ohlwein, C.; Wahl, S.; Keller, J. D.; Bollmeyer, C.

    2014-12-01

    Reanalyses gain more and more importance as a source of meteorological information for many purposes and applications. Several global reanalyses projects (e.g., ERA, MERRA, CSFR, JMA9) produce and verify these data sets to provide time series as long as possible combined with a high data quality. Due to a spatial resolution down to 50-70km and 3-hourly temporal output, they are not suitable for small scale problems (e.g., regional climate assessment, meso-scale NWP verification, input for subsequent models such as river runoff simulations). The implementation of regional reanalyses based on a limited area model along with a data assimilation scheme is able to generate reanalysis data sets with high spatio-temporal resolution. Within the Hans-Ertel-Centre for Weather Research (HErZ), the climate monitoring branch concentrates efforts on the assessment and analysis of regional climate in Germany and Europe. In joint cooperation with DWD (German Meteorological Service), a high-resolution reanalysis system based on the COSMO model has been developed. The regional reanalysis for Europe matches the domain of the CORDEX EURO-11 specifications, albeit at a higher spatial resolution, i.e., 0.055° (6km) instead of 0.11° (12km) and comprises the assimilation of observational data using the existing nudging scheme of COSMO complemented by a special soil moisture analysis with boundary conditions provided by ERA-Interim data. The reanalysis data set covers 6 years (2007-2012) and is currently extended to 16 years. Extensive evaluation of the reanalysis is performed using independent observations with special emphasis on precipitation and high-impact weather situations indicating a better representation of small scale variability. Further, the evaluation shows an added value of the regional reanalysis with respect to the forcing ERA Interim reanalysis and compared to a pure high-resolution dynamical downscaling approach without data assimilation.

  14. A High-resolution Reanalysis for the European CORDEX Region

    NASA Astrophysics Data System (ADS)

    Bentzien, Sabrina; Bollmeyer, Christoph; Crewell, Susanne; Friederichs, Petra; Hense, Andreas; Keller, Jan; Keune, Jessica; Kneifel, Stefan; Ohlwein, Christian; Pscheidt, Ieda; Redl, Stephanie; Steinke, Sandra

    2014-05-01

    A High-resolution Reanalysis for the European CORDEX Region Within the Hans-Ertel-Centre for Weather Research (HErZ), the climate monitoring branch concentrates efforts on the assessment and analysis of regional climate in Germany and Europe. In joint cooperation with DWD (German Meteorological Service), a high-resolution reanalysis system based on the COSMO model has been developed. Reanalyses gain more and more importance as a source of meteorological information for many purposes and applications. Several global reanalyses projects (e.g., ERA, MERRA, CSFR, JMA9) produce and verify these data sets to provide time series as long as possible combined with a high data quality. Due to a spatial resolution down to 50-70km and 3-hourly temporal output, they are not suitable for small scale problems (e.g., regional climate assessment, meso-scale NWP verification, input for subsequent models such as river runoff simulations). The implementation of regional reanalyses based on a limited area model along with a data assimilation scheme is able to generate reanalysis data sets with high spatio-temporal resolution. The work presented here focuses on the regional reanalysis for Europe with a domain matching the CORDEX-EURO-11 specifications, albeit at a higher spatial resolution, i.e., 0.055° (6km) instead of 0.11° (12km). The COSMO reanalysis system comprises the assimilation of observational data using the existing nudging scheme of COSMO and is complemented by a special soil moisture analysis and boundary conditions given by ERA-interim data. The reanalysis data set currently covers 6 years (2007-2012). The evaluation of the reanalyses is done using independent observations with special emphasis on precipitation and high-impact weather situations. The development and evaluation of the COSMO-based reanalysis for the CORDEX-Euro domain can be seen as a preparation for joint European activities on the development of an ensemble system of regional reanalyses for Europe.

  15. Stratospheric water vapor and ozone evaluation in reanalyses as part of the SPARC Reanalysis Intercomparison Project (S-RIP)

    NASA Astrophysics Data System (ADS)

    Davis, S. M.; Hegglin, M. I.; Fujiwara, M.; Manney, G. L.; Dragani, R.; Nash, E.; Tegtmeier, S.; Kobayashi, C.; Harada, Y.; Long, C. S.; Wargan, K.; Rosenlof, K. H.

    2017-12-01

    Reanalyses are widely used to understand atmospheric processes and past variability, and are often used to stand in as "observations" for comparisons with climate model output. Because of the central role of water vapor (WV) and ozone (O3) in climate change, it is important to understand how accurately and consistently these species are represented in existing global reanalyses. Here we present the results of WV and O3 intercomparisons that have been performed as part of the SPARC (Stratosphere-troposphere Processes and their Role in Climate) Reanalysis Intercomparison Project (S-RIP). The comparisons cover a range of timescales and evaluate both inter-reanalysis and observation-reanalysis differences. The assimilation of total column ozone (TCO) observations in newer reanalyses results in realistic representations of TCO in reanalyses except when data coverage is lacking, such as during polar night. The vertical distribution of ozone is also relatively well represented in the stratosphere in reanalyses, particularly given the relatively weak constraints on ozone vertical structure provided by most assimilated observations and the simplistic representations of ozone photochemical processes in most of the reanalysis forecast models. For times when vertically resolved observations are not assimilated, biases in the vertical distribution of ozone are found in the upper troposphere and lower stratosphere in all reanalyses. In contrast to O3, reanalysis stratospheric WV fields are not directly constrained by assimilated data. Observations of atmospheric humidity are typically used only in the troposphere, below a specified vertical level at or near the tropopause. The fidelity of reanalysis stratospheric WV products is therefore dependent on the reanalyses' representation of processes that influence stratospheric WV, such as tropical tropopause layer temperatures and methane oxidation. The lack of assimilated observations and known deficiencies in the representation of stratospheric transport in reanalyses result in much poorer agreement amongst observational and reanalysis estimates of stratospheric WV. Hence, stratospheric WV products from the current generation of reanalyses should generally not be used in scientific studies.

  16. Data submission and quality in microarray-based microRNA profiling

    PubMed Central

    Witwer, Kenneth W.

    2014-01-01

    Background Public sharing of scientific data has assumed greater importance in the ‘omics’ era. Transparency is necessary for confirmation and validation, and multiple examiners aid in extracting maximal value from large datasets. Accordingly, database submission and provision of the Minimum Information About a Microarray Experiment (MIAME) are required by most journals as a prerequisite for review or acceptance. Methods In this study, the level of data submission and MIAME compliance was reviewed for 127 articles that included microarray-based microRNA profiling and that were published from July, 2011 through April, 2012 in the journals that published the largest number of such articles—PLOS ONE, the Journal of Biological Chemistry, Blood, and Oncogene—along with articles from nine other journals, including Clinical Chemistry, that published smaller numbers of array-based articles. Results Overall, data submission was reported at publication for less than 40% of all articles, and almost 75% of articles were MIAME-noncompliant. On average, articles that included full data submission scored significantly higher on a quality metric than articles with limited or no data submission, and studies with adequate description of methods disproportionately included larger numbers of experimental repeats. Finally, for several articles that were not MIAME-compliant, data re-analysis revealed less than complete support for the published conclusions, in one case leading to retraction. Conclusions These findings buttress the hypothesis that reluctance to share data is associated with low study quality and suggest that most miRNA array investigations are underpowered and/or potentially compromised by a lack of appropriate reporting and data submission. PMID:23358751

  17. Data submission and quality in microarray-based microRNA profiling.

    PubMed

    Witwer, Kenneth W

    2013-02-01

    Public sharing of scientific data has assumed greater importance in the omics era. Transparency is necessary for confirmation and validation, and multiple examiners aid in extracting maximal value from large data sets. Accordingly, database submission and provision of the Minimum Information About a Microarray Experiment (MIAME)(3) are required by most journals as a prerequisite for review or acceptance. In this study, the level of data submission and MIAME compliance was reviewed for 127 articles that included microarray-based microRNA (miRNA) profiling and were published from July 2011 through April 2012 in the journals that published the largest number of such articles--PLOS ONE, the Journal of Biological Chemistry, Blood, and Oncogene--along with articles from 9 other journals, including Clinical Chemistry, that published smaller numbers of array-based articles. Overall, data submission was reported at publication for <40% of all articles, and almost 75% of articles were MIAME noncompliant. On average, articles that included full data submission scored significantly higher on a quality metric than articles with limited or no data submission, and studies with adequate description of methods disproportionately included larger numbers of experimental repeats. Finally, for several articles that were not MIAME compliant, data reanalysis revealed less than complete support for the published conclusions, in 1 case leading to retraction. These findings buttress the hypothesis that reluctance to share data is associated with low study quality and suggest that most miRNA array investigations are underpowered and/or potentially compromised by a lack of appropriate reporting and data submission. © 2012 American Association for Clinical Chemistry

  18. Microarray Я US: a user-friendly graphical interface to Bioconductor tools that enables accurate microarray data analysis and expedites comprehensive functional analysis of microarray results.

    PubMed

    Dai, Yilin; Guo, Ling; Li, Meng; Chen, Yi-Bu

    2012-06-08

    Microarray data analysis presents a significant challenge to researchers who are unable to use the powerful Bioconductor and its numerous tools due to their lack of knowledge of R language. Among the few existing software programs that offer a graphic user interface to Bioconductor packages, none have implemented a comprehensive strategy to address the accuracy and reliability issue of microarray data analysis due to the well known probe design problems associated with many widely used microarray chips. There is also a lack of tools that would expedite the functional analysis of microarray results. We present Microarray Я US, an R-based graphical user interface that implements over a dozen popular Bioconductor packages to offer researchers a streamlined workflow for routine differential microarray expression data analysis without the need to learn R language. In order to enable a more accurate analysis and interpretation of microarray data, we incorporated the latest custom probe re-definition and re-annotation for Affymetrix and Illumina chips. A versatile microarray results output utility tool was also implemented for easy and fast generation of input files for over 20 of the most widely used functional analysis software programs. Coupled with a well-designed user interface, Microarray Я US leverages cutting edge Bioconductor packages for researchers with no knowledge in R language. It also enables a more reliable and accurate microarray data analysis and expedites downstream functional analysis of microarray results.

  19. In silico Microarray Probe Design for Diagnosis of Multiple Pathogens

    DTIC Science & Technology

    2008-10-21

    enhancements to an existing single-genome pipeline that allows for efficient design of microarray probes common to groups of target genomes. The...for tens or even hundreds of related genomes in a single run. Hybridization results with an unsequenced B. pseudomallei strain indicate that the

  20. A database for the analysis of immunity genes in Drosophila: PADMA database.

    PubMed

    Lee, Mark J; Mondal, Ariful; Small, Chiyedza; Paddibhatla, Indira; Kawaguchi, Akira; Govind, Shubha

    2011-01-01

    While microarray experiments generate voluminous data, discerning trends that support an existing or alternative paradigm is challenging. To synergize hypothesis building and testing, we designed the Pathogen Associated Drosophila MicroArray (PADMA) database for easy retrieval and comparison of microarray results from immunity-related experiments (www.padmadatabase.org). PADMA also allows biologists to upload their microarray-results and compare it with datasets housed within PADMA. We tested PADMA using a preliminary dataset from Ganaspis xanthopoda-infected fly larvae, and uncovered unexpected trends in gene expression, reshaping our hypothesis. Thus, the PADMA database will be a useful resource to fly researchers to evaluate, revise, and refine hypotheses.

  1. Boundary formulations for sensitivity analysis without matrix derivatives

    NASA Technical Reports Server (NTRS)

    Kane, J. H.; Guru Prasad, K.

    1993-01-01

    A new hybrid approach to continuum structural shape sensitivity analysis employing boundary element analysis (BEA) is presented. The approach uses iterative reanalysis to obviate the need to factor perturbed matrices in the determination of surface displacement and traction sensitivities via a univariate perturbation/finite difference (UPFD) step. The UPFD approach makes it possible to immediately reuse existing subroutines for computation of BEA matrix coefficients in the design sensitivity analysis process. The reanalysis technique computes economical response of univariately perturbed models without factoring perturbed matrices. The approach provides substantial computational economy without the burden of a large-scale reprogramming effort.

  2. The Gulf Stream in Ocean Reanalyses: 1993-2010

    NASA Astrophysics Data System (ADS)

    Chi, L.; Wolfe, C.; Hameed, S.

    2017-12-01

    In recent years, significant progress has been made in the development of high-resolution ocean reanalysis products. However, errors are likely to remain because of inadequate coverage of observations, model resolutions, physical parameterizations, etc. We compare the representation of the Gulf Stream (GS) in several widely used global reanalysis products with resolutions ranging from 1° to 1/12°. This intercomparison focuses on the Florida Current transport, the separation of GS near Cape Hatteras, GS properties along the Oleander Line (from New Jersey to Bermuda), GS path and the GS north wall positions between 73°W and 55°W. A large spread exists across the reanalysis products. HYCOM and GLORYS2v4 stand out for their top performance in most metrics. Some common biases are found in all discussed products; for example, the velocity structure of the GS near the Oleander Line is too symmetric and the maximum velocity is weaker than in observations. In addition, the annual mean values of GS separation latitude near Cape Hatteras, the GS transport, and net transport across Oleander Line (which runs from New Jersey to Bermuda), less than half of the reanalysis products are correlated to the observations at 95% confidence level.

  3. Diagnostic evaluation of the Community Earth System Model in simulating mineral dust emission with insight into large-scale dust storm mobilization in the Middle East and North Africa (MENA)

    NASA Astrophysics Data System (ADS)

    Parajuli, Sagar Prasad; Yang, Zong-Liang; Lawrence, David M.

    2016-06-01

    Large amounts of mineral dust are injected into the atmosphere during dust storms, which are common in the Middle East and North Africa (MENA) where most of the global dust hotspots are located. In this work, we present simulations of dust emission using the Community Earth System Model Version 1.2.2 (CESM 1.2.2) and evaluate how well it captures the spatio-temporal characteristics of dust emission in the MENA region with a focus on large-scale dust storm mobilization. We explicitly focus our analysis on the model's two major input parameters that affect the vertical mass flux of dust-surface winds and the soil erodibility factor. We analyze dust emissions in simulations with both prognostic CESM winds and with CESM winds that are nudged towards ERA-Interim reanalysis values. Simulations with three existing erodibility maps and a new observation-based erodibility map are also conducted. We compare the simulated results with MODIS satellite data, MACC reanalysis data, AERONET station data, and CALIPSO 3-d aerosol profile data. The dust emission simulated by CESM, when driven by nudged reanalysis winds, compares reasonably well with observations on daily to monthly time scales despite CESM being a global General Circulation Model. However, considerable bias exists around known high dust source locations in northwest/northeast Africa and over the Arabian Peninsula where recurring large-scale dust storms are common. The new observation-based erodibility map, which can represent anthropogenic dust sources that are not directly represented by existing erodibility maps, shows improved performance in terms of the simulated dust optical depth (DOD) and aerosol optical depth (AOD) compared to existing erodibility maps although the performance of different erodibility maps varies by region.

  4. Assessment of upper tropospheric and stratospheric water vapor and ozone in reanalyses as part of S-RIP

    NASA Astrophysics Data System (ADS)

    Davis, Sean M.; Hegglin, Michaela I.; Fujiwara, Masatomo; Dragani, Rossana; Harada, Yayoi; Kobayashi, Chiaki; Long, Craig; Manney, Gloria L.; Nash, Eric R.; Potter, Gerald L.; Tegtmeier, Susann; Wang, Tao; Wargan, Krzysztof; Wright, Jonathon S.

    2017-10-01

    Reanalysis data sets are widely used to understand atmospheric processes and past variability, and are often used to stand in as "observations" for comparisons with climate model output. Because of the central role of water vapor (WV) and ozone (O3) in climate change, it is important to understand how accurately and consistently these species are represented in existing global reanalyses. In this paper, we present the results of WV and O3 intercomparisons that have been performed as part of the SPARC (Stratosphere-troposphere Processes and their Role in Climate) Reanalysis Intercomparison Project (S-RIP). The comparisons cover a range of timescales and evaluate both inter-reanalysis and observation-reanalysis differences. We also provide a systematic documentation of the treatment of WV and O3 in current reanalyses to aid future research and guide the interpretation of differences amongst reanalysis fields.The assimilation of total column ozone (TCO) observations in newer reanalyses results in realistic representations of TCO in reanalyses except when data coverage is lacking, such as during polar night. The vertical distribution of ozone is also relatively well represented in the stratosphere in reanalyses, particularly given the relatively weak constraints on ozone vertical structure provided by most assimilated observations and the simplistic representations of ozone photochemical processes in most of the reanalysis forecast models. However, significant biases in the vertical distribution of ozone are found in the upper troposphere and lower stratosphere in all reanalyses.In contrast to O3, reanalysis estimates of stratospheric WV are not directly constrained by assimilated data. Observations of atmospheric humidity are typically used only in the troposphere, below a specified vertical level at or near the tropopause. The fidelity of reanalysis stratospheric WV products is therefore mainly dependent on the reanalyses' representation of the physical drivers that influence stratospheric WV, such as temperatures in the tropical tropopause layer, methane oxidation, and the stratospheric overturning circulation. The lack of assimilated observations and known deficiencies in the representation of stratospheric transport in reanalyses result in much poorer agreement amongst observational and reanalysis estimates of stratospheric WV. Hence, stratospheric WV products from the current generation of reanalyses should generally not be used in scientific studies.

  5. MVIAeval: a web tool for comprehensively evaluating the performance of a new missing value imputation algorithm.

    PubMed

    Wu, Wei-Sheng; Jhou, Meng-Jhun

    2017-01-13

    Missing value imputation is important for microarray data analyses because microarray data with missing values would significantly degrade the performance of the downstream analyses. Although many microarray missing value imputation algorithms have been developed, an objective and comprehensive performance comparison framework is still lacking. To solve this problem, we previously proposed a framework which can perform a comprehensive performance comparison of different existing algorithms. Also the performance of a new algorithm can be evaluated by our performance comparison framework. However, constructing our framework is not an easy task for the interested researchers. To save researchers' time and efforts, here we present an easy-to-use web tool named MVIAeval (Missing Value Imputation Algorithm evaluator) which implements our performance comparison framework. MVIAeval provides a user-friendly interface allowing users to upload the R code of their new algorithm and select (i) the test datasets among 20 benchmark microarray (time series and non-time series) datasets, (ii) the compared algorithms among 12 existing algorithms, (iii) the performance indices from three existing ones, (iv) the comprehensive performance scores from two possible choices, and (v) the number of simulation runs. The comprehensive performance comparison results are then generated and shown as both figures and tables. MVIAeval is a useful tool for researchers to easily conduct a comprehensive and objective performance evaluation of their newly developed missing value imputation algorithm for microarray data or any data which can be represented as a matrix form (e.g. NGS data or proteomics data). Thus, MVIAeval will greatly expedite the progress in the research of missing value imputation algorithms.

  6. The tropopause inversion layer in models and analyses

    NASA Astrophysics Data System (ADS)

    Birner, T.; Sankey, D.; Shepherd, T. G.

    2006-07-01

    Recent high-resolution radiosonde climatologies have revealed a tropopause inversion layer (TIL) in the extratropics: temperature strongly increases just above a sharp local cold point tropopause. Here, it is asked to what extent a TIL exists in current general circulation models (GCMs) and meteorological analyses. Only a weak hint of a TIL exists in NCEP/NCAR reanalysis data. In contrast, the Canadian Middle Atmosphere Model (CMAM), a comprehensive GCM, exhibits a TIL of realistic strength. However, in data assimilation mode CMAM exhibits a much weaker TIL, especially in the Southern Hemisphere where only coarse satellite data are available. The discrepancy between the analyses and the GCM is thus hypothesized to be mainly due to data assimilation acting to smooth the observed strong curvature in temperature around the tropopause. This is confirmed in the reanalysis where the stratification around the tropopause exhibits a strong discontinuity at the start of the satellite era.

  7. GeneXplorer: an interactive web application for microarray data visualization and analysis.

    PubMed

    Rees, Christian A; Demeter, Janos; Matese, John C; Botstein, David; Sherlock, Gavin

    2004-10-01

    When publishing large-scale microarray datasets, it is of great value to create supplemental websites where either the full data, or selected subsets corresponding to figures within the paper, can be browsed. We set out to create a CGI application containing many of the features of some of the existing standalone software for the visualization of clustered microarray data. We present GeneXplorer, a web application for interactive microarray data visualization and analysis in a web environment. GeneXplorer allows users to browse a microarray dataset in an intuitive fashion. It provides simple access to microarray data over the Internet and uses only HTML and JavaScript to display graphic and annotation information. It provides radar and zoom views of the data, allows display of the nearest neighbors to a gene expression vector based on their Pearson correlations and provides the ability to search gene annotation fields. The software is released under the permissive MIT Open Source license, and the complete documentation and the entire source code are freely available for download from CPAN http://search.cpan.org/dist/Microarray-GeneXplorer/.

  8. CEM-designer: design of custom expression microarrays in the post-ENCODE Era.

    PubMed

    Arnold, Christian; Externbrink, Fabian; Hackermüller, Jörg; Reiche, Kristin

    2014-11-10

    Microarrays are widely used in gene expression studies, and custom expression microarrays are popular to monitor expression changes of a customer-defined set of genes. However, the complexity of transcriptomes uncovered recently make custom expression microarray design a non-trivial task. Pervasive transcription and alternative processing of transcripts generate a wealth of interweaved transcripts that requires well-considered probe design strategies and is largely neglected in existing approaches. We developed the web server CEM-Designer that facilitates microarray platform independent design of custom expression microarrays for complex transcriptomes. CEM-Designer covers (i) the collection and generation of a set of unique target sequences from different sources and (ii) the selection of a set of sensitive and specific probes that optimally represents the target sequences. Probe design itself is left to third party software to ensure that probes meet provider-specific constraints. CEM-Designer is available at http://designpipeline.bioinf.uni-leipzig.de. Copyright © 2014 Elsevier B.V. All rights reserved.

  9. A Reanalysis of the Effects of Teacher Replacement Using Value-Added Modeling

    ERIC Educational Resources Information Center

    Yeh, Stuart S.

    2013-01-01

    Background: In principle, value-added modeling (VAM) might be justified if it can be shown to be a more reliable indicator of teacher quality than existing indicators for existing low-stakes decisions that are already being made, such as the award of small merit bonuses. However, a growing number of researchers now advocate the use of VAM to…

  10. Prediction of regulatory gene pairs using dynamic time warping and gene ontology.

    PubMed

    Yang, Andy C; Hsu, Hui-Huang; Lu, Ming-Da; Tseng, Vincent S; Shih, Timothy K

    2014-01-01

    Selecting informative genes is the most important task for data analysis on microarray gene expression data. In this work, we aim at identifying regulatory gene pairs from microarray gene expression data. However, microarray data often contain multiple missing expression values. Missing value imputation is thus needed before further processing for regulatory gene pairs becomes possible. We develop a novel approach to first impute missing values in microarray time series data by combining k-Nearest Neighbour (KNN), Dynamic Time Warping (DTW) and Gene Ontology (GO). After missing values are imputed, we then perform gene regulation prediction based on our proposed DTW-GO distance measurement of gene pairs. Experimental results show that our approach is more accurate when compared with existing missing value imputation methods on real microarray data sets. Furthermore, our approach can also discover more regulatory gene pairs that are known in the literature than other methods.

  11. Chondrocyte channel transcriptomics

    PubMed Central

    Lewis, Rebecca; May, Hannah; Mobasheri, Ali; Barrett-Jolley, Richard

    2013-01-01

    To date, a range of ion channels have been identified in chondrocytes using a number of different techniques, predominantly electrophysiological and/or biomolecular; each of these has its advantages and disadvantages. Here we aim to compare and contrast the data available from biophysical and microarray experiments. This letter analyses recent transcriptomics datasets from chondrocytes, accessible from the European Bioinformatics Institute (EBI). We discuss whether such bioinformatic analysis of microarray datasets can potentially accelerate identification and discovery of ion channels in chondrocytes. The ion channels which appear most frequently across these microarray datasets are discussed, along with their possible functions. We discuss whether functional or protein data exist which support the microarray data. A microarray experiment comparing gene expression in osteoarthritis and healthy cartilage is also discussed and we verify the differential expression of 2 of these genes, namely the genes encoding large calcium-activated potassium (BK) and aquaporin channels. PMID:23995703

  12. Variance stabilization and normalization for one-color microarray data using a data-driven multiscale approach.

    PubMed

    Motakis, E S; Nason, G P; Fryzlewicz, P; Rutter, G A

    2006-10-15

    Many standard statistical techniques are effective on data that are normally distributed with constant variance. Microarray data typically violate these assumptions since they come from non-Gaussian distributions with a non-trivial mean-variance relationship. Several methods have been proposed that transform microarray data to stabilize variance and draw its distribution towards the Gaussian. Some methods, such as log or generalized log, rely on an underlying model for the data. Others, such as the spread-versus-level plot, do not. We propose an alternative data-driven multiscale approach, called the Data-Driven Haar-Fisz for microarrays (DDHFm) with replicates. DDHFm has the advantage of being 'distribution-free' in the sense that no parametric model for the underlying microarray data is required to be specified or estimated; hence, DDHFm can be applied very generally, not just to microarray data. DDHFm achieves very good variance stabilization of microarray data with replicates and produces transformed intensities that are approximately normally distributed. Simulation studies show that it performs better than other existing methods. Application of DDHFm to real one-color cDNA data validates these results. The R package of the Data-Driven Haar-Fisz transform (DDHFm) for microarrays is available in Bioconductor and CRAN.

  13. Development of a DNA microarray for species identification of quarantine aphids.

    PubMed

    Lee, Won Sun; Choi, Hwalran; Kang, Jinseok; Kim, Ji-Hoon; Lee, Si Hyeock; Lee, Seunghwan; Hwang, Seung Yong

    2013-12-01

    Aphid pests are being brought into Korea as a result of increased crop trading. Aphids exist on growth areas of plants, and thus plant growth is seriously affected by aphid pests. However, aphids are very small and have several sexual morphs and life stages, so it is difficult to identify species on the basis of morphological features. This problem was approached using DNA microarray technology. DNA targets of the cytochrome c oxidase subunit I gene were generated with a fluorescent dye-labelled primer and were hybridised onto a DNA microarray consisting of specific probes. After analysing the signal intensity of the specific probes, the unique patterns from the DNA microarray, consisting of 47 species-specific probes, were obtained to identify 23 aphid species. To confirm the accuracy of the developed DNA microarray, ten individual blind samples were used in blind trials, and the identifications were completely consistent with the sequencing data of all individual blind samples. A microarray has been developed to distinguish aphid species. DNA microarray technology provides a rapid, easy, cost-effective and accurate method for identifying aphid species for pest control management. © 2013 Society of Chemical Industry.

  14. CGO: utilizing and integrating gene expression microarray data in clinical research and data management.

    PubMed

    Bumm, Klaus; Zheng, Mingzhong; Bailey, Clyde; Zhan, Fenghuang; Chiriva-Internati, M; Eddlemon, Paul; Terry, Julian; Barlogie, Bart; Shaughnessy, John D

    2002-02-01

    Clinical GeneOrganizer (CGO) is a novel windows-based archiving, organization and data mining software for the integration of gene expression profiling in clinical medicine. The program implements various user-friendly tools and extracts data for further statistical analysis. This software was written for Affymetrix GeneChip *.txt files, but can also be used for any other microarray-derived data. The MS-SQL server version acts as a data mart and links microarray data with clinical parameters of any other existing database and therefore represents a valuable tool for combining gene expression analysis and clinical disease characteristics.

  15. Design and validation of MEDRYS, a Mediterranean Sea reanalysis over the period 1992-2013

    NASA Astrophysics Data System (ADS)

    Hamon, Mathieu; Beuvier, Jonathan; Somot, Samuel; Lellouche, Jean-Michel; Greiner, Eric; Jordà, Gabriel; Bouin, Marie-Noëlle; Arsouze, Thomas; Béranger, Karine; Sevault, Florence; Dubois, Clotilde; Drevillon, Marie; Drillet, Yann

    2016-04-01

    The French research community in the Mediterranean Sea modeling and the French operational ocean forecasting center Mercator Océan have gathered their skill and expertise in physical oceanography, ocean modeling, atmospheric forcings and data assimilation to carry out a MEDiterranean sea ReanalYsiS (MEDRYS) at high resolution for the period 1992-2013. The ocean model used is NEMOMED12, a Mediterranean configuration of NEMO with a 1/12° ( ˜ 7 km) horizontal resolution and 75 vertical z levels with partial steps. At the surface, it is forced by a new atmospheric-forcing data set (ALDERA), coming from a dynamical downscaling of the ERA-Interim atmospheric reanalysis by the regional climate model ALADIN-Climate with a 12 km horizontal and 3 h temporal resolutions. This configuration is used to carry a 34-year hindcast simulation over the period 1979-2013 (NM12-FREE), which is the initial state of the reanalysis in October 1992. MEDRYS uses the existing Mercator Océan data assimilation system SAM2 that is based on a reduced-order Kalman filter with a three-dimensional (3-D) multivariate modal decomposition of the forecast error. Altimeter data, satellite sea surface temperature (SST) and temperature and salinity vertical profiles are jointly assimilated. This paper describes the configuration we used to perform MEDRYS. We then validate the skills of the data assimilation system. It is shown that the data assimilation restores a good average temperature and salinity at intermediate layers compared to the hindcast. No particular biases are identified in the bottom layers. However, the reanalysis shows slight positive biases of 0.02 psu and 0.15 °C above 150 m depth. In the validation stage, it is also shown that the assimilation allows one to better reproduce water, heat and salt transports through the Strait of Gibraltar. Finally, the ability of the reanalysis to represent the sea surface high-frequency variability is shown.

  16. Cruella: developing a scalable tissue microarray data management system.

    PubMed

    Cowan, James D; Rimm, David L; Tuck, David P

    2006-06-01

    Compared with DNA microarray technology, relatively little information is available concerning the special requirements, design influences, and implementation strategies of data systems for tissue microarray technology. These issues include the requirement to accommodate new and different data elements for each new project as well as the need to interact with pre-existing models for clinical, biological, and specimen-related data. To design and implement a flexible, scalable tissue microarray data storage and management system that could accommodate information regarding different disease types and different clinical investigators, and different clinical investigation questions, all of which could potentially contribute unforeseen data types that require dynamic integration with existing data. The unpredictability of the data elements combined with the novelty of automated analysis algorithms and controlled vocabulary standards in this area require flexible designs and practical decisions. Our design includes a custom Java-based persistence layer to mediate and facilitate interaction with an object-relational database model and a novel database schema. User interaction is provided through a Java Servlet-based Web interface. Cruella has become an indispensable resource and is used by dozens of researchers every day. The system stores millions of experimental values covering more than 300 biological markers and more than 30 disease types. The experimental data are merged with clinical data that has been aggregated from multiple sources and is available to the researchers for management, analysis, and export. Cruella addresses many of the special considerations for managing tissue microarray experimental data and the associated clinical information. A metadata-driven approach provides a practical solution to many of the unique issues inherent in tissue microarray research, and allows relatively straightforward interoperability with and accommodation of new data models.

  17. A Java-based tool for the design of classification microarrays.

    PubMed

    Meng, Da; Broschat, Shira L; Call, Douglas R

    2008-08-04

    Classification microarrays are used for purposes such as identifying strains of bacteria and determining genetic relationships to understand the epidemiology of an infectious disease. For these cases, mixed microarrays, which are composed of DNA from more than one organism, are more effective than conventional microarrays composed of DNA from a single organism. Selection of probes is a key factor in designing successful mixed microarrays because redundant sequences are inefficient and limited representation of diversity can restrict application of the microarray. We have developed a Java-based software tool, called PLASMID, for use in selecting the minimum set of probe sequences needed to classify different groups of plasmids or bacteria. The software program was successfully applied to several different sets of data. The utility of PLASMID was illustrated using existing mixed-plasmid microarray data as well as data from a virtual mixed-genome microarray constructed from different strains of Streptococcus. Moreover, use of data from expression microarray experiments demonstrated the generality of PLASMID. In this paper we describe a new software tool for selecting a set of probes for a classification microarray. While the tool was developed for the design of mixed microarrays-and mixed-plasmid microarrays in particular-it can also be used to design expression arrays. The user can choose from several clustering methods (including hierarchical, non-hierarchical, and a model-based genetic algorithm), several probe ranking methods, and several different display methods. A novel approach is used for probe redundancy reduction, and probe selection is accomplished via stepwise discriminant analysis. Data can be entered in different formats (including Excel and comma-delimited text), and dendrogram, heat map, and scatter plot images can be saved in several different formats (including jpeg and tiff). Weights generated using stepwise discriminant analysis can be stored for analysis of subsequent experimental data. Additionally, PLASMID can be used to construct virtual microarrays with genomes from public databases, which can then be used to identify an optimal set of probes.

  18. Multi-task feature selection in microarray data by binary integer programming.

    PubMed

    Lan, Liang; Vucetic, Slobodan

    2013-12-20

    A major challenge in microarray classification is that the number of features is typically orders of magnitude larger than the number of examples. In this paper, we propose a novel feature filter algorithm to select the feature subset with maximal discriminative power and minimal redundancy by solving a quadratic objective function with binary integer constraints. To improve the computational efficiency, the binary integer constraints are relaxed and a low-rank approximation to the quadratic term is applied. The proposed feature selection algorithm was extended to solve multi-task microarray classification problems. We compared the single-task version of the proposed feature selection algorithm with 9 existing feature selection methods on 4 benchmark microarray data sets. The empirical results show that the proposed method achieved the most accurate predictions overall. We also evaluated the multi-task version of the proposed algorithm on 8 multi-task microarray datasets. The multi-task feature selection algorithm resulted in significantly higher accuracy than when using the single-task feature selection methods.

  19. A DNA microarray-based methylation-sensitive (MS)-AFLP hybridization method for genetic and epigenetic analyses.

    PubMed

    Yamamoto, F; Yamamoto, M

    2004-07-01

    We previously developed a PCR-based DNA fingerprinting technique named the Methylation Sensitive (MS)-AFLP method, which permits comparative genome-wide scanning of methylation status with a manageable number of fingerprinting experiments. The technique uses the methylation sensitive restriction enzyme NotI in the context of the existing Amplified Fragment Length Polymorphism (AFLP) method. Here we report the successful conversion of this gel electrophoresis-based DNA fingerprinting technique into a DNA microarray hybridization technique (DNA Microarray MS-AFLP). By performing a total of 30 (15 x 2 reciprocal labeling) DNA Microarray MS-AFLP hybridization experiments on genomic DNA from two breast and three prostate cancer cell lines in all pairwise combinations, and Southern hybridization experiments using more than 100 different probes, we have demonstrated that the DNA Microarray MS-AFLP is a reliable method for genetic and epigenetic analyses. No statistically significant differences were observed in the number of differences between the breast-prostate hybridization experiments and the breast-breast or prostate-prostate comparisons.

  20. Stochastic models for inferring genetic regulation from microarray gene expression data.

    PubMed

    Tian, Tianhai

    2010-03-01

    Microarray expression profiles are inherently noisy and many different sources of variation exist in microarray experiments. It is still a significant challenge to develop stochastic models to realize noise in microarray expression profiles, which has profound influence on the reverse engineering of genetic regulation. Using the target genes of the tumour suppressor gene p53 as the test problem, we developed stochastic differential equation models and established the relationship between the noise strength of stochastic models and parameters of an error model for describing the distribution of the microarray measurements. Numerical results indicate that the simulated variance from stochastic models with a stochastic degradation process can be represented by a monomial in terms of the hybridization intensity and the order of the monomial depends on the type of stochastic process. The developed stochastic models with multiple stochastic processes generated simulations whose variance is consistent with the prediction of the error model. This work also established a general method to develop stochastic models from experimental information. 2009 Elsevier Ireland Ltd. All rights reserved.

  1. Analysis of microarray leukemia data using an efficient MapReduce-based K-nearest-neighbor classifier.

    PubMed

    Kumar, Mukesh; Rath, Nitish Kumar; Rath, Santanu Kumar

    2016-04-01

    Microarray-based gene expression profiling has emerged as an efficient technique for classification, prognosis, diagnosis, and treatment of cancer. Frequent changes in the behavior of this disease generates an enormous volume of data. Microarray data satisfies both the veracity and velocity properties of big data, as it keeps changing with time. Therefore, the analysis of microarray datasets in a small amount of time is essential. They often contain a large amount of expression, but only a fraction of it comprises genes that are significantly expressed. The precise identification of genes of interest that are responsible for causing cancer are imperative in microarray data analysis. Most existing schemes employ a two-phase process such as feature selection/extraction followed by classification. In this paper, various statistical methods (tests) based on MapReduce are proposed for selecting relevant features. After feature selection, a MapReduce-based K-nearest neighbor (mrKNN) classifier is also employed to classify microarray data. These algorithms are successfully implemented in a Hadoop framework. A comparative analysis is done on these MapReduce-based models using microarray datasets of various dimensions. From the obtained results, it is observed that these models consume much less execution time than conventional models in processing big data. Copyright © 2016 Elsevier Inc. All rights reserved.

  2. A biomimetic algorithm for the improved detection of microarray features

    NASA Astrophysics Data System (ADS)

    Nicolau, Dan V., Jr.; Nicolau, Dan V.; Maini, Philip K.

    2007-02-01

    One the major difficulties of microarray technology relate to the processing of large and - importantly - error-loaded images of the dots on the chip surface. Whatever the source of these errors, those obtained in the first stage of data acquisition - segmentation - are passed down to the subsequent processes, with deleterious results. As it has been demonstrated recently that biological systems have evolved algorithms that are mathematically efficient, this contribution attempts to test an algorithm that mimics a bacterial-"patented" algorithm for the search of available space and nutrients to find, "zero-in" and eventually delimitate the features existent on the microarray surface.

  3. Ballet as a Career

    ERIC Educational Resources Information Center

    Sutherland, David Earl

    1976-01-01

    A reorganization and reanalysis of data gathered by Ronald Charles Frederico--who interviewed 146 dancers belonging to 12 ballet companies in the U.S.--to investigate the structural features of ballet as a profession. Four possibilities exist for a more general interpretative scheme for understanding ballet: social structural, phenomenological,…

  4. RDFBuilder: a tool to automatically build RDF-based interfaces for MAGE-OM microarray data sources.

    PubMed

    Anguita, Alberto; Martin, Luis; Garcia-Remesal, Miguel; Maojo, Victor

    2013-07-01

    This paper presents RDFBuilder, a tool that enables RDF-based access to MAGE-ML-compliant microarray databases. We have developed a system that automatically transforms the MAGE-OM model and microarray data stored in the ArrayExpress database into RDF format. Additionally, the system automatically enables a SPARQL endpoint. This allows users to execute SPARQL queries for retrieving microarray data, either from specific experiments or from more than one experiment at a time. Our system optimizes response times by caching and reusing information from previous queries. In this paper, we describe our methods for achieving this transformation. We show that our approach is complementary to other existing initiatives, such as Bio2RDF, for accessing and retrieving data from the ArrayExpress database. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  5. Autoregressive-model-based missing value estimation for DNA microarray time series data.

    PubMed

    Choong, Miew Keen; Charbit, Maurice; Yan, Hong

    2009-01-01

    Missing value estimation is important in DNA microarray data analysis. A number of algorithms have been developed to solve this problem, but they have several limitations. Most existing algorithms are not able to deal with the situation where a particular time point (column) of the data is missing entirely. In this paper, we present an autoregressive-model-based missing value estimation method (ARLSimpute) that takes into account the dynamic property of microarray temporal data and the local similarity structures in the data. ARLSimpute is especially effective for the situation where a particular time point contains many missing values or where the entire time point is missing. Experiment results suggest that our proposed algorithm is an accurate missing value estimator in comparison with other imputation methods on simulated as well as real microarray time series datasets.

  6. Evolution of the MIDTAL microarray: the adaption and testing of oligonucleotide 18S and 28S rDNA probes and evaluation of subsequent microarray generations with Prymnesium spp. cultures and field samples.

    PubMed

    McCoy, Gary R; Touzet, Nicolas; Fleming, Gerard T A; Raine, Robin

    2015-07-01

    The toxic microalgal species Prymnesium parvum and Prymnesium polylepis are responsible for numerous fish kills causing economic stress on the aquaculture industry and, through the consumption of contaminated shellfish, can potentially impact on human health. Monitoring of toxic phytoplankton is traditionally carried out by light microscopy. However, molecular methods of identification and quantification are becoming more common place. This study documents the optimisation of the novel Microarrays for the Detection of Toxic Algae (MIDTAL) microarray from its initial stages to the final commercial version now available from Microbia Environnement (France). Existing oligonucleotide probes used in whole-cell fluorescent in situ hybridisation (FISH) for Prymnesium species from higher group probes to species-level probes were adapted and tested on the first-generation microarray. The combination and interaction of numerous other probes specific for a whole range of phytoplankton taxa also spotted on the chip surface caused high cross reactivity, resulting in false-positive results on the microarray. The probe sequences were extended for the subsequent second-generation microarray, and further adaptations of the hybridisation protocol and incubation temperatures significantly reduced false-positive readings from the first to the second-generation chip, thereby increasing the specificity of the MIDTAL microarray. Additional refinement of the subsequent third-generation microarray protocols with the addition of a poly-T amino linker to the 5' end of each probe further enhanced the microarray performance but also highlighted the importance of optimising RNA labelling efficiency when testing with natural seawater samples from Killary Harbour, Ireland.

  7. Content Representations in a Secondary Environmental Science Class.

    ERIC Educational Resources Information Center

    Tomanek, Debra

    The purpose of this study was to determine what representations of content existed in a secondary environmental science class and what happended to those representations during curriculum occasions. Initial data construction involved attention to what was actually going on during class sessions. Following this, a reanalysis of the data corpus with…

  8. Physical forcing of late summer chlorophyll a blooms in the oligotrophic eastern North Pacific

    NASA Astrophysics Data System (ADS)

    Toyoda, Takahiro; Okamoto, Suguru

    2017-03-01

    We investigated physical forcing of late summer chlorophyll a (chl a) blooms in the oligotrophic eastern North Pacific Ocean by using ocean reanalysis and satellite data. Relatively large chl a blooms as defined in this study occurred in August-October following sea surface temperature (SST) anomaly (SSTA) decreases, mixed layer deepening, and temperature and salinity increases at the bottom of the mixed layer. These physical conditions were apparently induced by the entrainment of subsurface water resulting from the destabilization of the surface layer caused by anomalous northward Ekman transport of subtropical waters of higher salinity. Salinity-normalized total alkalinity data provide supporting evidence for nutrient supply by the entrainment process. We next investigated the impact of including information about the entrainment on bloom identification. The results of analyses using reanalysis data and of those using only satellite data showed large SSTA decreases when the northward Ekman salinity transports were large, implying that the entrainment of subsurface water is well represented in both types of data. After surface-destabilizing conditions were established, relatively high surface chl a concentrations were observed. The use of SST information can further improve the detection of high chl a concentrations. Although the detection of high chl a concentrations would be enhanced by finer data resolution and the inclusion of biogeochemical parameters in the ocean reanalysis, our results obtained by using existing reanalysis data as well as recent satellite data are valuable for better understanding and prediction of lower trophic ecosystem variability.

  9. A high-resolution regional reanalysis for the European CORDEX region

    NASA Astrophysics Data System (ADS)

    Bollmeyer, Christoph; Keller, Jan; Ohlwein, Christian; Wahl, Sabrina

    2015-04-01

    Within the Hans-Ertel-Centre for Weather Research (HErZ), the climate monitoring branch concentrates efforts on the assessment and analysis of regional climate in Germany and Europe. In joint cooperation with DWD (German Weather Service), a high-resolution reanalysis system based on the COSMO model has been developed. Reanalyses gain more and more importance as a source of meteorological information for many purposes and applications. Several global reanalyses projects (e.g., ERA, MERRA, CSFR, JMA9) produce and verify these data sets to provide time series as long as possible combined with a high data quality. Due to a spatial resolution down to 50-70km and 3-hourly temporal output, they are not suitable for small scale problems (e.g., regional climate assessment, meso-scale NWP verification, input for subsequent models such as river runoff simulations, renewable energy applications). The implementation of regional reanalyses based on a limited area model along with a data assimilation scheme is able to generate reanalysis data sets with high spatio-temporal resolution. The work presented here focuses on two regional reanalyses for Europe and Germany. The European reanalysis COSMO-REA6 matches the CORDEX EURO-11 specifications, albeit at a higher spatial resolution, i.e., 0.055° (6km) instead of 0.11° (12km). Nested into COSMO-REA6 is COSMO-REA2, a convective-scale reanalysis with 2km resolution for Germany. COSMO-REA6 comprises the assimilation of observational data using the existing nudging scheme of COSMO and is complemented by a special soil moisture analysis and boundary conditions given by ERA-Interim data. COSMO-REA2 also uses the nudging scheme complemented by a latent heat nudging of radar information. The reanalysis data set currently covers 17 years (1997-2013) for COSMO-REA6 and 4 years (2010-2013) for COSMO-REA2 with a very large set of output variables and a high temporal output step of hourly 3D-fields and quarter-hourly 2D-fields. The evaluation of the reanalyses is done using independent observations for the most important meteorological parameters with special emphasis on precipitation and high-impact weather situations.

  10. Mobley et al. Turnover Model Reanalysis and Review of Existing Data.

    ERIC Educational Resources Information Center

    Dalessio, Anthony; And Others

    Job satisfaction has been identified as one of the most important antecedents of turnover, although it rarely accounts for more than 16% of the variance in employee withdrawal. Several data sets collected on the Mobley, Horner, and Hollingsworth (1978) model of turnover were reanalyzed with path analytic techniques. Data analyses revealed support…

  11. Reanalysis of the fragility of glycerol at very high pressures using new Tg data

    NASA Astrophysics Data System (ADS)

    Lyon, Kevin; Oliver, William

    Direct measurements of the glass transition temperature of glycerol between 1 atm and 6.7 GPa from our lab allow reanalysis of high-pressure viscosity data, which were limited to approximately 107 poise. Previous attempts to determine Tg (P) and fragility by extrapolation of the viscosity data by many orders of magnitude led to inconclusive results. Tg (P) data constrain the value of viscosity at the glass transition providing for more accurate determinations of isobaric fragilities. Over most of the pressure range, a constant fragility is found in agreement with analysis of high-pressure dielectric data by Paluch et al.. Discrepancies in the pressure dependence of the fragility of glycerol at very low pressures exist in the literature and will also be discussed.

  12. Who was the agent? The neural correlates of reanalysis processes during sentence comprehension.

    PubMed

    Hirotani, Masako; Makuuchi, Michiru; Rüschemeyer, Shirley-Ann; Friederici, Angela D

    2011-11-01

    Sentence comprehension is a complex process. Besides identifying the meaning of each word and processing the syntactic structure of a sentence, it requires the computation of thematic information, that is, information about who did what to whom. The present fMRI study investigated the neural basis for thematic reanalysis (reanalysis of the thematic roles initially assigned to noun phrases in a sentence) and its interplay with syntactic reanalysis (reanalysis of the underlying syntactic structure originally constructed for a sentence). Thematic reanalysis recruited a network consisting of Broca's area, that is, the left pars triangularis (LPT), and the left posterior superior temporal gyrus, whereas only LPT showed greater sensitivity to syntactic reanalysis. These data provide direct evidence for a functional neuroanatomical basis for two linguistically motivated reanalysis processes during sentence comprehension. Copyright © 2010 Wiley-Liss, Inc.

  13. The Impact of Socio-Economic Status on Participation and Attainment in Science

    ERIC Educational Resources Information Center

    Gorard, Stephen; See, Beng Huat

    2009-01-01

    In this paper we combine the findings from two recent studies relating to participation and attainment in school science --a re-analysis of existing official data for England and a review of wider international research evidence in the literature relevant to the UK. Although the secondary data are drawn mainly from England, the comprehensiveness…

  14. The MGED Ontology: a resource for semantics-based description of microarray experiments.

    PubMed

    Whetzel, Patricia L; Parkinson, Helen; Causton, Helen C; Fan, Liju; Fostel, Jennifer; Fragoso, Gilberto; Game, Laurence; Heiskanen, Mervi; Morrison, Norman; Rocca-Serra, Philippe; Sansone, Susanna-Assunta; Taylor, Chris; White, Joseph; Stoeckert, Christian J

    2006-04-01

    The generation of large amounts of microarray data and the need to share these data bring challenges for both data management and annotation and highlights the need for standards. MIAME specifies the minimum information needed to describe a microarray experiment and the Microarray Gene Expression Object Model (MAGE-OM) and resulting MAGE-ML provide a mechanism to standardize data representation for data exchange, however a common terminology for data annotation is needed to support these standards. Here we describe the MGED Ontology (MO) developed by the Ontology Working Group of the Microarray Gene Expression Data (MGED) Society. The MO provides terms for annotating all aspects of a microarray experiment from the design of the experiment and array layout, through to the preparation of the biological sample and the protocols used to hybridize the RNA and analyze the data. The MO was developed to provide terms for annotating experiments in line with the MIAME guidelines, i.e. to provide the semantics to describe a microarray experiment according to the concepts specified in MIAME. The MO does not attempt to incorporate terms from existing ontologies, e.g. those that deal with anatomical parts or developmental stages terms, but provides a framework to reference terms in other ontologies and therefore facilitates the use of ontologies in microarray data annotation. The MGED Ontology version.1.2.0 is available as a file in both DAML and OWL formats at http://mged.sourceforge.net/ontologies/index.php. Release notes and annotation examples are provided. The MO is also provided via the NCICB's Enterprise Vocabulary System (http://nciterms.nci.nih.gov/NCIBrowser/Dictionary.do). Stoeckrt@pcbi.upenn.edu Supplementary data are available at Bioinformatics online.

  15. Variations in the temperature and circulation of the atmosphere during the 11-year cycle of solar activity derived from the ERA-Interim reanalysis data

    NASA Astrophysics Data System (ADS)

    Gruzdev, A. N.

    2017-07-01

    Using the data of the ERA-Interim reanalysis, we have obtained estimates of changes in temperature, the geopotential and its large-scale zonal harmonics, wind velocity, and potential vorticity in the troposphere and stratosphere of the Northern and Southern hemispheres during the 11-year solar cycle. The estimates have been obtained using the method of multiple linear regression. Specific features of response of the indicated atmospheric parameters to the solar cycle have been revealed in particular regions of the atmosphere for a whole year and depending on the season. The results of the analysis indicate the existence of a reliable statistical relationship of large-scale dynamic and thermodynamic processes in the troposphere and stratosphere with the 11-year solar cycle.

  16. MALDI-TOF mass spectrometry for quantitative gene expression analysis of acid responses in Staphylococcus aureus.

    PubMed

    Rode, Tone Mari; Berget, Ingunn; Langsrud, Solveig; Møretrø, Trond; Holck, Askild

    2009-07-01

    Microorganisms are constantly exposed to new and altered growth conditions, and respond by changing gene expression patterns. Several methods for studying gene expression exist. During the last decade, the analysis of microarrays has been one of the most common approaches applied for large scale gene expression studies. A relatively new method for gene expression analysis is MassARRAY, which combines real competitive-PCR and MALDI-TOF (matrix-assisted laser desorption/ionization time-of-flight) mass spectrometry. In contrast to microarray methods, MassARRAY technology is suitable for analysing a larger number of samples, though for a smaller set of genes. In this study we compare the results from MassARRAY with microarrays on gene expression responses of Staphylococcus aureus exposed to acid stress at pH 4.5. RNA isolated from the same stress experiments was analysed using both the MassARRAY and the microarray methods. The MassARRAY and microarray methods showed good correlation. Both MassARRAY and microarray estimated somewhat lower fold changes compared with quantitative real-time PCR (qRT-PCR). The results confirmed the up-regulation of the urease genes in acidic environments, and also indicated the importance of metal ion regulation. This study shows that the MassARRAY technology is suitable for gene expression analysis in prokaryotes, and has advantages when a set of genes is being analysed for an organism exposed to many different environmental conditions.

  17. MADGE: scalable distributed data management software for cDNA microarrays.

    PubMed

    McIndoe, Richard A; Lanzen, Aaron; Hurtz, Kimberly

    2003-01-01

    The human genome project and the development of new high-throughput technologies have created unparalleled opportunities to study the mechanism of diseases, monitor the disease progression and evaluate effective therapies. Gene expression profiling is a critical tool to accomplish these goals. The use of nucleic acid microarrays to assess the gene expression of thousands of genes simultaneously has seen phenomenal growth over the past five years. Although commercial sources of microarrays exist, investigators wanting more flexibility in the genes represented on the array will turn to in-house production. The creation and use of cDNA microarrays is a complicated process that generates an enormous amount of information. Effective data management of this information is essential to efficiently access, analyze, troubleshoot and evaluate the microarray experiments. We have developed a distributable software package designed to track and store the various pieces of data generated by a cDNA microarray facility. This includes the clone collection storage data, annotation data, workflow queues, microarray data, data repositories, sample submission information, and project/investigator information. This application was designed using a 3-tier client server model. The data access layer (1st tier) contains the relational database system tuned to support a large number of transactions. The data services layer (2nd tier) is a distributed COM server with full database transaction support. The application layer (3rd tier) is an internet based user interface that contains both client and server side code for dynamic interactions with the user. This software is freely available to academic institutions and non-profit organizations at http://www.genomics.mcg.edu/niddkbtc.

  18. Unlocking the potential of publicly available microarray data using inSilicoDb and inSilicoMerging R/Bioconductor packages.

    PubMed

    Taminau, Jonatan; Meganck, Stijn; Lazar, Cosmin; Steenhoff, David; Coletta, Alain; Molter, Colin; Duque, Robin; de Schaetzen, Virginie; Weiss Solís, David Y; Bersini, Hugues; Nowé, Ann

    2012-12-24

    With an abundant amount of microarray gene expression data sets available through public repositories, new possibilities lie in combining multiple existing data sets. In this new context, analysis itself is no longer the problem, but retrieving and consistently integrating all this data before delivering it to the wide variety of existing analysis tools becomes the new bottleneck. We present the newly released inSilicoMerging R/Bioconductor package which, together with the earlier released inSilicoDb R/Bioconductor package, allows consistent retrieval, integration and analysis of publicly available microarray gene expression data sets. Inside the inSilicoMerging package a set of five visual and six quantitative validation measures are available as well. By providing (i) access to uniformly curated and preprocessed data, (ii) a collection of techniques to remove the batch effects between data sets from different sources, and (iii) several validation tools enabling the inspection of the integration process, these packages enable researchers to fully explore the potential of combining gene expression data for downstream analysis. The power of using both packages is demonstrated by programmatically retrieving and integrating gene expression studies from the InSilico DB repository [https://insilicodb.org/app/].

  19. Characterization and simulation of cDNA microarray spots using a novel mathematical model

    PubMed Central

    Kim, Hye Young; Lee, Seo Eun; Kim, Min Jung; Han, Jin Il; Kim, Bo Kyung; Lee, Yong Sung; Lee, Young Seek; Kim, Jin Hyuk

    2007-01-01

    Background The quality of cDNA microarray data is crucial for expanding its application to other research areas, such as the study of gene regulatory networks. Despite the fact that a number of algorithms have been suggested to increase the accuracy of microarray gene expression data, it is necessary to obtain reliable microarray images by improving wet-lab experiments. As the first step of a cDNA microarray experiment, spotting cDNA probes is critical to determining the quality of spot images. Results We developed a governing equation of cDNA deposition during evaporation of a drop in the microarray spotting process. The governing equation included four parameters: the surface site density on the support, the extrapolated equilibrium constant for the binding of cDNA molecules with surface sites on glass slides, the macromolecular interaction factor, and the volume constant of a drop of cDNA solution. We simulated cDNA deposition from the single model equation by varying the value of the parameters. The morphology of the resulting cDNA deposit can be classified into three types: a doughnut shape, a peak shape, and a volcano shape. The spot morphology can be changed into a flat shape by varying the experimental conditions while considering the parameters of the governing equation of cDNA deposition. The four parameters were estimated by fitting the governing equation to the real microarray images. With the results of the simulation and the parameter estimation, the phenomenon of the formation of cDNA deposits in each type was investigated. Conclusion This study explains how various spot shapes can exist and suggests which parameters are to be adjusted for obtaining a good spot. This system is able to explore the cDNA microarray spotting process in a predictable, manageable and descriptive manner. We hope it can provide a way to predict the incidents that can occur during a real cDNA microarray experiment, and produce useful data for several research applications involving cDNA microarrays. PMID:18096047

  20. Introduction to the SPARC Reanalysis Intercomparison Project (S-RIP) and overview of the reanalysis systems

    NASA Astrophysics Data System (ADS)

    Fujiwara, Masatomo; Wright, Jonathon S.; Manney, Gloria L.; Gray, Lesley J.; Anstey, James; Birner, Thomas; Davis, Sean; Gerber, Edwin P.; Harvey, V. Lynn; Hegglin, Michaela I.; Homeyer, Cameron R.; Knox, John A.; Krüger, Kirstin; Lambert, Alyn; Long, Craig S.; Martineau, Patrick; Molod, Andrea; Monge-Sanz, Beatriz M.; Santee, Michelle L.; Tegtmeier, Susann; Chabrillat, Simon; Tan, David G. H.; Jackson, David R.; Polavarapu, Saroja; Compo, Gilbert P.; Dragani, Rossana; Ebisuzaki, Wesley; Harada, Yayoi; Kobayashi, Chiaki; McCarty, Will; Onogi, Kazutoshi; Pawson, Steven; Simmons, Adrian; Wargan, Krzysztof; Whitaker, Jeffrey S.; Zou, Cheng-Zhi

    2017-01-01

    The climate research community uses atmospheric reanalysis data sets to understand a wide range of processes and variability in the atmosphere, yet different reanalyses may give very different results for the same diagnostics. The Stratosphere-troposphere Processes And their Role in Climate (SPARC) Reanalysis Intercomparison Project (S-RIP) is a coordinated activity to compare reanalysis data sets using a variety of key diagnostics. The objectives of this project are to identify differences among reanalyses and understand their underlying causes, to provide guidance on appropriate usage of various reanalysis products in scientific studies, particularly those of relevance to SPARC, and to contribute to future improvements in the reanalysis products by establishing collaborative links between reanalysis centres and data users. The project focuses predominantly on differences among reanalyses, although studies that include operational analyses and studies comparing reanalyses with observations are also included when appropriate. The emphasis is on diagnostics of the upper troposphere, stratosphere, and lower mesosphere. This paper summarizes the motivation and goals of the S-RIP activity and extensively reviews key technical aspects of the reanalysis data sets that are the focus of this activity. The special issue The SPARC Reanalysis Intercomparison Project (S-RIP) in this journal serves to collect research with relevance to the S-RIP in preparation for the publication of the planned two (interim and full) S-RIP reports.

  1. Multiclass classification of microarray data samples with a reduced number of genes

    PubMed Central

    2011-01-01

    Background Multiclass classification of microarray data samples with a reduced number of genes is a rich and challenging problem in Bioinformatics research. The problem gets harder as the number of classes is increased. In addition, the performance of most classifiers is tightly linked to the effectiveness of mandatory gene selection methods. Critical to gene selection is the availability of estimates about the maximum number of genes that can be handled by any classification algorithm. Lack of such estimates may lead to either computationally demanding explorations of a search space with thousands of dimensions or classification models based on gene sets of unrestricted size. In the former case, unbiased but possibly overfitted classification models may arise. In the latter case, biased classification models unable to support statistically significant findings may be obtained. Results A novel bound on the maximum number of genes that can be handled by binary classifiers in binary mediated multiclass classification algorithms of microarray data samples is presented. The bound suggests that high-dimensional binary output domains might favor the existence of accurate and sparse binary mediated multiclass classifiers for microarray data samples. Conclusions A comprehensive experimental work shows that the bound is indeed useful to induce accurate and sparse multiclass classifiers for microarray data samples. PMID:21342522

  2. Microarray labeling extension values: laboratory signatures for Affymetrix GeneChips

    PubMed Central

    Lee, Yun-Shien; Chen, Chun-Houh; Tsai, Chi-Neu; Tsai, Chia-Lung; Chao, Angel; Wang, Tzu-Hao

    2009-01-01

    Interlaboratory comparison of microarray data, even when using the same platform, imposes several challenges to scientists. RNA quality, RNA labeling efficiency, hybridization procedures and data-mining tools can all contribute variations in each laboratory. In Affymetrix GeneChips, about 11–20 different 25-mer oligonucleotides are used to measure the level of each transcript. Here, we report that ‘labeling extension values (LEVs)’, which are correlation coefficients between probe intensities and probe positions, are highly correlated with the gene expression levels (GEVs) on eukayotic Affymetrix microarray data. By analyzing LEVs and GEVs in the publicly available 2414 cel files of 20 Affymetrix microarray types covering 13 species, we found that correlations between LEVs and GEVs only exist in eukaryotic RNAs, but not in prokaryotic ones. Surprisingly, Affymetrix results of the same specimens that were analyzed in different laboratories could be clearly differentiated only by LEVs, leading to the identification of ‘laboratory signatures’. In the examined dataset, GSE10797, filtering out high-LEV genes did not compromise the discovery of biological processes that are constructed by differentially expressed genes. In conclusion, LEVs provide a new filtering parameter for microarray analysis of gene expression and it may improve the inter- and intralaboratory comparability of Affymetrix GeneChips data. PMID:19295132

  3. Discrimination of Influenza Infection (A/2009 H1N1) from Prior Exposure by Antibody Protein Microarray Analysis

    PubMed Central

    te Beest, Dennis; de Bruin, Erwin; Imholz, Sandra; Wallinga, Jacco; Teunis, Peter; Koopmans, Marion; van Boven, Michiel

    2014-01-01

    Reliable discrimination of recent influenza A infection from previous exposure using hemagglutination inhibition (HI) or virus neutralization tests is currently not feasible. This is due to low sensitivity of the tests and the interference of antibody responses generated by previous infections. Here we investigate the diagnostic characteristics of a newly developed antibody (HA1) protein microarray using data from cross-sectional serological studies carried out before and after the pandemic of 2009. The data are analysed by mixture models, providing a probabilistic classification of sera (susceptible, prior-exposed, recently infected). Estimated sensitivity and specificity for identifying A/2009 infections are low using HI (66% and 51%), and high when using A/2009 microarray data alone or together with A/1918 microarray data (96% and 95%). As a heuristic, a high A/2009 to A/1918 antibody ratio (>1.05) is indicative of recent infection, while a low ratio is indicative of a pre-existing response, even if the A/2009 titer is high. We conclude that highly sensitive and specific classification of individual sera is possible using the protein microarray, thereby enabling precise estimation of age-specific infection attack rates in the population even if sample sizes are small. PMID:25405997

  4. Knowledge-based analysis of microarrays for the discovery of transcriptional regulation relationships

    PubMed Central

    2010-01-01

    Background The large amount of high-throughput genomic data has facilitated the discovery of the regulatory relationships between transcription factors and their target genes. While early methods for discovery of transcriptional regulation relationships from microarray data often focused on the high-throughput experimental data alone, more recent approaches have explored the integration of external knowledge bases of gene interactions. Results In this work, we develop an algorithm that provides improved performance in the prediction of transcriptional regulatory relationships by supplementing the analysis of microarray data with a new method of integrating information from an existing knowledge base. Using a well-known dataset of yeast microarrays and the Yeast Proteome Database, a comprehensive collection of known information of yeast genes, we show that knowledge-based predictions demonstrate better sensitivity and specificity in inferring new transcriptional interactions than predictions from microarray data alone. We also show that comprehensive, direct and high-quality knowledge bases provide better prediction performance. Comparison of our results with ChIP-chip data and growth fitness data suggests that our predicted genome-wide regulatory pairs in yeast are reasonable candidates for follow-up biological verification. Conclusion High quality, comprehensive, and direct knowledge bases, when combined with appropriate bioinformatic algorithms, can significantly improve the discovery of gene regulatory relationships from high throughput gene expression data. PMID:20122245

  5. Knowledge-based analysis of microarrays for the discovery of transcriptional regulation relationships.

    PubMed

    Seok, Junhee; Kaushal, Amit; Davis, Ronald W; Xiao, Wenzhong

    2010-01-18

    The large amount of high-throughput genomic data has facilitated the discovery of the regulatory relationships between transcription factors and their target genes. While early methods for discovery of transcriptional regulation relationships from microarray data often focused on the high-throughput experimental data alone, more recent approaches have explored the integration of external knowledge bases of gene interactions. In this work, we develop an algorithm that provides improved performance in the prediction of transcriptional regulatory relationships by supplementing the analysis of microarray data with a new method of integrating information from an existing knowledge base. Using a well-known dataset of yeast microarrays and the Yeast Proteome Database, a comprehensive collection of known information of yeast genes, we show that knowledge-based predictions demonstrate better sensitivity and specificity in inferring new transcriptional interactions than predictions from microarray data alone. We also show that comprehensive, direct and high-quality knowledge bases provide better prediction performance. Comparison of our results with ChIP-chip data and growth fitness data suggests that our predicted genome-wide regulatory pairs in yeast are reasonable candidates for follow-up biological verification. High quality, comprehensive, and direct knowledge bases, when combined with appropriate bioinformatic algorithms, can significantly improve the discovery of gene regulatory relationships from high throughput gene expression data.

  6. Implementation of GenePattern within the Stanford Microarray Database.

    PubMed

    Hubble, Jeremy; Demeter, Janos; Jin, Heng; Mao, Maria; Nitzberg, Michael; Reddy, T B K; Wymore, Farrell; Zachariah, Zachariah K; Sherlock, Gavin; Ball, Catherine A

    2009-01-01

    Hundreds of researchers across the world use the Stanford Microarray Database (SMD; http://smd.stanford.edu/) to store, annotate, view, analyze and share microarray data. In addition to providing registered users at Stanford access to their own data, SMD also provides access to public data, and tools with which to analyze those data, to any public user anywhere in the world. Previously, the addition of new microarray data analysis tools to SMD has been limited by available engineering resources, and in addition, the existing suite of tools did not provide a simple way to design, execute and share analysis pipelines, or to document such pipelines for the purposes of publication. To address this, we have incorporated the GenePattern software package directly into SMD, providing access to many new analysis tools, as well as a plug-in architecture that allows users to directly integrate and share additional tools through SMD. In this article, we describe our implementation of the GenePattern microarray analysis software package into the SMD code base. This extension is available with the SMD source code that is fully and freely available to others under an Open Source license, enabling other groups to create a local installation of SMD with an enriched data analysis capability.

  7. Multi-Reanalysis Comparison of Variability in Analysis Increment of Column-Integrated Water Vapor Associated with Madden-Julian Oscillation

    NASA Astrophysics Data System (ADS)

    Yokoi, S.

    2014-12-01

    This study conducts a comparison of three reanalysis products (JRA-55, JRA-25, and ERA-Interim) in representation of Madden-Julian Oscillation (MJO), focusing on column-integrated water vapor (CWV) that is considered as an essential variable for discussing MJO dynamics. Besides the analysis fields of CWV, which exhibit spatio-temporal distributions that are quite similar to satellite observations, CWV tendency simulated by forecast models and analysis increment calculated by data assimilation are examined. For JRA-55, it is revealed that, while its forecast model is able to simulate eastward propagation of the CWV anomaly, it tends to weaken the amplitude, and data assimilation process sustains the amplitude. The multi-reanalysis comparison of the analysis increment further reveals that this weakening bias is probably caused by excessively weak cloud-radiative feedback represented by the model. This bias in the feedback strength makes anomalous moisture supply by the vertical advection term in the CWV budget equation too insensitive to precipitation anomaly, resulting in reduction of the amplitude of CWV anomaly. ERA-Interim has a nearly opposite feature; the forecast model represents excessively strong feedback and unrealistically strengthens the amplitude, while the data assimilation weakens it. These results imply the necessity of accurate representation of the cloud-radiative feedback strength for a short-term MJO forecast, and may be evidence to support the argument that this feedback is essential for the existence of MJO. Furthermore, this study demonstrates that the multi-reanalysis comparison of the analysis increment will provide useful information for identifying model biases and, potentially, for estimating parameters that are difficult to estimate solely from observation data, such as gross moist stability.

  8. Tuning a climate model using nudging to reanalysis.

    NASA Astrophysics Data System (ADS)

    Cheedela, S. K.; Mapes, B. E.

    2014-12-01

    Tuning a atmospheric general circulation model involves a daunting task of adjusting non-observable parameters to adjust the mean climate. These parameters arise from necessity to describe unresolved flow through parametrizations. Tuning a climate model is often done with certain set of priorities, such as global mean temperature, net top of the atmosphere radiation. These priorities are hard enough to reach let alone reducing systematic biases in the models. The goal of currently study is to explore alternate ways to tune a climate model to reduce some systematic biases that can be used in synergy with existing efforts. Nudging a climate model to a known state is a poor man's inverse of tuning process described above. Our approach involves nudging the atmospheric model to state of art reanalysis fields thereby providing a balanced state with respect to the global mean temperature and winds. The tendencies derived from nudging are negative of errors from physical parametrizations as the errors from dynamical core would be small. Patterns of nudging are compared to the patterns of different physical parametrizations to decipher the cause for certain biases in relation to tuning parameters. This approach might also help in understanding certain compensating errors that arise from tuning process. ECHAM6 is a comprehensive general model, also used in recent Coupled Model Intercomparision Project(CMIP5). The approach used to tune it and effect of certain parameters that effect its mean climate are reported clearly, hence it serves as a benchmark for our approach. Our planned experiments include nudging ECHAM6 atmospheric model to European Center Reanalysis (ERA-Interim) and reanalysis from National Center for Environmental Prediction (NCEP) and decipher choice of certain parameters that lead to systematic biases in its simulations. Of particular interest are reducing long standing biases related to simulation of Asian summer monsoon.

  9. Functional Analyses of NSF1 in Wine Yeast Using Interconnected Correlation Clustering and Molecular Analyses

    PubMed Central

    Bessonov, Kyrylo; Walkey, Christopher J.; Shelp, Barry J.; van Vuuren, Hennie J. J.; Chiu, David; van der Merwe, George

    2013-01-01

    Analyzing time-course expression data captured in microarray datasets is a complex undertaking as the vast and complex data space is represented by a relatively low number of samples as compared to thousands of available genes. Here, we developed the Interdependent Correlation Clustering (ICC) method to analyze relationships that exist among genes conditioned on the expression of a specific target gene in microarray data. Based on Correlation Clustering, the ICC method analyzes a large set of correlation values related to gene expression profiles extracted from given microarray datasets. ICC can be applied to any microarray dataset and any target gene. We applied this method to microarray data generated from wine fermentations and selected NSF1, which encodes a C2H2 zinc finger-type transcription factor, as the target gene. The validity of the method was verified by accurate identifications of the previously known functional roles of NSF1. In addition, we identified and verified potential new functions for this gene; specifically, NSF1 is a negative regulator for the expression of sulfur metabolism genes, the nuclear localization of Nsf1 protein (Nsf1p) is controlled in a sulfur-dependent manner, and the transcription of NSF1 is regulated by Met4p, an important transcriptional activator of sulfur metabolism genes. The inter-disciplinary approach adopted here highlighted the accuracy and relevancy of the ICC method in mining for novel gene functions using complex microarray datasets with a limited number of samples. PMID:24130853

  10. MASQOT: a method for cDNA microarray spot quality control

    PubMed Central

    Bylesjö, Max; Eriksson, Daniel; Sjödin, Andreas; Sjöström, Michael; Jansson, Stefan; Antti, Henrik; Trygg, Johan

    2005-01-01

    Background cDNA microarray technology has emerged as a major player in the parallel detection of biomolecules, but still suffers from fundamental technical problems. Identifying and removing unreliable data is crucial to prevent the risk of receiving illusive analysis results. Visual assessment of spot quality is still a common procedure, despite the time-consuming work of manually inspecting spots in the range of hundreds of thousands or more. Results A novel methodology for cDNA microarray spot quality control is outlined. Multivariate discriminant analysis was used to assess spot quality based on existing and novel descriptors. The presented methodology displays high reproducibility and was found superior in identifying unreliable data compared to other evaluated methodologies. Conclusion The proposed methodology for cDNA microarray spot quality control generates non-discrete values of spot quality which can be utilized as weights in subsequent analysis procedures as well as to discard spots of undesired quality using the suggested threshold values. The MASQOT approach provides a consistent assessment of spot quality and can be considered an alternative to the labor-intensive manual quality assessment process. PMID:16223442

  11. Application of the Gini correlation coefficient to infer regulatory relationships in transcriptome analysis.

    PubMed

    Ma, Chuang; Wang, Xiangfeng

    2012-09-01

    One of the computational challenges in plant systems biology is to accurately infer transcriptional regulation relationships based on correlation analyses of gene expression patterns. Despite several correlation methods that are applied in biology to analyze microarray data, concerns regarding the compatibility of these methods with the gene expression data profiled by high-throughput RNA transcriptome sequencing (RNA-Seq) technology have been raised. These concerns are mainly due to the fact that the distribution of read counts in RNA-Seq experiments is different from that of fluorescence intensities in microarray experiments. Therefore, a comprehensive evaluation of the existing correlation methods and, if necessary, introduction of novel methods into biology is appropriate. In this study, we compared four existing correlation methods used in microarray analysis and one novel method called the Gini correlation coefficient on previously published microarray-based and sequencing-based gene expression data in Arabidopsis (Arabidopsis thaliana) and maize (Zea mays). The comparisons were performed on more than 11,000 regulatory relationships in Arabidopsis, including 8,929 pairs of transcription factors and target genes. Our analyses pinpointed the strengths and weaknesses of each method and indicated that the Gini correlation can compensate for the shortcomings of the Pearson correlation, the Spearman correlation, the Kendall correlation, and the Tukey's biweight correlation. The Gini correlation method, with the other four evaluated methods in this study, was implemented as an R package named rsgcc that can be utilized as an alternative option for biologists to perform clustering analyses of gene expression patterns or transcriptional network analyses.

  12. A robust two-way semi-linear model for normalization of cDNA microarray data

    PubMed Central

    Wang, Deli; Huang, Jian; Xie, Hehuang; Manzella, Liliana; Soares, Marcelo Bento

    2005-01-01

    Background Normalization is a basic step in microarray data analysis. A proper normalization procedure ensures that the intensity ratios provide meaningful measures of relative expression values. Methods We propose a robust semiparametric method in a two-way semi-linear model (TW-SLM) for normalization of cDNA microarray data. This method does not make the usual assumptions underlying some of the existing methods. For example, it does not assume that: (i) the percentage of differentially expressed genes is small; or (ii) the numbers of up- and down-regulated genes are about the same, as required in the LOWESS normalization method. We conduct simulation studies to evaluate the proposed method and use a real data set from a specially designed microarray experiment to compare the performance of the proposed method with that of the LOWESS normalization approach. Results The simulation results show that the proposed method performs better than the LOWESS normalization method in terms of mean square errors for estimated gene effects. The results of analysis of the real data set also show that the proposed method yields more consistent results between the direct and the indirect comparisons and also can detect more differentially expressed genes than the LOWESS method. Conclusions Our simulation studies and the real data example indicate that the proposed robust TW-SLM method works at least as well as the LOWESS method and works better when the underlying assumptions for the LOWESS method are not satisfied. Therefore, it is a powerful alternative to the existing normalization methods. PMID:15663789

  13. High-density, microsphere-based fiber optic DNA microarrays.

    PubMed

    Epstein, Jason R; Leung, Amy P K; Lee, Kyong Hoon; Walt, David R

    2003-05-01

    A high-density fiber optic DNA microarray has been developed consisting of oligonucleotide-functionalized, 3.1-microm-diameter microspheres randomly distributed on the etched face of an imaging fiber bundle. The fiber bundles are comprised of 6000-50000 fused optical fibers and each fiber terminates with an etched well. The microwell array is capable of housing complementary-sized microspheres, each containing thousands of copies of a unique oligonucleotide probe sequence. The array fabrication process results in random microsphere placement. Determining the position of microspheres in the random array requires an optical encoding scheme. This array platform provides many advantages over other array formats. The microsphere-stock suspension concentration added to the etched fiber can be controlled to provide inherent sensor redundancy. Examining identical microspheres has a beneficial effect on the signal-to-noise ratio. As other sequences of interest are discovered, new microsphere sensing elements can be added to existing microsphere pools and new arrays can be fabricated incorporating the new sequences without altering the existing detection capabilities. These microarrays contain the smallest feature sizes (3 microm) of any DNA array, allowing interrogation of extremely small sample volumes. Reducing the feature size results in higher local target molecule concentrations, creating rapid and highly sensitive assays. The microsphere array platform is also flexible in its applications; research has included DNA-protein interaction profiles, microbial strain differentiation, and non-labeled target interrogation with molecular beacons. Fiber optic microsphere-based DNA microarrays have a simple fabrication protocol enabling their expansion into other applications, such as single cell-based assays.

  14. Application of the Gini Correlation Coefficient to Infer Regulatory Relationships in Transcriptome Analysis[W][OA

    PubMed Central

    Ma, Chuang; Wang, Xiangfeng

    2012-01-01

    One of the computational challenges in plant systems biology is to accurately infer transcriptional regulation relationships based on correlation analyses of gene expression patterns. Despite several correlation methods that are applied in biology to analyze microarray data, concerns regarding the compatibility of these methods with the gene expression data profiled by high-throughput RNA transcriptome sequencing (RNA-Seq) technology have been raised. These concerns are mainly due to the fact that the distribution of read counts in RNA-Seq experiments is different from that of fluorescence intensities in microarray experiments. Therefore, a comprehensive evaluation of the existing correlation methods and, if necessary, introduction of novel methods into biology is appropriate. In this study, we compared four existing correlation methods used in microarray analysis and one novel method called the Gini correlation coefficient on previously published microarray-based and sequencing-based gene expression data in Arabidopsis (Arabidopsis thaliana) and maize (Zea mays). The comparisons were performed on more than 11,000 regulatory relationships in Arabidopsis, including 8,929 pairs of transcription factors and target genes. Our analyses pinpointed the strengths and weaknesses of each method and indicated that the Gini correlation can compensate for the shortcomings of the Pearson correlation, the Spearman correlation, the Kendall correlation, and the Tukey’s biweight correlation. The Gini correlation method, with the other four evaluated methods in this study, was implemented as an R package named rsgcc that can be utilized as an alternative option for biologists to perform clustering analyses of gene expression patterns or transcriptional network analyses. PMID:22797655

  15. Application of Immunosignatures for Diagnosis of Valley Fever

    PubMed Central

    Navalkar, Krupa Arun; Johnston, Stephen Albert; Woodbury, Neal; Galgiani, John N.; Magee, D. Mitchell; Chicacz, Zbigniew

    2014-01-01

    Valley fever (VF) is difficult to diagnose, partly because the symptoms of VF are confounded with those of other community-acquired pneumonias. Confirmatory diagnostics detect IgM and IgG antibodies against coccidioidal antigens via immunodiffusion (ID). The false-negative rate can be as high as 50% to 70%, with 5% of symptomatic patients never showing detectable antibody levels. In this study, we tested whether the immunosignature diagnostic can resolve VF false negatives. An immunosignature is the pattern of antibody binding to random-sequence peptides on a peptide microarray. A 10,000-peptide microarray was first used to determine whether valley fever patients can be distinguished from 3 other cohorts with similar infections. After determining the VF-specific peptides, a small 96-peptide diagnostic array was created and tested. The performances of the 10,000-peptide array and the 96-peptide diagnostic array were compared to that of the ID diagnostic standard. The 10,000-peptide microarray classified the VF samples from the other 3 infections with 98% accuracy. It also classified VF false-negative patients with 100% sensitivity in a blinded test set versus 28% sensitivity for ID. The immunosignature microarray has potential for simultaneously distinguishing valley fever patients from those with other fungal or bacterial infections. The same 10,000-peptide array can diagnose VF false-negative patients with 100% sensitivity. The smaller 96-peptide diagnostic array was less specific for diagnosing false negatives. We conclude that the performance of the immunosignature diagnostic exceeds that of the existing standard, and the immunosignature can distinguish related infections and might be used in lieu of existing diagnostics. PMID:24964807

  16. Guanine Plus Cytosine Contents of the Deoxyribonucleic Acids of Some Sulfate-Reducing Bacteria: a Reassessment

    PubMed Central

    Skyring, G. W.; Jones, H. E.

    1972-01-01

    Guanine plus cytosine (GC) contents of the deoxyribonucleic acids of Desulfovibrio and Desulfotomaculum have been used as a basis for classification. Some of these data have been incorrectly calculated, resulting in errors of as much as 5% GC. This situation has been corrected by a reanalysis of existing data and by the contribution of new data. PMID:5011245

  17. Recent advances in proteomic applications for schistosomiasis research: potential clinical impact.

    PubMed

    Sotillo, Javier; Doolan, Denise; Loukas, Alex

    2017-02-01

    Schistosomiasis is a neglected tropical disease affecting hundreds of millions of people worldwide. Recent advances in the field of proteomics and the development of new and highly sensitive mass spectrometers and quantitative techniques have provided new tools for advancing the molecular biology, cell biology, diagnosis and vaccine development for public health threats such as schistosomiasis. Areas covered: In this review we describe the latest advances in research that utilizes proteomics-based tools to address some of the key challenges to developing effective interventions against schistosomiasis. We also provide information about the potential of extracellular vesicles to advance the fight against this devastating disease. Expert commentary: Different proteins are already being tested as vaccines against schistosomiasis with promising results. The re-analysis of the Schistosoma spp. proteomes using new and more sensitive mass spectrometers as well as better separation approaches will help identify more vaccine targets in a rational and informed manner. In addition, the recent development of new proteome microarrays will facilitate characterisation of novel markers of infection as well as new vaccine and diagnostic candidate antigens.

  18. Daily temperature and precipitation extremes in the Baltic Sea region derived from the BaltAn65+ reanalysis

    NASA Astrophysics Data System (ADS)

    Toll, Velle; Post, Piia

    2018-04-01

    Daily 2-m temperature and precipitation extremes in the Baltic Sea region for the time period of 1965-2005 is studied based on data from the BaltAn65 + high resolution atmospheric reanalysis. Moreover, the ability of regional reanalysis to capture extremes is analysed by comparing the reanalysis data to gridded observations. The shortcomings in the simulation of the minimum temperatures over the northern part of the region and in the simulation of the extreme precipitation over the Scandinavian mountains in the BaltAn65+ reanalysis data are detected and analysed. Temporal trends in the temperature and precipitation extremes in the Baltic Sea region, with the largest increases in temperature and precipitation in winter, are detected based on both gridded observations and the BaltAn65+ reanalysis data. However, the reanalysis is not able to capture all of the regional trends in the extremes in the observations due to the shortcomings in the simulation of the extremes.

  19. Downscaling reanalysis data to high-resolution variables above a glacier surface (Cordillera Blanca, Peru)

    NASA Astrophysics Data System (ADS)

    Hofer, Marlis; Mölg, Thomas; Marzeion, Ben; Kaser, Georg

    2010-05-01

    Recently initiated observation networks in the Cordillera Blanca provide temporally high-resolution, yet short-term atmospheric data. The aim of this study is to extend the existing time series into the past. We present an empirical-statistical downscaling (ESD) model that links 6-hourly NCEP/NCAR reanalysis data to the local target variables, measured at the tropical glacier Artesonraju (Northern Cordillera Blanca). The approach is particular in the context of ESD for two reasons. First, the observational time series for model calibration are short (only about two years). Second, unlike most ESD studies in climate research, we focus on variables at a high temporal resolution (i.e., six-hourly values). Our target variables are two important drivers in the surface energy balance of tropical glaciers; air temperature and specific humidity. The selection of predictor fields from the reanalysis data is based on regression analyses and climatologic considerations. The ESD modelling procedure includes combined empirical orthogonal function and multiple regression analyses. Principal component screening is based on cross-validation using the Akaike Information Criterion as model selection criterion. Double cross-validation is applied for model evaluation. Potential autocorrelation in the time series is considered by defining the block length in the resampling procedure. Apart from the selection of predictor fields, the modelling procedure is automated and does not include subjective choices. We assess the ESD model sensitivity to the predictor choice by using both single- and mixed-field predictors of the variables air temperature (1000 hPa), specific humidity (1000 hPa), and zonal wind speed (500 hPa). The chosen downscaling domain ranges from 80 to 50 degrees west and from 0 to 20 degrees south. Statistical transfer functions are derived individually for different months and times of day (month/hour-models). The forecast skill of the month/hour-models largely depends on month and time of day, ranging from 0 to 0.8, but the mixed-field predictors generally perform better than the single-field predictors. At all time scales, the ESD model shows added value against two simple reference models; (i) the direct use of reanalysis grid point values, and (ii) mean diurnal and seasonal cycles over the calibration period. The ESD model forecast 1960 to 2008 clearly reflects interannual variability related to the El Niño/Southern Oscillation, but is sensitive to the chosen predictor type. So far, we have not assessed the performance of NCEP/NCAR reanalysis data against other reanalysis products. The developed ESD model is computationally cheap and applicable wherever measurements are available for model calibration.

  20. Potential Seasonal Predictability of Water Cycle in Observations and Reanalysis

    NASA Astrophysics Data System (ADS)

    Feng, X.; Houser, P.

    2012-12-01

    Identification of predictability of water cycle variability is crucial for climate prediction, water resources availability, ecosystem management and hazard mitigation. An analysis that can assess the potential skill in seasonal prediction was proposed by the authors, named as analysis of covariance (ANOCOVA). This method tests whether interannual variability of seasonal means exceeds that due to weather noise under the null hypothesis that seasonal means are identical every year. It has the advantage of taking into account autocorrelation structure in the daily time series but also accounting for the uncertainty of the estimated parameters in the significance test. During the past several years, multiple reanalysis datasets have become available for studying climate variability and understanding climate system. We are motivated to compare the potential predictability of water cycle variation from different reanalysis datasets against observations using the newly proposed ANOCOVA method. The selected eight reanalyses include the National Centers for Environmental Prediction-National Center for Atmospheric Research (NCEP/NCAR) 40-year Reanalysis Project (NNRP), the National Centers for Environmental Prediction-Department of Energy (NCEP/DOE) Reanalysis Project (NDRP), the European Centre for Medium-Range Weather Forecasts (ECMWF) 40-year Reanalysis, The Japan Meteorological Agency 25-year Reanalysis Project (JRA25), the ECMWF) Interim Reanalysis (ERAINT), the NCEP Climate Forecast System Reanalysis (CFSR), the National Aeronautics and Space Administration (NASA) Modern-Era Retrospective Analysis for Research and Applications (MERRA), and the National Oceanic and Atmospheric Administration-Cooperative Institute for Research in Environmental Sciences (NOAA/CIRES) 20th Century Reanalysis Version 2 (20CR). For key water cycle components, precipitation and evaporation, all reanalyses consistently show high fraction of predictable variance in the tropics, low predictability over the extratropics, more potential predictability over the ocean than land, and a stronger seasonal variation in potential predictability over land than ocean. The substantial differences are observed especially over the extropical areas where boundary-forced signal is not as significant as in tropics. We further evaluate the accuracy of reanalysis in estimating seasonal predictability over several selected regions, where rain gauge measurement or land surface data assimilation product is available and accurate, to gain insight on the strength and weakness of reanalysis products.

  1. Experimental design for three-color and four-color gene expression microarrays.

    PubMed

    Woo, Yong; Krueger, Winfried; Kaur, Anupinder; Churchill, Gary

    2005-06-01

    Three-color microarrays, compared with two-color microarrays, can increase design efficiency and power to detect differential expression without additional samples and arrays. Furthermore, three-color microarray technology is currently available at a reasonable cost. Despite the potential advantages, clear guidelines for designing and analyzing three-color experiments do not exist. We propose a three- and a four-color cyclic design (loop) and a complementary graphical representation to help design experiments that are balanced, efficient and robust to hybridization failures. In theory, three-color loop designs are more efficient than two-color loop designs. Experiments using both two- and three-color platforms were performed in parallel and their outputs were analyzed using linear mixed model analysis in R/MAANOVA. These results demonstrate that three-color experiments using the same number of samples (and fewer arrays) will perform as efficiently as two-color experiments. The improved efficiency of the design is somewhat offset by a reduced dynamic range and increased variability in the three-color experimental system. This result suggests that, with minor technological improvements, three-color microarrays using loop designs could detect differential expression more efficiently than two-color loop designs. http://www.jax.org/staff/churchill/labsite/software Multicolor cyclic design construction methods and examples along with additional results of the experiment are provided at http://www.jax.org/staff/churchill/labsite/pubs/yong.

  2. MULTI-K: accurate classification of microarray subtypes using ensemble k-means clustering

    PubMed Central

    Kim, Eun-Youn; Kim, Seon-Young; Ashlock, Daniel; Nam, Dougu

    2009-01-01

    Background Uncovering subtypes of disease from microarray samples has important clinical implications such as survival time and sensitivity of individual patients to specific therapies. Unsupervised clustering methods have been used to classify this type of data. However, most existing methods focus on clusters with compact shapes and do not reflect the geometric complexity of the high dimensional microarray clusters, which limits their performance. Results We present a cluster-number-based ensemble clustering algorithm, called MULTI-K, for microarray sample classification, which demonstrates remarkable accuracy. The method amalgamates multiple k-means runs by varying the number of clusters and identifies clusters that manifest the most robust co-memberships of elements. In addition to the original algorithm, we newly devised the entropy-plot to control the separation of singletons or small clusters. MULTI-K, unlike the simple k-means or other widely used methods, was able to capture clusters with complex and high-dimensional structures accurately. MULTI-K outperformed other methods including a recently developed ensemble clustering algorithm in tests with five simulated and eight real gene-expression data sets. Conclusion The geometric complexity of clusters should be taken into account for accurate classification of microarray data, and ensemble clustering applied to the number of clusters tackles the problem very well. The C++ code and the data sets tested are available from the authors. PMID:19698124

  3. Empirical-statistical downscaling of reanalysis data to high-resolution air temperature and specific humidity above a glacier surface (Cordillera Blanca, Peru)

    NASA Astrophysics Data System (ADS)

    Hofer, Marlis; MöLg, Thomas; Marzeion, Ben; Kaser, Georg

    2010-06-01

    Recently initiated observation networks in the Cordillera Blanca (Peru) provide temporally high-resolution, yet short-term, atmospheric data. The aim of this study is to extend the existing time series into the past. We present an empirical-statistical downscaling (ESD) model that links 6-hourly National Centers for Environmental Prediction (NCEP)/National Center for Atmospheric Research (NCAR) reanalysis data to air temperature and specific humidity, measured at the tropical glacier Artesonraju (northern Cordillera Blanca). The ESD modeling procedure includes combined empirical orthogonal function and multiple regression analyses and a double cross-validation scheme for model evaluation. Apart from the selection of predictor fields, the modeling procedure is automated and does not include subjective choices. We assess the ESD model sensitivity to the predictor choice using both single-field and mixed-field predictors. Statistical transfer functions are derived individually for different months and times of day. The forecast skill largely depends on month and time of day, ranging from 0 to 0.8. The mixed-field predictors perform better than the single-field predictors. The ESD model shows added value, at all time scales, against simpler reference models (e.g., the direct use of reanalysis grid point values). The ESD model forecast 1960-2008 clearly reflects interannual variability related to the El Niño/Southern Oscillation but is sensitive to the chosen predictor type.

  4. A daily Azores-Iceland North Atlantic Oscillation index back to 1850.

    PubMed

    Cropper, Thomas; Hanna, Edward; Valente, Maria Antónia; Jónsson, Trausti

    2015-07-01

    We present the construction of a continuous, daily (09:00 UTC), station-based (Azores-Iceland) North Atlantic Oscillation (NAO) Index back to 1871 which is extended back to 1850 with additional daily mean data. The constructed index more than doubles the length of previously existing, widely available, daily NAO time series. The index is created using entirely observational sea-level pressure (SLP) data from Iceland and 73.5% of observational SLP data from the Azores - the remainder being filled in via reanalysis (Twentieth Century Reanalysis Project and European Mean Sea Level Pressure) SLP data. Icelandic data are taken from the Southwest Iceland pressure series. We construct and document a new Ponta Delgada SLP time series based on recently digitized and newly available data that extend back to 1872. The Ponta Delgada time series is created by splicing together several fractured records (from Ponta Delgada, Lajes, and Santa Maria) and filling in the major gaps (pre-1872, 1888-1905, and 1940-1941) and occasional days (145) with reanalysis data. Further homogeneity corrections are applied to the Azores record, and the daily (09:00 UTC) NAO index is then calculated. The resulting index, with its extended temporal length and daily resolution, is the first reconstruction of daily NAO back into the 19th Century and therefore is useful for researchers across multiple disciplines.

  5. Global 3-D ionospheric electron density reanalysis based on multisource data assimilation

    NASA Astrophysics Data System (ADS)

    Yue, Xinan; Schreiner, William S.; Kuo, Ying-Hwa; Hunt, Douglas C.; Wang, Wenbin; Solomon, Stanley C.; Burns, Alan G.; Bilitza, Dieter; Liu, Jann-Yenq; Wan, Weixing; Wickert, Jens

    2012-09-01

    We report preliminary results of a global 3-D ionospheric electron density reanalysis demonstration study during 2002-2011 based on multisource data assimilation. The monthly global ionospheric electron density reanalysis has been done by assimilating the quiet days ionospheric data into a data assimilation model constructed using the International Reference Ionosphere (IRI) 2007 model and a Kalman filter technique. These data include global navigation satellite system (GNSS) observations of ionospheric total electron content (TEC) from ground-based stations, ionospheric radio occultations by CHAMP, GRACE, COSMIC, SAC-C, Metop-A, and the TerraSAR-X satellites, and Jason-1 and 2 altimeter TEC measurements. The output of the reanalysis are 3-D gridded ionospheric electron densities with temporal and spatial resolutions of 1 h in universal time, 5° in latitude, 10° in longitude, and ˜30 km in altitude. The climatological features of the reanalysis results, such as solar activity dependence, seasonal variations, and the global morphology of the ionosphere, agree well with those in the empirical models and observations. The global electron content derived from the international GNSS service global ionospheric maps, the observed electron density profiles from the Poker Flat Incoherent Scatter Radar during 2007-2010, and foF2 observed by the global ionosonde network during 2002-2011 are used to validate the reanalysis method. All comparisons show that the reanalysis have smaller deviations and biases than the IRI-2007 predictions. Especially after April 2006 when the six COSMIC satellites were launched, the reanalysis shows significant improvement over the IRI predictions. The obvious overestimation of the low-latitude ionospheric F region densities by the IRI model during the 23/24 solar minimum is corrected well by the reanalysis. The potential application and improvements of the reanalysis are also discussed.

  6. Extending the reanalysis to the ionosphere based on ground and LEO based GNSS observations

    NASA Astrophysics Data System (ADS)

    Yue, X.; Schreiner, W. S.; Kuo, Y.

    2012-12-01

    We report preliminary results of a global 3-D ionospheric electron density reanalysis during 2002-2011 based on multi-source data assimilation. The monthly global ionospheric electron density reanalysis has been done by assimilating the quiet days ionospheric data into a data assimilation model constructed using the International Reference Ionosphere (IRI) 2007 model and a Kalman filter technique. These data include global navigation satellite system (GNSS) observations of ionospheric total electron content (TEC) from ground based stations, ionospheric radio occultations by CHAMP, GRACE, COSMIC, SAC-C, Metop-A, and the TerraSAR-X satellites, and Jason-1 and 2 altimeter TEC measurements. The output of the reanalysis are 3-D gridded ionospheric electron densities with temporal and spatial resolutions of 1 hr in universal time, 5o in latitude, 10o in longitude, and ~ 30 km in altitude. The climatological features of the reanalysis results, such as solar activity dependence, seasonal variations, and the global morphology of the ionosphere, agree well with those in the empirical models and observations. The global electron content (GEC) derived from the international GNSS service (IGS) global ionospheric maps (GIM), the observed electron density profiles from the Poker Flat Incoherent Scatter Radar (PFISR) during 2007-2010, and foF2 observed by the global ionosonde network during 2002-2011 are used to validate the reanalysis method. All comparisons show that the reanalysis have smaller deviations and biases than the IRI-2007 predictions. Especially after April 2006 when the six COSMIC satellites were launched, the reanalysis shows significant improvement over the IRI predictions. The obvious overestimation of the low-latitude ionospheric F-region densities by the IRI model during the 23/24 solar minimum is corrected well by the reanalysis. The potential application and improvements of the reanalysis are also discussed.

  7. Microarray missing data imputation based on a set theoretic framework and biological knowledge.

    PubMed

    Gan, Xiangchao; Liew, Alan Wee-Chung; Yan, Hong

    2006-01-01

    Gene expressions measured using microarrays usually suffer from the missing value problem. However, in many data analysis methods, a complete data matrix is required. Although existing missing value imputation algorithms have shown good performance to deal with missing values, they also have their limitations. For example, some algorithms have good performance only when strong local correlation exists in data while some provide the best estimate when data is dominated by global structure. In addition, these algorithms do not take into account any biological constraint in their imputation. In this paper, we propose a set theoretic framework based on projection onto convex sets (POCS) for missing data imputation. POCS allows us to incorporate different types of a priori knowledge about missing values into the estimation process. The main idea of POCS is to formulate every piece of prior knowledge into a corresponding convex set and then use a convergence-guaranteed iterative procedure to obtain a solution in the intersection of all these sets. In this work, we design several convex sets, taking into consideration the biological characteristic of the data: the first set mainly exploit the local correlation structure among genes in microarray data, while the second set captures the global correlation structure among arrays. The third set (actually a series of sets) exploits the biological phenomenon of synchronization loss in microarray experiments. In cyclic systems, synchronization loss is a common phenomenon and we construct a series of sets based on this phenomenon for our POCS imputation algorithm. Experiments show that our algorithm can achieve a significant reduction of error compared to the KNNimpute, SVDimpute and LSimpute methods.

  8. Identification of Novel Tissue-Specific Genes by Analysis of Microarray Databases: A Human and Mouse Model

    PubMed Central

    Suh, Yeunsu; Davis, Michael E.; Lee, Kichoon

    2013-01-01

    Understanding the tissue-specific pattern of gene expression is critical in elucidating the molecular mechanisms of tissue development, gene function, and transcriptional regulations of biological processes. Although tissue-specific gene expression information is available in several databases, follow-up strategies to integrate and use these data are limited. The objective of the current study was to identify and evaluate novel tissue-specific genes in human and mouse tissues by performing comparative microarray database analysis and semi-quantitative PCR analysis. We developed a powerful approach to predict tissue-specific genes by analyzing existing microarray data from the NCBI′s Gene Expression Omnibus (GEO) public repository. We investigated and confirmed tissue-specific gene expression in the human and mouse kidney, liver, lung, heart, muscle, and adipose tissue. Applying our novel comparative microarray approach, we confirmed 10 kidney, 11 liver, 11 lung, 11 heart, 8 muscle, and 8 adipose specific genes. The accuracy of this approach was further verified by employing semi-quantitative PCR reaction and by searching for gene function information in existing publications. Three novel tissue-specific genes were discovered by this approach including AMDHD1 (amidohydrolase domain containing 1) in the liver, PRUNE2 (prune homolog 2) in the heart, and ACVR1C (activin A receptor, type IC) in adipose tissue. We further confirmed the tissue-specific expression of these 3 novel genes by real-time PCR. Among them, ACVR1C is adipose tissue-specific and adipocyte-specific in adipose tissue, and can be used as an adipocyte developmental marker. From GEO profiles, we predicted the processes in which AMDHD1 and PRUNE2 may participate. Our approach provides a novel way to identify new sets of tissue-specific genes and to predict functions in which they may be involved. PMID:23741331

  9. High density DNA microarrays: algorithms and biomedical applications.

    PubMed

    Liu, Wei-Min

    2004-08-01

    DNA microarrays are devices capable of detecting the identity and abundance of numerous DNA or RNA segments in samples. They are used for analyzing gene expressions, identifying genetic markers and detecting mutations on a genomic scale. The fundamental chemical mechanism of DNA microarrays is the hybridization between probes and targets due to the hydrogen bonds of nucleotide base pairing. Since the cross hybridization is inevitable, and probes or targets may form undesirable secondary or tertiary structures, the microarray data contain noise and depend on experimental conditions. It is crucial to apply proper statistical algorithms to obtain useful signals from noisy data. After we obtained the signals of a large amount of probes, we need to derive the biomedical information such as the existence of a transcript in a cell, the difference of expression levels of a gene in multiple samples, and the type of a genetic marker. Furthermore, after the expression levels of thousands of genes or the genotypes of thousands of single nucleotide polymorphisms are determined, it is usually important to find a small number of genes or markers that are related to a disease, individual reactions to drugs, or other phenotypes. All these applications need careful data analyses and reliable algorithms.

  10. Development and application of a DNA microarray-based yeast two-hybrid system

    PubMed Central

    Suter, Bernhard; Fontaine, Jean-Fred; Yildirimman, Reha; Raskó, Tamás; Schaefer, Martin H.; Rasche, Axel; Porras, Pablo; Vázquez-Álvarez, Blanca M.; Russ, Jenny; Rau, Kirstin; Foulle, Raphaele; Zenkner, Martina; Saar, Kathrin; Herwig, Ralf; Andrade-Navarro, Miguel A.; Wanker, Erich E.

    2013-01-01

    The yeast two-hybrid (Y2H) system is the most widely applied methodology for systematic protein–protein interaction (PPI) screening and the generation of comprehensive interaction networks. We developed a novel Y2H interaction screening procedure using DNA microarrays for high-throughput quantitative PPI detection. Applying a global pooling and selection scheme to a large collection of human open reading frames, proof-of-principle Y2H interaction screens were performed for the human neurodegenerative disease proteins huntingtin and ataxin-1. Using systematic controls for unspecific Y2H results and quantitative benchmarking, we identified and scored a large number of known and novel partner proteins for both huntingtin and ataxin-1. Moreover, we show that this parallelized screening procedure and the global inspection of Y2H interaction data are uniquely suited to define specific PPI patterns and their alteration by disease-causing mutations in huntingtin and ataxin-1. This approach takes advantage of the specificity and flexibility of DNA microarrays and of the existence of solid-related statistical methods for the analysis of DNA microarray data, and allows a quantitative approach toward interaction screens in human and in model organisms. PMID:23275563

  11. Apparently low reproducibility of true differential expression discoveries in microarray studies.

    PubMed

    Zhang, Min; Yao, Chen; Guo, Zheng; Zou, Jinfeng; Zhang, Lin; Xiao, Hui; Wang, Dong; Yang, Da; Gong, Xue; Zhu, Jing; Li, Yanhui; Li, Xia

    2008-09-15

    Differentially expressed gene (DEG) lists detected from different microarray studies for a same disease are often highly inconsistent. Even in technical replicate tests using identical samples, DEG detection still shows very low reproducibility. It is often believed that current small microarray studies will largely introduce false discoveries. Based on a statistical model, we show that even in technical replicate tests using identical samples, it is highly likely that the selected DEG lists will be very inconsistent in the presence of small measurement variations. Therefore, the apparently low reproducibility of DEG detection from current technical replicate tests does not indicate low quality of microarray technology. We also demonstrate that heterogeneous biological variations existing in real cancer data will further reduce the overall reproducibility of DEG detection. Nevertheless, in small subsamples from both simulated and real data, the actual false discovery rate (FDR) for each DEG list tends to be low, suggesting that each separately determined list may comprise mostly true DEGs. Rather than simply counting the overlaps of the discovery lists from different studies for a complex disease, novel metrics are needed for evaluating the reproducibility of discoveries characterized with correlated molecular changes. Supplementaty information: Supplementary data are available at Bioinformatics online.

  12. UCLA-LANL Reanalysis Project

    NASA Astrophysics Data System (ADS)

    Shprits, Y.; Chen, Y.; Friedel, R.; Kondrashov, D.; Ni, B.; Subbotin, D.; Reeves, G.; Ghil, M.

    2009-04-01

    We present first results of the UCLA-LANL Reanalysis Project. Radiation belt relativistic electron Phase Space Density is obtained using the data assimilative VERB code combined with observations from GEO, CRRES, and Akebono data. Reanalysis of data shows the pronounced peaks in the phase space density and pronounced dropouts of fluxes during the main phase of a storm. The results of the reanalysis are discussed and compared to the simulations with the recently developed VERB 3D code.

  13. Global Eddy-Permitting Ocean Reanalyses and Simulations of the Period 1992 to Present

    NASA Astrophysics Data System (ADS)

    Parent, L.; Ferry, N.; Barnier, B.; Garric, G.; Bricaud, C.; Testut, C.-E.; Le Galloudec, O.; Lellouche, J.-M.; Greiner, E.; Drevillon, M.; Remy, E.; Moulines, J.-M.; Guinehut, S.; Cabanes, C.

    2013-09-01

    We present GLORYS2V1 global ocean and sea-ice eddy permitting reanalysis over the altimetric era (1993- 2009). This reanalysis is based on an ocean and sea-ice general circulation model at 1⁄4° horizontal resolution assimilating sea surface temperature, in situ profiles of temperature and salinity and along-track sea level anomaly observations. The reanalysis has been produced along with a reference simulation called MJM95 which allows evaluating the benefits of the data assimilation. In the introduction, we briefly describe the GLORYS2V1 reanalysis system. In sections 2, 3 and 4, the reanalysis skill is presented. Data assimilation diagnostics reveal that the reanalysis is stable all along the time period, with however an improved skill when Argo observation network establishes. GLORYS2V1 captures well climate signals and trends and describes meso-scale variability in a realistic manner.

  14. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Berg, Larry K.; Riihimaki, Laura D.; Qian, Yun

    This study utilizes five commonly used reanalysis products, including the NCEP-DOE Reanalysis 2 (NCEP2), ECMWF Re-Analysis (ERA)-Interim, Japanese 25-year Reanalysis (JRA-25), Modern-Era Retrospective Analysis for Research and Applications (MERRA), and North American Regional Reanalysis (NARR) to evaluate features of the Southern Great Plains Low Level Jet (LLJ) above the Atmospheric Radiation Measurement (ARM) Climate Research Facility (ACRF) Southern Great Plains site. Two sets of radiosonde data are utilized: the six-week Midlatitude Continental Convective Clouds Experiment (MC3E), and a ten-year period spanning 2001-2010. All five reanalysis are compared to MC3E data, while only the NARR and MERRA are compared to themore » ten-year data. Each reanalysis is able to represent most aspects of the composite LLJ profile, although there is a tendency for each reanalysis to overestimate the wind speed between the nose of the LLJ and 700 mb. There are large discrepancies in the number of LLJ observed and derived from the reanalysis, particularly for strong LLJs that leads to an underestimate of the water vapor transport associated with LLJs. When the ten-year period is considered, the NARR overestimates and MERRA underestimates the total moisture transport, but both underestimate the transport associated with strong LLJs by factors of 2.0 and 2.7 for the NARR and MERR, respectively. During MC3E there were differences in the patterns of moisture convergence and divergence, with the MERRA having an area of moisture divergence over Oklahoma, while the NARR has moisture convergence. The patterns of moisture convergence and divergence are more consistent during the ten-year period.« less

  15. A First Look at Surface Meteorology in the Arctic System Reanalysis

    NASA Astrophysics Data System (ADS)

    Slater, A. G.; Serreze, M. C.; Asr-Team, A.

    2010-12-01

    The Arctic System Reanalysis (ASR) is a joint venture between several universities (Ohio-State Uni., Uni. Colorado, Uni. Illinois UC, Uni. Alaska) and NCAR. It is a regional reanalysis that will span the period 2000-2010, possibly continuing into the future. Compared to current regional or global reanalyses it will have a spatial resolution twice that of prior efforts; a final product is expected to be an equal area projection of 15km grid boxes. The domain encompasses all the Arctic Ocean drainage areas. Several new reanalysis applications have been implemented, with some being Arctic specific - for example satellite derived sea ice age is translated into thickness and MODIS surface albedo is to be ingested. A preliminary ASR run has been performed for the period June 2007 - December 2008 at a reduced resolution of 30km. Here we make a comparison of all recent reanalysis products (NARR, MERRA, ERA-I, CFSRR) to both the ASR and observations at 350 surface stations in the Western Arctic; there is a major focus on Alaska. An intercomparison of surface variables (which are perhaps the most used reanalysis data) has been undertaken including temperature, humidity and solar radiation. Results indicate that the level of discrepancy between reanalysis data and observations is of similar magnitude as it is between all the reanalysis products; possibly suggesting that we have reached the limit of repersentativeness when comparing grid boxes to point measurements.

  16. JRAero: the Japanese Reanalysis for Aerosol v1.0

    NASA Astrophysics Data System (ADS)

    Yumimoto, Keiya; Tanaka, Taichu Y.; Oshima, Naga; Maki, Takashi

    2017-09-01

    A global aerosol reanalysis product named the Japanese Reanalysis for Aerosol (JRAero) was constructed by the Meteorological Research Institute (MRI) of the Japan Meteorological Agency. The reanalysis employs a global aerosol transport model developed by MRI and a two-dimensional variational data assimilation method. It assimilates maps of aerosol optical depth (AOD) from MODIS onboard the Terra and Aqua satellites every 6 h and has a TL159 horizontal resolution (approximately 1.1° × 1.1°). This paper describes the aerosol transport model, the data assimilation system, the observation data, and the setup of the reanalysis and examines its quality with AOD observations. Comparisons with MODIS AODs that were used for the assimilation showed that the reanalysis showed much better agreement than the free run (without assimilation) of the aerosol model and improved under- and overestimation in the free run, thus confirming the accuracy of the data assimilation system. The reanalysis had a root mean square error (RMSE) of 0.05, a correlation coefficient (R) of 0.96, a mean fractional error (MFE) of 23.7 %, a mean fractional bias (MFB) of 2.8 %, and an index of agreement (IOA) of 0.98. The better agreement of the first guess, compared to the free run, indicates that aerosol fields obtained by the reanalysis can improve short-term forecasts. AOD fields from the reanalysis also agreed well with monthly averaged global AODs obtained by the Aerosol Robotic Network (AERONET) (RMSE = 0.08, R = 0. 90, MFE = 28.1 %, MFB = 0.6 %, and IOA = 0.93). Site-by-site comparison showed that the reanalysis was considerably better than the free run; RMSE was less than 0.10 at 86.4 % of the 181 AERONET sites, R was greater than 0.90 at 40.7 % of the sites, and IOA was greater than 0.90 at 43.4 % of the sites. However, the reanalysis tended to have a negative bias at urban sites (in particular, megacities in industrializing countries) and a positive bias at mountain sites, possibly because of insufficient anthropogenic emissions data, the coarse model resolution, and the difference in representativeness between satellite and ground-based observations.

  17. Evaluation of artificial time series microarray data for dynamic gene regulatory network inference.

    PubMed

    Xenitidis, P; Seimenis, I; Kakolyris, S; Adamopoulos, A

    2017-08-07

    High-throughput technology like microarrays is widely used in the inference of gene regulatory networks (GRNs). We focused on time series data since we are interested in the dynamics of GRNs and the identification of dynamic networks. We evaluated the amount of information that exists in artificial time series microarray data and the ability of an inference process to produce accurate models based on them. We used dynamic artificial gene regulatory networks in order to create artificial microarray data. Key features that characterize microarray data such as the time separation of directly triggered genes, the percentage of directly triggered genes and the triggering function type were altered in order to reveal the limits that are imposed by the nature of microarray data on the inference process. We examined the effect of various factors on the inference performance such as the network size, the presence of noise in microarray data, and the network sparseness. We used a system theory approach and examined the relationship between the pole placement of the inferred system and the inference performance. We examined the relationship between the inference performance in the time domain and the true system parameter identification. Simulation results indicated that time separation and the percentage of directly triggered genes are crucial factors. Also, network sparseness, the triggering function type and noise in input data affect the inference performance. When two factors were simultaneously varied, it was found that variation of one parameter significantly affects the dynamic response of the other. Crucial factors were also examined using a real GRN and acquired results confirmed simulation findings with artificial data. Different initial conditions were also used as an alternative triggering approach. Relevant results confirmed that the number of datasets constitutes the most significant parameter with regard to the inference performance. Copyright © 2017 Elsevier Ltd. All rights reserved.

  18. On the improvement of wave and storm surge hindcasts by downscaled atmospheric forcing: application to historical storms

    NASA Astrophysics Data System (ADS)

    Bresson, Émilie; Arbogast, Philippe; Aouf, Lotfi; Paradis, Denis; Kortcheva, Anna; Bogatchev, Andrey; Galabov, Vasko; Dimitrova, Marieta; Morvan, Guillaume; Ohl, Patrick; Tsenova, Boryana; Rabier, Florence

    2018-04-01

    Winds, waves and storm surges can inflict severe damage in coastal areas. In order to improve preparedness for such events, a better understanding of storm-induced coastal flooding episodes is necessary. To this end, this paper highlights the use of atmospheric downscaling techniques in order to improve wave and storm surge hindcasts. The downscaling techniques used here are based on existing European Centre for Medium-Range Weather Forecasts reanalyses (ERA-20C, ERA-40 and ERA-Interim). The results show that the 10 km resolution data forcing provided by a downscaled atmospheric model gives a better wave and surge hindcast compared to using data directly from the reanalysis. Furthermore, the analysis of the most extreme mid-latitude cyclones indicates that a four-dimensional blending approach improves the whole process, as it assimilates more small-scale processes in the initial conditions. Our approach has been successfully applied to ERA-20C (the 20th century reanalysis).

  19. An Update on the VAMOS Extremes Working Group Activities

    NASA Technical Reports Server (NTRS)

    Schubert, Siegfried; Cavalcanti, Iracema

    2011-01-01

    We review here the progress of the Variability of the American MOnsoon Systems (VAMOS) extremes working group since it was formed in February of 2010. The goals of the working group are to 1) develop an atlas of warm-season extremes over the Americas, 2) evaluate existing and planned simulations, and 3) suggest new model runs to address mechanisms and predictability of extremes. Substantial progress has been made in the development of an extremes atlas based on gridded observations and several reanalysis products including Modern Era Retrospective-Analysis for Research and Applications (MERRA) and Climate Forecast System Reanalysis (CFSR). The status of the atlas, remaining issues and plans for its expansion to include model data will be discussed. This includes the possibility of adding a companion atlas based on station observations based on the software developed under the World Climate Research Programme (WCRP) Expert Team on Climate Change. Detection and Indices (ETCCDI) activity. We will also review progress on relevant research and plans for the use and validation of the atlas results.

  20. Quantifying How Observations Inform a Numerical Reanalysis of Hawaii

    NASA Astrophysics Data System (ADS)

    Powell, B. S.

    2017-11-01

    When assimilating observations into a model via state-estimation, it is possible to quantify how each observation changes the modeled estimate of a chosen oceanic metric. Using an existing 2 year reanalysis of Hawaii that includes more than 31 million observations from satellites, ships, SeaGliders, and autonomous floats, I assess which observations most improve the estimates of the transport and eddy kinetic energy. When the SeaGliders were in the water, they comprised less than 2.5% of the data, but accounted for 23% of the transport adjustment. Because the model physics constrains advanced state-estimation, the prescribed covariances are propagated in time to identify observation-model covariance. I find that observations that constrain the isopycnal tilt across the transport section provide the greatest impact in the analysis. In the case of eddy kinetic energy, observations that constrain the surface-driven upper ocean have more impact. This information can help to identify optimal sampling strategies to improve both state-estimates and forecasts.

  1. Re-analysis of public genetic data reveals a rare X-chromosomal variant associated with type 2 diabetes.

    PubMed

    Bonàs-Guarch, Sílvia; Guindo-Martínez, Marta; Miguel-Escalada, Irene; Grarup, Niels; Sebastian, David; Rodriguez-Fos, Elias; Sánchez, Friman; Planas-Fèlix, Mercè; Cortes-Sánchez, Paula; González, Santi; Timshel, Pascal; Pers, Tune H; Morgan, Claire C; Moran, Ignasi; Atla, Goutham; González, Juan R; Puiggros, Montserrat; Martí, Jonathan; Andersson, Ehm A; Díaz, Carlos; Badia, Rosa M; Udler, Miriam; Leong, Aaron; Kaur, Varindepal; Flannick, Jason; Jørgensen, Torben; Linneberg, Allan; Jørgensen, Marit E; Witte, Daniel R; Christensen, Cramer; Brandslund, Ivan; Appel, Emil V; Scott, Robert A; Luan, Jian'an; Langenberg, Claudia; Wareham, Nicholas J; Pedersen, Oluf; Zorzano, Antonio; Florez, Jose C; Hansen, Torben; Ferrer, Jorge; Mercader, Josep Maria; Torrents, David

    2018-01-22

    The reanalysis of existing GWAS data represents a powerful and cost-effective opportunity to gain insights into the genetics of complex diseases. By reanalyzing publicly available type 2 diabetes (T2D) genome-wide association studies (GWAS) data for 70,127 subjects, we identify seven novel associated regions, five driven by common variants (LYPLAL1, NEUROG3, CAMKK2, ABO, and GIP genes), one by a low-frequency (EHMT2), and one driven by a rare variant in chromosome Xq23, rs146662057, associated with a twofold increased risk for T2D in males. rs146662057 is located within an active enhancer associated with the expression of Angiotensin II Receptor type 2 gene (AGTR2), a modulator of insulin sensitivity, and exhibits allelic specific activity in muscle cells. Beyond providing insights into the genetics and pathophysiology of T2D, these results also underscore the value of reanalyzing publicly available data using novel genetic resources and analytical approaches.

  2. Evaluation of MERRAero (MERRA Aerosol Reanalysis)

    NASA Technical Reports Server (NTRS)

    Buchard, Virginie; da Silva, Arlindo; Randles, Cynthia; Colarco, Peter; Darmenov, Anton; Govindaraju, Ravi

    2016-01-01

    This presentation focuses on MERRA Aerosol Reanalysis (MERRAero) which is the first aerosol reanalysis produced at GMAO. This presentation involve an overview of MERRAero. The evaluation of MERRAero absorption and the evaluation of MERRAero Surface PM 2.5 will also be discussed.

  3. Optimized Probe Masking for Comparative Transcriptomics of Closely Related Species

    PubMed Central

    Poeschl, Yvonne; Delker, Carolin; Trenner, Jana; Ullrich, Kristian Karsten; Quint, Marcel; Grosse, Ivo

    2013-01-01

    Microarrays are commonly applied to study the transcriptome of specific species. However, many available microarrays are restricted to model organisms, and the design of custom microarrays for other species is often not feasible. Hence, transcriptomics approaches of non-model organisms as well as comparative transcriptomics studies among two or more species often make use of cost-intensive RNAseq studies or, alternatively, by hybridizing transcripts of a query species to a microarray of a closely related species. When analyzing these cross-species microarray expression data, differences in the transcriptome of the query species can cause problems, such as the following: (i) lower hybridization accuracy of probes due to mismatches or deletions, (ii) probes binding multiple transcripts of different genes, and (iii) probes binding transcripts of non-orthologous genes. So far, methods for (i) exist, but these neglect (ii) and (iii). Here, we propose an approach for comparative transcriptomics addressing problems (i) to (iii), which retains only transcript-specific probes binding transcripts of orthologous genes. We apply this approach to an Arabidopsis lyrata expression data set measured on a microarray designed for Arabidopsis thaliana, and compare it to two alternative approaches, a sequence-based approach and a genomic DNA hybridization-based approach. We investigate the number of retained probe sets, and we validate the resulting expression responses by qRT-PCR. We find that the proposed approach combines the benefit of sequence-based stringency and accuracy while allowing the expression analysis of much more genes than the alternative sequence-based approach. As an added benefit, the proposed approach requires probes to detect transcripts of orthologous genes only, which provides a superior base for biological interpretation of the measured expression responses. PMID:24260119

  4. TIPMaP: a web server to establish transcript isoform profiles from reliable microarray probes.

    PubMed

    Chitturi, Neelima; Balagannavar, Govindkumar; Chandrashekar, Darshan S; Abinaya, Sadashivam; Srini, Vasan S; Acharya, Kshitish K

    2013-12-27

    Standard 3' Affymetrix gene expression arrays have contributed a significantly higher volume of existing gene expression data than other microarray platforms. These arrays were designed to identify differentially expressed genes, but not their alternatively spliced transcript forms. No resource can currently identify expression pattern of specific mRNA forms using these microarray data, even though it is possible to do this. We report a web server for expression profiling of alternatively spliced transcripts using microarray data sets from 31 standard 3' Affymetrix arrays for human, mouse and rat species. The tool has been experimentally validated for mRNAs transcribed or not-detected in a human disease condition (non-obstructive azoospermia, a male infertility condition). About 4000 gene expression datasets were downloaded from a public repository. 'Good probes' with complete coverage and identity to latest reference transcript sequences were first identified. Using them, 'Transcript specific probe-clusters' were derived for each platform and used to identify expression status of possible transcripts. The web server can lead the user to datasets corresponding to specific tissues, conditions via identifiers of the microarray studies or hybridizations, keywords, official gene symbols or reference transcript identifiers. It can identify, in the tissues and conditions of interest, about 40% of known transcripts as 'transcribed', 'not-detected' or 'differentially regulated'. Corresponding additional information for probes, genes, transcripts and proteins can be viewed too. We identified the expression of transcripts in a specific clinical condition and validated a few of these transcripts by experiments (using reverse transcription followed by polymerase chain reaction). The experimental observations indicated higher agreements with the web server results, than contradictions. The tool is accessible at http://resource.ibab.ac.in/TIPMaP. The newly developed online tool forms a reliable means for identification of alternatively spliced transcript-isoforms that may be differentially expressed in various tissues, cell types or physiological conditions. Thus, by making better use of existing data, TIPMaP avoids the dependence on precious tissue-samples, in experiments with a goal to establish expression profiles of alternative splice forms--at least in some cases.

  5. sigReannot: an oligo-set re-annotation pipeline based on similarities with the Ensembl transcripts and Unigene clusters.

    PubMed

    Casel, Pierrot; Moreews, François; Lagarrigue, Sandrine; Klopp, Christophe

    2009-07-16

    Microarray is a powerful technology enabling to monitor tens of thousands of genes in a single experiment. Most microarrays are now using oligo-sets. The design of the oligo-nucleotides is time consuming and error prone. Genome wide microarray oligo-sets are designed using as large a set of transcripts as possible in order to monitor as many genes as possible. Depending on the genome sequencing state and on the assembly state the knowledge of the existing transcripts can be very different. This knowledge evolves with the different genome builds and gene builds. Once the design is done the microarrays are often used for several years. The biologists working in EADGENE expressed the need of up-to-dated annotation files for the oligo-sets they share including information about the orthologous genes of model species, the Gene Ontology, the corresponding pathways and the chromosomal location. The results of SigReannot on a chicken micro-array used in the EADGENE project compared to the initial annotations show that 23% of the oligo-nucleotide gene annotations were not confirmed, 2% were modified and 1% were added. The interest of this up-to-date annotation procedure is demonstrated through the analysis of real data previously published. SigReannot uses the oligo-nucleotide design procedure criteria to validate the probe-gene link and the Ensembl transcripts as reference for annotation. It therefore produces a high quality annotation based on reference gene sets.

  6. Extending Climate Analytics as a Service to the Earth System Grid Federation Progress Report on the Reanalysis Ensemble Service

    NASA Astrophysics Data System (ADS)

    Tamkin, G.; Schnase, J. L.; Duffy, D.; Li, J.; Strong, S.; Thompson, J. H.

    2016-12-01

    We are extending climate analytics-as-a-service, including: (1) A high-performance Virtual Real-Time Analytics Testbed supporting six major reanalysis data sets using advanced technologies like the Cloudera Impala-based SQL and Hadoop-based MapReduce analytics over native NetCDF files. (2) A Reanalysis Ensemble Service (RES) that offers a basic set of commonly used operations over the reanalysis collections that are accessible through NASA's climate data analytics Web services and our client-side Climate Data Services Python library, CDSlib. (3) An Open Geospatial Consortium (OGC) WPS-compliant Web service interface to CDSLib to accommodate ESGF's Web service endpoints. This presentation will report on the overall progress of this effort, with special attention to recent enhancements that have been made to the Reanalysis Ensemble Service, including the following: - An CDSlib Python library that supports full temporal, spatial, and grid-based resolution services - A new reanalysis collections reference model to enable operator design and implementation - An enhanced library of sample queries to demonstrate and develop use case scenarios - Extended operators that enable single- and multiple reanalysis area average, vertical average, re-gridding, and trend, climatology, and anomaly computations - Full support for the MERRA-2 reanalysis and the initial integration of two additional reanalyses - A prototype Jupyter notebook-based distribution mechanism that combines CDSlib documentation with interactive use case scenarios and personalized project management - Prototyped uncertainty quantification services that combine ensemble products with comparative observational products - Convenient, one-stop shopping for commonly used data products from multiple reanalyses, including basic subsetting and arithmetic operations over the data and extractions of trends, climatologies, and anomalies - The ability to compute and visualize multiple reanalysis intercomparisons

  7. Atmospheric response to Saharan dust deduced from ECMWF reanalysis (ERA) temperature increments

    NASA Astrophysics Data System (ADS)

    Kishcha, P.; Alpert, P.; Barkan, J.; Kirchner, I.; Machenhauer, B.

    2003-09-01

    This study focuses on the atmospheric temperature response to dust deduced from a new source of data the European Reanalysis (ERA) increments. These increments are the systematic errors of global climate models, generated in the reanalysis procedure. The model errors result not only from the lack of desert dust but also from a complex combination of many kinds of model errors. Over the Sahara desert the lack of dust radiative effect is believed to be a predominant model defect which should significantly affect the increments. This dust effect was examined by considering correlation between the increments and remotely sensed dust. Comparisons were made between April temporal variations of the ERA analysis increments and the variations of the Total Ozone Mapping Spectrometer aerosol index (AI) between 1979 and 1993. The distinctive structure was identified in the distribution of correlation composed of three nested areas with high positive correlation (>0.5), low correlation and high negative correlation (<-0.5). The innermost positive correlation area (PCA) is a large area near the center of the Sahara desert. For some local maxima inside this area the correlation even exceeds 0.8. The outermost negative correlation area (NCA) is not uniform. It consists of some areas over the eastern and western parts of North Africa with a relatively small amount of dust. Inside those areas both positive and negative high correlations exist at pressure levels ranging from 850 to 700 hPa, with the peak values near 775 hPa. Dust-forced heating (cooling) inside the PCA (NCA) is accompanied by changes in the static instability of the atmosphere above the dust layer. The reanalysis data of the European Center for Medium Range Weather Forecast (ECMWF) suggest that the PCA (NCA) corresponds mainly to anticyclonic (cyclonic) flow, negative (positive) vorticity and downward (upward) airflow. These findings are associated with the interaction between dust-forced heating/cooling and atmospheric circulation. This paper contributes to a better understanding of dust radiative processes missed in the model.

  8. Connecting Observations and Reanalysis of the MJO with Theory

    NASA Astrophysics Data System (ADS)

    Powell, S. W.

    2017-12-01

    Over the past few years, refined theories have been advanced the explain the onset and/or propagation of the Madden-Julian Oscillation over the tropical warm pool. For example, Adames and Kim (2016) built on Sobel and Maloney (2012, 2013) to describe the MJO as a dispersive moisture wave whose instability mechanism is a radiative-convective instability supported by anvils of large mesoscale systems. Wang and Chen (2016) describe a similar frictionally coupled moisture mode that captures many basic features of the canonically observed MJO. Arnold and Randall (2015) hypothesize that the MJO might be described as self-aggregation of convection over the Indian Ocean. Fuchs and Raymond (2017) describe the MJO as a first baroclinic dispersive mode in a simplified model with a linear WISHE instability that shows decreased propagation speeds for lower wavelengths. Not all of these theories can be correct, and quite possibly none of them are fully. Intelligent use of observations and reanalysis of past MJO events can help guide development of MJO theory. For example, Powell (2017) shows that in MERRA-2 reanalysis, the MJO propagates as a convectively coupled Kelvin wave over the Western Hemisphere then transitions abruptly into a slower moving mode over the Indian Ocean. A complete MJO theory must account for both forms as, and when, the MJO circumnavigates. Observations (like TRMM and GPM data) and reanalysis can reveal the relative roles of cloud-scale processes and large-scale free tropospheric horizontal advection in "pre-moistening" the troposphere in the location of MJO initiation where subsequent propagation of an existing MJO occurs. This can, for example, help validate or refute aspects of moisture mode theory that require large-scale dynamics to moisten an area ahead of an active envelope of MJO-related convection before the MJO can propagate eastward. Radar and satellite observations might yield some insight into whether convective self-aggregation is a real phenomenon or if upscale growth of cloud elements into mesoscale systems is actually more responsible for the apparent large-scale organization of convection in the tropics, let alone within the MJO. I will present a few such examples of how careful exploration of observations and reanalysis might help guide MJO theory during the next several years.

  9. Towards a reanalysis covering the last millennia

    NASA Astrophysics Data System (ADS)

    Goosse, H.; Dubinkina, S.

    2012-04-01

    The reanalysis, extending over several decades or longer, provide a comprehensive record of the recent variability and changes of the climate system by objectively combining observations and a numerical model. They are now considered as an essential source of information on the state of the ocean and the atmosphere used, among many other applications, to study the dynamics of the system and the interactions between its different components, to analyze the characteristics of the recent changes as well as the interannual climate variability. However, in order to study processes with a characteristic period from some decades to several centuries, the period covered by the presently available reanalysis is too short. It is therefore necessary to use paleoclimatic proxy data, which provide longer time series, in order to extent the period covered by reanalyses . Those paleoclimatic data, however, are much less numerous, more noisy, and have a lower spatial and temporal resolution than the ones available for the reanalyses over the 20th century. In order to obtain reanalyses covering the last millennia, several steps are thus still required. It is first necessary to develop data assimilation methods adapted to this specific problem. Some data synthesis for this period are available but a reanalysis requires a comprehensive evaluation of the quality of existing data, in all the regions and for all the proxies. Reanalyses are very demanding in computer time, the model selected in the procedure must thus be efficient enough but should also include the right dynamics in order to reproduce the teleconnections between areas where data are available and to extrapolate the information towards regions with no data. Finally, proxies and models do not provide the same variables and comparing them requires a sophisticated approach, ideally implying the inclusion of forward proxy models in the data assimilation system. Here, we propose to review the present status of the field and to identify how significant steps can be made in the way towards a reanalysis over the last millennia. This is illustrated by examples from recent studies in which data assimilation is applied over the last millennia using reconstructions at continental and nearly hemispheric-scale as well as grid-scale temperatures derived directly from the local calibration of proxies.

  10. Method and Early Results of Applying the Global Land Data Assimilation System (GLDAS) in the Third Global Reanalysis of NCEP

    NASA Astrophysics Data System (ADS)

    Meng, J.; Mitchell, K.; Wei, H.; Yang, R.; Kumar, S.; Geiger, J.; Xie, P.

    2008-05-01

    Over the past several years, the Environmental Modeling Center (EMC) of the National Centers for Environmental Prediction (NCEP) of the U.S. National Weather Service has developed a Global Land Data Assimilation System (GLDAS). For its computational infrastructure, the GLDAS applies the NASA Land Information System (LIS), developed by the Hydrological Science Branch of NASA Goddard Space Flight Center. The land model utilized in the NCEP GLDAS is the NCEP Noah Land Surface Model (Noah LSM). This presentation will 1) describe how the GLDAS component has been included in the development of NCEP's third global reanalysis (with special attention to the input sources of global precipitation), and 2) will present results from the GLDAS component of pilot tests of the new NCEP global reanalysis. Unlike NCEP's past two global reanalysis projects, this new NCEP global reanalysis includes both a global land data assimilation system (GLDAS) and a global ocean data assimilation system (GODAS). The new global reanalysis will span 30-years (1979-2008) and will include a companion realtime operational component. The atmospheric, ocean, and land states of this global reanalysis will provide the initial conditions for NCEP's 3rd- generation global coupled Climate Forecast System (CFS). NCEP is now preparing to launch a 28-year seasonal reforecast project with its new CFS, to provide the reforecast foundation for operational NCEP seasonal climate forecasts using the new CFS. Together, the new global reanalysis and companion CFS reforecasts constitute what NCEP calls the Climate Forecast System Reanalysis and Reforecast (CFSRR) project. Compared to the previous two generations of NCEP global reanalysis, the hallmark of the GLDAS component of CFSRR is GLDAS use of global analyses of observed precipitation to drive the land surface component of the reanalysis (rather than the typical reanalysis approach of using precipitation from the assimilating background atmospheric model). Specifically, the GLDAS merges two global analyses of observed precipitation produced by the Climate Prediction Center (CPC) of NCEP, as follows: 1) a new CPC daily gauge-only land-only global precipitation analysis at 0.5-degree resolution and 2) the well-known CPC CMAP global 2.0 x 2.5 degree 5-day precipitation analysis, which utilizes satellite estimates of precipitation, as well as some gauge observations. The presentation will describe how these two analyses are merged with latitude-dependent weights that favor the gauge-only analysis in mid-latitudes and the satellite-dominated CMAP analysis in tropical latitudes. Finally, we will show some impacts of using GLDAS to initialize the land states of seasonal CFS reforecasts, versus using the previous generation of NCEP global reanalysis as the source for CFS initial land states.

  11. High-Resolution Regional Reanalysis in China: Evaluation of 1 Year Period Experiments

    NASA Astrophysics Data System (ADS)

    Zhang, Qi; Pan, Yinong; Wang, Shuyu; Xu, Jianjun; Tang, Jianping

    2017-10-01

    Globally, reanalysis data sets are widely used in assessing climate change, validating numerical models, and understanding the interactions between the components of a climate system. However, due to the relatively coarse resolution, most global reanalysis data sets are not suitable to apply at the local and regional scales directly with the inadequate descriptions of mesoscale systems and climatic extreme incidents such as mesoscale convective systems, squall lines, tropical cyclones, regional droughts, and heat waves. In this study, by using a data assimilation system of Gridpoint Statistical Interpolation, and a mesoscale atmospheric model of Weather Research and Forecast model, we build a regional reanalysis system. This is preliminary and the first experimental attempt to construct a high-resolution reanalysis for China main land. Four regional test bed data sets are generated for year 2013 via three widely used methods (classical dynamical downscaling, spectral nudging, and data assimilation) and a hybrid method with data assimilation coupled with spectral nudging. Temperature at 2 m, precipitation, and upper level atmospheric variables are evaluated by comparing against observations for one-year-long tests. It can be concluded that the regional reanalysis with assimilation and nudging methods can better produce the atmospheric variables from surface to upper levels, and regional extreme events such as heat waves, than the classical dynamical downscaling. Compared to the ERA-Interim global reanalysis, the hybrid nudging method performs slightly better in reproducing upper level temperature and low-level moisture over China, which improves regional reanalysis data quality.

  12. Performance evaluation of CESM in simulating the dust cycle

    NASA Astrophysics Data System (ADS)

    Parajuli, S. P.; Yang, Z. L.; Kocurek, G.; Lawrence, D. M.

    2014-12-01

    Mineral dust in the atmosphere has implications for Earth's radiation budget, biogeochemical cycles, hydrological cycles, human health and visibility. Mineral dust is injected into the atmosphere during dust storms when the surface winds are sufficiently strong and the land surface conditions are favorable. Dust storms are very common in specific regions of the world including the Middle East and North Africa (MENA) region, which contains more than 50% of the global dust sources. In this work, we present simulation of the dust cycle under the framework of CESM1.2.2 and evaluate how well the model captures the spatio-temporal characteristics of dust sources, transport and deposition at global scale, especially in dust source regions. We conducted our simulations using two existing erodibility maps (geomorphic and topographic) and a new erodibility map, which is based on the correlation between observed wind and dust. We compare the simulated results with MODIS satellite data, MACC reanalysis data, and AERONET station data. Comparison with MODIS satellite data and MACC reanalysis data shows that all three erodibility maps generally reproduce the spatio-temporal characteristics of dust optical depth globally. However, comparison with AERONET station data shows that the simulated dust optical depth is generally overestimated for all erodibility maps. Results vary greatly by region and scale of observational data. Our results also show that the simulations forced by reanalysis meteorology capture the overall dust cycle more realistically compared to the simulations done using online meteorology.

  13. BioconductorBuntu: a Linux distribution that implements a web-based DNA microarray analysis server.

    PubMed

    Geeleher, Paul; Morris, Dermot; Hinde, John P; Golden, Aaron

    2009-06-01

    BioconductorBuntu is a custom distribution of Ubuntu Linux that automatically installs a server-side microarray processing environment, providing a user-friendly web-based GUI to many of the tools developed by the Bioconductor Project, accessible locally or across a network. System installation is via booting off a CD image or by using a Debian package provided to upgrade an existing Ubuntu installation. In its current version, several microarray analysis pipelines are supported including oligonucleotide, dual-or single-dye experiments, including post-processing with Gene Set Enrichment Analysis. BioconductorBuntu is designed to be extensible, by server-side integration of further relevant Bioconductor modules as required, facilitated by its straightforward underlying Python-based infrastructure. BioconductorBuntu offers an ideal environment for the development of processing procedures to facilitate the analysis of next-generation sequencing datasets. BioconductorBuntu is available for download under a creative commons license along with additional documentation and a tutorial from (http://bioinf.nuigalway.ie).

  14. Sensitivity of Simulated Global Ocean Carbon Flux Estimates to Forcing by Reanalysis Products

    NASA Technical Reports Server (NTRS)

    Gregg, Watson W.; Casey, Nancy W.; Rousseaux, Cecile S.

    2015-01-01

    Reanalysis products from MERRA, NCEP2, NCEP1, and ECMWF were used to force an established ocean biogeochemical model to estimate air-sea carbon fluxes (FCO2) and partial pressure of carbon dioxide (pCO2) in the global oceans. Global air-sea carbon fluxes and pCO2 were relatively insensitive to the choice of forcing reanalysis. All global FCO2 estimates from the model forced by the four different reanalyses were within 20% of in situ estimates (MERRA and NCEP1 were within 7%), and all models exhibited statistically significant positive correlations with in situ estimates across the 12 major oceanographic basins. Global pCO2 estimates were within 1% of in situ estimates with ECMWF being the outlier at 0.6%. Basin correlations were similar to FCO2. There were, however, substantial departures among basin estimates from the different reanalysis forcings. The high latitudes and tropics had the largest ranges in estimated fluxes among the reanalyses. Regional pCO2 differences among the reanalysis forcings were muted relative to the FCO2 results. No individual reanalysis was uniformly better or worse in the major oceanographic basins. The results provide information on the characterization of uncertainty in ocean carbon models due to choice of reanalysis forcing.

  15. Evaluating reproducibility of differential expression discoveries in microarray studies by considering correlated molecular changes.

    PubMed

    Zhang, Min; Zhang, Lin; Zou, Jinfeng; Yao, Chen; Xiao, Hui; Liu, Qing; Wang, Jing; Wang, Dong; Wang, Chenguang; Guo, Zheng

    2009-07-01

    According to current consistency metrics such as percentage of overlapping genes (POG), lists of differentially expressed genes (DEGs) detected from different microarray studies for a complex disease are often highly inconsistent. This irreproducibility problem also exists in other high-throughput post-genomic areas such as proteomics and metabolism. A complex disease is often characterized with many coordinated molecular changes, which should be considered when evaluating the reproducibility of discovery lists from different studies. We proposed metrics percentage of overlapping genes-related (POGR) and normalized POGR (nPOGR) to evaluate the consistency between two DEG lists for a complex disease, considering correlated molecular changes rather than only counting gene overlaps between the lists. Based on microarray datasets of three diseases, we showed that though the POG scores for DEG lists from different studies for each disease are extremely low, the POGR and nPOGR scores can be rather high, suggesting that the apparently inconsistent DEG lists may be highly reproducible in the sense that they are actually significantly correlated. Observing different discovery results for a disease by the POGR and nPOGR scores will obviously reduce the uncertainty of the microarray studies. The proposed metrics could also be applicable in many other high-throughput post-genomic areas.

  16. Droplet Microarray Based on Superhydrophobic-Superhydrophilic Patterns for Single Cell Analysis.

    PubMed

    Jogia, Gabriella E; Tronser, Tina; Popova, Anna A; Levkin, Pavel A

    2016-12-09

    Single-cell analysis provides fundamental information on individual cell response to different environmental cues and is a growing interest in cancer and stem cell research. However, current existing methods are still facing challenges in performing such analysis in a high-throughput manner whilst being cost-effective. Here we established the Droplet Microarray (DMA) as a miniaturized screening platform for high-throughput single-cell analysis. Using the method of limited dilution and varying cell density and seeding time, we optimized the distribution of single cells on the DMA. We established culturing conditions for single cells in individual droplets on DMA obtaining the survival of nearly 100% of single cells and doubling time of single cells comparable with that of cells cultured in bulk cell population using conventional methods. Our results demonstrate that the DMA is a suitable platform for single-cell analysis, which carries a number of advantages compared with existing technologies allowing for treatment, staining and spot-to-spot analysis of single cells over time using conventional analysis methods such as microscopy.

  17. Gene regulatory network identification from the yeast cell cycle based on a neuro-fuzzy system.

    PubMed

    Wang, B H; Lim, J W; Lim, J S

    2016-08-30

    Many studies exist for reconstructing gene regulatory networks (GRNs). In this paper, we propose a method based on an advanced neuro-fuzzy system, for gene regulatory network reconstruction from microarray time-series data. This approach uses a neural network with a weighted fuzzy function to model the relationships between genes. Fuzzy rules, which determine the regulators of genes, are very simplified through this method. Additionally, a regulator selection procedure is proposed, which extracts the exact dynamic relationship between genes, using the information obtained from the weighted fuzzy function. Time-series related features are extracted from the original data to employ the characteristics of temporal data that are useful for accurate GRN reconstruction. The microarray dataset of the yeast cell cycle was used for our study. We measured the mean squared prediction error for the efficiency of the proposed approach and evaluated the accuracy in terms of precision, sensitivity, and F-score. The proposed method outperformed the other existing approaches.

  18. Overcoming confounded controls in the analysis of gene expression data from microarray experiments.

    PubMed

    Bhattacharya, Soumyaroop; Long, Dang; Lyons-Weiler, James

    2003-01-01

    A potential limitation of data from microarray experiments exists when improper control samples are used. In cancer research, comparisons of tumour expression profiles to those from normal samples is challenging due to tissue heterogeneity (mixed cell populations). A specific example exists in a published colon cancer dataset, in which tissue heterogeneity was reported among the normal samples. In this paper, we show how to overcome or avoid the problem of using normal samples that do not derive from the same tissue of origin as the tumour. We advocate an exploratory unsupervised bootstrap analysis that can reveal unexpected and undesired, but strongly supported, clusters of samples that reflect tissue differences instead of tumour versus normal differences. All of the algorithms used in the analysis, including the maximum difference subset algorithm, unsupervised bootstrap analysis, pooled variance t-test for finding differentially expressed genes and the jackknife to reduce false positives, are incorporated into our online Gene Expression Data Analyzer ( http:// bioinformatics.upmc.edu/GE2/GEDA.html ).

  19. Probabilistic segmentation and intensity estimation for microarray images.

    PubMed

    Gottardo, Raphael; Besag, Julian; Stephens, Matthew; Murua, Alejandro

    2006-01-01

    We describe a probabilistic approach to simultaneous image segmentation and intensity estimation for complementary DNA microarray experiments. The approach overcomes several limitations of existing methods. In particular, it (a) uses a flexible Markov random field approach to segmentation that allows for a wider range of spot shapes than existing methods, including relatively common 'doughnut-shaped' spots; (b) models the image directly as background plus hybridization intensity, and estimates the two quantities simultaneously, avoiding the common logical error that estimates of foreground may be less than those of the corresponding background if the two are estimated separately; and (c) uses a probabilistic modeling approach to simultaneously perform segmentation and intensity estimation, and to compute spot quality measures. We describe two approaches to parameter estimation: a fast algorithm, based on the expectation-maximization and the iterated conditional modes algorithms, and a fully Bayesian framework. These approaches produce comparable results, and both appear to offer some advantages over other methods. We use an HIV experiment to compare our approach to two commercial software products: Spot and Arrayvision.

  20. Evaluation and inter-comparison of modern day reanalysis datasets over Africa and the Middle East

    NASA Astrophysics Data System (ADS)

    Shukla, S.; Arsenault, K. R.; Hobbins, M.; Peters-Lidard, C. D.; Verdin, J. P.

    2015-12-01

    Reanalysis datasets are potentially very valuable for otherwise data-sparse regions such as Africa and the Middle East. They are potentially useful for long-term climate and hydrologic analyses and, given their availability in real-time, they are particularity attractive for real-time hydrologic monitoring purposes (e.g. to monitor flood and drought events). Generally in data-sparse regions, reanalysis variables such as precipitation, temperature, radiation and humidity are used in conjunction with in-situ and/or satellite-based datasets to generate long-term gridded atmospheric forcing datasets. These atmospheric forcing datasets are used to drive offline land surface models and simulate soil moisture and runoff, which are natural indicators of hydrologic conditions. Therefore, any uncertainty or bias in the reanalysis datasets contributes to uncertainties in hydrologic monitoring estimates. In this presentation, we report on a comprehensive analysis that evaluates several modern-day reanalysis products (such as NASA's MERRA-1 and -2, ECMWF's ERA-Interim and NCEP's CFS Reanalysis) over Africa and the Middle East region. We compare the precipitation and temperature from the reanalysis products with other independent gridded datasets such as GPCC, CRU, and USGS/UCSB's CHIRPS precipitation datasets, and CRU's temperature datasets. The evaluations are conducted at a monthly time scale, since some of these independent datasets are only available at this temporal resolution. The evaluations range from the comparison of the monthly mean climatology to inter-annual variability and long-term changes. Finally, we also present the results of inter-comparisons of radiation and humidity variables from the different reanalysis datasets.

  1. Evaluation of reanalysis datasets against observational soil temperature data over China

    NASA Astrophysics Data System (ADS)

    Yang, Kai; Zhang, Jingyong

    2018-01-01

    Soil temperature is a key land surface variable, and is a potential predictor for seasonal climate anomalies and extremes. Using observational soil temperature data in China for 1981-2005, we evaluate four reanalysis datasets, the land surface reanalysis of the European Centre for Medium-Range Weather Forecasts (ERA-Interim/Land), the second modern-era retrospective analysis for research and applications (MERRA-2), the National Center for Environmental Prediction Climate Forecast System Reanalysis (NCEP-CFSR), and version 2 of the Global Land Data Assimilation System (GLDAS-2.0), with a focus on 40 cm soil layer. The results show that reanalysis data can mainly reproduce the spatial distributions of soil temperature in summer and winter, especially over the east of China, but generally underestimate their magnitudes. Owing to the influence of precipitation on soil temperature, the four datasets perform better in winter than in summer. The ERA-Interim/Land and GLDAS-2.0 produce spatial characteristics of the climatological mean that are similar to observations. The interannual variability of soil temperature is well reproduced by the ERA-Interim/Land dataset in summer and by the CFSR dataset in winter. The linear trend of soil temperature in summer is well rebuilt by reanalysis datasets. We demonstrate that soil heat fluxes in April-June and in winter are highly correlated with the soil temperature in summer and winter, respectively. Different estimations of surface energy balance components can contribute to different behaviors in reanalysis products in terms of estimating soil temperature. In addition, reanalysis datasets can mainly rebuild the northwest-southeast gradient of soil temperature memory over China.

  2. A Self-Directed Method for Cell-Type Identification and Separation of Gene Expression Microarrays

    PubMed Central

    Zuckerman, Neta S.; Noam, Yair; Goldsmith, Andrea J.; Lee, Peter P.

    2013-01-01

    Gene expression analysis is generally performed on heterogeneous tissue samples consisting of multiple cell types. Current methods developed to separate heterogeneous gene expression rely on prior knowledge of the cell-type composition and/or signatures - these are not available in most public datasets. We present a novel method to identify the cell-type composition, signatures and proportions per sample without need for a-priori information. The method was successfully tested on controlled and semi-controlled datasets and performed as accurately as current methods that do require additional information. As such, this method enables the analysis of cell-type specific gene expression using existing large pools of publically available microarray datasets. PMID:23990767

  3. eQTL Mapping Using RNA-seq Data

    PubMed Central

    Hu, Yijuan

    2012-01-01

    As RNA-seq is replacing gene expression microarrays to assess genome-wide transcription abundance, gene expression Quantitative Trait Locus (eQTL) studies using RNA-seq have emerged. RNA-seq delivers two novel features that are important for eQTL studies. First, it provides information on allele-specific expression (ASE), which is not available from gene expression microarrays. Second, it generates unprecedentedly rich data to study RNA-isoform expression. In this paper, we review current methods for eQTL mapping using ASE and discuss some future directions. We also review existing works that use RNA-seq data to study RNA-isoform expression and we discuss the gaps between these works and isoform-specific eQTL mapping. PMID:23667399

  4. Where's the Beef? A Comment on Ferguson and Donnellan (2014)

    ERIC Educational Resources Information Center

    Zimmerman, Frederick J.

    2014-01-01

    To make a scientific contribution, a reanalysis must be firmly rooted in the identification of a clearly superior methodological innovation over the original research. By contrast, a reanalysis rooted in dissatisfaction with previous results will necessarily be biased and can only obscure scientific discoveries. The reanalysis published by…

  5. MERRA-2 Input Observations: Summary and Assessment

    NASA Technical Reports Server (NTRS)

    Koster, Randal D. (Editor); McCarty, Will; Coy, Lawrence; Gelaro, Ronald; Huang, Albert; Merkova, Dagmar; Smith, Edmond B.; Sienkiewicz, Meta; Wargan, Krzysztof

    2016-01-01

    The Modern-Era Retrospective Analysis for Research and Applications, Version 2 (MERRA-2) is an atmospheric reanalysis, spanning 1980 through near-realtime, that uses state-of-the-art processing of observations from the continually evolving global observing system. The effectiveness of any reanalysis is a function not only of the input observations themselves, but also of how the observations are handled in the assimilation procedure. Relevant issues to consider include, but are not limited to, data selection, data preprocessing, quality control, bias correction procedures, and blacklisting. As the assimilation algorithm and earth system models are fundamentally fixed in a reanalysis, it is often a change in the character of the observations, and their feedbacks on the system, that cause changes in the character of the reanalysis. It is therefore important to provide documentation of the observing system so that its discontinuities and transitions can be readily linked to discontinuities seen in the gridded atmospheric fields of the reanalysis. With this in mind, this document provides an exhaustive list of the input observations, the context under which they are assimilated, and an initial assessment of selected core observations fundamental to the reanalysis.

  6. Technical Report Series on Global Modeling and Data Assimilation. Volume 13; Interannual Variability and Potential Predictability in Reanalysis Products

    NASA Technical Reports Server (NTRS)

    Min, Wei; Schubert, Siegfried D.; Suarez, Max J. (Editor)

    1997-01-01

    The Data Assimilation Office (DAO) at Goddard Space Flight Center and the National Center for Environmental Prediction and National Center for Atmospheric Research (NCEP/NCAR) have produced multi-year global assimilations of historical data employing fixed analysis systems. These "reanalysis" products are ideally suited for studying short-term climatic variations. The availability of multiple reanalysis products also provides the opportunity to examine the uncertainty in the reanalysis data. The purpose of this document is to provide an updated estimate of seasonal and interannual variability based on the DAO and NCEP/NCAR reanalyses for the 15-year period 1980-1995. Intercomparisons of the seasonal means and their interannual variations are presented for a variety of prognostic and diagnostic fields. In addition, atmospheric potential predictability is re-examined employing selected DAO reanalysis variables.

  7. The NASA Reanalysis Ensemble Service - Advanced Capabilities for Integrated Reanalysis Access and Intercomparison

    NASA Astrophysics Data System (ADS)

    Tamkin, G.; Schnase, J. L.; Duffy, D.; Li, J.; Strong, S.; Thompson, J. H.

    2017-12-01

    NASA's efforts to advance climate analytics-as-a-service are making new capabilities available to the research community: (1) A full-featured Reanalysis Ensemble Service (RES) comprising monthly means data from multiple reanalysis data sets, accessible through an enhanced set of extraction, analytic, arithmetic, and intercomparison operations. The operations are made accessible through NASA's climate data analytics Web services and our client-side Climate Data Services Python library, CDSlib; (2) A cloud-based, high-performance Virtual Real-Time Analytics Testbed supporting a select set of climate variables. This near real-time capability enables advanced technologies like Spark and Hadoop-based MapReduce analytics over native NetCDF files; and (3) A WPS-compliant Web service interface to our climate data analytics service that will enable greater interoperability with next-generation systems such as ESGF. The Reanalysis Ensemble Service includes the following: - New API that supports full temporal, spatial, and grid-based resolution services with sample queries - A Docker-ready RES application to deploy across platforms - Extended capabilities that enable single- and multiple reanalysis area average, vertical average, re-gridding, standard deviation, and ensemble averages - Convenient, one-stop shopping for commonly used data products from multiple reanalyses including basic sub-setting and arithmetic operations (e.g., avg, sum, max, min, var, count, anomaly) - Full support for the MERRA-2 reanalysis dataset in addition to, ECMWF ERA-Interim, NCEP CFSR, JMA JRA-55 and NOAA/ESRL 20CR… - A Jupyter notebook-based distribution mechanism designed for client use cases that combines CDSlib documentation with interactive scenarios and personalized project management - Supporting analytic services for NASA GMAO Forward Processing datasets - Basic uncertainty quantification services that combine heterogeneous ensemble products with comparative observational products (e.g., reanalysis, observational, visualization) - The ability to compute and visualize multiple reanalysis for ease of inter-comparisons - Automated tools to retrieve and prepare data collections for analytic processing

  8. Surface Glycosylation Profiles of Urine Extracellular Vesicles

    PubMed Central

    Gerlach, Jared Q.; Krüger, Anja; Gallogly, Susan; Hanley, Shirley A.; Hogan, Marie C.; Ward, Christopher J.

    2013-01-01

    Urinary extracellular vesicles (uEVs) are released by cells throughout the nephron and contain biomolecules from their cells of origin. Although uEV-associated proteins and RNA have been studied in detail, little information exists regarding uEV glycosylation characteristics. Surface glycosylation profiling by flow cytometry and lectin microarray was applied to uEVs enriched from urine of healthy adults by ultracentrifugation and centrifugal filtration. The carbohydrate specificity of lectin microarray profiles was confirmed by competitive sugar inhibition and carbohydrate-specific enzyme hydrolysis. Glycosylation profiles of uEVs and purified Tamm Horsfall protein were compared. In both flow cytometry and lectin microarray assays, uEVs demonstrated surface binding, at low to moderate intensities, of a broad range of lectins whether prepared by ultracentrifugation or centrifugal filtration. In general, ultracentrifugation-prepared uEVs demonstrated higher lectin binding intensities than centrifugal filtration-prepared uEVs consistent with lesser amounts of co-purified non-vesicular proteins. The surface glycosylation profiles of uEVs showed little inter-individual variation and were distinct from those of Tamm Horsfall protein, which bound a limited number of lectins. In a pilot study, lectin microarray was used to compare uEVs from individuals with autosomal dominant polycystic kidney disease to those of age-matched controls. The lectin microarray profiles of polycystic kidney disease and healthy uEVs showed differences in binding intensity of 6/43 lectins. Our results reveal a complex surface glycosylation profile of uEVs that is accessible to lectin-based analysis following multiple uEV enrichment techniques, is distinct from co-purified Tamm Horsfall protein and may demonstrate disease-specific modifications. PMID:24069349

  9. Gene set analysis approaches for RNA-seq data: performance evaluation and application guideline

    PubMed Central

    Rahmatallah, Yasir; Emmert-Streib, Frank

    2016-01-01

    Transcriptome sequencing (RNA-seq) is gradually replacing microarrays for high-throughput studies of gene expression. The main challenge of analyzing microarray data is not in finding differentially expressed genes, but in gaining insights into the biological processes underlying phenotypic differences. To interpret experimental results from microarrays, gene set analysis (GSA) has become the method of choice, in particular because it incorporates pre-existing biological knowledge (in a form of functionally related gene sets) into the analysis. Here we provide a brief review of several statistically different GSA approaches (competitive and self-contained) that can be adapted from microarrays practice as well as those specifically designed for RNA-seq. We evaluate their performance (in terms of Type I error rate, power, robustness to the sample size and heterogeneity, as well as the sensitivity to different types of selection biases) on simulated and real RNA-seq data. Not surprisingly, the performance of various GSA approaches depends only on the statistical hypothesis they test and does not depend on whether the test was developed for microarrays or RNA-seq data. Interestingly, we found that competitive methods have lower power as well as robustness to the samples heterogeneity than self-contained methods, leading to poor results reproducibility. We also found that the power of unsupervised competitive methods depends on the balance between up- and down-regulated genes in tested gene sets. These properties of competitive methods have been overlooked before. Our evaluation provides a concise guideline for selecting GSA approaches, best performing under particular experimental settings in the context of RNA-seq. PMID:26342128

  10. Differential prioritization between relevance and redundancy in correlation-based feature selection techniques for multiclass gene expression data.

    PubMed

    Ooi, Chia Huey; Chetty, Madhu; Teng, Shyh Wei

    2006-06-23

    Due to the large number of genes in a typical microarray dataset, feature selection looks set to play an important role in reducing noise and computational cost in gene expression-based tissue classification while improving accuracy at the same time. Surprisingly, this does not appear to be the case for all multiclass microarray datasets. The reason is that many feature selection techniques applied on microarray datasets are either rank-based and hence do not take into account correlations between genes, or are wrapper-based, which require high computational cost, and often yield difficult-to-reproduce results. In studies where correlations between genes are considered, attempts to establish the merit of the proposed techniques are hampered by evaluation procedures which are less than meticulous, resulting in overly optimistic estimates of accuracy. We present two realistically evaluated correlation-based feature selection techniques which incorporate, in addition to the two existing criteria involved in forming a predictor set (relevance and redundancy), a third criterion called the degree of differential prioritization (DDP). DDP functions as a parameter to strike the balance between relevance and redundancy, providing our techniques with the novel ability to differentially prioritize the optimization of relevance against redundancy (and vice versa). This ability proves useful in producing optimal classification accuracy while using reasonably small predictor set sizes for nine well-known multiclass microarray datasets. For multiclass microarray datasets, especially the GCM and NCI60 datasets, DDP enables our filter-based techniques to produce accuracies better than those reported in previous studies which employed similarly realistic evaluation procedures.

  11. Linking microarray reporters with protein functions.

    PubMed

    Gaj, Stan; van Erk, Arie; van Haaften, Rachel I M; Evelo, Chris T A

    2007-09-26

    The analysis of microarray experiments requires accurate and up-to-date functional annotation of the microarray reporters to optimize the interpretation of the biological processes involved. Pathway visualization tools are used to connect gene expression data with existing biological pathways by using specific database identifiers that link reporters with elements in the pathways. This paper proposes a novel method that aims to improve microarray reporter annotation by BLASTing the original reporter sequences against a species-specific EMBL subset, that was derived from and crosslinked back to the highly curated UniProt database. The resulting alignments were filtered using high quality alignment criteria and further compared with the outcome of a more traditional approach, where reporter sequences were BLASTed against EnsEMBL followed by locating the corresponding protein (UniProt) entry for the high quality hits. Combining the results of both methods resulted in successful annotation of > 58% of all reporter sequences with UniProt IDs on two commercial array platforms, increasing the amount of Incyte reporters that could be coupled to Gene Ontology terms from 32.7% to 58.3% and to a local GenMAPP pathway from 9.6% to 16.7%. For Agilent, 35.3% of the total reporters are now linked towards GO nodes and 7.1% on local pathways. Our methods increased the annotation quality of microarray reporter sequences and allowed us to visualize more reporters using pathway visualization tools. Even in cases where the original reporter annotation showed the correct description the new identifiers often allowed improved pathway and Gene Ontology linking. These methods are freely available at http://www.bigcat.unimaas.nl/public/publications/Gaj_Annotation/.

  12. Northern Hemisphere winter storm track trends since 1959 derived from multiple reanalysis datasets

    NASA Astrophysics Data System (ADS)

    Chang, Edmund K. M.; Yau, Albert M. W.

    2016-09-01

    In this study, a comprehensive comparison of Northern Hemisphere winter storm track trend since 1959 derived from multiple reanalysis datasets and rawinsonde observations has been conducted. In addition, trends in terms of variance and cyclone track statistics have been compared. Previous studies, based largely on the National Center for Environmental Prediction-National Center for Atmospheric Research Reanalysis (NNR), have suggested that both the Pacific and Atlantic storm tracks have significantly intensified between the 1950s and 1990s. Comparison with trends derived from rawinsonde observations suggest that the trends derived from NNR are significantly biased high, while those from the European Center for Medium Range Weather Forecasts 40-year Reanalysis and the Japanese 55-year Reanalysis are much less biased but still too high. Those from the two twentieth century reanalysis datasets are most consistent with observations but may exhibit slight biases of opposite signs. Between 1959 and 2010, Pacific storm track activity has likely increased by 10 % or more, while Atlantic storm track activity has likely increased by <10 %. Our analysis suggests that trends in Pacific and Atlantic basin wide storm track activity prior to the 1950s derived from the two twentieth century reanalysis datasets are unlikely to be reliable due to changes in density of surface observations. Nevertheless, these datasets may provide useful information on interannual variability, especially over the Atlantic.

  13. 76 FR 26290 - Science Advisory Board Staff Office; Notification of a Public Teleconference of the Chartered...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-05-06

    ... report entitled ``SAB Review of EPA's Reanalysis of Key Issues Related to Dioxin Toxicity and Response to... of EPA's Reanalysis of Key Issues Related to Dioxin Toxicity and Response to NAS Comments.'' The SAB... Reanalysis of Key Issues Related to Dioxin Toxicity and Response to NAS Comments.'' To conduct this review...

  14. Missing value imputation for microarray data: a comprehensive comparison study and a web tool.

    PubMed

    Chiu, Chia-Chun; Chan, Shih-Yao; Wang, Chung-Ching; Wu, Wei-Sheng

    2013-01-01

    Microarray data are usually peppered with missing values due to various reasons. However, most of the downstream analyses for microarray data require complete datasets. Therefore, accurate algorithms for missing value estimation are needed for improving the performance of microarray data analyses. Although many algorithms have been developed, there are many debates on the selection of the optimal algorithm. The studies about the performance comparison of different algorithms are still incomprehensive, especially in the number of benchmark datasets used, the number of algorithms compared, the rounds of simulation conducted, and the performance measures used. In this paper, we performed a comprehensive comparison by using (I) thirteen datasets, (II) nine algorithms, (III) 110 independent runs of simulation, and (IV) three types of measures to evaluate the performance of each imputation algorithm fairly. First, the effects of different types of microarray datasets on the performance of each imputation algorithm were evaluated. Second, we discussed whether the datasets from different species have different impact on the performance of different algorithms. To assess the performance of each algorithm fairly, all evaluations were performed using three types of measures. Our results indicate that the performance of an imputation algorithm mainly depends on the type of a dataset but not on the species where the samples come from. In addition to the statistical measure, two other measures with biological meanings are useful to reflect the impact of missing value imputation on the downstream data analyses. Our study suggests that local-least-squares-based methods are good choices to handle missing values for most of the microarray datasets. In this work, we carried out a comprehensive comparison of the algorithms for microarray missing value imputation. Based on such a comprehensive comparison, researchers could choose the optimal algorithm for their datasets easily. Moreover, new imputation algorithms could be compared with the existing algorithms using this comparison strategy as a standard protocol. In addition, to assist researchers in dealing with missing values easily, we built a web-based and easy-to-use imputation tool, MissVIA (http://cosbi.ee.ncku.edu.tw/MissVIA), which supports many imputation algorithms. Once users upload a real microarray dataset and choose the imputation algorithms, MissVIA will determine the optimal algorithm for the users' data through a series of simulations, and then the imputed results can be downloaded for the downstream data analyses.

  15. Translating standards into practice - one Semantic Web API for Gene Expression.

    PubMed

    Deus, Helena F; Prud'hommeaux, Eric; Miller, Michael; Zhao, Jun; Malone, James; Adamusiak, Tomasz; McCusker, Jim; Das, Sudeshna; Rocca Serra, Philippe; Fox, Ronan; Marshall, M Scott

    2012-08-01

    Sharing and describing experimental results unambiguously with sufficient detail to enable replication of results is a fundamental tenet of scientific research. In today's cluttered world of "-omics" sciences, data standards and standardized use of terminologies and ontologies for biomedical informatics play an important role in reporting high-throughput experiment results in formats that can be interpreted by both researchers and analytical tools. Increasing adoption of Semantic Web and Linked Data technologies for the integration of heterogeneous and distributed health care and life sciences (HCLSs) datasets has made the reuse of standards even more pressing; dynamic semantic query federation can be used for integrative bioinformatics when ontologies and identifiers are reused across data instances. We present here a methodology to integrate the results and experimental context of three different representations of microarray-based transcriptomic experiments: the Gene Expression Atlas, the W3C BioRDF task force approach to reporting Provenance of Microarray Experiments, and the HSCI blood genomics project. Our approach does not attempt to improve the expressivity of existing standards for genomics but, instead, to enable integration of existing datasets published from microarray-based transcriptomic experiments. SPARQL Construct is used to create a posteriori mappings of concepts and properties and linking rules that match entities based on query constraints. We discuss how our integrative approach can encourage reuse of the Experimental Factor Ontology (EFO) and the Ontology for Biomedical Investigations (OBIs) for the reporting of experimental context and results of gene expression studies. Copyright © 2012 Elsevier Inc. All rights reserved.

  16. Improving nuclear data accuracy of 241Am and 237Np capture cross sections

    NASA Astrophysics Data System (ADS)

    Žerovnik, Gašper; Schillebeeckx, Peter; Cano-Ott, Daniel; Jandel, Marian; Hori, Jun-ichi; Kimura, Atsushi; Rossbach, Matthias; Letourneau, Alain; Noguere, Gilles; Leconte, Pierre; Sano, Tadafumi; Kellett, Mark A.; Iwamoto, Osamu; Ignatyuk, Anatoly V.; Cabellos, Oscar; Genreith, Christoph; Harada, Hideo

    2017-09-01

    In the framework of the OECD/NEA WPEC subgroup 41, ways to improve neutron induced capture cross sections for 241Am and 237Np are being sought. Decay data, energy dependent cross section data and neutron spectrum averaged data are important for that purpose and were investigated. New time-of-flight measurements were performed and analyzed, and considerable effort was put into development of methods for analysis of spectrum averaged data and re-analysis of existing experimental data.

  17. Analysis and High-Resolution Modeling of Tropical Cyclogenesis During the TCS-08 and TPARC Field Campaign

    DTIC Science & Technology

    2014-10-13

    synoptic and dynamic aspects of cyclogenesis, a multi-nested WRF model (with 2 km resolution in the innermost mesh) will be used to simulate both...intraseasonal and interannual variability of TC activity in the WNP. For the data assimilation task, WRF 3DVar assimilation system will be employed...simulated using WRF . This genesis is associated with Rossby wave energy dispersion of a pre- existing TC Bills (2000). Using the reanalysis data as an

  18. Application of a substructuring technique to the problem of crack extension and closure

    NASA Technical Reports Server (NTRS)

    Armen, H., Jr.

    1974-01-01

    A substructuring technique, originally developed for the efficient reanalysis of structures, is incorporated into the methodology associated with the plastic analysis of structures. An existing finite-element computer program that accounts for elastic-plastic material behavior under cyclic loading was modified to account for changing kinematic constraint conditions - crack growth and intermittent contact of crack surfaces in two dimensional regions. Application of the analysis is presented for a problem of a centercrack panel to demonstrate the efficiency and accuracy of the technique.

  19. Comparison of surface sensible and latent heat fluxes over the Tibetan Plateau from reanalysis and observations

    NASA Astrophysics Data System (ADS)

    Xie, Jin; Yu, Ye; Li, Jiang-lin; Ge, Jun; Liu, Chuan

    2018-02-01

    Surface sensible and latent heat fluxes (SH and LE) over the Tibetan Plateau (TP) have been under research since 1950s, especially for recent several years, by mainly using observation, reanalysis, and satellite data. However, the spatiotemporal changes are not consistent among different studies. This paper focuses on the spatiotemporal variation of SH and LE over the TP from 1981 to 2013 using reanalysis data sets (ERA-Interim, JRA-55, and MERRA) and observations. Results show that the spatiotemporal changes from the three reanalysis data sets are significantly different and the probable causes are discussed. Averaged for the whole TP, both SH and LE from MERRA are obviously higher than the other two reanalysis data sets. ERA-Interim shows a significant downward trend for SH and JRA-55 shows a significant increase of LE during the 33 years with other data sets having no obvious changes. By comparing the heat fluxes and some climate factors from the reanalysis with observations, it is found that the differences of heat fluxes among the three reanalysis data sets are closely related to their differences in meteorological conditions as well as the different parameterizations for surface transfer coefficients. In general, the heat fluxes from the three reanalysis have a better representation in the western TP than that in the eastern TP under inter-annual scale. While in terms of monthly variation, ERA-Interim may have better applicability in the eastern TP with dense vegetation conditions, while SH of JRA-55 and LE of MERRA are probably more representative for the middle and western TP with poor vegetation conditions.

  20. Structural reanalysis via a mixed method. [using Taylor series for accuracy improvement

    NASA Technical Reports Server (NTRS)

    Noor, A. K.; Lowder, H. E.

    1975-01-01

    A study is made of the approximate structural reanalysis technique based on the use of Taylor series expansion of response variables in terms of design variables in conjunction with the mixed method. In addition, comparisons are made with two reanalysis techniques based on the displacement method. These techniques are the Taylor series expansion and the modified reduced basis. It is shown that the use of the reciprocals of the sizing variables as design variables (which is the natural choice in the mixed method) can result in a substantial improvement in the accuracy of the reanalysis technique. Numerical results are presented for a space truss structure.

  1. Comparison of MERRA-2 and ECCO-v4 ocean surface heat fluxes: Consequences of different forcing feedbacks on ocean circulation and implications for climate data assimilation.

    NASA Astrophysics Data System (ADS)

    Strobach, E.; Molod, A.; Menemenlis, D.; Forget, G.; Hill, C. N.; Campin, J. M.; Heimbach, P.

    2017-12-01

    Forcing ocean models with reanalysis data is a common practice in ocean modeling. As part of this practice, prescribed atmospheric state variables and interactive ocean SST are used to calculate fluxes between the ocean and the atmosphere. When forcing an ocean model with reanalysis fields, errors in the reanalysis data, errors in the ocean model and errors in the forcing formulation will generate a different solution compared to other ocean reanalysis solutions (which also have their own errors). As a first step towards a consistent coupled ocean-atmosphere reanalysis, we compare surface heat fluxes from a state-of-the-art atmospheric reanalysis, the Modern-Era Retrospective analysis for Research and Applications, Version 2 (MERRA-2), to heat fluxes from a state-of-the-art oceanic reanalysis, the Estimating the Circulation and Climate of the Ocean Version 4, Release 2 (ECCO-v4). Then, we investigate the errors associated with the MITgcm ocean model in its ECCO-v4 ocean reanalysis configuration (1992-2011) when it is forced with MERRA-2 atmospheric reanalysis fields instead of with the ECCO-v4 adjoint optimized ERA-interim state variables. This is done by forcing ECCO-v4 ocean with and without feedbacks from MERRA-2 related to turbulent fluxes of heat and moisture and the outgoing long wave radiation. In addition, we introduce an intermediate forcing method that includes only the feedback from the interactive outgoing long wave radiation. The resulting ocean circulation is compared with ECCO-v4 reanalysis and in-situ observations. We show that, without feedbacks, imbalances in the energy and the hydrological cycles of MERRA-2 (which are directly related to the fact it was created without interactive ocean) result in considerable SST drifts and a large reduction in sea level. The bulk formulae and interactive outgoing long wave radiation, although providing air-sea feedbacks and reducing model-data misfit, strongly relax the ocean to observed SST and may result in unwanted features such as large change in the water budget. These features have implications in on desired forcing recipe to be used. The results strongly and unambiguously argue for next generation data assimilation climate studies to involve fully coupled systems.

  2. An integrated approach for identifying wrongly labelled samples when performing classification in microarray data.

    PubMed

    Leung, Yuk Yee; Chang, Chun Qi; Hung, Yeung Sam

    2012-01-01

    Using hybrid approach for gene selection and classification is common as results obtained are generally better than performing the two tasks independently. Yet, for some microarray datasets, both classification accuracy and stability of gene sets obtained still have rooms for improvement. This may be due to the presence of samples with wrong class labels (i.e. outliers). Outlier detection algorithms proposed so far are either not suitable for microarray data, or only solve the outlier detection problem on their own. We tackle the outlier detection problem based on a previously proposed Multiple-Filter-Multiple-Wrapper (MFMW) model, which was demonstrated to yield promising results when compared to other hybrid approaches (Leung and Hung, 2010). To incorporate outlier detection and overcome limitations of the existing MFMW model, three new features are introduced in our proposed MFMW-outlier approach: 1) an unbiased external Leave-One-Out Cross-Validation framework is developed to replace internal cross-validation in the previous MFMW model; 2) wrongly labeled samples are identified within the MFMW-outlier model; and 3) a stable set of genes is selected using an L1-norm SVM that removes any redundant genes present. Six binary-class microarray datasets were tested. Comparing with outlier detection studies on the same datasets, MFMW-outlier could detect all the outliers found in the original paper (for which the data was provided for analysis), and the genes selected after outlier removal were proven to have biological relevance. We also compared MFMW-outlier with PRAPIV (Zhang et al., 2006) based on same synthetic datasets. MFMW-outlier gave better average precision and recall values on three different settings. Lastly, artificially flipped microarray datasets were created by removing our detected outliers and flipping some of the remaining samples' labels. Almost all the 'wrong' (artificially flipped) samples were detected, suggesting that MFMW-outlier was sufficiently powerful to detect outliers in high-dimensional microarray datasets.

  3. DFP: a Bioconductor package for fuzzy profile identification and gene reduction of microarray data

    PubMed Central

    Glez-Peña, Daniel; Álvarez, Rodrigo; Díaz, Fernando; Fdez-Riverola, Florentino

    2009-01-01

    Background Expression profiling assays done by using DNA microarray technology generate enormous data sets that are not amenable to simple analysis. The greatest challenge in maximizing the use of this huge amount of data is to develop algorithms to interpret and interconnect results from different genes under different conditions. In this context, fuzzy logic can provide a systematic and unbiased way to both (i) find biologically significant insights relating to meaningful genes, thereby removing the need for expert knowledge in preliminary steps of microarray data analyses and (ii) reduce the cost and complexity of later applied machine learning techniques being able to achieve interpretable models. Results DFP is a new Bioconductor R package that implements a method for discretizing and selecting differentially expressed genes based on the application of fuzzy logic. DFP takes advantage of fuzzy membership functions to assign linguistic labels to gene expression levels. The technique builds a reduced set of relevant genes (FP, Fuzzy Pattern) able to summarize and represent each underlying class (pathology). A last step constructs a biased set of genes (DFP, Discriminant Fuzzy Pattern) by intersecting existing fuzzy patterns in order to detect discriminative elements. In addition, the software provides new functions and visualisation tools that summarize achieved results and aid in the interpretation of differentially expressed genes from multiple microarray experiments. Conclusion DFP integrates with other packages of the Bioconductor project, uses common data structures and is accompanied by ample documentation. It has the advantage that its parameters are highly configurable, facilitating the discovery of biologically relevant connections between sets of genes belonging to different pathologies. This information makes it possible to automatically filter irrelevant genes thereby reducing the large volume of data supplied by microarray experiments. Based on these contributions GENECBR, a successful tool for cancer diagnosis using microarray datasets, has recently been released. PMID:19178723

  4. Methodological study of affine transformations of gene expression data with proposed robust non-parametric multi-dimensional normalization method.

    PubMed

    Bengtsson, Henrik; Hössjer, Ola

    2006-03-01

    Low-level processing and normalization of microarray data are most important steps in microarray analysis, which have profound impact on downstream analysis. Multiple methods have been suggested to date, but it is not clear which is the best. It is therefore important to further study the different normalization methods in detail and the nature of microarray data in general. A methodological study of affine models for gene expression data is carried out. Focus is on two-channel comparative studies, but the findings generalize also to single- and multi-channel data. The discussion applies to spotted as well as in-situ synthesized microarray data. Existing normalization methods such as curve-fit ("lowess") normalization, parallel and perpendicular translation normalization, and quantile normalization, but also dye-swap normalization are revisited in the light of the affine model and their strengths and weaknesses are investigated in this context. As a direct result from this study, we propose a robust non-parametric multi-dimensional affine normalization method, which can be applied to any number of microarrays with any number of channels either individually or all at once. A high-quality cDNA microarray data set with spike-in controls is used to demonstrate the power of the affine model and the proposed normalization method. We find that an affine model can explain non-linear intensity-dependent systematic effects in observed log-ratios. Affine normalization removes such artifacts for non-differentially expressed genes and assures that symmetry between negative and positive log-ratios is obtained, which is fundamental when identifying differentially expressed genes. In addition, affine normalization makes the empirical distributions in different channels more equal, which is the purpose of quantile normalization, and may also explain why dye-swap normalization works or fails. All methods are made available in the aroma package, which is a platform-independent package for R.

  5. DFP: a Bioconductor package for fuzzy profile identification and gene reduction of microarray data.

    PubMed

    Glez-Peña, Daniel; Alvarez, Rodrigo; Díaz, Fernando; Fdez-Riverola, Florentino

    2009-01-29

    Expression profiling assays done by using DNA microarray technology generate enormous data sets that are not amenable to simple analysis. The greatest challenge in maximizing the use of this huge amount of data is to develop algorithms to interpret and interconnect results from different genes under different conditions. In this context, fuzzy logic can provide a systematic and unbiased way to both (i) find biologically significant insights relating to meaningful genes, thereby removing the need for expert knowledge in preliminary steps of microarray data analyses and (ii) reduce the cost and complexity of later applied machine learning techniques being able to achieve interpretable models. DFP is a new Bioconductor R package that implements a method for discretizing and selecting differentially expressed genes based on the application of fuzzy logic. DFP takes advantage of fuzzy membership functions to assign linguistic labels to gene expression levels. The technique builds a reduced set of relevant genes (FP, Fuzzy Pattern) able to summarize and represent each underlying class (pathology). A last step constructs a biased set of genes (DFP, Discriminant Fuzzy Pattern) by intersecting existing fuzzy patterns in order to detect discriminative elements. In addition, the software provides new functions and visualisation tools that summarize achieved results and aid in the interpretation of differentially expressed genes from multiple microarray experiments. DFP integrates with other packages of the Bioconductor project, uses common data structures and is accompanied by ample documentation. It has the advantage that its parameters are highly configurable, facilitating the discovery of biologically relevant connections between sets of genes belonging to different pathologies. This information makes it possible to automatically filter irrelevant genes thereby reducing the large volume of data supplied by microarray experiments. Based on these contributions GENECBR, a successful tool for cancer diagnosis using microarray datasets, has recently been released.

  6. Revisiting the global surface energy budgets with maximum-entropy-production model of surface heat fluxes

    NASA Astrophysics Data System (ADS)

    Huang, Shih-Yu; Deng, Yi; Wang, Jingfeng

    2017-09-01

    The maximum-entropy-production (MEP) model of surface heat fluxes, based on contemporary non-equilibrium thermodynamics, information theory, and atmospheric turbulence theory, is used to re-estimate the global surface heat fluxes. The MEP model predicted surface fluxes automatically balance the surface energy budgets at all time and space scales without the explicit use of near-surface temperature and moisture gradient, wind speed and surface roughness data. The new MEP-based global annual mean fluxes over the land surface, using input data of surface radiation, temperature data from National Aeronautics and Space Administration-Clouds and the Earth's Radiant Energy System (NASA CERES) supplemented by surface specific humidity data from the Modern-Era Retrospective Analysis for Research and Applications (MERRA), agree closely with previous estimates. The new estimate of ocean evaporation, not using the MERRA reanalysis data as model inputs, is lower than previous estimates, while the new estimate of ocean sensible heat flux is higher than previously reported. The MEP model also produces the first global map of ocean surface heat flux that is not available from existing global reanalysis products.

  7. Enabling Research Tools for Sustained Climate Assessment

    NASA Technical Reports Server (NTRS)

    Leidner, Allison K.; Bosilovich, Michael G.; Jasinski, Michael F.; Nemani, Ramakrishna R.; Waliser, Duane Edward; Lee, Tsengdar J.

    2016-01-01

    The U.S. Global Change Research Program Sustained Assessment process benefits from long-term investments in Earth science research that enable the scientific community to conduct assessment-relevant science. To this end, NASA initiated several research programs over the past five years to support the Earth observation community in developing indicators, datasets, research products, and tools to support ongoing and future National Climate Assessments. These activities complement NASA's ongoing Earth science research programs. One aspect of the assessment portfolio funds four "enabling tools" projects at NASA research centers. Each tool leverages existing capacity within the center, but has developed tailored applications and products for National Climate Assessments. The four projects build on the capabilities of a global atmospheric reanalysis (MERRA-2), a continental U.S. land surface reanalysis (NCA-LDAS), the NASA Earth Exchange (NEX), and a Regional Climate Model Evaluation System (RCMES). Here, we provide a brief overview of each enabling tool, highlighting the ways in which it has advanced assessment science to date. We also discuss how the assessment community can access and utilize these tools for National Climate Assessments and other sustained assessment activities.

  8. Rainy Days in the New Arctic: A Comprehensive Look at Precipitation from 8 Reanalysis

    NASA Astrophysics Data System (ADS)

    Boisvert, L.; Webster, M.; Petty, A.; Markus, T.

    2017-12-01

    Precipitation in the Arctic plays an important role in the fresh water budget, and is the primary control of snow accumulation on sea ice. However, Arctic precipitation from reanalysis is highly uncertain due to differences in the atmospheric physics and use/approaches of data assimilation and sea ice concentrations across the different products. More specifically, yearly cumulative precipitation in some regions can vary by 100-150 mm across reanalyses. This creates problems for those modeling snow depth on sea ice, specifically for use in deriving sea ice thickness from satellite altimetry. In recent years, this new Arctic has become warmer and wetter, and evaporation from the ice-free ocean has been increasing, which leads to the question: is more precipitation falling and is more of this precipitation rain? This could pose a big problem for model and remote sensing applications and studies those modeling snow accumulation because rain events will can melt the existing snow pack, reduce surface albedo, and modify the ocean-to-atmosphere heat flux via snow densification. In this work we compare precipitation (both snow and rain) from 8 different reanalysis: MERRA, MERRA2, NCEP-R1, NCEP-R2, ERA-Interim, ERA-5, ASR and JRA-55. We examine the annual, seasonal, and regional differences and compare with buoy data to assess discrepancies between products during observed snowfall and rainfall events. Magnitudes and frequencies of these precipitation events are evaluated, as well as the "residual drizzle" between reanalyzes. Lastly, we will look at whether the frequency and magnitude of "rainy days" in the Arctic have been changing over recent decades.

  9. A comprehensive transcript index of the human genome generated using microarrays and computational approaches

    PubMed Central

    Schadt, Eric E; Edwards, Stephen W; GuhaThakurta, Debraj; Holder, Dan; Ying, Lisa; Svetnik, Vladimir; Leonardson, Amy; Hart, Kyle W; Russell, Archie; Li, Guoya; Cavet, Guy; Castle, John; McDonagh, Paul; Kan, Zhengyan; Chen, Ronghua; Kasarskis, Andrew; Margarint, Mihai; Caceres, Ramon M; Johnson, Jason M; Armour, Christopher D; Garrett-Engele, Philip W; Tsinoremas, Nicholas F; Shoemaker, Daniel D

    2004-01-01

    Background Computational and microarray-based experimental approaches were used to generate a comprehensive transcript index for the human genome. Oligonucleotide probes designed from approximately 50,000 known and predicted transcript sequences from the human genome were used to survey transcription from a diverse set of 60 tissues and cell lines using ink-jet microarrays. Further, expression activity over at least six conditions was more generally assessed using genomic tiling arrays consisting of probes tiled through a repeat-masked version of the genomic sequence making up chromosomes 20 and 22. Results The combination of microarray data with extensive genome annotations resulted in a set of 28,456 experimentally supported transcripts. This set of high-confidence transcripts represents the first experimentally driven annotation of the human genome. In addition, the results from genomic tiling suggest that a large amount of transcription exists outside of annotated regions of the genome and serves as an example of how this activity could be measured on a genome-wide scale. Conclusions These data represent one of the most comprehensive assessments of transcriptional activity in the human genome and provide an atlas of human gene expression over a unique set of gene predictions. Before the annotation of the human genome is considered complete, however, the previously unannotated transcriptional activity throughout the genome must be fully characterized. PMID:15461792

  10. Feature Genes Selection Using Supervised Locally Linear Embedding and Correlation Coefficient for Microarray Classification

    PubMed Central

    Wang, Yun; Huang, Fangzhou

    2018-01-01

    The selection of feature genes with high recognition ability from the gene expression profiles has gained great significance in biology. However, most of the existing methods have a high time complexity and poor classification performance. Motivated by this, an effective feature selection method, called supervised locally linear embedding and Spearman's rank correlation coefficient (SLLE-SC2), is proposed which is based on the concept of locally linear embedding and correlation coefficient algorithms. Supervised locally linear embedding takes into account class label information and improves the classification performance. Furthermore, Spearman's rank correlation coefficient is used to remove the coexpression genes. The experiment results obtained on four public tumor microarray datasets illustrate that our method is valid and feasible. PMID:29666661

  11. Feature Genes Selection Using Supervised Locally Linear Embedding and Correlation Coefficient for Microarray Classification.

    PubMed

    Xu, Jiucheng; Mu, Huiyu; Wang, Yun; Huang, Fangzhou

    2018-01-01

    The selection of feature genes with high recognition ability from the gene expression profiles has gained great significance in biology. However, most of the existing methods have a high time complexity and poor classification performance. Motivated by this, an effective feature selection method, called supervised locally linear embedding and Spearman's rank correlation coefficient (SLLE-SC 2 ), is proposed which is based on the concept of locally linear embedding and correlation coefficient algorithms. Supervised locally linear embedding takes into account class label information and improves the classification performance. Furthermore, Spearman's rank correlation coefficient is used to remove the coexpression genes. The experiment results obtained on four public tumor microarray datasets illustrate that our method is valid and feasible.

  12. Using in vitro models for expression profiling studies on ethanol and drugs of abuse.

    PubMed

    Thibault, Christelle; Hassan, Sajida; Miles, Michael

    2005-03-01

    The use of expression profiling with microarrays offers great potential for studying the mechanisms of action of drugs of abuse. Studies with the intact nervous system seem likely to be most relevant to understanding the mechanisms of drug abuse-related behaviours. However, the use of expression profiling with in vitro culture models offers significant advantages for identifying details of cellular signalling actions and toxicity for drugs of abuse. This study discusses general issues of the use of microarrays and cell culture models for studies on drugs of abuse. Specific results from existing studies are also discussed, providing clear examples of relevance for in vitro studies on ethanol, nicotine, opiates, cannabinoids and hallucinogens such as LSD. In addition to providing details on signalling mechanisms relevant to the neurobiology of drugs of abuse, microarray studies on a variety of cell culture systems have also provided important information on mechanisms of cellular/organ toxicity with drugs of abuse. Efforts to integrate genomic studies on drugs of abuse with both in vivo and in vitro models offer the potential for novel mechanistic rigor and physiological relevance.

  13. A comparison of bootstrap methods and an adjusted bootstrap approach for estimating the prediction error in microarray classification.

    PubMed

    Jiang, Wenyu; Simon, Richard

    2007-12-20

    This paper first provides a critical review on some existing methods for estimating the prediction error in classifying microarray data where the number of genes greatly exceeds the number of specimens. Special attention is given to the bootstrap-related methods. When the sample size n is small, we find that all the reviewed methods suffer from either substantial bias or variability. We introduce a repeated leave-one-out bootstrap (RLOOB) method that predicts for each specimen in the sample using bootstrap learning sets of size ln. We then propose an adjusted bootstrap (ABS) method that fits a learning curve to the RLOOB estimates calculated with different bootstrap learning set sizes. The ABS method is robust across the situations we investigate and provides a slightly conservative estimate for the prediction error. Even with small samples, it does not suffer from large upward bias as the leave-one-out bootstrap and the 0.632+ bootstrap, and it does not suffer from large variability as the leave-one-out cross-validation in microarray applications. Copyright (c) 2007 John Wiley & Sons, Ltd.

  14. Threshold-free high-power methods for the ontological analysis of genome-wide gene-expression studies

    PubMed Central

    Nilsson, Björn; Håkansson, Petra; Johansson, Mikael; Nelander, Sven; Fioretos, Thoas

    2007-01-01

    Ontological analysis facilitates the interpretation of microarray data. Here we describe new ontological analysis methods which, unlike existing approaches, are threshold-free and statistically powerful. We perform extensive evaluations and introduce a new concept, detection spectra, to characterize methods. We show that different ontological analysis methods exhibit distinct detection spectra, and that it is critical to account for this diversity. Our results argue strongly against the continued use of existing methods, and provide directions towards an enhanced approach. PMID:17488501

  15. Analysis of Atmosphere-Ocean Surface Flux Feedbacks in Recent Satellite and Model Reanalysis Products

    NASA Technical Reports Server (NTRS)

    Roberts, J. Brent; Robertson, F. R.; Clayson, C. A.

    2010-01-01

    Recent investigations have examined observations in an attempt to determine when and how the ocean forces the atmosphere, and vice versa. These studies focus primarily on relationships between sea surface temperature anomalies and the turbulent and radiative surface heat fluxes. It has been found that both positive and negative feedbacks, which enhance or reduce sea surface temperature anomaly amplitudes, can be generated through changes in the surface boundary layer. Consequent changes in sea surface temperature act to change boundary layer characteristics through changes in static stability or turbulent fluxes. Previous studies over the global oceans have used coarse-resolution observational and model products such as ICOADS and the NCEP Reanalysis. This study focuses on documenting the atmosphere ocean feedbacks that exist in recently produced higher resolution products, namely the SeaFlux v1.0 product and the NASA Modern Era Retrospective-Analysis for Research and Applications (MERRA). It has been noted in recent studies that evidence of oceanic forcing of the atmosphere exists on smaller scales than the usually more dominant atmospheric forcing of the ocean, particularly in higher latitudes. It is expected that use of these higher resolution products will allow for a more comprehensive description of these small-scale ocean-atmosphere feedbacks. The SeaFlux intercomparisons have revealed large scatter between various surface flux climatologies. This study also investigates the uncertainty in surface flux feedbacks based on several of these recent satellite based climatologies

  16. ValWorkBench: an open source Java library for cluster validation, with applications to microarray data analysis.

    PubMed

    Giancarlo, R; Scaturro, D; Utro, F

    2015-02-01

    The prediction of the number of clusters in a dataset, in particular microarrays, is a fundamental task in biological data analysis, usually performed via validation measures. Unfortunately, it has received very little attention and in fact there is a growing need for software tools/libraries dedicated to it. Here we present ValWorkBench, a software library consisting of eleven well known validation measures, together with novel heuristic approximations for some of them. The main objective of this paper is to provide the interested researcher with the full software documentation of an open source cluster validation platform having the main features of being easily extendible in a homogeneous way and of offering software components that can be readily re-used. Consequently, the focus of the presentation is on the architecture of the library, since it provides an essential map that can be used to access the full software documentation, which is available at the supplementary material website [1]. The mentioned main features of ValWorkBench are also discussed and exemplified, with emphasis on software abstraction design and re-usability. A comparison with existing cluster validation software libraries, mainly in terms of the mentioned features, is also offered. It suggests that ValWorkBench is a much needed contribution to the microarray software development/algorithm engineering community. For completeness, it is important to mention that previous accurate algorithmic experimental analysis of the relative merits of each of the implemented measures [19,23,25], carried out specifically on microarray data, gives useful insights on the effectiveness of ValWorkBench for cluster validation to researchers in the microarray community interested in its use for the mentioned task. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.

  17. Genotyping microarray (gene chip) for the ABCR (ABCA4) gene.

    PubMed

    Jaakson, K; Zernant, J; Külm, M; Hutchinson, A; Tonisson, N; Glavac, D; Ravnik-Glavac, M; Hawlina, M; Meltzer, M R; Caruso, R C; Testa, F; Maugeri, A; Hoyng, C B; Gouras, P; Simonelli, F; Lewis, R A; Lupski, J R; Cremers, F P M; Allikmets, R

    2003-11-01

    Genetic variation in the ABCR (ABCA4) gene has been associated with five distinct retinal phenotypes, including Stargardt disease/fundus flavimaculatus (STGD/FFM), cone-rod dystrophy (CRD), and age-related macular degeneration (AMD). Comparative genetic analyses of ABCR variation and diagnostics have been complicated by substantial allelic heterogeneity and by differences in screening methods. To overcome these limitations, we designed a genotyping microarray (gene chip) for ABCR that includes all approximately 400 disease-associated and other variants currently described, enabling simultaneous detection of all known ABCR variants. The ABCR genotyping microarray (the ABCR400 chip) was constructed by the arrayed primer extension (APEX) technology. Each sequence change in ABCR was included on the chip by synthesis and application of sequence-specific oligonucleotides. We validated the chip by screening 136 confirmed STGD patients and 96 healthy controls, each of whom we had analyzed previously by single strand conformation polymorphism (SSCP) technology and/or heteroduplex analysis. The microarray was >98% effective in determining the existing genetic variation and was comparable to direct sequencing in that it yielded many sequence changes undetected by SSCP. In STGD patient cohorts, the efficiency of the array to detect disease-associated alleles was between 54% and 78%, depending on the ethnic composition and degree of clinical and molecular characterization of a cohort. In addition, chip analysis suggested a high carrier frequency (up to 1:10) of ABCR variants in the general population. The ABCR genotyping microarray is a robust, cost-effective, and comprehensive screening tool for variation in one gene in which mutations are responsible for a substantial fraction of retinal disease. The ABCR chip is a prototype for the next generation of screening and diagnostic tools in ophthalmic genetics, bridging clinical and scientific research. Copyright 2003 Wiley-Liss, Inc.

  18. Linking microarray reporters with protein functions

    PubMed Central

    Gaj, Stan; van Erk, Arie; van Haaften, Rachel IM; Evelo, Chris TA

    2007-01-01

    Background The analysis of microarray experiments requires accurate and up-to-date functional annotation of the microarray reporters to optimize the interpretation of the biological processes involved. Pathway visualization tools are used to connect gene expression data with existing biological pathways by using specific database identifiers that link reporters with elements in the pathways. Results This paper proposes a novel method that aims to improve microarray reporter annotation by BLASTing the original reporter sequences against a species-specific EMBL subset, that was derived from and crosslinked back to the highly curated UniProt database. The resulting alignments were filtered using high quality alignment criteria and further compared with the outcome of a more traditional approach, where reporter sequences were BLASTed against EnsEMBL followed by locating the corresponding protein (UniProt) entry for the high quality hits. Combining the results of both methods resulted in successful annotation of > 58% of all reporter sequences with UniProt IDs on two commercial array platforms, increasing the amount of Incyte reporters that could be coupled to Gene Ontology terms from 32.7% to 58.3% and to a local GenMAPP pathway from 9.6% to 16.7%. For Agilent, 35.3% of the total reporters are now linked towards GO nodes and 7.1% on local pathways. Conclusion Our methods increased the annotation quality of microarray reporter sequences and allowed us to visualize more reporters using pathway visualization tools. Even in cases where the original reporter annotation showed the correct description the new identifiers often allowed improved pathway and Gene Ontology linking. These methods are freely available at http://www.bigcat.unimaas.nl/public/publications/Gaj_Annotation/. PMID:17897448

  19. Analysis of population structure and genetic history of cattle breeds based on high-density SNP data

    USDA-ARS?s Scientific Manuscript database

    Advances in single nucleotide polymorphism (SNP) genotyping microarrays have facilitated a new understanding of population structure and evolutionary history for several species. Most existing studies in livestock were based on low density SNP arrays. The first wave of low density SNP studies on cat...

  20. A reanalysis dataset of the South China Sea.

    PubMed

    Zeng, Xuezhi; Peng, Shiqiu; Li, Zhijin; Qi, Yiquan; Chen, Rongyu

    2014-01-01

    Ocean reanalysis provides a temporally continuous and spatially gridded four-dimensional estimate of the ocean state for a better understanding of the ocean dynamics and its spatial/temporal variability. Here we present a 19-year (1992-2010) high-resolution ocean reanalysis dataset of the upper ocean in the South China Sea (SCS) produced from an ocean data assimilation system. A wide variety of observations, including in-situ temperature/salinity profiles, ship-measured and satellite-derived sea surface temperatures, and sea surface height anomalies from satellite altimetry, are assimilated into the outputs of an ocean general circulation model using a multi-scale incremental three-dimensional variational data assimilation scheme, yielding a daily high-resolution reanalysis dataset of the SCS. Comparisons between the reanalysis and independent observations support the reliability of the dataset. The presented dataset provides the research community of the SCS an important data source for studying the thermodynamic processes of the ocean circulation and meso-scale features in the SCS, including their spatial and temporal variability.

  1. A reanalysis dataset of the South China Sea

    PubMed Central

    Zeng, Xuezhi; Peng, Shiqiu; Li, Zhijin; Qi, Yiquan; Chen, Rongyu

    2014-01-01

    Ocean reanalysis provides a temporally continuous and spatially gridded four-dimensional estimate of the ocean state for a better understanding of the ocean dynamics and its spatial/temporal variability. Here we present a 19-year (1992–2010) high-resolution ocean reanalysis dataset of the upper ocean in the South China Sea (SCS) produced from an ocean data assimilation system. A wide variety of observations, including in-situ temperature/salinity profiles, ship-measured and satellite-derived sea surface temperatures, and sea surface height anomalies from satellite altimetry, are assimilated into the outputs of an ocean general circulation model using a multi-scale incremental three-dimensional variational data assimilation scheme, yielding a daily high-resolution reanalysis dataset of the SCS. Comparisons between the reanalysis and independent observations support the reliability of the dataset. The presented dataset provides the research community of the SCS an important data source for studying the thermodynamic processes of the ocean circulation and meso-scale features in the SCS, including their spatial and temporal variability. PMID:25977803

  2. Web-based Reanalysis Intercomparison Tools (WRIT): Comparing Reanalyses and Observational data.

    NASA Astrophysics Data System (ADS)

    Compo, G. P.; Smith, C. A.; Hooper, D. K.

    2014-12-01

    While atmospheric reanalysis datasets are widely used in climate science, many technical issues hinder comparing them to each other and to observations. The reanalysis fields are stored in diverse file architectures, data formats, and resolutions, with metadata, such as variable name and units, that also differ. Individual users have to download the fields, convert them to a common format, store them locally, change variable names, re-grid if needed, and convert units. Comparing reanalyses with observational datasets is difficult for similar reasons. Even if a dataset can be read via Open-source Project for a Network Data Access Protocol (OPeNDAP) or a similar protocol, most of this work is still needed. All of these tasks take time, effort, and money. To overcome some of the obstacles in reanalysis intercomparison, our group at the Cooperative Institute for Research in the Environmental Sciences (CIRES) at the University of Colorado and affiliated colleagues at National Oceanic and Atmospheric Administration's (NOAA's) Earth System Research Laboratory Physical Sciences Division (ESRL/PSD) have created a set of Web-based Reanalysis Intercomparison Tools (WRIT) at http://www.esrl.noaa.gov/psd/data/writ/. WRIT allows users to easily plot and compare reanalysis and observational datasets, and to test hypotheses. Currently, there are tools to plot monthly mean maps and vertical cross-sections, timeseries, and trajectories for standard pressure level and surface variables. Users can refine dates, statistics, and plotting options. Reanalysis datasets currently available include the NCEP/NCAR R1, NCEP/DOE R2, MERRA, ERA-Interim, NCEP CFSR and the 20CR. Observational datasets include those containing precipitation (e.g. GPCP), temperature (e.g. GHCNCAMS), winds (e.g. WASWinds), precipitable water (e.g. NASA NVAP), SLP (HadSLP2), and SST (NOAA ERSST). WRIT also facilitates the mission of the Reanalyses.org website as a convenient toolkit for studying the reanalysis datasets.

  3. Uncertainties in Decadal Model Evaluation due to the Choice of Different Reanalysis Products

    NASA Astrophysics Data System (ADS)

    Illing, Sebastian; Kadow, Christopher; Kunst, Oliver; Cubasch, Ulrich

    2014-05-01

    In recent years decadal predictions have become very popular in the climate science community. A major task is the evaluation and validation of a decadal prediction system. Therefore hindcast experiments are performed and evaluated against observation based or reanalysis data-sets. That is, various metrics and skill scores like the anomaly correlation or the mean squared error skill score (MSSS) are calculated to estimate potential prediction skill of the model system. Our results will mostly feature the Baseline 1 hindcast experiments from the MiKlip decadal prediction system. MiKlip (www.fona-miklip.de) is a project for medium-term climate prediction funded by the Federal Ministry of Education and Research in Germany (BMBF) and has the aim to create a model system that can provide reliable decadal forecasts on climate and weather. There are various reanalysis and observation based products covering at least the last forty years which can be used for model evaluation, for instance the 20th Century Reanalysis from NOAA-CIRES, the Climate Forecast System Reanalysis from NCEP or the Interim Reanalysis from ECMWF. Each of them is based on different climate models and observations. We will show that the choice of the reanalysis product has a huge impact on the value of various skill metrics. In some cases this may actually lead to a change in the interpretation of the results, e.g. when one tries to compare two model versions and the anomaly correlation difference changes its sign for two different reanalysis products. We will also show first results of our studies investigating the influence and effect of this source of uncertainty for decadal model evaluation. Furthermore we point out regions which are most affected by this uncertainty and where one has to cautious interpreting skill scores. In addition we introduce some strategies to overcome or at least reduce this source of uncertainty.

  4. Missing value imputation for microarray data: a comprehensive comparison study and a web tool

    PubMed Central

    2013-01-01

    Background Microarray data are usually peppered with missing values due to various reasons. However, most of the downstream analyses for microarray data require complete datasets. Therefore, accurate algorithms for missing value estimation are needed for improving the performance of microarray data analyses. Although many algorithms have been developed, there are many debates on the selection of the optimal algorithm. The studies about the performance comparison of different algorithms are still incomprehensive, especially in the number of benchmark datasets used, the number of algorithms compared, the rounds of simulation conducted, and the performance measures used. Results In this paper, we performed a comprehensive comparison by using (I) thirteen datasets, (II) nine algorithms, (III) 110 independent runs of simulation, and (IV) three types of measures to evaluate the performance of each imputation algorithm fairly. First, the effects of different types of microarray datasets on the performance of each imputation algorithm were evaluated. Second, we discussed whether the datasets from different species have different impact on the performance of different algorithms. To assess the performance of each algorithm fairly, all evaluations were performed using three types of measures. Our results indicate that the performance of an imputation algorithm mainly depends on the type of a dataset but not on the species where the samples come from. In addition to the statistical measure, two other measures with biological meanings are useful to reflect the impact of missing value imputation on the downstream data analyses. Our study suggests that local-least-squares-based methods are good choices to handle missing values for most of the microarray datasets. Conclusions In this work, we carried out a comprehensive comparison of the algorithms for microarray missing value imputation. Based on such a comprehensive comparison, researchers could choose the optimal algorithm for their datasets easily. Moreover, new imputation algorithms could be compared with the existing algorithms using this comparison strategy as a standard protocol. In addition, to assist researchers in dealing with missing values easily, we built a web-based and easy-to-use imputation tool, MissVIA (http://cosbi.ee.ncku.edu.tw/MissVIA), which supports many imputation algorithms. Once users upload a real microarray dataset and choose the imputation algorithms, MissVIA will determine the optimal algorithm for the users' data through a series of simulations, and then the imputed results can be downloaded for the downstream data analyses. PMID:24565220

  5. Climatology and trend of wind power resources in China and its surrounding regions: a revisit using Climate Forecast System Reanalysis data

    Treesearch

    Lejiang Yu; Shiyuan Zhong; Xindi Bian; Warren E. Heilman

    2015-01-01

    The mean climatology, seasonal and interannual variability and trend of wind speeds at the hub height (80 m) of modern wind turbines over China and its surrounding regions are revisited using 33-year (1979–2011) wind data from the Climate Forecast System Reanalysis (CFSR) that has many improvements including higher spatial resolution over previous global reanalysis...

  6. Maintaining Atmospheric Mass and Water Balance Within Reanalysis

    NASA Technical Reports Server (NTRS)

    Takacs, Lawrence L.; Suarez, Max; Todling, Ricardo

    2015-01-01

    This report describes the modifications implemented into the Goddard Earth Observing System Version-5 (GEOS-5) Atmospheric Data Assimilation System (ADAS) to maintain global conservation of dry atmospheric mass as well as to preserve the model balance of globally integrated precipitation and surface evaporation during reanalysis. Section 1 begins with a review of these global quantities from four current reanalysis efforts. Section 2 introduces the modifications necessary to preserve these constraints within the atmospheric general circulation model (AGCM), the Gridpoint Statistical Interpolation (GSI) analysis procedure, and the Incremental Analysis Update (IAU) algorithm. Section 3 presents experiments quantifying the impact of the new procedure. Section 4 shows preliminary results from its use within the GMAO MERRA-2 Reanalysis project. Section 5 concludes with a summary.

  7. The Hadley circulation: assessing NCEP/NCAR reanalysis and sparse in-situ estimates

    NASA Astrophysics Data System (ADS)

    Waliser, D. E.; Shi, Zhixiong; Lanzante, J. R.; Oort, A. H.

    We present a comparison of the zonal mean meridional circulations derived from monthly in situ data (i.e. radiosondes and ship reports) and from the NCEP/NCAR reanalysis product. To facilitate the interpretation of the results, a third estimate of the mean meridional circulation is produced by subsampling the reanalysis at the locations where radiosonde and surface ship data are available for the in situ calculation. This third estimate, known as the subsampled estimate, is compared to the complete reanalysis estimate to assess biases in conventional, in situ estimates of the Hadley circulation associated with the sparseness of the data sources (i.e., radiosonde network). The subsampled estimate is also compared to the in situ estimate to assess the biases introduced into the reanalysis product by the numerical model, initialization process and/or indirect data sources such as satellite retrievals. The comparisons suggest that a number of qualitative differences between the in situ and reanalysis estimates are mainly associated with the sparse sampling and simplified interpolation schemes associated with in situ estimates. These differences include: (1) a southern Hadley cell that consistently extends up to 200 hPa in the reanalysis, whereas the bulk of the circulation for the in situ and subsampled estimates tends to be confined to the lower half of the troposphere, (2) more well-defined and consistent poleward limits of the Hadley cells in the reanalysis compared to the in-situ and subsampled estimates, and (3) considerably less variability in magnitude and latitudinal extent of the Ferrel cells and southern polar cell exhibited in the reanalysis estimate compared to the in situ and subsampled estimates. Quantitative comparison shows that the subsampled estimate, relative to the reanalysis estimate, produces a stronger northern Hadley cell ( 20%), a weaker southern Hadley cell ( 20-60%), and weaker Ferrel cells in both hemispheres. These differences stem from poorly measured oceanic regions which necessitate significant interpolation over broad regions. Moreover, they help to pinpoint specific shortcomings in the present and previous in situ estimates of the Hadley circulation. Comparisons between the subsampled and in situ estimates suggest that the subsampled estimate produces a slightly stronger Hadley circulation in both hemispheres, with the relative differences in some seasons as large as 20-30%. 6These differences suggest that the mean meridional circulation associated with the NCEP/NCAR reanalysis is more energetic than observations suggest. Examination of ENSO-related changes to the Hadley circulation suggest that the in situ and subsampled estimates significantly overestimate the effects of ENSO on the Hadley circulation due to the reliance on sparsely distributed data. While all three estimates capture the large-scale region of low-level equatorial convergence near the dateline that occurs during El Nino, the in situ and subsampled estimates fail to effectively reproduce the large-scale areas of equatorial mass divergence to the west and east of this convergence area, leading to an overestimate of the effects of ENSO on the zonal mean circulation.

  8. Arctic sea ice in the global eddy-permitting ocean reanalysis ORAP5

    NASA Astrophysics Data System (ADS)

    Tietsche, Steffen; Balmaseda, Magdalena A.; Zuo, Hao; Mogensen, Kristian

    2017-08-01

    We discuss the state of Arctic sea ice in the global eddy-permitting ocean reanalysis Ocean ReAnalysis Pilot 5 (ORAP5). Among other innovations, ORAP5 now assimilates observations of sea ice concentration using a univariate 3DVar-FGAT scheme. We focus on the period 1993-2012 and emphasize the evaluation of model performance with respect to recent observations of sea ice thickness. We find that sea ice concentration in ORAP5 is close to assimilated observations, with root mean square analysis residuals of less than 5 % in most regions. However, larger discrepancies exist for the Labrador Sea and east of Greenland during winter owing to biases in the free-running model. Sea ice thickness is evaluated against three different observational data sets that have sufficient spatial and temporal coverage: ICESat, IceBridge and SMOSIce. Large-scale features like the gradient between the thickest ice in the Canadian Arctic and thinner ice in the Siberian Arctic are simulated well by ORAP5. However, some biases remain. Of special note is the model's tendency to accumulate too thick ice in the Beaufort Gyre. The root mean square error of ORAP5 sea ice thickness with respect to ICESat observations is 1.0 m, which is on par with the well-established PIOMAS model sea ice reconstruction. Interannual variability and trend of sea ice volume in ORAP5 also compare well with PIOMAS and ICESat estimates. We conclude that, notwithstanding a relatively simple sea ice data assimilation scheme, the overall state of Arctic sea ice in ORAP5 is in good agreement with observations and will provide useful initial conditions for predictions.

  9. Diagnosing Possible Anthropogenic Contributions to Heavy Colorado Rainfall in September 2013

    NASA Astrophysics Data System (ADS)

    Pall, Pardeep; Patricola, Christina; Wehner, Michael; Stone, Dáithí; Paciorek, Christopher; Collins, William

    2015-04-01

    Unusually heavy rainfall occurred over the Colorado Front Range during early September 2013, with record or near-record totals recorded in several locations. It was associated predominantly with a stationary large-scale weather pattern (akin to the North American Monsoon, which occurs earlier in the year) that drove a strong plume of deep moisture inland from the Gulf of Mexico against the Front Range foothills. The resulting floods across the South Platte River basin impacted several thousands of people and many homes, roads, and businesses. To diagnose possible anthropogenic contributions to the odds of such heavy rainfall, we adapt an existing event attribution paradigm of modelling an 'event that was' for September 2013 and comparing it to a modelled 'event that might have been' for that same time but for the absence of historical anthropogenic drivers of climate. Specifically, we first perform 'event that was' simulations with the regional Weather Research and Forecasting (WRF) model at 12 km resolution over North America, driven by NCEP2 re-analysis. We then re-simulate, having adjusted the re-analysis to 'event that might have been conditions' by modifying atmospheric greenhouse gas and other pollutant concentrations, temperature, humidity, and winds, as well as sea ice coverage, and sea-surface temperatures - all according to estimates from global climate model simulations. Thus our findings are highly conditional on the driving re-analysis and adjustments therein, but the setup allows us to elucidate possible mechanisms responsible for heavy Colorado rainfall in September 2013. Our model results suggests that, given an insignificant change in the pattern of large-scale driving weather, there is an increase in atmospheric water vapour under anthropogenic climate warming leading to a substantial increase in the probability of heavy rainfall occurring over the South Platte River basin in September 2013.

  10. Diagnosing Possible Anthropogenic Contributions to Heavy Colorado Rainfall in September 2013

    NASA Astrophysics Data System (ADS)

    Pall, P.; Patricola, C. M.; Wehner, M. F.; Stone, D. A.; Paciorek, C. J.; Collins, W.

    2014-12-01

    Unusually heavy rainfall occurred over the Colorado Front Range during early September 2013, with record or near-record totals recorded in several locations. It was associated predominantly with a stationary large-scale weather pattern (akin to the North American Monsoon, which occurs earlier in the year) that drove a strong plume of deep moisture inland from the Gulf of Mexico against the Front Range foothills. The resulting floods impacted several thousands of people and many homes, roads, and businesses. To diagnose possible anthropogenic contributions to the odds of such heavy rainfall, we adapt an existing event attribution paradigm of modelling a 'world that was' for September 2013 and comparing it to a modelled 'world that might have been' for that same time but for the absence of historical anthropogenic drivers of climate. Specifically, we first perform 'world that was' simulations with the regional WRF model at 12 km resolution over North America, driven by NCEP2 re-analysis. We then re-simulate, having adjusted the re-analysis to 'world that might have been conditions' by modifying atmospheric greenhouse gas and other pollutant concentrations, temperature, humidity, and winds, as well as sea ice coverage, and sea-surface temperatures - all according to estimates from global climate model simulations. Thus our findings are highly conditional on the driving re-analysis and adjustments therein, but the setup allows us to elucidate possible mechanisms responsible for heavy Colorado rainfall in September 2013. For example, preliminary analysis suggests that, given no change in the pattern of large-scale driving weather, there is an increase in atmospheric water vapour under anthropogenic climate warming leading to a substantial increase in the odds of heavy rainfall over the Front Range.

  11. Status and Plans for Reanalysis at NASA/GMAO

    NASA Technical Reports Server (NTRS)

    Gelaro, Ron

    2017-01-01

    Reanalysis plays a critical role in GMAOs goal to enhance NASA's program of Earth observations, providing vital data sets for climate research and the development of future missions. As the breadth of NASAs observations expands to include multiple components of the Earth system, so does the need to assimilate observations from currently uncoupled components of the system in a more physically consistent manner. GMAOs most recent reanalysis of the satellite era, MERRA-2, has completed the period 1980-present, and is now running as a continuing global climate analysis with two- to three-week latency. MERRA-2 assimilates meteorological and aerosol observations as a weakly coupled assimilation system as a first step toward GMAOs longer term goal of developing an integrated Earth system analysis (IESA) capability that will couple assimilation systems for the atmosphere, ocean, land and chemistry. The GMAO strategy is to progress incrementally toward an IESA through an evolving combination of coupled systems and offline component reanalyses driven by, for example, MERRA-2 atmospheric forcing. Most recently, the GMAO has implemented a weakly coupled assimilation scheme for analyzing ocean skin temperature within the existing atmospheric analysis. The scheme uses background fields from a near-surface ocean diurnal layer model to assimilate surface-sensitive radiances plus in-situ observations along with all other observations in the atmospheric assimilation system. In addition, MERRA-2-driven simulations of the ocean (plus sea ice) and atmospheric chemistry (for the EOS period) are currently underway, as is the development of a coupled atmosphere-ocean assimilation system. This talk will describe the status of these ongoing efforts and the planned steps toward an IESA capability for climate research.

  12. Groundwater Variability in a Sandstone Catchment and Linkages with Large-scale Climatic Circulatio

    NASA Astrophysics Data System (ADS)

    Hannah, D. M.; Lavers, D. A.; Bradley, C.

    2015-12-01

    Groundwater is a crucial water resource that sustains river ecosystems and provides public water supply. Furthermore, during periods of prolonged high rainfall, groundwater-dominated catchments can be subject to protracted flooding. Climate change and associated projected increases in the frequency and intensity of hydrological extremes have implications for groundwater levels. This study builds on previous research undertaken on a Chalk catchment by investigating groundwater variability in a UK sandstone catchment: the Tern in Shropshire. In contrast to the Chalk, sandstone is characterised by a more lagged response to precipitation inputs; and, as such, it is important to determine the groundwater behaviour and its links with the large-scale climatic circulation to improve process understanding of recharge, groundwater level and river flow responses to hydroclimatological drivers. Precipitation, river discharge and groundwater levels for borehole sites in the Tern basin over 1974-2010 are analysed as the target variables; and we use monthly gridded reanalysis data from the Twentieth Century Reanalysis Project (20CR). First, groundwater variability is evaluated and associations with precipitation / discharge are explored using monthly concurrent and lagged correlation analyses. Second, gridded 20CR reanalysis data are used in composite and correlation analyses to identify the regions of strongest climate-groundwater association. Results show that reasonably strong climate-groundwater connections exist in the Tern basin, with a several months lag. These lags are associated primarily with the time taken for recharge waters to percolate through to the groundwater table. The uncovered patterns improve knowledge of large-scale climate forcing of groundwater variability and may provide a basis to inform seasonal prediction of groundwater levels, which would be useful for strategic water resource planning.

  13. Overview of the reanalysis of the Harvard Six Cities Study and American Cancer Society Study of Particulate Air Pollution and Mortality.

    PubMed

    Krewski, Daniel; Burnett, Richard T; Goldberg, Mark S; Hoover, B Kristin; Siemiatycki, Jack; Jerrett, Michael; Abrahamowicz, Michal; White, Warren H

    This article provides an overview of the Reanalysis Study of the Harvard Six Cities and the American Cancer Society (ACS) studies of particulate air pollution and mortality. The previous findings of the studies have been subject to debate. In response, a reanalysis team, comprised of Canadian and American researchers, was invited to participate in an independent reanalysis project to address the concerns. Phase I of the reanalysis involved the design of data audits to determine whether each study conformed to the consistency and accuracy of their data. Phase II of the reanalysis involved conducting a series of comprehensive analyses using alternative statistical methods. Alternative models were also used to identify covariates that may confound or modify the association of particulate air pollution as well as identify sensitive population subgroups. The audit demonstrated that the data in the original analyses were of high quality, as were the risk estimates reported by the original investigators. The sensitivity analysis illustrated that the mortality risk estimates reported in both studies were found to be robust against alternative Cox models. Detailed investigation of the covariate effects found a significant modifying effect of education and a relative risk of mortality associated with fine particles and declining education levels. The study team applied spatial analytic methods to the ACS data, resulting in various levels of spatial autocorrelations supporting the reported association for fine particles mortality of the original investigators as well as demonstrating a significant association between sulfur dioxide and mortality. Collectively, our reanalysis suggest that mortality may be attributable to more than one component of the complex mixture of ambient air pollutants for U.S. urban areas.

  14. Decadal reanalysis of biogeochemical indicators and fluxes in the North West European shelf-sea ecosystem

    NASA Astrophysics Data System (ADS)

    Ciavatta, S.; Kay, S.; Saux-Picart, S.; Butenschön, M.; Allen, J. I.

    2016-03-01

    This paper presents the first decadal reanalysis simulation of the biogeochemistry of the North West European shelf, along with a full evaluation of its skill, confidence, and value. An error-characterized satellite product for chlorophyll was assimilated into a physical-biogeochemical model of the North East Atlantic, applying a localized Ensemble Kalman filter. The results showed that the reanalysis improved the model simulation of assimilated chlorophyll in 60% of the study region. Model validation metrics showed that the reanalysis had skill in matching a large data set of in situ observations for 10 ecosystem variables. Spearman rank correlations were significant and higher than 0.7 for physical-chemical variables (temperature, salinity, and oxygen), ˜0.6 for chlorophyll and nutrients (phosphate, nitrate, and silicate), and significant, though lower in value, for partial pressure of dissolved carbon dioxide (˜0.4). The reanalysis captured the magnitude of pH and ammonia observations, but not their variability. The value of the reanalysis for assessing environmental status and variability has been exemplified in two case studies. The first shows that between 325,000 and 365,000 km2 of shelf bottom waters were vulnerable to oxygen deficiency potentially threatening bottom fishes and benthos. The second application confirmed that the shelf is a net sink of atmospheric carbon dioxide, but the total amount of uptake varies between 36 and 46 Tg C yr-1 at a 90% confidence level. These results indicate that the reanalysis output data set can inform the management of the North West European shelf ecosystem, in relation to eutrophication, fishery, and variability of the carbon cycle.

  15. Toward a chemical reanalysis in a coupled chemistry-climate model: An evaluation of MOPITT CO assimilation and its impact on tropospheric composition

    NASA Astrophysics Data System (ADS)

    Gaubert, B.; Arellano, A. F.; Barré, J.; Worden, H. M.; Emmons, L. K.; Tilmes, S.; Buchholz, R. R.; Vitt, F.; Raeder, K.; Collins, N.; Anderson, J. L.; Wiedinmyer, C.; Martinez Alonso, S.; Edwards, D. P.; Andreae, M. O.; Hannigan, J. W.; Petri, C.; Strong, K.; Jones, N.

    2016-06-01

    We examine in detail a 1 year global reanalysis of carbon monoxide (CO) that is based on joint assimilation of conventional meteorological observations and Measurement of Pollution in The Troposphere (MOPITT) multispectral CO retrievals in the Community Earth System Model (CESM). Our focus is to assess the impact to the chemical system when CO distribution is constrained in a coupled full chemistry-climate model like CESM. To do this, we first evaluate the joint reanalysis (MOPITT Reanalysis) against four sets of independent observations and compare its performance against a reanalysis with no MOPITT assimilation (Control Run). We then investigate the CO burden and chemical response with the aid of tagged sectoral CO tracers. We estimate the total tropospheric CO burden in 2002 (from ensemble mean and spread) to be 371 ± 12% Tg for MOPITT Reanalysis and 291 ± 9% Tg for Control Run. Our multispecies analysis of this difference suggests that (a) direct emissions of CO and hydrocarbons are too low in the inventory used in this study and (b) chemical oxidation, transport, and deposition processes are not accurately and consistently represented in the model. Increases in CO led to net reduction of OH and subsequent longer lifetime of CH4 (Control Run: 8.7 years versus MOPITT Reanalysis: 9.3 years). Yet at the same time, this increase led to 5-10% enhancement of Northern Hemisphere O3 and overall photochemical activity via HOx recycling. Such nonlinear effects further complicate the attribution to uncertainties in direct emissions alone. This has implications to chemistry-climate modeling and inversion studies of longer-lived species.

  16. The laboratory-clinician team: a professional call to action to improve communication and collaboration for optimal patient care in chromosomal microarray testing.

    PubMed

    Wain, Karen E; Riggs, Erin; Hanson, Karen; Savage, Melissa; Riethmaier, Darlene; Muirhead, Andrea; Mitchell, Elyse; Packard, Bethanny Smith; Faucett, W Andrew

    2012-10-01

    The International Standards for Cytogenomic Arrays (ISCA) Consortium is a worldwide collaborative effort dedicated to optimizing patient care by improving the quality of chromosomal microarray testing. The primary effort of the ISCA Consortium has been the development of a database of copy number variants (CNVs) identified during the course of clinical microarray testing. This database is a powerful resource for clinicians, laboratories, and researchers, and can be utilized for a variety of applications, such as facilitating standardized interpretations of certain CNVs across laboratories or providing phenotypic information for counseling purposes when published data is sparse. A recognized limitation to the clinical utility of this database, however, is the quality of clinical information available for each patient. Clinical genetic counselors are uniquely suited to facilitate the communication of this information to the laboratory by virtue of their existing clinical responsibilities, case management skills, and appreciation of the evolving nature of scientific knowledge. We intend to highlight the critical role that genetic counselors play in ensuring optimal patient care through contributing to the clinical utility of the ISCA Consortium's database, as well as the quality of individual patient microarray reports provided by contributing laboratories. Current tools, paper and electronic forms, created to maximize this collaboration are shared. In addition to making a professional commitment to providing complete clinical information, genetic counselors are invited to become ISCA members and to become involved in the discussions and initiatives within the Consortium.

  17. Assimilation of Ocean-Color Plankton Functional Types to Improve Marine Ecosystem Simulations

    NASA Astrophysics Data System (ADS)

    Ciavatta, S.; Brewin, R. J. W.; Skákala, J.; Polimene, L.; de Mora, L.; Artioli, Y.; Allen, J. I.

    2018-02-01

    We assimilated phytoplankton functional types (PFTs) derived from ocean color into a marine ecosystem model, to improve the simulation of biogeochemical indicators and emerging properties in a shelf sea. Error-characterized chlorophyll concentrations of four PFTs (diatoms, dinoflagellates, nanoplankton, and picoplankton), as well as total chlorophyll for comparison, were assimilated into a physical-biogeochemical model of the North East Atlantic, applying a localized Ensemble Kalman filter. The reanalysis simulations spanned the years 1998-2003. The skill of the reference and reanalysis simulations in estimating ocean color and in situ biogeochemical data were compared by using robust statistics. The reanalysis outperformed both the reference and the assimilation of total chlorophyll in estimating the ocean-color PFTs (except nanoplankton), as well as the not-assimilated total chlorophyll, leading the model to simulate better the plankton community structure. Crucially, the reanalysis improved the estimates of not-assimilated in situ data of PFTs, as well as of phosphate and pCO2, impacting the simulation of the air-sea carbon flux. However, the reanalysis increased further the model overestimation of nitrate, in spite of increases in plankton nitrate uptake. The method proposed here is easily adaptable for use with other ecosystem models that simulate PFTs, for, e.g., reanalysis of carbon fluxes in the global ocean and for operational forecasts of biogeochemical indicators in shelf-sea ecosystems.

  18. Training analysis and reanalysis in the development of the psychoanalyst.

    PubMed

    Meyer, Jon K

    2007-01-01

    A psychoanalyst faces the extraordinary demand of becoming instrumental in the psychoanalytic process. In the candidate's attempt to rise to that expectation, the first step is the training analysis. As the center-piece of psychoanalytic education, it is no ordinary analysis and bears special burdens intrinsic to its multiple functions and institutionalization. Recognizing the difficulties of both analytic education and analytic practice, Freud suggested that the analyst be periodically reanalyzed; for many, reanalysis is integral to their analytic development. Indeed, an analyst is actually never "made" but is always "in the making," developing and maturing in life and in practice. Reanalysis serves to focus elements of transference and resistance, rework defenses, facilitate more extensive regression in the service of the ego, deepen emotional integration, rework those elements of psychoanalysis itself that have been incorporated into defensive structure, and further the maturation of the analyzing instrument. If analysis is our most powerful mode of initial education, reanalysis is the most powerful form of continuing education. That remarkably little attention has been paid to reanalysis is testimony to the infantile fantasies that remain invested in our personal analyses.

  19. ELISA microarray technology as a high-throughput system for cancer biomarker validation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zangar, Richard C.; Daly, Don S.; White, Amanda M.

    A large gap currently exists between the ability to discover potential biomarkers and the ability to assess the real value of these proteins for cancer screening. One major challenge in biomarker validation is the inherent variability in biomarker levels. This variability stems from the diversity across the human population and the considerable molecular heterogeneity between individual tumors, even those that originate from a single tissue. Another major challenge with cancer screening is that most cancers are rare in the general population, meaning that the specificity of an assay must be very high if the number of false positive is notmore » going to be much greater than the number of true positives. Because of these challenges with biomarker validation, it is necessary to analysis of thousands of samples before a clear idea of the utility of a screening assay can be determined. Enzyme-linked immunosorbent assay (ELISA) microarray technology can simultaneously quantify levels of multiple proteins and has the potential to accelerate biomarker validation. In this review, we discuss current ELISA microarray technology and the enabling advances needed to achieve the reproducibility and throughput that are required to evaluate cancer biomarkers.« less

  20. Reproducibility-optimized test statistic for ranking genes in microarray studies.

    PubMed

    Elo, Laura L; Filén, Sanna; Lahesmaa, Riitta; Aittokallio, Tero

    2008-01-01

    A principal goal of microarray studies is to identify the genes showing differential expression under distinct conditions. In such studies, the selection of an optimal test statistic is a crucial challenge, which depends on the type and amount of data under analysis. While previous studies on simulated or spike-in datasets do not provide practical guidance on how to choose the best method for a given real dataset, we introduce an enhanced reproducibility-optimization procedure, which enables the selection of a suitable gene- anking statistic directly from the data. In comparison with existing ranking methods, the reproducibilityoptimized statistic shows good performance consistently under various simulated conditions and on Affymetrix spike-in dataset. Further, the feasibility of the novel statistic is confirmed in a practical research setting using data from an in-house cDNA microarray study of asthma-related gene expression changes. These results suggest that the procedure facilitates the selection of an appropriate test statistic for a given dataset without relying on a priori assumptions, which may bias the findings and their interpretation. Moreover, the general reproducibilityoptimization procedure is not limited to detecting differential expression only but could be extended to a wide range of other applications as well.

  1. A new voxel-based model for the determination of atmospheric weighted mean temperature in GPS atmospheric sounding

    NASA Astrophysics Data System (ADS)

    He, Changyong; Wu, Suqin; Wang, Xiaoming; Hu, Andong; Wang, Qianxin; Zhang, Kefei

    2017-06-01

    The Global Positioning System (GPS) is a powerful atmospheric observing system for determining precipitable water vapour (PWV). In the detection of PWV using GPS, the atmospheric weighted mean temperature (Tm) is a crucial parameter for the conversion of zenith tropospheric delay (ZTD) to PWV since the quality of PWV is affected by the accuracy of Tm. In this study, an improved voxel-based Tm model, named GWMT-D, was developed using global reanalysis data over a 4-year period from 2010 to 2013 provided by the United States National Centers for Environmental Prediction (NCEP). The performance of GWMT-D was assessed against three existing empirical Tm models - GTm-III, GWMT-IV, and GTmN - using different data sources in 2014 - the NCEP reanalysis data, surface Tm data provided by Global Geodetic Observing System and radiosonde measurements. The results show that the new GWMT-D model outperforms all the other three models with a root-mean-square error of less than 5.0 K at different altitudes over the globe. The new GWMT-D model can provide a practical alternative Tm determination method in real-time GPS-PWV remote sensing systems.

  2. Evaluation of atmospheric precipitable water from reanalysis products using homogenized radiosonde observations over China

    NASA Astrophysics Data System (ADS)

    Zhao, Tianbao; Wang, Juanhuai; Dai, Aiguo

    2015-10-01

    Many multidecadal atmospheric reanalysis products are available now, but their consistencies and reliability are far from perfect. In this study, atmospheric precipitable water (PW) from the National Centers for Environmental Prediction/National Center for Atmospheric Research (NCEP/NCAR), NCEP/Department of Energy (DOE), Modern Era Retrospective-Analysis for Research and Applications (MERRA), Japanese 55 year Reanalysis (JRA-55), JRA-25, ERA-Interim, ERA-40, Climate Forecast System Reanalysis (CFSR), and 20th Century Reanalysis version 2 is evaluated against homogenized radiosonde observations over China during 1979-2012 (1979-2001 for ERA-40). Results suggest that the PW biases in the reanalyses are within ˜20% for most of northern and eastern China, but the reanalyses underestimate the observed PW by 20%-40% over western China and by ˜60% over the southwestern Tibetan Plateau. The newer-generation reanalyses (e.g., JRA25, JRA55, CFSR, and ERA-Interim) have smaller root-mean-square error than the older-generation ones (NCEP/NCAR, NCEP/DOE, and ERA-40). Most of the reanalyses reproduce well the observed PW climatology and interannual variations over China. However, few reanalyses capture the observed long-term PW changes, primarily because they show spurious wet biases before about 2002. This deficiency results mainly from the discontinuities contained in reanalysis relative humidity fields in the middle-lower troposphere due to the wet bias in older radiosonde records that are assimilated into the reanalyses. An empirical orthogonal function (EOF) analysis revealed two leading modes that represent the long-term PW changes and El Niño-Southern Oscillation-related interannual variations with robust spatial patterns. The reanalysis products, especially the MERRA and JRA-25, roughly capture these EOF modes, which account for over 50% of the total variance. The results show that even during the post-1979 satellite era, discontinuities in radiosonde data can still induce large spurious long-term changes in reanalysis PW and other related fields. Thus, more efforts are needed to remove spurious changes in input data for future long-term reanalyses.

  3. Japanese 25-year reanalysis (JRA-25)

    NASA Astrophysics Data System (ADS)

    Ohkawara, Nozomu

    2006-12-01

    A long term global atmospheric reanalysis Japanese 25-year Reanalysis (JRA-25) which covers from 1979 to 2004 was completed using the Japan Meteorological Agency (JMA) numerical assimilation and forecast system. This is the first long term reanalysis undertaken in Asia. JMA's latest numerical assimilation system, and observational data collected as much as possible, were used in JRA-25 to generate a consistent and high quality reanalysis dataset to contribute to climate research and operational work. One purpose of JRA-25 is to enhance to a high quality the analysis in the Asian region. 6-hourly data assimilation cycles were performed and produced 6-hourly atmospheric analysis and forecast fields with various kinds of physical variables. The global model used in JRA-25 has a spectral resolution of T106 (equivalent to a horizontal grid size of around 120km) and 40 vertical layers with the top level at 0.4hPa. For observational data, a great deal of satellite data was used in addition to conventional surface and upper air data. Atmospheric Motion Vector (AMV) data retrieved from geostationary satellites, brightness temperature (TBB) data from TIROS Operational Vertical Sounder (TOVS), precipitable water retrieved from radiance of microwave radiometer from orbital satellites and some other satellite data were assimilated with 3-dimensional variational method (3DVAR). Many advantages have been found in the JRA-25 reanalysis. Firstly, forecast 6-hour global total precipitation in JRA-25 performs well, distribution and amount are properly represented both in space and time. JRA-25 has the best performance compared to other reanalysis with respect to time series of global precipitation over many years, with few unrealistic variations caused by degraded quality of satellite data due to volcanic eruptions. Secondly, JRA-25 is the first reanalysis which assimilated wind profiles surrounding tropical cyclones retrieved from historical best track information; tropical cyclones were analyzed correctly in all the global regions. Additionally, low-level cloud along the subtropical western coast of continents is forecast very accurately, and snow depth analysis is also good.

  4. SEARCH Workshop on Large-Scale Atmosphere/Cryosphere Observations

    NASA Technical Reports Server (NTRS)

    2002-01-01

    The purpose of the workshop held in Seattle during 27-29 November 2001 was to review existing land, sea ice, and atmospheric observations and the prospect for an Arctic System Reanalysis, through white papers, invited speakers, and panels. A major task for SEARCH was to determine how existing observation systems can be best used and enhanced to understand and anticipate the course of the ongoing changes in the Arctic. The primary workshop conclusion is that there is no cohesion among various Arctic disciplines and data types to form a complete observation set of Arctic change; a second workshop conclusion is that present data sets are vastly underutilized in understanding Arctic change; a third conclusion is that a distributed observing system must accommodate a wide range of spatial patterns of variability.

  5. Developing a high-resolution regional atmospheric reanalysis for Australia

    NASA Astrophysics Data System (ADS)

    White, Christopher; Fox-Hughes, Paul; Su, Chun-Hsu; Jakob, Dörte; Kociuba, Greg; Eisenberg, Nathan; Steinle, Peter; Harris, Rebecca; Corney, Stuart; Love, Peter; Remenyi, Tomas; Chladil, Mark; Bally, John; Bindoff, Nathan

    2017-04-01

    A dynamically consistent, long-term atmospheric reanalysis can be used to support high-quality assessments of environmental risk and likelihood of extreme events. Most reanalyses are presently based on coarse-scale global systems that are not suitable for regional assessments in fire risk, water and natural resources, amongst others. The Australian Bureau of Meteorology is currently working to close this gap by producing a high-resolution reanalysis over the Australian and New Zealand region to construct a sequence of atmospheric conditions at sub-hourly intervals over the past 25 years from 1990. The Australia reanalysis consists of a convective-scale analysis nested within a 12 km regional-scale reanalysis, which is bounded by a coarse-scale ERA-Interim reanalysis that provides the required boundary and initial conditions. We use an unchanging atmospheric modelling suite based on the UERRA system used at the UK Met Office and the more recent version of the Bureau of Meteorology's operational numerical prediction model used in ACCESS-R (Australian Community Climate and Earth-System Simulator-Regional system). An advanced (4-dimensional variational) data assimilation scheme is used to optimally combine model physics with multiple observations from aircrafts, sondes, surface observations and satellites to create a best estimate of state of the atmosphere over a 6-hour moving window. This analysis is in turn used to drive a higher-resolution (1.5 km) downscaling model over selected subdomains within Australia, currently eastern New South Wales and Tasmania, with the capability to support this anywhere in the Australia-New Zealand domain. The temporal resolution of the gridded analysis fields for both the regional and higher-resolution subdomains are generally one hour, with many fields such as 10 m winds and 2 m temperatures available every 10 minutes. The reanalysis also produces many other variables that include wind, temperature, moisture, pressure, cloud cover, precipitation, evaporation, soil water, and energy fluxes. In this presentation, we report on the implementation of the Australia regional reanalysis and results from first stages of the project, with a focus on the Tasmanian subdomain. An initial benchmarking 1.5 km data set - referred to as the 'Initial Analysis' - has been constructed over the subdomains consisting of regridded and harmonised analysis and short-term forecast fields from the operational ACCESS-C model using the past 5 years (2011-2015) of archived data. Evaluation of the Initial Analysis against surface observations from automatic weather stations indicate changes in model skills over time that may be attributed to changes in NWP and assimilation systems, and model cycling frequency. Preliminary evaluations of the reanalysis across Tasmania and its inter-comparisons with the Initial Analysis and the ERA-Interim reanalysis products will be presented, including some features across the Tasmanian subdomain such as means and extremes of analysed weather variables. Finally, we describe a number of applications across Tasmania of the reanalysis of immediate interest to meteorologists, fire and landscape managers and other members of the emergency management community, including the use of the data to create post-processed fields such as soil dryness, tornados and fire danger indices for forest fire danger risk assessment, including a climatology of Continuous Haines Index.

  6. NVAP

    Atmospheric Science Data Center

    2013-09-05

    ... of creating stable, community accepted Earth System Data Records (ESDRs) for a variety of geophysical time series. A reanalysis and ... of creating stable, community accepted Earth System Data Records (ESDRs) for a variety of geophysical time series. A reanalysis and ...

  7. Dynamical Downscaling of Typhoon Vera (1959) and related Storm Surge based on JRA-55 Reanalysis

    NASA Astrophysics Data System (ADS)

    Ninomiya, J.; Takemi, T.; Mori, N.; Shibutani, Y.; Kim, S.

    2015-12-01

    Typhoon Vera in 1959 is historical extreme typhoon that caused severest typhoon damage mainly due to the storm surge up to 389 cm in Japan. Vera developed 895 hPa on offshore and landed with 929.2 hPa. There are many studies of the dynamical downscaling of Vera but it is difficult to simulate accurately because of the lack of the accuracy of global reanalysis data. This study carried out dynamical downscaling experiment of Vera using WRF downscaling forced by JRA-55 that are latest atmospheric model and reanalysis data. In this study, the reproducibility of five global reanalysis data for Typhoon Vera were compered. Comparison shows that reanalysis data doesn't have strong typhoon information except for JRA-55, so that downscaling with conventional reanalysis data goes wrong. The dynamical downscaling method for storm surge is studied very much (e.g. choice of physical model, nudging, 4D-VAR, bogus and so on). In this study, domain size and resolution of the coarse domain were considered. The coarse domain size influences the typhoon route and central pressure, and larger domain restrains the typhoon strength. The results of simulations with different domain size show that the threshold of developing restrain is whether the coarse domain fully includes the area of wind speed more than 15 m/s around the typhoon. The results of simulations with different resolution show that the resolution doesn't affect the typhoon route, and higher resolution gives stronger typhoon simulation.

  8. Detection of Porphyromonas endodontalis, Porphyromonas gingivalis and Prevotella intermedia in primary endodontic infections in a Chinese population.

    PubMed

    Cao, H; Qi, Z; Jiang, H; Zhao, J; Liu, Z; Tang, Z

    2012-08-01

    To assess the prevalence of three black-pigmented bacterial species (Porphyromonas endodontalis, Porphyromonas gingivalis and Prevotella intermedia) using microarray technology in root canals of teeth associated with primary endodontic infections in a Chinese population. Microbial samples were taken from root canals of 80 teeth with pulp necrosis and primary endodontic infections in a Chinese population. DNA extracted from the samples was amplified by PCR with universal bacterial primers based on 16S rRNA gene sequences, and the products hybridized with the microarrays in which the specific oligonucleotide probes were added. The results of hybridization were screened by a confocal laser scanner. Pearson chi-square test and the two-sided Fisher exact test were used to analyse whether a significant association existed between the species and symptoms as well as in co-existence of two target organisms by a statistical software package (SAS 8.02). The 16S rRNA gene microarray detected at least one of the three test species in 76% of the study teeth. P. endodontalis, P. gingivalis and P. intermedia were found in 50%, 33% and 45%, respectively. A significant association was found in the presence of the pair P. endodontalis / P. gingivalis (P < 0.005). Both P. endodontalis (P <0.05) and P. gingivalis (P <0.005) had a statistically significant association with the presence of a sinus tract. The simultaneous presence of P. endodontalis and P. gingivalis was also associated with the presence of a sinus tract (P<0.005) and abscess formation (P<0.05). The three black-pigmented bacteria were prevalent in teeth with pulp necrosis and primary endodontic infections in a Chinese population. P. gingivalis and P. endodontalis were associated with the presence of sinus tract and abscess formation. © 2012 International Endodontic Journal.

  9. Novel harmonic regularization approach for variable selection in Cox's proportional hazards model.

    PubMed

    Chu, Ge-Jin; Liang, Yong; Wang, Jia-Xuan

    2014-01-01

    Variable selection is an important issue in regression and a number of variable selection methods have been proposed involving nonconvex penalty functions. In this paper, we investigate a novel harmonic regularization method, which can approximate nonconvex Lq  (1/2 < q < 1) regularizations, to select key risk factors in the Cox's proportional hazards model using microarray gene expression data. The harmonic regularization method can be efficiently solved using our proposed direct path seeking approach, which can produce solutions that closely approximate those for the convex loss function and the nonconvex regularization. Simulation results based on the artificial datasets and four real microarray gene expression datasets, such as real diffuse large B-cell lymphoma (DCBCL), the lung cancer, and the AML datasets, show that the harmonic regularization method can be more accurate for variable selection than existing Lasso series methods.

  10. Prediction of clinical behaviour and treatment for cancers.

    PubMed

    Futschik, Matthias E; Sullivan, Mike; Reeve, Anthony; Kasabov, Nikola

    2003-01-01

    Prediction of clinical behaviour and treatment for cancers is based on the integration of clinical and pathological parameters. Recent reports have demonstrated that gene expression profiling provides a powerful new approach for determining disease outcome. If clinical and microarray data each contain independent information then it should be possible to combine these datasets to gain more accurate prognostic information. Here, we have used existing clinical information and microarray data to generate a combined prognostic model for outcome prediction for diffuse large B-cell lymphoma (DLBCL). A prediction accuracy of 87.5% was achieved. This constitutes a significant improvement compared to the previously most accurate prognostic model with an accuracy of 77.6%. The model introduced here may be generally applicable to the combination of various types of molecular and clinical data for improving medical decision support systems and individualising patient care.

  11. "Set in Stone" or "Ray of Hope": Parents' Beliefs about Cause and Prognosis after Genomic Testing of Children Diagnosed with ASD

    ERIC Educational Resources Information Center

    Reiff, Marian; Bugos, Eva; Giarelli, Ellen; Bernhardt, Barbara A.; Spinner, Nancy B.; Sankar, Pamela L.; Mulchandani, Surabhi

    2017-01-01

    Despite increasing utilization of chromosomal microarray analysis (CMA) for autism spectrum disorders (ASD), limited information exists about how results influence parents' beliefs about etiology and prognosis. We conducted in-depth interviews and surveys with 57 parents of children with ASD who received CMA results categorized as pathogenic,…

  12. An 8-Year, High-Resolution Reanalysis of Atmospheric CO2 Mixing Ratios Based on OCO-2 and GOSAT-ACOS Retrievals

    NASA Technical Reports Server (NTRS)

    Weir, B.; Chatterjee, A.; Ott, L. E.; Pawson, S.

    2017-01-01

    The NASA GMAO (Global Modeling and Assimilation Office) reanalysis blends OCO-2 (Orbiting Carbon Observatory 2) and GOSAT-ACOS (Greenhouse Gases Observing Satellite-Atmospheric Carbon Observations from Space) retrievals (top) with GEOS (Goddard Earth Observing System) model predictions (bottom) to estimate the full 3D (three-dimensional) state of CO2 every 3 hours (middle). This poster describes monthly atmospheric growth rates derived from the reanalysis and an application to aircraft data with the potential to aid bias correction.

  13. GeneMesh: a web-based microarray analysis tool for relating differentially expressed genes to MeSH terms.

    PubMed

    Jani, Saurin D; Argraves, Gary L; Barth, Jeremy L; Argraves, W Scott

    2010-04-01

    An important objective of DNA microarray-based gene expression experimentation is determining inter-relationships that exist between differentially expressed genes and biological processes, molecular functions, cellular components, signaling pathways, physiologic processes and diseases. Here we describe GeneMesh, a web-based program that facilitates analysis of DNA microarray gene expression data. GeneMesh relates genes in a query set to categories available in the Medical Subject Headings (MeSH) hierarchical index. The interface enables hypothesis driven relational analysis to a specific MeSH subcategory (e.g., Cardiovascular System, Genetic Processes, Immune System Diseases etc.) or unbiased relational analysis to broader MeSH categories (e.g., Anatomy, Biological Sciences, Disease etc.). Genes found associated with a given MeSH category are dynamically linked to facilitate tabular and graphical depiction of Entrez Gene information, Gene Ontology information, KEGG metabolic pathway diagrams and intermolecular interaction information. Expression intensity values of groups of genes that cluster in relation to a given MeSH category, gene ontology or pathway can be displayed as heat maps of Z score-normalized values. GeneMesh operates on gene expression data derived from a number of commercial microarray platforms including Affymetrix, Agilent and Illumina. GeneMesh is a versatile web-based tool for testing and developing new hypotheses through relating genes in a query set (e.g., differentially expressed genes from a DNA microarray experiment) to descriptors making up the hierarchical structure of the National Library of Medicine controlled vocabulary thesaurus, MeSH. The system further enhances the discovery process by providing links between sets of genes associated with a given MeSH category to a rich set of html linked tabular and graphic information including Entrez Gene summaries, gene ontologies, intermolecular interactions, overlays of genes onto KEGG pathway diagrams and heatmaps of expression intensity values. GeneMesh is freely available online at http://proteogenomics.musc.edu/genemesh/.

  14. Toward a 35-years North American Precipitation and Surface Reanalysis

    NASA Astrophysics Data System (ADS)

    Gasset, N.; Fortin, V.

    2017-12-01

    In support of the International Watersheds Initiative (IWI) of the International Joint Commission (IJC), a 35-years precipitation and surface reanalysis covering North America at a 3-hours and 15-km resolution is currently being developed at the Canadian Meteorological Centre (CMC). A deterministic reforecast / dynamical downscaling approach is followed where a global reanalysis (ERA-Interim) is used as initial condition of the Global Environmental Multi-scale model (GEM). Moreover, the latter is coupled with precipitation and surface data assimilation systems, i.e. the Canadian Precipitation Analysis (CaPA) and the Canadian Land Data Assimilation System (CaLDAS). While optimized to be more computationally efficient in the context of a reforecast experiment, all systems used are closely related to model versions and configurations currently run operationally at CMC, meaning they have undergone a strict and thorough validation procedure.As a proof of concept and in order to identify the optimal set-up before achieving the 35-years reanalysis, several configurations of the approach are evaluated for the years 2010-2014 using both standard CMC validation methodology as well as more dedicated scores such as comparison against the currently available products (North American Regional Reanalysis, MERRA-Land and the newly released ERA5 reanalysis). A special attention is dedicated to the evaluation of analysed variables, i.e. precipitation, snow depth, surface/ground temperature and moisture over the whole domain of interest. Results from these preliminary samples are very encouraging and the optimal set-up is identified. The coupled approach, i.e. GEM+CaPA/CaLDAS, always shows clear improvements over classical reforecast and dynamical downscaling where surface observations are present. Furthermore, results are inline or better than currently available products and the reference CMC operational approach that was operated from 2012 to 2016 (GEM 3.3, 10-km resolution). This reanalysis will allow for bias correction of current estimates and forecasts, and help decision maker understand and communicate by how much the current forecasted state of the system differs from the recent past.

  15. Steps towards a consistent Climate Forecast System Reanalysis wave hindcast (1979-2016)

    NASA Astrophysics Data System (ADS)

    Stopa, Justin E.; Ardhuin, Fabrice; Huchet, Marion; Accensi, Mickael

    2017-04-01

    Surface gravity waves are being increasingly recognized as playing an important role within the climate system. Wave hindcasts and reanalysis products of long time series (>30 years) have been instrumental in understanding and describing the wave climate for the past several decades and have allowed a better understanding of extreme waves and inter-annual variability. Wave hindcasts have the advantage of covering the oceans in higher space-time resolution than possible with conventional observations from satellites and buoys. Wave reanalysis systems like ECWMF's ERA-Interim directly included a wave model that is coupled to the ocean and atmosphere, otherwise reanalysis wind fields are used to drive a wave model to reproduce the wave field in long time series. The ERA Interim dataset is consistent in time, but cannot adequately resolve extreme waves. On the other hand, the NCEP Climate Forecast System (CFSR) wind field better resolves the extreme wind speeds, but suffers from discontinuous features in time which are due to the quantity and quality of the remote sensing data incorporated into the product. Therefore, a consistent hindcast that resolves the extreme waves still alludes us limiting our understanding of the wave climate. In this study, we systematically correct the CFSR wind field to reproduce a homogeneous wave field in time. To verify the homogeneity of our hindcast we compute error metrics on a monthly basis using the observations from a merged altimeter wave database which has been calibrated and quality controlled from 1985-2016. Before 1985 only few wave observations exist and are limited to a select number of wave buoys mostly in the North Hemisphere. Therefore we supplement our wave observations with seismic data which responds to nonlinear wave interactions created by opposing waves with nearly equal wavenumbers. Within the CFSR wave hindcast, we find both spatial and temporal discontinuities in the error metrics. The Southern Hemisphere often has wind speed biases larger than the Northern Hemisphere and we propose a simple correction to reduce these features by applying a taper shaped by a half-Hanning window. The discontinuous features in time are corrected by scaling the entire wind field by percentages ranging typically ranging from 1-3%. Our analysis is performed on monthly time series and we expect the monthly statistics to be more adequate for climate studies.

  16. Projecting Wind Energy Potential Under Climate Change with Ensemble of Climate Model Simulations

    NASA Astrophysics Data System (ADS)

    Jain, A.; Shashikanth, K.; Ghosh, S.; Mukherjee, P. P.

    2013-12-01

    Recent years have witnessed an increasing global concern over energy sustainability and security, triggered by a number of issues, such as (though not limited to): fossil fuel depletion, energy resource geopolitics, economic efficiency versus population growth debate, environmental concerns and climate change. Wind energy is a renewable and sustainable form of energy in which wind turbines convert the kinetic energy of wind into electrical energy. Global warming and differential surface heating may significantly impact the wind velocity and hence the wind energy potential. Sustainable design of wind mills requires understanding the impacts of climate change on wind energy potential, which we evaluate here with multiple General Circulation Models (GCMs). GCMs simulate the climate variables globally considering the greenhouse emission scenarios provided as Representation Concentration path ways (RCPs). Here we use new generation climate model outputs obtained from Coupled model Intercomparison Project 5(CMIP5). We first compute the wind energy potential with reanalysis data (NCEP/ NCAR), at a spatial resolution of 2.50, where the gridded data is fitted to Weibull distribution and with the Weibull parameters, the wind energy densities are computed at different grids. The same methodology is then used, to CMIP5 outputs (resultant of U-wind and V-wind) of MRI, CMCC, BCC, CanESM, and INMCM4 for historical runs. This is performed separately for four seasons globally, MAM, JJA, SON and DJF. We observe the muti-model average of wind energy density for historic period has significant bias with respect to that of reanalysis product. Here we develop a quantile based superensemble approach where GCM quantiles corresponding to selected CDF values are regressed to reanalysis data. It is observed that this regression approach takes care of both, bias in GCMs and combination of GCMs. With superensemble, we observe that the historical wind energy density resembles quite well with reanalysis/ observed output. We apply the same for future under RCP scenarios. We observe spatially and temporally varying global change of wind energy density. The underlying assumption is that the regression relationship will also hold good for future. The results highlight the needs to change the design standards of wind mills at different locations, considering climate change and at the same time the requirement of height modifications for existing mills to produce same energy in future.

  17. Atmospheric response to Saharan dust deduced from ECMWF reanalysis increments

    NASA Astrophysics Data System (ADS)

    Kishcha, P.; Alpert, P.; Barkan, J.; Kirchner, I.; Machenhauer, B.

    2003-04-01

    This study focuses on the atmospheric temperature response to dust deduced from a new source of data - the European Reanalysis (ERA) increments. These increments are the systematic errors of global climate models, generated in reanalysis procedure. The model errors result not only from the lack of desert dust but also from a complex combination of many kinds of model errors. Over the Sahara desert the dust radiative effect is believed to be a predominant model defect which should significantly affect the increments. This dust effect was examined by considering correlation between the increments and remotely-sensed dust. Comparisons were made between April temporal variations of the ERA analysis increments and the variations of the Total Ozone Mapping Spectrometer aerosol index (AI) between 1979 and 1993. The distinctive structure was identified in the distribution of correlation composed of three nested areas with high positive correlation (> 0.5), low correlation, and high negative correlation (<-0.5). The innermost positive correlation area (PCA) is a large area near the center of the Sahara desert. For some local maxima inside this area the correlation even exceeds 0.8. The outermost negative correlation area (NCA) is not uniform. It consists of some areas over the eastern and western parts of North Africa with a relatively small amount of dust. Inside those areas both positive and negative high correlations exist at pressure levels ranging from 850 to 700 hPa, with the peak values near 775 hPa. Dust-forced heating (cooling) inside the PCA (NCA) is accompanied by changes in the static stability of the atmosphere above the dust layer. The reanalysis data of the European Center for Medium Range Weather Forecast(ECMWF) suggests that the PCA (NCA) corresponds mainly to anticyclonic (cyclonic) flow, negative (positive) vorticity, and downward (upward) airflow. These facts indicate an interaction between dust-forced heating /cooling and atmospheric circulation. The April correlation results are supported by the analysis of vertical distribution of dust concentration, derived from the 24-hour dust prediction system at Tel Aviv University (website: http://earth.nasa.proj.ac.il/dust/current/). For other months the analysis is more complicated because of the essential increasing of humidity along with the northward progress of the ITCZ and the significant impact on the increments.

  18. A novel feature extraction approach for microarray data based on multi-algorithm fusion

    PubMed Central

    Jiang, Zhu; Xu, Rong

    2015-01-01

    Feature extraction is one of the most important and effective method to reduce dimension in data mining, with emerging of high dimensional data such as microarray gene expression data. Feature extraction for gene selection, mainly serves two purposes. One is to identify certain disease-related genes. The other is to find a compact set of discriminative genes to build a pattern classifier with reduced complexity and improved generalization capabilities. Depending on the purpose of gene selection, two types of feature extraction algorithms including ranking-based feature extraction and set-based feature extraction are employed in microarray gene expression data analysis. In ranking-based feature extraction, features are evaluated on an individual basis, without considering inter-relationship between features in general, while set-based feature extraction evaluates features based on their role in a feature set by taking into account dependency between features. Just as learning methods, feature extraction has a problem in its generalization ability, which is robustness. However, the issue of robustness is often overlooked in feature extraction. In order to improve the accuracy and robustness of feature extraction for microarray data, a novel approach based on multi-algorithm fusion is proposed. By fusing different types of feature extraction algorithms to select the feature from the samples set, the proposed approach is able to improve feature extraction performance. The new approach is tested against gene expression dataset including Colon cancer data, CNS data, DLBCL data, and Leukemia data. The testing results show that the performance of this algorithm is better than existing solutions. PMID:25780277

  19. A novel feature extraction approach for microarray data based on multi-algorithm fusion.

    PubMed

    Jiang, Zhu; Xu, Rong

    2015-01-01

    Feature extraction is one of the most important and effective method to reduce dimension in data mining, with emerging of high dimensional data such as microarray gene expression data. Feature extraction for gene selection, mainly serves two purposes. One is to identify certain disease-related genes. The other is to find a compact set of discriminative genes to build a pattern classifier with reduced complexity and improved generalization capabilities. Depending on the purpose of gene selection, two types of feature extraction algorithms including ranking-based feature extraction and set-based feature extraction are employed in microarray gene expression data analysis. In ranking-based feature extraction, features are evaluated on an individual basis, without considering inter-relationship between features in general, while set-based feature extraction evaluates features based on their role in a feature set by taking into account dependency between features. Just as learning methods, feature extraction has a problem in its generalization ability, which is robustness. However, the issue of robustness is often overlooked in feature extraction. In order to improve the accuracy and robustness of feature extraction for microarray data, a novel approach based on multi-algorithm fusion is proposed. By fusing different types of feature extraction algorithms to select the feature from the samples set, the proposed approach is able to improve feature extraction performance. The new approach is tested against gene expression dataset including Colon cancer data, CNS data, DLBCL data, and Leukemia data. The testing results show that the performance of this algorithm is better than existing solutions.

  20. Transcriptome-Wide Mega-Analyses Reveal Joint Dysregulation of Immunologic Genes and Transcription Regulators in Brain and Blood in Schizophrenia

    PubMed Central

    Hess, Jonathan L.; Tylee, Daniel S.; Barve, Rahul; de Jong, Simone; Ophoff, Roel A.; Kumarasinghe, Nishantha; Tooney, Paul; Schall, Ulrich; Gardiner, Erin; Beveridge, Natalie Jane; Scott, Rodney J.; Yasawardene, Surangi; Perera, Antionette; Mendis, Jayan; Carr, Vaughan; Kelly, Brian; Cairns, Murray; Tsuang, Ming T.; Glatt, Stephen J.

    2016-01-01

    The application of microarray technology in schizophrenia research was heralded as paradigm-shifting, as it allowed for high-throughput assessment of cell and tissue function. This technology was widely adopted, initially in studies of postmortem brain tissue, and later in studies of peripheral blood. The collective body of schizophrenia microarray literature contains apparent inconsistencies between studies, with failures to replicate top hits, in part due to small sample sizes, cohort-specific effects, differences in array types, and other confounders. In an attempt to summarize existing studies of schizophrenia cases and non-related comparison subjects, we performed two mega-analyses of a combined set of microarray data from postmortem prefrontal cortices (n = 315) and from ex-vivo blood tissues (n = 578). We adjusted regression models per gene to remove non-significant covariates, providing best-estimates of transcripts dysregulated in schizophrenia. We also examined dysregulation of functionally related gene sets and gene co-expression modules, and assessed enrichment of cell types and genetic risk factors. The identities of the most significantly dysregulated genes were largely distinct for each tissue, but the findings indicated common emergent biological functions (e.g. immunity) and regulatory factors (e.g., predicted targets of transcription factors and miRNA species across tissues). Our network-based analyses converged upon similar patterns of heightened innate immune gene expression in both brain and blood in schizophrenia. We also constructed generalizable machine-learning classifiers using the blood-based microarray data. Our study provides an informative atlas for future pathophysiologic and biomarker studies of schizophrenia. PMID:27450777

  1. Transcriptome-wide mega-analyses reveal joint dysregulation of immunologic genes and transcription regulators in brain and blood in schizophrenia.

    PubMed

    Hess, Jonathan L; Tylee, Daniel S; Barve, Rahul; de Jong, Simone; Ophoff, Roel A; Kumarasinghe, Nishantha; Tooney, Paul; Schall, Ulrich; Gardiner, Erin; Beveridge, Natalie Jane; Scott, Rodney J; Yasawardene, Surangi; Perera, Antionette; Mendis, Jayan; Carr, Vaughan; Kelly, Brian; Cairns, Murray; Tsuang, Ming T; Glatt, Stephen J

    2016-10-01

    The application of microarray technology in schizophrenia research was heralded as paradigm-shifting, as it allowed for high-throughput assessment of cell and tissue function. This technology was widely adopted, initially in studies of postmortem brain tissue, and later in studies of peripheral blood. The collective body of schizophrenia microarray literature contains apparent inconsistencies between studies, with failures to replicate top hits, in part due to small sample sizes, cohort-specific effects, differences in array types, and other confounders. In an attempt to summarize existing studies of schizophrenia cases and non-related comparison subjects, we performed two mega-analyses of a combined set of microarray data from postmortem prefrontal cortices (n=315) and from ex-vivo blood tissues (n=578). We adjusted regression models per gene to remove non-significant covariates, providing best-estimates of transcripts dysregulated in schizophrenia. We also examined dysregulation of functionally related gene sets and gene co-expression modules, and assessed enrichment of cell types and genetic risk factors. The identities of the most significantly dysregulated genes were largely distinct for each tissue, but the findings indicated common emergent biological functions (e.g. immunity) and regulatory factors (e.g., predicted targets of transcription factors and miRNA species across tissues). Our network-based analyses converged upon similar patterns of heightened innate immune gene expression in both brain and blood in schizophrenia. We also constructed generalizable machine-learning classifiers using the blood-based microarray data. Our study provides an informative atlas for future pathophysiologic and biomarker studies of schizophrenia. Published by Elsevier B.V.

  2. Supervised group Lasso with applications to microarray data analysis

    PubMed Central

    Ma, Shuangge; Song, Xiao; Huang, Jian

    2007-01-01

    Background A tremendous amount of efforts have been devoted to identifying genes for diagnosis and prognosis of diseases using microarray gene expression data. It has been demonstrated that gene expression data have cluster structure, where the clusters consist of co-regulated genes which tend to have coordinated functions. However, most available statistical methods for gene selection do not take into consideration the cluster structure. Results We propose a supervised group Lasso approach that takes into account the cluster structure in gene expression data for gene selection and predictive model building. For gene expression data without biological cluster information, we first divide genes into clusters using the K-means approach and determine the optimal number of clusters using the Gap method. The supervised group Lasso consists of two steps. In the first step, we identify important genes within each cluster using the Lasso method. In the second step, we select important clusters using the group Lasso. Tuning parameters are determined using V-fold cross validation at both steps to allow for further flexibility. Prediction performance is evaluated using leave-one-out cross validation. We apply the proposed method to disease classification and survival analysis with microarray data. Conclusion We analyze four microarray data sets using the proposed approach: two cancer data sets with binary cancer occurrence as outcomes and two lymphoma data sets with survival outcomes. The results show that the proposed approach is capable of identifying a small number of influential gene clusters and important genes within those clusters, and has better prediction performance than existing methods. PMID:17316436

  3. Testing forward model against OCO-2 and TANSO-FTS/GOSAT observed spectra in near infrared range

    NASA Astrophysics Data System (ADS)

    Zadvornykh, Ilya V.; Gribanov, Konstantin G.

    2015-11-01

    An existing software package FIRE-ARMS (Fine InfraRed Explorer for Atmospheric Remote MeasurementS) was modified by embedding vector radiative transfer model VLIDORT. Thus the program tool includes both thermal (TIR) and near infrared (NIR) regions. We performed forward simulation of near infrared spectra on the top of the atmosphere for outgoing radiation accounting multiple scattering in cloudless atmosphere. Simulated spectra are compared with spectra measured by TANSO-FTS/GOSAT and OCO-2 in the condition of cloudless atmosphere over Western Siberia. NCEP/NCAR reanalysis data were used to complete model atmosphere.

  4. Impact of bias-corrected reanalysis-derived lateral boundary conditions on WRF simulations

    NASA Astrophysics Data System (ADS)

    Moalafhi, Ditiro Benson; Sharma, Ashish; Evans, Jason Peter; Mehrotra, Rajeshwar; Rocheta, Eytan

    2017-08-01

    Lateral and lower boundary conditions derived from a suitable global reanalysis data set form the basis for deriving a dynamically consistent finer resolution downscaled product for climate and hydrological assessment studies. A problem with this, however, is that systematic biases have been noted to be present in the global reanalysis data sets that form these boundaries, biases which can be carried into the downscaled simulations thereby reducing their accuracy or efficacy. In this work, three Weather Research and Forecasting (WRF) model downscaling experiments are undertaken to investigate the impact of bias correcting European Centre for Medium range Weather Forecasting Reanalysis ERA-Interim (ERA-I) atmospheric temperature and relative humidity using Atmospheric Infrared Sounder (AIRS) satellite data. The downscaling is performed over a domain centered over southern Africa between the years 2003 and 2012. The sample mean and the mean as well as standard deviation at each grid cell for each variable are used for bias correction. The resultant WRF simulations of near-surface temperature and precipitation are evaluated seasonally and annually against global gridded observational data sets and compared with ERA-I reanalysis driving field. The study reveals inconsistencies between the impact of the bias correction prior to downscaling and the resultant model simulations after downscaling. Mean and standard deviation bias-corrected WRF simulations are, however, found to be marginally better than mean only bias-corrected WRF simulations and raw ERA-I reanalysis-driven WRF simulations. Performances, however, differ when assessing different attributes in the downscaled field. This raises questions about the efficacy of the correction procedures adopted.

  5. Severe Weather Environments in Atmospheric Reanalyses

    NASA Astrophysics Data System (ADS)

    King, A. T.; Kennedy, A. D.

    2017-12-01

    Atmospheric reanalyses combine historical observation data using a fixed assimilation scheme to achieve a dynamically coherent representation of the atmosphere. How well these reanalyses represent severe weather environments via proxies is poorly defined. To quantify the performance of reanalyses, a database of proximity soundings near severe storms from the Rapid Update Cycle 2 (RUC-2) model will be compared to a suite of reanalyses including: North American Reanalysis (NARR), European Interim Reanalysis (ERA-Interim), 2nd Modern-Era Retrospective Reanalysis for Research and Applications (MERRA-2), Japanese 55-year Reanalysis (JRA-55), 20th Century Reanalysis (20CR), and Climate Forecast System Reanalysis (CFSR). A variety of severe weather parameters will be calculated from these soundings including: convective available potential energy (CAPE), storm relative helicity (SRH), supercell composite parameter (SCP), and significant tornado parameter (STP). These soundings will be generated using the SHARPpy python module, which is an open source tool used to calculate severe weather parameters. Preliminary results indicate that the NARR and JRA55 are significantly more skilled at producing accurate severe weather environments than the other reanalyses. The primary difference between these two reanalyses and the remaining reanalyses is a significant negative bias for thermodynamic parameters. To facilitate climatological studies, the scope of work will be expanded to compute these parameters for the entire domain and duration of select renalyses. Preliminary results from this effort will be presented and compared to observations at select locations. This dataset will be made pubically available to the larger scientific community, and details of this product will be provided.

  6. REMO poor man's reanalysis

    NASA Astrophysics Data System (ADS)

    Ries, H.; Moseley, C.; Haensler, A.

    2012-04-01

    Reanalyses depict the state of the atmosphere as a best fit in space and time of many atmospheric observations in a physically consistent way. By essentially solving the data assimilation problem in a very accurate manner, reanalysis results can be used as reference for model evaluation procedures and as forcing data sets for different model applications. However, the spatial resolution of the most common and accepted reanalysis data sets (e.g. JRA25, ERA-Interim) ranges from approximately 124 km to 80 km. This resolution is too coarse to simulate certain small scale processes often associated with extreme events. In addition, many models need higher resolved forcing data ( e.g. land-surface models, tools for identifying and assessing hydrological extremes). Therefore we downscaled the ERA-Interim reanalysis over the EURO-CORDEX-Domain for the time period 1989 to 2008 to a horizontal resolution of approximately 12 km. The downscaling is performed by nudging REMO-simulations to lower and lateral boundary conditions of the reanalysis, and by re-initializing the model every 24 hours ("REMO in forecast mode"). In this study the three following questions will be addressed: 1.) Does the REMO poor man's reanalysis meet the needs (accuracy, extreme value distribution) in validation and forcing? 2.) What lessons can be learned about the model used for downscaling? As REMO is used as a pure downscaling procedure, any systematic deviations from ERA-Interim result from poor process modelling but not from predictability limitations. 3.) How much small scale information generated by the downscaling model is lost with frequent initializations? A comparison to a simulation that is performed in climate mode will be presented.

  7. Development of the NHM-LETKF regional reanalysis system assimilating conventional observations only

    NASA Astrophysics Data System (ADS)

    Fukui, S.; Iwasaki, T.; Saito, K. K.; Seko, H.; Kunii, M.

    2016-12-01

    The information about long-term high-resolution atmospheric fields is very useful for studying meso-scale responses to climate change or analyzing extreme events. We are developing a NHM-LETKF (the local ensemble transform Kalman filter with the nonhydrostatic model of the Japan Meteorological Agency (JMA)) regional reanalysis system assimilating only conventional observations that are available over about 60 years such as surface observations at observatories and upper air observations with radiosondes. The domain covers Japan and its surroundings. Before the long-term reanalysis is performed, an experiment using the system was conducted over August in 2014 in order to identify effectiveness and problems of the regional reanalysis system. In this study, we investigated the six-hour accumulated precipitations obtained by integration from the analysis fields. The reproduced precipitation was compared with the JMA's Radar/Rain-gauge Analyzed Precipitation data over Japan islands and the precipitation of JRA-55, which is used as lateral boundary conditions. The comparisons reveal the underestimation of the precipitation in the regional reanalysis. The underestimation is improved by extending the forecast time. In the regional reanalysis system, the analysis fields are derived using the ensemble mean fields, where the conflicting components among ensemble members are filtered out. Therefore, it is important to tune the inflation factor and lateral boundary perturbations not to smooth the analysis fields excessively and to consider more time to spin-up the fields. In the extended run, the underestimation still remains. This implies that the underestimation is attributed to the forecast model itself as well as the analysis scheme.

  8. Impact of DYNAMO observations on NASA GEOS-5 reanalyses and the representation of MJO initiation

    NASA Astrophysics Data System (ADS)

    Achuthavarier, D.; Wang, H.; Schubert, S. D.; Sienkiewicz, M.

    2017-01-01

    This study examines the impact of the Dynamics of the Madden-Julian Oscillation (DYNAMO) campaign in situ observations on NASA Goddard Earth Observing System version 5 (GEOS-5) reanalyses and the improvements gained thereby in the representation of the Madden-Julian Oscillation (MJO) initiation processes. To this end, we produced a global, high-resolution (1/4° spatially) reanalysis that assimilates the level-4, quality-controlled DYNAMO upper air soundings from about 87 stations in the equatorial Indian Ocean region along with a companion data-denied control reanalysis. The DYNAMO reanalysis produces a more realistic vertical structure of the temperature and moisture in the central tropical Indian Ocean by correcting the model biases, namely, the cold and dry biases in the lower troposphere and warm bias in the upper troposphere. The reanalysis horizontal winds are substantially improved, in that, the westerly acceleration and vertical shear of the zonal wind are enhanced. The DYNAMO reanalysis shows enhanced low-level diabatic heating, moisture anomalies and vertical velocity during the MJO initiation. Due to the warmer lower troposphere, the deep convection is invigorated, which is evident in convective cloud fraction. The GEOS-5 atmospheric general circulation model (AGCM) employed in the reanalysis is overall successful in assimilating the additional DYNAMO observations, except for an erroneous model response for medium rain rates, between 700 and 600 hPa, reminiscent of a bias in earlier versions of the AGCM. The moist heating profile shows a sharp decrease there due to the excessive convective rain re-evaporation, which is partly offset by the temperature increment produced by the analysis.

  9. Importing MAGE-ML format microarray data into BioConductor.

    PubMed

    Durinck, Steffen; Allemeersch, Joke; Carey, Vincent J; Moreau, Yves; De Moor, Bart

    2004-12-12

    The microarray gene expression markup language (MAGE-ML) is a widely used XML (eXtensible Markup Language) standard for describing and exchanging information about microarray experiments. It can describe microarray designs, microarray experiment designs, gene expression data and data analysis results. We describe RMAGEML, a new Bioconductor package that provides a link between cDNA microarray data stored in MAGE-ML format and the Bioconductor framework for preprocessing, visualization and analysis of microarray experiments. http://www.bioconductor.org. Open Source.

  10. Can Atmospheric Reanalysis Data Sets Be Used to Reproduce Flooding Over Large Scales?

    NASA Astrophysics Data System (ADS)

    Andreadis, Konstantinos M.; Schumann, Guy J.-P.; Stampoulis, Dimitrios; Bates, Paul D.; Brakenridge, G. Robert; Kettner, Albert J.

    2017-10-01

    Floods are costly to global economies and can be exceptionally lethal. The ability to produce consistent flood hazard maps over large areas could provide a significant contribution to reducing such losses, as the lack of knowledge concerning flood risk is a major factor in the transformation of river floods into flood disasters. In order to accurately reproduce flooding in river channels and floodplains, high spatial resolution hydrodynamic models are needed. Despite being computationally expensive, recent advances have made their continental to global implementation feasible, although inputs for long-term simulations may require the use of reanalysis meteorological products especially in data-poor regions. We employ a coupled hydrologic/hydrodynamic model cascade forced by the 20CRv2 reanalysis data set and evaluate its ability to reproduce flood inundation area and volume for Australia during the 1973-2012 period. Ensemble simulations using the reanalysis data were performed to account for uncertainty in the meteorology and compared with a validated benchmark simulation. Results show that the reanalysis ensemble capture the inundated areas and volumes relatively well, with correlations for the ensemble mean of 0.82 and 0.85 for area and volume, respectively, although the meteorological ensemble spread propagates in large uncertainty of the simulated flood characteristics.

  11. Providing Access to a Diverse Set of Global Reanalysis Dataset Collections

    NASA Astrophysics Data System (ADS)

    Schuster, D.; Worley, S. J.

    2015-12-01

    The National Center for Atmospheric Research (NCAR) Research Data Archive (RDA, http://rda.ucar.edu) provides open access to a variety of global reanalysis dataset collections to support atmospheric and related sciences research worldwide. These include products from the European Centre for Medium-Range Weather Forecasts (ECMWF), Japan Meteorological Agency (JMA), National Centers for Environmental Prediction (NCEP), National Oceanic and Atmospheric Administration (NOAA), and NCAR.All RDA hosted reanalysis collections are freely accessible to registered users through a variety of methods. Standard access methods include traditional browser and scripted HTTP file download. Enhanced downloads are available through the Globus GridFTP "fire and forget" data transfer service, which provides an efficient, reliable, and preferred alternative to traditional HTTP-based methods. For those that favor interoperable access using compatible tools, the Unidata THREDDS Data server provides remote access to complete reanalysis collections by virtual dataset aggregation "files". Finally, users can request data subsets and format conversions to be prepared for them through web interface form requests or web service API batch requests. This approach uses NCAR HPC and central file systems to effectively prepare products from the high-resolution and very large reanalyses archives. The presentation will include a detailed inventory of all RDA reanalysis dataset collection holdings, and highlight access capabilities to these collections through use case examples.

  12. Mutual information estimation reveals global associations between stimuli and biological processes

    PubMed Central

    Suzuki, Taiji; Sugiyama, Masashi; Kanamori, Takafumi; Sese, Jun

    2009-01-01

    Background Although microarray gene expression analysis has become popular, it remains difficult to interpret the biological changes caused by stimuli or variation of conditions. Clustering of genes and associating each group with biological functions are often used methods. However, such methods only detect partial changes within cell processes. Herein, we propose a method for discovering global changes within a cell by associating observed conditions of gene expression with gene functions. Results To elucidate the association, we introduce a novel feature selection method called Least-Squares Mutual Information (LSMI), which computes mutual information without density estimaion, and therefore LSMI can detect nonlinear associations within a cell. We demonstrate the effectiveness of LSMI through comparison with existing methods. The results of the application to yeast microarray datasets reveal that non-natural stimuli affect various biological processes, whereas others are no significant relation to specific cell processes. Furthermore, we discover that biological processes can be categorized into four types according to the responses of various stimuli: DNA/RNA metabolism, gene expression, protein metabolism, and protein localization. Conclusion We proposed a novel feature selection method called LSMI, and applied LSMI to mining the association between conditions of yeast and biological processes through microarray datasets. In fact, LSMI allows us to elucidate the global organization of cellular process control. PMID:19208155

  13. arrayCGHbase: an analysis platform for comparative genomic hybridization microarrays

    PubMed Central

    Menten, Björn; Pattyn, Filip; De Preter, Katleen; Robbrecht, Piet; Michels, Evi; Buysse, Karen; Mortier, Geert; De Paepe, Anne; van Vooren, Steven; Vermeesch, Joris; Moreau, Yves; De Moor, Bart; Vermeulen, Stefan; Speleman, Frank; Vandesompele, Jo

    2005-01-01

    Background The availability of the human genome sequence as well as the large number of physically accessible oligonucleotides, cDNA, and BAC clones across the entire genome has triggered and accelerated the use of several platforms for analysis of DNA copy number changes, amongst others microarray comparative genomic hybridization (arrayCGH). One of the challenges inherent to this new technology is the management and analysis of large numbers of data points generated in each individual experiment. Results We have developed arrayCGHbase, a comprehensive analysis platform for arrayCGH experiments consisting of a MIAME (Minimal Information About a Microarray Experiment) supportive database using MySQL underlying a data mining web tool, to store, analyze, interpret, compare, and visualize arrayCGH results in a uniform and user-friendly format. Following its flexible design, arrayCGHbase is compatible with all existing and forthcoming arrayCGH platforms. Data can be exported in a multitude of formats, including BED files to map copy number information on the genome using the Ensembl or UCSC genome browser. Conclusion ArrayCGHbase is a web based and platform independent arrayCGH data analysis tool, that allows users to access the analysis suite through the internet or a local intranet after installation on a private server. ArrayCGHbase is available at . PMID:15910681

  14. Shrinkage covariance matrix approach based on robust trimmed mean in gene sets detection

    NASA Astrophysics Data System (ADS)

    Karjanto, Suryaefiza; Ramli, Norazan Mohamed; Ghani, Nor Azura Md; Aripin, Rasimah; Yusop, Noorezatty Mohd

    2015-02-01

    Microarray involves of placing an orderly arrangement of thousands of gene sequences in a grid on a suitable surface. The technology has made a novelty discovery since its development and obtained an increasing attention among researchers. The widespread of microarray technology is largely due to its ability to perform simultaneous analysis of thousands of genes in a massively parallel manner in one experiment. Hence, it provides valuable knowledge on gene interaction and function. The microarray data set typically consists of tens of thousands of genes (variables) from just dozens of samples due to various constraints. Therefore, the sample covariance matrix in Hotelling's T2 statistic is not positive definite and become singular, thus it cannot be inverted. In this research, the Hotelling's T2 statistic is combined with a shrinkage approach as an alternative estimation to estimate the covariance matrix to detect significant gene sets. The use of shrinkage covariance matrix overcomes the singularity problem by converting an unbiased to an improved biased estimator of covariance matrix. Robust trimmed mean is integrated into the shrinkage matrix to reduce the influence of outliers and consequently increases its efficiency. The performance of the proposed method is measured using several simulation designs. The results are expected to outperform existing techniques in many tested conditions.

  15. An Interview with Matthew P. Greving, PhD. Interview by Vicki Glaser.

    PubMed

    Greving, Matthew P

    2011-10-01

    Matthew P. Greving is Chief Scientific Officer at Nextval Inc., a company founded in early 2010 that has developed a discovery platform called MassInsight™.. He received his PhD in Biochemistry from Arizona State University, and prior to that he spent nearly 7 years working as a software engineer. This experience in solving complex computational problems fueled his interest in developing technologies and algorithms related to acquisition and analysis of high-dimensional biochemical data. To address the existing problems associated with label-based microarray readouts, he beganwork on a technique for label-free mass spectrometry (MS) microarray readout compatible with both matrix-assisted laser/desorption ionization (MALDI) and matrix-free nanostructure initiator mass spectrometry (NIMS). This is the core of Nextval’s MassInsight technology, which utilizes picoliter noncontact deposition of high-density arrays on mass-readout substrates along with computational algorithms for high-dimensional data processingand reduction.

  16. Novel Harmonic Regularization Approach for Variable Selection in Cox's Proportional Hazards Model

    PubMed Central

    Chu, Ge-Jin; Liang, Yong; Wang, Jia-Xuan

    2014-01-01

    Variable selection is an important issue in regression and a number of variable selection methods have been proposed involving nonconvex penalty functions. In this paper, we investigate a novel harmonic regularization method, which can approximate nonconvex Lq  (1/2 < q < 1) regularizations, to select key risk factors in the Cox's proportional hazards model using microarray gene expression data. The harmonic regularization method can be efficiently solved using our proposed direct path seeking approach, which can produce solutions that closely approximate those for the convex loss function and the nonconvex regularization. Simulation results based on the artificial datasets and four real microarray gene expression datasets, such as real diffuse large B-cell lymphoma (DCBCL), the lung cancer, and the AML datasets, show that the harmonic regularization method can be more accurate for variable selection than existing Lasso series methods. PMID:25506389

  17. Direct multiplexed measurement of gene expression with color-coded probe pairs.

    PubMed

    Geiss, Gary K; Bumgarner, Roger E; Birditt, Brian; Dahl, Timothy; Dowidar, Naeem; Dunaway, Dwayne L; Fell, H Perry; Ferree, Sean; George, Renee D; Grogan, Tammy; James, Jeffrey J; Maysuria, Malini; Mitton, Jeffrey D; Oliveri, Paola; Osborn, Jennifer L; Peng, Tao; Ratcliffe, Amber L; Webster, Philippa J; Davidson, Eric H; Hood, Leroy; Dimitrov, Krassen

    2008-03-01

    We describe a technology, the NanoString nCounter gene expression system, which captures and counts individual mRNA transcripts. Advantages over existing platforms include direct measurement of mRNA expression levels without enzymatic reactions or bias, sensitivity coupled with high multiplex capability, and digital readout. Experiments performed on 509 human genes yielded a replicate correlation coefficient of 0.999, a detection limit between 0.1 fM and 0.5 fM, and a linear dynamic range of over 500-fold. Comparison of the NanoString nCounter gene expression system with microarrays and TaqMan PCR demonstrated that the nCounter system is more sensitive than microarrays and similar in sensitivity to real-time PCR. Finally, a comparison of transcript levels for 21 genes across seven samples measured by the nCounter system and SYBR Green real-time PCR demonstrated similar patterns of gene expression at all transcript levels.

  18. Probe classification of on-off type DNA microarray images with a nonlinear matching measure

    NASA Astrophysics Data System (ADS)

    Ryu, Munho; Kim, Jong Dae; Min, Byoung Goo; Kim, Jongwon; Kim, Y. Y.

    2006-01-01

    We propose a nonlinear matching measure, called counting measure, as a signal detection measure that is defined as the number of on pixels in the spot area. It is applied to classify probes for an on-off type DNA microarray, where each probe spot is classified as hybridized or not. The counting measure also incorporates the maximum response search method, where the expected signal is obtained by taking the maximum among the measured responses of the various positions and sizes of the spot template. The counting measure was compared to existing signal detection measures such as the normalized covariance and the median for 2390 patient samples tested on the human papillomavirus (HPV) DNA chip. The counting measure performed the best regardless of whether or not the maximum response search method was used. The experimental results showed that the counting measure combined with the positional search was the most preferable.

  19. The Mediterranean interannual variability in MEDRYS, a Mediterranean Sea reanalysis over 1992-2013

    NASA Astrophysics Data System (ADS)

    Beuvier, Jonathan; Hamon, Mathieu; Lellouche, Jean-Michel; Greiner, Eric; Alias, Antoinette; Arsouze, Thomas; Benkiran, Mounir; Béranger, Karine; Drillet, Yann; Sevault, Florence; Somot, Samuel

    2015-04-01

    The French research community on the Mediterranean Sea and the French operational ocean forecasting center Mercator Océan are gathering their skills and expertises in physical oceanography, ocean modelling, atmospheric forcings and data assimilation, to carry out a MEDiterranean Sea ReanalYsiS (MEDRYS) at high resolution for the period 1992-2013. The reanalysis is used to have a realistic description of the ocean state over the recent decades and it will help to understand the long-term water cycle over the Mediterranean basin in terms of variability and trends, contributing thus to the HyMeX international program. The ocean model used is NEMOMED12 [Lebeaupin Brossier et al., 2011, Oc. Mod., 2012, Oc. Mod.; Beuvier et al., 2012a, JGR, 2012b, Mercator Newsl.], a Mediterranean configuration of NEMO [Madec and the NEMO Team, 2008], with a 1/12° (about 7 km) horizontal resolution and 75 vertical z-levels with partial steps. It is forced by the 3-hourly atmospheric fluxes coming from an ALADIN-Climate simulation at 12 km of resolution [Herrmann et al., 2011, NHESS], driven by the ERA-Interim atmospheric reanalysis. The exchanges with the Atlantic Ocean are performed through a buffer zone, with a damping on 3D theta-S and on sea level towards the ORA-S4 oceanic reanalysis [Balmaseda et al., 2012, QJRMS]. This model configuration is used to carry a 34-year free simulation over the period 1979-2013. This free simulation is the initial state of the reanalysis in October 1992. It is also used to compute anomalies from which the data assimilation scheme derives required characteristic covariances of the ocean model. MEDRYS1 uses the current Mercator Océan operational data assimilation system [Lellouche et al., 2013, Oc.Sci.]. It uses a reduced order Kalman filter with a 3D multivariate modal decomposition of the forecast error. A 3D-Var scheme corrects biases in temperature and salinity for the slowly evolving large-scale. In addition, some modifications dedicated to the Mediterranean area (more specific Post-Glacial-Rebound corrections, new model-equivalent for the Sea Level Anomaly for example) have been introduced. Temperature and salinity vertical profiles from the newly released CORA4 database, altimeter data and satellite SST and are jointly assimilated. Thus, the reanalysis benefits from the intensive observational field campaigns carried out during the HyMeX Special Observation Periods (SOPs) in fall 2012 and winter 2013 in the north-western Mediterranean Sea. We assess here the ability of a MEDRYS1 to reproduce the general circulation and the water masses in the Mediterranean Sea. We present the misfit between the reanalysis and the assimilated observations, as well as differences between the reanalysis and its twin free simulation. We show diagnostics on the surface circulation variability, heat and salt contents and deep water formation over the whole period of the reanalysis, with also a focus on the impact of the HyMeX data during the SOPs time period.

  20. Analyzing the most frequent disease loci in targeted patient categories optimizes disease gene identification and test accuracy worldwide.

    PubMed

    Lebo, Roger V; Tonk, Vijay S

    2015-01-21

    Our genomewide studies support targeted testing the most frequent genetic diseases by patient category: (1) pregnant patients, (2) at-risk conceptuses, (3) affected children, and (4) abnormal adults. This approach not only identifies most reported disease causing sequences accurately, but also minimizes incorrectly identified additional disease causing loci. Diseases were grouped in descending order of occurrence from four data sets: (1) GeneTests 534 listed population prevalences, (2) 4129 high risk prenatal karyotypes, (3) 1265 affected patient microarrays, and (4) reanalysis of 25,452 asymptomatic patient results screened prenatally for 108 genetic diseases. These most frequent diseases are categorized by transmission: (A) autosomal recessive, (B) X-linked, (C) autosomal dominant, (D) microscopic chromosome rearrangements, (E) submicroscopic copy number changes, and (F) frequent ethnic diseases. Among affected and carrier patients worldwide, most reported mutant genes would be identified correctly according to one of four patient categories from at-risk couples with <64 tested genes to affected adults with 314 tested loci. Three clinically reported patient series confirmed this approach. First, only 54 targeted chromosomal sites would have detected all 938 microscopically visible unbalanced karyotypes among 4129 karyotyped POC, CVS, and amniocentesis samples. Second, 37 of 48 reported aneuploid regions were found among our 1265 clinical microarrays confirming the locations of 8 schizophrenia loci and 20 aneuploidies altering intellectual ability, while also identifying 9 of the most frequent deletion syndromes. Third, testing 15 frequent genes would have identified 124 couples with a 1 in 4 risk of a fetus with a recessive disease compared to the 127 couples identified by testing all 108 genes, while testing all mutations in 15 genes could have identified more couples. Testing the most frequent disease causing abnormalities in 1 of 8 reported disease loci [~1 of 84 total genes] will identify ~ 7 of 8 reported abnormal Caucasian newborn genotypes. This would eliminate ~8 to 10 of ~10 Caucasian newborn gene sequences selected as abnormal that are actually normal variants identified when testing all ~2500 diseases looking for the remaining 1 of 8 disease causing genes. This approach enables more accurate testing within available laboratory and reimbursement resources.

  1. Using a Coupled Lake Model with WRF for Dynamical Downscaling

    EPA Science Inventory

    The Weather Research and Forecasting (WRF) model is used to downscale a coarse reanalysis (National Centers for Environmental Prediction–Department of Energy Atmospheric Model Intercomparison Project reanalysis, hereafter R2) as a proxy for a global climate model (GCM) to examine...

  2. Meta-analysis of gene expression patterns in animal models of prenatal alcohol exposure suggests role for protein synthesis inhibition and chromatin remodeling

    PubMed Central

    Rogic, Sanja; Wong, Albertina; Pavlidis, Paul

    2017-01-01

    Background Prenatal alcohol exposure (PAE) can result in an array of morphological, behavioural and neurobiological deficits that can range in their severity. Despite extensive research in the field and a significant progress made, especially in understanding the range of possible malformations and neurobehavioral abnormalities, the molecular mechanisms of alcohol responses in development are still not well understood. There have been multiple transcriptomic studies looking at the changes in gene expression after PAE in animal models, however there is a limited apparent consensus among the reported findings. In an effort to address this issue, we performed a comprehensive re-analysis and meta-analysis of all suitable, publically available expression data sets. Methods We assembled ten microarray data sets of gene expression after PAE in mouse and rat models consisting of samples from a total of 63 ethanol-exposed and 80 control animals. We re-analyzed each data set for differential expression and then used the results to perform meta-analyses considering all data sets together or grouping them by time or duration of exposure (pre- and post-natal, acute and chronic, respectively). We performed network and Gene Ontology enrichment analysis to further characterize the identified signatures. Results For each sub-analysis we identified signatures of differential expressed genes that show support from multiple studies. Overall, the changes in gene expression were more extensive after acute ethanol treatment during prenatal development than in other models. Considering the analysis of all the data together, we identified a robust core signature of 104 genes down-regulated after PAE, with no up-regulated genes. Functional analysis reveals over-representation of genes involved in protein synthesis, mRNA splicing and chromatin organization. Conclusions Our meta-analysis shows that existing studies, despite superficial dissimilarity in findings, share features that allow us to identify a common core signature set of transcriptome changes in PAE. This is an important step to identifying the biological processes that underlie the etiology of FASD. PMID:26996386

  3. On-Chip, Amplification-Free Quantification of Nucleic Acid for Point-of-Care Diagnosis

    NASA Astrophysics Data System (ADS)

    Yen, Tony Minghung

    This dissertation demonstrates three physical device concepts to overcome limitations in point-of-care quantification of nucleic acids. Enabling sensitive, high throughput nucleic acid quantification on a chip, outside of hospital and centralized laboratory setting, is crucial for improving pathogen detection and cancer diagnosis and prognosis. Among existing platforms, microarray have the advantages of being amplification free, low instrument cost, and high throughput, but are generally less sensitive compared to sequencing and PCR assays. To bridge this performance gap, this dissertation presents theoretical and experimental progress to develop a platform nucleic acid quantification technology that is drastically more sensitive than current microarrays while compatible with microarray architecture. The first device concept explores on-chip nucleic acid enrichment by natural evaporation of nucleic acid solution droplet. Using a micro-patterned super-hydrophobic black silicon array device, evaporative enrichment is coupled with nano-liter droplet self-assembly workflow to produce a 50 aM concentration sensitivity, 6 orders of dynamic range, and rapid hybridization time at under 5 minutes. The second device concept focuses on improving target copy number sensitivity, instead of concentration sensitivity. A comprehensive microarray physical model taking into account of molecular transport, electrostatic intermolecular interactions, and reaction kinetics is considered to guide device optimization. Device pattern size and target copy number are optimized based on model prediction to achieve maximal hybridization efficiency. At a 100-mum pattern size, a quantum leap in detection limit of 570 copies is achieved using black silicon array device with self-assembled pico-liter droplet workflow. Despite its merits, evaporative enrichment on black silicon device suffers from coffee-ring effect at 100-mum pattern size, and thus not compatible with clinical patient samples. The third device concept utilizes an integrated optomechanical laser system and a Cytop microarray device to reverse coffee-ring effect during evaporative enrichment at 100-mum pattern size. This method, named "laser-induced differential evaporation" is expected to enable 570 copies detection limit for clinical samples in near future. While the work is ongoing as of the writing of this dissertation, a clear research plan is in place to implement this method on microarray platform toward clinical sample testing for disease applications and future commercialization.

  4. Tropical cyclone genesis potential index over the western North Pacific simulated by CMIP5 models

    NASA Astrophysics Data System (ADS)

    Song, Yajuan; Wang, Lei; Lei, Xiaoyan; Wang, Xidong

    2015-11-01

    Tropical cyclone (TC) genesis over the western North Pacific (WNP) is analyzed using 23 CMIP5 (Coupled Model Intercomparison Project Phase 5) models and reanalysis datasets. The models are evaluated according to TC genesis potential index (GPI). The spatial and temporal variations of the GPI are first calculated using three atmospheric reanalysis datasets (ERA-Interim, NCEP/NCAR Reanalysis-1, and NCEP/DOE Reanalysis-2). Spatial distributions of July-October-mean TC frequency based on the GPI from ERA-interim are more consistent with observed ones derived from IBTrACS global TC data. So, the ERA-interim reanalysis dataset is used to examine the CMIP5 models in terms of reproducing GPI during the period 1982-2005. Although most models possess deficiencies in reproducing the spatial distribution of the GPI, their multimodel ensemble (MME) mean shows a reasonable climatological GPI pattern characterized by a high GPI zone along 20°N in the WNP. There was an upward trend of TC genesis frequency during 1982 to 1998, followed by a downward trend. Both MME results and reanalysis data can represent a robust increasing trend during 1982-1998, but the models cannot simulate the downward trend after 2000. Analysis based on future projection experiments shows that the GPI exhibits no significant change in the first half of the 21st century, and then starts to decrease at the end of the 21st century under the representative concentration pathway (RCP) 2.6 scenario. Under the RCP8.5 scenario, the GPI shows an increasing trend in the vicinity of 20°N, indicating more TCs could possibly be expected over the WNP under future global warming.

  5. Evaluation of representativeness of near-surface winds in station measurements, global and regional reanalysis for Germany

    NASA Astrophysics Data System (ADS)

    Kaspar, Frank; Kaiser-Weiss, Andrea K.; Heene, Vera; Borsche, Michael; Keller, Jan

    2015-04-01

    Within the preparation activities for a European COPERNICUS Climate Change Service (C3S) several ongoing research projects analyse the potential of global and regional model-based climate reanalyses for applications. A user survey in the FP7-project CORE-CLIMAX revealed that surface wind (10 m) is among the most frequently used parameters of global reanalysis products. The FP7 project UERRA (Uncertainties in Ensembles of Regional Re-Analysis) has the focus on regional European reanalysis and the associated uncertainties, also from a user perspective. Especially in the field of renewable energy planning and production there is a need for climatological information across all spatial scales, i.e., from climatology at a certain site to the spatial scale of national or continental renewable energy production. Here, we focus on a comparison of wind measurements of the Germany's meteorological service (Deutscher Wetterdienst, DWD) with global reanalyses of ECWMF and a regional reanalysis for Europe based on DWD's NWP-model COSMO (performed by the Hans-Ertel-Center for Weather Research, University of Bonn). Reanalyses can provide valuable additional information on larger scale variability, e.g. multi-annual variation over Germany. However, changes in the observing system, model errors and biases have to be carefully considered. On the other hand, the ground-based observation networks partly suffer from change of the station distribution, changes in instrumentation, measurements procedures and quality control as well as local changes which might modify their spatial representativeness. All these effects might often been unknown or hard to characterize, although plenty of the meta-data information has been recorded for the German stations. One focus of the presentation will be the added-value of the regional reanalysis.

  6. Report of the 4th World Climate Research Programme International Conference on Reanalyses

    NASA Technical Reports Server (NTRS)

    Bosilovich, Michael G.; Rixen, Michel; van Oevelen, Peter; Asrar, Ghassem; Compo, Gilbert; Onogi, Kazutoshi; Simmons, Adrian; Trenberth, Kevin; Behringer, Dave; Bhuiyan, Tanvir Hossain; hide

    2012-01-01

    The 4th WCRP International Conference on Reanalyses provided an opportunity for the international community to review and discuss the observational and modelling research, as well as process studies and uncertainties associated with reanalysis of the Earth System and its components. Characterizing the uncertainty and quality of reanalyses is a task that reaches far beyond the international community of producers, and into the interdisciplinary research community, especially those using reanalysis products in their research and applications. Reanalyses have progressed greatly even in the last 5 years, and newer ideas, projects and data are coming forward. While reanalysis has typically been carried out for the individual domains of atmosphere, ocean and land, it is now moving towards coupling using Earth system models. Observations are being reprocessed and they are providing improved quality for use in reanalysis. New applications are being investigated, and the need for climate reanalyses is as strong as ever. At the heart of it all, new investigators are exploring the possibilities for reanalysis, and developing new ideas in research and applications. Given the many centres creating reanalyses products (e.g. ocean, land and cryosphere research centres as well as NWP and atmospheric centers), and the development of new ideas (e.g. families of reanalyses), the total number of reanalyses is increasing greatly, with new and innovative diagnostics and output data. The need for reanalysis data is growing steadily, and likewise, the need for open discussion and comment on the data. The 4th Conference was convened to provide a forum for constructive discussion on the objectives, strengths and weaknesses of reanalyses, indicating potential development paths for the future.

  7. Evaluation of four global reanalysis products using in situ observations in the Amundsen Sea Embayment, Antarctica

    NASA Astrophysics Data System (ADS)

    Jones, R. W.; Renfrew, I. A.; Orr, A.; Webber, B. G. M.; Holland, D. M.; Lazzara, M. A.

    2016-06-01

    The glaciers within the Amundsen Sea Embayment (ASE), West Antarctica, are amongst the most rapidly retreating in Antarctica. Meteorological reanalysis products are widely used to help understand and simulate the processes causing this retreat. Here we provide an evaluation against observations of four of the latest global reanalysis products within the ASE region—the European Centre for Medium-Range Weather Forecasts Interim Reanalysis (ERA-I), Japanese 55-year Reanalysis (JRA-55), Climate Forecast System Reanalysis (CFSR), and Modern Era Retrospective-Analysis for Research and Applications (MERRA). The observations comprise data from four automatic weather stations (AWSs), three research vessel cruises, and a new set of 38 radiosondes all within the period 2009-2014. All four reanalyses produce 2 m temperature fields that are colder than AWS observations, with the biases varying from approximately -1.8°C (ERA-I) to -6.8°C (MERRA). Over the Amundsen Sea, spatially averaged summertime biases are between -0.4°C (JRA-55) and -2.1°C (MERRA) with notably larger cold biases close to the continent (up to -6°C) in all reanalyses. All four reanalyses underestimate near-surface wind speed at high wind speeds (>15 m s-1) and exhibit dry biases and relatively large root-mean-square errors (RMSE) in specific humidity. A comparison to the radiosonde soundings shows that the cold, dry bias at the surface extends into the lower troposphere; here ERA-I and CFSR reanalyses provide the most accurate profiles. The reanalyses generally contain larger temperature and humidity biases, (and RMSE) when a temperature inversion is observed, and contain larger wind speed biases (~2 to 3 m s-1), when a low-level jet is observed.

  8. Water Cycle Variability over the Global Oceans Estimated Using Homogenized Reanalysis Fluxes

    NASA Astrophysics Data System (ADS)

    Robertson, F. R.; Bosilovich, M. G.; Roberts, J. B.

    2017-12-01

    Establishing consistent records of the global water cycle fluxes and their variations is particularly difficult over oceans where the density of in situ observations varies enormously with time, satellite retrievals of flux processes are sparse, and reanalyses are uncertain. The latter have the positive attribute of assimilating diverse observations to provide boundary fluxes and transports but are hindered by at least two factors: (1) the physical parameterizations are imperfect and, (2) the forcing data availability and quality vary greatly in time and, thus, can induce time-dependent, false signals of climate variability. Here we examine the prospects for homogenization of reanalysis records, that is, identifying and greatly minimizing non-physical signals. Our analysis focuses on the satellite era, 1980 to near present. The strategy involves three atmospheric reanalysis systems: (1) the NASA MERRA-2, (2) the newest reanalysis produced by the Japanese Meteorological Agency, JRA-55, and (3) the European Centre for Medium Range Weather Forecasts 20th Century reanalysis, ERA-20C. MERRA-2 and ERA-20C are also accompanied by 10-member AMIP integrations, and JRA-55 by a reanalysis using only conventional observations, JRA-55C. Differencing these latter integrations from the more comprehensive reanalyses helps provide a clearer picture of the impact of satellite observations by removing the effects of SST forcing. This facilitates the use of principal component analysis as a tool to identify and remove non-physical signals. We then use these homogenized E, P and moisture transports to examine the consistency of diagnostics of thermodynamic and hydrologic scaling, especially the P-E pattern amplification or the "wet-get-wetter, dry-get-drier" response. Prospects for further validation by new turbulent flux retrievals by satellite are discussed.

  9. Precipitation climatology over India: validation with observations and reanalysis datasets and spatial trends

    NASA Astrophysics Data System (ADS)

    Kishore, P.; Jyothi, S.; Basha, Ghouse; Rao, S. V. B.; Rajeevan, M.; Velicogna, Isabella; Sutterley, Tyler C.

    2016-01-01

    Changing rainfall patterns have significant effect on water resources, agriculture output in many countries, especially the country like India where the economy depends on rain-fed agriculture. Rainfall over India has large spatial as well as temporal variability. To understand the variability in rainfall, spatial-temporal analyses of rainfall have been studied by using 107 (1901-2007) years of daily gridded India Meteorological Department (IMD) rainfall datasets. Further, the validation of IMD precipitation data is carried out with different observational and different reanalysis datasets during the period from 1989 to 2007. The Global Precipitation Climatology Project data shows similar features as that of IMD with high degree of comparison, whereas Asian Precipitation-Highly-Resolved Observational Data Integration Towards Evaluation data show similar features but with large differences, especially over northwest, west coast and western Himalayas. Spatially, large deviation is observed in the interior peninsula during the monsoon season with National Aeronautics Space Administration-Modern Era Retrospective-analysis for Research and Applications (NASA-MERRA), pre-monsoon with Japanese 25 years Re Analysis (JRA-25), and post-monsoon with climate forecast system reanalysis (CFSR) reanalysis datasets. Among the reanalysis datasets, European Centre for Medium-Range Weather Forecasts Interim Re-Analysis (ERA-Interim) shows good comparison followed by CFSR, NASA-MERRA, and JRA-25. Further, for the first time, with high resolution and long-term IMD data, the spatial distribution of trends is estimated using robust regression analysis technique on the annual and seasonal rainfall data with respect to different regions of India. Significant positive and negative trends are noticed in the whole time series of data during the monsoon season. The northeast and west coast of the Indian region shows significant positive trends and negative trends over western Himalayas and north central Indian region.

  10. Sensitivity of Statistical Downscaling Techniques to Reanalysis Choice and Implications for Regional Climate Change Scenarios

    NASA Astrophysics Data System (ADS)

    Manzanas, R., Sr.; Brands, S.; San Martin, D., Sr.; Gutiérrez, J. M., Sr.

    2014-12-01

    This work shows that local-scale climate projections obtained by means of statistical downscaling are sensitive to the choice of reanalysis used for calibration. To this aim, a Generalized Linear Model (GLM) approach is applied to downscale daily precipitation in the Philippines. First, the GLMs are trained and tested -under a cross-validation scheme- separately for two distinct reanalyses (ERA-Interim and JRA-25) for the period 1981-2000. When the observed and downscaled time-series are compared, the attained performance is found to be sensitive to the reanalysis considered if climate change signal bearing variables (temperature and/or specific humidity) are included in the predictor field. Moreover, performance differences are shown to be in correspondence with the disagreement found between the raw predictors from the two reanalyses. Second, the regression coefficients calibrated either with ERA-Interim or JRA-25 are subsequently applied to the output of a Global Climate Model (MPI-ECHAM5) in order to assess the sensitivity of local-scale climate change projections (up to 2100) to reanalysis choice. In this case, the differences detected in present climate conditions are considerably amplified, leading to "delta-change" estimates differing by up to a 35% (on average for the entire country) depending on the reanalysis used for calibration. Therefore, reanalysis choice is shown to importantly contribute to the uncertainty of local-scale climate change projections, and, consequently, should be treated with equal care as other, well-known, sources of uncertainty -e.g., the choice of the GCM and/or downscaling method.- Implications of the results for the entire tropics, as well as for the Model Output Statistics downscaling approach are also briefly discussed.

  11. Multidecadal Changes in the UTLS Ozone from the MERRA-2 Reanalysis and the GMI Chemistry Model

    NASA Technical Reports Server (NTRS)

    Wargan, Krzysztof; Orbe, Clara; Pawson, Steven; Ziemke, Jerald R.; Oman, Luke; Olsen, Mark; Coy, Lawrence; Knowland, Emma

    2018-01-01

    Long-term changes of ozone in the UTLS (Upper Troposphere / Lower Stratosphere) reflect the response to decreases in the stratospheric concentrations of ozone-depleting substances as well as changes in the stratospheric circulation induced by climate change. To date, studies of UTLS ozone changes and variability have relied mainly on satellite and in-situ observations as well as chemistry-climate model simulations. By comparison, the potential of reanalysis ozone data remains relatively untapped. This is despite evidence from recent studies, including detailed analyses conducted under SPARC (Scalable Processor Architecture) Reanalysis Intercomparison Project (S-RIP), that demonstrate that stratospheric ozone fields from modern atmospheric reanalyses exhibit good agreement with independent data while delineating issues related to inhomogeneities in the assimilated observations. In this presentation, we will explore the possibility of inferring long-term geographically and vertically resolved behavior of the lower stratospheric (LS) ozone from NASA's MERRA-2 (Modern-Era Retrospective Analysis for Research and Applications -2) reanalysis after accounting for the few known discontinuities and gaps in its assimilated input data. This work builds upon previous studies that have documented excellent agreement between MERRA-2 ozone and ozonesonde observations in the LS. Of particular importance is a relatively good vertical resolution of MERRA-2 allowing precise separation of tropospheric and stratospheric ozone contents. We also compare the MERRA-2 LS ozone results with the recently completed 37-year simulation produced using Goddard Earth Observing System in "replay"� mode coupled with the GMI (Global Modeling Initiative) chemistry mechanism. Replay mode dynamically constrains the model with the MERRA-2 reanalysis winds, temperature, and pressure. We will emphasize the areas of agreement of the reanalysis and replay and interpret differences between them in the context of our increasing understanding of model transport driven by assimilated winds.

  12. Spatiotemporal Evaluation of Reanalysis and In-situ Surface Air Temperature over Ethiopia

    NASA Astrophysics Data System (ADS)

    Tesfaye, T.

    2017-12-01

    Tewodros Woldemariam Tesfaye*1, C.T. Dhanya 2,and A.K. Gosain3 1Research Scholar, Department of Civil Engineering, Indian Institute of Technology Delhi, New Delhi-110016, India 2Assistant Professor, Department of Civil Engineering, Indian Institute of Technology Delhi, New Delhi-110016, India 3 Professor, Department of Civil Engineering, Indian Institute of Technology Delhi, New Delhi-110016, India, *e-mail: tewodros2002@gmail.com Abstract: Water resources management and modelling studies are often constrained by the scarcity of observed data, especially of the two major variables i.e., precipitation and temperature. Modellers, hence, rely on reanalysis datasets as a substitute; though its performance heavily vary depending on the data availability and regional characteristics. The present study aims at examining the ability of frequently used reanalysis datasets in capturing the spatiotemporal characteristics of maximum and minimum surface temperatures over Ethiopia and to highlight the biases, if any, in these over Ethiopian region. We considered ERA-Interim, NCEP 2, MERRA and CFSR reanalysis datasets and compared these with temperature observations from 15 synoptic stations spread over Ethiopia. In addition to the long term averages and annual cycle, a critical comparison of various extreme indices such as diurnal temperature range, warm days, warm nights, cool days, cool nights, summer days and tropical nights are also undertaken. Our results indicate that, the performance of CFSR followed by NCEP 2 is better in capturing majority of the aspects. ERA-Interim suffers a huge additive bias in the simulation of various aspects of minimum temperature in all the stations considered; while its performance is better for maximum temperature. The inferior performance of ERA-Interim is noted to be only because of the difficulty in simulating minimum temperature. Key words: ERA Interim; NCEP Reanalysis; MERRA; CFSR; Diurnal temperature range; reanalysis performance.

  13. Assessing the Suitability of Historical PM(2.5) Element Measurements for Trend Analysis.

    PubMed

    Hyslop, Nicole P; Trzepla, Krystyna; White, Warren H

    2015-08-04

    The IMPROVE (Interagency Monitoring of Protected Visual Environments) network has characterized fine particulate matter composition at locations throughout the United States since 1988. A main objective of the network is to evaluate long-term trends in aerosol concentrations. Measurements inevitably advance over time, but changes in measurement technique have the potential to confound the interpretation of long-term trends. Problems of interpretation typically arise from changing biases, and changes in bias can be difficult to identify without comparison data that are consistent throughout the measurement series, which rarely exist. We created a consistent measurement series for exactly this purpose by reanalyzing the 15-year archives (1995-2009) of aerosol samples from three sites - Great Smoky Mountains National Park, Mount Rainier National Park, and Point Reyes National Seashore-as single batches using consistent analytical methods. In most cases, trend estimates based on the original and reanalysis measurements are statistically different for elements that were not measured above the detection limit consistently over the years (e.g., Na, Cl, Si, Ti, V, Mn). The original trends are more reliable for elements consistently measured above the detection limit. All but one of the 23 site-element series with detection rates >80% had statistically indistinguishable original and reanalysis trends (overlapping 95% confidence intervals).

  14. MGDB: crossing the marker genes of a user microarray with a database of public-microarrays marker genes.

    PubMed

    Huerta, Mario; Munyi, Marc; Expósito, David; Querol, Enric; Cedano, Juan

    2014-06-15

    The microarrays performed by scientific teams grow exponentially. These microarray data could be useful for researchers around the world, but unfortunately they are underused. To fully exploit these data, it is necessary (i) to extract these data from a repository of the high-throughput gene expression data like Gene Expression Omnibus (GEO) and (ii) to make the data from different microarrays comparable with tools easy to use for scientists. We have developed these two solutions in our server, implementing a database of microarray marker genes (Marker Genes Data Base). This database contains the marker genes of all GEO microarray datasets and it is updated monthly with the new microarrays from GEO. Thus, researchers can see whether the marker genes of their microarray are marker genes in other microarrays in the database, expanding the analysis of their microarray to the rest of the public microarrays. This solution helps not only to corroborate the conclusions regarding a researcher's microarray but also to identify the phenotype of different subsets of individuals under investigation, to frame the results with microarray experiments from other species, pathologies or tissues, to search for drugs that promote the transition between the studied phenotypes, to detect undesirable side effects of the treatment applied, etc. Thus, the researcher can quickly add relevant information to his/her studies from all of the previous analyses performed in other studies as long as they have been deposited in public repositories. Marker-gene database tool: http://ibb.uab.es/mgdb © The Author 2014. Published by Oxford University Press.

  15. 2008 Microarray Research Group (MARG Survey): Sensing the State of Microarray Technology

    EPA Science Inventory

    Over the past several years, the field of microarrays has grown and evolved drastically. In its continued efforts to track this evolution and transformation, the ABRF-MARG has once again conducted a survey of international microarray facilities and individual microarray users. Th...

  16. Masculine and Feminine Stereotypes and Adjustment: A Reanalysis.

    ERIC Educational Resources Information Center

    Gilbert, Lucia A.; And Others

    1981-01-01

    Presents a reanalysis of earlier data from a study of self-perceptions of stereotypic masculine and feminine attributes, Rogerian-type conflict, and personal adjustment. Reanalyses demonstrated the advantage of distinguishing between androgynous and undifferentiated individuals rather than grouping them together into a single "balanced"…

  17. Chemistry Simulations using the MERRA-2 Reanalysis with the GMI CTM and Replay in Support of the Atmospheric Composition Community

    NASA Technical Reports Server (NTRS)

    Oman, Luke D.; Strahan, Susan E.

    2017-01-01

    Simulations using reanalysis meteorological fields have long been used to understand the causes of atmospheric composition change in the recent past. Using the new MERRA-2 reanalysis, we are conducting chemistry simulations to create products covering 1980-2016 for the atmospheric composition community. These simulations use the Global Modeling Initiative (GMI) chemical mechanism in two different models: the GMI Chemical Transport Model (CTM) and the GEOS-5 model in Replay mode. Replay mode means an integration of the GEOS-5 general circulation model that is incrementally adjusted each time step toward the MERRA-2 reanalysis. The GMI CTM is a 1 deg x 1.25 deg simulation and the MERRA-2 GMI Replay simulation uses the native MERRA-2 grid of approximately 1/2 deg horizontal resolution on the cubed sphere. A specialized set of transport diagnostics is included in both runs to better understand trace gas transport and its variability in the recent past.

  18. Northern Hemisphere climate trends in reanalysis and forecast model predictions: The 500 hPa annual means

    NASA Astrophysics Data System (ADS)

    Bordi, I.; Fraedrich, K.; Sutera, A.

    2010-06-01

    The lead time dependent climates of the ECMWF weather prediction model, initialized with ERA-40 reanalysis, are analysed using 44 years of day-1 to day-10 forecasts of the northern hemispheric 500-hPa geopotential height fields. The study addresses the question whether short-term tendencies have an impact on long-term trends. Comparing climate trends of ERA-40 with those of the forecasts, it seems that the forecast model rapidly loses the memory of initial conditions creating its own climate. All forecast trends show a high degree of consistency. Comparison results suggest that: (i) Only centers characterized by an upward trend are statistical significant when increasing the lead time. (ii) In midilatitudes an upward trend larger than the one observed in the reanalysis characterizes the forecasts, while in the tropics there is a good agreement. (iii) The downward trend in reanalysis at high latitudes characterizes also the day-1 forecast which, however, increasing lead time approaches zero.

  19. Cross-evaluation of ground-based, multi-satellite and reanalysis precipitation products: Applicability of the Triple Collocation method across Mainland China

    NASA Astrophysics Data System (ADS)

    Li, Changming; Tang, Guoqiang; Hong, Yang

    2018-07-01

    Evaluating the reliability of satellite and reanalysis precipitation products is critical but challenging over ungauged or poorly gauged regions. The Triple Collocation (TC) method is a reliable approach to estimate the accuracy of any three independent inputs in the absence of truth values. This study assesses the uncertainty of three types of independent precipitation products, i.e., satellite-based, ground-based and model reanalysis over Mainland China using the TC method. The ground-based data set is Gauge Based Daily Precipitation Analysis (CGDPA). The reanalysis data set is European Reanalysis Agency Reanalysis Product (ERA-interim). The satellite-based products include five mainstream satellite products. The comparison and evaluation are conducted at 0.25° and daily resolutions from 2013 to 2015. First, the effectiveness of the TC method is evaluated in South China with dense gauge network. The results demonstrate that the TC method is reliable because the correlation coefficient (CC) and root mean square error (RMSE) derived from TC are close to those derived from ground observations, with only 9% and 7% mean relative differences, respectively. Then, the TC method is applied in Mainland China, with special attention paid to the Tibetan Plateau (TP) known as the Earth's third pole with few ground stations. Results indicate that (1) The overall performance of IMERG is better than the other satellite products over Mainland China, followed by 3B42V7, CMORPH-CRT and PERSIANN-CDR. (2) In the TP, CGDPA shows the best overall performance over gauged grid cells, however, over ungauged regions, IMERG and ERA-interim slightly outperform CGDPA with similar RMSE but higher mean CC (0.63, 0.61, and 0.58, respectively). It highlights the strengths and potentiality of remote sensing and reanalysis data over the TP and reconfirms the cons of the inherent uncertainty of CGDPA due to interpolation from sparsely gauged data. The study concludes that the TC method provides not only reliable cross-validation results over Mainland China but also a new perspective for comparatively assessing multi-source precipitation products, particularly over poorly gauged regions such as the TP.

  20. How well do Reanalysis represent polar lows?

    NASA Astrophysics Data System (ADS)

    Zappa, G.; Shaffrey, L.; Hodges, K.

    2013-12-01

    Polar lows are intense maritime mesocyclones forming at high latitudes during polar air outbreaks. The associated high surface winds can be an important cause of coastal damage.They also seem to play a relevant role in the climate system by modulating the oceanic surface heat fluxes. This creates strong interest in understanding whether modern reanalysis datasets are able to represent polar lows, as well as how their representation may be sensitive to the model resolution. In this talk we investigate how ERA-Interim reanalysis represents the polar lows identified by the Norwegian meteorological services and listed in the STARS (Combination of Sea Surface Temperature and AltimeteR Synergy) dataset for the period 2002-2011. The sensitivity to resolution is explored by comparing ERA-Interim to the ECMWF operational analyses (2008-2011), which have three times higher horizontal resolution compared to ERA-Interim. We show that ERAI-Interim has excellent ability to capture the observed polar lows events with up to 90% of the observed events being found in the reanalysis. However, ERA-Interim tends to have polar lows of weaker dynamical intensity, in terms of both winds and vorticity, and with less spatial structure than in the ECMWF operational analyses (See Fig 1). Furthermore, we apply an objective feature tracking algorithm to the 3 hourly vorticity at 850 hPa with constraints on vorticity intensity and atmospheric static stability to objectively identify polar lows in the ERA-Interim reanalysis. We show that for the stronger polar lows the objective climatology shows good agreement with the STARS dataset over the 2002-2011 period. This allows us to extend the polar lows climatology over the whole ERA Interim period. Differences with another reanalysis product (NCEP-CFSR) will be also discussed. Fig 1: Composite of the tangential wind speed at 925 hPa for 34 polar lows observed in the Norwegian sea between 2008-2010 as represented by the ERA-Interim reanalysis (left) and by the ECMWF Operational analysis (right). Positive values indicate cyclonic circulation. The composite is centered on the polar low vorticity maxima and it is presented for a radial cap of 5 degrees of radius on the sphere (~550Km).

  1. An 11-year global gridded aerosol optical thickness reanalysis (v1.0) for atmospheric and climate sciences

    NASA Astrophysics Data System (ADS)

    Lynch, Peng; Reid, Jeffrey S.; Westphal, Douglas L.; Zhang, Jianglong; Hogan, Timothy F.; Hyer, Edward J.; Curtis, Cynthia A.; Hegg, Dean A.; Shi, Yingxi; Campbell, James R.; Rubin, Juli I.; Sessions, Walter R.; Turk, F. Joseph; Walker, Annette L.

    2016-04-01

    While stand alone satellite and model aerosol products see wide utilization, there is a significant need in numerous atmospheric and climate applications for a fused product on a regular grid. Aerosol data assimilation is an operational reality at numerous centers, and like meteorological reanalyses, aerosol reanalyses will see significant use in the near future. Here we present a standardized 2003-2013 global 1 × 1° and 6-hourly modal aerosol optical thickness (AOT) reanalysis product. This data set can be applied to basic and applied Earth system science studies of significant aerosol events, aerosol impacts on numerical weather prediction, and electro-optical propagation and sensor performance, among other uses. This paper describes the science of how to develop and score an aerosol reanalysis product. This reanalysis utilizes a modified Navy Aerosol Analysis and Prediction System (NAAPS) at its core and assimilates quality controlled retrievals of AOT from the Moderate Resolution Imaging Spectroradiometer (MODIS) on Terra and Aqua and the Multi-angle Imaging SpectroRadiometer (MISR) on Terra. The aerosol source functions, including dust and smoke, were regionally tuned to obtain the best match between the model fine- and coarse-mode AOTs and the Aerosol Robotic Network (AERONET) AOTs. Other model processes, including deposition, were tuned to minimize the AOT difference between the model and satellite AOT. Aerosol wet deposition in the tropics is driven with satellite-retrieved precipitation, rather than the model field. The final reanalyzed fine- and coarse-mode AOT at 550 nm is shown to have good agreement with AERONET observations, with global mean root mean square error around 0.1 for both fine- and coarse-mode AOTs. This paper includes a discussion of issues particular to aerosol reanalyses that make them distinct from standard meteorological reanalyses, considerations for extending such a reanalysis outside of the NASA A-Train era, and examples of how the aerosol reanalysis can be applied or fused with other model or remote sensing products. Finally, the reanalysis is evaluated in comparison with other available studies of aerosol trends, and the implications of this comparison are discussed.

  2. Development studies towards an 11-year global gridded aerosol optical thickness reanalysis for climate and applied applications

    NASA Astrophysics Data System (ADS)

    Lynch, P.; Reid, J. S.; Westphal, D. L.; Zhang, J.; Hogan, T. F.; Hyer, E. J.; Curtis, C. A.; Hegg, D. A.; Shi, Y.; Campbell, J. R.; Rubin, J. I.; Sessions, W. R.; Turk, F. J.; Walker, A. L.

    2015-12-01

    While standalone satellite and model aerosol products see wide utilization, there is a significant need in numerous climate and applied applications for a fused product on a regular grid. Aerosol data assimilation is an operational reality at numerous centers, and like meteorological reanalyses, aerosol reanalyses will see significant use in the near future. Here we present a standardized 2003-2013 global 1° × 1° and 6 hourly modal aerosol optical thickness (AOT) reanalysis product. This dataset can be applied to basic and applied earth system science studies of significant aerosol events, aerosol impacts on numerical weather prediction, and electro-optical propagation and sensor performance, among other uses. This paper describes the science of how to develop and score an aerosol reanalysis product. This reanalysis utilizes a modified Navy Aerosol Analysis and Prediction System (NAAPS) at its core and assimilates quality controlled retrievals of AOT from the Moderate Resolution Imaging Spectroradiometer (MODIS) on Terra and Aqua and the Multi-angle Imaging SpectroRadiometer (MISR) on Terra. The aerosol source functions, including dust and smoke, were regionally tuned to obtain the best match between the model fine and coarse mode AOTs and the Aerosol Robotic Network (AERONET) AOTs. Other model processes, including deposition, were tuned to minimize the AOT difference between the model and satellite AOT. Aerosol wet deposition in the tropics is driven with satellite retrieved precipitation, rather than the model field. The final reanalyzed fine and coarse mode AOT at 550 nm is shown to have good agreement with AERONET observations, with global mean root mean square error around 0.1 for both fine and coarse mode AOTs. This paper includes a discussion of issues particular to aerosol reanalyses that make them distinct from standard meteorological reanalyses, considerations for extending such a reanalysis outside of the NASA A-Train era, and examples of how the aerosol reanalysis can be applied or fused with other model or remote sensing products. Finally, the reanalysis is evaluated in comparison with other available studies of aerosol trends, and the implications of this comparison are discussed.

  3. THE ABRF-MARG MICROARRAY SURVEY 2004: TAKING THE PULSE OF THE MICROARRAY FIELD

    EPA Science Inventory

    Over the past several years, the field of microarrays has grown and evolved drastically. In its continued efforts to track this evolution, the ABRF-MARG has once again conducted a survey of international microarray facilities and individual microarray users. The goal of the surve...

  4. Contributions to Statistical Problems Related to Microarray Data

    ERIC Educational Resources Information Center

    Hong, Feng

    2009-01-01

    Microarray is a high throughput technology to measure the gene expression. Analysis of microarray data brings many interesting and challenging problems. This thesis consists three studies related to microarray data. First, we propose a Bayesian model for microarray data and use Bayes Factors to identify differentially expressed genes. Second, we…

  5. Parallel, confocal, and complete spectrum imager for fluorescent detection of high-density microarray

    NASA Astrophysics Data System (ADS)

    Bogdanov, Valery L.; Boyce-Jacino, Michael

    1999-05-01

    Confined arrays of biochemical probes deposited on a solid support surface (analytical microarray or 'chip') provide an opportunity to analysis multiple reactions simultaneously. Microarrays are increasingly used in genetics, medicine and environment scanning as research and analytical instruments. A power of microarray technology comes from its parallelism which grows with array miniaturization, minimization of reagent volume per reaction site and reaction multiplexing. An optical detector of microarray signals should combine high sensitivity, spatial and spectral resolution. Additionally, low-cost and a high processing rate are needed to transfer microarray technology into biomedical practice. We designed an imager that provides confocal and complete spectrum detection of entire fluorescently-labeled microarray in parallel. Imager uses microlens array, non-slit spectral decomposer, and high- sensitive detector (cooled CCD). Two imaging channels provide a simultaneous detection of localization, integrated and spectral intensities for each reaction site in microarray. A dimensional matching between microarray and imager's optics eliminates all in moving parts in instrumentation, enabling highly informative, fast and low-cost microarray detection. We report theory of confocal hyperspectral imaging with microlenses array and experimental data for implementation of developed imager to detect fluorescently labeled microarray with a density approximately 103 sites per cm2.

  6. MicroRNA-integrated and network-embedded gene selection with diffusion distance.

    PubMed

    Huang, Di; Zhou, Xiaobo; Lyon, Christopher J; Hsueh, Willa A; Wong, Stephen T C

    2010-10-29

    Gene network information has been used to improve gene selection in microarray-based studies by selecting marker genes based both on their expression and the coordinate expression of genes within their gene network under a given condition. Here we propose a new network-embedded gene selection model. In this model, we first address the limitations of microarray data. Microarray data, although widely used for gene selection, measures only mRNA abundance, which does not always reflect the ultimate gene phenotype, since it does not account for post-transcriptional effects. To overcome this important (critical in certain cases) but ignored-in-almost-all-existing-studies limitation, we design a new strategy to integrate together microarray data with the information of microRNA, the major post-transcriptional regulatory factor. We also handle the challenges led by gene collaboration mechanism. To incorporate the biological facts that genes without direct interactions may work closely due to signal transduction and that two genes may be functionally connected through multi paths, we adopt the concept of diffusion distance. This concept permits us to simulate biological signal propagation and therefore to estimate the collaboration probability for all gene pairs, directly or indirectly-connected, according to multi paths connecting them. We demonstrate, using type 2 diabetes (DM2) as an example, that the proposed strategies can enhance the identification of functional gene partners, which is the key issue in a network-embedded gene selection model. More importantly, we show that our gene selection model outperforms related ones. Genes selected by our model 1) have improved classification capability; 2) agree with biological evidence of DM2-association; and 3) are involved in many well-known DM2-associated pathways.

  7. Microarray analysis and draft genomes of two Escherichia coli 0157:H7 lineage II cattle isolates FRIK966 and FRIK2000 investigating lack of Shiga toxin expression

    USDA-ARS?s Scientific Manuscript database

    The existence of two separate lineages of Escherichia coli O157:H7 has previously been reported, and research indicates that lineage I might be more pathogenic towards human hosts than lineage II. We have previously shown that lineage I expresses higher levels of Shiga toxin 2 (Stx2). To evaluate w...

  8. Molecular Profiles for Lung Cancer Pathogenesis and Detection in U.S. Veterans

    DTIC Science & Technology

    2013-10-01

    All small RNA sequencing and microarray data has been deposited in GEO under the accession number GSE48798. References 1. Hwang HW, Mendell JT (2006...searching existing data sources, gathering and maintaining the data needed, and completing and reviewing this collection of information. Send...histologically normal airway epithelium and tumor. We have validated this approach and the data will allow us to identify novel pathways for lung

  9. MapReduce Algorithms for Inferring Gene Regulatory Networks from Time-Series Microarray Data Using an Information-Theoretic Approach.

    PubMed

    Abduallah, Yasser; Turki, Turki; Byron, Kevin; Du, Zongxuan; Cervantes-Cervantes, Miguel; Wang, Jason T L

    2017-01-01

    Gene regulation is a series of processes that control gene expression and its extent. The connections among genes and their regulatory molecules, usually transcription factors, and a descriptive model of such connections are known as gene regulatory networks (GRNs). Elucidating GRNs is crucial to understand the inner workings of the cell and the complexity of gene interactions. To date, numerous algorithms have been developed to infer gene regulatory networks. However, as the number of identified genes increases and the complexity of their interactions is uncovered, networks and their regulatory mechanisms become cumbersome to test. Furthermore, prodding through experimental results requires an enormous amount of computation, resulting in slow data processing. Therefore, new approaches are needed to expeditiously analyze copious amounts of experimental data resulting from cellular GRNs. To meet this need, cloud computing is promising as reported in the literature. Here, we propose new MapReduce algorithms for inferring gene regulatory networks on a Hadoop cluster in a cloud environment. These algorithms employ an information-theoretic approach to infer GRNs using time-series microarray data. Experimental results show that our MapReduce program is much faster than an existing tool while achieving slightly better prediction accuracy than the existing tool.

  10. A novel harmony search-K means hybrid algorithm for clustering gene expression data

    PubMed Central

    Nazeer, KA Abdul; Sebastian, MP; Kumar, SD Madhu

    2013-01-01

    Recent progress in bioinformatics research has led to the accumulation of huge quantities of biological data at various data sources. The DNA microarray technology makes it possible to simultaneously analyze large number of genes across different samples. Clustering of microarray data can reveal the hidden gene expression patterns from large quantities of expression data that in turn offers tremendous possibilities in functional genomics, comparative genomics, disease diagnosis and drug development. The k- ¬means clustering algorithm is widely used for many practical applications. But the original k-¬means algorithm has several drawbacks. It is computationally expensive and generates locally optimal solutions based on the random choice of the initial centroids. Several methods have been proposed in the literature for improving the performance of the k-¬means algorithm. A meta-heuristic optimization algorithm named harmony search helps find out near-global optimal solutions by searching the entire solution space. Low clustering accuracy of the existing algorithms limits their use in many crucial applications of life sciences. In this paper we propose a novel Harmony Search-K means Hybrid (HSKH) algorithm for clustering the gene expression data. Experimental results show that the proposed algorithm produces clusters with better accuracy in comparison with the existing algorithms. PMID:23390351

  11. A novel harmony search-K means hybrid algorithm for clustering gene expression data.

    PubMed

    Nazeer, Ka Abdul; Sebastian, Mp; Kumar, Sd Madhu

    2013-01-01

    Recent progress in bioinformatics research has led to the accumulation of huge quantities of biological data at various data sources. The DNA microarray technology makes it possible to simultaneously analyze large number of genes across different samples. Clustering of microarray data can reveal the hidden gene expression patterns from large quantities of expression data that in turn offers tremendous possibilities in functional genomics, comparative genomics, disease diagnosis and drug development. The k- ¬means clustering algorithm is widely used for many practical applications. But the original k-¬means algorithm has several drawbacks. It is computationally expensive and generates locally optimal solutions based on the random choice of the initial centroids. Several methods have been proposed in the literature for improving the performance of the k-¬means algorithm. A meta-heuristic optimization algorithm named harmony search helps find out near-global optimal solutions by searching the entire solution space. Low clustering accuracy of the existing algorithms limits their use in many crucial applications of life sciences. In this paper we propose a novel Harmony Search-K means Hybrid (HSKH) algorithm for clustering the gene expression data. Experimental results show that the proposed algorithm produces clusters with better accuracy in comparison with the existing algorithms.

  12. A measure of the signal-to-noise ratio of microarray samples and studies using gene correlations.

    PubMed

    Venet, David; Detours, Vincent; Bersini, Hugues

    2012-01-01

    The quality of gene expression data can vary dramatically from platform to platform, study to study, and sample to sample. As reliable statistical analysis rests on reliable data, determining such quality is of the utmost importance. Quality measures to spot problematic samples exist, but they are platform-specific, and cannot be used to compare studies. As a proxy for quality, we propose a signal-to-noise ratio for microarray data, the "Signal-to-Noise Applied to Gene Expression Experiments", or SNAGEE. SNAGEE is based on the consistency of gene-gene correlations. We applied SNAGEE to a compendium of 80 large datasets on 37 platforms, for a total of 24,380 samples, and assessed the signal-to-noise ratio of studies and samples. This allowed us to discover serious issues with three studies. We show that signal-to-noise ratios of both studies and samples are linked to the statistical significance of the biological results. We showed that SNAGEE is an effective way to measure data quality for most types of gene expression studies, and that it often outperforms existing techniques. Furthermore, SNAGEE is platform-independent and does not require raw data files. The SNAGEE R package is available in BioConductor.

  13. Clustering of European winter storms: A multi-model perspective

    NASA Astrophysics Data System (ADS)

    Renggli, Dominik; Buettner, Annemarie; Scherb, Anke; Straub, Daniel; Zimmerli, Peter

    2016-04-01

    The storm series over Europe in 1990 (Daria, Vivian, Wiebke, Herta) and 1999 (Anatol, Lothar, Martin) are very well known. Such clusters of severe events strongly affect the seasonally accumulated damage statistics. The (re)insurance industry has quantified clustering by using distribution assumptions deduced from the historical storm activity of the last 30 to 40 years. The use of storm series simulated by climate models has only started recently. Climate model runs can potentially represent 100s to 1000s of years, allowing a more detailed quantification of clustering than the history of the last few decades. However, it is unknown how sensitive the representation of clustering is to systematic biases. Using a multi-model ensemble allows quantifying that uncertainty. This work uses CMIP5 decadal ensemble hindcasts to study clustering of European winter storms from a multi-model perspective. An objective identification algorithm extracts winter storms (September to April) in the gridded 6-hourly wind data. Since the skill of European storm predictions is very limited on the decadal scale, the different hindcast runs are interpreted as independent realizations. As a consequence, the available hindcast ensemble represents several 1000 simulated storm seasons. The seasonal clustering of winter storms is quantified using the dispersion coefficient. The benchmark for the decadal prediction models is the 20th Century Reanalysis. The decadal prediction models are able to reproduce typical features of the clustering characteristics observed in the reanalysis data. Clustering occurs in all analyzed models over the North Atlantic and European region, in particular over Great Britain and Scandinavia as well as over Iberia (i.e. the exit regions of the North Atlantic storm track). Clustering is generally weaker in the models compared to reanalysis, although the differences between different models are substantial. In contrast to existing studies, clustering is driven by weak and moderate events, and not by extreme storms. Thus, the decision which climate model to use to quantify clustering can have a substantial impact on the risk assessment in the (re)insurance business.

  14. Evaluation of Vitamin D Standardization Program protocols for standardizing serum 25-hydroxyvitamin D data: a case study of the program's potential for national nutrition and health surveys12345

    PubMed Central

    Cashman, Kevin D; Kiely, Mairead; Kinsella, Michael; Durazo-Arvizu, Ramón A; Tian, Lu; Zhang, Yue; Lucey, Alice; Flynn, Albert; Gibney, Michael J; Vesper, Hubert W; Phinney, Karen W; Coates, Paul M; Picciano, Mary F; Sempos, Christopher T

    2013-01-01

    Background: The Vitamin D Standardization Program (VDSP) has developed protocols for standardizing procedures of 25-hydroxyvitamin D [25(OH)D] measurement in National Health/Nutrition Surveys to promote 25(OH)D measurements that are accurate and comparable over time, location, and laboratory procedure to improve public health practice. Objective: We applied VDSP protocols to existing ELISA-derived serum 25(OH)D data from the Irish National Adult Nutrition Survey (NANS) as a case-study survey and evaluated their effectiveness by comparison of the protocol-projected estimates with those from a reanalysis of survey serums by using liquid chromatography–tandem mass spectrometry (LC–tandem MS). Design: The VDSP reference system and protocols were applied to ELISA-based serum 25(OH)D data from the representative NANS sample (n = 1118). A reanalysis of 99 stored serums by using standardized LC–tandem MS and resulting regression equations yielded predicted standardized serum 25(OH)D values, which were then compared with LC–tandem MS reanalyzed values for all serums. Results: Year-round prevalence rates for serum 25(OH)D concentrations <30, <40, and <50 nmol/L were 6.5%, 21.9%, and 40.0%, respectively, via original ELISA measurements and 11.4%, 25.3%, and 43.7%, respectively, when VDSP protocols were applied. Differences in estimates at <30- and <40-nmol/L thresholds, but not at the <50-nmol/L threshold, were significant (P < 0.05). A reanalysis of all serums by using LC–tandem MS confirmed prevalence estimates as 11.2%, 27.2%, and 45.0%, respectively. Prevalences of serum 25(OH)D concentrations >125 nmol/L were 1.2%, 0.3%, and 0.6% by means of ELISA, VDSP protocols, and LC–tandem MS, respectively. Conclusion: VDSP protocols hold a major potential for national nutrition and health surveys in terms of the standardization of serum 25(OH)D data. PMID:23615829

  15. Quantifying the effect of varying GHG's concentration in Regional Climate Models

    NASA Astrophysics Data System (ADS)

    López-Romero, Jose Maria; Jerez, Sonia; Palacios-Peña, Laura; José Gómez-Navarro, Juan; Jiménez-Guerrero, Pedro; Montavez, Juan Pedro

    2017-04-01

    Regional Climate Models (RCMs) are driven at the boundaries by Global Circulation Models (GCM), and in the particular case of Climate Change projections, such simulations are forced by varying greenhouse gases (GHGs) concentrations. In hindcast simulations driven by reanalysis products, the climate change signal is usually introduced in the assimilation process as well. An interesting question arising in this context is whether GHGs concentrations have to be varied within the RCMs model itself, or rather they should be kept constant. Some groups keep the GHGs concentrations constant under the assumption that information about climate change signal is given throughout the boundaries; sometimes certain radiation parameterization schemes do not permit such changes. Other approaches vary these concentrations arguing that this preserves the physical coherence respect to the driving conditions for the RCM. This work aims to shed light on this topic. For this task, various regional climate simulations with the WRF model for the 1954-2004 period have been carried out for using a Euro-CORDEX compliant domain. A series of simulations with constant and variable GHGs have been performed using both, a GCM (ECHAM6-OM) and a reanalysis product (ERA-20C) data. Results indicate that there exist noticeable differences when introducing varying GHGs concentrations within the RCM domain. The differences in 2-m temperature series between the experiments with varying or constant GHGs concentration strongly depend on the atmospheric conditions, appearing a strong interannual variability. This suggests that short-term experiments are not recommended if the aim is to assess the role of varying GHGs. In addition, and consistently in both GCM and reanalysis-driven experiments, the magnitude of temperature trends, as well as the spatial pattern represented by varying GHGs experiment, are closer to the driving dataset than in experiments keeping constant the GHGs concentration. These results point towards the need for the inclusion of varying GHGs concentration within the RCM itself when dynamically downscaling global datasets, both in GCM and hindcast simulations.

  16. Seasonal evaluation of evapotranspiration fluxes from MODIS satellite and mesoscale model downscaled global reanalysis datasets

    NASA Astrophysics Data System (ADS)

    Srivastava, Prashant K.; Han, Dawei; Islam, Tanvir; Petropoulos, George P.; Gupta, Manika; Dai, Qiang

    2016-04-01

    Reference evapotranspiration (ETo) is an important variable in hydrological modeling, which is not always available, especially for ungauged catchments. Satellite data, such as those available from the MODerate Resolution Imaging Spectroradiometer (MODIS), and global datasets via the European Centre for Medium Range Weather Forecasts (ECMWF) reanalysis (ERA) interim and National Centers for Environmental Prediction (NCEP) reanalysis are important sources of information for ETo. This study explored the seasonal performances of MODIS (MOD16) and Weather Research and Forecasting (WRF) model downscaled global reanalysis datasets, such as ERA interim and NCEP-derived ETo, against ground-based datasets. Overall, on the basis of the statistical metrics computed, ETo derived from ERA interim and MODIS were more accurate in comparison to the estimates from NCEP for all the seasons. The pooled datasets also revealed a similar performance to the seasonal assessment with higher agreement for the ERA interim (r = 0.96, RMSE = 2.76 mm/8 days; bias = 0.24 mm/8 days), followed by MODIS (r = 0.95, RMSE = 7.66 mm/8 days; bias = -7.17 mm/8 days) and NCEP (r = 0.76, RMSE = 11.81 mm/8 days; bias = -10.20 mm/8 days). The only limitation with downscaling ERA interim reanalysis datasets using WRF is that it is time-consuming in contrast to the readily available MODIS operational product for use in mesoscale studies and practical applications.

  17. Rapid iterative reanalysis for automated design

    NASA Technical Reports Server (NTRS)

    Bhatia, K. G.

    1973-01-01

    A method for iterative reanalysis in automated structural design is presented for a finite-element analysis using the direct stiffness approach. A basic feature of the method is that the generalized stiffness and inertia matrices are expressed as functions of structural design parameters, and these generalized matrices are expanded in Taylor series about the initial design. Only the linear terms are retained in the expansions. The method is approximate because it uses static condensation, modal reduction, and the linear Taylor series expansions. The exact linear representation of the expansions of the generalized matrices is also described and a basis for the present method is established. Results of applications of the present method to the recalculation of the natural frequencies of two simple platelike structural models are presented and compared with results obtained by using a commonly applied analysis procedure used as a reference. In general, the results are in good agreement. A comparison of the computer times required for the use of the present method and the reference method indicated that the present method required substantially less time for reanalysis. Although the results presented are for relatively small-order problems, the present method will become more efficient relative to the reference method as the problem size increases. An extension of the present method to static reanalysis is described, ana a basis for unifying the static and dynamic reanalysis procedures is presented.

  18. Dissociating retrieval interference and reanalysis in the P600 during sentence comprehension.

    PubMed

    Tanner, Darren; Grey, Sarah; van Hell, Janet G

    2017-02-01

    We investigated the relative independence of two key processes in language comprehension, as reflected in the P600 ERP component. Numerous studies have linked the P600 to sentence- or message-level reanalysis; however, much research has shown that skilled, cue-based memory retrieval operations are also important to successful language processing. Our goal was to identify whether these cue-based retrieval operations are part of the reanalysis processes indexed by the P600. To this end, participants read sentences that were either grammatical or ungrammatical via subject-verb agreement violations, and in which there was either no possibility for retrieval interference or there was an attractor noun interfering with the computation of subject-verb agreement (e.g., "The slogan on the political poster(s) was/were …"). A stimulus onset asynchrony manipulation (fast, medium, or slow presentation rate) was designed to modulate participants' ability to engage in reanalysis processes. Results showed a reliable attraction interference effect, indexed by reduced behavioral sensitivity to ungrammaticalities and P600 amplitudes when there was an opportunity for retrieval interference, as well as an effect of presentation rate, with reduced behavioral sensitivity and smaller P600 effects at faster presentation rates. Importantly, there was no interaction between the two, suggesting that retrieval interference and sentence-level reanalysis processes indexed by the P600 can be neurocognitively distinct processes. © 2016 Society for Psychophysiological Research.

  19. Chemiluminescence microarrays in analytical chemistry: a critical review.

    PubMed

    Seidel, Michael; Niessner, Reinhard

    2014-09-01

    Multi-analyte immunoassays on microarrays and on multiplex DNA microarrays have been described for quantitative analysis of small organic molecules (e.g., antibiotics, drugs of abuse, small molecule toxins), proteins (e.g., antibodies or protein toxins), and microorganisms, viruses, and eukaryotic cells. In analytical chemistry, multi-analyte detection by use of analytical microarrays has become an innovative research topic because of the possibility of generating several sets of quantitative data for different analyte classes in a short time. Chemiluminescence (CL) microarrays are powerful tools for rapid multiplex analysis of complex matrices. A wide range of applications for CL microarrays is described in the literature dealing with analytical microarrays. The motivation for this review is to summarize the current state of CL-based analytical microarrays. Combining analysis of different compound classes on CL microarrays reduces analysis time, cost of reagents, and use of laboratory space. Applications are discussed, with examples from food safety, water safety, environmental monitoring, diagnostics, forensics, toxicology, and biosecurity. The potential and limitations of research on multiplex analysis by use of CL microarrays are discussed in this review.

  20. Analysis of High-Throughput ELISA Microarray Data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    White, Amanda M.; Daly, Don S.; Zangar, Richard C.

    Our research group develops analytical methods and software for the high-throughput analysis of quantitative enzyme-linked immunosorbent assay (ELISA) microarrays. ELISA microarrays differ from DNA microarrays in several fundamental aspects and most algorithms for analysis of DNA microarray data are not applicable to ELISA microarrays. In this review, we provide an overview of the steps involved in ELISA microarray data analysis and how the statistically sound algorithms we have developed provide an integrated software suite to address the needs of each data-processing step. The algorithms discussed are available in a set of open-source software tools (http://www.pnl.gov/statistics/ProMAT).

  1. Filling the Silence: Reactivation, not Reconstruction

    PubMed Central

    Paape, Dario L. J. F.

    2016-01-01

    In a self-paced reading experiment, we investigated the processing of sluicing constructions (“sluices”) whose antecedent contained a known garden-path structure in German. Results showed decreased processing times for sluices with garden-path antecedents as well as a disadvantage for antecedents with non-canonical word order downstream from the ellipsis site. A post-hoc analysis showed the garden-path advantage also to be present in the region right before the ellipsis site. While no existing account of ellipsis processing explicitly predicted the results, we argue that they are best captured by combining a local antecedent mismatch effect with memory trace reactivation through reanalysis. PMID:26858674

  2. Effects of Deficient Reporting on Meta-Analysis: A Conceptual Framework and Reanalysis

    ERIC Educational Resources Information Center

    Orwin, Robert G.; Cordray, David S.

    1985-01-01

    Identifies three sources of reporting deficiency for meta-analytic results: quality (adequacy) of publicizing; quality of macrolevel reporting, and quality of microlevel reporting. Reanalysis of 25 reports from the Smith, Glass and Miller (1980) psychotherapy meta-analysis established two sources of misinformation, interrater reliabilities and…

  3. Differential Effects of Treatments for Chronic Depression: A Latent Growth Model Reanalysis

    ERIC Educational Resources Information Center

    Stulz, Niklaus; Thase, Michael E.; Klein, Daniel N.; Manber, Rachel; Crits-Christoph, Paul

    2010-01-01

    Objective: Psychotherapy-pharmacotherapy combinations are frequently recommended for the treatment of chronic depressive disorders. Our aim in this novel reanalysis of archival data was to identify patient subgroups on the basis of symptom trajectories and examine the clinical significance of the resultant classification on basis of differential…

  4. The Circumplex Pattern of the Life Styles Inventory: A Reanalysis.

    ERIC Educational Resources Information Center

    Levin, Joseph

    1991-01-01

    A reanalysis of the intercorrelation matrix from a principal components analysis of the Life Styles Inventory was conducted using a Canadian sample. Using nonmetric multidimensional scaling, analyses show an almost perfect circumplex pattern. Results illustrate the inadequacy of factor analytic procedures for the analysis and representation of a…

  5. NASA's EOSDIS Near Term Challenges

    NASA Technical Reports Server (NTRS)

    Behnke, Jeanne

    2018-01-01

    Given the long-term requirements, and the rapid pace of information technology and changing expectations of the user community, the ESDIS Project has had to evolve EOSDIS continually over the past three decades. However, many challenges remain. One near-term challenge is the enormous quantity of new data that will need to be managed by the EOSDIS. With the upcoming launch of the latest NASA missions coupled with existing operational missions and field campaigns, EOSDIS can expect to handle as much as 50 petabytes of data per year. In perspective, this is twice the size of the current existing archive, which took over 21 years to collect. Another continuing challenge is the disparate requirements of a diverse science community. Maintaining rigorous long-term data preservation, supporting ease of discovery and access, incorporating user feedback, enabling reanalysis/ reprocessing, and agile integration of new data sources, continue to be the Project's objectives.

  6. Genome Expression Pathway Analysis Tool – Analysis and visualization of microarray gene expression data under genomic, proteomic and metabolic context

    PubMed Central

    Weniger, Markus; Engelmann, Julia C; Schultz, Jörg

    2007-01-01

    Background Regulation of gene expression is relevant to many areas of biology and medicine, in the study of treatments, diseases, and developmental stages. Microarrays can be used to measure the expression level of thousands of mRNAs at the same time, allowing insight into or comparison of different cellular conditions. The data derived out of microarray experiments is highly dimensional and often noisy, and interpretation of the results can get intricate. Although programs for the statistical analysis of microarray data exist, most of them lack an integration of analysis results and biological interpretation. Results We have developed GEPAT, Genome Expression Pathway Analysis Tool, offering an analysis of gene expression data under genomic, proteomic and metabolic context. We provide an integration of statistical methods for data import and data analysis together with a biological interpretation for subsets of probes or single probes on the chip. GEPAT imports various types of oligonucleotide and cDNA array data formats. Different normalization methods can be applied to the data, afterwards data annotation is performed. After import, GEPAT offers various statistical data analysis methods, as hierarchical, k-means and PCA clustering, a linear model based t-test or chromosomal profile comparison. The results of the analysis can be interpreted by enrichment of biological terms, pathway analysis or interaction networks. Different biological databases are included, to give various information for each probe on the chip. GEPAT offers no linear work flow, but allows the usage of any subset of probes and samples as a start for a new data analysis. GEPAT relies on established data analysis packages, offers a modular approach for an easy extension, and can be run on a computer grid to allow a large number of users. It is freely available under the LGPL open source license for academic and commercial users at . Conclusion GEPAT is a modular, scalable and professional-grade software integrating analysis and interpretation of microarray gene expression data. An installation available for academic users can be found at . PMID:17543125

  7. Identification of candidate genes involved in neuroblastoma progression by combining genomic and expression microarrays with survival data.

    PubMed

    Łastowska, M; Viprey, V; Santibanez-Koref, M; Wappler, I; Peters, H; Cullinane, C; Roberts, P; Hall, A G; Tweddle, D A; Pearson, A D J; Lewis, I; Burchill, S A; Jackson, M S

    2007-11-22

    Identifying genes, whose expression is consistently altered by chromosomal gains or losses, is an important step in defining genes of biological relevance in a wide variety of tumour types. However, additional criteria are needed to discriminate further among the large number of candidate genes identified. This is particularly true for neuroblastoma, where multiple genomic copy number changes of proven prognostic value exist. We have used Affymetrix microarrays and a combination of fluorescent in situ hybridization and single nucleotide polymorphism (SNP) microarrays to establish expression profiles and delineate copy number alterations in 30 primary neuroblastomas. Correlation of microarray data with patient survival and analysis of expression within rodent neuroblastoma cell lines were then used to define further genes likely to be involved in the disease process. Using this approach, we identify >1000 genes within eight recurrent genomic alterations (loss of 1p, 3p, 4p, 10q and 11q, 2p gain, 17q gain, and the MYCN amplicon) whose expression is consistently altered by copy number change. Of these, 84 correlate with patient survival, with the minimal regions of 17q gain and 4p loss being enriched significantly for such genes. These include genes involved in RNA and DNA metabolism, and apoptosis. Orthologues of all but one of these genes on 17q are overexpressed in rodent neuroblastoma cell lines. A significant excess of SNPs whose copy number correlates with survival is also observed on proximal 4p in stage 4 tumours, and we find that deletion of 4p is associated with improved outcome in an extended cohort of tumours. These results define the major impact of genomic copy number alterations upon transcription within neuroblastoma, and highlight genes on distal 17q and proximal 4p for downstream analyses. They also suggest that integration of discriminators, such as survival and comparative gene expression, with microarray data may be useful in the identification of critical genes within regions of loss or gain in many human cancers.

  8. Cross-Study Homogeneity of Psoriasis Gene Expression in Skin across a Large Expression Range

    PubMed Central

    Kerkof, Keith; Timour, Martin; Russell, Christopher B.

    2013-01-01

    Background In psoriasis, only limited overlap between sets of genes identified as differentially expressed (psoriatic lesional vs. psoriatic non-lesional) was found using statistical and fold-change cut-offs. To provide a framework for utilizing prior psoriasis data sets we sought to understand the consistency of those sets. Methodology/Principal Findings Microarray expression profiling and qRT-PCR were used to characterize gene expression in PP and PN skin from psoriasis patients. cDNA (three new data sets) and cRNA hybridization (four existing data sets) data were compared using a common analysis pipeline. Agreement between data sets was assessed using varying qualitative and quantitative cut-offs to generate a DEG list in a source data set and then using other data sets to validate the list. Concordance increased from 67% across all probe sets to over 99% across more than 10,000 probe sets when statistical filters were employed. The fold-change behavior of individual genes tended to be consistent across the multiple data sets. We found that genes with <2-fold change values were quantitatively reproducible between pairs of data-sets. In a subset of transcripts with a role in inflammation changes detected by microarray were confirmed by qRT-PCR with high concordance. For transcripts with both PN and PP levels within the microarray dynamic range, microarray and qRT-PCR were quantitatively reproducible, including minimal fold-changes in IL13, TNFSF11, and TNFRSF11B and genes with >10-fold changes in either direction such as CHRM3, IL12B and IFNG. Conclusions/Significance Gene expression changes in psoriatic lesions were consistent across different studies, despite differences in patient selection, sample handling, and microarray platforms but between-study comparisons showed stronger agreement within than between platforms. We could use cut-offs as low as log10(ratio) = 0.1 (fold-change = 1.26), generating larger gene lists that validate on independent data sets. The reproducibility of PP signatures across data sets suggests that different sample sets can be productively compared. PMID:23308107

  9. Comparative study of classification algorithms for immunosignaturing data

    PubMed Central

    2012-01-01

    Background High-throughput technologies such as DNA, RNA, protein, antibody and peptide microarrays are often used to examine differences across drug treatments, diseases, transgenic animals, and others. Typically one trains a classification system by gathering large amounts of probe-level data, selecting informative features, and classifies test samples using a small number of features. As new microarrays are invented, classification systems that worked well for other array types may not be ideal. Expression microarrays, arguably one of the most prevalent array types, have been used for years to help develop classification algorithms. Many biological assumptions are built into classifiers that were designed for these types of data. One of the more problematic is the assumption of independence, both at the probe level and again at the biological level. Probes for RNA transcripts are designed to bind single transcripts. At the biological level, many genes have dependencies across transcriptional pathways where co-regulation of transcriptional units may make many genes appear as being completely dependent. Thus, algorithms that perform well for gene expression data may not be suitable when other technologies with different binding characteristics exist. The immunosignaturing microarray is based on complex mixtures of antibodies binding to arrays of random sequence peptides. It relies on many-to-many binding of antibodies to the random sequence peptides. Each peptide can bind multiple antibodies and each antibody can bind multiple peptides. This technology has been shown to be highly reproducible and appears promising for diagnosing a variety of disease states. However, it is not clear what is the optimal classification algorithm for analyzing this new type of data. Results We characterized several classification algorithms to analyze immunosignaturing data. We selected several datasets that range from easy to difficult to classify, from simple monoclonal binding to complex binding patterns in asthma patients. We then classified the biological samples using 17 different classification algorithms. Using a wide variety of assessment criteria, we found ‘Naïve Bayes’ far more useful than other widely used methods due to its simplicity, robustness, speed and accuracy. Conclusions ‘Naïve Bayes’ algorithm appears to accommodate the complex patterns hidden within multilayered immunosignaturing microarray data due to its fundamental mathematical properties. PMID:22720696

  10. Intra-Platform Repeatability and Inter-Platform Comparability of MicroRNA Microarray Technology

    PubMed Central

    Sato, Fumiaki; Tsuchiya, Soken; Terasawa, Kazuya; Tsujimoto, Gozoh

    2009-01-01

    Over the last decade, DNA microarray technology has provided a great contribution to the life sciences. The MicroArray Quality Control (MAQC) project demonstrated the way to analyze the expression microarray. Recently, microarray technology has been utilized to analyze a comprehensive microRNA expression profiling. Currently, several platforms of microRNA microarray chips are commercially available. Thus, we compared repeatability and comparability of five different microRNA microarray platforms (Agilent, Ambion, Exiqon, Invitrogen and Toray) using 309 microRNAs probes, and the Taqman microRNA system using 142 microRNA probes. This study demonstrated that microRNA microarray has high intra-platform repeatability and comparability to quantitative RT-PCR of microRNA. Among the five platforms, Agilent and Toray array showed relatively better performances than the others. However, the current lineup of commercially available microRNA microarray systems fails to show good inter-platform concordance, probably because of lack of an adequate normalization method and severe divergence in stringency of detection call criteria between different platforms. This study provided the basic information about the performance and the problems specific to the current microRNA microarray systems. PMID:19436744

  11. Living Cell Microarrays: An Overview of Concepts

    PubMed Central

    Jonczyk, Rebecca; Kurth, Tracy; Lavrentieva, Antonina; Walter, Johanna-Gabriela; Scheper, Thomas; Stahl, Frank

    2016-01-01

    Living cell microarrays are a highly efficient cellular screening system. Due to the low number of cells required per spot, cell microarrays enable the use of primary and stem cells and provide resolution close to the single-cell level. Apart from a variety of conventional static designs, microfluidic microarray systems have also been established. An alternative format is a microarray consisting of three-dimensional cell constructs ranging from cell spheroids to cells encapsulated in hydrogel. These systems provide an in vivo-like microenvironment and are preferably used for the investigation of cellular physiology, cytotoxicity, and drug screening. Thus, many different high-tech microarray platforms are currently available. Disadvantages of many systems include their high cost, the requirement of specialized equipment for their manufacture, and the poor comparability of results between different platforms. In this article, we provide an overview of static, microfluidic, and 3D cell microarrays. In addition, we describe a simple method for the printing of living cell microarrays on modified microscope glass slides using standard DNA microarray equipment available in most laboratories. Applications in research and diagnostics are discussed, e.g., the selective and sensitive detection of biomarkers. Finally, we highlight current limitations and the future prospects of living cell microarrays. PMID:27600077

  12. Assessment of Patient-Specific Surgery Effect Based on Weighted Estimation and Propensity Scoring in the Re-Analysis of the Sciatica Trial

    PubMed Central

    Mertens, Bart J. A.; Jacobs, Wilco C. H.; Brand, Ronald; Peul, Wilco C.

    2014-01-01

    We consider a re-analysis of the wait-and-see (control) arm of a recent clinical trial on sciatica. While the original randomised trial was designed to evaluate the public policy effect of a conservative wait-and-see approach versus early surgery, we investigate the impact of surgery at the individual patient level in a re-analysis of the wait-and-see group data. Both marginal structural model re-weighted estimates as well as propensity score adjusted analyses are presented. Results indicate that patients with high propensity to receive surgery may have beneficial effects at 2 years from delayed disc surgery. PMID:25353633

  13. The NOAA-NASA CZCS Reanalysis Effort

    NASA Technical Reports Server (NTRS)

    Gregg, Watson W.; Conkright, Margarita E.; OReilly, John E.; Patt, Frederick S.; Wang, Meng-Hua; Yoder, James; Casey-McCabe, Nancy; Koblinsky, Chester J. (Technical Monitor)

    2001-01-01

    Satellite observations of global ocean chlorophyll span over two decades. However, incompatibilities between processing algorithms prevent us from quantifying natural variability. We applied a comprehensive reanalysis to the Coastal Zone Color Scanner (CZCS) archive, called the NOAA-NASA CZCS Reanalysis (NCR) Effort. NCR consisted of 1) algorithm improvement (AI), where CZCS processing algorithms were improved using modernized atmospheric correction and bio-optical algorithms, and 2) blending, where in situ data were incorporated into the CZCS AI to minimize residual errors. The results indicated major improvement over the previously available CZCS archive. Global spatial and seasonal patterns of NCR chlorophyll indicated remarkable correspondence with modern sensors, suggesting compatibility. The NCR permits quantitative analyses of interannual and interdecadal trends in global ocean chlorophyll.

  14. ELISA-BASE: An Integrated Bioinformatics Tool for Analyzing and Tracking ELISA Microarray Data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    White, Amanda M.; Collett, James L.; Seurynck-Servoss, Shannon L.

    ELISA-BASE is an open-source database for capturing, organizing and analyzing protein enzyme-linked immunosorbent assay (ELISA) microarray data. ELISA-BASE is an extension of the BioArray Soft-ware Environment (BASE) database system, which was developed for DNA microarrays. In order to make BASE suitable for protein microarray experiments, we developed several plugins for importing and analyzing quantitative ELISA microarray data. Most notably, our Protein Microarray Analysis Tool (ProMAT) for processing quantita-tive ELISA data is now available as a plugin to the database.

  15. Thermodynamically optimal whole-genome tiling microarray design and validation.

    PubMed

    Cho, Hyejin; Chou, Hui-Hsien

    2016-06-13

    Microarray is an efficient apparatus to interrogate the whole transcriptome of species. Microarray can be designed according to annotated gene sets, but the resulted microarrays cannot be used to identify novel transcripts and this design method is not applicable to unannotated species. Alternatively, a whole-genome tiling microarray can be designed using only genomic sequences without gene annotations, and it can be used to detect novel RNA transcripts as well as known genes. The difficulty with tiling microarray design lies in the tradeoff between probe-specificity and coverage of the genome. Sequence comparison methods based on BLAST or similar software are commonly employed in microarray design, but they cannot precisely determine the subtle thermodynamic competition between probe targets and partially matched probe nontargets during hybridizations. Using the whole-genome thermodynamic analysis software PICKY to design tiling microarrays, we can achieve maximum whole-genome coverage allowable under the thermodynamic constraints of each target genome. The resulted tiling microarrays are thermodynamically optimal in the sense that all selected probes share the same melting temperature separation range between their targets and closest nontargets, and no additional probes can be added without violating the specificity of the microarray to the target genome. This new design method was used to create two whole-genome tiling microarrays for Escherichia coli MG1655 and Agrobacterium tumefaciens C58 and the experiment results validated the design.

  16. SYMBIOmatics: synergies in Medical Informatics and Bioinformatics--exploring current scientific literature for emerging topics.

    PubMed

    Rebholz-Schuhman, Dietrich; Cameron, Graham; Clark, Dominic; van Mulligen, Erik; Coatrieux, Jean-Louis; Del Hoyo Barbolla, Eva; Martin-Sanchez, Fernando; Milanesi, Luciano; Porro, Ivan; Beltrame, Francesco; Tollis, Ioannis; Van der Lei, Johan

    2007-03-08

    The SYMBIOmatics Specific Support Action (SSA) is "an information gathering and dissemination activity" that seeks "to identify synergies between the bioinformatics and the medical informatics" domain to improve collaborative progress between both domains (ref. to http://www.symbiomatics.org). As part of the project experts in both research fields will be identified and approached through a survey. To provide input to the survey, the scientific literature was analysed to extract topics relevant to both medical informatics and bioinformatics. This paper presents results of a systematic analysis of the scientific literature from medical informatics research and bioinformatics research. In the analysis pairs of words (bigrams) from the leading bioinformatics and medical informatics journals have been used as indication of existing and emerging technologies and topics over the period 2000-2005 ("recent") and 1990-1990 ("past"). We identified emerging topics that were equally important to bioinformatics and medical informatics in recent years such as microarray experiments, ontologies, open source, text mining and support vector machines. Emerging topics that evolved only in bioinformatics were system biology, protein interaction networks and statistical methods for microarray analyses, whereas emerging topics in medical informatics were grid technology and tissue microarrays. We conclude that although both fields have their own specific domains of interest, they share common technological developments that tend to be initiated by new developments in biotechnology and computer science.

  17. SYMBIOmatics: Synergies in Medical Informatics and Bioinformatics – exploring current scientific literature for emerging topics

    PubMed Central

    Rebholz-Schuhman, Dietrich; Cameron, Graham; Clark, Dominic; van Mulligen, Erik; Coatrieux, Jean-Louis; Del Hoyo Barbolla, Eva; Martin-Sanchez, Fernando; Milanesi, Luciano; Porro, Ivan; Beltrame, Francesco; Tollis, Ioannis; Van der Lei, Johan

    2007-01-01

    Background The SYMBIOmatics Specific Support Action (SSA) is "an information gathering and dissemination activity" that seeks "to identify synergies between the bioinformatics and the medical informatics" domain to improve collaborative progress between both domains (ref. to ). As part of the project experts in both research fields will be identified and approached through a survey. To provide input to the survey, the scientific literature was analysed to extract topics relevant to both medical informatics and bioinformatics. Results This paper presents results of a systematic analysis of the scientific literature from medical informatics research and bioinformatics research. In the analysis pairs of words (bigrams) from the leading bioinformatics and medical informatics journals have been used as indication of existing and emerging technologies and topics over the period 2000–2005 ("recent") and 1990–1990 ("past"). We identified emerging topics that were equally important to bioinformatics and medical informatics in recent years such as microarray experiments, ontologies, open source, text mining and support vector machines. Emerging topics that evolved only in bioinformatics were system biology, protein interaction networks and statistical methods for microarray analyses, whereas emerging topics in medical informatics were grid technology and tissue microarrays. Conclusion We conclude that although both fields have their own specific domains of interest, they share common technological developments that tend to be initiated by new developments in biotechnology and computer science. PMID:17430562

  18. Molecular Genetics Information System (MOLGENIS): alternatives in developing local experimental genomics databases.

    PubMed

    Swertz, Morris A; De Brock, E O; Van Hijum, Sacha A F T; De Jong, Anne; Buist, Girbe; Baerends, Richard J S; Kok, Jan; Kuipers, Oscar P; Jansen, Ritsert C

    2004-09-01

    Genomic research laboratories need adequate infrastructure to support management of their data production and research workflow. But what makes infrastructure adequate? A lack of appropriate criteria makes any decision on buying or developing a system difficult. Here, we report on the decision process for the case of a molecular genetics group establishing a microarray laboratory. Five typical requirements for experimental genomics database systems were identified: (i) evolution ability to keep up with the fast developing genomics field; (ii) a suitable data model to deal with local diversity; (iii) suitable storage of data files in the system; (iv) easy exchange with other software; and (v) low maintenance costs. The computer scientists and the researchers of the local microarray laboratory considered alternative solutions for these five requirements and chose the following options: (i) use of automatic code generation; (ii) a customized data model based on standards; (iii) storage of datasets as black boxes instead of decomposing them in database tables; (iv) loosely linking to other programs for improved flexibility; and (v) a low-maintenance web-based user interface. Our team evaluated existing microarray databases and then decided to build a new system, Molecular Genetics Information System (MOLGENIS), implemented using code generation in a period of three months. This case can provide valuable insights and lessons to both software developers and a user community embarking on large-scale genomic projects. http://www.molgenis.nl

  19. Classification of mislabelled microarrays using robust sparse logistic regression.

    PubMed

    Bootkrajang, Jakramate; Kabán, Ata

    2013-04-01

    Previous studies reported that labelling errors are not uncommon in microarray datasets. In such cases, the training set may become misleading, and the ability of classifiers to make reliable inferences from the data is compromised. Yet, few methods are currently available in the bioinformatics literature to deal with this problem. The few existing methods focus on data cleansing alone, without reference to classification, and their performance crucially depends on some tuning parameters. In this article, we develop a new method to detect mislabelled arrays simultaneously with learning a sparse logistic regression classifier. Our method may be seen as a label-noise robust extension of the well-known and successful Bayesian logistic regression classifier. To account for possible mislabelling, we formulate a label-flipping process as part of the classifier. The regularization parameter is automatically set using Bayesian regularization, which not only saves the computation time that cross-validation would take, but also eliminates any unwanted effects of label noise when setting the regularization parameter. Extensive experiments with both synthetic data and real microarray datasets demonstrate that our approach is able to counter the bad effects of labelling errors in terms of predictive performance, it is effective at identifying marker genes and simultaneously it detects mislabelled arrays to high accuracy. The code is available from http://cs.bham.ac.uk/∼jxb008. Supplementary data are available at Bioinformatics online.

  20. Bayesian inference with historical data-based informative priors improves detection of differentially expressed genes

    PubMed Central

    Li, Ben; Sun, Zhaonan; He, Qing; Zhu, Yu; Qin, Zhaohui S.

    2016-01-01

    Motivation: Modern high-throughput biotechnologies such as microarray are capable of producing a massive amount of information for each sample. However, in a typical high-throughput experiment, only limited number of samples were assayed, thus the classical ‘large p, small n’ problem. On the other hand, rapid propagation of these high-throughput technologies has resulted in a substantial collection of data, often carried out on the same platform and using the same protocol. It is highly desirable to utilize the existing data when performing analysis and inference on a new dataset. Results: Utilizing existing data can be carried out in a straightforward fashion under the Bayesian framework in which the repository of historical data can be exploited to build informative priors and used in new data analysis. In this work, using microarray data, we investigate the feasibility and effectiveness of deriving informative priors from historical data and using them in the problem of detecting differentially expressed genes. Through simulation and real data analysis, we show that the proposed strategy significantly outperforms existing methods including the popular and state-of-the-art Bayesian hierarchical model-based approaches. Our work illustrates the feasibility and benefits of exploiting the increasingly available genomics big data in statistical inference and presents a promising practical strategy for dealing with the ‘large p, small n’ problem. Availability and implementation: Our method is implemented in R package IPBT, which is freely available from https://github.com/benliemory/IPBT. Contact: yuzhu@purdue.edu; zhaohui.qin@emory.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26519502

  1. Smart Climatology System

    DTIC Science & Technology

    2010-09-24

    12 2.1 Downscaling /Reanalysis Data ................................................................................ 12 2.2 Downscaling of...Comparison of Resolutions of Maximum Significant Wave heights for La Niña >= 8 ft >= 6 ft 12 2 Data Production Issues 2.1 Downscaling /Reanalysis...numerical weather prediction systems. The usage of satellite data , for example, is markedly different than the past practice. This played havoc with

  2. Reanalysis of Clause Boundaries in Japanese as a Constraint-Driven Process.

    ERIC Educational Resources Information Center

    Miyamoto, Edson T.

    2003-01-01

    Reports on two experiments that focus on clause boundaries in Japanese that suggest that minimal change restriction is unnecessary to characterize reanalysis. Proposes that the data and previous observations are more naturally explained by a constraint-driven model in which revisions are performed only when required by parsing constraints.…

  3. Does Nutrition Cause Intelligence? A Reanalysis of the Cali Experiment.

    ERIC Educational Resources Information Center

    Bejar, Isaac I.

    1981-01-01

    Recent literature suggests a causal link between malnutrition and impaired cognitive development. A selective literature review indicates that the presence or absence of such a link cannot be established. A reanalysis of an experiment indicated after four years of treatment there was no association between cognitive and nutritional status.…

  4. Cognitive Development Among Retardates: Reanalysis of Inhelder's Data.

    ERIC Educational Resources Information Center

    Jordan, Valerie Barnes

    A reanalysis of B. Inhelder's (1968) data concerning cognitive development among retardates was performed by selecting from the original 159 subjects a sample of 104 educable mentally retarded Ss (7-19 years old) who were diagnosed as fixated or nonfixated at three of the cognitive stages postulated by Jean Piaget. The results indicated that among…

  5. The Value of Reanalysis: TV Viewing and Attention Problems

    ERIC Educational Resources Information Center

    Foster, E. Michael; Watkins, Stephanie

    2010-01-01

    Using data from the National Longitudinal Survey of Youth (N = 1,159), this study reexamines the link between maternal reports of television viewing at ages 1 and 3 and attention problems at age 7. This work represents a reanalysis and extension of recent research suggesting young children's television viewing causes subsequent attention problems.…

  6. Assessment of Uncertainties in Recent Changes in the South American Low-Level Jet Based on Various Reanalyses

    NASA Astrophysics Data System (ADS)

    Montini, T.; Jones, C.

    2017-12-01

    The South American low-level jet (SALLJ) is one of the key components of the South American Monsoon System. The SALLJ transports large amounts of moisture to the subtropics, influencing the development of deep convection and heavy precipitation over southeastern South America. Previous studies have analyzed the jet using reanalysis data due to the lack of available upper-air observations over this region. The purpose of the current study is to quantify uncertainties in the climatology, variability, and changes in the SALLJ based on various reanalyses for the period 1979-2015. This is important because there are significant differences among reanalysis datasets due to variations in their data quality control, data assimilation systems, and model physics. The datasets used in this analysis are: (1) Climate Forecast System Reanalysis, (2) ERA-Interim, (3) the Japanese 55-year reanalysis, (4) the Second Modern Era Retrospective-analysis for Research and Applications (MERRA-2). Finally, significant changes in the SALLJ are discussed in relation to substantial warming over South America in recent decades and changes in the monsoon.

  7. Surface Mass Balance of the Greenland Ice Sheet Derived from Paleoclimate Reanalysis

    NASA Astrophysics Data System (ADS)

    Badgeley, J.; Steig, E. J.; Hakim, G. J.; Anderson, J.; Tardif, R.

    2017-12-01

    Modeling past ice-sheet behavior requires independent knowledge of past surface mass balance. Though models provide useful insight into ice-sheet response to climate forcing, if past climate is unknown, then ascertaining the rate and extent of past ice-sheet change is limited to geological and geophysical constraints. We use a novel data-assimilation framework developed under the Last Millennium Reanalysis Project (Hakim et al., 2016) to reconstruct past climate over ice sheets with the intent of creating an independent surface mass balance record for paleo ice-sheet modeling. Paleoclimate data assimilation combines the physics of climate models and the time series evidence of proxy records in an offline, ensemble-based approach. This framework allows for the assimilation of numerous proxy records and archive types while maintaining spatial consistency with known climate dynamics and physics captured by the models. In our reconstruction, we use the Community Climate System Model version 4, CMIP5 last millennium simulation (Taylor et al., 2012; Landrum et al., 2013) and a nearly complete database of ice core oxygen isotope records to reconstruct Holocene surface temperature and precipitation over the Greenland Ice Sheet on a decadal timescale. By applying a seasonality to this reconstruction (from the TraCE-21ka simulation; Liu et al., 2009), our reanalysis can be used in seasonally-based surface mass balance models. Here we discuss the methods behind our reanalysis and the performance of our reconstruction through prediction of unassimilated proxy records and comparison to paleoclimate reconstructions and reanalysis products.

  8. Reanalysis of the Indian summer monsoon: four dimensional data assimilation of AIRS retrievals in a regional data assimilation and modeling framework

    NASA Astrophysics Data System (ADS)

    Attada, Raju; Parekh, Anant; Chowdary, J. S.; Gnanaseelan, C.

    2018-04-01

    This work is the first attempt to produce a multi-year downscaled regional reanalysis of the Indian summer monsoon (ISM) using the National Centers for Environmental Prediction (NCEP) operational analyses and Atmospheric Infrared Sounder (AIRS) version 5 temperature and moisture retrievals in a regional model. Reanalysis of nine monsoon seasons (2003-2011) are produced in two parallel setups. The first set of experiments simply downscale the original NCEP operational analyses, whilst the second one assimilates the AIRS temperature and moisture profiles. The results show better representation of the key monsoon features such as low level jet, tropical easterly jet, subtropical westerly jet, monsoon trough and the spatial pattern of precipitation when AIRS profiles are assimilated (compared to those without AIRS data assimilation). The distribution of temperature, moisture and meridional gradients of dynamical and thermodynamical fields over the monsoon region are better represented in the reanalysis that assimilates AIRS profiles. The change induced by AIRS data on the moist and thermodynamic conditions results in more realistic rendering of the vertical shear associated with the monsoon, which in turn leads to a proper moisture transport and the moist convective feedback. This feedback benefits the representation of the regional monsoon characteristics, the monsoon dynamics and the moist convective processes on the seasonal time scale. This study emphasizes the use of AIRS soundings for downscaling of ISM representation in a regional reanalysis.

  9. The Mars Analysis Correction Data Assimilation (MACDA): A reference atmospheric reanalysis

    NASA Astrophysics Data System (ADS)

    Montabone, Luca; Read, Peter; Lewis, Stephen; Steele, Liam; Holmes, James; Valeanu, Alexandru

    2016-07-01

    The Mars Analysis Correction Data Assimilation (MACDA) dataset version 1.0 contains the reanalysis of fundamental atmospheric and surface variables for the planet Mars covering a period of about three Martian years (late MY 24 to early MY 27). This has been produced by data assimilation of retrieved thermal profiles and column dust optical depths from NASA's Mars Global Surveyor/Thermal Emission Spectrometer (MGS/TES), which have been assimilated into a Mars global climate model (MGCM) using the Analysis Correction scheme developed at the UK Meteorological Office. The MACDA v1.0 reanalysis is publicly available, and the NetCDF files can be downloaded from the archive at the Centre for Environmental Data Analysis/British Atmospheric Data Centre (CEDA/BADC). The variables included in the dataset can be visualised using an ad-hoc graphical user interface (the "MACDA Plotter") at the following URL: http://macdap.physics.ox.ac.uk/ MACDA is an ongoing collaborative project, and work is currently undertaken to produce version 2.0 of the Mars atmospheric reanalysis. One of the key improvements is the extension of the reanalysis period to nine martian years (MY 24 through MY 32), with the assimilation of NASA's Mars Reconnaissance Orbiter/Mars Climate Sounder (MRO/MCS) retrievals of thermal and dust opacity profiles. MACDA 2.0 is also going to be based on an improved version of the underlying MGCM and an updated scheme to fully assimilate (radiative active) tracers, such as dust and water ice.

  10. Trends in solar radiation in NCEP/NCAR database and measurements in northeastern Brazil

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Silva, Vicente de Paulo Rodrigues da; Silva, Roberta Araujo e; Cavalcanti, Enilson Palmeira

    2010-10-15

    The database from the National Center for Environmental Prediction/National Center for Atmospheric Research (NCEP/NCAR) re-analysis project available for the period from 1948 to 2009 was used for obtaining long-term solar radiation for northeastern Brazil. Measurements of global solar radiation (R{sub s}) from data collection platform (DCP) for four climatic zones of northeastern Brazil were compared to the re-analysis data. Applying cluster analysis to R{sub s} from database, homogeneous sub-regions in northeastern Brazil were determined. Long times series of R{sub s} and sunshine duration measurements data for two sites, Petrolina (09 09'S, 40 22'W) and Juazeiro (09 24'S, 40 26'W), exceedingmore » 30 years, were analyzed. In order to exclude the decadal variations which are linked to the Pacific Decadal Oscillation, high-frequency cycles in the solar radiation and sunshine duration time series were eliminated by using a 14-year moving average, and the Mann-Kendall test was employed to assess the long-term variability of re-analysis and measured solar radiation. This study provides an overview of the decrease in solar radiation in a large area, which can be attributed to the global dimming effect. The global solar radiation obtained from the NCEP/NCAR re-analysis data overestimate that obtained from DCP measurements by 1.6% to 18.6%. Results show that there is a notable symmetry between R{sub s} from the re-analysis data and sunshine duration measurements. (author)« less

  11. Met Éireann high resolution reanalysis for Ireland

    NASA Astrophysics Data System (ADS)

    Gleeson, Emily; Whelan, Eoin; Hanley, John

    2017-03-01

    The Irish Meteorological Service, Met Éireann, has carried out a 35-year very high resolution (2.5 km horizontal grid) regional climate reanalysis for Ireland using the ALADIN-HIRLAM numerical weather prediction system. This article provides an overview of the reanalysis, called MÉRA, as well as a preliminary analysis of surface parameters including screen level temperature, 10 m wind speeds, mean sea-level pressure (MSLP), soil temperatures, soil moisture and 24 h rainfall accumulations. The quality of the 3-D variational data assimilation used in the reanalysis is also assessed. Preliminary analysis shows that it takes almost 12 months to spin up the deep soil in terms of moisture, justifying the choice of running year-long spin up periods. Overall, the model performed consistently over the time period. Small biases were found in screen-level temperatures (less than -0.5 °C), MSLP (within 0.5 hPa) and 10 m wind speed (up to 0.5 m s-1) Soil temperatures are well represented by the model. 24 h accumulations of precipitation generally exhibit a small positive bias of ˜ 1 mm per day and negative biases over mountains due to a mismatch between the model orography and the geography of the region. MÉRA outperforms the ERA-Interim reanalysis, particularly in terms of standard deviations in screen-level temperatures and surface winds. This dataset is the first of its kind for Ireland that will be made publically available during spring 2017.

  12. Large differences in reanalyses of diabatic heating in the tropical upper troposphere and lower stratosphere

    NASA Astrophysics Data System (ADS)

    Wright, J. S.; Fueglistaler, S.

    2013-09-01

    We present the time mean heat budgets of the tropical upper troposphere (UT) and lower stratosphere (LS) as simulated by five reanalysis models: the Modern-Era Retrospective Analysis for Research and Applications (MERRA), European Reanalysis (ERA-Interim), Climate Forecast System Reanalysis (CFSR), Japanese 25-yr Reanalysis and Japan Meteorological Agency Climate Data Assimilation System (JRA-25/JCDAS), and National Centers for Environmental Prediction/National Center for Atmospheric Research (NCEP/NCAR) Reanalysis 1. The simulated diabatic heat budget in the tropical UTLS differs significantly from model to model, with substantial implications for representations of transport and mixing. Large differences are apparent both in the net heat budget and in all comparable individual components, including latent heating, heating due to radiative transfer, and heating due to parameterised vertical mixing. We describe and discuss the most pronounced differences. Discrepancies in latent heating reflect continuing difficulties in representing moist convection in models. Although these discrepancies may be expected, their magnitude is still disturbing. We pay particular attention to discrepancies in radiative heating (which may be surprising given the strength of observational constraints on temperature and tropospheric water vapour) and discrepancies in heating due to turbulent mixing (which have received comparatively little attention). The largest differences in radiative heating in the tropical UTLS are attributable to differences in cloud radiative heating, but important systematic differences are present even in the absence of clouds. Local maxima in heating and cooling due to parameterised turbulent mixing occur in the vicinity of the tropical tropopause.

  13. Extending Climate Analytics-As to the Earth System Grid Federation

    NASA Astrophysics Data System (ADS)

    Tamkin, G.; Schnase, J. L.; Duffy, D.; McInerney, M.; Nadeau, D.; Li, J.; Strong, S.; Thompson, J. H.

    2015-12-01

    We are building three extensions to prior-funded work on climate analytics-as-a-service that will benefit the Earth System Grid Federation (ESGF) as it addresses the Big Data challenges of future climate research: (1) We are creating a cloud-based, high-performance Virtual Real-Time Analytics Testbed supporting a select set of climate variables from six major reanalysis data sets. This near real-time capability will enable advanced technologies like the Cloudera Impala-based Structured Query Language (SQL) query capabilities and Hadoop-based MapReduce analytics over native NetCDF files while providing a platform for community experimentation with emerging analytic technologies. (2) We are building a full-featured Reanalysis Ensemble Service comprising monthly means data from six reanalysis data sets. The service will provide a basic set of commonly used operations over the reanalysis collections. The operations will be made accessible through NASA's climate data analytics Web services and our client-side Climate Data Services (CDS) API. (3) We are establishing an Open Geospatial Consortium (OGC) WPS-compliant Web service interface to our climate data analytics service that will enable greater interoperability with next-generation ESGF capabilities. The CDS API will be extended to accommodate the new WPS Web service endpoints as well as ESGF's Web service endpoints. These activities address some of the most important technical challenges for server-side analytics and support the research community's requirements for improved interoperability and improved access to reanalysis data.

  14. Recent Reanalysis Activities at ECMWF: Results from ERA-20C and Plans for ERA5

    NASA Astrophysics Data System (ADS)

    Dragani, R.; Hersbach, H.; Poli, P.; Pebeuy, C.; Hirahara, S.; Simmons, A.; Dee, D.

    2015-12-01

    This presentation will provide an overview of the most recent reanalysis activities performed at the European Centre for Medium-Range Weather Forecasts (ECMWF). A pilot reanalysis of the 20th-century (ERA-20C) has recently been completed. Funded through the European FP7 collaborative project ERA-CLIM, ERA-20C is part of a suite of experiments that also includes a model-only integration (ERA-20CM) and a land-surface reanalysis (ERA-20CL). Its data assimilation system is constrained by only surface observations obtained from ISPD (3.2.6) and ICOADS (2.5.1). Surface boundary conditions are provided by the Hadley Centre (HadISST2.1.0.0) and radiative forcing follows CMIP5 recommended data sets. First-guess uncertainty estimates are based on a 10-member ensemble of Data Assimilations, ERA-20C ensemble, run prior to ERA-20C using ten SST and sea-ice realizations from the Hadley Centre. In November 2014, the European Commission entrusted ECMWF to run on its behalf the Copernicus Climate Change Service (C3S) aiming at producing quality-assured information about the past, current and future states of the climate at both European and global scales. Reanalysis will be one of the main components of the C3S portfolio and the first one to be produced is a global modern era reanalysis (ERA5) covering the period from 1979 onwards. Based on a recent version of the ECMWF data assimilation system, ERA5 will replace the widely used ERA-Interim dataset. This new production will benefit from a much improved model, and better characterized and exploited observations compared to its predecessor. The first part of the presentation will focus on the ERA-20C production, provide an overview of its main characteristics and discuss some of the key results from its assessment. The second part of the talk will give an overview of ERA5, and briefly discuss some of its challenges.

  15. Assessment of Precipitation Trends over Europe by Comparing ERA-20C with a New Homogenized Observational GPCC Dataset

    NASA Astrophysics Data System (ADS)

    Rustemeier, E.; Ziese, M.; Meyer-Christoffer, A.; Finger, P.; Schneider, U.; Becker, A.

    2015-12-01

    Reliable data is essential for robust climate analysis. The ERA-20C reanalysis was developed during the projects ERA-CLIM and ERA-CLIM2. These projects focus on multi-decadal reanalyses of the global climate system. To ensure data quality and provide end users with information about uncertainties in these products, the 4th work package of ERA_CLIM2 deals with the quality assessment of the products including quality control and error estimation.In doing so, the monthly totals of the ERA-20C reanalysis are compared to two corresponding Global Precipitation Climatology Centre (GPCC) products; the Full Data Reanalysis Version 7 and the new HOMogenized PRecipitation Analysis of European in-situ data (HOMPRA Europe).ERA-20C reanalysis was produced based on ECMWFs IFS version Cy38r1 with a spatial resolution of about 125 km. It covers the time period 1900 to 2010. Only surface observations are assimilated namely marine winds and pressure. This allows the comparison with independent, not assimilated data. The GPCC Full Data Reanalysis Version 7 comprises monthly land-surface precipitation from approximately 75,000 rain-gauges covering the time period 1901-2013. For this paper, the version with 1° resolution is utilized. For trend analysis, a monthly European subset of the ERA-20C reanalysis is investigated spanning the years 1951-2005. The European subset will be compared to a new homogenized GPCC data set HOMPRA Europe. The latter is based on a collective of 5373 homogenized monthly rain gauge time series, carefully chosen from the GPCC archive of precipitation data.For the spatial and temporal evaluation of ERA-20C, global scores on monthly, seasonal and annual time scales are calculated. These include contingency table scores, correlation, along with spatial scores such as the fractional skill score. Unsurprisingly regions with strongest deviations are those of data scarcity, mountainous regions with their luv and lee effects, and monsoon regions. They all exhibit strong biases throughout their series, and severe shifts in the means. The new HOMPRA Europe data set is useful in particular for trend analysis. Therefore it is compared to a monthly European subset of the ERA-20C reanalysis for the same period, i.e. the years 1951-2005, to study the ERA-20C capability in reproducing observed trends across Europe.

  16. The Mars Analysis Correction Data Assimilation (MACDA): A reference atmospheric reanalysis

    NASA Astrophysics Data System (ADS)

    Montabone, Luca; Lewis, Stephen R.; Steele, Liam J.; Holmes, James; Read, Peter L.; Valeanu, Alexandru; Smith, Michael D.; Kass, David; Kleinboehl, Armin; LMD Team, MGS/TES Team, MRO/MCS Team

    2016-10-01

    The Mars Analysis Correction Data Assimilation (MACDA) dataset version 1.0 contains the reanalysis of fundamental atmospheric and surface variables for the planet Mars covering a period of about three Martian years (late MY 24 to early MY 27). This four-dimensional dataset has been produced by data assimilation of retrieved thermal profiles and column dust optical depths from NASA's Mars Global Surveyor/Thermal Emission Spectrometer (MGS/TES), which have been assimilated into a Mars global climate model (MGCM) using the Analysis Correction scheme developed at the UK Meteorological Office.The MACDA v1.0 reanalysis is publicly available, and the NetCDF files can be downloaded from the archive at the Centre for Environmental Data Analysis/British Atmospheric Data Centre (CEDA/BADC). The variables included in the dataset can be visualised using an ad-hoc graphical user interface (the "MACDA Plotter") located at the following URL: http://macdap.physics.ox.ac.uk/The first paper about MACDA reanalysis of TES retrievals appeared in 2006, although the acronym MACDA was not yet used at that time. Ten years later, MACDA v1.0 has been used by several researchers worldwide and has contributed to the advancement of the knowledge about the martian atmosphere in critical areas such as the radiative impact of water ice clouds, the solsticial pause in baroclinic wave activity, and the climatology and dynamics of polar vortices, to cite only a few. It is therefore timely to review the scientific results obtained by using such Mars reference atmospheric reanalysis, in order to understand what priorities the user community should focus on in the next decade.MACDA is an ongoing collaborative project, and work funded by NASA MDAP Programme is currently undertaken to produce version 2.0 of the Mars atmospheric reanalysis. One of the key improvements is the extension of the reanalysis period to nine martian years (MY 24 through MY 32), with the assimilation of NASA's Mars Reconnaissance Orbiter/Mars Climate Sounder (MRO/MCS) retrievals of thermal and dust opacity profiles. MACDA 2.0 is also going to be based on an improved version of the underlying MGCM and an updated scheme to fully assimilate (radiative active) tracers, such as dust.

  17. Evaluation of reanalysis near-surface winds over northern Africa in Boreal summer

    NASA Astrophysics Data System (ADS)

    Engelstaedter, Sebastian; Washington, Richard

    2014-05-01

    The emission of dust from desert surfaces depends on the combined effects of surface properties such as surface roughness, soil moisture, soil texture and particle size (erodibility) and wind speed (erosivity). In order for dust cycle models to realistically simulate dust emissions for the right reasons, it is essential that erosivity and erodibility controlling factors are represented correctly. There has been a focus on improving dust emission schemes or input fields of soil distribution and texture even though it has been shown that the use of wind fields from different reanalysis datasets to drive the same model can result in significant differences in the dust emissions. Here we evaluate the representation of near-surface wind speed from three different reanalysis datasets (ERA-Interim, CFSR and MERRA) over the North African domain. Reanalysis 10m wind speeds are compared with observations from SYNOP and METAR reports available from the UK Meteorological Office Integrated Data Archive System (MIDAS) Land and Marine Surface Stations Dataset. We compare 6-hourly observations of 10m wind speed between 1 January 1989 and 31 December 2009 from more the 500 surface stations with the corresponding reanalysis values. A station data based mean wind speed climatology for North Africa is presented. Overall, the representation of 10m winds is relatively poor in all three reanalysis datasets with stations in the northern parts of the Sahara still being better simulated (correlation coefficients ~ 0.5) than stations in the Sahel (correlation coefficients < 0.3) which points at the reanalyses not being able to realistically capture the Sahel dynamics systems. All three reanalyses have a systematic bias towards overestimating wind speed below 3-4 m/s and underestimating wind speed above 4 m/s. This bias becomes larger with increasing wind speed but is independent of the time of day. For instance, 14 m/s observed wind speeds are underestimated on average by 6 m/s in the ERA-Interim reanalysis. Given the cubic relationship between wind speed and dust emission this large underestimation is expected to significantly impact the simulation of dust emissions. A negative relationship between observed and ERA-Interim wind speed is found for winds above 14 m/s indicating that high wind speed generating processes are not well (if at all) represented in the model.

  18. cDNA microarray analysis of esophageal cancer: discoveries and prospects.

    PubMed

    Shimada, Yutaka; Sato, Fumiaki; Shimizu, Kazuharu; Tsujimoto, Gozoh; Tsukada, Kazuhiro

    2009-07-01

    Recent progress in molecular biology has revealed many genetic and epigenetic alterations that are involved in the development and progression of esophageal cancer. Microarray analysis has also revealed several genetic networks that are involved in esophageal cancer. However, clinical application of microarray techniques and use of microarray data have not yet occurred. In this review, we focus on the recent developments and problems with microarray analysis of esophageal cancer.

  19. Manufacturing of microarrays.

    PubMed

    Petersen, David W; Kawasaki, Ernest S

    2007-01-01

    DNA microarray technology has become a powerful tool in the arsenal of the molecular biologist. Capitalizing on high precision robotics and the wealth of DNA sequences annotated from the genomes of a large number of organisms, the manufacture of microarrays is now possible for the average academic laboratory with the funds and motivation. Microarray production requires attention to both biological and physical resources, including DNA libraries, robotics, and qualified personnel. While the fabrication of microarrays is a very labor-intensive process, production of quality microarrays individually tailored on a project-by-project basis will help researchers shed light on future scientific questions.

  20. The Longhorn Array Database (LAD): An Open-Source, MIAME compliant implementation of the Stanford Microarray Database (SMD)

    PubMed Central

    Killion, Patrick J; Sherlock, Gavin; Iyer, Vishwanath R

    2003-01-01

    Background The power of microarray analysis can be realized only if data is systematically archived and linked to biological annotations as well as analysis algorithms. Description The Longhorn Array Database (LAD) is a MIAME compliant microarray database that operates on PostgreSQL and Linux. It is a fully open source version of the Stanford Microarray Database (SMD), one of the largest microarray databases. LAD is available at Conclusions Our development of LAD provides a simple, free, open, reliable and proven solution for storage and analysis of two-color microarray data. PMID:12930545

  1. Development and evaluation of a high-resolution reanalysis of the East Australian Current region using the Regional Ocean Modelling System (ROMS 3.4) and Incremental Strong-Constraint 4-Dimensional Variational (IS4D-Var) data assimilation

    NASA Astrophysics Data System (ADS)

    Kerry, Colette; Powell, Brian; Roughan, Moninya; Oke, Peter

    2016-10-01

    As with other Western Boundary Currents globally, the East Australian Current (EAC) is highly variable making it a challenge to model and predict. For the EAC region, we combine a high-resolution state-of-the-art numerical ocean model with a variety of traditional and newly available observations using an advanced variational data assimilation scheme. The numerical model is configured using the Regional Ocean Modelling System (ROMS 3.4) and takes boundary forcing from the BlueLink ReANalysis (BRAN3). For the data assimilation, we use an Incremental Strong-Constraint 4-Dimensional Variational (IS4D-Var) scheme, which uses the model dynamics to perturb the initial conditions, atmospheric forcing, and boundary conditions, such that the modelled ocean state better fits and is in balance with the observations. This paper describes the data assimilative model configuration that achieves a significant reduction of the difference between the modelled solution and the observations to give a dynamically consistent "best estimate" of the ocean state over a 2-year period. The reanalysis is shown to represent both assimilated and non-assimilated observations well. It achieves mean spatially averaged root mean squared (rms) residuals with the observations of 7.6 cm for sea surface height (SSH) and 0.4 °C for sea surface temperature (SST) over the assimilation period. The time-mean rms residual for subsurface temperature measured by Argo floats is a maximum of 0.9 °C between water depths of 100 and 300 m and smaller throughout the rest of the water column. Velocities at several offshore and continental shelf moorings are well represented in the reanalysis with complex correlations between 0.8 and 1 for all observations in the upper 500 m. Surface radial velocities from a high-frequency radar array are assimilated and the reanalysis provides surface velocity estimates with complex correlations with observed velocities of 0.8-1 across the radar footprint. A comparison with independent (non-assimilated) shipboard conductivity temperature depth (CTD) cast observations shows a marked improvement in the representation of the subsurface ocean in the reanalysis, with the rms residual in potential density reduced to about half of the residual with the free-running model in the upper eddy-influenced part of the water column. This shows that information is successfully propagated from observed variables to unobserved regions as the assimilation system uses the model dynamics to adjust the model state estimate. This is the first study to generate a reanalysis of the region at such a high resolution, making use of an unprecedented observational data set and using an assimilation method that uses the time-evolving model physics to adjust the model in a dynamically consistent way. As such, the reanalysis potentially represents a marked improvement in our ability to capture important circulation dynamics in the EAC. The reanalysis is being used to study EAC dynamics, observation impact in state-estimation, and as forcing for a variety of downscaling studies.

  2. [Microarray CGH: principle and use for constitutional disorders].

    PubMed

    Sanlaville, D; Lapierre, J M; Coquin, A; Turleau, C; Vermeesch, J; Colleaux, L; Borck, G; Vekemans, M; Aurias, A; Romana, S P

    2005-10-01

    Chips technology has allowed to miniaturize process making possible to realize in one step and using the same device a lot of chemical reactions. The application of this technology to molecular cytogenetics resulted in the development of comparative genomic hybridization (CGH) on microarrays technique. Using this technique it is possible to detect very small genetic imbalances anywhere in the genome. Its usefulness has been well documented in cancer and more recently in constitutional disorders. In particular it has been used to detect interstitial and subtelomeric submicroscopic imbalances, to characterize their size at the molecular level or to define the breakpoints of translocation. The challenge today is to transfer this technology in laboratory medicine. Nevertheless this technology remains expensive and the existence of numerous sequence polymorphisms makes its interpretation difficult. Finally its is unlikely that it will make karyotyping obsolete as it does not allow to detect balanced rearrangements which after meiotic segregation might result in genome imbalance in the progeny.

  3. Cross-platform normalization of microarray and RNA-seq data for machine learning applications

    PubMed Central

    Thompson, Jeffrey A.; Tan, Jie

    2016-01-01

    Large, publicly available gene expression datasets are often analyzed with the aid of machine learning algorithms. Although RNA-seq is increasingly the technology of choice, a wealth of expression data already exist in the form of microarray data. If machine learning models built from legacy data can be applied to RNA-seq data, larger, more diverse training datasets can be created and validation can be performed on newly generated data. We developed Training Distribution Matching (TDM), which transforms RNA-seq data for use with models constructed from legacy platforms. We evaluated TDM, as well as quantile normalization, nonparanormal transformation, and a simple log2 transformation, on both simulated and biological datasets of gene expression. Our evaluation included both supervised and unsupervised machine learning approaches. We found that TDM exhibited consistently strong performance across settings and that quantile normalization also performed well in many circumstances. We also provide a TDM package for the R programming language. PMID:26844019

  4. Investigating the Genome Diversity of B. cereus and Evolutionary Aspects of B. anthracis Emergence

    PubMed Central

    Papazisi, Leka; Rasko, David A.; Ratnayake, Shashikala; Bock, Geoff R.; Remortel, Brian G.; Appalla, Lakshmi; Liu, Jia; Dracheva, Tatiana; Braisted, John C.; Shallom, Shamira; Jarrahi, Benham; Snesrud, Erik; Ahn, Susie; Sun, Qiang; Rilstone, Jenifer; Økstad, Ole Andreas; Kolstø, Anne-Brit; Fleischmann, Robert D.; Peterson, Scott N.

    2011-01-01

    Here we report the use of a multi-genome DNA microarray to investigate the genome diversity of Bacillus cereus group members and elucidate the events associated with the emergence of B. anthracis the causative agent of anthrax–a lethal zoonotic disease. We initially performed directed genome sequencing of seven diverse B. cereus strains to identify novel sequences encoded in those genomes. The novel genes identified, combined with those publicly available, allowed the design of a “species” DNA microarray. Comparative genomic hybridization analyses of 41 strains indicates that substantial heterogeneity exists with respect to the genes comprising functional role categories. While the acquisition of the plasmid-encoded pathogenicity island (pXO1) and capsule genes (pXO2) represent a crucial landmark dictating the emergence of B. anthracis, the evolution of this species and its close relatives was associated with an overall a shift in the fraction of genes devoted to energy metabolism, cellular processes, transport, as well as virulence. PMID:21447378

  5. Fast gene ontology based clustering for microarray experiments.

    PubMed

    Ovaska, Kristian; Laakso, Marko; Hautaniemi, Sampsa

    2008-11-21

    Analysis of a microarray experiment often results in a list of hundreds of disease-associated genes. In order to suggest common biological processes and functions for these genes, Gene Ontology annotations with statistical testing are widely used. However, these analyses can produce a very large number of significantly altered biological processes. Thus, it is often challenging to interpret GO results and identify novel testable biological hypotheses. We present fast software for advanced gene annotation using semantic similarity for Gene Ontology terms combined with clustering and heat map visualisation. The methodology allows rapid identification of genes sharing the same Gene Ontology cluster. Our R based semantic similarity open-source package has a speed advantage of over 2000-fold compared to existing implementations. From the resulting hierarchical clustering dendrogram genes sharing a GO term can be identified, and their differences in the gene expression patterns can be seen from the heat map. These methods facilitate advanced annotation of genes resulting from data analysis.

  6. Comparative Analysis of CNV Calling Algorithms: Literature Survey and a Case Study Using Bovine High-Density SNP Data.

    PubMed

    Xu, Lingyang; Hou, Yali; Bickhart, Derek M; Song, Jiuzhou; Liu, George E

    2013-06-25

    Copy number variations (CNVs) are gains and losses of genomic sequence between two individuals of a species when compared to a reference genome. The data from single nucleotide polymorphism (SNP) microarrays are now routinely used for genotyping, but they also can be utilized for copy number detection. Substantial progress has been made in array design and CNV calling algorithms and at least 10 comparison studies in humans have been published to assess them. In this review, we first survey the literature on existing microarray platforms and CNV calling algorithms. We then examine a number of CNV calling tools to evaluate their impacts using bovine high-density SNP data. Large incongruities in the results from different CNV calling tools highlight the need for standardizing array data collection, quality assessment and experimental validation. Only after careful experimental design and rigorous data filtering can the impacts of CNVs on both normal phenotypic variability and disease susceptibility be fully revealed.

  7. A framework for list representation, enabling list stabilization through incorporation of gene exchangeabilities.

    PubMed

    Soneson, Charlotte; Fontes, Magnus

    2012-01-01

    Analysis of multivariate data sets from, for example, microarray studies frequently results in lists of genes which are associated with some response of interest. The biological interpretation is often complicated by the statistical instability of the obtained gene lists, which may partly be due to the functional redundancy among genes, implying that multiple genes can play exchangeable roles in the cell. In this paper, we use the concept of exchangeability of random variables to model this functional redundancy and thereby account for the instability. We present a flexible framework to incorporate the exchangeability into the representation of lists. The proposed framework supports straightforward comparison between any 2 lists. It can also be used to generate new more stable gene rankings incorporating more information from the experimental data. Using 2 microarray data sets, we show that the proposed method provides more robust gene rankings than existing methods with respect to sampling variations, without compromising the biological significance of the rankings.

  8. EPA's Reanalysis of Key Issues Related to Dioxin Toxicity and Response to NAS Comments (Volume 1) (Interagency Science Discussion Draft)

    EPA Science Inventory

    EPA is releasing the draft report, EPA's Reanalysis of Key Issues Related to Dioxin Toxicity and Response to NAS Comments (Volume 1), that was distributed to Federal agencies and White House Offices for comment during the Science Discussion step of the On the Extraction of Components and the Applicability of the Factor Model.

    ERIC Educational Resources Information Center

    Dziuban, Charles D.; Harris, Chester W.

    A reanalysis of Shaycroft's matrix of intercorrelations of 10 test variables plus 4 random variables is discussed. Three different procedures were used in the reanalysis: (1) Image Component Analysis, (2) Uniqueness Rescaling Factor Analysis, and (3) Alpha Factor Analysis. The results of these analyses are presented in tables. It is concluded from…

  9. EPA's Reanalysis of Key Issues Related to Dioxin Toxicity and Response to NAS Comments (External Review Draft)

    EPA Science Inventory

  1. Predictability of Subsurface Temperature and the AMOC

    NASA Astrophysics Data System (ADS)

    Chang, Y.; Schubert, S. D.

    2013-12-01

    GEOS 5 coupled model is extensively used for experimental decadal climate prediction. Understanding the limits of decadal ocean predictability is critical for making progress in these efforts. Using this model, we study the subsurface temperature initial value predictability, the variability of the Atlantic meridional overturning circulation (AMOC) and its impacts on the global climate. Our approach is to utilize the idealized data assimilation technology developed at the GMAO. The technique 'replay' allows us to assess, for example, the impact of the surface wind stresses and/or precipitation on the ocean in a very well controlled environment. By running the coupled model in replay mode we can in fact constrain the model using any existing reanalysis data set. We replay the model constraining (nudging) it to the MERRA reanalysis in various fields from 1948-2012. The fields, u,v,T,q,ps, are adjusted towards the 6-hourly analyzed fields in atmosphere. The simulated AMOC variability is studied with a 400-year-long segment of replay integration. The 84 cases of 10-year hindcasts are initialized from 4 different replay cycles. Here, the variability and predictability are examined further by a measure to quantify how much the subsurface temperature and AMOC variability has been influenced by atmospheric forcing and by ocean internal variability. The simulated impact of the AMOC on the multi-decadal variability of the SST, sea surface height (SSH) and sea ice extent is also studied.

  2. The role of observational reference data for climate downscaling: Insights from the VALUE COST Action

    NASA Astrophysics Data System (ADS)

    Kotlarski, Sven; Gutiérrez, José M.; Boberg, Fredrik; Bosshard, Thomas; Cardoso, Rita M.; Herrera, Sixto; Maraun, Douglas; Mezghani, Abdelkader; Pagé, Christian; Räty, Olle; Stepanek, Petr; Soares, Pedro M. M.; Szabo, Peter

    2016-04-01

    VALUE is an open European network to validate and compare downscaling methods for climate change research (http://www.value-cost.eu). A key deliverable of VALUE is the development of a systematic validation framework to enable the assessment and comparison of downscaling methods. Such assessments can be expected to crucially depend on the existence of accurate and reliable observational reference data. In dynamical downscaling, observational data can influence model development itself and, later on, model evaluation, parameter calibration and added value assessment. In empirical-statistical downscaling, observations serve as predictand data and directly influence model calibration with corresponding effects on downscaled climate change projections. We here present a comprehensive assessment of the influence of uncertainties in observational reference data and of scale-related issues on several of the above-mentioned aspects. First, temperature and precipitation characteristics as simulated by a set of reanalysis-driven EURO-CORDEX RCM experiments are validated against three different gridded reference data products, namely (1) the EOBS dataset (2) the recently developed EURO4M-MESAN regional re-analysis, and (3) several national high-resolution and quality-controlled gridded datasets that recently became available. The analysis reveals a considerable influence of the choice of the reference data on the evaluation results, especially for precipitation. It is also illustrated how differences between the reference data sets influence the ranking of RCMs according to a comprehensive set of performance measures.

  3. Making new genetic diagnoses with old data: iterative reanalysis and reporting from genome-wide data in 1,133 families with developmental disorders.

    PubMed

    Wright, Caroline F; McRae, Jeremy F; Clayton, Stephen; Gallone, Giuseppe; Aitken, Stuart; FitzGerald, Tomas W; Jones, Philip; Prigmore, Elena; Rajan, Diana; Lord, Jenny; Sifrim, Alejandro; Kelsell, Rosemary; Parker, Michael J; Barrett, Jeffrey C; Hurles, Matthew E; FitzPatrick, David R; Firth, Helen V

    2018-01-11

    PurposeGiven the rapid pace of discovery in rare disease genomics, it is likely that improvements in diagnostic yield can be made by systematically reanalyzing previously generated genomic sequence data in light of new knowledge.MethodsWe tested this hypothesis in the United Kingdom-wide Deciphering Developmental Disorders study, where in 2014 we reported a diagnostic yield of 27% through whole-exome sequencing of 1,133 children with severe developmental disorders and their parents. We reanalyzed existing data using improved variant calling methodologies, novel variant detection algorithms, updated variant annotation, evidence-based filtering strategies, and newly discovered disease-associated genes.ResultsWe are now able to diagnose an additional 182 individuals, taking our overall diagnostic yield to 454/1,133 (40%), and another 43 (4%) have a finding of uncertain clinical significance. The majority of these new diagnoses are due to novel developmental disorder-associated genes discovered since our original publication.ConclusionThis study highlights the importance of coupling large-scale research with clinical practice, and of discussing the possibility of iterative reanalysis and recontact with patients and health professionals at an early stage. We estimate that implementing parent-offspring whole-exome sequencing as a first-line diagnostic test for developmental disorders would diagnose >50% of patients.GENETICS in MEDICINE advance online publication, 11 January 2018; doi:10.1038/gim.2017.246.

  4. Influence of reanalysis datasets on dynamically downscaling the recent past

    NASA Astrophysics Data System (ADS)

    Moalafhi, Ditiro B.; Evans, Jason P.; Sharma, Ashish

    2017-08-01

    Multiple reanalysis datasets currently exist that can provide boundary conditions for dynamic downscaling and simulating local hydro-climatic processes at finer spatial and temporal resolutions. Previous work has suggested that there are two reanalyses alternatives that provide the best lateral boundary conditions for downscaling over southern Africa. This study dynamically downscales these reanalyses (ERA-I and MERRA) over southern Africa to a high resolution (10 km) grid using the WRF model. Simulations cover the period 1981-2010. Multiple observation datasets were used for both surface temperature and precipitation to account for observational uncertainty when assessing results. Generally, temperature is simulated quite well, except over the Namibian coastal plain where the simulations show anomalous warm temperature related to the failure to propagate the influence of the cold Benguela current inland. Precipitation tends to be overestimated in high altitude areas, and most of southern Mozambique. This could be attributed to challenges in handling complex topography and capturing large-scale circulation patterns. While MERRA driven WRF exhibits slightly less bias in temperature especially for La Nina years, ERA-I driven simulations are on average superior in terms of RMSE. When considering multiple variables and metrics, ERA-I is found to produce the best simulation of the climate over the domain. The influence of the regional model appears to be large enough to overcome the small difference in relative errors present in the lateral boundary conditions derived from these two reanalyses.

  5. The processing of phonological, orthographical, and lexical information of Chinese characters in sentence contexts: an ERP study.

    PubMed

    Liu, Baolin; Jin, Zhixing; Qing, Zhao; Wang, Zhongning

    2011-02-04

    In the current work, we aimed to study the processing of phonological, orthographical, and lexical information of Chinese characters in sentence contexts, as well as to provide further evidence for psychological models. In the experiment, we designed sentences with expected, homophonic, orthographically similar, synonymous, and control characters as endings, respectively. The results indicated that P200 might be related to the early extraction of phonological information. Moreover, it might also represent immediate semantic and orthographic lexical access. This suggested that there might be a dual-route in cognitive processing, where the direct access route and the phonologically mediated access route both exist and interact with each other. The increased N400 under the control condition suggested that both phonological and orthographical information would influence semantic integration in Chinese sentence comprehension. The two positive peaks of the late positive shift might represent the semantic monitoring, and orthographical retrieval and reanalysis processing, respectively. Under the orthographically similar condition, orthographical retrieval and reanalysis processing was more difficult in comparison with the other conditions, which suggested that there might be direct access from orthography to semantic representation in cognitive processing. In conclusion, it was shown that the direct access hypothesis or the dual-route hypothesis could better explain cognitive processing in the brain. Copyright © 2010 Elsevier B.V. All rights reserved.

  6. Global open data management in metabolomics.

    PubMed

    Haug, Kenneth; Salek, Reza M; Steinbeck, Christoph

    2017-02-01

    Chemical Biology employs chemical synthesis, analytical chemistry and other tools to study biological systems. Recent advances in both molecular biology such as next generation sequencing (NGS) have led to unprecedented insights towards the evolution of organisms' biochemical repertoires. Because of the specific data sharing culture in Genomics, genomes from all kingdoms of life become readily available for further analysis by other researchers. While the genome expresses the potential of an organism to adapt to external influences, the Metabolome presents a molecular phenotype that allows us to asses the external influences under which an organism exists and develops in a dynamic way. Steady advancements in instrumentation towards high-throughput and highresolution methods have led to a revival of analytical chemistry methods for the measurement and analysis of the metabolome of organisms. This steady growth of metabolomics as a field is leading to a similar accumulation of big data across laboratories worldwide as can be observed in all of the other omics areas. This calls for the development of methods and technologies for handling and dealing with such large datasets, for efficiently distributing them and for enabling re-analysis. Here we describe the recently emerging ecosystem of global open-access databases and data exchange efforts between them, as well as the foundations and obstacles that enable or prevent the data sharing and reanalysis of this data. Copyright © 2017 The Authors. Published by Elsevier Ltd.. All rights reserved.

  7. How effective is good domestic kitchen hygiene at reducing diarrhoeal disease in developed countries? A systematic review and reanalysis of the UK IID study

    PubMed Central

    Stenberg, Anna; Macdonald, Clare; Hunter, Paul R

    2008-01-01

    Background To assess whether domestic kitchen hygiene is an important contributor to the development of diarrhoea in the developed world. Methods Electronic searches were carried out in October 2006 in EMBASE, MEDLINE, Web of Knowledge, Cochrane central register of clinical trials and CINAHL. All publications, irrespective of study design, assessing food hygiene practices with an outcome measure of diarrhoea were included in the review. All included studies underwent data extraction and the data was subsequently analysed. The analysis was conducted by qualitative synthesis of the results. Given the substantial heterogeneity in study design and outcome measures meta-analysis was not done. In addition the existing dataset of the UK IID study was reanalysed to investigate possible associations between self-reported diarrhoea and variables indicative of poor domestic kitchen hygiene Results Some 14 studies were finally included in subsequent analyses. Of the 14 studies included in this systematic review, 11 were case-control studies, 2 cross-sectional surveys, and 1 RCT. Very few studies identified any significant association with good environmental kitchen hygiene. Although some of the variables in the reanalysis of the UK IID study were statistically significant no obvious trend was seen. Conclusion The balance of the available evidence does not support the hypothesis that poor domestic kitchen hygiene practices are important risk factors for diarrhoeal disease in developed countries. PMID:18294383

  8. How effective is good domestic kitchen hygiene at reducing diarrhoeal disease in developed countries? A systematic review and reanalysis of the UK IID study.

    PubMed

    Stenberg, Anna; Macdonald, Clare; Hunter, Paul R

    2008-02-22

    To assess whether domestic kitchen hygiene is an important contributor to the development of diarrhoea in the developed world. Electronic searches were carried out in October 2006 in EMBASE, MEDLINE, Web of Knowledge, Cochrane central register of clinical trials and CINAHL. All publications, irrespective of study design, assessing food hygiene practices with an outcome measure of diarrhoea were included in the review. All included studies underwent data extraction and the data was subsequently analysed. The analysis was conducted by qualitative synthesis of the results. Given the substantial heterogeneity in study design and outcome measures meta-analysis was not done. In addition the existing dataset of the UK IID study was reanalysed to investigate possible associations between self-reported diarrhoea and variables indicative of poor domestic kitchen hygiene Some 14 studies were finally included in subsequent analyses. Of the 14 studies included in this systematic review, 11 were case-control studies, 2 cross-sectional surveys, and 1 RCT. Very few studies identified any significant association with good environmental kitchen hygiene. Although some of the variables in the reanalysis of the UK IID study were statistically significant no obvious trend was seen. The balance of the available evidence does not support the hypothesis that poor domestic kitchen hygiene practices are important risk factors for diarrhoeal disease in developed countries.

  9. The Microarray Revolution: Perspectives from Educators

    ERIC Educational Resources Information Center

    Brewster, Jay L.; Beason, K. Beth; Eckdahl, Todd T.; Evans, Irene M.

    2004-01-01

    In recent years, microarray analysis has become a key experimental tool, enabling the analysis of genome-wide patterns of gene expression. This review approaches the microarray revolution with a focus upon four topics: 1) the early development of this technology and its application to cancer diagnostics; 2) a primer of microarray research,…

  10. Evaluating the fidelity of CMIP5 models in producing large-scale meteorological patterns over the Northwestern United States

    NASA Astrophysics Data System (ADS)

    Lintner, B. R.; Loikith, P. C.; Pike, M.; Aragon, C.

    2017-12-01

    Climate change information is increasingly required at impact-relevant scales. However, most state-of-the-art climate models are not of sufficiently high spatial resolution to resolve features explicitly at such scales. This challenge is particularly acute in regions of complex topography, such as the Pacific Northwest of the United States. To address this scale mismatch problem, we consider large-scale meteorological patterns (LSMPs), which can be resolved by climate models and associated with the occurrence of local scale climate and climate extremes. In prior work, using self-organizing maps (SOMs), we computed LSMPs over the northwestern United States (NWUS) from daily reanalysis circulation fields and further related these to the occurrence of observed extreme temperatures and precipitation: SOMs were used to group LSMPs into 12 nodes or clusters spanning the continuum of synoptic variability over the regions. Here this observational foundation is utilized as an evaluation target for a suite of global climate models from the Fifth Phase of the Coupled Model Intercomparison Project (CMIP5). Evaluation is performed in two primary ways. First, daily model circulation fields are assigned to one of the 12 reanalysis nodes based on minimization of the mean square error. From this, a bulk model skill score is computed measuring the similarity between the model and reanalysis nodes. Next, SOMs are applied directly to the model output and compared to the nodes obtained from reanalysis. Results reveal that many of the models have LSMPs analogous to the reanalysis, suggesting that the models reasonably capture observed daily synoptic states.

  11. A North American regional reanalysis climatology of the Haines Index

    Treesearch

    Wei Lu; Joseph J. (Jay) Charney; Sharon Zhong; Xindi Bian; Shuhua Liu

    2011-01-01

    A warm-season (May through October) Haines Index climatology is derived using 32-km regional reanalysis temperature and humidity data from 1980 to 2007. We compute lapse rates, dewpoint depressions, Haines Index factors A and B, and values for each of the low-, mid- and high-elevation variants of the Haines Index. Statistical techniques are used to investigate the...

  12. Recent progress in making protein microarray through BioLP

    NASA Astrophysics Data System (ADS)

    Yang, Rusong; Wei, Lian; Feng, Ying; Li, Xiujian; Zhou, Quan

    2017-02-01

    Biological laser printing (BioLP) is a promising biomaterial printing technique. It has the advantage of high resolution, high bioactivity, high printing frequency and small transported liquid amount. In this paper, a set of BioLP device is design and made, and protein microarrays are printed by this device. It's found that both laser intensity and fluid layer thickness have an influence on the microarrays acquired. Besides, two kinds of the fluid layer coating methods are compared, and the results show that blade coating method is better than well-coating method in BioLP. A microarray of 0.76pL protein microarray and a "NUDT" patterned microarray are printed to testify the printing ability of BioLP.

  13. Intercomparison of the Gulf Stream in ocean reanalyses: 1993-2010

    NASA Astrophysics Data System (ADS)

    Chi, Lequan; Wolfe, Christopher L. P.; Hameed, Sultan

    2018-05-01

    In recent years, significant progress has been made in the development of high-resolution ocean reanalysis products. This paper compares aspects of the Gulf Stream (GS) from the Florida Straits to south of the Grand Banks-particularly Florida Strait transport, separation of the GS near Cape Hatteras, GS properties along the Oleander Line (from New Jersey to Bermuda), GS path, and the GS north wall positions-in 13 widely used global reanalysis products of various resolutions, including two unconstrained products. A large spread across reanalysis products is found. HYCOM and GLORYS2v4 stand out for their superior performance by most metrics. Some common biases are found in all discussed models; for example, the velocity structure of the GS near the Oleander Line is too symmetrical and the maximum velocity is too weak compared with observations. Less than half of the reanalysis products show significant correlations (at the 95% confidence level) with observations for the GS separation latitude at Cape Hatteras, the GS transport, and net transport across Oleander Line. The cross-stream velocity structure is further discussed by a theoretical model idealizing GS as a smoothed PV front.

  14. Synoptic Storms in the North Atlantic in the Atmospheric Reanalysis and Scatterometer-Based Wind Products

    NASA Astrophysics Data System (ADS)

    Dukhovskoy, D. S.; Bourassa, M. A.

    2016-12-01

    The study compares and analyses the characteristics of synoptic storms in the Subpolar North Atlantic over the time period from 2000 through 2009 derived from reanalysis data sets and scatterometer-based gridded wind products. The analysis is performed for ocean 10-m winds derived from the following wind data sets: NCEP/DOE AMIP-II reanalysis (NCEPR2), NCAR/CFSR, Arctic System Reanalysis (ASR) version 1, Cross-Calibrated Multi-Platform (CCMP) wind product versions 1.1 and recently released version 2.0 prepared by the Remote Sensing Systems, and QuikSCAT. A cyclone tracking algorithm employed in this study for storm identification is based on average vorticity fields derived from the wind data. The study discusses storm characteristics such as storm counts, trajectories, intensity, integrated kinetic energy, spatial scale. Interannal variability of these characteristics in the data sets is compared. The analyses demonstrates general agreement among the wind data products on the characteristics of the storms, their spatial distribution and trajectories. On average, the NCEPR2 storms are more energetic mostly due to large spatial scales and stronger winds. There is noticeable interannual variability in the storm characteristics, yet no obvious trend in storms is observed in the data sets.

  15. Reanalysis of Water, Land Use, and Production Data for Assessing China's Agricultural Resources

    NASA Astrophysics Data System (ADS)

    Smith, T.; Pan, J.; McLaughlin, D.

    2016-12-01

    Quantitative data about water availability, crop evapotranspiration (ET), agricultural land use, and production are needed at high temporal and spatial resolutions to develop sustainable water and agricultural plan and policies. However, large-scale high-resolution measured data can be susceptible to errors, physically inconsistent, or incomplete. Reanalysis provides a way to develop improved physically consistent estimates of both measured and hidden variables. The reanalysis approach described here uses a least-squares technique constrained by water balances and crop water requirements to assimilate many possibly redundant data sources to yield estimates of water, land use, and food production variables that are physically consistent while minimizing differences from measured data. As an example, this methodology is applied in China, where food demand is expected to increase but land and water resources could constrain further increases in food production. Hydrologic fluxes, crop ET, agricultural land use, yields, and food production are characterized at 0.5o by 0.5o resolution for a nominal year around the year 2000 for 22 different crop groups. The reanalysis approach provides useful information for resource management and policy, both in China and around the world.

  16. Assessment of Bias in the National Mosaic and Multi-Sensor QPE (NMQ/Q2) Reanalysis Radar-Only Estimate

    NASA Astrophysics Data System (ADS)

    Nelson, B. R.; Prat, O. P.; Stevens, S. E.; Seo, D. J.; Zhang, J.; Howard, K.

    2014-12-01

    The processing of radar-only precipitation via the reanalysis from the National Mosaic and Multi-Sensor QPE (NMQ/Q2) based on the WSR-88D Next-generation Radar (NEXRAD) network over Continental United States (CONUS) is nearly completed for the period covering from 2001 to 2012. Reanalysis data are available at 1-km and 5-minute resolution. An important step in generating the best possible precipitation data is to assess the bias in the radar-only product. In this work, we use data from a combination of rain gauge networks to assess the bias in the NMQ reanalysis. Rain gauge networks such as the Hydrometeorological Automated Data System (HADS), the Automated Surface Observing Systems (ASOS), the Climate Reference Network (CRN), and the Global Historical Climatology Network Daily (GHCN-D) are combined for use in the assessment. These rain gauge networks vary in spatial density and temporal resolution. The challenge hence is to optimally utilize them to assess the bias at the finest resolution possible. For initial assessment, we propose to subset the CONUS data in climatologically representative domains, and perform bias assessment using information in the Q2 dataset on precipitation type and phase.

  17. Micro-Analyzer: automatic preprocessing of Affymetrix microarray data.

    PubMed

    Guzzi, Pietro Hiram; Cannataro, Mario

    2013-08-01

    A current trend in genomics is the investigation of the cell mechanism using different technologies, in order to explain the relationship among genes, molecular processes and diseases. For instance, the combined use of gene-expression arrays and genomic arrays has been demonstrated as an effective instrument in clinical practice. Consequently, in a single experiment different kind of microarrays may be used, resulting in the production of different types of binary data (images and textual raw data). The analysis of microarray data requires an initial preprocessing phase, that makes raw data suitable for use on existing analysis platforms, such as the TIGR M4 (TM4) Suite. An additional challenge to be faced by emerging data analysis platforms is the ability to treat in a combined way those different microarray formats coupled with clinical data. In fact, resulting integrated data may include both numerical and symbolic data (e.g. gene expression and SNPs regarding molecular data), as well as temporal data (e.g. the response to a drug, time to progression and survival rate), regarding clinical data. Raw data preprocessing is a crucial step in analysis but is often performed in a manual and error prone way using different software tools. Thus novel, platform independent, and possibly open source tools enabling the semi-automatic preprocessing and annotation of different microarray data are needed. The paper presents Micro-Analyzer (Microarray Analyzer), a cross-platform tool for the automatic normalization, summarization and annotation of Affymetrix gene expression and SNP binary data. It represents the evolution of the μ-CS tool, extending the preprocessing to SNP arrays that were not allowed in μ-CS. The Micro-Analyzer is provided as a Java standalone tool and enables users to read, preprocess and analyse binary microarray data (gene expression and SNPs) by invoking TM4 platform. It avoids: (i) the manual invocation of external tools (e.g. the Affymetrix Power Tools), (ii) the manual loading of preprocessing libraries, and (iii) the management of intermediate files, such as results and metadata. Micro-Analyzer users can directly manage Affymetrix binary data without worrying about locating and invoking the proper preprocessing tools and chip-specific libraries. Moreover, users of the Micro-Analyzer tool can load the preprocessed data directly into the well-known TM4 platform, extending in such a way also the TM4 capabilities. Consequently, Micro Analyzer offers the following advantages: (i) it reduces possible errors in the preprocessing and further analysis phases, e.g. due to the incorrect choice of parameters or due to the use of old libraries, (ii) it enables the combined and centralized pre-processing of different arrays, (iii) it may enhance the quality of further analysis by storing the workflow, i.e. information about the preprocessing steps, and (iv) finally Micro-Analzyer is freely available as a standalone application at the project web site http://sourceforge.net/projects/microanalyzer/. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  18. The MicroArray Quality Control (MAQC)-II study of common practices for the development and validation of microarray-based predictive models

    EPA Science Inventory

    The second phase of the MicroArray Quality Control (MAQC-II) project evaluated common practices for developing and validating microarray-based models aimed at predicting toxicological and clinical endpoints. Thirty-six teams developed classifiers for 13 endpoints - some easy, som...

  19. Flow-pattern Guided Fabrication of High-density Barcode Antibody Microarray

    PubMed Central

    Ramirez, Lisa S.; Wang, Jun

    2016-01-01

    Antibody microarray as a well-developed technology is currently challenged by a few other established or emerging high-throughput technologies. In this report, we renovate the antibody microarray technology by using a novel approach for manufacturing and by introducing new features. The fabrication of our high-density antibody microarray is accomplished through perpendicularly oriented flow-patterning of single stranded DNAs and subsequent conversion mediated by DNA-antibody conjugates. This protocol outlines the critical steps in flow-patterning DNA, producing and purifying DNA-antibody conjugates, and assessing the quality of the fabricated microarray. The uniformity and sensitivity are comparable with conventional microarrays, while our microarray fabrication does not require the assistance of an array printer and can be performed in most research laboratories. The other major advantage is that the size of our microarray units is 10 times smaller than that of printed arrays, offering the unique capability of analyzing functional proteins from single cells when interfacing with generic microchip designs. This barcode technology can be widely employed in biomarker detection, cell signaling studies, tissue engineering, and a variety of clinical applications. PMID:26780370

  20. Microarray platform for omics analysis

    NASA Astrophysics Data System (ADS)

    Mecklenburg, Michael; Xie, Bin

    2001-09-01

    Microarray technology has revolutionized genetic analysis. However, limitations in genome analysis has lead to renewed interest in establishing 'omic' strategies. As we enter the post-genomic era, new microarray technologies are needed to address these new classes of 'omic' targets, such as proteins, as well as lipids and carbohydrates. We have developed a microarray platform that combines self- assembling monolayers with the biotin-streptavidin system to provide a robust, versatile immobilization scheme. A hydrophobic film is patterned on the surface creating an array of tension wells that eliminates evaporation effects thereby reducing the shear stress to which biomolecules are exposed to during immobilization. The streptavidin linker layer makes it possible to adapt and/or develop microarray based assays using virtually any class of biomolecules including: carbohydrates, peptides, antibodies, receptors, as well as them ore traditional DNA based arrays. Our microarray technology is designed to furnish seamless compatibility across the various 'omic' platforms by providing a common blueprint for fabricating and analyzing arrays. The prototype microarray uses a microscope slide footprint patterned with 2 by 96 flat wells. Data on the microarray platform will be presented.

  1. On-Chip Synthesis of Protein Microarrays from DNA Microarrays Via Coupled In Vitro Transcription and Translation for Surface Plasmon Resonance Imaging Biosensor Applications

    PubMed Central

    Seefeld, Ting H.; Halpern, Aaron R.; Corn, Robert M.

    2012-01-01

    Protein microarrays are fabricated from double-stranded DNA (dsDNA) microarrays by a one-step, multiplexed enzymatic synthesis in an on-chip microfluidic format and then employed for antibody biosensing measurements with surface plasmon resonance imaging (SPRI). A microarray of dsDNA elements (denoted as generator elements) that encode either a His-tagged green fluorescent protein (GFP) or a His-tagged luciferase protein is utilized to create multiple copies of messenger RNA (mRNA) in a surface RNA polymerase reaction; the mRNA transcripts are then translated into proteins by cell-free protein synthesis in a microfluidic format. The His-tagged proteins diffuse to adjacent Cu(II)-NTA microarray elements (denoted as detector elements) and are specifically adsorbed. The net result is the on-chip, cell-free synthesis of a protein microarray that can be used immediately for SPRI protein biosensing. The dual element format greatly reduces any interference from the nonspecific adsorption of enzyme or proteins. SPRI measurements for the detection of the antibodies anti-GFP and anti-luciferase were used to verify the formation of the protein microarray. This convenient on-chip protein microarray fabrication method can be implemented for multiplexed SPRI biosensing measurements in both clinical and research applications. PMID:22793370

  2. Fully Automated Complementary DNA Microarray Segmentation using a Novel Fuzzy-based Algorithm.

    PubMed

    Saberkari, Hamidreza; Bahrami, Sheyda; Shamsi, Mousa; Amoshahy, Mohammad Javad; Ghavifekr, Habib Badri; Sedaaghi, Mohammad Hossein

    2015-01-01

    DNA microarray is a powerful approach to study simultaneously, the expression of 1000 of genes in a single experiment. The average value of the fluorescent intensity could be calculated in a microarray experiment. The calculated intensity values are very close in amount to the levels of expression of a particular gene. However, determining the appropriate position of every spot in microarray images is a main challenge, which leads to the accurate classification of normal and abnormal (cancer) cells. In this paper, first a preprocessing approach is performed to eliminate the noise and artifacts available in microarray cells using the nonlinear anisotropic diffusion filtering method. Then, the coordinate center of each spot is positioned utilizing the mathematical morphology operations. Finally, the position of each spot is exactly determined through applying a novel hybrid model based on the principle component analysis and the spatial fuzzy c-means clustering (SFCM) algorithm. Using a Gaussian kernel in SFCM algorithm will lead to improving the quality in complementary DNA microarray segmentation. The performance of the proposed algorithm has been evaluated on the real microarray images, which is available in Stanford Microarray Databases. Results illustrate that the accuracy of microarray cells segmentation in the proposed algorithm reaches to 100% and 98% for noiseless/noisy cells, respectively.

  3. Development and application of a fluorescence protein microarray for detecting serum alpha-fetoprotein in patients with hepatocellular carcinoma.

    PubMed

    Zhang, Aiying; Yin, Chengzeng; Wang, Zhenshun; Zhang, Yonghong; Zhao, Yuanshun; Li, Ang; Sun, Huanqin; Lin, Dongdong; Li, Ning

    2016-12-01

    Objective To develop a simple, effective, time-saving and low-cost fluorescence protein microarray method for detecting serum alpha-fetoprotein (AFP) in patients with hepatocellular carcinoma (HCC). Method Non-contact piezoelectric print techniques were applied to fluorescence protein microarray to reduce the cost of prey antibody. Serum samples from patients with HCC and healthy control subjects were collected and evaluated for the presence of AFP using a novel fluorescence protein microarray. To validate the fluorescence protein microarray, serum samples were tested for AFP using an enzyme-linked immunosorbent assay (ELISA). Results A total of 110 serum samples from patients with HCC ( n = 65) and healthy control subjects ( n = 45) were analysed. When the AFP cut-off value was set at 20 ng/ml, the fluorescence protein microarray had a sensitivity of 91.67% and a specificity of 93.24% for detecting serum AFP. Serum AFP quantified via fluorescence protein microarray had a similar diagnostic performance compared with ELISA in distinguishing patients with HCC from healthy control subjects (area under receiver operating characteristic curve: 0.906 for fluorescence protein microarray; 0.880 for ELISA). Conclusion A fluorescence protein microarray method was developed for detecting serum AFP in patients with HCC.

  4. Development and application of a fluorescence protein microarray for detecting serum alpha-fetoprotein in patients with hepatocellular carcinoma

    PubMed Central

    Zhang, Aiying; Yin, Chengzeng; Wang, Zhenshun; Zhang, Yonghong; Zhao, Yuanshun; Li, Ang; Sun, Huanqin; Lin, Dongdong

    2016-01-01

    Objective To develop a simple, effective, time-saving and low-cost fluorescence protein microarray method for detecting serum alpha-fetoprotein (AFP) in patients with hepatocellular carcinoma (HCC). Method Non-contact piezoelectric print techniques were applied to fluorescence protein microarray to reduce the cost of prey antibody. Serum samples from patients with HCC and healthy control subjects were collected and evaluated for the presence of AFP using a novel fluorescence protein microarray. To validate the fluorescence protein microarray, serum samples were tested for AFP using an enzyme-linked immunosorbent assay (ELISA). Results A total of 110 serum samples from patients with HCC (n = 65) and healthy control subjects (n = 45) were analysed. When the AFP cut-off value was set at 20 ng/ml, the fluorescence protein microarray had a sensitivity of 91.67% and a specificity of 93.24% for detecting serum AFP. Serum AFP quantified via fluorescence protein microarray had a similar diagnostic performance compared with ELISA in distinguishing patients with HCC from healthy control subjects (area under receiver operating characteristic curve: 0.906 for fluorescence protein microarray; 0.880 for ELISA). Conclusion A fluorescence protein microarray method was developed for detecting serum AFP in patients with HCC. PMID:27885040

  5. Genotyping microarray: Mutation screening in Spanish families with autosomal dominant retinitis pigmentosa

    PubMed Central

    García-Hoyos, María; Cortón, Marta; Ávila-Fernández, Almudena; Riveiro-Álvarez, Rosa; Giménez, Ascensión; Hernan, Inma; Carballo, Miguel; Ayuso, Carmen

    2012-01-01

    Purpose Presently, 22 genes have been described in association with autosomal dominant retinitis pigmentosa (adRP); however, they explain only 50% of all cases, making genetic diagnosis of this disease difficult and costly. The aim of this study was to evaluate a specific genotyping microarray for its application to the molecular diagnosis of adRP in Spanish patients. Methods We analyzed 139 unrelated Spanish families with adRP. Samples were studied by using a genotyping microarray (adRP). All mutations found were further confirmed with automatic sequencing. Rhodopsin (RHO) sequencing was performed in all negative samples for the genotyping microarray. Results The adRP genotyping microarray detected the mutation associated with the disease in 20 of the 139 families with adRP. As in other populations, RHO was found to be the most frequently mutated gene in these families (7.9% of the microarray genotyped families). The rate of false positives (microarray results not confirmed with sequencing) and false negatives (mutations in RHO detected with sequencing but not with the genotyping microarray) were established, and high levels of analytical sensitivity (95%) and specificity (100%) were found. Diagnostic accuracy was 15.1%. Conclusions The adRP genotyping microarray is a quick, cost-efficient first step in the molecular diagnosis of Spanish patients with adRP. PMID:22736939

  6. Ensemble-Based Assimilation of Aerosol Observations in GEOS-5

    NASA Technical Reports Server (NTRS)

    Buchard, V.; Da Silva, A.

    2016-01-01

    MERRA-2 is the latest Aerosol Reanalysis produced at NASA's Global Modeling Assimilation Office (GMAO) from 1979 to present. This reanalysis is based on a version of the GEOS-5 model radiatively coupled to GOCART aerosols and includes assimilation of bias corrected Aerosol Optical Depth (AOD) from AVHRR over ocean, MODIS sensors on both Terra and Aqua satellites, MISR over bright surfaces and AERONET data. In order to assimilate lidar profiles of aerosols, we are updating the aerosol component of our assimilation system to an Ensemble Kalman Filter (EnKF) type of scheme using ensembles generated routinely by the meteorological assimilation. Following the work performed with the first NASA's aerosol reanalysis (MERRAero), we first validate the vertical structure of MERRA-2 aerosol assimilated fields using CALIOP data over regions of particular interest during 2008.

  7. Shape reanalysis and sensitivities utilizing preconditioned iterative boundary solvers

    NASA Technical Reports Server (NTRS)

    Guru Prasad, K.; Kane, J. H.

    1992-01-01

    The computational advantages associated with the utilization of preconditined iterative equation solvers are quantified for the reanalysis of perturbed shapes using continuum structural boundary element analysis (BEA). Both single- and multi-zone three-dimensional problems are examined. Significant reductions in computer time are obtained by making use of previously computed solution vectors and preconditioners in subsequent analyses. The effectiveness of this technique is demonstrated for the computation of shape response sensitivities required in shape optimization. Computer times and accuracies achieved using the preconditioned iterative solvers are compared with those obtained via direct solvers and implicit differentiation of the boundary integral equations. It is concluded that this approach employing preconditioned iterative equation solvers in reanalysis and sensitivity analysis can be competitive with if not superior to those involving direct solvers.

  8. MicroArray Facility: a laboratory information management system with extended support for Nylon based technologies.

    PubMed

    Honoré, Paul; Granjeaud, Samuel; Tagett, Rebecca; Deraco, Stéphane; Beaudoing, Emmanuel; Rougemont, Jacques; Debono, Stéphane; Hingamp, Pascal

    2006-09-20

    High throughput gene expression profiling (GEP) is becoming a routine technique in life science laboratories. With experimental designs that repeatedly span thousands of genes and hundreds of samples, relying on a dedicated database infrastructure is no longer an option.GEP technology is a fast moving target, with new approaches constantly broadening the field diversity. This technology heterogeneity, compounded by the informatics complexity of GEP databases, means that software developments have so far focused on mainstream techniques, leaving less typical yet established techniques such as Nylon microarrays at best partially supported. MAF (MicroArray Facility) is the laboratory database system we have developed for managing the design, production and hybridization of spotted microarrays. Although it can support the widely used glass microarrays and oligo-chips, MAF was designed with the specific idiosyncrasies of Nylon based microarrays in mind. Notably single channel radioactive probes, microarray stripping and reuse, vector control hybridizations and spike-in controls are all natively supported by the software suite. MicroArray Facility is MIAME supportive and dynamically provides feedback on missing annotations to help users estimate effective MIAME compliance. Genomic data such as clone identifiers and gene symbols are also directly annotated by MAF software using standard public resources. The MAGE-ML data format is implemented for full data export. Journalized database operations (audit tracking), data anonymization, material traceability and user/project level confidentiality policies are also managed by MAF. MicroArray Facility is a complete data management system for microarray producers and end-users. Particular care has been devoted to adequately model Nylon based microarrays. The MAF system, developed and implemented in both private and academic environments, has proved a robust solution for shared facilities and industry service providers alike.

  9. MicroArray Facility: a laboratory information management system with extended support for Nylon based technologies

    PubMed Central

    Honoré, Paul; Granjeaud, Samuel; Tagett, Rebecca; Deraco, Stéphane; Beaudoing, Emmanuel; Rougemont, Jacques; Debono, Stéphane; Hingamp, Pascal

    2006-01-01

    Background High throughput gene expression profiling (GEP) is becoming a routine technique in life science laboratories. With experimental designs that repeatedly span thousands of genes and hundreds of samples, relying on a dedicated database infrastructure is no longer an option. GEP technology is a fast moving target, with new approaches constantly broadening the field diversity. This technology heterogeneity, compounded by the informatics complexity of GEP databases, means that software developments have so far focused on mainstream techniques, leaving less typical yet established techniques such as Nylon microarrays at best partially supported. Results MAF (MicroArray Facility) is the laboratory database system we have developed for managing the design, production and hybridization of spotted microarrays. Although it can support the widely used glass microarrays and oligo-chips, MAF was designed with the specific idiosyncrasies of Nylon based microarrays in mind. Notably single channel radioactive probes, microarray stripping and reuse, vector control hybridizations and spike-in controls are all natively supported by the software suite. MicroArray Facility is MIAME supportive and dynamically provides feedback on missing annotations to help users estimate effective MIAME compliance. Genomic data such as clone identifiers and gene symbols are also directly annotated by MAF software using standard public resources. The MAGE-ML data format is implemented for full data export. Journalized database operations (audit tracking), data anonymization, material traceability and user/project level confidentiality policies are also managed by MAF. Conclusion MicroArray Facility is a complete data management system for microarray producers and end-users. Particular care has been devoted to adequately model Nylon based microarrays. The MAF system, developed and implemented in both private and academic environments, has proved a robust solution for shared facilities and industry service providers alike. PMID:16987406

  10. In-depth investigation of archival and prospectively collected samples reveals no evidence for XMRV infection in prostate cancer.

    PubMed

    Lee, Deanna; Das Gupta, Jaydip; Gaughan, Christina; Steffen, Imke; Tang, Ning; Luk, Ka-Cheung; Qiu, Xiaoxing; Urisman, Anatoly; Fischer, Nicole; Molinaro, Ross; Broz, Miranda; Schochetman, Gerald; Klein, Eric A; Ganem, Don; Derisi, Joseph L; Simmons, Graham; Hackett, John; Silverman, Robert H; Chiu, Charles Y

    2012-01-01

    XMRV, or xenotropic murine leukemia virus (MLV)-related virus, is a novel gammaretrovirus originally identified in studies that analyzed tissue from prostate cancer patients in 2006 and blood from patients with chronic fatigue syndrome (CFS) in 2009. However, a large number of subsequent studies failed to confirm a link between XMRV infection and CFS or prostate cancer. On the contrary, recent evidence indicates that XMRV is a contaminant originating from the recombination of two mouse endogenous retroviruses during passaging of a prostate tumor xenograft (CWR22) in mice, generating laboratory-derived cell lines that are XMRV-infected. To confirm or refute an association between XMRV and prostate cancer, we analyzed prostate cancer tissues and plasma from a prospectively collected cohort of 39 patients as well as archival RNA and prostate tissue from the original 2006 study. Despite comprehensive microarray, PCR, FISH, and serological testing, XMRV was not detected in any of the newly collected samples or in archival tissue, although archival RNA remained XMRV-positive. Notably, archival VP62 prostate tissue, from which the prototype XMRV strain was derived, tested negative for XMRV on re-analysis. Analysis of viral genomic and human mitochondrial sequences revealed that all previously characterized XMRV strains are identical and that the archival RNA had been contaminated by an XMRV-infected laboratory cell line. These findings reveal no association between XMRV and prostate cancer, and underscore the conclusion that XMRV is not a naturally acquired human infection.

  11. Combined image and genomic analysis of high-grade serous ovarian cancer reveals PTEN loss as a common driver event and prognostic classifier.

    PubMed

    Martins, Filipe C; Santiago, Ines de; Trinh, Anne; Xian, Jian; Guo, Anne; Sayal, Karen; Jimenez-Linan, Mercedes; Deen, Suha; Driver, Kristy; Mack, Marie; Aslop, Jennifer; Pharoah, Paul D; Markowetz, Florian; Brenton, James D

    2014-12-17

    TP53 and BRCA1/2 mutations are the main drivers in high-grade serous ovarian carcinoma (HGSOC). We hypothesise that combining tissue phenotypes from image analysis of tumour sections with genomic profiles could reveal other significant driver events. Automatic estimates of stromal content combined with genomic analysis of TCGA HGSOC tumours show that stroma strongly biases estimates of PTEN expression. Tumour-specific PTEN expression was tested in two independent cohorts using tissue microarrays containing 521 cases of HGSOC. PTEN loss or downregulation occurred in 77% of the first cohort by immunofluorescence and 52% of the validation group by immunohistochemistry, and is associated with worse survival in a multivariate Cox-regression model adjusted for study site, age, stage and grade. Reanalysis of TCGA data shows that hemizygous loss of PTEN is common (36%) and expression of PTEN and expression of androgen receptor are positively associated. Low androgen receptor expression was associated with reduced survival in data from TCGA and immunohistochemical analysis of the first cohort. PTEN loss is a common event in HGSOC and defines a subgroup with significantly worse prognosis, suggesting the rational use of drugs to target PI3K and androgen receptor pathways for HGSOC. This work shows that integrative approaches combining tissue phenotypes from images with genomic analysis can resolve confounding effects of tissue heterogeneity and should be used to identify new drivers in other cancers.

  12. The tissue microarray data exchange specification: Extending TMA DES to provide flexible scoring and incorporate virtual slides

    PubMed Central

    Wright, Alexander; Lyttleton, Oliver; Lewis, Paul; Quirke, Philip; Treanor, Darren

    2011-01-01

    Background: Tissue MicroArrays (TMAs) are a high throughput technology for rapid analysis of protein expression across hundreds of patient samples. Often, data relating to TMAs is specific to the clinical trial or experiment it is being used for, and not interoperable. The Tissue Microarray Data Exchange Specification (TMA DES) is a set of eXtensible Markup Language (XML)-based protocols for storing and sharing digitized Tissue Microarray data. XML data are enclosed by named tags which serve as identifiers. These tag names can be Common Data Elements (CDEs), which have a predefined meaning or semantics. By using this specification in a laboratory setting with increasing demands for digital pathology integration, we found that the data structure lacked the ability to cope with digital slide imaging in respect to web-enabled digital pathology systems and advanced scoring techniques. Materials and Methods: By employing user centric design, and observing behavior in relation to TMA scoring and associated data, the TMA DES format was extended to accommodate the current limitations. This was done with specific focus on developing a generic tool for handling any given scoring system, and utilizing data for multiple observations and observers. Results: DTDs were created to validate the extensions of the TMA DES protocol, and a test set of data containing scores for 6,708 TMA core images was generated. The XML was then read into an image processing algorithm to utilize the digital pathology data extensions, and scoring results were easily stored alongside the existing multiple pathologist scores. Conclusions: By extending the TMA DES format to include digital pathology data and customizable scoring systems for TMAs, the new system facilitates the collaboration between pathologists and organizations, and can be used in automatic or manual data analysis. This allows complying systems to effectively communicate complex and varied scoring data. PMID:21572508

  13. Bayesian inference with historical data-based informative priors improves detection of differentially expressed genes.

    PubMed

    Li, Ben; Sun, Zhaonan; He, Qing; Zhu, Yu; Qin, Zhaohui S

    2016-03-01

    Modern high-throughput biotechnologies such as microarray are capable of producing a massive amount of information for each sample. However, in a typical high-throughput experiment, only limited number of samples were assayed, thus the classical 'large p, small n' problem. On the other hand, rapid propagation of these high-throughput technologies has resulted in a substantial collection of data, often carried out on the same platform and using the same protocol. It is highly desirable to utilize the existing data when performing analysis and inference on a new dataset. Utilizing existing data can be carried out in a straightforward fashion under the Bayesian framework in which the repository of historical data can be exploited to build informative priors and used in new data analysis. In this work, using microarray data, we investigate the feasibility and effectiveness of deriving informative priors from historical data and using them in the problem of detecting differentially expressed genes. Through simulation and real data analysis, we show that the proposed strategy significantly outperforms existing methods including the popular and state-of-the-art Bayesian hierarchical model-based approaches. Our work illustrates the feasibility and benefits of exploiting the increasingly available genomics big data in statistical inference and presents a promising practical strategy for dealing with the 'large p, small n' problem. Our method is implemented in R package IPBT, which is freely available from https://github.com/benliemory/IPBT CONTACT: yuzhu@purdue.edu; zhaohui.qin@emory.edu Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  14. Microbial forensics: fiber optic microarray subtyping of Bacillus anthracis

    NASA Astrophysics Data System (ADS)

    Shepard, Jason R. E.

    2009-05-01

    The past decade has seen increased development and subsequent adoption of rapid molecular techniques involving DNA analysis for detection of pathogenic microorganisms, also termed microbial forensics. The continued accumulation of microbial sequence information in genomic databases now better positions the field of high-throughput DNA analysis to proceed in a more manageable fashion. The potential to build off of these databases exists as technology continues to develop, which will enable more rapid, cost effective analyses. This wealth of genetic information, along with new technologies, has the potential to better address some of the current problems and solve the key issues involved in DNA analysis of pathogenic microorganisms. To this end, a high density fiber optic microarray has been employed, housing numerous DNA sequences simultaneously for detection of various pathogenic microorganisms, including Bacillus anthracis, among others. Each organism is analyzed with multiple sequences and can be sub-typed against other closely related organisms. For public health labs, real-time PCR methods have been developed as an initial preliminary screen, but culture and growth are still considered the gold standard. Technologies employing higher throughput than these standard methods are better suited to capitalize on the limitless potential garnered from the sequence information. Microarray analyses are one such format positioned to exploit this potential, and our array platform is reusable, allowing repetitive tests on a single array, providing an increase in throughput and decrease in cost, along with a certainty of detection, down to the individual strain level.

  15. Genomic resources for songbird research and their use in characterizing gene expression during brain development

    PubMed Central

    Li, XiaoChing; Wang, Xiu-Jie; Tannenhauser, Jonathan; Podell, Sheila; Mukherjee, Piali; Hertel, Moritz; Biane, Jeremy; Masuda, Shoko; Nottebohm, Fernando; Gaasterland, Terry

    2007-01-01

    Vocal learning and neuronal replacement have been studied extensively in songbirds, but until recently, few molecular and genomic tools for songbird research existed. Here we describe new molecular/genomic resources developed in our laboratory. We made cDNA libraries from zebra finch (Taeniopygia guttata) brains at different developmental stages. A total of 11,000 cDNA clones from these libraries, representing 5,866 unique gene transcripts, were randomly picked and sequenced from the 3′ ends. A web-based database was established for clone tracking, sequence analysis, and functional annotations. Our cDNA libraries were not normalized. Sequencing ESTs without normalization produced many developmental stage-specific sequences, yielding insights into patterns of gene expression at different stages of brain development. In particular, the cDNA library made from brains at posthatching day 30–50, corresponding to the period of rapid song system development and song learning, has the most diverse and richest set of genes expressed. We also identified five microRNAs whose sequences are highly conserved between zebra finch and other species. We printed cDNA microarrays and profiled gene expression in the high vocal center of both adult male zebra finches and canaries (Serinus canaria). Genes differentially expressed in the high vocal center were identified from the microarray hybridization results. Selected genes were validated by in situ hybridization. Networks among the regulated genes were also identified. These resources provide songbird biologists with tools for genome annotation, comparative genomics, and microarray gene expression analysis. PMID:17426146

  16. Comparison of Two Global Ocean Reanalyses, NRL Global Ocean Forecast System (GOFS) and U. Maryland Simple Ocean Data Assimilation (SODA)

    NASA Astrophysics Data System (ADS)

    Richman, J. G.; Shriver, J. F.; Metzger, E. J.; Hogan, P. J.; Smedstad, O. M.

    2017-12-01

    The Oceanography Division of the Naval Research Laboratory recently completed a 23-year (1993-2015) coupled ocean-sea ice reanalysis forced by NCEP CFS reanalysis fluxes. The reanalysis uses the Global Ocean Forecast System (GOFS) framework of the HYbrid Coordinate Ocean Model (HYCOM) and the Los Alamos Community Ice CodE (CICE) and the Navy Coupled Ocean Data Assimilation 3D Var system (NCODA). The ocean model has 41 layers and an equatorial resolution of 0.08° (8.8 km) on a tri-polar grid with the sea ice model on the same grid that reduces to 3.5 km at the North Pole. Sea surface temperature (SST), sea surface height (SSH) and temperature-salinity profile data are assimilated into the ocean every day. The SSH anomalies are converted into synthetic profiles of temperature and salinity prior to assimilation. Incremental analysis updating of geostrophically balanced increments is performed over a 6-hour insertion window. Sea ice concentration is assimilated into the sea ice model every day. Following the lead of the Ocean Reanalysis Intercomparison Project (ORA-IP), the monthly mean upper ocean heat and salt content from the surface to 300 m, 700m and 1500 m, the mixed layer depth, the depth of the 20°C isotherm, the steric sea surface height and the Atlantic Meridional Overturning Circulation for the GOFS reanalysis and the Simple Ocean Data Assimilation (SODA 3.3.1) eddy-permitting reanalysis have been compared on a global uniform 0.5° grid. The differences between the two ocean reanalyses in heat and salt content increase with increasing integration depth. Globally, GOFS trends to be colder than SODA at all depth. Warming trends are observed at all depths over the 23 year period. The correlation of the upper ocean heat content is significant above 700 m. Prior to 2004, differences in the data assimilated lead to larger biases. The GOFS reanalysis assimilates SSH as profile data, while SODA doesn't. Large differences are found in the Western Boundary Currents, Southern Ocean and equatorial regions. In the Indian Ocean, the Equatorial Counter Current extends to far to the east and the subsurface flow in the thermocline is too weak in GOFS. The 20°C isotherm is biased 2 m shallow in SODA compared to GOFS, but the monthly anomalies in the depth are highly correlated.

  17. Reanalysis of Tyrannosaurus rex Mass Spectra.

    PubMed

    Bern, Marshall; Phinney, Brett S; Goldberg, David

    2009-09-01

    Asara et al. reported the detection of collagen peptides in a 68-million-year-old Tyrannosaurus rex bone by shotgun proteomics. This finding has been called into question as a possible statistical artifact. We reanalyze Asara et al.'s tandem mass spectra using a different search engine and different statistical tools. Our reanalysis shows a sample containing common laboratory contaminants, soil bacteria, and bird-like hemoglobin and collagen.

  18. More from the Water Jars: A Reanalysis of Problem-Solving Performance among Gifted and Nongifted Children.

    ERIC Educational Resources Information Center

    Shore, Bruce M.; And Others

    1994-01-01

    Reanalysis of the data from a 1984 study on making and breaking problem-solving mental sets with 50 children found that gifted subjects who failed to initially form the set made the most errors of any group and were least likely to recognize their own errors. Results suggest that motivational reasons may underly this inferior performance by some…

  19. Processes of Change in Acceptance and Commitment Therapy and Cognitive Therapy for Depression: A Mediation Reanalysis of Zettle and Rains

    ERIC Educational Resources Information Center

    Zettle, Robert D.; Rains, Jeanetta C.; Hayes, Steven C.

    2011-01-01

    Several articles have recently questioned the distinction between acceptance and commitment therapy (ACT) and traditional cognitive therapy (CT). This study presents a reanalysis of data from Zettle and Rains that compared 12 weeks of group CT with group ACT. For theoretical reasons, Zettle and Rains also included a modified form of CT that did…

  20. Comparison of present global reanalysis datasets in the context of a statistical downscaling method for precipitation prediction

    NASA Astrophysics Data System (ADS)

    Horton, Pascal; Weingartner, Rolf; Brönnimann, Stefan

    2017-04-01

    The analogue method is a statistical downscaling method for precipitation prediction. It uses similarity in terms of synoptic-scale predictors with situations in the past in order to provide a probabilistic prediction for the day of interest. It has been used for decades in a context of weather or flood forecasting, and is more recently also applied to climate studies, whether for reconstruction of past weather conditions or future climate impact studies. In order to evaluate the relationship between synoptic scale predictors and the local weather variable of interest, e.g. precipitation, reanalysis datasets are necessary. Nowadays, the number of available reanalysis datasets increases. These are generated by different atmospheric models with different assimilation techniques and offer various spatial and temporal resolutions. A major difference between these datasets is also the length of the archive they provide. While some datasets start at the beginning of the satellite era (1980) and assimilate these data, others aim at homogeneity on a longer period (e.g. 20th century) and only assimilate conventional observations. The context of the application of analogue methods might drive the choice of an appropriate dataset, for example when the archive length is a leading criterion. However, in many studies, a reanalysis dataset is subjectively chosen, according to the user's preferences or the ease of access. The impact of this choice on the results of the downscaling procedure is rarely considered and no comprehensive comparison has been undertaken so far. In order to fill this gap and to advise on the choice of appropriate datasets, nine different global reanalysis datasets were compared in seven distinct versions of analogue methods, over 300 precipitation stations in Switzerland. Significant differences in terms of prediction performance were identified. Although the impact of the reanalysis dataset on the skill score varies according to the chosen predictor, be it atmospheric circulation or thermodynamic variables, some hierarchy between the datasets is often preserved. This work can thus help choosing an appropriate dataset for the analogue method, or raise awareness of the consequences of using a certain dataset.

  1. Reanalysis comparisons of upper tropospheric-lower stratospheric jets and multiple tropopauses

    NASA Astrophysics Data System (ADS)

    Manney, Gloria L.; Hegglin, Michaela I.; Lawrence, Zachary D.; Wargan, Krzysztof; Millán, Luis F.; Schwartz, Michael J.; Santee, Michelle L.; Lambert, Alyn; Pawson, Steven; Knosp, Brian W.; Fuller, Ryan A.; Daffer, William H.

    2017-09-01

    The representation of upper tropospheric-lower stratospheric (UTLS) jet and tropopause characteristics is compared in five modern high-resolution reanalyses for 1980 through 2014. Climatologies of upper tropospheric jet, subvortex jet (the lowermost part of the stratospheric vortex), and multiple tropopause frequency distributions in MERRA (Modern-Era Retrospective analysis for Research and Applications), ERA-I (ERA-Interim; the European Centre for Medium-Range Weather Forecasts, ECMWF, interim reanalysis), JRA-55 (the Japanese 55-year Reanalysis), and CFSR (the Climate Forecast System Reanalysis) are compared with those in MERRA-2. Differences between alternate products from individual reanalysis systems are assessed; in particular, a comparison of CFSR data on model and pressure levels highlights the importance of vertical grid spacing. Most of the differences in distributions of UTLS jets and multiple tropopauses are consistent with the differences in assimilation model grids and resolution - for example, ERA-I (with coarsest native horizontal resolution) typically shows a significant low bias in upper tropospheric jets with respect to MERRA-2, and JRA-55 (the Japanese 55-year Reanalysis) a more modest one, while CFSR (with finest native horizontal resolution) shows a high bias with respect to MERRA-2 in both upper tropospheric jets and multiple tropopauses. Vertical temperature structure and grid spacing are especially important for multiple tropopause characterizations. Substantial differences between MERRA and MERRA-2 are seen in mid- to high-latitude Southern Hemisphere (SH) winter upper tropospheric jets and multiple tropopauses as well as in the upper tropospheric jets associated with tropical circulations during the solstice seasons; some of the largest differences from the other reanalyses are seen in the same times and places. Very good qualitative agreement among the reanalyses is seen between the large-scale climatological features in UTLS jet and multiple tropopause distributions. Quantitative differences may, however, have important consequences for transport and variability studies. Our results highlight the importance of considering reanalyses differences in UTLS studies, especially in relation to resolution and model grids; this is particularly critical when using high-resolution reanalyses as an observational reference for evaluating global chemistry-climate models.

  2. Comparative analysis of atmosphere temperature variability for Northern Eurasia based on the Reanalysis and in-situ observed data

    NASA Astrophysics Data System (ADS)

    Shulgina, T.; Genina, E.; Gordov, E.; Nikitchuk, K.

    2009-04-01

    At present numerous data archives which include meteorological observations as well as climate processes modeling data are available for Earth Science specialists. Methods of mathematical statistics are widely used for their processing and analysis. In many cases they represent the only way of quantitative assessment of the meteorological and climatic information. Unified set of analysis methods allows us to compare climatic characteristics calculated on the basis of different datasets with the purpose of performing more detailed analysis of climate dynamics for both regional and global levels. The report presents the results of comparative analysis of atmosphere temperature behavior for the Northern Eurasia territory for the period from 1979 to 2004 based on the NCEP/NCAR Reanalysis, NCEP/DOE Reanalysis AMIP II, JMA/CRIEPI JRA-25 Reanalysis, ECMWF ERA-40 Reanalysis data and observation data obtained from meteorological stations of the former Soviet Union. Statistical processing of atmosphere temperature data included analysis of time series homogeneity of climate indices approved by WMO, such as "Number of frost days", "Number of summer days", "Number of icing days", "Number of tropical nights", etc. by means of parametric methods of mathematical statistics (Fisher and Student tests). That allowed conducting comprehensive research of spatio-temporal features of the atmosphere temperature. Analysis of the atmosphere temperature dynamics revealed inhomogeneity of the data obtained for large observation intervals. Particularly, analysis performed for the period 1979 - 2004 showed the significant increase of the number of frost and icing days approximately by 1 day for every 2 years and decrease roughly by 1 day for 2 years for the number of summer days. Also it should be mentioned that the growth period mean temperature have increased by 1.5 - 2° C for the time period being considered. The usage of different Reanalysis datasets in conjunction with in-situ observed data allowed comparing of climate indices values calculated on the basis of different datasets that improves the reliability of the results obtained. Partial support of SB RAS Basic Research Program 4.5.2 (Project 2) is acknowledged.

  3. High Resolution Nature Runs and the Big Data Challenge

    NASA Technical Reports Server (NTRS)

    Webster, W. Phillip; Duffy, Daniel Q.

    2015-01-01

    NASA's Global Modeling and Assimilation Office at Goddard Space Flight Center is undertaking a series of very computationally intensive Nature Runs and a downscaled reanalysis. The nature runs use the GEOS-5 as an Atmospheric General Circulation Model (AGCM) while the reanalysis uses the GEOS-5 in Data Assimilation mode. This paper will present computational challenges from three runs, two of which are AGCM and one is downscaled reanalysis using the full DAS. The nature runs will be completed at two surface grid resolutions, 7 and 3 kilometers and 72 vertical levels. The 7 km run spanned 2 years (2005-2006) and produced 4 PB of data while the 3 km run will span one year and generate 4 BP of data. The downscaled reanalysis (MERRA-II Modern-Era Reanalysis for Research and Applications) will cover 15 years and generate 1 PB of data. Our efforts to address the big data challenges of climate science, we are moving toward a notion of Climate Analytics-as-a-Service (CAaaS), a specialization of the concept of business process-as-a-service that is an evolving extension of IaaS, PaaS, and SaaS enabled by cloud computing. In this presentation, we will describe two projects that demonstrate this shift. MERRA Analytic Services (MERRA/AS) is an example of cloud-enabled CAaaS. MERRA/AS enables MapReduce analytics over MERRA reanalysis data collection by bringing together the high-performance computing, scalable data management, and a domain-specific climate data services API. NASA's High-Performance Science Cloud (HPSC) is an example of the type of compute-storage fabric required to support CAaaS. The HPSC comprises a high speed Infinib and network, high performance file systems and object storage, and a virtual system environments specific for data intensive, science applications. These technologies are providing a new tier in the data and analytic services stack that helps connect earthbound, enterprise-level data and computational resources to new customers and new mobility-driven applications and modes of work. In our experience, CAaaS lowers the barriers and risk to organizational change, fosters innovation and experimentation, and provides the agility required to meet our customers' increasing and changing needs

  4. Comparison of trends and abrupt changes of the South Asia high from 1979 to 2014 in reanalysis and radiosonde datasets

    NASA Astrophysics Data System (ADS)

    Shi, Chunhua; Huang, Ying; Guo, Dong; Zhou, Shunwu; Hu, Kaixi; Liu, Yu

    2018-05-01

    The South Asian High (SAH) has an important influence on atmospheric circulation and the Asian climate in summer. However, current comparative analyses of the SAH are mostly between reanalysis datasets and there is a lack of sounding data. We therefore compared the climatology, trends and abrupt changes in the SAH in the Japanese 55-year Reanalysis (JRA-55) dataset, the National Centers for Environmental Prediction Climate Forecast System Reanalysis (NCEP-CFSR) dataset, the European Center for Medium-Range Weather Forecasts Reanalysis Interim (ERA-interim) dataset and radiosonde data from China using linear analysis and a sliding t-test. The trends in geopotential height in the control area of the SAH were positive in the JRA-55, NCEP-CFSR and ERA-interim datasets, but negative in the radiosonde data in the time period 1979-2014. The negative trends for the SAH were significant at the 90% confidence level in the radiosonde data from May to September. The positive trends in the NCEP-CFSR dataset were significant at the 90% confidence level in May, July, August and September, but the positive trends in the JRA-55 and ERA-Interim were only significant at the 90% confidence level in September. The reasons for the differences in the trends of the SAH between the radiosonde data and the three reanalysis datasets in the time period 1979-2014 were updates to the sounding systems, changes in instrumentation and improvements in the radiation correction method for calculations around the year 2000. We therefore analyzed the trends in the two time periods of 1979-2000 and 2001-2014 separately. From 1979 to 2000, the negative SAH trends in the radiosonde data mainly agreed with the negative trends in the NCEP-CFSR dataset, but were in contrast with the positive trends in the JRA-55 and ERA-Interim datasets. In 2001-2014, however, the trends in the SAH were positive in all four datasets and most of the trends in the radiosonde and NCEP-CFSR datasets were significant. It is therefore better to use the NCEP-CFSR dataset than the JRA-55 and ERA-Interim datasets when discussing trends in the SAH.

  5. Multisource Estimation of Long-term Global Terrestrial Surface Radiation

    NASA Astrophysics Data System (ADS)

    Peng, L.; Sheffield, J.

    2017-12-01

    Land surface net radiation is the essential energy source at the earth's surface. It determines the surface energy budget and its partitioning, drives the hydrological cycle by providing available energy, and offers heat, light, and energy for biological processes. Individual components in net radiation have changed historically due to natural and anthropogenic climate change and land use change. Decadal variations in radiation such as global dimming or brightening have important implications for hydrological and carbon cycles. In order to assess the trends and variability of net radiation and evapotranspiration, there is a need for accurate estimates of long-term terrestrial surface radiation. While large progress in measuring top of atmosphere energy budget has been made, huge discrepancies exist among ground observations, satellite retrievals, and reanalysis fields of surface radiation, due to the lack of observational networks, the difficulty in measuring from space, and the uncertainty in algorithm parameters. To overcome the weakness of single source datasets, we propose a multi-source merging approach to fully utilize and combine multiple datasets of radiation components separately, as they are complementary in space and time. First, we conduct diagnostic analysis of multiple satellite and reanalysis datasets based on in-situ measurements such as Global Energy Balance Archive (GEBA), existing validation studies, and other information such as network density and consistency with other meteorological variables. Then, we calculate the optimal weighted average of multiple datasets by minimizing the variance of error between in-situ measurements and other observations. Finally, we quantify the uncertainties in the estimates of surface net radiation and employ physical constraints based on the surface energy balance to reduce these uncertainties. The final dataset is evaluated in terms of the long-term variability and its attribution to changes in individual components. The goal of this study is to provide a merged observational benchmark for large-scale diagnostic analyses, remote sensing and land surface modeling.

  6. DNA Microarray for Rapid Detection and Identification of Food and Water Borne Bacteria: From Dry to Wet Lab.

    PubMed

    Ranjbar, Reza; Behzadi, Payam; Najafi, Ali; Roudi, Raheleh

    2017-01-01

    A rapid, accurate, flexible and reliable diagnostic method may significantly decrease the costs of diagnosis and treatment. Designing an appropriate microarray chip reduces noises and probable biases in the final result. The aim of this study was to design and construct a DNA Microarray Chip for a rapid detection and identification of 10 important bacterial agents. In the present survey, 10 unique genomic regions relating to 10 pathogenic bacterial agents including Escherichia coli (E.coli), Shigella boydii, Sh.dysenteriae, Sh.flexneri, Sh.sonnei, Salmonella typhi, S.typhimurium, Brucella sp., Legionella pneumophila, and Vibrio cholera were selected for designing specific long oligo microarray probes. For this reason, the in-silico operations including utilization of the NCBI RefSeq database, Servers of PanSeq and Gview, AlleleID 7.7 and Oligo Analyzer 3.1 was done. On the other hand, the in-vitro part of the study comprised stages of robotic microarray chip probe spotting, bacterial DNAs extraction and DNA labeling, hybridization and microarray chip scanning. In wet lab section, different tools and apparatus such as Nexterion® Slide E, Qarray mini spotter, NimbleGen kit, TrayMix TM S4, and Innoscan 710 were used. A DNA microarray chip including 10 long oligo microarray probes was designed and constructed for detection and identification of 10 pathogenic bacteria. The DNA microarray chip was capable to identify all 10 bacterial agents tested simultaneously. The presence of a professional bioinformatician as a probe designer is needed to design appropriate multifunctional microarray probes to increase the accuracy of the outcomes.

  7. Comparison of gene expression microarray data with count-based RNA measurements informs microarray interpretation.

    PubMed

    Richard, Arianne C; Lyons, Paul A; Peters, James E; Biasci, Daniele; Flint, Shaun M; Lee, James C; McKinney, Eoin F; Siegel, Richard M; Smith, Kenneth G C

    2014-08-04

    Although numerous investigations have compared gene expression microarray platforms, preprocessing methods and batch correction algorithms using constructed spike-in or dilution datasets, there remains a paucity of studies examining the properties of microarray data using diverse biological samples. Most microarray experiments seek to identify subtle differences between samples with variable background noise, a scenario poorly represented by constructed datasets. Thus, microarray users lack important information regarding the complexities introduced in real-world experimental settings. The recent development of a multiplexed, digital technology for nucleic acid measurement enables counting of individual RNA molecules without amplification and, for the first time, permits such a study. Using a set of human leukocyte subset RNA samples, we compared previously acquired microarray expression values with RNA molecule counts determined by the nCounter Analysis System (NanoString Technologies) in selected genes. We found that gene measurements across samples correlated well between the two platforms, particularly for high-variance genes, while genes deemed unexpressed by the nCounter generally had both low expression and low variance on the microarray. Confirming previous findings from spike-in and dilution datasets, this "gold-standard" comparison demonstrated signal compression that varied dramatically by expression level and, to a lesser extent, by dataset. Most importantly, examination of three different cell types revealed that noise levels differed across tissues. Microarray measurements generally correlate with relative RNA molecule counts within optimal ranges but suffer from expression-dependent accuracy bias and precision that varies across datasets. We urge microarray users to consider expression-level effects in signal interpretation and to evaluate noise properties in each dataset independently.

  8. The Development of Protein Microarrays and Their Applications in DNA-Protein and Protein-Protein Interaction Analyses of Arabidopsis Transcription Factors

    PubMed Central

    Gong, Wei; He, Kun; Covington, Mike; Dinesh-Kumar, S. P.; Snyder, Michael; Harmer, Stacey L.; Zhu, Yu-Xian; Deng, Xing Wang

    2009-01-01

    We used our collection of Arabidopsis transcription factor (TF) ORFeome clones to construct protein microarrays containing as many as 802 TF proteins. These protein microarrays were used for both protein-DNA and protein-protein interaction analyses. For protein-DNA interaction studies, we examined AP2/ERF family TFs and their cognate cis-elements. By careful comparison of the DNA-binding specificity of 13 TFs on the protein microarray with previous non-microarray data, we showed that protein microarrays provide an efficient and high throughput tool for genome-wide analysis of TF-DNA interactions. This microarray protein-DNA interaction analysis allowed us to derive a comprehensive view of DNA-binding profiles of AP2/ERF family proteins in Arabidopsis. It also revealed four TFs that bound the EE (evening element) and had the expected phased gene expression under clock-regulation, thus providing a basis for further functional analysis of their roles in clock regulation of gene expression. We also developed procedures for detecting protein interactions using this TF protein microarray and discovered four novel partners that interact with HY5, which can be validated by yeast two-hybrid assays. Thus, plant TF protein microarrays offer an attractive high-throughput alternative to traditional techniques for TF functional characterization on a global scale. PMID:19802365

  9. Plastic Polymers for Efficient DNA Microarray Hybridization: Application to Microbiological Diagnostics▿

    PubMed Central

    Zhao, Zhengshan; Peytavi, Régis; Diaz-Quijada, Gerardo A.; Picard, Francois J.; Huletsky, Ann; Leblanc, Éric; Frenette, Johanne; Boivin, Guy; Veres, Teodor; Dumoulin, Michel M.; Bergeron, Michel G.

    2008-01-01

    Fabrication of microarray devices using traditional glass slides is not easily adaptable to integration into microfluidic systems. There is thus a need for the development of polymeric materials showing a high hybridization signal-to-background ratio, enabling sensitive detection of microbial pathogens. We have developed such plastic supports suitable for highly sensitive DNA microarray hybridizations. The proof of concept of this microarray technology was done through the detection of four human respiratory viruses that were amplified and labeled with a fluorescent dye via a sensitive reverse transcriptase PCR (RT-PCR) assay. The performance of the microarray hybridization with plastic supports made of PMMA [poly(methylmethacrylate)]-VSUVT or Zeonor 1060R was compared to that with high-quality glass slide microarrays by using both passive and microfluidic hybridization systems. Specific hybridization signal-to-background ratios comparable to that obtained with high-quality commercial glass slides were achieved with both polymeric substrates. Microarray hybridizations demonstrated an analytical sensitivity equivalent to approximately 100 viral genome copies per RT-PCR, which is at least 100-fold higher than the sensitivities of previously reported DNA hybridizations on plastic supports. Testing of these plastic polymers using a microfluidic microarray hybridization platform also showed results that were comparable to those with glass supports. In conclusion, PMMA-VSUVT and Zeonor 1060R are both suitable for highly sensitive microarray hybridizations. PMID:18784318

  10. Development and application of a microarray meter tool to optimize microarray experiments

    PubMed Central

    Rouse, Richard JD; Field, Katrine; Lapira, Jennifer; Lee, Allen; Wick, Ivan; Eckhardt, Colleen; Bhasker, C Ramana; Soverchia, Laura; Hardiman, Gary

    2008-01-01

    Background Successful microarray experimentation requires a complex interplay between the slide chemistry, the printing pins, the nucleic acid probes and targets, and the hybridization milieu. Optimization of these parameters and a careful evaluation of emerging slide chemistries are a prerequisite to any large scale array fabrication effort. We have developed a 'microarray meter' tool which assesses the inherent variations associated with microarray measurement prior to embarking on large scale projects. Findings The microarray meter consists of nucleic acid targets (reference and dynamic range control) and probe components. Different plate designs containing identical probe material were formulated to accommodate different robotic and pin designs. We examined the variability in probe quality and quantity (as judged by the amount of DNA printed and remaining post-hybridization) using three robots equipped with capillary printing pins. Discussion The generation of microarray data with minimal variation requires consistent quality control of the (DNA microarray) manufacturing and experimental processes. Spot reproducibility is a measure primarily of the variations associated with printing. The microarray meter assesses array quality by measuring the DNA content for every feature. It provides a post-hybridization analysis of array quality by scoring probe performance using three metrics, a) a measure of variability in the signal intensities, b) a measure of the signal dynamic range and c) a measure of variability of the spot morphologies. PMID:18710498

  11. Is Recent Warming Unprecedented in the Common Era? Insights from PAGES2k data and the Last Millennium Reanalysis

    NASA Astrophysics Data System (ADS)

    Erb, M. P.; Emile-Geay, J.; McKay, N.; Hakim, G. J.; Steig, E. J.; Anchukaitis, K. J.

    2017-12-01

    Paleoclimate observations provide a critical context for 20th century warming by putting recent climate change into a longer-term perspective. Previous work (e.g. IPCC AR3-5) has claimed that recent decades are exceptional in the context of past centuries, though these statements are usually accompanied by large uncertainties and little spatial detail. Here we leverage a recent multiproxy compilation (PAGES2k Consortium, 2017) to revisit this long-standing question. We do so via two complementary approaches. The first approach compares multi-decadal averages and trends in PAGES2k proxy records, which include trees, corals, ice cores, and more. Numerous proxy records reveal that late 20th century values are extreme compared to the remainder of the recorded period, although considerable variability exists in the signals preserved in individual records. The second approach uses the same PAGES2k data blended with climate model output to produce an optimal analysis: the Last Millennium Reanalysis (LMR; Hakim et al., 2016). Unlike proxy data, LMR is spatially-complete and explicitly models uncertainty in proxy records, resulting in objective error estimates. The LMR results show that for nearly every region of the world, late 20th century temperatures exceed temperatures in previous multi-decadal periods during the Common Era, and 20th century warming rates exceed rates in previous centuries. An uncertainty with the present analyses concerns the interpretation of proxy records. PAGES2k included only records that are primarily sensitive to temperature, but many proxies may be influenced by secondary non-temperature effects. Additionally, the issue of seasonality is important as, for example, many temperature-sensitive tree ring chronologies in the Northern Hemisphere respond to summer or growing season temperature rather than annual-means. These uncertainties will be further explored. References Hakim, G. J., et al., 2016: The last millennium climate reanalysis project: Framework and first results. Journal of Geophysical Research: Atmospheres, 121(12), 6745-6764. http://doi.org/10.1002/2016JD024751 PAGES2k Consortium, 2017: A global multiproxy database for temperature reconstructions of the Common Era. Scientific Data, 1-33. http://doi.org/10.1038/sdata.2017.88

  12. A WPS Based Architecture for Climate Data Analytic Services (CDAS) at NASA

    NASA Astrophysics Data System (ADS)

    Maxwell, T. P.; McInerney, M.; Duffy, D.; Carriere, L.; Potter, G. L.; Doutriaux, C.

    2015-12-01

    Faced with unprecedented growth in the Big Data domain of climate science, NASA has developed the Climate Data Analytic Services (CDAS) framework. This framework enables scientists to execute trusted and tested analysis operations in a high performance environment close to the massive data stores at NASA. The data is accessed in standard (NetCDF, HDF, etc.) formats in a POSIX file system and processed using trusted climate data analysis tools (ESMF, CDAT, NCO, etc.). The framework is structured as a set of interacting modules allowing maximal flexibility in deployment choices. The current set of module managers include: Staging Manager: Runs the computation locally on the WPS server or remotely using tools such as celery or SLURM. Compute Engine Manager: Runs the computation serially or distributed over nodes using a parallelization framework such as celery or spark. Decomposition Manger: Manages strategies for distributing the data over nodes. Data Manager: Handles the import of domain data from long term storage and manages the in-memory and disk-based caching architectures. Kernel manager: A kernel is an encapsulated computational unit which executes a processor's compute task. Each kernel is implemented in python exploiting existing analysis packages (e.g. CDAT) and is compatible with all CDAS compute engines and decompositions. CDAS services are accessed via a WPS API being developed in collaboration with the ESGF Compute Working Team to support server-side analytics for ESGF. The API can be executed using either direct web service calls, a python script or application, or a javascript-based web application. Client packages in python or javascript contain everything needed to make CDAS requests. The CDAS architecture brings together the tools, data storage, and high-performance computing required for timely analysis of large-scale data sets, where the data resides, to ultimately produce societal benefits. It is is currently deployed at NASA in support of the Collaborative REAnalysis Technical Environment (CREATE) project, which centralizes numerous global reanalysis datasets onto a single advanced data analytics platform. This service permits decision makers to investigate climate changes around the globe, inspect model trends, compare multiple reanalysis datasets, and variability.

  13. A Web-Based Multi-Database System Supporting Distributed Collaborative Management and Sharing of Microarray Experiment Information

    PubMed Central

    Burgarella, Sarah; Cattaneo, Dario; Masseroli, Marco

    2006-01-01

    We developed MicroGen, a multi-database Web based system for managing all the information characterizing spotted microarray experiments. It supports information gathering and storing according to the Minimum Information About Microarray Experiments (MIAME) standard. It also allows easy sharing of information and data among all multidisciplinary actors involved in spotted microarray experiments. PMID:17238488

  14. Polysaccharide Microarray Technology for the Detection of Burkholderia Pseudomallei and Burkholderia Mallei Antibodies

    DTIC Science & Technology

    2006-04-27

    polysaccharide microarray platform was prepared by immobilizing Burkholderia pseudomallei and Burkholderia mallei polysaccharides . This... polysaccharide array was tested with success for detecting B. pseudomallei and B. mallei serum (human and animal) antibodies. The advantages of this microarray... Polysaccharide microarrays; Burkholderia pseudomallei; Burkholderia mallei; Glanders; Melioidosis1. Introduction There has been a great deal of emphasis on the

  15. Microarray-integrated optoelectrofluidic immunoassay system

    PubMed Central

    Han, Dongsik

    2016-01-01

    A microarray-based analytical platform has been utilized as a powerful tool in biological assay fields. However, an analyte depletion problem due to the slow mass transport based on molecular diffusion causes low reaction efficiency, resulting in a limitation for practical applications. This paper presents a novel method to improve the efficiency of microarray-based immunoassay via an optically induced electrokinetic phenomenon by integrating an optoelectrofluidic device with a conventional glass slide-based microarray format. A sample droplet was loaded between the microarray slide and the optoelectrofluidic device on which a photoconductive layer was deposited. Under the application of an AC voltage, optically induced AC electroosmotic flows caused by a microarray-patterned light actively enhanced the mass transport of target molecules at the multiple assay spots of the microarray simultaneously, which reduced tedious reaction time from more than 30 min to 10 min. Based on this enhancing effect, a heterogeneous immunoassay with a tiny volume of sample (5 μl) was successfully performed in the microarray-integrated optoelectrofluidic system using immunoglobulin G (IgG) and anti-IgG, resulting in improved efficiency compared to the static environment. Furthermore, the application of multiplex assays was also demonstrated by multiple protein detection. PMID:27190571

  16. Microarray-integrated optoelectrofluidic immunoassay system.

    PubMed

    Han, Dongsik; Park, Je-Kyun

    2016-05-01

    A microarray-based analytical platform has been utilized as a powerful tool in biological assay fields. However, an analyte depletion problem due to the slow mass transport based on molecular diffusion causes low reaction efficiency, resulting in a limitation for practical applications. This paper presents a novel method to improve the efficiency of microarray-based immunoassay via an optically induced electrokinetic phenomenon by integrating an optoelectrofluidic device with a conventional glass slide-based microarray format. A sample droplet was loaded between the microarray slide and the optoelectrofluidic device on which a photoconductive layer was deposited. Under the application of an AC voltage, optically induced AC electroosmotic flows caused by a microarray-patterned light actively enhanced the mass transport of target molecules at the multiple assay spots of the microarray simultaneously, which reduced tedious reaction time from more than 30 min to 10 min. Based on this enhancing effect, a heterogeneous immunoassay with a tiny volume of sample (5 μl) was successfully performed in the microarray-integrated optoelectrofluidic system using immunoglobulin G (IgG) and anti-IgG, resulting in improved efficiency compared to the static environment. Furthermore, the application of multiplex assays was also demonstrated by multiple protein detection.

  17. Advances in cell-free protein array methods.

    PubMed

    Yu, Xiaobo; Petritis, Brianne; Duan, Hu; Xu, Danke; LaBaer, Joshua

    2018-01-01

    Cell-free protein microarrays represent a special form of protein microarray which display proteins made fresh at the time of the experiment, avoiding storage and denaturation. They have been used increasingly in basic and translational research over the past decade to study protein-protein interactions, the pathogen-host relationship, post-translational modifications, and antibody biomarkers of different human diseases. Their role in the first blood-based diagnostic test for early stage breast cancer highlights their value in managing human health. Cell-free protein microarrays will continue to evolve to become widespread tools for research and clinical management. Areas covered: We review the advantages and disadvantages of different cell-free protein arrays, with an emphasis on the methods that have been studied in the last five years. We also discuss the applications of each microarray method. Expert commentary: Given the growing roles and impact of cell-free protein microarrays in research and medicine, we discuss: 1) the current technical and practical limitations of cell-free protein microarrays; 2) the biomarker discovery and verification pipeline using protein microarrays; and 3) how cell-free protein microarrays will advance over the next five years, both in their technology and applications.

  18. On the impact of using downscaled reanalysis data instead of direct measurements for modeling the mass balance of a tropical glacier (Cordillera Blanca, Peru)

    NASA Astrophysics Data System (ADS)

    Galos, Stephan; Hofer, Marlis; Marzeion, Ben; Mölg, Thomas; Großhauser, Martin

    2013-04-01

    Due to their setting, tropical glaciers are sensitive indicators of mid-tropospheric meteorological variability and climate change. Furthermore these glaciers are of particular interest because they respond faster to climatic changes than glaciers located in mid- or high-latitudes. As long-term direct meteorological measurements in such remote environments are scarce, reanalysis data (e.g. ERA-Interim) provide a highly valuable source of information. Reanalysis datasets (i) enable a temporal extension of data records gained by direct measurements and (ii) provide information from regions where direct measurements are not available. In order to properly derive the physical exchange processes between glaciers and atmosphere from reanalysis data, downscaling procedures are required. In the present study we investigate if downscaled atmospheric variables (air temperature and relative humidity) from a reanalysis dataset can be used as input for a physically based, high resolution energy and mass balance model. We apply a well validated empirical-statistical downscaling model, fed with ERA-Interim data, to an automated weather station (AWS) on the surface of Glaciar Artesonraju (8.96° S | 77.63° W). The downscaled data is then used to replace measured air temperature and relative humidity in the input for the energy and mass balance model, which was calibrated using ablation data from stakes and a sonic ranger. In order to test the sensitivity of the modeled mass balance to the downscaled data, the results are compared to a reference model run driven solely with AWS data as model input. We finally discuss the results and present future perspectives for further developing this method.

  19. Characterisation of Special Sensor Microwave Water Vapor Profiler (SSM/T-2) radiances using radiative transfer simulations from global atmospheric reanalyses

    NASA Astrophysics Data System (ADS)

    Kobayashi, Shinya; Poli, Paul; John, Viju O.

    2017-02-01

    The near-global and all-sky coverage of satellite observations from microwave humidity sounders operating in the 183 GHz band complement radiosonde and aircraft observations and satellite infrared clear-sky observations. The Special Sensor Microwave Water Vapor Profiler (SSM/T-2) of the Defense Meteorological Satellite Program began operations late 1991. It has been followed by several other microwave humidity sounders, continuing today. However, expertise and accrued knowledge regarding the SSM/T-2 data record is limited because it has remained underused for climate applications and reanalyses. In this study, SSM/T-2 radiances are characterised using several global atmospheric reanalyses. The European Centre for Medium-Range Weather Forecasts (ECMWF) Interim Reanalysis (ERA-Interim), the first ECMWF reanalysis of the 20th-century (ERA-20C), and the Japanese 55-year Reanalysis (JRA-55) are projected into SSM/T-2 radiance space using a fast radiative transfer model. The present study confirms earlier indications that the polarisation state of SSM/T-2 antenna is horizontal (not vertical) in the limit of nadir viewing. The study also formulates several recommendations to improve use of the SSM/T-2 measurement data in future fundamental climate data records or reanalyses. Recommendations are (1) to correct geolocation errors, especially for DMSP 14; (2) to blacklist poor quality data identified in the paper; (3) to correct for inter-satellite biases, estimated here on the order of 1 K, by applying an inter-satellite recalibration or, for reanalysis, an automated (e.g., variational) bias correction; and (4) to improve precipitating cloud filtering or, for reanalysis, consider an all-sky assimilation scheme where radiative transfer simulations account for the scattering effect of hydrometeors.

  20. Mid-latitude storm track variability and its influence on atmospheric composition

    NASA Astrophysics Data System (ADS)

    Knowland, K. E.; Doherty, R. M.; Hodges, K.

    2013-12-01

    Using the storm tracking algorithm, TRACK (Hodges, 1994, 1995, 1999), we have studied the behaviour of storm tracks in the North Atlantic basin, using 850-hPa relative vorticity from the ERA-Interim Re-analysis (Dee et al., 2011). We have correlated surface ozone measurements at rural coastal sites in Europe to the storm track data to explore the role mid-latitude cyclones and their transport of pollutants play in determining surface air quality in Western Europe. To further investigate this relationship, we have used the Monitoring Atmospheric Composition Climate (MACC) Re-analysis dataset (Inness et al., 2013) in TRACK. The MACC Re-analysis is a 10-year dataset which couples a chemistry transport model (Mozart-3; Stein 2009, 2012) to an extended version of the European Centre for Medium-Range Weather Forecasts' (ECMWF) Integrated Forecast System (IFS). Storm tracks in the MACC Re-analysis compare well to the storm tracks using the ERA-Interim Re-analysis for the same 10-year period, as both are based on ECMWF IFSs. We also compare surface ozone values from MACC to surface ozone measurements previously studied. Using TRACK, we follow ozone (O3) and carbon monoxide (CO) through the life cycle of storms from North America to Western Europe. Along the storm tracks, we examine the distribution of CO and O3 within 6 degrees of the center of each storm and vertically at different pressure levels in the troposphere. We hope to better understand the mechanisms with which pollution is vented from the boundary layer to the free troposphere, as well as transport of pollutants to rural areas. Our hope is to give policy makers more detailed information on how climate variability associated with storm tracks between 1979-2013 may affect air quality in Northeast USA and Western Europe.

  1. CREATE-IP and CREATE-V: Data and Services Update

    NASA Astrophysics Data System (ADS)

    Carriere, L.; Potter, G. L.; Hertz, J.; Peters, J.; Maxwell, T. P.; Strong, S.; Shute, J.; Shen, Y.; Duffy, D.

    2017-12-01

    The NASA Center for Climate Simulation (NCCS) at the Goddard Space Flight Center and the Earth System Grid Federation (ESGF) are working together to build a uniform environment for the comparative study and use of a group of reanalysis datasets of particular importance to the research community. This effort is called the Collaborative REAnalysis Technical Environment (CREATE) and it contains two components: the CREATE-Intercomparison Project (CREATE-IP) and CREATE-V. This year's efforts included generating and publishing an atmospheric reanalysis ensemble mean and spread and improving the analytics available through CREATE-V. Related activities included adding access to subsets of the reanalysis data through ArcGIS and expanding the visualization tool to GMAO forecast data. This poster will present the access mechanisms to this data and use cases including example Jupyter Notebook code. The reanalysis ensemble was generated using two methods, first using standard Python tools for regridding, extracting levels and creating the ensemble mean and spread on a virtual server in the NCCS environment. The second was using a new analytics software suite, the Earth Data Analytics Services (EDAS), coupled with a high-performance Data Analytics and Storage System (DASS) developed at the NCCS. Results were compared to validate the EDAS methodologies, and the results, including time to process, will be presented. The ensemble includes selected 6 hourly and monthly variables, regridded to 1.25 degrees, with 24 common levels used for the 3D variables. Use cases for the new data and services will be presented, including the use of EDAS for the backend analytics on CREATE-V, the use of the GMAO forecast aerosol and cloud data in CREATE-V, and the ability to connect CREATE-V data to NCCS ArcGIS services.

  2. Improving Global Net Surface Heat Flux with Ocean Reanalysis

    NASA Astrophysics Data System (ADS)

    Carton, J.; Chepurin, G. A.; Chen, L.; Grodsky, S.

    2017-12-01

    This project addresses the current level of uncertainty in surface heat flux estimates. Time mean surface heat flux estimates provided by atmospheric reanalyses differ by 10-30W/m2. They are generally unbalanced globally, and have been shown by ocean simulation studies to be incompatible with ocean temperature and velocity measurements. Here a method is presented 1) to identify the spatial and temporal structure of the underlying errors and 2) to reduce them by exploiting hydrographic observations and the analysis increments produced by an ocean reanalysis using sequential data assimilation. The method is applied to fluxes computed from daily state variables obtained from three widely used reanalyses: MERRA2, ERA-Interim, and JRA-55, during an eight year period 2007-2014. For each of these seasonal heat flux errors/corrections are obtained. In a second set of experiments the heat fluxes are corrected and the ocean reanalysis experiments are repeated. This second round of experiments shows that the time mean error in the corrected fluxes is reduced to within ±5W/m2 over the interior subtropical and midlatitude oceans, with the most significant changes occuring over the Southern Ocean. The global heat flux imbalance of each reanalysis is reduced to within a few W/m2 with this single correction. Encouragingly, the corrected forms of the three sets of fluxes are also shown to converge. In the final discussion we present experiments beginning with a modified form of the ERA-Int reanalysis, produced by the DAKKAR program, in which state variables have been individually corrected based on independent measurements. Finally, we discuss the separation of flux error from model error.

  3. Sensitivity studies of high-resolution RegCM3 simulations of precipitation over the European Alps: the effect of lateral boundary conditions and domain size

    NASA Astrophysics Data System (ADS)

    Nadeem, Imran; Formayer, Herbert

    2016-11-01

    A suite of high-resolution (10 km) simulations were performed with the International Centre for Theoretical Physics (ICTP) Regional Climate Model (RegCM3) to study the effect of various lateral boundary conditions (LBCs), domain size, and intermediate domains on simulated precipitation over the Great Alpine Region. The boundary conditions used were ECMWF ERA-Interim Reanalysis with grid spacing 0.75∘, the ECMWF ERA-40 Reanalysis with grid spacing 1.125 and 2.5∘, and finally the 2.5∘ NCEP/DOE AMIP-II Reanalysis. The model was run in one-way nesting mode with direct nesting of the high-resolution RCM (horizontal grid spacing Δx = 10 km) with driving reanalysis, with one intermediate resolution nest (Δx = 30 km) between high-resolution RCM and reanalysis forcings, and also with two intermediate resolution nests (Δx = 90 km and Δx = 30 km) for simulations forced with LBC of resolution 2.5∘. Additionally, the impact of domain size was investigated. The results of multiple simulations were evaluated using different analysis techniques, e.g., Taylor diagram and a newly defined useful statistical parameter, called Skill-Score, for evaluation of daily precipitation simulated by the model. It has been found that domain size has the major impact on the results, while different resolution and versions of LBCs, e.g., 1.125∘ ERA40 and 0.7∘ ERA-Interim, do not produce significantly different results. It is also noticed that direct nesting with reasonable domain size, seems to be the most adequate method for reproducing precipitation over complex terrain, while introducing intermediate resolution nests seems to deteriorate the results.

  4. Comparison of the ocean surface vector winds from atmospheric reanalysis and scatterometer-based wind products over the Nordic Seas and the northern North Atlantic and their application for ocean modeling

    NASA Astrophysics Data System (ADS)

    Dukhovskoy, Dmitry S.; Bourassa, Mark A.; Petersen, Gudrún Nína; Steffen, John

    2017-03-01

    Ocean surface vector wind fields from reanalysis data sets and scatterometer-derived gridded products are analyzed over the Nordic Seas and the northern North Atlantic for the time period from 2000 to 2009. The data sets include the National Center for Environmental Prediction Reanalysis 2 (NCEPR2), Climate Forecast System Reanalysis (CFSR), Arctic System Reanalysis (ASR), Cross-Calibrated Multiplatform (CCMP) wind product version 1.1 and recently released version 2.0, and QuikSCAT. The goal of the study is to assess discrepancies across the wind vector fields in the data sets and demonstrate possible implications of these differences for ocean modeling. Large-scale and mesoscale characteristics of winds are compared at interannual, seasonal, and synoptic timescales. A cyclone tracking methodology is developed and applied to the wind fields to compare cyclone characteristics in the data sets. Additionally, the winds are evaluated against observations collected from meteorological buoys deployed in the Iceland and Irminger Seas. The agreement among the wind fields is better for longer time and larger spatial scales. The discrepancies are clearly apparent for synoptic timescales and mesoscales. CCMP, ASR, and CFSR show the closest overall agreement with each other. Substantial biases are found in the NCEPR2 winds. Numerical sensitivity experiments are conducted with a coupled ice-ocean model forced by different wind fields. The experiments demonstrate differences in the net surface heat fluxes during storms. In the experiment forced by NCEPR2 winds, there are discrepancies in the large-scale wind-driven ocean dynamics compared to the other experiments.

  5. Evaluation of person-level heterogeneity of treatment effects in published multiperson N-of-1 studies: systematic review and reanalysis.

    PubMed

    Raman, Gowri; Balk, Ethan M; Lai, Lana; Shi, Jennifer; Chan, Jeffrey; Lutz, Jennifer S; Dubois, Robert W; Kravitz, Richard L; Kent, David M

    2018-05-26

    Individual patients with the same condition may respond differently to similar treatments. Our aim is to summarise the reporting of person-level heterogeneity of treatment effects (HTE) in multiperson N-of-1 studies and to examine the evidence for person-level HTE through reanalysis. Systematic review and reanalysis of multiperson N-of-1 studies. Medline, Cochrane Controlled Trials, EMBASE, Web of Science and review of references through August 2017 for N-of-1 studies published in English. N-of-1 studies of pharmacological interventions with at least two subjects. Citation screening and data extractions were performed in duplicate. We performed statistical reanalysis testing for person-level HTE on all studies presenting person-level data. We identified 62 multiperson N-of-1 studies with at least two subjects. Statistical tests examining HTE were described in only 13 (21%), of which only two (3%) tested person-level HTE. Only 25 studies (40%) provided person-level data sufficient to reanalyse person-level HTE. Reanalysis using a fixed effect linear model identified statistically significant person-level HTE in 8 of the 13 studies (62%) reporting person-level treatment effects and in 8 of the 14 studies (57%) reporting person-level outcomes. Our analysis suggests that person-level HTE is common and often substantial. Reviewed studies had incomplete information on person-level treatment effects and their variation. Improved assessment and reporting of person-level treatment effects in multiperson N-of-1 studies are needed. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.

  6. Structure of the tropical lower stratosphere as revealed by three reanalysis data sets

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Pawson, S.; Fiorino, M.

    1996-05-01

    While the skill of climate simulation models has advanced over the last decade, mainly through improvements in modeling, further progress will depend on the availability and the quality of comprehensive validation data sets covering long time periods. A new source of such validation data is atmospheric {open_quotes}reanalysis{close_quotes} where a fixed, state-of-the-art global atmospheric model/data assimilation system is run through archived and recovered observations to produce a consistent set of atmospheric analyses. Although reanalysis will be free of non-physical variability caused by changes in the models and/or the assimilation procedure, it is necessary to assess its quality. A region for stringentmore » testing of the quality of reanalysis is the tropical lower stratosphere. This portion of the atmosphere is sparse in observations but displays the prominent quasi-biennial oscillation (QBO) and an annual cycle, neither of which is fully understood, but which are likely coupled dynamically. We first consider the performance of three reanalyses, from NCEP/NCAR, NASA and ECMWF, against rawinsonde data in depicting the QBO and then examine the structure of the tropical lower stratosphere in NCEP and ECMWF data sets in detail. While the annual cycle and the QBO in wind and temperature are quite successfully represented, the mean meridional circulations in NCEP and ECMWF data sets contain unusual features which may be due to the assimilation process rather than being physically based. Further, the models capture the long-term temperature fluctuations associated with volcanic eruptions, even though the physical mechanisms are not included, thus implying that the model does not mask prominent stratospheric signals in the observational data. We conclude that reanalysis offers a unique opportunity to better understand the dynamics of QBO and can be applied to climate model validation.« less

  7. WCRP Task Team for the Intercomparison of Reanalyses (TIRA): Motivation and Progress

    NASA Technical Reports Server (NTRS)

    Bosilovich, Michael

    2017-01-01

    Reanalyses have proven to be an important resource for weather and climate related research, as well as societal applications at large. Several centers have emerged to produce new atmospheric reanalyses in various forms every few years. In addition, land and ocean communities are producing disciplinary uncoupled reanalyses. Current research and development in reanalysis is directed at (1) extending the length of reanalyzed period and (2) use of coupled Earth system models for climate reanalysis. While WCRPs involvement in the reanalyses communities through its Data Advisory Council (WDAC) has been substantial, for example in organizing international conferences on reanalyses, a central team of reanalyses expertise is not in place in the WCRP structure. The differences among reanalyses and their inherent uncertainties are some of the most important questions for both users and developers of reanalyses. Therefore, a collaborative effort to systematically assess and intercompare reanalyses would be a logical progression that fills the needs of the community and contributes to the WCRP mission. The primary charge to the TIRA is to develop a reanalysis intercomparison project plan that will attain the following objectives.1)To foster understanding and estimation of uncertainties in reanalysis data by intercomparison and other means 2)To communicate new developments and best practices among the reanalyses producing centers 3)To enhance the understanding of data and assimilation issues and their impact on uncertainties, leading to improved reanalyses for climate assessment 4)To communicate the strengths and weaknesses of reanalyses, their fitness for purpose, and best practices in the use of reanalysis datasets by the scientific community. This presentation outlines the need for a task team on reanalyses, their intercomparison, the objectives of the team and progress thus far.

  8. Spring snow albedo feedback over northern Eurasia: Comparing in situ measurements with reanalysis products

    NASA Astrophysics Data System (ADS)

    Wegmann, Martin; Dutra, Emanuel; Jacobi, Hans-Werner; Zolina, Olga

    2018-06-01

    This study uses daily observations and modern reanalyses in order to evaluate reanalysis products over northern Eurasia regarding the spring snow albedo feedback (SAF) during the period from 2000 to 2013. We used the state-of-the-art reanalyses from ERA-Interim/Land and the Modern-Era Retrospective Analysis for Research and Applications version 2 (MERRA-2) as well as an experimental set-up of ERA-Interim/Land with prescribed short grass as land cover to enhance the comparability with the station data while underlining the caveats of comparing in situ observations with gridded data. Snow depth statistics derived from daily station data are well reproduced in all three reanalyses. However day-to-day albedo variability is notably higher at the stations than for any reanalysis product. The ERA-Interim grass set-up shows improved performance when representing albedo variability and generates comparable estimates for the snow albedo in spring. We find that modern reanalyses show a physically consistent representation of SAF, with realistic spatial patterns and area-averaged sensitivity estimates. However, station-based SAF values are significantly higher than in the reanalyses, which is mostly driven by the stronger contrast between snow and snow-free albedo. Switching to grass-only vegetation in ERA-Interim/Land increases the SAF values up to the level of station-based estimates. We found no significant trend in the examined 14-year time series of SAF, but interannual changes of about 0.5 % K-1 in both station-based and reanalysis estimates were derived. This interannual variability is primarily dominated by the variability in the snowmelt sensitivity, which is correctly captured in reanalysis products. Although modern reanalyses perform well for snow variables, efforts should be made to improve the representation of dynamic albedo changes.

  9. Carbohydrate Microarray Technology Applied to High-Throughput Mapping of Plant Cell Wall Glycans Using Comprehensive Microarray Polymer Profiling (CoMPP).

    PubMed

    Kračun, Stjepan Krešimir; Fangel, Jonatan Ulrik; Rydahl, Maja Gro; Pedersen, Henriette Lodberg; Vidal-Melgosa, Silvia; Willats, William George Tycho

    2017-01-01

    Cell walls are an important feature of plant cells and a major component of the plant glycome. They have both structural and physiological functions and are critical for plant growth and development. The diversity and complexity of these structures demand advanced high-throughput techniques to answer questions about their structure, functions and roles in both fundamental and applied scientific fields. Microarray technology provides both the high-throughput and the feasibility aspects required to meet that demand. In this chapter, some of the most recent microarray-based techniques relating to plant cell walls are described together with an overview of related contemporary techniques applied to carbohydrate microarrays and their general potential in glycoscience. A detailed experimental procedure for high-throughput mapping of plant cell wall glycans using the comprehensive microarray polymer profiling (CoMPP) technique is included in the chapter and provides a good example of both the robust and high-throughput nature of microarrays as well as their applicability to plant glycomics.

  10. Interim report on updated microarray probes for the LLNL Burkholderia pseudomallei SNP array

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gardner, S; Jaing, C

    2012-03-27

    The overall goal of this project is to forensically characterize 100 unknown Burkholderia isolates in the US-Australia collaboration. We will identify genome-wide single nucleotide polymorphisms (SNPs) from B. pseudomallei and near neighbor species including B. mallei, B. thailandensis and B. oklahomensis. We will design microarray probes to detect these SNP markers and analyze 100 Burkholderia genomic DNAs extracted from environmental, clinical and near neighbor isolates from Australian collaborators on the Burkholderia SNP microarray. We will analyze the microarray genotyping results to characterize the genetic diversity of these new isolates and triage the samples for whole genome sequencing. In this interimmore » report, we described the SNP analysis and the microarray probe design for the Burkholderia SNP microarray.« less

  11. Reanalysis of the 1893 heat wave in France through offline data assimilation in a downscaled ensemble meteorological reconstruction

    NASA Astrophysics Data System (ADS)

    Devers, Alexandre; Vidal, Jean-Philippe; Lauvernet, Claire; Graff, Benjamin

    2017-04-01

    The knowledge of historical French weather has recently been improved through the development of the SCOPE (Spatially COherent Probabilistic Extended) Climate reconstruction, a probabilistic high-resolution daily reconstruction of precipitation and temperature covering the period 1871-2012 and based on the statistical downscaling of the Twentieth Century Reanalysis (Caillouet et al., 2016). However, historical surface observations - even though rather scarce and sparse - do exist from at least the beginning of the period considered, and this information does not currently feed SCOPE Climate reconstructions. The goal of this study is therefore to assimilate these historical observations into SCOPE Climate reconstructions in order to build a 150-year meteorological reanalysis over France. This study considers "offline" data assimilation methods - Kalman filtering methods like the Ensemble Square Root Filter - that have successfully been used in recent paleoclimate studies, i.e. at much larger temporal and spatial scales (see e.g. Bhend et al., 2012). These methods are here applied for reconstructing the 8-24 August 1893 heat wave in France, using all available daily temperature observations from that period. Temperatures reached that summer were indeed compared at the time to those of Senegal (Garnier, 2012). Results show a spatially coherent view of the heat wave at the national scale as well as a reduced uncertainty compared to initial meteorological reconstructions, thus demonstrating the added value of data assimilation. In order to assess the performance of assimilation methods in a more recent context, these methods are also used to reconstruct the well-known 3-14 August 2003 heat wave by using (1) all available stations, and (2) the same station density as in August 1893, the rest of the observations being saved for validation. This analysis allows comparing two heat waves having occurred 100 years apart in France with different associated uncertainties, in terms of dynamics and intensity. Bhend, J., Franke, J., Folini, D., Wild, M., and Brönnimann, S.: An ensemble-based approach to climate reconstructions, Clim. Past, 8, 963-976, doi: 10.5194/cp-8-963-2012, 2012 Caillouet, L., Vidal, J-P., Sauquet, E., and Graff, B.: Probabilistic precipitation and temperature downscaling of the Twentieth Century Reanalysis over France, Clim. Past, 12, 635-662, doi: 10.5194/cp-12-635-2016, 2016. Garnier, E.: Sécheresses et canicules avant le Global Warming - 1500-1950. In: Canicules et froids extrêmes. L'Événement climatique et ses représentations (II) Histoire, littérature, peinture (Berchtlod, J., Le Roy ladurie, E., Sermain, J.-P., and Vasak, A., Eds.), 297-325, Hermann, 2012.

  12. Reanalysis of Tyrannosaurus rex Mass Spectra

    PubMed Central

    Bern, Marshall; Phinney, Brett S.; Goldberg, David

    2009-01-01

    Asara et al. reported the detection of collagen peptides in a 68-million-year-old T. rex bone by shotgun proteomics. This finding has been called into question as a possible statistical artifact. We reanalyze Asara et al.'s tandem mass spectra using a different search engine and different statistical tools. Our reanalysis shows a sample containing common laboratory contaminants, soil bacteria, and bird-like hemoglobin and collagen. PMID:19603827

  13. MERRA-2: File Specification

    NASA Technical Reports Server (NTRS)

    Bosilovich, M. G.; Lucchesi, R.; Suarez, M.

    2015-01-01

    The second Modern-Era Retrospective analysis for Research and Applications (MERRA-2) is a NASA atmospheric reanalysis that begins in 1980. It replaces the original MERRA reanalysis (Rienecker et al., 2011) using an upgraded version of the Goddard Earth Observing System Model, Version 5 (GEOS-5) data assimilation system. The file collections for MERRA-2 are described in detail in this document, including some important changes from those of the MERRA dataset (Lucchesi, 2012).

  14. Temporal and spatial variability of wind resources in the United States as derived from the Climate Forecast System Reanalysis

    Treesearch

    Lejiang Yu; Shiyuan Zhong; Xindi Bian; Warren E. Heilman

    2015-01-01

    This study examines the spatial and temporal variability of wind speed at 80m above ground (the average hub height of most modern wind turbines) in the contiguous United States using Climate Forecast System Reanalysis (CFSR) data from 1979 to 2011. The mean 80-m wind exhibits strong seasonality and large spatial variability, with higher (lower) wind speeds in the...

  15. Electrocortical correlations between pairs of isolated people: A reanalysis

    PubMed Central

    Radin, Dean

    2017-01-01

    A previously reported experiment collected electrocortical data recorded simultaneously in pairs of people separated by distance. Reanalysis of those data confirmed the presence of a time-synchronous, statistically significant correlation in brain electrical activity of these distant “sender-receiver” pairs. Given the sensory shielding employed in the original experiment to avoid mundane explanations for such a correlation, this outcome is suggestive of an anomalous intersubjective connection. PMID:28713556

  16. 10 CFR 50.46 - Acceptance criteria for emergency core cooling systems for light-water nuclear power reactors.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... schedule for providing a reanalysis or taking other action as may be needed to show compliance with § 50.46... schedule for providing a reanalysis or taking other action as may be needed to show compliance with § 50.46... generated from the chemical reaction of the cladding with water or steam shall not exceed 0.01 times the...

  17. A Method for Snow Reanalysis: The Sierra Nevada (USA) Example

    NASA Technical Reports Server (NTRS)

    Girotto, Manuela; Margulis, Steven; Cortes, Gonzalo; Durand, Michael

    2017-01-01

    This work presents a state-of-the art methodology for constructing snow water equivalent (SWE) reanalysis. The method is comprised of two main components: (1) a coupled land surface model and snow depletion curve model, which is used to generate an ensemble of predictions of SWE and snow cover area for a given set of (uncertain) inputs, and (2) a reanalysis step, which updates estimation variables to be consistent with the satellite observed depletion of the fractional snow cover time series. This method was applied over the Sierra Nevada (USA) based on the assimilation of remotely sensed fractional snow covered area data from the Landsat 5-8 record (1985-2016). The verified dataset (based on a comparison with over 9000 station years of in situ data) exhibited mean and root-mean-square errors less than 3 and 13 cm, respectively, and correlation greater than 0.95 compared with in situ SWE observations. The method (fully Bayesian), resolution (daily, 90-meter), temporal extent (31 years), and accuracy provide a unique dataset for investigating snow processes. This presentation illustrates how the reanalysis dataset was used to provide a basic accounting of the stored snowpack water in the Sierra Nevada over the last 31 years and ultimately improve real-time streamflow predictions.

  18. Sensitivity analysis and approximation methods for general eigenvalue problems

    NASA Technical Reports Server (NTRS)

    Murthy, D. V.; Haftka, R. T.

    1986-01-01

    Optimization of dynamic systems involving complex non-hermitian matrices is often computationally expensive. Major contributors to the computational expense are the sensitivity analysis and reanalysis of a modified design. The present work seeks to alleviate this computational burden by identifying efficient sensitivity analysis and approximate reanalysis methods. For the algebraic eigenvalue problem involving non-hermitian matrices, algorithms for sensitivity analysis and approximate reanalysis are classified, compared and evaluated for efficiency and accuracy. Proper eigenvector normalization is discussed. An improved method for calculating derivatives of eigenvectors is proposed based on a more rational normalization condition and taking advantage of matrix sparsity. Important numerical aspects of this method are also discussed. To alleviate the problem of reanalysis, various approximation methods for eigenvalues are proposed and evaluated. Linear and quadratic approximations are based directly on the Taylor series. Several approximation methods are developed based on the generalized Rayleigh quotient for the eigenvalue problem. Approximation methods based on trace theorem give high accuracy without needing any derivatives. Operation counts for the computation of the approximations are given. General recommendations are made for the selection of appropriate approximation technique as a function of the matrix size, number of design variables, number of eigenvalues of interest and the number of design points at which approximation is sought.

  19. The critical period hypothesis in second language acquisition: a statistical critique and a reanalysis.

    PubMed

    Vanhove, Jan

    2013-01-01

    In second language acquisition research, the critical period hypothesis (cph) holds that the function between learners' age and their susceptibility to second language input is non-linear. This paper revisits the indistinctness found in the literature with regard to this hypothesis's scope and predictions. Even when its scope is clearly delineated and its predictions are spelt out, however, empirical studies-with few exceptions-use analytical (statistical) tools that are irrelevant with respect to the predictions made. This paper discusses statistical fallacies common in cph research and illustrates an alternative analytical method (piecewise regression) by means of a reanalysis of two datasets from a 2010 paper purporting to have found cross-linguistic evidence in favour of the cph. This reanalysis reveals that the specific age patterns predicted by the cph are not cross-linguistically robust. Applying the principle of parsimony, it is concluded that age patterns in second language acquisition are not governed by a critical period. To conclude, this paper highlights the role of confirmation bias in the scientific enterprise and appeals to second language acquisition researchers to reanalyse their old datasets using the methods discussed in this paper. The data and R commands that were used for the reanalysis are provided as supplementary materials.

  20. Impact of uncertainty in surface forcing on the new SODA 3 global reanalysis

    NASA Astrophysics Data System (ADS)

    Carton, J.; Chepurin, G. A.; Chen, L.

    2016-02-01

    An updated version of the Simple Ocean Data Assimilation reanalysis (SODA 3)has been constructed based on GFDL MOM ocean and sea ice numerics, with improved resolution and other changes. A series of three 30+ year long global ocean reanalysis experiments (1980-2014) have carried out which differ only in the choice of specified daily surface heat, momentum, and freshwater forcing: MERRA2, ERA-Int, and ERA-20. The first two forcing data sets make extensive use of satellite observations while the third only uses surface observations. The differences in the resulting SODA reanalysis experiments allow us to explore a major source of error in ocean reanalyses, which is the uncertainty introduced by errors in the surface forcing. The modest differences among the experiments tend to be concentrated at higher latitude where the MERRA2-SODA has a somewhat cooler (1C), saltier (1psu) surface leading to lower (10cm) sea level. Cooler conditions affect the upper 300m heat content at high latitude (although MERRA2-SODA HC300 is higher in the subtropics). RMS differences are small except for surface salinity at high latitude (1psu). The implications for such issues thermosteric sea level, the overturning circulation, and the rise of global heat storage will be discussed.

  1. Cloud-Enabled Climate Analytics-as-a-Service using Reanalysis data: A case study.

    NASA Astrophysics Data System (ADS)

    Nadeau, D.; Duffy, D.; Schnase, J. L.; McInerney, M.; Tamkin, G.; Potter, G. L.; Thompson, J. H.

    2014-12-01

    The NASA Center for Climate Simulation (NCCS) maintains advanced data capabilities and facilities that allow researchers to access the enormous volume of data generated by weather and climate models. The NASA Climate Model Data Service (CDS) and the NCCS are merging their efforts to provide Climate Analytics-as-a-Service for the comparative study of the major reanalysis projects: ECMWF ERA-Interim, NASA/GMAO MERRA, NOAA/NCEP CFSR, NOAA/ESRL 20CR, JMA JRA25, and JRA55. These reanalyses have been repackaged to netCDF4 file format following the CMIP5 Climate and Forecast (CF) metadata convention prior to be sequenced into the Hadoop Distributed File System ( HDFS ). A small set of operations that represent a common starting point in many analysis workflows was then created: min, max, sum, count, variance and average. In this example, Reanalysis data exploration was performed with the use of Hadoop MapReduce and accessibility was achieved using the Climate Data Service(CDS) application programming interface (API) created at NCCS. This API provides a uniform treatment of large amount of data. In this case study, we have limited our exploration to 2 variables, temperature and precipitation, using 3 operations, min, max and avg and using 30-year of Reanalysis data for 3 regions of the world: global, polar, subtropical.

  2. Sensitivity of a numerical wave model on wind re-analysis datasets

    NASA Astrophysics Data System (ADS)

    Lavidas, George; Venugopal, Vengatesan; Friedrich, Daniel

    2017-03-01

    Wind is the dominant process for wave generation. Detailed evaluation of metocean conditions strengthens our understanding of issues concerning potential offshore applications. However, the scarcity of buoys and high cost of monitoring systems pose a barrier to properly defining offshore conditions. Through use of numerical wave models, metocean conditions can be hindcasted and forecasted providing reliable characterisations. This study reports the sensitivity of wind inputs on a numerical wave model for the Scottish region. Two re-analysis wind datasets with different spatio-temporal characteristics are used, the ERA-Interim Re-Analysis and the CFSR-NCEP Re-Analysis dataset. Different wind products alter results, affecting the accuracy obtained. The scope of this study is to assess different available wind databases and provide information concerning the most appropriate wind dataset for the specific region, based on temporal, spatial and geographic terms for wave modelling and offshore applications. Both wind input datasets delivered results from the numerical wave model with good correlation. Wave results by the 1-h dataset have higher peaks and lower biases, in expense of a high scatter index. On the other hand, the 6-h dataset has lower scatter but higher biases. The study shows how wind dataset affects the numerical wave modelling performance, and that depending on location and study needs, different wind inputs should be considered.

  3. A genome-wide 20 K citrus microarray for gene expression analysis

    PubMed Central

    Martinez-Godoy, M Angeles; Mauri, Nuria; Juarez, Jose; Marques, M Carmen; Santiago, Julia; Forment, Javier; Gadea, Jose

    2008-01-01

    Background Understanding of genetic elements that contribute to key aspects of citrus biology will impact future improvements in this economically important crop. Global gene expression analysis demands microarray platforms with a high genome coverage. In the last years, genome-wide EST collections have been generated in citrus, opening the possibility to create new tools for functional genomics in this crop plant. Results We have designed and constructed a publicly available genome-wide cDNA microarray that include 21,081 putative unigenes of citrus. As a functional companion to the microarray, a web-browsable database [1] was created and populated with information about the unigenes represented in the microarray, including cDNA libraries, isolated clones, raw and processed nucleotide and protein sequences, and results of all the structural and functional annotation of the unigenes, like general description, BLAST hits, putative Arabidopsis orthologs, microsatellites, putative SNPs, GO classification and PFAM domains. We have performed a Gene Ontology comparison with the full set of Arabidopsis proteins to estimate the genome coverage of the microarray. We have also performed microarray hybridizations to check its usability. Conclusion This new cDNA microarray replaces the first 7K microarray generated two years ago and allows gene expression analysis at a more global scale. We have followed a rational design to minimize cross-hybridization while maintaining its utility for different citrus species. Furthermore, we also provide access to a website with full structural and functional annotation of the unigenes represented in the microarray, along with the ability to use this site to directly perform gene expression analysis using standard tools at different publicly available servers. Furthermore, we show how this microarray offers a good representation of the citrus genome and present the usefulness of this genomic tool for global studies in citrus by using it to catalogue genes expressed in citrus globular embryos. PMID:18598343

  4. An evaluation of two-channel ChIP-on-chip and DNA methylation microarray normalization strategies

    PubMed Central

    2012-01-01

    Background The combination of chromatin immunoprecipitation with two-channel microarray technology enables genome-wide mapping of binding sites of DNA-interacting proteins (ChIP-on-chip) or sites with methylated CpG di-nucleotides (DNA methylation microarray). These powerful tools are the gateway to understanding gene transcription regulation. Since the goals of such studies, the sample preparation procedures, the microarray content and study design are all different from transcriptomics microarrays, the data pre-processing strategies traditionally applied to transcriptomics microarrays may not be appropriate. Particularly, the main challenge of the normalization of "regulation microarrays" is (i) to make the data of individual microarrays quantitatively comparable and (ii) to keep the signals of the enriched probes, representing DNA sequences from the precipitate, as distinguishable as possible from the signals of the un-enriched probes, representing DNA sequences largely absent from the precipitate. Results We compare several widely used normalization approaches (VSN, LOWESS, quantile, T-quantile, Tukey's biweight scaling, Peng's method) applied to a selection of regulation microarray datasets, ranging from DNA methylation to transcription factor binding and histone modification studies. Through comparison of the data distributions of control probes and gene promoter probes before and after normalization, and assessment of the power to identify known enriched genomic regions after normalization, we demonstrate that there are clear differences in performance between normalization procedures. Conclusion T-quantile normalization applied separately on the channels and Tukey's biweight scaling outperform other methods in terms of the conservation of enriched and un-enriched signal separation, as well as in identification of genomic regions known to be enriched. T-quantile normalization is preferable as it additionally improves comparability between microarrays. In contrast, popular normalization approaches like quantile, LOWESS, Peng's method and VSN normalization alter the data distributions of regulation microarrays to such an extent that using these approaches will impact the reliability of the downstream analysis substantially. PMID:22276688

  5. Advantages of RNA-seq compared to RNA microarrays for transcriptome profiling of anterior cruciate ligament tears.

    PubMed

    Rai, Muhammad Farooq; Tycksen, Eric D; Sandell, Linda J; Brophy, Robert H

    2018-01-01

    Microarrays and RNA-seq are at the forefront of high throughput transcriptome analyses. Since these methodologies are based on different principles, there are concerns about the concordance of data between the two techniques. The concordance of RNA-seq and microarrays for genome-wide analysis of differential gene expression has not been rigorously assessed in clinically derived ligament tissues. To demonstrate the concordance between RNA-seq and microarrays and to assess potential benefits of RNA-seq over microarrays, we assessed differences in transcript expression in anterior cruciate ligament (ACL) tissues based on time-from-injury. ACL remnants were collected from patients with an ACL tear at the time of ACL reconstruction. RNA prepared from torn ACL remnants was subjected to Agilent microarrays (N = 24) and RNA-seq (N = 8). The correlation of biological replicates in RNA-seq and microarrays data was similar (0.98 vs. 0.97), demonstrating that each platform has high internal reproducibility. Correlations between the RNA-seq data and the individual microarrays were low, but correlations between the RNA-seq values and the geometric mean of the microarrays values were moderate. The cross-platform concordance for differentially expressed transcripts or enriched pathways was linearly correlated (r = 0.64). RNA-Seq was superior in detecting low abundance transcripts and differentiating biologically critical isoforms. Additional independent validation of transcript expression was undertaken using microfluidic PCR for selected genes. PCR data showed 100% concordance (in expression pattern) with RNA-seq and microarrays data. These findings demonstrate that RNA-seq has advantages over microarrays for transcriptome profiling of ligament tissues when available and affordable. Furthermore, these findings are likely transferable to other musculoskeletal tissues where tissue collection is challenging and cells are in low abundance. © 2017 Orthopaedic Research Society. Published by Wiley Periodicals, Inc. J Orthop Res 36:484-497, 2018. © 2017 Orthopaedic Research Society. Published by Wiley Periodicals, Inc.

  6. The effect of column purification on cDNA indirect labelling for microarrays

    PubMed Central

    Molas, M Lia; Kiss, John Z

    2007-01-01

    Background The success of the microarray reproducibility is dependent upon the performance of standardized procedures. Since the introduction of microarray technology for the analysis of global gene expression, reproducibility of results among different laboratories has been a major problem. Two of the main contributors to this variability are the use of different microarray platforms and different laboratory practices. In this paper, we address the latter question in terms of how variation in one of the steps of a labelling procedure affects the cDNA product prior to microarray hybridization. Results We used a standard procedure to label cDNA for microarray hybridization and employed different types of column chromatography for cDNA purification. After purifying labelled cDNA, we used the Agilent 2100 Bioanalyzer and agarose gel electrophoresis to assess the quality of the labelled cDNA before its hybridization onto a microarray platform. There were major differences in the cDNA profile (i.e. cDNA fragment lengths and abundance) as a result of using four different columns for purification. In addition, different columns have different efficiencies to remove rRNA contamination. This study indicates that the appropriate column to use in this type of protocol has to be experimentally determined. Finally, we present new evidence establishing the importance of testing the method of purification used during an indirect labelling procedure. Our results confirm the importance of assessing the quality of the sample in the labelling procedure prior to hybridization onto a microarray platform. Conclusion Standardization of column purification systems to be used in labelling procedures will improve the reproducibility of microarray results among different laboratories. In addition, implementation of a quality control check point of the labelled samples prior to microarray hybridization will prevent hybridizing a poor quality sample to expensive micorarrays. PMID:17597522

  7. The effect of column purification on cDNA indirect labelling for microarrays.

    PubMed

    Molas, M Lia; Kiss, John Z

    2007-06-27

    The success of the microarray reproducibility is dependent upon the performance of standardized procedures. Since the introduction of microarray technology for the analysis of global gene expression, reproducibility of results among different laboratories has been a major problem. Two of the main contributors to this variability are the use of different microarray platforms and different laboratory practices. In this paper, we address the latter question in terms of how variation in one of the steps of a labelling procedure affects the cDNA product prior to microarray hybridization. We used a standard procedure to label cDNA for microarray hybridization and employed different types of column chromatography for cDNA purification. After purifying labelled cDNA, we used the Agilent 2100 Bioanalyzer and agarose gel electrophoresis to assess the quality of the labelled cDNA before its hybridization onto a microarray platform. There were major differences in the cDNA profile (i.e. cDNA fragment lengths and abundance) as a result of using four different columns for purification. In addition, different columns have different efficiencies to remove rRNA contamination. This study indicates that the appropriate column to use in this type of protocol has to be experimentally determined. Finally, we present new evidence establishing the importance of testing the method of purification used during an indirect labelling procedure. Our results confirm the importance of assessing the quality of the sample in the labelling procedure prior to hybridization onto a microarray platform. Standardization of column purification systems to be used in labelling procedures will improve the reproducibility of microarray results among different laboratories. In addition, implementation of a quality control check point of the labelled samples prior to microarray hybridization will prevent hybridizing a poor quality sample to expensive micorarrays.

  8. The Glycan Microarray Story from Construction to Applications.

    PubMed

    Hyun, Ji Young; Pai, Jaeyoung; Shin, Injae

    2017-04-18

    Not only are glycan-mediated binding processes in cells and organisms essential for a wide range of physiological processes, but they are also implicated in various pathological processes. As a result, elucidation of glycan-associated biomolecular interactions and their consequences is of great importance in basic biological research and biomedical applications. In 2002, we and others were the first to utilize glycan microarrays in efforts aimed at the rapid analysis of glycan-associated recognition events. Because they contain a number of glycans immobilized in a dense and orderly manner on a solid surface, glycan microarrays enable multiple parallel analyses of glycan-protein binding events while utilizing only small amounts of glycan samples. Therefore, this microarray technology has become a leading edge tool in studies aimed at elucidating roles played by glycans and glycan binding proteins in biological systems. In this Account, we summarize our efforts on the construction of glycan microarrays and their applications in studies of glycan-associated interactions. Immobilization strategies of functionalized and unmodified glycans on derivatized glass surfaces are described. Although others have developed immobilization techniques, our efforts have focused on improving the efficiencies and operational simplicity of microarray construction. The microarray-based technology has been most extensively used for rapid analysis of the glycan binding properties of proteins. In addition, glycan microarrays have been employed to determine glycan-protein interactions quantitatively, detect pathogens, and rapidly assess substrate specificities of carbohydrate-processing enzymes. More recently, the microarrays have been employed to identify functional glycans that elicit cell surface lectin-mediated cellular responses. Owing to these efforts, it is now possible to use glycan microarrays to expand the understanding of roles played by glycans and glycan binding proteins in biological systems.

  9. Two-Dimensional VO2 Mesoporous Microarrays for High-Performance Supercapacitor

    NASA Astrophysics Data System (ADS)

    Fan, Yuqi; Ouyang, Delong; Li, Bao-Wen; Dang, Feng; Ren, Zongming

    2018-05-01

    Two-dimensional (2D) mesoporous VO2 microarrays have been prepared using an organic-inorganic liquid interface. The units of microarrays consist of needle-like VO2 particles with a mesoporous structure, in which crack-like pores with a pore size of about 2 nm and depth of 20-100 nm are distributed on the particle surface. The liquid interface acts as a template for the formation of the 2D microarrays, as identified from the kinetic observation. Due to the mesoporous structure of the units and high conductivity of the microarray, such 2D VO2 microarrays exhibit a high specific capacitance of 265 F/g at 1 A/g and excellent rate capability (182 F/g at 10 A/g) and cycling stability, suggesting the effect of unique microstructure for improving the electrochemical performance.

  10. Plant-pathogen interactions: what microarray tells about it?

    PubMed

    Lodha, T D; Basak, J

    2012-01-01

    Plant defense responses are mediated by elementary regulatory proteins that affect expression of thousands of genes. Over the last decade, microarray technology has played a key role in deciphering the underlying networks of gene regulation in plants that lead to a wide variety of defence responses. Microarray is an important tool to quantify and profile the expression of thousands of genes simultaneously, with two main aims: (1) gene discovery and (2) global expression profiling. Several microarray technologies are currently in use; most include a glass slide platform with spotted cDNA or oligonucleotides. Till date, microarray technology has been used in the identification of regulatory genes, end-point defence genes, to understand the signal transduction processes underlying disease resistance and its intimate links to other physiological pathways. Microarray technology can be used for in-depth, simultaneous profiling of host/pathogen genes as the disease progresses from infection to resistance/susceptibility at different developmental stages of the host, which can be done in different environments, for clearer understanding of the processes involved. A thorough knowledge of plant disease resistance using successful combination of microarray and other high throughput techniques, as well as biochemical, genetic, and cell biological experiments is needed for practical application to secure and stabilize yield of many crop plants. This review starts with a brief introduction to microarray technology, followed by the basics of plant-pathogen interaction, the use of DNA microarrays over the last decade to unravel the mysteries of plant-pathogen interaction, and ends with the future prospects of this technology.

  11. Confident difference criterion: a new Bayesian differentially expressed gene selection algorithm with applications.

    PubMed

    Yu, Fang; Chen, Ming-Hui; Kuo, Lynn; Talbott, Heather; Davis, John S

    2015-08-07

    Recently, the Bayesian method becomes more popular for analyzing high dimensional gene expression data as it allows us to borrow information across different genes and provides powerful estimators for evaluating gene expression levels. It is crucial to develop a simple but efficient gene selection algorithm for detecting differentially expressed (DE) genes based on the Bayesian estimators. In this paper, by extending the two-criterion idea of Chen et al. (Chen M-H, Ibrahim JG, Chi Y-Y. A new class of mixture models for differential gene expression in DNA microarray data. J Stat Plan Inference. 2008;138:387-404), we propose two new gene selection algorithms for general Bayesian models and name these new methods as the confident difference criterion methods. One is based on the standardized differences between two mean expression values among genes; the other adds the differences between two variances to it. The proposed confident difference criterion methods first evaluate the posterior probability of a gene having different gene expressions between competitive samples and then declare a gene to be DE if the posterior probability is large. The theoretical connection between the proposed first method based on the means and the Bayes factor approach proposed by Yu et al. (Yu F, Chen M-H, Kuo L. Detecting differentially expressed genes using alibrated Bayes factors. Statistica Sinica. 2008;18:783-802) is established under the normal-normal-model with equal variances between two samples. The empirical performance of the proposed methods is examined and compared to those of several existing methods via several simulations. The results from these simulation studies show that the proposed confident difference criterion methods outperform the existing methods when comparing gene expressions across different conditions for both microarray studies and sequence-based high-throughput studies. A real dataset is used to further demonstrate the proposed methodology. In the real data application, the confident difference criterion methods successfully identified more clinically important DE genes than the other methods. The confident difference criterion method proposed in this paper provides a new efficient approach for both microarray studies and sequence-based high-throughput studies to identify differentially expressed genes.

  12. geneCBR: a translational tool for multiple-microarray analysis and integrative information retrieval for aiding diagnosis in cancer research.

    PubMed

    Glez-Peña, Daniel; Díaz, Fernando; Hernández, Jesús M; Corchado, Juan M; Fdez-Riverola, Florentino

    2009-06-18

    Bioinformatics and medical informatics are two research fields that serve the needs of different but related communities. Both domains share the common goal of providing new algorithms, methods and technological solutions to biomedical research, and contributing to the treatment and cure of diseases. Although different microarray techniques have been successfully used to investigate useful information for cancer diagnosis at the gene expression level, the true integration of existing methods into day-to-day clinical practice is still a long way off. Within this context, case-based reasoning emerges as a suitable paradigm specially intended for the development of biomedical informatics applications and decision support systems, given the support and collaboration involved in such a translational development. With the goals of removing barriers against multi-disciplinary collaboration and facilitating the dissemination and transfer of knowledge to real practice, case-based reasoning systems have the potential to be applied to translational research mainly because their computational reasoning paradigm is similar to the way clinicians gather, analyze and process information in their own practice of clinical medicine. In addressing the issue of bridging the existing gap between biomedical researchers and clinicians who work in the domain of cancer diagnosis, prognosis and treatment, we have developed and made accessible a common interactive framework. Our geneCBR system implements a freely available software tool that allows the use of combined techniques that can be applied to gene selection, clustering, knowledge extraction and prediction for aiding diagnosis in cancer research. For biomedical researches, geneCBR expert mode offers a core workbench for designing and testing new techniques and experiments. For pathologists or oncologists, geneCBR diagnostic mode implements an effective and reliable system that can diagnose cancer subtypes based on the analysis of microarray data using a CBR architecture. For programmers, geneCBR programming mode includes an advanced edition module for run-time modification of previous coded techniques. geneCBR is a new translational tool that can effectively support the integrative work of programmers, biomedical researches and clinicians working together in a common framework. The code is freely available under the GPL license and can be obtained at http://www.genecbr.org.

  13. Clustering-based spot segmentation of cDNA microarray images.

    PubMed

    Uslan, Volkan; Bucak, Ihsan Ömür

    2010-01-01

    Microarrays are utilized as that they provide useful information about thousands of gene expressions simultaneously. In this study segmentation step of microarray image processing has been implemented. Clustering-based methods, fuzzy c-means and k-means, have been applied for the segmentation step that separates the spots from the background. The experiments show that fuzzy c-means have segmented spots of the microarray image more accurately than the k-means.

  14. A perspective on microarrays: current applications, pitfalls, and potential uses

    PubMed Central

    Jaluria, Pratik; Konstantopoulos, Konstantinos; Betenbaugh, Michael; Shiloach, Joseph

    2007-01-01

    With advances in robotics, computational capabilities, and the fabrication of high quality glass slides coinciding with increased genomic information being available on public databases, microarray technology is increasingly being used in laboratories around the world. In fact, fields as varied as: toxicology, evolutionary biology, drug development and production, disease characterization, diagnostics development, cellular physiology and stress responses, and forensics have benefiting from its use. However, for many researchers not familiar with microarrays, current articles and reviews often address neither the fundamental principles behind the technology nor the proper designing of experiments. Although, microarray technology is relatively simple, conceptually, its practice does require careful planning and detailed understanding of the limitations inherently present. Without these considerations, it can be exceedingly difficult to ascertain valuable information from microarray data. Therefore, this text aims to outline key features in microarray technology, paying particular attention to current applications as outlined in recent publications, experimental design, statistical methods, and potential uses. Furthermore, this review is not meant to be comprehensive, but rather substantive; highlighting important concepts and detailing steps necessary to conduct and interpret microarray experiments. Collectively, the information included in this text will highlight the versatility of microarray technology and provide a glimpse of what the future may hold. PMID:17254338

  15. A Platform for Combined DNA and Protein Microarrays Based on Total Internal Reflection Fluorescence

    PubMed Central

    Asanov, Alexander; Zepeda, Angélica; Vaca, Luis

    2012-01-01

    We have developed a novel microarray technology based on total internal reflection fluorescence (TIRF) in combination with DNA and protein bioassays immobilized at the TIRF surface. Unlike conventional microarrays that exhibit reduced signal-to-background ratio, require several stages of incubation, rinsing and stringency control, and measure only end-point results, our TIRF microarray technology provides several orders of magnitude better signal-to-background ratio, performs analysis rapidly in one step, and measures the entire course of association and dissociation kinetics between target DNA and protein molecules and the bioassays. In many practical cases detection of only DNA or protein markers alone does not provide the necessary accuracy for diagnosing a disease or detecting a pathogen. Here we describe TIRF microarrays that detect DNA and protein markers simultaneously, which reduces the probabilities of false responses. Supersensitive and multiplexed TIRF DNA and protein microarray technology may provide a platform for accurate diagnosis or enhanced research studies. Our TIRF microarray system can be mounted on upright or inverted microscopes or interfaced directly with CCD cameras equipped with a single objective, facilitating the development of portable devices. As proof-of-concept we applied TIRF microarrays for detecting molecular markers from Bacillus anthracis, the pathogen responsible for anthrax. PMID:22438738

  16. Validation of MIMGO: a method to identify differentially expressed GO terms in a microarray dataset

    PubMed Central

    2012-01-01

    Background We previously proposed an algorithm for the identification of GO terms that commonly annotate genes whose expression is upregulated or downregulated in some microarray data compared with in other microarray data. We call these “differentially expressed GO terms” and have named the algorithm “matrix-assisted identification method of differentially expressed GO terms” (MIMGO). MIMGO can also identify microarray data in which genes annotated with a differentially expressed GO term are upregulated or downregulated. However, MIMGO has not yet been validated on a real microarray dataset using all available GO terms. Findings We combined Gene Set Enrichment Analysis (GSEA) with MIMGO to identify differentially expressed GO terms in a yeast cell cycle microarray dataset. GSEA followed by MIMGO (GSEA + MIMGO) correctly identified (p < 0.05) microarray data in which genes annotated to differentially expressed GO terms are upregulated. We found that GSEA + MIMGO was slightly less effective than, or comparable to, GSEA (Pearson), a method that uses Pearson’s correlation as a metric, at detecting true differentially expressed GO terms. However, unlike other methods including GSEA (Pearson), GSEA + MIMGO can comprehensively identify the microarray data in which genes annotated with a differentially expressed GO term are upregulated or downregulated. Conclusions MIMGO is a reliable method to identify differentially expressed GO terms comprehensively. PMID:23232071

  17. Microintaglio Printing for Soft Lithography-Based in Situ Microarrays

    PubMed Central

    Biyani, Manish; Ichiki, Takanori

    2015-01-01

    Advances in lithographic approaches to fabricating bio-microarrays have been extensively explored over the last two decades. However, the need for pattern flexibility, a high density, a high resolution, affordability and on-demand fabrication is promoting the development of unconventional routes for microarray fabrication. This review highlights the development and uses of a new molecular lithography approach, called “microintaglio printing technology”, for large-scale bio-microarray fabrication using a microreactor array (µRA)-based chip consisting of uniformly-arranged, femtoliter-size µRA molds. In this method, a single-molecule-amplified DNA microarray pattern is self-assembled onto a µRA mold and subsequently converted into a messenger RNA or protein microarray pattern by simultaneously producing and transferring (immobilizing) a messenger RNA or a protein from a µRA mold to a glass surface. Microintaglio printing allows the self-assembly and patterning of in situ-synthesized biomolecules into high-density (kilo-giga-density), ordered arrays on a chip surface with µm-order precision. This holistic aim, which is difficult to achieve using conventional printing and microarray approaches, is expected to revolutionize and reshape proteomics. This review is not written comprehensively, but rather substantively, highlighting the versatility of microintaglio printing for developing a prerequisite platform for microarray technology for the postgenomic era. PMID:27600226

  18. Connecting medieval megadroughts and surface climate in the Last Millennium Reanalysis

    NASA Astrophysics Data System (ADS)

    Erb, M. P.; Emile-Geay, J.; Anderson, D. M.; Hakim, G. J.; Horlick, K. A.; Noone, D.; Perkins, W. A.; Steig, E. J.; Tardif, R.

    2016-12-01

    The North American Drought Atlas shows severe, long-lasting droughts during the Medieval Climate Anomaly. Because drought frequency and severity over the coming century is an area of vital interest, better understanding the causes of these historic droughts is crucial. A variety of research has suggested that a La Niña state was important for producing medieval megadroughts [1], and other work has indicated the potential roles of the Atlantic Multidecadal Oscillation [2] and internal atmospheric variability [3]. Correlations between drought and large-scale climate patterns also exist in the instrumental record [4], but understanding these relationships is far from complete. To investigate these relationships further, a data assimilation approach is employed. Proxy records - including tree rings, corals, and ice cores - are used to constrain climate states over the Common Era. By using general circulation model (GCM) output to quantify the covariances in the climate system, climate can be constrained not just at proxy sites but for all covarying locations and climate fields. Multiple GCMs will be employed to offset the limitations of imperfect model physics. This "Last Millennium Reanalysis" will be used to quantify relationships between North American medieval megadroughts and sea surface temperature patterns in the Atlantic and Pacific. 1. Cook, E. R., et al., Earth-Sci. Rev. 81, 93 (2007). 2. Oglesby, R., et al., Global Planet. Change 84-85, 56 (2012). 3. Stevenson, S., et al., J. Climate 28, 1865 (2015). 4. Cook, B. I., et al., J. Climate 27, 383 (2014).

  19. Reconstructing the 20th century high-resolution climate of the southeastern United States

    NASA Astrophysics Data System (ADS)

    Dinapoli, Steven M.; Misra, Vasubandhu

    2012-10-01

    We dynamically downscale the 20th Century Reanalysis (20CR) to a 10-km grid resolution from 1901 to 2008 over the southeastern United States and the Gulf of Mexico using the Regional Spectral Model. The downscaled data set, which we call theFlorida Climate Institute-Florida State University Land-Atmosphere Reanalysis for theSoutheastern United States at 10-km resolution (FLAReS1.0), will facilitate the study of the effects of low-frequency climate variability and major historical climate events on local hydrology and agriculture. To determine the suitability of the FLAReS1.0 downscaled data set for any subsequent applied climate studies, we compare the annual, seasonal, and diurnal variability of temperature and precipitation in the model to various observation data sets. In addition, we examine the model's depiction of several meteorological phenomena that affect the climate of the region, including extreme cold waves, summer sea breezes and associated convective activity, tropical cyclone landfalls, and midlatitude frontal systems. Our results show that temperature and precipitation variability are well-represented by FLAReS1.0 on most time scales, although systematic biases do exist in the data. FLAReS1.0 accurately portrays some of the major weather phenomena in the region, but the severity of extreme weather events is generally underestimated. The high resolution of FLAReS1.0 makes it more suitable for local climate studies than the coarser 20CR.

  20. Regional reanalysis without local data: Exploiting the downscaling paradigm

    NASA Astrophysics Data System (ADS)

    von Storch, Hans; Feser, Frauke; Geyer, Beate; Klehmet, Katharina; Li, Delei; Rockel, Burkhardt; Schubert-Frisius, Martina; Tim, Nele; Zorita, Eduardo

    2017-08-01

    This paper demonstrates two important aspects of regional dynamical downscaling of multidecadal atmospheric reanalysis. First, that in this way skillful regional descriptions of multidecadal climate variability may be constructed in regions with little or no local data. Second, that the concept of large-scale constraining allows global downscaling, so that global reanalyses may be completed by additions of consistent detail in all regions of the world. Global reanalyses suffer from inhomogeneities. However, their large-scale componenst are mostly homogeneous; Therefore, the concept of downscaling may be applied to homogeneously complement the large-scale state of the reanalyses with regional detail—wherever the condition of homogeneity of the description of large scales is fulfilled. Technically, this can be done by dynamical downscaling using a regional or global climate model, which's large scales are constrained by spectral nudging. This approach has been developed and tested for the region of Europe, and a skillful representation of regional weather risks—in particular marine risks—was identified. We have run this system in regions with reduced or absent local data coverage, such as Central Siberia, the Bohai and Yellow Sea, Southwestern Africa, and the South Atlantic. Also, a global simulation was computed, which adds regional features to prescribed global dynamics. Our cases demonstrate that spatially detailed reconstructions of the climate state and its change in the recent three to six decades add useful supplementary information to existing observational data for midlatitude and subtropical regions of the world.

  1. Global hotspots of river erosion under global warming

    NASA Astrophysics Data System (ADS)

    Plink-Bjorklund, P.; Reichler, T.

    2017-12-01

    Extreme precipitation plays a significant role for river hydrology, flood hazards and landscape response. For example, the September 2013 rainstorm in the Colorado Front Range evacuated the equivalent of hundreds to thousands of years of hillslope weathering products. Although promoted by steep topography, the Colorado event is clearly linked to rainfall intensity, since most of the 1100 debris flows occurred within the highest rainfall contour. Additional evidence for a strong link between extreme precipitation and river erosion comes from the sedimentary record, and especially from that of past greenhouse climates. The existence of such a link suggests that information about global rainfall patterns can be used to define regions of increased erosion potential. However, the question arises what rainfall criteria to use and how well the method works. A related question is how ongoing climate change and the corresponding shifts in rainfall might impact the results. Here, we use atmospheric reanalysis and output from a climate model to identify regions that are particularly susceptible to landscape change in response to extreme precipitation. In order to define the regions, we combine several hydroclimatological and geomorphological criteria into a single index of erosion potential. We show that for current climate, our criteria applied to atmospheric reanalysis or to climate model data successfully localize known areas of increased erosion potential, such as the Colorado region. We then apply our criteria to climate model data for future climate to document how the location, extent, and intensity of erosion hotspots are likely to change under global warming.

  2. Observed Trend in Surface Wind Speed Over the Conterminous USA and CMIP5 Simulations

    NASA Technical Reports Server (NTRS)

    Hashimoto, Hirofumi; Nemani, Ramakrishna R.

    2016-01-01

    There has been no spatial surface wind map even over the conterminous USA due to the difficulty of spatial interpolation of wind field. As a result, the reanalysis data were often used to analyze the statistics of spatial pattern in surface wind speed. Unfortunately, no consistent trend in wind field was found among the available reanalysis data, and that obstructed the further analysis or projection of spatial pattern of wind speed. In this study, we developed the methodology to interpolate the observed wind speed data at weather stations using random forest algorithm. We produced the 1-km daily climate variables over the conterminous USA from 1979 to 2015. The validation using Ameriflux daily data showed that R2 is 0.59. Existing studies have found the negative trend over the Eastern US, and our study also showed same results. However, our new datasets also revealed the significant increasing trend over the southwest US especially from April to June. The trend in the southwestern US represented change or seasonal shift in North American Monsoon. Global analysis of CMIP5 data projected the decrease trend in mid-latitude, while increase trend in tropical region over the land. Most likely because of the low resolution in GCM, CMIP5 data failed to simulate the increase trend in the southwest US, even though it was qualitatively predicted that pole ward shift of anticyclone help the North American Monsoon.

  3. The Chandra Source Catalog: X-ray Aperture Photometry

    NASA Astrophysics Data System (ADS)

    Kashyap, Vinay; Primini, F. A.; Glotfelty, K. J.; Anderson, C. S.; Bonaventura, N. R.; Chen, J. C.; Davis, J. E.; Doe, S. M.; Evans, I. N.; Evans, J. D.; Fabbiano, G.; Galle, E. C.; Gibbs, D. G., II; Grier, J. D.; Hain, R.; Hall, D. M.; Harbo, P. N.; He, X.; Houck, J. C.; Karovska, M.; Lauer, J.; McCollough, M. L.; McDowell, J. C.; Miller, J. B.; Mitschang, A. W.; Morgan, D. L.; Nichols, J. S.; Nowak, M. A.; Plummer, D. A.; Refsdal, B. L.; Rots, A. H.; Siemiginowska, A. L.; Sundheim, B. A.; Tibbetts, M. S.; van Stone, D. W.; Winkelman, S. L.; Zografou, P.

    2009-09-01

    The Chandra Source Catalog (CSC) represents a reanalysis of the entire ACIS and HRC imaging observations over the 9-year Chandra mission. We describe here the method by which fluxes are measured for detected sources. Source detection is carried out on a uniform basis, using the CIAO tool wavdetect. Source fluxes are estimated post-facto using a Bayesian method that accounts for background, spatial resolution effects, and contamination from nearby sources. We use gamma-function prior distributions, which could be either non-informative, or in case there exist previous observations of the same source, strongly informative. The current implementation is however limited to non-informative priors. The resulting posterior probability density functions allow us to report the flux and a robust credible range on it.

  4. First Monte Carlo Global Analysis of Nucleon Transversity with Lattice QCD Constraints

    DOE PAGES

    Lin, Huey-Wen; Melnitchouk, Wally; Prokudin, Alexei; ...

    2018-04-11

    We report on the first global QCD analysis of the quark transversity distributions in the nucleon from semi-inclusive deep-inelastic scattering (SIDIS), using a new Monte Carlo method based on nested sampling and constraints on the isovector tensor chargemore » $$g_T$$ from lattice QCD. A simultaneous fit to the available SIDIS Collins asymmetry data is compatible with $$g_T$$ values extracted from a comprehensive reanalysis of existing lattice simulations, in contrast to previous analyses, which found significantly smaller $$g_T$$ values. The contributions to the nucleon tensor charge from $u$ and $d$ quarks are found to be $$\\delta u = 0.3(2)$$ and $$\\delta d = -0.7(2)$$ at a scale $Q^2 = 2$ GeV$^2$.« less

  5. First Monte Carlo Global Analysis of Nucleon Transversity with Lattice QCD Constraints

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lin, Huey-Wen; Melnitchouk, Wally; Prokudin, Alexei

    We report on the first global QCD analysis of the quark transversity distributions in the nucleon from semi-inclusive deep-inelastic scattering (SIDIS), using a new Monte Carlo method based on nested sampling and constraints on the isovector tensor chargemore » $$g_T$$ from lattice QCD. A simultaneous fit to the available SIDIS Collins asymmetry data is compatible with $$g_T$$ values extracted from a comprehensive reanalysis of existing lattice simulations, in contrast to previous analyses, which found significantly smaller $$g_T$$ values. The contributions to the nucleon tensor charge from $u$ and $d$ quarks are found to be $$\\delta u = 0.3(2)$$ and $$\\delta d = -0.7(2)$$ at a scale $Q^2 = 2$ GeV$^2$.« less

  6. The death spiral: predicting death in Drosophila cohorts.

    PubMed

    Mueller, Laurence D; Shahrestani, Parvin; Rauser, Casandra L; Rose, Michael R

    2016-11-01

    Drosophila research has identified a new feature of aging that has been called the death spiral. The death spiral is a period prior to death during which there is a decline in life-history characters, such as fecundity, as well as physiological characters. First, we review the data from the Drosophila and medfly literature that suggest the existence of death spirals. Second, we re-analyze five cases with such data from four laboratories using a generalized statistical framework, a re-analysis that strengthens the case for the salience of the death spiral phenomenon. Third, we raise the issue whether death spirals need to be taken into account in the analysis of functional characters over age, in aging research with model species as well as human data.

  7. New perspectives on an old problem: The bending of light in Yang-Mills gravity

    NASA Astrophysics Data System (ADS)

    Cottrell, Kazuo Ota; Hsu, Jong-Ping

    Yang-Mills gravity with electromagnetism predicts, in the geometric optics limit, a value for the deflection of light by the sun which agrees closely with the reanalysis of Eddington's 1919 optical measurements done in 1979. Einstein's General Theory of Relativity, on the other hand, agrees very closely with measurements of the deflection of electromagnetic waves made in the range of radio frequencies. Since both General Relativity and Yang-Mills gravity with electromagnetism in the geometric optics limit make predictions for the optical region which fall within experimental uncertainty, it becomes important to consider the possibility of the existence of a frequency dependence in the measurement results for the deflection of light, in order to determine which theory more closely describes nature...

  8. First Monte Carlo Global Analysis of Nucleon Transversity with Lattice QCD Constraints.

    PubMed

    Lin, H-W; Melnitchouk, W; Prokudin, A; Sato, N; Shows, H

    2018-04-13

    We report on the first global QCD analysis of the quark transversity distributions in the nucleon from semi-inclusive deep-inelastic scattering (SIDIS), using a new Monte Carlo method based on nested sampling and constraints on the isovector tensor charge g_{T} from lattice QCD. A simultaneous fit to the available SIDIS Collins asymmetry data is compatible with g_{T} values extracted from a comprehensive reanalysis of existing lattice simulations, in contrast to previous analyses, which found significantly smaller g_{T} values. The contributions to the nucleon tensor charge from u and d quarks are found to be δu=0.3(2) and δd=-0.7(2) at a scale Q^{2}=2  GeV^{2}.

  9. First Monte Carlo Global Analysis of Nucleon Transversity with Lattice QCD Constraints

    NASA Astrophysics Data System (ADS)

    Lin, H.-W.; Melnitchouk, W.; Prokudin, A.; Sato, N.; Shows, H.; Jefferson Lab Angular Momentum JAM Collaboration

    2018-04-01

    We report on the first global QCD analysis of the quark transversity distributions in the nucleon from semi-inclusive deep-inelastic scattering (SIDIS), using a new Monte Carlo method based on nested sampling and constraints on the isovector tensor charge gT from lattice QCD. A simultaneous fit to the available SIDIS Collins asymmetry data is compatible with gT values extracted from a comprehensive reanalysis of existing lattice simulations, in contrast to previous analyses, which found significantly smaller gT values. The contributions to the nucleon tensor charge from u and d quarks are found to be δ u =0.3 (2 ) and δ d =-0.7 (2 ) at a scale Q2=2 GeV2.

  10. Visualizing the Heterogeneity of Effects in the Analysis of Associations of Multiple Myeloma with Glyphosate Use. Comments on Sorahan, T. Multiple Myeloma and Glyphosate Use: A Re-Analysis of US Agricultural Health Study (AHS) Data. Int. J. Environ. Res. Public Health 2015, 12, 1548-1559.

    PubMed

    Burstyn, Igor; De Roos, Anneclaire J

    2016-12-22

    We address a methodological issue of the evaluation of the difference in effects in epidemiological studies that may arise, for example, from stratum-specific analyses or differences in analytical decisions during data analysis. We propose a new simulation-based method to quantify the plausible extent of such heterogeneity, rather than testing a hypothesis about its existence. We examine the contribution of the method to the debate surrounding risk of multiple myeloma and glyphosate use and propose that its application contributes to a more balanced weighting of evidence.

  11. Visualizing the Heterogeneity of Effects in the Analysis of Associations of Multiple Myeloma with Glyphosate Use. Comments on Sorahan, T. Multiple Myeloma and Glyphosate Use: A Re-Analysis of US Agricultural Health Study (AHS) Data. Int. J. Environ. Res. Public Health 2015, 12, 1548–1559

    PubMed Central

    Burstyn, Igor; De Roos, Anneclaire J.

    2016-01-01

    We address a methodological issue of the evaluation of the difference in effects in epidemiological studies that may arise, for example, from stratum-specific analyses or differences in analytical decisions during data analysis. We propose a new simulation-based method to quantify the plausible extent of such heterogeneity, rather than testing a hypothesis about its existence. We examine the contribution of the method to the debate surrounding risk of multiple myeloma and glyphosate use and propose that its application contributes to a more balanced weighting of evidence. PMID:28025514

  12. ROBNCA: robust network component analysis for recovering transcription factor activities.

    PubMed

    Noor, Amina; Ahmad, Aitzaz; Serpedin, Erchin; Nounou, Mohamed; Nounou, Hazem

    2013-10-01

    Network component analysis (NCA) is an efficient method of reconstructing the transcription factor activity (TFA), which makes use of the gene expression data and prior information available about transcription factor (TF)-gene regulations. Most of the contemporary algorithms either exhibit the drawback of inconsistency and poor reliability, or suffer from prohibitive computational complexity. In addition, the existing algorithms do not possess the ability to counteract the presence of outliers in the microarray data. Hence, robust and computationally efficient algorithms are needed to enable practical applications. We propose ROBust Network Component Analysis (ROBNCA), a novel iterative algorithm that explicitly models the possible outliers in the microarray data. An attractive feature of the ROBNCA algorithm is the derivation of a closed form solution for estimating the connectivity matrix, which was not available in prior contributions. The ROBNCA algorithm is compared with FastNCA and the non-iterative NCA (NI-NCA). ROBNCA estimates the TF activity profiles as well as the TF-gene control strength matrix with a much higher degree of accuracy than FastNCA and NI-NCA, irrespective of varying noise, correlation and/or amount of outliers in case of synthetic data. The ROBNCA algorithm is also tested on Saccharomyces cerevisiae data and Escherichia coli data, and it is observed to outperform the existing algorithms. The run time of the ROBNCA algorithm is comparable with that of FastNCA, and is hundreds of times faster than NI-NCA. The ROBNCA software is available at http://people.tamu.edu/∼amina/ROBNCA

  13. An Introduction to MAMA (Meta-Analysis of MicroArray data) System.

    PubMed

    Zhang, Zhe; Fenstermacher, David

    2005-01-01

    Analyzing microarray data across multiple experiments has been proven advantageous. To support this kind of analysis, we are developing a software system called MAMA (Meta-Analysis of MicroArray data). MAMA utilizes a client-server architecture with a relational database on the server-side for the storage of microarray datasets collected from various resources. The client-side is an application running on the end user's computer that allows the user to manipulate microarray data and analytical results locally. MAMA implementation will integrate several analytical methods, including meta-analysis within an open-source framework offering other developers the flexibility to plug in additional statistical algorithms.

  14. Methods to study legionella transcriptome in vitro and in vivo.

    PubMed

    Faucher, Sebastien P; Shuman, Howard A

    2013-01-01

    The study of transcriptome responses can provide insight into the regulatory pathways and genetic factors that contribute to a specific phenotype. For bacterial pathogens, it can identify putative new virulence systems and shed light on the mechanisms underlying the regulation of virulence factors. Microarrays have been previously used to study gene regulation in Legionella pneumophila. In the past few years a sharp reduction of the costs associated with microarray experiments together with the availability of relatively inexpensive custom-designed commercial microarrays has made microarray technology an accessible tool for the majority of researchers. Here we describe the methodologies to conduct microarray experiments from in vitro and in vivo samples.

  15. Fluorescence-based bioassays for the detection and evaluation of food materials.

    PubMed

    Nishi, Kentaro; Isobe, Shin-Ichiro; Zhu, Yun; Kiyama, Ryoiti

    2015-10-13

    We summarize here the recent progress in fluorescence-based bioassays for the detection and evaluation of food materials by focusing on fluorescent dyes used in bioassays and applications of these assays for food safety, quality and efficacy. Fluorescent dyes have been used in various bioassays, such as biosensing, cell assay, energy transfer-based assay, probing, protein/immunological assay and microarray/biochip assay. Among the arrays used in microarray/biochip assay, fluorescence-based microarrays/biochips, such as antibody/protein microarrays, bead/suspension arrays, capillary/sensor arrays, DNA microarrays/polymerase chain reaction (PCR)-based arrays, glycan/lectin arrays, immunoassay/enzyme-linked immunosorbent assay (ELISA)-based arrays, microfluidic chips and tissue arrays, have been developed and used for the assessment of allergy/poisoning/toxicity, contamination and efficacy/mechanism, and quality control/safety. DNA microarray assays have been used widely for food safety and quality as well as searches for active components. DNA microarray-based gene expression profiling may be useful for such purposes due to its advantages in the evaluation of pathway-based intracellular signaling in response to food materials.

  16. Fluorescence-Based Bioassays for the Detection and Evaluation of Food Materials

    PubMed Central

    Nishi, Kentaro; Isobe, Shin-Ichiro; Zhu, Yun; Kiyama, Ryoiti

    2015-01-01

    We summarize here the recent progress in fluorescence-based bioassays for the detection and evaluation of food materials by focusing on fluorescent dyes used in bioassays and applications of these assays for food safety, quality and efficacy. Fluorescent dyes have been used in various bioassays, such as biosensing, cell assay, energy transfer-based assay, probing, protein/immunological assay and microarray/biochip assay. Among the arrays used in microarray/biochip assay, fluorescence-based microarrays/biochips, such as antibody/protein microarrays, bead/suspension arrays, capillary/sensor arrays, DNA microarrays/polymerase chain reaction (PCR)-based arrays, glycan/lectin arrays, immunoassay/enzyme-linked immunosorbent assay (ELISA)-based arrays, microfluidic chips and tissue arrays, have been developed and used for the assessment of allergy/poisoning/toxicity, contamination and efficacy/mechanism, and quality control/safety. DNA microarray assays have been used widely for food safety and quality as well as searches for active components. DNA microarray-based gene expression profiling may be useful for such purposes due to its advantages in the evaluation of pathway-based intracellular signaling in response to food materials. PMID:26473869

  17. Microarray Data Processing Techniques for Genome-Scale Network Inference from Large Public Repositories.

    PubMed

    Chockalingam, Sriram; Aluru, Maneesha; Aluru, Srinivas

    2016-09-19

    Pre-processing of microarray data is a well-studied problem. Furthermore, all popular platforms come with their own recommended best practices for differential analysis of genes. However, for genome-scale network inference using microarray data collected from large public repositories, these methods filter out a considerable number of genes. This is primarily due to the effects of aggregating a diverse array of experiments with different technical and biological scenarios. Here we introduce a pre-processing pipeline suitable for inferring genome-scale gene networks from large microarray datasets. We show that partitioning of the available microarray datasets according to biological relevance into tissue- and process-specific categories significantly extends the limits of downstream network construction. We demonstrate the effectiveness of our pre-processing pipeline by inferring genome-scale networks for the model plant Arabidopsis thaliana using two different construction methods and a collection of 11,760 Affymetrix ATH1 microarray chips. Our pre-processing pipeline and the datasets used in this paper are made available at http://alurulab.cc.gatech.edu/microarray-pp.

  18. A metadata-aware application for remote scoring and exchange of tissue microarray images

    PubMed Central

    2013-01-01

    Background The use of tissue microarrays (TMA) and advances in digital scanning microscopy has enabled the collection of thousands of tissue images. There is a need for software tools to annotate, query and share this data amongst researchers in different physical locations. Results We have developed an open source web-based application for remote scoring of TMA images, which exploits the value of Microsoft Silverlight Deep Zoom to provide a intuitive interface for zooming and panning around digital images. We use and extend existing XML-based standards to ensure that the data collected can be archived and that our system is interoperable with other standards-compliant systems. Conclusion The application has been used for multi-centre scoring of TMA slides composed of tissues from several Phase III breast cancer trials and ten different studies participating in the International Breast Cancer Association Consortium (BCAC). The system has enabled researchers to simultaneously score large collections of TMA and export the standardised data to integrate with pathological and clinical outcome data, thereby facilitating biomarker discovery. PMID:23635078

  19. Plug-and-actuate on demand: multimodal individual addressability of microarray plates using modular hybrid acoustic wave technology.

    PubMed

    Rezk, Amgad R; Ramesan, Shwathy; Yeo, Leslie Y

    2018-01-30

    The microarray titre plate remains a fundamental workhorse in genomic, proteomic and cellomic analyses that underpin the drug discovery process. Nevertheless, liquid handling technologies for sample dispensing, processing and transfer have not progressed significantly beyond conventional robotic micropipetting techniques, which are not only at their fundamental sample size limit, but are also prone to mechanical failure and contamination. This is because alternative technologies to date suffer from a number of constraints, mainly their limitation to carry out only a single liquid operation such as dispensing or mixing at a given time, and their inability to address individual wells, particularly at high throughput. Here, we demonstrate the possibility for true sequential or simultaneous single- and multi-well addressability in a 96-well plate using a reconfigurable modular platform from which MHz-order hybrid surface and bulk acoustic waves can be coupled to drive a variety of microfluidic modes including mixing, sample preconcentration and droplet jetting/ejection in individual or multiple wells on demand, thus constituting a highly versatile yet simple setup capable of improving the functionality of existing laboratory protocols and processes.

  20. On existence of the σ(600) Its physical implications and related problems

    NASA Astrophysics Data System (ADS)

    Ishida, Shin

    1998-05-01

    We make a re-analysis of 1=0 ππ scattering phase shift δ00 through a new method of S-matrix parametrization (IA; interfering amplitude method), and show a result suggesting strongly for the existence of σ-particle-long-sought Chiral partner of π-meson. Furthermore, through the phenomenological analyses of typical production processes of the 2π-system, the pp-central collision and the J/Ψ→ωππ decay, by applying an intuitive formula as sum of Breit-Wigner amplitudes, (VMW; variant mass and width method), the other evidences for the σ-existence are given. The validity of the methods used in the above analyses is investigated, using a simple field theoretical model, from the general viewpoint of unitarity and the applicability of final state interaction (FSI-) theorem, especially in relation to the "universality" argument. It is shown that the IA and VMW are obtained as the physical state representations of scattering and production amplitudes, respectively. The VMW is shown to be an effective method to obtain the resonance properties from production processes, which generally have the unknown strong-phases. The conventional analyses based on the "universality" seem to be powerless for this purpose.

  1. DOE Office of Scientific and Technical Information (OSTI.GOV)

    H.Zhang, P. Titus, P. Rogoff, A.Zolfaghari, D. Mangra, M. Smith

    The National Spherical Torus Experiment (NSTX) is a low aspect ratio, spherical torus (ST) configuration device which is located at Princeton Plasma Physics Laboratory (PPPL) This device is presently being updated to enhance its physics by doubling the TF field to 1 Tesla and increasing the plasma current to 2 Mega-amperes. The upgrades include a replacement of the centerstack and addition of a second neutral beam. The upgrade analyses have two missions. The first is to support design of new components, principally the centerstack, the second is to qualify existing NSTX components for higher loads, which will increase by amore » factor of four. Cost efficiency was a design goal for new equipment qualification, and reanalysis of the existing components. Showing that older components can sustain the increased loads has been a challenging effort in which designs had to be developed that would limit loading on weaker components, and would minimize the extent of modifications needed. Two areas representing this effort have been chosen to describe in more details: analysis of the current distribution in the new TF inner legs, and, second, analysis of the out-of-plane support of the existing TF outer legs.« less

  2. Reanalysis information for eigenvalues derived from a differential equation analysis formulation. [for shell of revolution buckling

    NASA Technical Reports Server (NTRS)

    Thornton, W. A.; Majumder, D. K.

    1974-01-01

    The investigation reported demonstrates that in the case considered perturbation methods can be used in a straightforward manner to obtain reanalysis information. A perturbation formula for the buckling loads of a general shell of revolution is derived. The accuracy of the obtained relations and their range of application is studied with the aid of a specific example involving a particular stiffened shell of revolution.

  3. Hail frequency estimation across Europe based on a combination of overshooting top detections and the ERA-INTERIM reanalysis

    NASA Astrophysics Data System (ADS)

    Punge, H. J.; Bedka, K. M.; Kunz, M.; Reinbold, A.

    2017-12-01

    This article presents a hail frequency estimation based on the detection of cold overshooting cloud tops (OTs) from the Meteosat Second Generation (MSG) operational weather satellites, in combination with a hail-specific filter derived from the ERA-INTERIM reanalysis. This filter has been designed based on the atmospheric properties in the vicinity of hail reports registered in the European Severe Weather Database (ESWD). These include Convective Available Potential Energy (CAPE), 0-6-km bulk wind shear and freezing level height, evaluated at the nearest time step and interpolated from the reanalysis grid to the location of the hail report. Regions highly exposed to hail events include Northern Italy, followed by South-Eastern Austria and Eastern Spain. Pronounced hail frequency is also found in large parts of Eastern Europe, around the Alps, the Czech Republic, Southern Germany, Southern and Eastern France, and in the Iberic and Apennine mountain ranges.

  4. Active Thermochemical Tables: The Adiabatic Ionization Energy of Hydrogen Peroxide.

    PubMed

    Changala, P Bryan; Nguyen, T Lam; Baraban, Joshua H; Ellison, G Barney; Stanton, John F; Bross, David H; Ruscic, Branko

    2017-11-22

    The adiabatic ionization energy of hydrogen peroxide (HOOH) is investigated, both by means of theoretical calculations and theoretically assisted reanalysis of previous experimental data. Values obtained by three different approaches: 10.638 ± 0.012 eV (purely theoretical determination), 10.649 ± 0.005 eV (reanalysis of photoelectron spectrum), and 10.645 ± 0.010 eV (reanalysis of photoionization spectrum) are in excellent mutual agreement. Further refinement of the latter two values to account for asymmetry of the rotational profile of the photoionization origin band leads to a reduction of 0.007 ± 0.006 eV, which tends to bring them into even closer alignment with the purely theoretical value. Detailed analysis of this fundamental quantity by the Active Thermochemical Tables approach, using the present results and extant literature, gives a final estimate of 10.641 ± 0.006 eV.

  5. A Diagnosis of Rainfall over South America during the 1997/98 El Niño Event. Part I: Validation of NCEP-NCAR Reanalysis Rainfall Data.

    NASA Astrophysics Data System (ADS)

    Brahmananda Rao, V.; Santo, Clóvis E.; Franchito, Sergio H.

    2002-03-01

    A comparison between the National Centers for Environmental Predictions-National Center for Atmospheric Research (NCEP-NCAR) reanalysis rainfall data and the Agência Nacional de Energia Elétrica (ANEEL) rain gauge data over Brazil is made. It is found that over northeast Brazil, NCEP-NCAR rainfall is overestimated. But over south and southeast Brazil, the correlation between the two datasets is highly significant showing the utility of NCEP-NCAR rainfall data. Over other parts of Brazil the validity of NCEP-NCAR rainfall data is questionable. A detailed comparison between NCEP-NCAR rainfall data over northwest South America and rain gauge data showed that NCEP-NCAR rainfall data are useful despite important differences between the characteristics in the two data sources. NCEP-NCAR reanalysis data seem to have difficulty in correctly reproducing the strength and orientation of the South Atlantic convergence zone.

  6. Active Thermochemical Tables: The Adiabatic Ionization Energy of Hydrogen Peroxide

    DOE PAGES

    Changala, P. Bryan; Nguyen, T. Lam; Baraban, Joshua H.; ...

    2017-09-07

    The adiabatic ionization energy of hydrogen peroxide (HOOH) is investigated, both by means of theoretical calculations and theoretically-assisted reanalysis of previous experimental data. Values obtained by three different approaches: 10.638 ± 0.012 eV (purely theoretical determination), 10.649 ± 0.005 eV (reanalysis of photoelectron spectrum) and 10.645 ± 0.010 eV (reanalysis of photoionization spectrum) are in excellent mutual agreement. Further refinement of the latter two values to account for asymmetry of the rotational profile of the photoionization origin band leads to a reduction of 0.007 ± 0.006 eV, which tends to bring them into even closer alignment with the purely theoreticalmore » value. As a result, detailed analysis of this fundamental quantity by the Active Thermochemical Tables (ATcT) approach, using the present results and extant literature, gives a final estimate of 10.641 ± 0.006 eV.« less

  7. Final Technical Report for Collaborative Research: Developing and Implementing Ocean-Atmosphere Reanalyses for Climate Applications (OARCA)

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Compo, Gilbert P

    As an important step toward a coupled data assimilation system for generating reanalysis fields needed to assess climate model projections, the Ocean Atmosphere Coupled Reanalysis for Climate Applications (OARCA) project assesses and improves the longest reanalyses currently available of the atmosphere and ocean: the 20th Century Reanalysis Project (20CR) and the Simple Ocean Data Assimilation with sparse observational input (SODAsi) system, respectively. In this project, we make off-line but coordinated improvements in the 20CR and SODAsi datasets, with improvements in one feeding into improvements of the other through an iterative generation of new versions. These datasets now span from themore » 19th to 21st centuries. We then study the extreme weather and variability from days to decades of the resulting datasets. A total of 24 publications have been produced in this project.« less

  8. Approximate techniques of structural reanalysis

    NASA Technical Reports Server (NTRS)

    Noor, A. K.; Lowder, H. E.

    1974-01-01

    A study is made of two approximate techniques for structural reanalysis. These include Taylor series expansions for response variables in terms of design variables and the reduced-basis method. In addition, modifications to these techniques are proposed to overcome some of their major drawbacks. The modifications include a rational approach to the selection of the reduced-basis vectors and the use of Taylor series approximation in an iterative process. For the reduced basis a normalized set of vectors is chosen which consists of the original analyzed design and the first-order sensitivity analysis vectors. The use of the Taylor series approximation as a first (initial) estimate in an iterative process, can lead to significant improvements in accuracy, even with one iteration cycle. Therefore, the range of applicability of the reanalysis technique can be extended. Numerical examples are presented which demonstrate the gain in accuracy obtained by using the proposed modification techniques, for a wide range of variations in the design variables.

  9. An analysis of simulated and observed storm characteristics

    NASA Astrophysics Data System (ADS)

    Benestad, R. E.

    2010-09-01

    A calculus-based cyclone identification (CCI) method has been applied to the most recent re-analysis (ERAINT) from the European Centre for Medium-range Weather Forecasts and results from regional climate model (RCM) simulations. The storm frequency for events with central pressure below a threshold value of 960-990hPa were examined, and the gradient wind from the simulated storm systems were compared with corresponding estimates from the re-analysis. The analysis also yielded estimates for the spatial extent of the storm systems, which was also included in the regional climate model cyclone evaluation. A comparison is presented between a number of RCMs and the ERAINT re-analysis in terms of their description of the gradient winds, number of cyclones, and spatial extent. Furthermore, a comparison between geostrophic wind estimated though triangules of interpolated or station measurements of SLP is presented. Wind still represents one of the more challenging variables to model realistically.

  10. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Changala, P. Bryan; Nguyen, T. Lam; Baraban, Joshua H.

    The adiabatic ionization energy of hydrogen peroxide (HOOH) is investigated, both by means of theoretical calculations and theoretically-assisted reanalysis of previous experimental data. Values obtained by three different approaches: 10.638 ± 0.012 eV (purely theoretical determination), 10.649 ± 0.005 eV (reanalysis of photoelectron spectrum) and 10.645 ± 0.010 eV (reanalysis of photoionization spectrum) are in excellent mutual agreement. Further refinement of the latter two values to account for asymmetry of the rotational profile of the photoionization origin band leads to a reduction of 0.007 ± 0.006 eV, which tends to bring them into even closer alignment with the purely theoreticalmore » value. As a result, detailed analysis of this fundamental quantity by the Active Thermochemical Tables (ATcT) approach, using the present results and extant literature, gives a final estimate of 10.641 ± 0.006 eV.« less

  11. Diagnosing causes of cloud parameterization deficiencies using ARM measurements over SGP site

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wu, W.; Liu, Y.; Betts, A. K.

    2010-03-15

    Decade-long continuous surface-based measurements at Great Southern Plains (SGP) collected by the US Department of Energy’s Atmospheric Radiation Measurement (ARM) Climate Research Facility are first used to evaluate the three major reanalyses (i.e., ERA-Interim, NCEP/NCAR Reanalysis I and NCEP/DOE Reanalysis II) to identify model biases in simulating surface shortwave cloud forcing and total cloud fraction. The results show large systematic lower biases in the modeled surface shortwave cloud forcing and cloud fraction from all the three reanalysis datasets. Then we focus on diagnosing the causes of these model biases using the Active Remote Sensing of Clouds (ARSCL) products (e.g., verticalmore » distribution of cloud fraction, cloud-base and cloud-top heights, and cloud optical depth) and meteorological measurements (temperature, humidity and stability). Efforts are made to couple cloud properties with boundary processes in the diagnosis.« less

  12. Global Climatology of the Coastal Low-Level Wind Jets using different Reanalysis

    NASA Astrophysics Data System (ADS)

    Lima, Daniela C. A.; Soares, Pedro M. M.; Semedo, Alvaro; Cardoso, Rita M.

    2016-04-01

    Coastal Low-Level Jets (henceforth referred to as "coastal jets" or simply as CLLJ) are low-tropospheric mesoscale wind features, with wind speed maxima confined to the marine atmospheric boundary layer (MABL), typically bellow 1km. Coastal jets occur in the eastern flank of the semi-permanent subtropical mid-latitude high pressure systems, along equatorward eastern boundary currents, due to a large-scale synoptic forcing. The large-scale synoptic forcing behind CLLJ occurrences is a high pressure system over the ocean and a thermal low inland. This results in coastal parallel winds that are the consequence of the geostrophic adjustment. CLLJ are found along the California (California-Oregon) and the Canary (Iberia and Northeastern Africa) currents in the Northern Hemisphere, and along the Peru-Humboldt (Peru-Chile), Benguela (Namibia) and Western Australia (West Australia) currents in the Southern Hemisphere. In the Arabian Sea (Oman CLLJ), the interaction between the high pressure over the Indian Ocean in summer (Summer Indian Monsoon) and the Somali (also known as Findlater) Jet forces a coastal jet wind feature off the southeast coast of Oman. Coastal jets play an important role in the regional climates of the mid-latitude western continental regions. The decrease of the sea surface temperatures (SST) along the coast due to upwelling lowers the evaporation over the ocean and the coast parallel winds prevents the advection of marine air inshore. The feedback processes between the CLLJ and upwelling play a crucial role in the regional climate, namely, promoting aridity since the parallel flow prevents the intrusion of moisture inland, and increasing fish stocks through the transport of rich nutrient cold water from the bottom. In this study, the global coastal low-level wind jets are identified and characterized using an ensemble of three reanalysis, the ECMWF Interim Reanalysis (ERA-Interim), the Japanese 55-year Reanalysis (JRA-55) and the NCEP Climate Forecast System Reanalysis (NCEP CFSR). The CLLJ detection method proposed by Ranjha et al. (2013) was used for the reanalysis data. The criteria was applied sequentially to wind-speed and temperature vertical profiles to detect the location and frequency of CLLJ. The CLLJs spatio-temporal features and the seasonal synoptic configuration associated with the presence of coastal jets are studied for the period (1979-2008) using the ensemble. The present study will allow us to investigate thoroughly the global coastal low-level jets occurrence and main properties, following a new perspective and to assess the uncertainties in the representation of this jets by the available reanalysis. ublication supported by project FCT UID/GEO/50019/2013 - Instituto Dom Luiz.

  13. Microfluidic microarray systems and methods thereof

    DOEpatents

    West, Jay A. A. [Castro Valley, CA; Hukari, Kyle W [San Ramon, CA; Hux, Gary A [Tracy, CA

    2009-04-28

    Disclosed are systems that include a manifold in fluid communication with a microfluidic chip having a microarray, an illuminator, and a detector in optical communication with the microarray. Methods for using these systems for biological detection are also disclosed.

  14. cDNA Microarray Screening in Food Safety

    PubMed Central

    ROY, SASHWATI; SEN, CHANDAN K

    2009-01-01

    The cDNA microarray technology and related bioinformatics tools presents a wide range of novel application opportunities. The technology may be productively applied to address food safety. In this mini-review article, we present an update highlighting the late breaking discoveries that demonstrate the vitality of cDNA microarray technology as a tool to analyze food safety with reference to microbial pathogens and genetically modified foods. In order to bring the microarray technology to mainstream food safety, it is important to develop robust user-friendly tools that may be applied in a field setting. In addition, there needs to be a standardized process for regulatory agencies to interpret and act upon microarray-based data. The cDNA microarray approach is an emergent technology in diagnostics. Its values lie in being able to provide complimentary molecular insight when employed in addition to traditional tests for food safety, as part of a more comprehensive battery of tests. PMID:16466843

  15. Functional comparison of microarray data across multiple platforms using the method of percentage of overlapping functions.

    PubMed

    Li, Zhiguang; Kwekel, Joshua C; Chen, Tao

    2012-01-01

    Functional comparison across microarray platforms is used to assess the comparability or similarity of the biological relevance associated with the gene expression data generated by multiple microarray platforms. Comparisons at the functional level are very important considering that the ultimate purpose of microarray technology is to determine the biological meaning behind the gene expression changes under a specific condition, not just to generate a list of genes. Herein, we present a method named percentage of overlapping functions (POF) and illustrate how it is used to perform the functional comparison of microarray data generated across multiple platforms. This method facilitates the determination of functional differences or similarities in microarray data generated from multiple array platforms across all the functions that are presented on these platforms. This method can also be used to compare the functional differences or similarities between experiments, projects, or laboratories.

  16. ArrayNinja: An Open Source Platform for Unified Planning and Analysis of Microarray Experiments.

    PubMed

    Dickson, B M; Cornett, E M; Ramjan, Z; Rothbart, S B

    2016-01-01

    Microarray-based proteomic platforms have emerged as valuable tools for studying various aspects of protein function, particularly in the field of chromatin biochemistry. Microarray technology itself is largely unrestricted in regard to printable material and platform design, and efficient multidimensional optimization of assay parameters requires fluidity in the design and analysis of custom print layouts. This motivates the need for streamlined software infrastructure that facilitates the combined planning and analysis of custom microarray experiments. To this end, we have developed ArrayNinja as a portable, open source, and interactive application that unifies the planning and visualization of microarray experiments and provides maximum flexibility to end users. Array experiments can be planned, stored to a private database, and merged with the imaged results for a level of data interaction and centralization that is not currently attainable with available microarray informatics tools. © 2016 Elsevier Inc. All rights reserved.

  17. Emerging Use of Gene Expression Microarrays in Plant Physiology

    DOE PAGES

    Wullschleger, Stan D.; Difazio, Stephen P.

    2003-01-01

    Microarrays have become an important technology for the global analysis of gene expression in humans, animals, plants, and microbes. Implemented in the context of a well-designed experiment, cDNA and oligonucleotide arrays can provide highthroughput, simultaneous analysis of transcript abundance for hundreds, if not thousands, of genes. However, despite widespread acceptance, the use of microarrays as a tool to better understand processes of interest to the plant physiologist is still being explored. To help illustrate current uses of microarrays in the plant sciences, several case studies that we believe demonstrate the emerging application of gene expression arrays in plant physiology weremore » selected from among the many posters and presentations at the 2003 Plant and Animal Genome XI Conference. Based on this survey, microarrays are being used to assess gene expression in plants exposed to the experimental manipulation of air temperature, soil water content and aluminium concentration in the root zone. Analysis often includes characterizing transcript profiles for multiple post-treatment sampling periods and categorizing genes with common patterns of response using hierarchical clustering techniques. In addition, microarrays are also providing insights into developmental changes in gene expression associated with fibre and root elongation in cotton and maize, respectively. Technical and analytical limitations of microarrays are discussed and projects attempting to advance areas of microarray design and data analysis are highlighted. Finally, although much work remains, we conclude that microarrays are a valuable tool for the plant physiologist interested in the characterization and identification of individual genes and gene families with potential application in the fields of agriculture, horticulture and forestry.« less

  18. Profiling In Situ Microbial Community Structure with an Amplification Microarray

    PubMed Central

    Knickerbocker, Christopher; Bryant, Lexi; Golova, Julia; Wiles, Cory; Williams, Kenneth H.; Peacock, Aaron D.; Long, Philip E.

    2013-01-01

    The objectives of this study were to unify amplification, labeling, and microarray hybridization chemistries within a single, closed microfluidic chamber (an amplification microarray) and verify technology performance on a series of groundwater samples from an in situ field experiment designed to compare U(VI) mobility under conditions of various alkalinities (as HCO3−) during stimulated microbial activity accompanying acetate amendment. Analytical limits of detection were between 2 and 200 cell equivalents of purified DNA. Amplification microarray signatures were well correlated with 16S rRNA-targeted quantitative PCR results and hybridization microarray signatures. The succession of the microbial community was evident with and consistent between the two microarray platforms. Amplification microarray analysis of acetate-treated groundwater showed elevated levels of iron-reducing bacteria (Flexibacter, Geobacter, Rhodoferax, and Shewanella) relative to the average background profile, as expected. Identical molecular signatures were evident in the transect treated with acetate plus NaHCO3, but at much lower signal intensities and with a much more rapid decline (to nondetection). Azoarcus, Thaurea, and Methylobacterium were responsive in the acetate-only transect but not in the presence of bicarbonate. Observed differences in microbial community composition or response to bicarbonate amendment likely had an effect on measured rates of U reduction, with higher rates probable in the part of the field experiment that was amended with bicarbonate. The simplification in microarray-based work flow is a significant technological advance toward entirely closed-amplicon microarray-based tests and is generally extensible to any number of environmental monitoring applications. PMID:23160129

  19. PRACTICAL STRATEGIES FOR PROCESSING AND ANALYZING SPOTTED OLIGONUCLEOTIDE MICROARRAY DATA

    EPA Science Inventory

    Thoughtful data analysis is as important as experimental design, biological sample quality, and appropriate experimental procedures for making microarrays a useful supplement to traditional toxicology. In the present study, spotted oligonucleotide microarrays were used to profile...

  20. DNA Microarray-based Ecotoxicological Biomarker Discovery in a Small Fish Model Species

    EPA Science Inventory

    This paper addresses several issues critical to use of zebrafish oligonucleotide microarrays for computational toxicology research on endocrine disrupting chemicals using small fish models, and more generally, the use of microarrays in aquatic toxicology.

  1. IMPROVING THE RELIABILITY OF MICROARRAYS FOR TOXICOLOGY RESEARCH: A COLLABORATIVE APPROACH

    EPA Science Inventory

    Microarray-based gene expression profiling is a critical tool to identify molecular biomarkers of specific chemical stressors. Although current microarray technologies have progressed from their infancy, biological and technical repeatability and reliability are often still limit...

  2. Direct labeling of serum proteins by fluorescent dye for antibody microarray.

    PubMed

    Klimushina, M V; Gumanova, N G; Metelskaya, V A

    2017-05-06

    Analysis of serum proteome by antibody microarray is used to identify novel biomarkers and to study signaling pathways including protein phosphorylation and protein-protein interactions. Labeling of serum proteins is important for optimal performance of the antibody microarray. Proper choice of fluorescent label and optimal concentration of protein loaded on the microarray ensure good quality of imaging that can be reliably scanned and processed by the software. We have optimized direct serum protein labeling using fluorescent dye Arrayit Green 540 (Arrayit Corporation, USA) for antibody microarray. Optimized procedure produces high quality images that can be readily scanned and used for statistical analysis of protein composition of the serum. Copyright © 2017 Elsevier Inc. All rights reserved.

  3. A python module to normalize microarray data by the quantile adjustment method.

    PubMed

    Baber, Ibrahima; Tamby, Jean Philippe; Manoukis, Nicholas C; Sangaré, Djibril; Doumbia, Seydou; Traoré, Sekou F; Maiga, Mohamed S; Dembélé, Doulaye

    2011-06-01

    Microarray technology is widely used for gene expression research targeting the development of new drug treatments. In the case of a two-color microarray, the process starts with labeling DNA samples with fluorescent markers (cyanine 635 or Cy5 and cyanine 532 or Cy3), then mixing and hybridizing them on a chemically treated glass printed with probes, or fragments of genes. The level of hybridization between a strand of labeled DNA and a probe present on the array is measured by scanning the fluorescence of spots in order to quantify the expression based on the quality and number of pixels for each spot. The intensity data generated from these scans are subject to errors due to differences in fluorescence efficiency between Cy5 and Cy3, as well as variation in human handling and quality of the sample. Consequently, data have to be normalized to correct for variations which are not related to the biological phenomena under investigation. Among many existing normalization procedures, we have implemented the quantile adjustment method using the python computer language, and produced a module which can be run via an HTML dynamic form. This module is composed of different functions for data files reading, intensity and ratio computations and visualization. The current version of the HTML form allows the user to visualize the data before and after normalization. It also gives the option to subtract background noise before normalizing the data. The output results of this module are in agreement with the results of other normalization tools. Published by Elsevier B.V.

  4. Transcriptomic Analysis Using Olive Varieties and Breeding Progenies Identifies Candidate Genes Involved in Plant Architecture.

    PubMed

    González-Plaza, Juan J; Ortiz-Martín, Inmaculada; Muñoz-Mérida, Antonio; García-López, Carmen; Sánchez-Sevilla, José F; Luque, Francisco; Trelles, Oswaldo; Bejarano, Eduardo R; De La Rosa, Raúl; Valpuesta, Victoriano; Beuzón, Carmen R

    2016-01-01

    Plant architecture is a critical trait in fruit crops that can significantly influence yield, pruning, planting density and harvesting. Little is known about how plant architecture is genetically determined in olive, were most of the existing varieties are traditional with an architecture poorly suited for modern growing and harvesting systems. In the present study, we have carried out microarray analysis of meristematic tissue to compare expression profiles of olive varieties displaying differences in architecture, as well as seedlings from their cross pooled on the basis of their sharing architecture-related phenotypes. The microarray used, previously developed by our group has already been applied to identify candidates genes involved in regulating juvenile to adult transition in the shoot apex of seedlings. Varieties with distinct architecture phenotypes and individuals from segregating progenies displaying opposite architecture features were used to link phenotype to expression. Here, we identify 2252 differentially expressed genes (DEGs) associated to differences in plant architecture. Microarray results were validated by quantitative RT-PCR carried out on genes with functional annotation likely related to plant architecture. Twelve of these genes were further analyzed in individual seedlings of the corresponding pool. We also examined Arabidopsis mutants in putative orthologs of these targeted candidate genes, finding altered architecture for most of them. This supports a functional conservation between species and potential biological relevance of the candidate genes identified. This study is the first to identify genes associated to plant architecture in olive, and the results obtained could be of great help in future programs aimed at selecting phenotypes adapted to modern cultivation practices in this species.

  5. Application of Protein Microarrays for Multiplexed Detection of Antibodies to Tumor Antigens in Breast Cancer

    PubMed Central

    Anderson, Karen S.; Ramachandran, Niroshan; Wong, Jessica; Raphael, Jacob V.; Hainsworth, Eugenie; Demirkan, Gokhan; Cramer, Daniel; Aronzon, Diana; Hodi, F. Stephen; Harris, Lyndsay; Logvinenko, Tanya; LaBaer, Joshua

    2012-01-01

    There is strong preclinical evidence that cancer, including breast cancer, undergoes immune surveillance. This continual monitoring, by both the innate and the adaptive immune systems, recognizes changes in protein expression, mutation, folding, glycosylation, and degradation. Local immune responses to tumor antigens are amplified in draining lymph nodes, and then enter the systemic circulation. The antibody response to tumor antigens, such as p53 protein, are robust, stable, and easily detected in serum, may exist in greater concentrations than their cognate antigens, and are potential highly specific biomarkers for cancer. However, antibodies have limited sensitivities as single analytes, and differences in protein purification and assay characteristics have limited their clinical application. For example, p53 autoantibodies in the sera are highly specific for cancer patients, but are only detected in the sera of 10-20% of patients with breast cancer. Detection of p53 autoantibodies is dependent on tumor burden, p53 mutation, rapidly decreases with effective therapy, but is relatively independent of breast cancer subtype. Although antibodies to hundreds of other tumor antigens have been identified in the sera of breast cancer patients, very little is known about the specificity and clinical impact of the antibody immune repertoire to breast cancer. Recent advances in proteomic technologies have the potential for rapid identification of immune response signatures for breast cancer diagnosis and monitoring. We have adapted programmable protein microarrays for the specific detection of autoantibodies in breast cancer. Here, we present the first demonstration of the application of programmable protein microarray ELISAs for the rapid identification of breast cancer autoantibodies. PMID:18311903

  6. Design and verification of a pangenome microarray oligonucleotide probe set for Dehalococcoides spp.

    PubMed

    Hug, Laura A; Salehi, Maryam; Nuin, Paulo; Tillier, Elisabeth R; Edwards, Elizabeth A

    2011-08-01

    Dehalococcoides spp. are an industrially relevant group of Chloroflexi bacteria capable of reductively dechlorinating contaminants in groundwater environments. Existing Dehalococcoides genomes revealed a high level of sequence identity within this group, including 98 to 100% 16S rRNA sequence identity between strains with diverse substrate specificities. Common molecular techniques for identification of microbial populations are often not applicable for distinguishing Dehalococcoides strains. Here we describe an oligonucleotide microarray probe set designed based on clustered Dehalococcoides genes from five different sources (strain DET195, CBDB1, BAV1, and VS genomes and the KB-1 metagenome). This "pangenome" probe set provides coverage of core Dehalococcoides genes as well as strain-specific genes while optimizing the potential for hybridization to closely related, previously unknown Dehalococcoides strains. The pangenome probe set was compared to probe sets designed independently for each of the five Dehalococcoides strains. The pangenome probe set demonstrated better predictability and higher detection of Dehalococcoides genes than strain-specific probe sets on nontarget strains with <99% average nucleotide identity. An in silico analysis of the expected probe hybridization against the recently released Dehalococcoides strain GT genome and additional KB-1 metagenome sequence data indicated that the pangenome probe set performs more robustly than the combined strain-specific probe sets in the detection of genes not included in the original design. The pangenome probe set represents a highly specific, universal tool for the detection and characterization of Dehalococcoides from contaminated sites. It has the potential to become a common platform for Dehalococcoides-focused research, allowing meaningful comparisons between microarray experiments regardless of the strain examined.

  7. In-depth characterization of breast cancer tumor-promoting cell transcriptome by RNA sequencing and microarrays

    PubMed Central

    Soldà, Giulia; Merlino, Giuseppe; Fina, Emanuela; Brini, Elena; Moles, Anna; Cappelletti, Vera; Daidone, Maria Grazia

    2016-01-01

    Numerous studies have reported the existence of tumor-promoting cells (TPC) with self-renewal potential and a relevant role in drug resistance. However, pathways and modifications involved in the maintenance of such tumor subpopulations are still only partially understood. Sequencing-based approaches offer the opportunity for a detailed study of TPC including their transcriptome modulation. Using microarrays and RNA sequencing approaches, we compared the transcriptional profiles of parental MCF7 breast cancer cells with MCF7-derived TPC (i.e. MCFS). Data were explored using different bioinformatic approaches, and major findings were experimentally validated. The different analytical pipelines (Lifescope and Cufflinks based) yielded similar although not identical results. RNA sequencing data partially overlapped microarray results and displayed a higher dynamic range, although overall the two approaches concordantly predicted pathway modifications. Several biological functions were altered in TPC, ranging from production of inflammatory cytokines (i.e., IL-8 and MCP-1) to proliferation and response to steroid hormones. More than 300 non-coding RNAs were defined as differentially expressed, and 2,471 potential splicing events were identified. A consensus signature of genes up-regulated in TPC was derived and was found to be significantly associated with insensitivity to fulvestrant in a public breast cancer patient dataset. Overall, we obtained a detailed portrait of the transcriptome of a breast cancer TPC line, highlighted the role of non-coding RNAs and differential splicing, and identified a gene signature with a potential as a context-specific biomarker in patients receiving endocrine treatment. PMID:26556871

  8. Transfection microarray and the applications.

    PubMed

    Miyake, Masato; Yoshikawa, Tomohiro; Fujita, Satoshi; Miyake, Jun

    2009-05-01

    Microarray transfection has been extensively studied for high-throughput functional analysis of mammalian cells. However, control of efficiency and reproducibility are the critical issues for practical use. By using solid-phase transfection accelerators and nano-scaffold, we provide a highly efficient and reproducible microarray-transfection device, "transfection microarray". The device would be applied to the limited number of available primary cells and stem cells not only for large-scale functional analysis but also reporter-based time-lapse cellular event analysis.

  9. A Human Lectin Microarray for Sperm Surface Glycosylation Analysis *

    PubMed Central

    Sun, Yangyang; Cheng, Li; Gu, Yihua; Xin, Aijie; Wu, Bin; Zhou, Shumin; Guo, Shujuan; Liu, Yin; Diao, Hua; Shi, Huijuan; Wang, Guangyu; Tao, Sheng-ce

    2016-01-01

    Glycosylation is one of the most abundant and functionally important protein post-translational modifications. As such, technology for efficient glycosylation analysis is in high demand. Lectin microarrays are a powerful tool for such investigations and have been successfully applied for a variety of glycobiological studies. However, most of the current lectin microarrays are primarily constructed from plant lectins, which are not well suited for studies of human glycosylation because of the extreme complexity of human glycans. Herein, we constructed a human lectin microarray with 60 human lectin and lectin-like proteins. All of the lectins and lectin-like proteins were purified from yeast, and most showed binding to human glycans. To demonstrate the applicability of the human lectin microarray, human sperm were probed on the microarray and strong bindings were observed for several lectins, including galectin-1, 7, 8, GalNAc-T6, and ERGIC-53 (LMAN1). These bindings were validated by flow cytometry and fluorescence immunostaining. Further, mass spectrometry analysis showed that galectin-1 binds several membrane-associated proteins including heat shock protein 90. Finally, functional assays showed that binding of galectin-8 could significantly enhance the acrosome reaction within human sperms. To our knowledge, this is the first construction of a human lectin microarray, and we anticipate it will find wide use for a range of human or mammalian studies, alone or in combination with plant lectin microarrays. PMID:27364157

  10. Performance and quality assessment of the global ocean eddy-permitting physical reanalysis GLORYS2V4.

    NASA Astrophysics Data System (ADS)

    Garric, Gilles; Parent, Laurent; Greiner, Eric; Drévillon, Marie; Hamon, Mathieu; Lellouche, Jean-Michel; Régnier, Charly; Desportes, Charles; Le Galloudec, Olivier; Bricaud, Clement; Drillet, Yann; Hernandez, Fabrice; Le Traon, Pierre-Yves

    2017-04-01

    The purpose of this presentation is to give an overview of the recent upgrade of GLORYS2 (version 4 and GLORYS2V4 hereafter), the latest ocean reanalysis produced at Mercator Ocean that covers the altimetry era (1993-2015) in the framework of Copernicus Marine Environment Monitoring Service (CMEMS; http://marine.copernicus.eu/). The reanalysis is run at eddy-permitting resolution (¼° horizontal resolution and 75 vertical levels) with the NEMO model and driven at the surface by ERA-Interim reanalysis from ECMWF (European Centre for Medium-Range Weather Forecasts). The reanalysis system uses a multi-data and multivariate reduced order Kalman filter based on the singular extended evolutive Kalman (SEEK) filter formulation together with a 3D-VAR large scale bias correction. The assimilated observations are along-track satellite altimetry, sea surface temperature, sea ice concentration and in-situ profiles of temperature and salinity. With respect to the previous version (GLORYS2V3), GLORYS2V4 contains a number of improvements. In particular: a) new initial temperature and salinity conditions derived from EN4 data base with a better mass equilibrium with altimetry, b) the use of the updated delayed mode CORA in situ observations from CMEMS, c) a new hybrid Mean Dynamical Topography (MDT) for the assimilation scheme referenced over the 1993-2013 period, d) a better observation operator for altimetry observations for the data assimilation scheme: e) A correction of large scale ERA-Interim atmospheric surface (precipitations and radiative) fluxes as in GLORYS2V3 but towards new satellite data set f) an update of the climatological runoff data base by using the latest version of Dai's 2009 data set for the global ocean together with better account of freshwater fluxes from polar ice sheet's glaciers. The presentation will show that the new reanalysis outperforms the previous version in many aspects such as biases and root mean squared error and, especially in representing the variability of global heat and salt content and associated steric sea level in the last two decades. The dataset is available in NetCDF format and GLORYS2V4 best analysis products are distributed onto the CMEMS data portal.

  11. Evaluation of ACCMIP ozone simulations and ozonesonde sampling biases using a satellite-based multi-constituent chemical reanalysis

    NASA Astrophysics Data System (ADS)

    Miyazaki, Kazuyuki; Bowman, Kevin

    2017-07-01

    The Atmospheric Chemistry Climate Model Intercomparison Project (ACCMIP) ensemble ozone simulations for the present day from the 2000 decade simulation results are evaluated by a state-of-the-art multi-constituent atmospheric chemical reanalysis that ingests multiple satellite data including the Tropospheric Emission Spectrometer (TES), the Microwave Limb Sounder (MLS), the Ozone Monitoring Instrument (OMI), and the Measurement of Pollution in the Troposphere (MOPITT) for 2005-2009. Validation of the chemical reanalysis against global ozonesondes shows good agreement throughout the free troposphere and lower stratosphere for both seasonal and year-to-year variations, with an annual mean bias of less than 0.9 ppb in the middle and upper troposphere at the tropics and mid-latitudes. The reanalysis provides comprehensive spatiotemporal evaluation of chemistry-model performance that compliments direct ozonesonde comparisons, which are shown to suffer from significant sampling bias. The reanalysis reveals that the ACCMIP ensemble mean overestimates ozone in the northern extratropics by 6-11 ppb while underestimating by up to 18 ppb in the southern tropics over the Atlantic in the lower troposphere. Most models underestimate the spatial variability of the annual mean lower tropospheric concentrations in the extratropics of both hemispheres by up to 70 %. The ensemble mean also overestimates the seasonal amplitude by 25-70 % in the northern extratropics and overestimates the inter-hemispheric gradient by about 30 % in the lower and middle troposphere. A part of the discrepancies can be attributed to the 5-year reanalysis data for the decadal model simulations. However, these differences are less evident with the current sonde network. To estimate ozonesonde sampling biases, we computed model bias separately for global coverage and the ozonesonde network. The ozonesonde sampling bias in the evaluated model bias for the seasonal mean concentration relative to global coverage is 40-50 % over the western Pacific and east Indian Ocean and reaches 110 % over the equatorial Americas and up to 80 % for the global tropics. In contrast, the ozonesonde sampling bias is typically smaller than 30 % for the Arctic regions in the lower and middle troposphere. These systematic biases have implications for ozone radiative forcing and the response of chemistry to climate that can be further quantified as the satellite observational record extends to multiple decades.

  12. Sensitivity of Crop Gross Primary Production Simulations to In-situ and Reanalysis Meteorological Data

    NASA Astrophysics Data System (ADS)

    Jin, C.; Xiao, X.; Wagle, P.

    2014-12-01

    Accurate estimation of crop Gross Primary Production (GPP) is important for food securityand terrestrial carbon cycle. Numerous publications have reported the potential of the satellite-based Production Efficiency Models (PEMs) to estimate GPP driven by in-situ climate data. Simulations of the PEMs often require surface reanalysis climate data as inputs, for example, the North America Regional Reanalysis datasets (NARR). These reanalysis datasets showed certain biases from the in-situ climate datasets. Thus, sensitivity analysis of the PEMs to the climate inputs is needed before their application at the regional scale. This study used the satellite-based Vegetation Photosynthesis Model (VPM), which is driven by solar radiation (R), air temperature (T), and the satellite-based vegetation indices, to quantify the causes and degree of uncertainties in crop GPP estimates due to different meteorological inputs at the 8-day interval (in-situ AmeriFlux data and NARR surface reanalysis data). The NARR radiation (RNARR) explained over 95% of the variability in in-situ RAF and TAF measured from AmeriFlux. The bais of TNARR was relatively small. However, RNARR had a systematical positive bias of ~3.5 MJ m-2day-1 from RAF. A simple adjustment based on the spatial statistic between RNARR and RAF produced relatively accurate radiation data for all crop site-years by reducing RMSE from 4 to 1.7 MJ m-2day-1. The VPM-based GPP estimates with three climate datasets (i.e., in-situ, and NARR before and after adjustment, GPPVPM,AF, GPPVPM,NARR, and GPPVPM,adjNARR) showed good agreements with the seasonal dynamics of crop GPP derived from the flux towers (GPPAF). The GPPVPM,AF differed from GPPAF by 2% for maize, and -8% to -12% for soybean on the 8-day interval. The positive bias of RNARR resulted in an overestimation of GPPVPM,NARR at both maize and soybean systems. However, GPPVPM,adjNARR significantly reduced the uncertainties of the maize GPP from 25% to 2%. The results from this study revealed that the errors of the NARR surface reanalysis data introduced significant uncertainties of the PEMs-based GPP estimates. Therefore, it is important to develop more accurate radiation datasets at the regional and global scales to estimate gross and net primary production of terrestrial ecosystems at the regional and global scales.

  13. EMAAS: An extensible grid-based Rich Internet Application for microarray data analysis and management

    PubMed Central

    Barton, G; Abbott, J; Chiba, N; Huang, DW; Huang, Y; Krznaric, M; Mack-Smith, J; Saleem, A; Sherman, BT; Tiwari, B; Tomlinson, C; Aitman, T; Darlington, J; Game, L; Sternberg, MJE; Butcher, SA

    2008-01-01

    Background Microarray experimentation requires the application of complex analysis methods as well as the use of non-trivial computer technologies to manage the resultant large data sets. This, together with the proliferation of tools and techniques for microarray data analysis, makes it very challenging for a laboratory scientist to keep up-to-date with the latest developments in this field. Our aim was to develop a distributed e-support system for microarray data analysis and management. Results EMAAS (Extensible MicroArray Analysis System) is a multi-user rich internet application (RIA) providing simple, robust access to up-to-date resources for microarray data storage and analysis, combined with integrated tools to optimise real time user support and training. The system leverages the power of distributed computing to perform microarray analyses, and provides seamless access to resources located at various remote facilities. The EMAAS framework allows users to import microarray data from several sources to an underlying database, to pre-process, quality assess and analyse the data, to perform functional analyses, and to track data analysis steps, all through a single easy to use web portal. This interface offers distance support to users both in the form of video tutorials and via live screen feeds using the web conferencing tool EVO. A number of analysis packages, including R-Bioconductor and Affymetrix Power Tools have been integrated on the server side and are available programmatically through the Postgres-PLR library or on grid compute clusters. Integrated distributed resources include the functional annotation tool DAVID, GeneCards and the microarray data repositories GEO, CELSIUS and MiMiR. EMAAS currently supports analysis of Affymetrix 3' and Exon expression arrays, and the system is extensible to cater for other microarray and transcriptomic platforms. Conclusion EMAAS enables users to track and perform microarray data management and analysis tasks through a single easy-to-use web application. The system architecture is flexible and scalable to allow new array types, analysis algorithms and tools to be added with relative ease and to cope with large increases in data volume. PMID:19032776

  14. THE MAQC PROJECT: ESTABLISHING QC METRICS AND THRESHOLDS FOR MICROARRAY QUALITY CONTROL

    EPA Science Inventory

    Microarrays represent a core technology in pharmacogenomics and toxicogenomics; however, before this technology can successfully and reliably be applied in clinical practice and regulatory decision-making, standards and quality measures need to be developed. The Microarray Qualit...

  15. Validation and uncertainty analysis for monthly and extreme precipitation in the ERA-20C reanalysis based on the WZN in-situ measurements

    NASA Astrophysics Data System (ADS)

    Rustemeier, Elke; Ziese, Markus; Raykova, Kristin; Meyer-Christoffer, Anja; Schneider, Udo; Finger, Peter; Becker, Andreas

    2017-04-01

    The proper representation of precipitation, in particular extreme precipitation, in global reanalyses is still challenging. This paper focuses on the potential of the ERA-20C centennial reanalysis to reproduce precipitation events. The global ERA-20C Reanalysis has been developed within the projects ERA-CLIM and its successor ERA-CLIM2 with the aim of a multi-decadal reanalysis of the global climate system. One of the objectives of ERA-CLIM2 is to provide useful information about the uncertainty of the various parameters. Since precipitation is a prognostic variable, it allows for independent validation by in-situ measurements. For this purpose, the Global Precipitation Climatology Centre (GPCC) operated by the DWD has compared the ERA-20C Reanalysis with the GPCC observational products "Full Data Monthly Version 7" (FDM-V7) and "Full Data Daily Version 1" (FDD-V1). ERA-20C is based on the ECMWF prediction model IFS version Cy38r1 with a spatial resolution of approximately 125 km and covers the 111 years from 1900 to 2010. The GPCC FDM-V7 raster data product, on the other hand, includes the global land surface in-situ measurements between 1901 and 2013 (Schneider et al., 2014) and the FDD-V1 raster data product covers daily precipitation from 1988 to 2013 with daily resolution. The most suitable resolution of 1° was used to validate ERA-20C. For the spatial and temporal validation of the ERA-20C Reanalysis, global temporal scores were calculated on monthly, seasonal and annual time scales. These include e.g. monthly contingency table scores, correlation or climate change indices (ETCCDI) for precipitation to determine extreme values and their temporal change (Peterson et al., 2001, Appendix A). Not surprisingly, the regions with the strongest differences are also those with data scarcity, mountain regions with their luv and lee effects or monsoon areas. They all show a strong systematic difference and breaks within the time series. Differences between ERA-20C and FDD-V1 based on ETCCDI diagnoses were detected particularly in regions with large precipitation totals especially in Africa in the ITCZ area and in Indonesia. The overall comparison reveals geo-spatially heterogeneous results with areas of similar precipitation characteristics, but also areas that still remain challenging for the reanalysis' fidelity to represent the FDM-V7 and FDD-F1 based diagnostics. The results serve good guidance where improvements of the future IFS model versions should be most effective. Peterson, T., Folland, C., Gruza, G., Hogg, W., Mokssit, A. and Plummer, N. (2001): Report on the activities of the working group on climate change detection and related rapporteurs. Geneva: World Meteorological Organization. Poli, P., H. Hersbach, D. Tan, D. Dee, J.-N. Thépaut, A. Simmons, C. Peubey, P. Laloy-aux, T. Komori, P. Berrisford, R. Dragani, Y. Trémolet, E. H ´lm, M. Bonavita, L. Isaksen und M. Fisher (2013): The data assimilation system and initial performance evaluation of the ECMWF pilot reanalysis of the 20th-century assimilating surface observations only (ERA-20C), ERA Report Series 14, http://www.ecmwf.int/publications/library/do/references/show?id=90833) Schneider, Udo, Andreas Becker, Peter Finger, Anja Meyer-Christoffer, Bruno Rudolf und Markus Ziese (2015): GPCC Full Data Reanalysis Version 7.0 at 1.0°: Monthly Land-Surface Precipitation from Rain-Gauges built on GTS-based and Historic Data. DOI: 10.5676/DWD_GPCC/FD_M_V7_100

  16. Assessing differential expression in two-color microarrays: a resampling-based empirical Bayes approach.

    PubMed

    Li, Dongmei; Le Pape, Marc A; Parikh, Nisha I; Chen, Will X; Dye, Timothy D

    2013-01-01

    Microarrays are widely used for examining differential gene expression, identifying single nucleotide polymorphisms, and detecting methylation loci. Multiple testing methods in microarray data analysis aim at controlling both Type I and Type II error rates; however, real microarray data do not always fit their distribution assumptions. Smyth's ubiquitous parametric method, for example, inadequately accommodates violations of normality assumptions, resulting in inflated Type I error rates. The Significance Analysis of Microarrays, another widely used microarray data analysis method, is based on a permutation test and is robust to non-normally distributed data; however, the Significance Analysis of Microarrays method fold change criteria are problematic, and can critically alter the conclusion of a study, as a result of compositional changes of the control data set in the analysis. We propose a novel approach, combining resampling with empirical Bayes methods: the Resampling-based empirical Bayes Methods. This approach not only reduces false discovery rates for non-normally distributed microarray data, but it is also impervious to fold change threshold since no control data set selection is needed. Through simulation studies, sensitivities, specificities, total rejections, and false discovery rates are compared across the Smyth's parametric method, the Significance Analysis of Microarrays, and the Resampling-based empirical Bayes Methods. Differences in false discovery rates controls between each approach are illustrated through a preterm delivery methylation study. The results show that the Resampling-based empirical Bayes Methods offer significantly higher specificity and lower false discovery rates compared to Smyth's parametric method when data are not normally distributed. The Resampling-based empirical Bayes Methods also offers higher statistical power than the Significance Analysis of Microarrays method when the proportion of significantly differentially expressed genes is large for both normally and non-normally distributed data. Finally, the Resampling-based empirical Bayes Methods are generalizable to next generation sequencing RNA-seq data analysis.

  17. Well, what about intraspecific variation? Taxonomic and phylogenetic characters in the genus Synoeca de Saussure (Hymenoptera, Vespidae).

    PubMed

    Carpenter, James M; Andena, Sergio R; Noll, Fernando B; Wenzel, John W

    2013-01-01

    Cely and Sarmiento (2011) took issue with the cladistic analysis of relationships among species of the genus Synoeca by Andena et al. (2009a), and presented a reanalysis. They claimed that intraspecific variation in the genus is meaningful, and proper consideration yields a conclusion different from that of Andena et al. Both their critique and reanalysis are vitiated by numerous errors, as is shown in the present paper.

  18. Assessing Hydrological and Energy Budgets in Amazonia through Regional Downscaling, and Comparisons with Global Reanalysis Products

    NASA Astrophysics Data System (ADS)

    Nunes, A.; Ivanov, V. Y.

    2014-12-01

    Although current global reanalyses provide reasonably accurate large-scale features of the atmosphere, systematic errors are still found in the hydrological and energy budgets of such products. In the tropics, precipitation is particularly challenging to model, which is also adversely affected by the scarcity of hydrometeorological datasets in the region. With the goal of producing downscaled analyses that are appropriate for a climate assessment at regional scales, a regional spectral model has used a combination of precipitation assimilation with scale-selective bias correction. The latter is similar to the spectral nudging technique, which prevents the departure of the regional model's internal states from the large-scale forcing. The target area in this study is the Amazon region, where large errors are detected in reanalysis precipitation. To generate the downscaled analysis, the regional climate model used NCEP/DOE R2 global reanalysis as the initial and lateral boundary conditions, and assimilated NOAA's Climate Prediction Center (CPC) MORPHed precipitation (CMORPH), available at 0.25-degree resolution, every 3 hours. The regional model's precipitation was successfully brought closer to the observations, in comparison to the NCEP global reanalysis products, as a result of the impact of a precipitation assimilation scheme on cumulus-convection parameterization, and improved boundary forcing achieved through a new version of scale-selective bias correction. Water and energy budget terms were also evaluated against global reanalyses and other datasets.

  19. A generalized multivariate regression model for modelling ocean wave heights

    NASA Astrophysics Data System (ADS)

    Wang, X. L.; Feng, Y.; Swail, V. R.

    2012-04-01

    In this study, a generalized multivariate linear regression model is developed to represent the relationship between 6-hourly ocean significant wave heights (Hs) and the corresponding 6-hourly mean sea level pressure (MSLP) fields. The model is calibrated using the ERA-Interim reanalysis of Hs and MSLP fields for 1981-2000, and is validated using the ERA-Interim reanalysis for 2001-2010 and ERA40 reanalysis of Hs and MSLP for 1958-2001. The performance of the fitted model is evaluated in terms of Pierce skill score, frequency bias index, and correlation skill score. Being not normally distributed, wave heights are subjected to a data adaptive Box-Cox transformation before being used in the model fitting. Also, since 6-hourly data are being modelled, lag-1 autocorrelation must be and is accounted for. The models with and without Box-Cox transformation, and with and without accounting for autocorrelation, are inter-compared in terms of their prediction skills. The fitted MSLP-Hs relationship is then used to reconstruct historical wave height climate from the 6-hourly MSLP fields taken from the Twentieth Century Reanalysis (20CR, Compo et al. 2011), and to project possible future wave height climates using CMIP5 model simulations of MSLP fields. The reconstructed and projected wave heights, both seasonal means and maxima, are subject to a trend analysis that allows for non-linear (polynomial) trends.

  20. Report of an Expert Panel on the reanalysis by of a 90-day study conducted by Monsanto in support of the safety of a genetically modified corn variety (MON 863).

    PubMed

    Doull, J; Gaylor, D; Greim, H A; Lovell, D P; Lynch, B; Munro, I C

    2007-11-01

    MON 863, a genetically engineered corn variety that contains the gene for modified Bacillus thuringiensis Cry3Bb1 protein to protect against corn rootworm, was tested in a 90-day toxicity study as part of the process to gain regulatory approval. This study was reanalyzed by Séralini et al. who contended that the study showed possible hepatorenal effects of MON 863. An Expert Panel was convened to assess the original study results as analyzed by the Monsanto Company and the reanalysis conducted by Séralini et al. The Expert Panel concludes that the Séralini et al. reanalysis provided no evidence to indicate that MON 863 was associated with adverse effects in the 90-day rat study. In each case, statistical findings reported by both Monsanto and Séralini et al. were considered to be unrelated to treatment or of no biological or clinical importance because they failed to demonstrate a dose-response relationship, reproducibility over time, association with other relevant changes (e.g., histopathology), occurrence in both sexes, difference outside the normal range of variation, or biological plausibility with respect to cause-and-effect. The Séralini et al. reanalysis does not advance any new scientific data to indicate that MON 863 caused adverse effects in the 90-day rat study.

  1. Analysis of the precipitation and streamflow extremes in Northern Italy using high resolution reanalysis dataset Express-Hydro

    NASA Astrophysics Data System (ADS)

    Silvestro, Francesco; Parodi, Antonio; Campo, Lorenzo

    2017-04-01

    The characterization of the hydrometeorological extremes, both in terms of rainfall and streamflow, in a given region plays a key role in the environmental monitoring provided by the flood alert services. In last years meteorological simulations (both near real-time and historical reanalysis) were available at increasing spatial and temporal resolutions, making possible long-period hydrological reanalysis in which the meteo dataset is used as input in distributed hydrological models. In this work, a very high resolution meteorological reanalysis dataset, namely Express-Hydro (CIMA, ISAC-CNR, GAUSS Special Project PR45DE), was employed as input in the hydrological model Continuum in order to produce long time series of streamflows in the Liguria territory, located in the Northern part of Italy. The original dataset covers the whole Europe territory in the 1979-2008 period, at 4 km of spatial resolution and 3 hours of time resolution. Analyses in terms of comparison between the rainfall estimated by the dataset and the observations (available from the local raingauges network) were carried out, and a bias correction was also performed in order to better match the observed climatology. An extreme analysis was eventually carried on the streamflows time series obtained by the simulations, by comparing them with the results of the same hydrological model fed with the observed time series of rainfall. The results of the analysis are shown and discussed.

  2. Arctic atmospheric preconditioning: do not rule out shortwave radiation just yet

    NASA Astrophysics Data System (ADS)

    Sedlar, J.

    2017-12-01

    Springtime atmospheric preconditioning of Arctic sea ice for enhanced or buffered sea ice melt during the subsequent melt year has received considerable research focus in recent years. A general consensus points to enhanced poleward atmospheric transport of moisture and heat during spring, effectively increasing the emission of longwave radiation to the surface. Studies have essentially ruled out the role of shortwave radiation as an effective preconditioning mechanism because of the relatively weak incident solar radiation and high surface albedo from sea ice and snow during spring. These conclusions, however, are derived primarily from atmospheric reanalysis data, which may not always represent an accurate depiction of the Arctic climate system. Here, observations of top of atmosphere radiation from state of the art satellite sensors are examined and compared with reanalysis and climate model data to examine the differences in the spring radiative budget over the Arctic Ocean for years with extreme low/high ice extent at the end of the ice melt season (September). Distinct biases are observed between satellite-based measurements and reanalysis/models, particularly for the amount of shortwave radiation trapped (warming effect) within the Arctic climate system during spring months. A connection between the differences in reanalysis/model surface albedo representation and the albedo observed by satellite is discussed. These results suggest that shortwave radiation should not be overlooked as a significant contributing mechanism to springtime Arctic atmospheric preconditioning.

  3. Evaluating concentration estimation errors in ELISA microarray experiments

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Daly, Don S.; White, Amanda M.; Varnum, Susan M.

    Enzyme-linked immunosorbent assay (ELISA) is a standard immunoassay to predict a protein concentration in a sample. Deploying ELISA in a microarray format permits simultaneous prediction of the concentrations of numerous proteins in a small sample. These predictions, however, are uncertain due to processing error and biological variability. Evaluating prediction error is critical to interpreting biological significance and improving the ELISA microarray process. Evaluating prediction error must be automated to realize a reliable high-throughput ELISA microarray system. Methods: In this paper, we present a statistical method based on propagation of error to evaluate prediction errors in the ELISA microarray process. Althoughmore » propagation of error is central to this method, it is effective only when comparable data are available. Therefore, we briefly discuss the roles of experimental design, data screening, normalization and statistical diagnostics when evaluating ELISA microarray prediction errors. We use an ELISA microarray investigation of breast cancer biomarkers to illustrate the evaluation of prediction errors. The illustration begins with a description of the design and resulting data, followed by a brief discussion of data screening and normalization. In our illustration, we fit a standard curve to the screened and normalized data, review the modeling diagnostics, and apply propagation of error.« less

  4. The Importance of Normalization on Large and Heterogeneous Microarray Datasets

    EPA Science Inventory

    DNA microarray technology is a powerful functional genomics tool increasingly used for investigating global gene expression in environmental studies. Microarrays can also be used in identifying biological networks, as they give insight on the complex gene-to-gene interactions, ne...

  5. SIMULATION AND VISUALIZATION OF FLOW PATTERN IN MICROARRAYS FOR LIQUID PHASE OLIGONUCLEOTIDE AND PEPTIDE SYNTHESIS

    PubMed Central

    O-Charoen, Sirimon; Srivannavit, Onnop; Gulari, Erdogan

    2008-01-01

    Microfluidic microarrays have been developed for economical and rapid parallel synthesis of oligonucleotide and peptide libraries. For a synthesis system to be reproducible and uniform, it is crucial to have a uniform reagent delivery throughout the system. Computational fluid dynamics (CFD) is used to model and simulate the microfluidic microarrays to study geometrical effects on flow patterns. By proper design geometry, flow uniformity could be obtained in every microreactor in the microarrays. PMID:17480053

  6. The application of DNA microarrays in gene expression analysis.

    PubMed

    van Hal, N L; Vorst, O; van Houwelingen, A M; Kok, E J; Peijnenburg, A; Aharoni, A; van Tunen, A J; Keijer, J

    2000-03-31

    DNA microarray technology is a new and powerful technology that will substantially increase the speed of molecular biological research. This paper gives a survey of DNA microarray technology and its use in gene expression studies. The technical aspects and their potential improvements are discussed. These comprise array manufacturing and design, array hybridisation, scanning, and data handling. Furthermore, it is discussed how DNA microarrays can be applied in the working fields of: safety, functionality and health of food and gene discovery and pathway engineering in plants.

  7. Sandwich ELISA Microarrays: Generating Reliable and Reproducible Assays for High-Throughput Screens

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gonzalez, Rachel M.; Varnum, Susan M.; Zangar, Richard C.

    The sandwich ELISA microarray is a powerful screening tool in biomarker discovery and validation due to its ability to simultaneously probe for multiple proteins in a miniaturized assay. The technical challenges of generating and processing the arrays are numerous. However, careful attention to possible pitfalls in the development of your antibody microarray assay can overcome these challenges. In this chapter, we describe in detail the steps that are involved in generating a reliable and reproducible sandwich ELISA microarray assay.

  8. Comparison of RNA-seq and microarray-based models for clinical endpoint prediction.

    PubMed

    Zhang, Wenqian; Yu, Ying; Hertwig, Falk; Thierry-Mieg, Jean; Zhang, Wenwei; Thierry-Mieg, Danielle; Wang, Jian; Furlanello, Cesare; Devanarayan, Viswanath; Cheng, Jie; Deng, Youping; Hero, Barbara; Hong, Huixiao; Jia, Meiwen; Li, Li; Lin, Simon M; Nikolsky, Yuri; Oberthuer, André; Qing, Tao; Su, Zhenqiang; Volland, Ruth; Wang, Charles; Wang, May D; Ai, Junmei; Albanese, Davide; Asgharzadeh, Shahab; Avigad, Smadar; Bao, Wenjun; Bessarabova, Marina; Brilliant, Murray H; Brors, Benedikt; Chierici, Marco; Chu, Tzu-Ming; Zhang, Jibin; Grundy, Richard G; He, Min Max; Hebbring, Scott; Kaufman, Howard L; Lababidi, Samir; Lancashire, Lee J; Li, Yan; Lu, Xin X; Luo, Heng; Ma, Xiwen; Ning, Baitang; Noguera, Rosa; Peifer, Martin; Phan, John H; Roels, Frederik; Rosswog, Carolina; Shao, Susan; Shen, Jie; Theissen, Jessica; Tonini, Gian Paolo; Vandesompele, Jo; Wu, Po-Yen; Xiao, Wenzhong; Xu, Joshua; Xu, Weihong; Xuan, Jiekun; Yang, Yong; Ye, Zhan; Dong, Zirui; Zhang, Ke K; Yin, Ye; Zhao, Chen; Zheng, Yuanting; Wolfinger, Russell D; Shi, Tieliu; Malkas, Linda H; Berthold, Frank; Wang, Jun; Tong, Weida; Shi, Leming; Peng, Zhiyu; Fischer, Matthias

    2015-06-25

    Gene expression profiling is being widely applied in cancer research to identify biomarkers for clinical endpoint prediction. Since RNA-seq provides a powerful tool for transcriptome-based applications beyond the limitations of microarrays, we sought to systematically evaluate the performance of RNA-seq-based and microarray-based classifiers in this MAQC-III/SEQC study for clinical endpoint prediction using neuroblastoma as a model. We generate gene expression profiles from 498 primary neuroblastomas using both RNA-seq and 44 k microarrays. Characterization of the neuroblastoma transcriptome by RNA-seq reveals that more than 48,000 genes and 200,000 transcripts are being expressed in this malignancy. We also find that RNA-seq provides much more detailed information on specific transcript expression patterns in clinico-genetic neuroblastoma subgroups than microarrays. To systematically compare the power of RNA-seq and microarray-based models in predicting clinical endpoints, we divide the cohort randomly into training and validation sets and develop 360 predictive models on six clinical endpoints of varying predictability. Evaluation of factors potentially affecting model performances reveals that prediction accuracies are most strongly influenced by the nature of the clinical endpoint, whereas technological platforms (RNA-seq vs. microarrays), RNA-seq data analysis pipelines, and feature levels (gene vs. transcript vs. exon-junction level) do not significantly affect performances of the models. We demonstrate that RNA-seq outperforms microarrays in determining the transcriptomic characteristics of cancer, while RNA-seq and microarray-based models perform similarly in clinical endpoint prediction. Our findings may be valuable to guide future studies on the development of gene expression-based predictive models and their implementation in clinical practice.

  9. The efficacy of microarray screening for autosomal recessive retinitis pigmentosa in routine clinical practice

    PubMed Central

    van Huet, Ramon A. C.; Pierrache, Laurence H.M.; Meester-Smoor, Magda A.; Klaver, Caroline C.W.; van den Born, L. Ingeborgh; Hoyng, Carel B.; de Wijs, Ilse J.; Collin, Rob W. J.; Hoefsloot, Lies H.

    2015-01-01

    Purpose To determine the efficacy of multiple versions of a commercially available arrayed primer extension (APEX) microarray chip for autosomal recessive retinitis pigmentosa (arRP). Methods We included 250 probands suspected of arRP who were genetically analyzed with the APEX microarray between January 2008 and November 2013. The mode of inheritance had to be autosomal recessive according to the pedigree (including isolated cases). If the microarray identified a heterozygous mutation, we performed Sanger sequencing of exons and exon–intron boundaries of that specific gene. The efficacy of this microarray chip with the additional Sanger sequencing approach was determined by the percentage of patients that received a molecular diagnosis. We also collected data from genetic tests other than the APEX analysis for arRP to provide a detailed description of the molecular diagnoses in our study cohort. Results The APEX microarray chip for arRP identified the molecular diagnosis in 21 (8.5%) of the patients in our cohort. Additional Sanger sequencing yielded a second mutation in 17 patients (6.8%), thereby establishing the molecular diagnosis. In total, 38 patients (15.2%) received a molecular diagnosis after analysis using the microarray and additional Sanger sequencing approach. Further genetic analyses after a negative result of the arRP microarray (n = 107) resulted in a molecular diagnosis of arRP (n = 23), autosomal dominant RP (n = 5), X-linked RP (n = 2), and choroideremia (n = 1). Conclusions The efficacy of the commercially available APEX microarray chips for arRP appears to be low, most likely caused by the limitations of this technique and the genetic and allelic heterogeneity of RP. Diagnostic yields up to 40% have been reported for next-generation sequencing (NGS) techniques that, as expected, thereby outperform targeted APEX analysis. PMID:25999674

  10. "Set in Stone" or "Ray of Hope": Parents' Beliefs About Cause and Prognosis After Genomic Testing of Children Diagnosed with ASD.

    PubMed

    Reiff, Marian; Bugos, Eva; Giarelli, Ellen; Bernhardt, Barbara A; Spinner, Nancy B; Sankar, Pamela L; Mulchandani, Surabhi

    2017-05-01

    Despite increasing utilization of chromosomal microarray analysis (CMA) for autism spectrum disorders (ASD), limited information exists about how results influence parents' beliefs about etiology and prognosis. We conducted in-depth interviews and surveys with 57 parents of children with ASD who received CMA results categorized as pathogenic, negative or variant of uncertain significance. Parents tended to incorporate their child's CMA results within their existing beliefs about the etiology of ASD, regardless of CMA result. However, parents' expectations for the future tended to differ depending on results; those who received genetic confirmation for their children's ASD expressed a sense of concreteness, acceptance and permanence of the condition. Some parents expressed hope for future biomedical treatments as a result of genetic research.

  11. Best practices for hybridization design in two-colour microarray analysis.

    PubMed

    Knapen, Dries; Vergauwen, Lucia; Laukens, Kris; Blust, Ronny

    2009-07-01

    Two-colour microarrays are a popular platform of choice in gene expression studies. Because two different samples are hybridized on a single microarray, and several microarrays are usually needed in a given experiment, there are many possible ways to combine samples on different microarrays. The actual combination employed is commonly referred to as the 'hybridization design'. Different types of hybridization designs have been developed, all aimed at optimizing the experimental setup for the detection of differentially expressed genes while coping with technical noise. Here, we first provide an overview of the different classes of hybridization designs, discussing their advantages and limitations, and then we illustrate the current trends in the use of different hybridization design types in contemporary research.

  12. Accuracy and precision of polar lower stratospheric temperatures from reanalyses evaluated from A-Train CALIOP and MLS, COSMIC GPS RO, and the equilibrium thermodynamics of supercooled ternary solutions and ice clouds

    NASA Astrophysics Data System (ADS)

    Lambert, Alyn; Santee, Michelle L.

    2018-02-01

    We investigate the accuracy and precision of polar lower stratospheric temperatures (100-10 hPa during 2008-2013) reported in several contemporary reanalysis datasets comprising two versions of the Modern-Era Retrospective analysis for Research and Applications (MERRA and MERRA-2), the Japanese 55-year Reanalysis (JRA-55), the European Centre for Medium-Range Weather Forecasts (ECMWF) interim reanalysis (ERA-I), and the National Oceanic and Atmospheric Administration (NOAA) National Centers for Environmental Prediction (NCEP) Climate Forecast System Reanalysis (NCEP-CFSR). We also include the Goddard Earth Observing System model version 5.9.1 near-real-time analysis (GEOS-5.9.1). Comparisons of these datasets are made with respect to retrieved temperatures from the Aura Microwave Limb Sounder (MLS), Constellation Observing System for Meteorology, Ionosphere and Climate (COSMIC) Global Positioning System (GPS) radio occultation (RO) temperatures, and independent absolute temperature references defined by the equilibrium thermodynamics of supercooled ternary solutions (STSs) and ice clouds. Cloud-Aerosol Lidar with Orthogonal Polarization (CALIOP) observations of polar stratospheric clouds are used to determine the cloud particle types within the Aura MLS geometric field of view. The thermodynamic calculations for STS and the ice frost point use the colocated MLS gas-phase measurements of HNO3 and H2O. The estimated bias and precision for the STS temperature reference, over the 68 to 21 hPa pressure range, are 0.6-1.5 and 0.3-0.6 K, respectively; for the ice temperature reference, they are 0.4 and 0.3 K, respectively. These uncertainties are smaller than those estimated for the retrieved MLS temperatures and also comparable to GPS RO uncertainties (bias < 0.2 K, precision > 0.7 K) in the same pressure range. We examine a case study of the time-varying temperature structure associated with layered ice clouds formed by orographic gravity waves forced by flow over the Palmer Peninsula and compare how the wave amplitudes are reproduced by each reanalysis dataset. We find that the spatial and temporal distribution of temperatures below the ice frost point, and hence the potential to form ice polar stratospheric clouds (PSCs) in model studies driven by the reanalyses, varies significantly because of the underlying differences in the representation of mountain wave activity. High-accuracy COSMIC temperatures are used as a common reference to intercompare the reanalysis temperatures. Over the 68-21 hPa pressure range, the biases of the reanalyses with respect to COSMIC temperatures for both polar regions fall within the narrow range of -0.6 K to +0.5 K. GEOS-5.9.1, MERRA, MERRA-2, and JRA-55 have predominantly cold biases, whereas ERA-I has a predominantly warm bias. NCEP-CFSR has a warm bias in the Arctic but becomes substantially colder in the Antarctic. Reanalysis temperatures are also compared with the PSC reference temperatures. Over the 68-21 hPa pressure range, the reanalysis temperature biases are in the range -1.6 to -0.3 K with standard deviations ˜ 0.6 K for the CALIOP STS reference, and in the range -0.9 to +0.1 K with standard deviations ˜ 0.7 K for the CALIOP ice reference. Comparisons of MLS temperatures with the PSC reference temperatures reveal vertical oscillations in the MLS temperatures and a significant low bias in MLS temperatures of up to 3 K.

  13. Experimental Approaches to Microarray Analysis of Tumor Samples

    ERIC Educational Resources Information Center

    Furge, Laura Lowe; Winter, Michael B.; Meyers, Jacob I.; Furge, Kyle A.

    2008-01-01

    Comprehensive measurement of gene expression using high-density nucleic acid arrays (i.e. microarrays) has become an important tool for investigating the molecular differences in clinical and research samples. Consequently, inclusion of discussion in biochemistry, molecular biology, or other appropriate courses of microarray technologies has…

  14. Challenges of microarray applications for microbial detection and gene expression profiling in food

    USDA-ARS?s Scientific Manuscript database

    Microarray technology represents one of the latest advances in molecular biology. The diverse types of microarrays have been applied to clinical and environmental microbiology, microbial ecology, and in human, veterinary, and plant diagnostics. Since multiple genes can be analyzed simultaneously, ...

  15. Multiplex cDNA quantification method that facilitates the standardization of gene expression data

    PubMed Central

    Gotoh, Osamu; Murakami, Yasufumi; Suyama, Akira

    2011-01-01

    Microarray-based gene expression measurement is one of the major methods for transcriptome analysis. However, current microarray data are substantially affected by microarray platforms and RNA references because of the microarray method can provide merely the relative amounts of gene expression levels. Therefore, valid comparisons of the microarray data require standardized platforms, internal and/or external controls and complicated normalizations. These requirements impose limitations on the extensive comparison of gene expression data. Here, we report an effective approach to removing the unfavorable limitations by measuring the absolute amounts of gene expression levels on common DNA microarrays. We have developed a multiplex cDNA quantification method called GEP-DEAN (Gene expression profiling by DCN-encoding-based analysis). The method was validated by using chemically synthesized DNA strands of known quantities and cDNA samples prepared from mouse liver, demonstrating that the absolute amounts of cDNA strands were successfully measured with a sensitivity of 18 zmol in a highly multiplexed manner in 7 h. PMID:21415008

  16. Spot detection and image segmentation in DNA microarray data.

    PubMed

    Qin, Li; Rueda, Luis; Ali, Adnan; Ngom, Alioune

    2005-01-01

    Following the invention of microarrays in 1994, the development and applications of this technology have grown exponentially. The numerous applications of microarray technology include clinical diagnosis and treatment, drug design and discovery, tumour detection, and environmental health research. One of the key issues in the experimental approaches utilising microarrays is to extract quantitative information from the spots, which represent genes in a given experiment. For this process, the initial stages are important and they influence future steps in the analysis. Identifying the spots and separating the background from the foreground is a fundamental problem in DNA microarray data analysis. In this review, we present an overview of state-of-the-art methods for microarray image segmentation. We discuss the foundations of the circle-shaped approach, adaptive shape segmentation, histogram-based methods and the recently introduced clustering-based techniques. We analytically show that clustering-based techniques are equivalent to the one-dimensional, standard k-means clustering algorithm that utilises the Euclidean distance.

  17. Caryoscope: An Open Source Java application for viewing microarray data in a genomic context

    PubMed Central

    Awad, Ihab AB; Rees, Christian A; Hernandez-Boussard, Tina; Ball, Catherine A; Sherlock, Gavin

    2004-01-01

    Background Microarray-based comparative genome hybridization experiments generate data that can be mapped onto the genome. These data are interpreted more easily when represented graphically in a genomic context. Results We have developed Caryoscope, which is an open source Java application for visualizing microarray data from array comparative genome hybridization experiments in a genomic context. Caryoscope can read General Feature Format files (GFF files), as well as comma- and tab-delimited files, that define the genomic positions of the microarray reporters for which data are obtained. The microarray data can be browsed using an interactive, zoomable interface, which helps users identify regions of chromosomal deletion or amplification. The graphical representation of the data can be exported in a number of graphic formats, including publication-quality formats such as PostScript. Conclusion Caryoscope is a useful tool that can aid in the visualization, exploration and interpretation of microarray data in a genomic context. PMID:15488149

  18. Evaluation of a Field-Portable DNA Microarray Platform and Nucleic Acid Amplification Strategies for the Detection of Arboviruses, Arthropods, and Bloodmeals

    PubMed Central

    Grubaugh, Nathan D.; Petz, Lawrence N.; Melanson, Vanessa R.; McMenamy, Scott S.; Turell, Michael J.; Long, Lewis S.; Pisarcik, Sarah E.; Kengluecha, Ampornpan; Jaichapor, Boonsong; O'Guinn, Monica L.; Lee, John S.

    2013-01-01

    Highly multiplexed assays, such as microarrays, can benefit arbovirus surveillance by allowing researchers to screen for hundreds of targets at once. We evaluated amplification strategies and the practicality of a portable DNA microarray platform to analyze virus-infected mosquitoes. The prototype microarray design used here targeted the non-structural protein 5, ribosomal RNA, and cytochrome b genes for the detection of flaviviruses, mosquitoes, and bloodmeals, respectively. We identified 13 of 14 flaviviruses from virus inoculated mosquitoes and cultured cells. Additionally, we differentiated between four mosquito genera and eight whole blood samples. The microarray platform was field evaluated in Thailand and successfully identified flaviviruses (Culex flavivirus, dengue-3, and Japanese encephalitis viruses), differentiated between mosquito genera (Aedes, Armigeres, Culex, and Mansonia), and detected mammalian bloodmeals (human and dog). We showed that the microarray platform and amplification strategies described here can be used to discern specific information on a wide variety of viruses and their vectors. PMID:23249687

  19. A molecular beacon microarray based on a quantum dot label for detecting single nucleotide polymorphisms.

    PubMed

    Guo, Qingsheng; Bai, Zhixiong; Liu, Yuqian; Sun, Qingjiang

    2016-03-15

    In this work, we report the application of streptavidin-coated quantum dot (strAV-QD) in molecular beacon (MB) microarray assays by using the strAV-QD to label the immobilized MB, avoiding target labeling and meanwhile obviating the use of amplification. The MBs are stem-loop structured oligodeoxynucleotides, modified with a thiol and a biotin at two terminals of the stem. With the strAV-QD labeling an "opened" MB rather than a "closed" MB via streptavidin-biotin reaction, a sensitive and specific detection of label-free target DNA sequence is demonstrated by the MB microarray, with a signal-to-background ratio of 8. The immobilized MBs can be perfectly regenerated, allowing the reuse of the microarray. The MB microarray also is able to detect single nucleotide polymorphisms, exhibiting genotype-dependent fluorescence signals. It is demonstrated that the MB microarray can perform as a 4-to-2 encoder, compressing the genotype information into two outputs. Copyright © 2015 Elsevier B.V. All rights reserved.

  20. MIGS-GPU: Microarray Image Gridding and Segmentation on the GPU.

    PubMed

    Katsigiannis, Stamos; Zacharia, Eleni; Maroulis, Dimitris

    2017-05-01

    Complementary DNA (cDNA) microarray is a powerful tool for simultaneously studying the expression level of thousands of genes. Nevertheless, the analysis of microarray images remains an arduous and challenging task due to the poor quality of the images that often suffer from noise, artifacts, and uneven background. In this study, the MIGS-GPU [Microarray Image Gridding and Segmentation on Graphics Processing Unit (GPU)] software for gridding and segmenting microarray images is presented. MIGS-GPU's computations are performed on the GPU by means of the compute unified device architecture (CUDA) in order to achieve fast performance and increase the utilization of available system resources. Evaluation on both real and synthetic cDNA microarray images showed that MIGS-GPU provides better performance than state-of-the-art alternatives, while the proposed GPU implementation achieves significantly lower computational times compared to the respective CPU approaches. Consequently, MIGS-GPU can be an advantageous and useful tool for biomedical laboratories, offering a user-friendly interface that requires minimum input in order to run.

Top