Sample records for large-scale expression analysis

  1. A Review of Feature Extraction Software for Microarray Gene Expression Data

    PubMed Central

    Tan, Ching Siang; Ting, Wai Soon; Mohamad, Mohd Saberi; Chan, Weng Howe; Deris, Safaai; Ali Shah, Zuraini

    2014-01-01

    When gene expression data are too large to be processed, they are transformed into a reduced representation set of genes. Transforming large-scale gene expression data into a set of genes is called feature extraction. If the genes extracted are carefully chosen, this gene set can extract the relevant information from the large-scale gene expression data, allowing further analysis by using this reduced representation instead of the full size data. In this paper, we review numerous software applications that can be used for feature extraction. The software reviewed is mainly for Principal Component Analysis (PCA), Independent Component Analysis (ICA), Partial Least Squares (PLS), and Local Linear Embedding (LLE). A summary and sources of the software are provided in the last section for each feature extraction method. PMID:25250315

  2. Transcriptional analysis of product-concentration driven changes in cellular programs of recombinant Clostridium acetobutylicumstrains.

    PubMed

    Tummala, Seshu B; Junne, Stefan G; Paredes, Carlos J; Papoutsakis, Eleftherios T

    2003-12-30

    Antisense RNA (asRNA) downregulation alters protein expression without changing the regulation of gene expression. Downregulation of primary metabolic enzymes possibly combined with overexpression of other metabolic enzymes may result in profound changes in product formation, and this may alter the large-scale transcriptional program of the cells. DNA-array based large-scale transcriptional analysis has the potential to elucidate factors that control cellular fluxes even in the absence of proteome data. These themes are explored in the study of large-scale transcriptional analysis programs and the in vivo primary-metabolism fluxes of several related recombinant C. acetobutylicum strains: C. acetobutylicum ATCC 824(pSOS95del) (plasmid control; produces high levels of butanol snd acetone), 824(pCTFB1AS) (expresses antisense RNA against CoA transferase (ctfb1-asRNA); produces very low levels of butanol and acetone), and 824(pAADB1) (expresses ctfb1-asRNA and the alcohol-aldehyde dahydrogenase gene (aad); produce high alcohol and low acetone levels). DNA-array based transcriptional analysis revealed that the large changes in product concentrations (snd notably butanol concentration) due to ctfb1-asRNA expression alone and in combination with aad overexpression resulted in dramatic changes of the cellular transcriptome. Cluster analysis and gene expression patterns of established and putative operons involved in stress response, motility, sporulation, and fatty-acid biosynthesis indicate that these simple genetic changes dramatically alter the cellular programs of C. acetobutylicum. Comparison of gene expression and flux analysis data may point to possible flux-controling steps and suggest unknown regulatory mechanisms. Copyright 2003; Wiley Periodicals, Inc.

  3. bigSCale: an analytical framework for big-scale single-cell data.

    PubMed

    Iacono, Giovanni; Mereu, Elisabetta; Guillaumet-Adkins, Amy; Corominas, Roser; Cuscó, Ivon; Rodríguez-Esteban, Gustavo; Gut, Marta; Pérez-Jurado, Luis Alberto; Gut, Ivo; Heyn, Holger

    2018-06-01

    Single-cell RNA sequencing (scRNA-seq) has significantly deepened our insights into complex tissues, with the latest techniques capable of processing tens of thousands of cells simultaneously. Analyzing increasing numbers of cells, however, generates extremely large data sets, extending processing time and challenging computing resources. Current scRNA-seq analysis tools are not designed to interrogate large data sets and often lack sensitivity to identify marker genes. With bigSCale, we provide a scalable analytical framework to analyze millions of cells, which addresses the challenges associated with large data sets. To handle the noise and sparsity of scRNA-seq data, bigSCale uses large sample sizes to estimate an accurate numerical model of noise. The framework further includes modules for differential expression analysis, cell clustering, and marker identification. A directed convolution strategy allows processing of extremely large data sets, while preserving transcript information from individual cells. We evaluated the performance of bigSCale using both a biological model of aberrant gene expression in patient-derived neuronal progenitor cells and simulated data sets, which underlines the speed and accuracy in differential expression analysis. To test its applicability for large data sets, we applied bigSCale to assess 1.3 million cells from the mouse developing forebrain. Its directed down-sampling strategy accumulates information from single cells into index cell transcriptomes, thereby defining cellular clusters with improved resolution. Accordingly, index cell clusters identified rare populations, such as reelin ( Reln )-positive Cajal-Retzius neurons, for which we report previously unrecognized heterogeneity associated with distinct differentiation stages, spatial organization, and cellular function. Together, bigSCale presents a solution to address future challenges of large single-cell data sets. © 2018 Iacono et al.; Published by Cold Spring Harbor Laboratory Press.

  4. Large-scale transcriptome analysis reveals arabidopsis metabolic pathways are frequently influenced by different pathogens.

    PubMed

    Jiang, Zhenhong; He, Fei; Zhang, Ziding

    2017-07-01

    Through large-scale transcriptional data analyses, we highlighted the importance of plant metabolism in plant immunity and identified 26 metabolic pathways that were frequently influenced by the infection of 14 different pathogens. Reprogramming of plant metabolism is a common phenomenon in plant defense responses. Currently, a large number of transcriptional profiles of infected tissues in Arabidopsis (Arabidopsis thaliana) have been deposited in public databases, which provides a great opportunity to understand the expression patterns of metabolic pathways during plant defense responses at the systems level. Here, we performed a large-scale transcriptome analysis based on 135 previously published expression samples, including 14 different pathogens, to explore the expression pattern of Arabidopsis metabolic pathways. Overall, metabolic genes are significantly changed in expression during plant defense responses. Upregulated metabolic genes are enriched on defense responses, and downregulated genes are enriched on photosynthesis, fatty acid and lipid metabolic processes. Gene set enrichment analysis (GSEA) identifies 26 frequently differentially expressed metabolic pathways (FreDE_Paths) that are differentially expressed in more than 60% of infected samples. These pathways are involved in the generation of energy, fatty acid and lipid metabolism as well as secondary metabolite biosynthesis. Clustering analysis based on the expression levels of these 26 metabolic pathways clearly distinguishes infected and control samples, further suggesting the importance of these metabolic pathways in plant defense responses. By comparing with FreDE_Paths from abiotic stresses, we find that the expression patterns of 26 FreDE_Paths from biotic stresses are more consistent across different infected samples. By investigating the expression correlation between transcriptional factors (TFs) and FreDE_Paths, we identify several notable relationships. Collectively, the current study will deepen our understanding of plant metabolism in plant immunity and provide new insights into disease-resistant crop improvement.

  5. Gene Expression Browser: Large-Scale and Cross-Experiment Microarray Data Management, Search & Visualization

    USDA-ARS?s Scientific Manuscript database

    The amount of microarray gene expression data in public repositories has been increasing exponentially for the last couple of decades. High-throughput microarray data integration and analysis has become a critical step in exploring the large amount of expression data for biological discovery. Howeve...

  6. paraGSEA: a scalable approach for large-scale gene expression profiling

    PubMed Central

    Peng, Shaoliang; Yang, Shunyun

    2017-01-01

    Abstract More studies have been conducted using gene expression similarity to identify functional connections among genes, diseases and drugs. Gene Set Enrichment Analysis (GSEA) is a powerful analytical method for interpreting gene expression data. However, due to its enormous computational overhead in the estimation of significance level step and multiple hypothesis testing step, the computation scalability and efficiency are poor on large-scale datasets. We proposed paraGSEA for efficient large-scale transcriptome data analysis. By optimization, the overall time complexity of paraGSEA is reduced from O(mn) to O(m+n), where m is the length of the gene sets and n is the length of the gene expression profiles, which contributes more than 100-fold increase in performance compared with other popular GSEA implementations such as GSEA-P, SAM-GS and GSEA2. By further parallelization, a near-linear speed-up is gained on both workstations and clusters in an efficient manner with high scalability and performance on large-scale datasets. The analysis time of whole LINCS phase I dataset (GSE92742) was reduced to nearly half hour on a 1000 node cluster on Tianhe-2, or within 120 hours on a 96-core workstation. The source code of paraGSEA is licensed under the GPLv3 and available at http://github.com/ysycloud/paraGSEA. PMID:28973463

  7. Large-Scale Analysis of Network Bistability for Human Cancers

    PubMed Central

    Shiraishi, Tetsuya; Matsuyama, Shinako; Kitano, Hiroaki

    2010-01-01

    Protein–protein interaction and gene regulatory networks are likely to be locked in a state corresponding to a disease by the behavior of one or more bistable circuits exhibiting switch-like behavior. Sets of genes could be over-expressed or repressed when anomalies due to disease appear, and the circuits responsible for this over- or under-expression might persist for as long as the disease state continues. This paper shows how a large-scale analysis of network bistability for various human cancers can identify genes that can potentially serve as drug targets or diagnosis biomarkers. PMID:20628618

  8. Development of multitissue microfluidic dynamic array for assessing changes in gene expression associated with channel catfish appetite, growth, metabolism, and intestinal health

    USDA-ARS?s Scientific Manuscript database

    Large-scale, gene expression methods allow for high throughput analysis of physiological pathways at a fraction of the cost of individual gene expression analysis. Systems, such as the Fluidigm quantitative PCR array described here, can provide powerful assessments of the effects of diet, environme...

  9. Stormbow: A Cloud-Based Tool for Reads Mapping and Expression Quantification in Large-Scale RNA-Seq Studies

    PubMed Central

    Zhao, Shanrong; Prenger, Kurt; Smith, Lance

    2013-01-01

    RNA-Seq is becoming a promising replacement to microarrays in transcriptome profiling and differential gene expression study. Technical improvements have decreased sequencing costs and, as a result, the size and number of RNA-Seq datasets have increased rapidly. However, the increasing volume of data from large-scale RNA-Seq studies poses a practical challenge for data analysis in a local environment. To meet this challenge, we developed Stormbow, a cloud-based software package, to process large volumes of RNA-Seq data in parallel. The performance of Stormbow has been tested by practically applying it to analyse 178 RNA-Seq samples in the cloud. In our test, it took 6 to 8 hours to process an RNA-Seq sample with 100 million reads, and the average cost was $3.50 per sample. Utilizing Amazon Web Services as the infrastructure for Stormbow allows us to easily scale up to handle large datasets with on-demand computational resources. Stormbow is a scalable, cost effective, and open-source based tool for large-scale RNA-Seq data analysis. Stormbow can be freely downloaded and can be used out of box to process Illumina RNA-Seq datasets. PMID:25937948

  10. Stormbow: A Cloud-Based Tool for Reads Mapping and Expression Quantification in Large-Scale RNA-Seq Studies.

    PubMed

    Zhao, Shanrong; Prenger, Kurt; Smith, Lance

    2013-01-01

    RNA-Seq is becoming a promising replacement to microarrays in transcriptome profiling and differential gene expression study. Technical improvements have decreased sequencing costs and, as a result, the size and number of RNA-Seq datasets have increased rapidly. However, the increasing volume of data from large-scale RNA-Seq studies poses a practical challenge for data analysis in a local environment. To meet this challenge, we developed Stormbow, a cloud-based software package, to process large volumes of RNA-Seq data in parallel. The performance of Stormbow has been tested by practically applying it to analyse 178 RNA-Seq samples in the cloud. In our test, it took 6 to 8 hours to process an RNA-Seq sample with 100 million reads, and the average cost was $3.50 per sample. Utilizing Amazon Web Services as the infrastructure for Stormbow allows us to easily scale up to handle large datasets with on-demand computational resources. Stormbow is a scalable, cost effective, and open-source based tool for large-scale RNA-Seq data analysis. Stormbow can be freely downloaded and can be used out of box to process Illumina RNA-Seq datasets.

  11. In silico identification and comparative analysis of differentially expressed genes in human and mouse tissues

    PubMed Central

    Pao, Sheng-Ying; Lin, Win-Li; Hwang, Ming-Jing

    2006-01-01

    Background Screening for differentially expressed genes on the genomic scale and comparative analysis of the expression profiles of orthologous genes between species to study gene function and regulation are becoming increasingly feasible. Expressed sequence tags (ESTs) are an excellent source of data for such studies using bioinformatic approaches because of the rich libraries and tremendous amount of data now available in the public domain. However, any large-scale EST-based bioinformatics analysis must deal with the heterogeneous, and often ambiguous, tissue and organ terms used to describe EST libraries. Results To deal with the issue of tissue source, in this work, we carefully screened and organized more than 8 million human and mouse ESTs into 157 human and 108 mouse tissue/organ categories, to which we applied an established statistic test using different thresholds of the p value to identify genes differentially expressed in different tissues. Further analysis of the tissue distribution and level of expression of human and mouse orthologous genes showed that tissue-specific orthologs tended to have more similar expression patterns than those lacking significant tissue specificity. On the other hand, a number of orthologs were found to have significant disparity in their expression profiles, hinting at novel functions, divergent regulation, or new ortholog relationships. Conclusion Comprehensive statistics on the tissue-specific expression of human and mouse genes were obtained in this very large-scale, EST-based analysis. These statistical results have been organized into a database, freely accessible at our website , for easy searching of human and mouse tissue-specific genes and for investigating gene expression profiles in the context of comparative genomics. Comparative analysis showed that, although highly tissue-specific genes tend to exhibit similar expression profiles in human and mouse, there are significant exceptions, indicating that orthologous genes, while sharing basic genomic properties, could result in distinct phenotypes. PMID:16626500

  12. Large-Scale Cognitive GWAS Meta-Analysis Reveals Tissue-Specific Neural Expression and Potential Nootropic Drug Targets.

    PubMed

    Lam, Max; Trampush, Joey W; Yu, Jin; Knowles, Emma; Davies, Gail; Liewald, David C; Starr, John M; Djurovic, Srdjan; Melle, Ingrid; Sundet, Kjetil; Christoforou, Andrea; Reinvang, Ivar; DeRosse, Pamela; Lundervold, Astri J; Steen, Vidar M; Espeseth, Thomas; Räikkönen, Katri; Widen, Elisabeth; Palotie, Aarno; Eriksson, Johan G; Giegling, Ina; Konte, Bettina; Roussos, Panos; Giakoumaki, Stella; Burdick, Katherine E; Payton, Antony; Ollier, William; Chiba-Falek, Ornit; Attix, Deborah K; Need, Anna C; Cirulli, Elizabeth T; Voineskos, Aristotle N; Stefanis, Nikos C; Avramopoulos, Dimitrios; Hatzimanolis, Alex; Arking, Dan E; Smyrnis, Nikolaos; Bilder, Robert M; Freimer, Nelson A; Cannon, Tyrone D; London, Edythe; Poldrack, Russell A; Sabb, Fred W; Congdon, Eliza; Conley, Emily Drabant; Scult, Matthew A; Dickinson, Dwight; Straub, Richard E; Donohoe, Gary; Morris, Derek; Corvin, Aiden; Gill, Michael; Hariri, Ahmad R; Weinberger, Daniel R; Pendleton, Neil; Bitsios, Panos; Rujescu, Dan; Lahti, Jari; Le Hellard, Stephanie; Keller, Matthew C; Andreassen, Ole A; Deary, Ian J; Glahn, David C; Malhotra, Anil K; Lencz, Todd

    2017-11-28

    Here, we present a large (n = 107,207) genome-wide association study (GWAS) of general cognitive ability ("g"), further enhanced by combining results with a large-scale GWAS of educational attainment. We identified 70 independent genomic loci associated with general cognitive ability. Results showed significant enrichment for genes causing Mendelian disorders with an intellectual disability phenotype. Competitive pathway analysis implicated the biological processes of neurogenesis and synaptic regulation, as well as the gene targets of two pharmacologic agents: cinnarizine, a T-type calcium channel blocker, and LY97241, a potassium channel inhibitor. Transcriptome-wide and epigenome-wide analysis revealed that the implicated loci were enriched for genes expressed across all brain regions (most strongly in the cerebellum). Enrichment was exclusive to genes expressed in neurons but not oligodendrocytes or astrocytes. Finally, we report genetic correlations between cognitive ability and disparate phenotypes including psychiatric disorders, several autoimmune disorders, longevity, and maternal age at first birth. Copyright © 2017 The Author(s). Published by Elsevier Inc. All rights reserved.

  13. Multiscale Embedded Gene Co-expression Network Analysis

    PubMed Central

    Song, Won-Min; Zhang, Bin

    2015-01-01

    Gene co-expression network analysis has been shown effective in identifying functional co-expressed gene modules associated with complex human diseases. However, existing techniques to construct co-expression networks require some critical prior information such as predefined number of clusters, numerical thresholds for defining co-expression/interaction, or do not naturally reproduce the hallmarks of complex systems such as the scale-free degree distribution of small-worldness. Previously, a graph filtering technique called Planar Maximally Filtered Graph (PMFG) has been applied to many real-world data sets such as financial stock prices and gene expression to extract meaningful and relevant interactions. However, PMFG is not suitable for large-scale genomic data due to several drawbacks, such as the high computation complexity O(|V|3), the presence of false-positives due to the maximal planarity constraint, and the inadequacy of the clustering framework. Here, we developed a new co-expression network analysis framework called Multiscale Embedded Gene Co-expression Network Analysis (MEGENA) by: i) introducing quality control of co-expression similarities, ii) parallelizing embedded network construction, and iii) developing a novel clustering technique to identify multi-scale clustering structures in Planar Filtered Networks (PFNs). We applied MEGENA to a series of simulated data and the gene expression data in breast carcinoma and lung adenocarcinoma from The Cancer Genome Atlas (TCGA). MEGENA showed improved performance over well-established clustering methods and co-expression network construction approaches. MEGENA revealed not only meaningful multi-scale organizations of co-expressed gene clusters but also novel targets in breast carcinoma and lung adenocarcinoma. PMID:26618778

  14. Multiscale Embedded Gene Co-expression Network Analysis.

    PubMed

    Song, Won-Min; Zhang, Bin

    2015-11-01

    Gene co-expression network analysis has been shown effective in identifying functional co-expressed gene modules associated with complex human diseases. However, existing techniques to construct co-expression networks require some critical prior information such as predefined number of clusters, numerical thresholds for defining co-expression/interaction, or do not naturally reproduce the hallmarks of complex systems such as the scale-free degree distribution of small-worldness. Previously, a graph filtering technique called Planar Maximally Filtered Graph (PMFG) has been applied to many real-world data sets such as financial stock prices and gene expression to extract meaningful and relevant interactions. However, PMFG is not suitable for large-scale genomic data due to several drawbacks, such as the high computation complexity O(|V|3), the presence of false-positives due to the maximal planarity constraint, and the inadequacy of the clustering framework. Here, we developed a new co-expression network analysis framework called Multiscale Embedded Gene Co-expression Network Analysis (MEGENA) by: i) introducing quality control of co-expression similarities, ii) parallelizing embedded network construction, and iii) developing a novel clustering technique to identify multi-scale clustering structures in Planar Filtered Networks (PFNs). We applied MEGENA to a series of simulated data and the gene expression data in breast carcinoma and lung adenocarcinoma from The Cancer Genome Atlas (TCGA). MEGENA showed improved performance over well-established clustering methods and co-expression network construction approaches. MEGENA revealed not only meaningful multi-scale organizations of co-expressed gene clusters but also novel targets in breast carcinoma and lung adenocarcinoma.

  15. Gene Expression Analysis: Teaching Students to Do 30,000 Experiments at Once with Microarray

    ERIC Educational Resources Information Center

    Carvalho, Felicia I.; Johns, Christopher; Gillespie, Marc E.

    2012-01-01

    Genome scale experiments routinely produce large data sets that require computational analysis, yet there are few student-based labs that illustrate the design and execution of these experiments. In order for students to understand and participate in the genomic world, teaching labs must be available where students generate and analyze large data…

  16. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Seljak, Uroš, E-mail: useljak@berkeley.edu

    On large scales a nonlinear transformation of matter density field can be viewed as a biased tracer of the density field itself. A nonlinear transformation also modifies the redshift space distortions in the same limit, giving rise to a velocity bias. In models with primordial nongaussianity a nonlinear transformation generates a scale dependent bias on large scales. We derive analytic expressions for the large scale bias, the velocity bias and the redshift space distortion (RSD) parameter β, as well as the scale dependent bias from primordial nongaussianity for a general nonlinear transformation. These biases can be expressed entirely in termsmore » of the one point distribution function (PDF) of the final field and the parameters of the transformation. The analysis shows that one can view the large scale bias different from unity and primordial nongaussianity bias as a consequence of converting higher order correlations in density into 2-point correlations of its nonlinear transform. Our analysis allows one to devise nonlinear transformations with nearly arbitrary bias properties, which can be used to increase the signal in the large scale clustering limit. We apply the results to the ionizing equilibrium model of Lyman-α forest, in which Lyman-α flux F is related to the density perturbation δ via a nonlinear transformation. Velocity bias can be expressed as an average over the Lyman-α flux PDF. At z = 2.4 we predict the velocity bias of -0.1, compared to the observed value of −0.13±0.03. Bias and primordial nongaussianity bias depend on the parameters of the transformation. Measurements of bias can thus be used to constrain these parameters, and for reasonable values of the ionizing background intensity we can match the predictions to observations. Matching to the observed values we predict the ratio of primordial nongaussianity bias to bias to have the opposite sign and lower magnitude than the corresponding values for the highly biased galaxies, but this depends on the model parameters and can also vanish or change the sign.« less

  17. A large-scale analysis of sex differences in facial expressions

    PubMed Central

    Kodra, Evan; el Kaliouby, Rana; LaFrance, Marianne

    2017-01-01

    There exists a stereotype that women are more expressive than men; however, research has almost exclusively focused on a single facial behavior, smiling. A large-scale study examines whether women are consistently more expressive than men or whether the effects are dependent on the emotion expressed. Studies of gender differences in expressivity have been somewhat restricted to data collected in lab settings or which required labor-intensive manual coding. In the present study, we analyze gender differences in facial behaviors as over 2,000 viewers watch a set of video advertisements in their home environments. The facial responses were recorded using participants’ own webcams. Using a new automated facial coding technology we coded facial activity. We find that women are not universally more expressive across all facial actions. Nor are they more expressive in all positive valence actions and less expressive in all negative valence actions. It appears that generally women express actions more frequently than men, and in particular express more positive valence actions. However, expressiveness is not greater in women for all negative valence actions and is dependent on the discrete emotional state. PMID:28422963

  18. Pair-barcode high-throughput sequencing for large-scale multiplexed sample analysis

    PubMed Central

    2012-01-01

    Background The multiplexing becomes the major limitation of the next-generation sequencing (NGS) in application to low complexity samples. Physical space segregation allows limited multiplexing, while the existing barcode approach only permits simultaneously analysis of up to several dozen samples. Results Here we introduce pair-barcode sequencing (PBS), an economic and flexible barcoding technique that permits parallel analysis of large-scale multiplexed samples. In two pilot runs using SOLiD sequencer (Applied Biosystems Inc.), 32 independent pair-barcoded miRNA libraries were simultaneously discovered by the combination of 4 unique forward barcodes and 8 unique reverse barcodes. Over 174,000,000 reads were generated and about 64% of them are assigned to both of the barcodes. After mapping all reads to pre-miRNAs in miRBase, different miRNA expression patterns are captured from the two clinical groups. The strong correlation using different barcode pairs and the high consistency of miRNA expression in two independent runs demonstrates that PBS approach is valid. Conclusions By employing PBS approach in NGS, large-scale multiplexed pooled samples could be practically analyzed in parallel so that high-throughput sequencing economically meets the requirements of samples which are low sequencing throughput demand. PMID:22276739

  19. Pair-barcode high-throughput sequencing for large-scale multiplexed sample analysis.

    PubMed

    Tu, Jing; Ge, Qinyu; Wang, Shengqin; Wang, Lei; Sun, Beili; Yang, Qi; Bai, Yunfei; Lu, Zuhong

    2012-01-25

    The multiplexing becomes the major limitation of the next-generation sequencing (NGS) in application to low complexity samples. Physical space segregation allows limited multiplexing, while the existing barcode approach only permits simultaneously analysis of up to several dozen samples. Here we introduce pair-barcode sequencing (PBS), an economic and flexible barcoding technique that permits parallel analysis of large-scale multiplexed samples. In two pilot runs using SOLiD sequencer (Applied Biosystems Inc.), 32 independent pair-barcoded miRNA libraries were simultaneously discovered by the combination of 4 unique forward barcodes and 8 unique reverse barcodes. Over 174,000,000 reads were generated and about 64% of them are assigned to both of the barcodes. After mapping all reads to pre-miRNAs in miRBase, different miRNA expression patterns are captured from the two clinical groups. The strong correlation using different barcode pairs and the high consistency of miRNA expression in two independent runs demonstrates that PBS approach is valid. By employing PBS approach in NGS, large-scale multiplexed pooled samples could be practically analyzed in parallel so that high-throughput sequencing economically meets the requirements of samples which are low sequencing throughput demand.

  20. Analysis of blood-based gene expression in idiopathic Parkinson disease.

    PubMed

    Shamir, Ron; Klein, Christine; Amar, David; Vollstedt, Eva-Juliane; Bonin, Michael; Usenovic, Marija; Wong, Yvette C; Maver, Ales; Poths, Sven; Safer, Hershel; Corvol, Jean-Christophe; Lesage, Suzanne; Lavi, Ofer; Deuschl, Günther; Kuhlenbaeumer, Gregor; Pawlack, Heike; Ulitsky, Igor; Kasten, Meike; Riess, Olaf; Brice, Alexis; Peterlin, Borut; Krainc, Dimitri

    2017-10-17

    To examine whether gene expression analysis of a large-scale Parkinson disease (PD) patient cohort produces a robust blood-based PD gene signature compared to previous studies that have used relatively small cohorts (≤220 samples). Whole-blood gene expression profiles were collected from a total of 523 individuals. After preprocessing, the data contained 486 gene profiles (n = 205 PD, n = 233 controls, n = 48 other neurodegenerative diseases) that were partitioned into training, validation, and independent test cohorts to identify and validate a gene signature. Batch-effect reduction and cross-validation were performed to ensure signature reliability. Finally, functional and pathway enrichment analyses were applied to the signature to identify PD-associated gene networks. A gene signature of 100 probes that mapped to 87 genes, corresponding to 64 upregulated and 23 downregulated genes differentiating between patients with idiopathic PD and controls, was identified with the training cohort and successfully replicated in both an independent validation cohort (area under the curve [AUC] = 0.79, p = 7.13E-6) and a subsequent independent test cohort (AUC = 0.74, p = 4.2E-4). Network analysis of the signature revealed gene enrichment in pathways, including metabolism, oxidation, and ubiquitination/proteasomal activity, and misregulation of mitochondria-localized genes, including downregulation of COX4I1 , ATP5A1 , and VDAC3 . We present a large-scale study of PD gene expression profiling. This work identifies a reliable blood-based PD signature and highlights the importance of large-scale patient cohorts in developing potential PD biomarkers. © 2017 American Academy of Neurology.

  1. Identification of tissue-specific, abiotic stress-responsive gene expression patterns in wine grape (Vitis vinifera L.) based on curation and mining of large-scale EST data sets

    PubMed Central

    2011-01-01

    Background Abiotic stresses, such as water deficit and soil salinity, result in changes in physiology, nutrient use, and vegetative growth in vines, and ultimately, yield and flavor in berries of wine grape, Vitis vinifera L. Large-scale expressed sequence tags (ESTs) were generated, curated, and analyzed to identify major genetic determinants responsible for stress-adaptive responses. Although roots serve as the first site of perception and/or injury for many types of abiotic stress, EST sequencing in root tissues of wine grape exposed to abiotic stresses has been extremely limited to date. To overcome this limitation, large-scale EST sequencing was conducted from root tissues exposed to multiple abiotic stresses. Results A total of 62,236 expressed sequence tags (ESTs) were generated from leaf, berry, and root tissues from vines subjected to abiotic stresses and compared with 32,286 ESTs sequenced from 20 public cDNA libraries. Curation to correct annotation errors, clustering and assembly of the berry and leaf ESTs with currently available V. vinifera full-length transcripts and ESTs yielded a total of 13,278 unique sequences, with 2302 singletons and 10,976 mapped to V. vinifera gene models. Of these, 739 transcripts were found to have significant differential expression in stressed leaves and berries including 250 genes not described previously as being abiotic stress responsive. In a second analysis of 16,452 ESTs from a normalized root cDNA library derived from roots exposed to multiple, short-term, abiotic stresses, 135 genes with root-enriched expression patterns were identified on the basis of their relative EST abundance in roots relative to other tissues. Conclusions The large-scale analysis of relative EST frequency counts among a diverse collection of 23 different cDNA libraries from leaf, berry, and root tissues of wine grape exposed to a variety of abiotic stress conditions revealed distinct, tissue-specific expression patterns, previously unrecognized stress-induced genes, and many novel genes with root-enriched mRNA expression for improving our understanding of root biology and manipulation of rootstock traits in wine grape. mRNA abundance estimates based on EST library-enriched expression patterns showed only modest correlations between microarray and quantitative, real-time reverse transcription-polymerase chain reaction (qRT-PCR) methods highlighting the need for deep-sequencing expression profiling methods. PMID:21592389

  2. Genome-wide computational prediction and analysis of core promoter elements across plant monocots and dicots

    USDA-ARS?s Scientific Manuscript database

    Transcription initiation, essential to gene expression regulation, involves recruitment of basal transcription factors to the core promoter elements (CPEs). The distribution of currently known CPEs across plant genomes is largely unknown. This is the first large scale genome-wide report on the compu...

  3. A Normalization-Free and Nonparametric Method Sharpens Large-Scale Transcriptome Analysis and Reveals Common Gene Alteration Patterns in Cancers.

    PubMed

    Li, Qi-Gang; He, Yong-Han; Wu, Huan; Yang, Cui-Ping; Pu, Shao-Yan; Fan, Song-Qing; Jiang, Li-Ping; Shen, Qiu-Shuo; Wang, Xiao-Xiong; Chen, Xiao-Qiong; Yu, Qin; Li, Ying; Sun, Chang; Wang, Xiangting; Zhou, Jumin; Li, Hai-Peng; Chen, Yong-Bin; Kong, Qing-Peng

    2017-01-01

    Heterogeneity in transcriptional data hampers the identification of differentially expressed genes (DEGs) and understanding of cancer, essentially because current methods rely on cross-sample normalization and/or distribution assumption-both sensitive to heterogeneous values. Here, we developed a new method, Cross-Value Association Analysis (CVAA), which overcomes the limitation and is more robust to heterogeneous data than the other methods. Applying CVAA to a more complex pan-cancer dataset containing 5,540 transcriptomes discovered numerous new DEGs and many previously rarely explored pathways/processes; some of them were validated, both in vitro and in vivo , to be crucial in tumorigenesis, e.g., alcohol metabolism ( ADH1B ), chromosome remodeling ( NCAPH ) and complement system ( Adipsin ). Together, we present a sharper tool to navigate large-scale expression data and gain new mechanistic insights into tumorigenesis.

  4. Architectural Visualization of C/C++ Source Code for Program Comprehension

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Panas, T; Epperly, T W; Quinlan, D

    2006-09-01

    Structural and behavioral visualization of large-scale legacy systems to aid program comprehension is still a major challenge. The challenge is even greater when applications are implemented in flexible and expressive languages such as C and C++. In this paper, we consider visualization of static and dynamic aspects of large-scale scientific C/C++ applications. For our investigation, we reuse and integrate specialized analysis and visualization tools. Furthermore, we present a novel layout algorithm that permits a compressive architectural view of a large-scale software system. Our layout is unique in that it allows traditional program visualizations, i.e., graph structures, to be seen inmore » relation to the application's file structure.« less

  5. An integrated approach to reconstructing genome-scale transcriptional regulatory networks

    DOE PAGES

    Imam, Saheed; Noguera, Daniel R.; Donohue, Timothy J.; ...

    2015-02-27

    Transcriptional regulatory networks (TRNs) program cells to dynamically alter their gene expression in response to changing internal or environmental conditions. In this study, we develop a novel workflow for generating large-scale TRN models that integrates comparative genomics data, global gene expression analyses, and intrinsic properties of transcription factors (TFs). An assessment of this workflow using benchmark datasets for the well-studied γ-proteobacterium Escherichia coli showed that it outperforms expression-based inference approaches, having a significantly larger area under the precision-recall curve. Further analysis indicated that this integrated workflow captures different aspects of the E. coli TRN than expression-based approaches, potentially making themmore » highly complementary. We leveraged this new workflow and observations to build a large-scale TRN model for the α-Proteobacterium Rhodobacter sphaeroides that comprises 120 gene clusters, 1211 genes (including 93 TFs), 1858 predicted protein-DNA interactions and 76 DNA binding motifs. We found that ~67% of the predicted gene clusters in this TRN are enriched for functions ranging from photosynthesis or central carbon metabolism to environmental stress responses. We also found that members of many of the predicted gene clusters were consistent with prior knowledge in R. sphaeroides and/or other bacteria. Experimental validation of predictions from this R. sphaeroides TRN model showed that high precision and recall was also obtained for TFs involved in photosynthesis (PpsR), carbon metabolism (RSP_0489) and iron homeostasis (RSP_3341). In addition, this integrative approach enabled generation of TRNs with increased information content relative to R. sphaeroides TRN models built via other approaches. We also show how this approach can be used to simultaneously produce TRN models for each related organism used in the comparative genomics analysis. Our results highlight the advantages of integrating comparative genomics of closely related organisms with gene expression data to assemble large-scale TRN models with high-quality predictions.« less

  6. Statistical Model to Analyze Quantitative Proteomics Data Obtained by 18O/16O Labeling and Linear Ion Trap Mass Spectrometry

    PubMed Central

    Jorge, Inmaculada; Navarro, Pedro; Martínez-Acedo, Pablo; Núñez, Estefanía; Serrano, Horacio; Alfranca, Arántzazu; Redondo, Juan Miguel; Vázquez, Jesús

    2009-01-01

    Statistical models for the analysis of protein expression changes by stable isotope labeling are still poorly developed, particularly for data obtained by 16O/18O labeling. Besides large scale test experiments to validate the null hypothesis are lacking. Although the study of mechanisms underlying biological actions promoted by vascular endothelial growth factor (VEGF) on endothelial cells is of considerable interest, quantitative proteomics studies on this subject are scarce and have been performed after exposing cells to the factor for long periods of time. In this work we present the largest quantitative proteomics study to date on the short term effects of VEGF on human umbilical vein endothelial cells by 18O/16O labeling. Current statistical models based on normality and variance homogeneity were found unsuitable to describe the null hypothesis in a large scale test experiment performed on these cells, producing false expression changes. A random effects model was developed including four different sources of variance at the spectrum-fitting, scan, peptide, and protein levels. With the new model the number of outliers at scan and peptide levels was negligible in three large scale experiments, and only one false protein expression change was observed in the test experiment among more than 1000 proteins. The new model allowed the detection of significant protein expression changes upon VEGF stimulation for 4 and 8 h. The consistency of the changes observed at 4 h was confirmed by a replica at a smaller scale and further validated by Western blot analysis of some proteins. Most of the observed changes have not been described previously and are consistent with a pattern of protein expression that dynamically changes over time following the evolution of the angiogenic response. With this statistical model the 18O labeling approach emerges as a very promising and robust alternative to perform quantitative proteomics studies at a depth of several thousand proteins. PMID:19181660

  7. Analysis of the fluctuations of the tumour/host interface

    NASA Astrophysics Data System (ADS)

    Milotti, Edoardo; Vyshemirsky, Vladislav; Stella, Sabrina; Dogo, Federico; Chignola, Roberto

    2017-11-01

    In a recent analysis of metabolic scaling in solid tumours we found a scaling law that interpolates between the power laws μ ∝ V and μ ∝V 2 / 3, where μ is the metabolic rate expressed as the glucose absorption rate and V is the tumour volume. The scaling law fits quite well both in vitro and in vivo data, however we also observed marked fluctuations that are associated with the specific biological properties of individual tumours. Here we analyse these fluctuations, in an attempt to find the population-wide distribution of an important parameter (A) which expresses the total extent of the interface between the solid tumour and the non-cancerous environment. Heuristic considerations suggest that the values of the A parameter follow a lognormal distribution, and, allowing for the large uncertainties of the experimental data, our statistical analysis confirms this.

  8. Cloud-Scale Genomic Signals Processing for Robust Large-Scale Cancer Genomic Microarray Data Analysis.

    PubMed

    Harvey, Benjamin Simeon; Ji, Soo-Yeon

    2017-01-01

    As microarray data available to scientists continues to increase in size and complexity, it has become overwhelmingly important to find multiple ways to bring forth oncological inference to the bioinformatics community through the analysis of large-scale cancer genomic (LSCG) DNA and mRNA microarray data that is useful to scientists. Though there have been many attempts to elucidate the issue of bringing forth biological interpretation by means of wavelet preprocessing and classification, there has not been a research effort that focuses on a cloud-scale distributed parallel (CSDP) separable 1-D wavelet decomposition technique for denoising through differential expression thresholding and classification of LSCG microarray data. This research presents a novel methodology that utilizes a CSDP separable 1-D method for wavelet-based transformation in order to initialize a threshold which will retain significantly expressed genes through the denoising process for robust classification of cancer patients. Additionally, the overall study was implemented and encompassed within CSDP environment. The utilization of cloud computing and wavelet-based thresholding for denoising was used for the classification of samples within the Global Cancer Map, Cancer Cell Line Encyclopedia, and The Cancer Genome Atlas. The results proved that separable 1-D parallel distributed wavelet denoising in the cloud and differential expression thresholding increased the computational performance and enabled the generation of higher quality LSCG microarray datasets, which led to more accurate classification results.

  9. Analysis of large-scale gene expression data.

    PubMed

    Sherlock, G

    2000-04-01

    The advent of cDNA and oligonucleotide microarray technologies has led to a paradigm shift in biological investigation, such that the bottleneck in research is shifting from data generation to data analysis. Hierarchical clustering, divisive clustering, self-organizing maps and k-means clustering have all been recently used to make sense of this mass of data.

  10. Integrating genome-wide association studies and gene expression data highlights dysregulated multiple sclerosis risk pathways.

    PubMed

    Liu, Guiyou; Zhang, Fang; Jiang, Yongshuai; Hu, Yang; Gong, Zhongying; Liu, Shoufeng; Chen, Xiuju; Jiang, Qinghua; Hao, Junwei

    2017-02-01

    Much effort has been expended on identifying the genetic determinants of multiple sclerosis (MS). Existing large-scale genome-wide association study (GWAS) datasets provide strong support for using pathway and network-based analysis methods to investigate the mechanisms underlying MS. However, no shared genetic pathways have been identified to date. We hypothesize that shared genetic pathways may indeed exist in different MS-GWAS datasets. Here, we report results from a three-stage analysis of GWAS and expression datasets. In stage 1, we conducted multiple pathway analyses of two MS-GWAS datasets. In stage 2, we performed a candidate pathway analysis of the large-scale MS-GWAS dataset. In stage 3, we performed a pathway analysis using the dysregulated MS gene list from seven human MS case-control expression datasets. In stage 1, we identified 15 shared pathways. In stage 2, we successfully replicated 14 of these 15 significant pathways. In stage 3, we found that dysregulated MS genes were significantly enriched in 10 of 15 MS risk pathways identified in stages 1 and 2. We report shared genetic pathways in different MS-GWAS datasets and highlight some new MS risk pathways. Our findings provide new insights on the genetic determinants of MS.

  11. Large-scale gene expression profiling data for the model moss Physcomitrella patens aid understanding of developmental progression, culture and stress conditions.

    PubMed

    Hiss, Manuel; Laule, Oliver; Meskauskiene, Rasa M; Arif, Muhammad A; Decker, Eva L; Erxleben, Anika; Frank, Wolfgang; Hanke, Sebastian T; Lang, Daniel; Martin, Anja; Neu, Christina; Reski, Ralf; Richardt, Sandra; Schallenberg-Rüdinger, Mareike; Szövényi, Peter; Tiko, Theodhor; Wiedemann, Gertrud; Wolf, Luise; Zimmermann, Philip; Rensing, Stefan A

    2014-08-01

    The moss Physcomitrella patens is an important model organism for studying plant evolution, development, physiology and biotechnology. Here we have generated microarray gene expression data covering the principal developmental stages, culture forms and some environmental/stress conditions. Example analyses of developmental stages and growth conditions as well as abiotic stress treatments demonstrate that (i) growth stage is dominant over culture conditions, (ii) liquid culture is not stressful for the plant, (iii) low pH might aid protoplastation by reduced expression of cell wall structure genes, (iv) largely the same gene pool mediates response to dehydration and rehydration, and (v) AP2/EREBP transcription factors play important roles in stress response reactions. With regard to the AP2 gene family, phylogenetic analysis and comparison with Arabidopsis thaliana shows commonalities as well as uniquely expressed family members under drought, light perturbations and protoplastation. Gene expression profiles for P. patens are available for the scientific community via the easy-to-use tool at https://www.genevestigator.com. By providing large-scale expression profiles, the usability of this model organism is further enhanced, for example by enabling selection of control genes for quantitative real-time PCR. Now, gene expression levels across a broad range of conditions can be accessed online for P. patens. © 2014 The Authors The Plant Journal © 2014 John Wiley & Sons Ltd.

  12. Transcriptome profiles link environmental variation and physiological response of Mytilus californianus between Pacific tides

    PubMed Central

    Place, Sean P.; Menge, Bruce A.; Hofmann, Gretchen E.

    2011-01-01

    Summary The marine intertidal zone is characterized by large variation in temperature, pH, dissolved oxygen and the supply of nutrients and food on seasonal and daily time scales. These oceanic fluctuations drive of ecological processes such as recruitment, competition and consumer-prey interactions largely via physiological mehcanisms. Thus, to understand coastal ecosystem dynamics and responses to climate change, it is crucial to understand these mechanisms. Here we utilize transcriptome analysis of the physiological response of the mussel Mytilus californianus at different spatial scales to gain insight into these mechanisms. We used mussels inhabiting different vertical locations within Strawberry Hill on Cape Perpetua, OR and Boiler Bay on Cape Foulweather, OR to study inter- and intra-site variation of gene expression. The results highlight two distinct gene expression signatures related to the cycling of metabolic activity and perturbations to cellular homeostasis. Intermediate spatial scales show a strong influence of oceanographic differences in food and stress environments between sites separated by ~65 km. Together, these new insights into environmental control of gene expression may allow understanding of important physiological drivers within and across populations. PMID:22563136

  13. Optimal consistency in microRNA expression analysis using reference-gene-based normalization.

    PubMed

    Wang, Xi; Gardiner, Erin J; Cairns, Murray J

    2015-05-01

    Normalization of high-throughput molecular expression profiles secures differential expression analysis between samples of different phenotypes or biological conditions, and facilitates comparison between experimental batches. While the same general principles apply to microRNA (miRNA) normalization, there is mounting evidence that global shifts in their expression patterns occur in specific circumstances, which pose a challenge for normalizing miRNA expression data. As an alternative to global normalization, which has the propensity to flatten large trends, normalization against constitutively expressed reference genes presents an advantage through their relative independence. Here we investigated the performance of reference-gene-based (RGB) normalization for differential miRNA expression analysis of microarray expression data, and compared the results with other normalization methods, including: quantile, variance stabilization, robust spline, simple scaling, rank invariant, and Loess regression. The comparative analyses were executed using miRNA expression in tissue samples derived from subjects with schizophrenia and non-psychiatric controls. We proposed a consistency criterion for evaluating methods by examining the overlapping of differentially expressed miRNAs detected using different partitions of the whole data. Based on this criterion, we found that RGB normalization generally outperformed global normalization methods. Thus we recommend the application of RGB normalization for miRNA expression data sets, and believe that this will yield a more consistent and useful readout of differentially expressed miRNAs, particularly in biological conditions characterized by large shifts in miRNA expression.

  14. Potential large scale production of meningococcal vaccines by stable overexpression of fHbp in the rice seeds.

    PubMed

    Ma, Jian; Wang, Yunpeng; Xu, Nuo; Jin, Libo; Liu, Jia; Xing, Shaochen; Li, Xiaokun

    2018-06-25

    Factor H binding protein (fHbp) is the most promising vaccine candidate against serogroup B of Neisseria meningitidis which is a major cause of morbidity and mortality in children. In order to facilitate large scale production of a commercial vaccine, we previously used transgenic Arabidopsis thaliana, but plant-derived fHbp is still far away from a commercial vaccine due to less biomass production. Herein, we presented an alternative route for the production of recombinant fHbp from the seeds of transgenic rice. The OsrfHbp gene encoding recombinant fHbp fused protein was introduced into the genome of rice via Agrobacterium-mediated transformation. The both stable integration and transcription of the foreign OsrfHbp were confirmed by Southern blotting and RT-PCR analysis respectively. Further, the expression of fHbp protein was measured by immunoblotting analysis and quantified by ELISA. The results indicated that fHbp was successfully expressed and the highest yield of fHbp was 0.52 ± 0.03% of TSP in the transgenic rice seeds. The purified fHbp protein showed good antigenicity and immunogenicity in the animal model. The results of this experiment offer a novel approach for large-scale production of plant-derived commercial vaccine fHbp. Copyright © 2018. Published by Elsevier Inc.

  15. Analysis of host response to bacterial infection using error model based gene expression microarray experiments

    PubMed Central

    Stekel, Dov J.; Sarti, Donatella; Trevino, Victor; Zhang, Lihong; Salmon, Mike; Buckley, Chris D.; Stevens, Mark; Pallen, Mark J.; Penn, Charles; Falciani, Francesco

    2005-01-01

    A key step in the analysis of microarray data is the selection of genes that are differentially expressed. Ideally, such experiments should be properly replicated in order to infer both technical and biological variability, and the data should be subjected to rigorous hypothesis tests to identify the differentially expressed genes. However, in microarray experiments involving the analysis of very large numbers of biological samples, replication is not always practical. Therefore, there is a need for a method to select differentially expressed genes in a rational way from insufficiently replicated data. In this paper, we describe a simple method that uses bootstrapping to generate an error model from a replicated pilot study that can be used to identify differentially expressed genes in subsequent large-scale studies on the same platform, but in which there may be no replicated arrays. The method builds a stratified error model that includes array-to-array variability, feature-to-feature variability and the dependence of error on signal intensity. We apply this model to the characterization of the host response in a model of bacterial infection of human intestinal epithelial cells. We demonstrate the effectiveness of error model based microarray experiments and propose this as a general strategy for a microarray-based screening of large collections of biological samples. PMID:15800204

  16. Genome-wide screening and identification of long noncoding RNAs and their interaction with protein coding RNAs in bladder urothelial cell carcinoma.

    PubMed

    Wang, Longxin; Fu, Dian; Qiu, Yongbin; Xing, Xiaoxiao; Xu, Feng; Han, Conghui; Xu, Xiaofeng; Wei, Zhifeng; Zhang, Zhengyu; Ge, Jingping; Cheng, Wen; Xie, Hai-Long

    2014-07-10

    To understand lncRNAs expression profiling and their potential functions in bladder cancer, we investigated the lncRNA and coding RNA expression on human bladder cancer and normal bladder tissues. Bioinformatic analysis revealed thousands of significantly differentially expressed lncRNAs and coding mRNA in bladder cancer relative to normal bladder tissue. Co-expression analysis revealed that 50% of lncRNAs and coding RNAs expressed in the same direction. A subset of lncRNAs might be involved in mTOR signaling, p53 signaling, cancer pathways. Our study provides a large scale of co-expression between lncRNA and coding RNAs in bladder cancer cells and lays biological basis for further investigation. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.

  17. APPLICATION OF CDNA MICROARRAY TECHNOLOGY TO IN VITRO TOXICOLOGY AND THE SELECTION OF GENES FOR A REAL TIME RT-PCR-BASED SCREEN FOR OXIDATIVE STRESS IN HEP-G2 CELLS

    EPA Science Inventory

    Large-scale analysis of gene expression using cDNA microarrays promises the
    rapid detection of the mode of toxicity for drugs and other chemicals. cDNA
    microarrays were used to examine chemically-induced alterations of gene
    expression in HepG2 cells exposed to oxidative ...

  18. Parallel human genome analysis: microarray-based expression monitoring of 1000 genes.

    PubMed Central

    Schena, M; Shalon, D; Heller, R; Chai, A; Brown, P O; Davis, R W

    1996-01-01

    Microarrays containing 1046 human cDNAs of unknown sequence were printed on glass with high-speed robotics. These 1.0-cm2 DNA "chips" were used to quantitatively monitor differential expression of the cognate human genes using a highly sensitive two-color hybridization assay. Array elements that displayed differential expression patterns under given experimental conditions were characterized by sequencing. The identification of known and novel heat shock and phorbol ester-regulated genes in human T cells demonstrates the sensitivity of the assay. Parallel gene analysis with microarrays provides a rapid and efficient method for large-scale human gene discovery. Images Fig. 1 Fig. 2 Fig. 3 PMID:8855227

  19. Tomato functional genomics database (TFGD): a comprehensive collection and analysis package for tomato functional genomics

    USDA-ARS?s Scientific Manuscript database

    Tomato Functional Genomics Database (TFGD; http://ted.bti.cornell.edu) provides a comprehensive systems biology resource to store, mine, analyze, visualize and integrate large-scale tomato functional genomics datasets. The database is expanded from the previously described Tomato Expression Database...

  20. Multivariate Pattern Classification of Facial Expressions Based on Large-Scale Functional Connectivity.

    PubMed

    Liang, Yin; Liu, Baolin; Li, Xianglin; Wang, Peiyuan

    2018-01-01

    It is an important question how human beings achieve efficient recognition of others' facial expressions in cognitive neuroscience, and it has been identified that specific cortical regions show preferential activation to facial expressions in previous studies. However, the potential contributions of the connectivity patterns in the processing of facial expressions remained unclear. The present functional magnetic resonance imaging (fMRI) study explored whether facial expressions could be decoded from the functional connectivity (FC) patterns using multivariate pattern analysis combined with machine learning algorithms (fcMVPA). We employed a block design experiment and collected neural activities while participants viewed facial expressions of six basic emotions (anger, disgust, fear, joy, sadness, and surprise). Both static and dynamic expression stimuli were included in our study. A behavioral experiment after scanning confirmed the validity of the facial stimuli presented during the fMRI experiment with classification accuracies and emotional intensities. We obtained whole-brain FC patterns for each facial expression and found that both static and dynamic facial expressions could be successfully decoded from the FC patterns. Moreover, we identified the expression-discriminative networks for the static and dynamic facial expressions, which span beyond the conventional face-selective areas. Overall, these results reveal that large-scale FC patterns may also contain rich expression information to accurately decode facial expressions, suggesting a novel mechanism, which includes general interactions between distributed brain regions, and that contributes to the human facial expression recognition.

  1. Multivariate Pattern Classification of Facial Expressions Based on Large-Scale Functional Connectivity

    PubMed Central

    Liang, Yin; Liu, Baolin; Li, Xianglin; Wang, Peiyuan

    2018-01-01

    It is an important question how human beings achieve efficient recognition of others’ facial expressions in cognitive neuroscience, and it has been identified that specific cortical regions show preferential activation to facial expressions in previous studies. However, the potential contributions of the connectivity patterns in the processing of facial expressions remained unclear. The present functional magnetic resonance imaging (fMRI) study explored whether facial expressions could be decoded from the functional connectivity (FC) patterns using multivariate pattern analysis combined with machine learning algorithms (fcMVPA). We employed a block design experiment and collected neural activities while participants viewed facial expressions of six basic emotions (anger, disgust, fear, joy, sadness, and surprise). Both static and dynamic expression stimuli were included in our study. A behavioral experiment after scanning confirmed the validity of the facial stimuli presented during the fMRI experiment with classification accuracies and emotional intensities. We obtained whole-brain FC patterns for each facial expression and found that both static and dynamic facial expressions could be successfully decoded from the FC patterns. Moreover, we identified the expression-discriminative networks for the static and dynamic facial expressions, which span beyond the conventional face-selective areas. Overall, these results reveal that large-scale FC patterns may also contain rich expression information to accurately decode facial expressions, suggesting a novel mechanism, which includes general interactions between distributed brain regions, and that contributes to the human facial expression recognition. PMID:29615882

  2. Genome-scale approaches to the epigenetics of common human disease

    PubMed Central

    2011-01-01

    Traditionally, the pathology of human disease has been focused on microscopic examination of affected tissues, chemical and biochemical analysis of biopsy samples, other available samples of convenience, such as blood, and noninvasive or invasive imaging of varying complexity, in order to classify disease and illuminate its mechanistic basis. The molecular age has complemented this armamentarium with gene expression arrays and selective analysis of individual genes. However, we are entering a new era of epigenomic profiling, i.e., genome-scale analysis of cell-heritable nonsequence genetic change, such as DNA methylation. The epigenome offers access to stable measurements of cellular state and to biobanked material for large-scale epidemiological studies. Some of these genome-scale technologies are beginning to be applied to create the new field of epigenetic epidemiology. PMID:19844740

  3. Comparative modular analysis of gene expression in vertebrate organs.

    PubMed

    Piasecka, Barbara; Kutalik, Zoltán; Roux, Julien; Bergmann, Sven; Robinson-Rechavi, Marc

    2012-03-29

    The degree of conservation of gene expression between homologous organs largely remains an open question. Several recent studies reported some evidence in favor of such conservation. Most studies compute organs' similarity across all orthologous genes, whereas the expression level of many genes are not informative about organ specificity. Here, we use a modularization algorithm to overcome this limitation through the identification of inter-species co-modules of organs and genes. We identify such co-modules using mouse and human microarray expression data. They are functionally coherent both in terms of genes and of organs from both organisms. We show that a large proportion of genes belonging to the same co-module are orthologous between mouse and human. Moreover, their zebrafish orthologs also tend to be expressed in the corresponding homologous organs. Notable exceptions to the general pattern of conservation are the testis and the olfactory bulb. Interestingly, some co-modules consist of single organs, while others combine several functionally related organs. For instance, amygdala, cerebral cortex, hypothalamus and spinal cord form a clearly discernible unit of expression, both in mouse and human. Our study provides a new framework for comparative analysis which will be applicable also to other sets of large-scale phenotypic data collected across different species.

  4. Dating and functional characterization of duplicated genes in the apple (Malus domestica Borkh.) by analyzing EST data.

    PubMed

    Sanzol, Javier

    2010-05-14

    Gene duplication is central to genome evolution. In plants, genes can be duplicated through small-scale events and large-scale duplications often involving polyploidy. The apple belongs to the subtribe Pyrinae (Rosaceae), a diverse lineage that originated via allopolyploidization. Both small-scale duplications and polyploidy may have been important mechanisms shaping the genome of this species. This study evaluates the gene duplication and polyploidy history of the apple by characterizing duplicated genes in this species using EST data. Overall, 68% of the apple genes were clustered into families with a mean copy-number of 4.6. Analysis of the age distribution of gene duplications supported a continuous mode of small-scale duplications, plus two episodes of large-scale duplicates of vastly different ages. The youngest was consistent with the polyploid origin of the Pyrinae 37-48 MYBP, whereas the older may be related to gamma-triplication; an ancient hexapolyploidization previously characterized in the four sequenced eurosid genomes and basal to the eurosid-asterid divergence. Duplicated genes were studied for functional diversification with an emphasis on young paralogs; those originated during or after the formation of the Pyrinae lineage. Unequal assignment of single-copy genes and gene families to Gene Ontology categories suggested functional bias in the pattern of gene retention of paralogs. Young paralogs related to signal transduction, metabolism, and energy pathways have been preferentially retained. Non-random retention of duplicated genes seems to have mediated the expansion of gene families, some of which may have substantially increased their members after the origin of the Pyrinae. The joint analysis of over-duplicated functional categories and phylogenies, allowed evaluation of the role of both polyploidy and small-scale duplications during this process. Finally, gene expression analysis indicated that 82% of duplicated genes, including 80% of young paralogs, showed uncorrelated expression profiles, suggesting extensive subfunctionalization and a role of gene duplication in the acquisition of novel patterns of gene expression. This study reports a genome-wide analysis of the mode of gene duplication in the apple, and provides evidence for its role in genome functional diversification by characterising three major processes: selective retention of paralogs, amplification of gene families, and changes in gene expression.

  5. Principles of gene microarray data analysis.

    PubMed

    Mocellin, Simone; Rossi, Carlo Riccardo

    2007-01-01

    The development of several gene expression profiling methods, such as comparative genomic hybridization (CGH), differential display, serial analysis of gene expression (SAGE), and gene microarray, together with the sequencing of the human genome, has provided an opportunity to monitor and investigate the complex cascade of molecular events leading to tumor development and progression. The availability of such large amounts of information has shifted the attention of scientists towards a nonreductionist approach to biological phenomena. High throughput technologies can be used to follow changing patterns of gene expression over time. Among them, gene microarray has become prominent because it is easier to use, does not require large-scale DNA sequencing, and allows for the parallel quantification of thousands of genes from multiple samples. Gene microarray technology is rapidly spreading worldwide and has the potential to drastically change the therapeutic approach to patients affected with tumor. Therefore, it is of paramount importance for both researchers and clinicians to know the principles underlying the analysis of the huge amount of data generated with microarray technology.

  6. Cloud-scale genomic signals processing classification analysis for gene expression microarray data.

    PubMed

    Harvey, Benjamin; Soo-Yeon Ji

    2014-01-01

    As microarray data available to scientists continues to increase in size and complexity, it has become overwhelmingly important to find multiple ways to bring inference though analysis of DNA/mRNA sequence data that is useful to scientists. Though there have been many attempts to elucidate the issue of bringing forth biological inference by means of wavelet preprocessing and classification, there has not been a research effort that focuses on a cloud-scale classification analysis of microarray data using Wavelet thresholding in a Cloud environment to identify significantly expressed features. This paper proposes a novel methodology that uses Wavelet based Denoising to initialize a threshold for determination of significantly expressed genes for classification. Additionally, this research was implemented and encompassed within cloud-based distributed processing environment. The utilization of Cloud computing and Wavelet thresholding was used for the classification 14 tumor classes from the Global Cancer Map (GCM). The results proved to be more accurate than using a predefined p-value for differential expression classification. This novel methodology analyzed Wavelet based threshold features of gene expression in a Cloud environment, furthermore classifying the expression of samples by analyzing gene patterns, which inform us of biological processes. Moreover, enabling researchers to face the present and forthcoming challenges that may arise in the analysis of data in functional genomics of large microarray datasets.

  7. Streaming fragment assignment for real-time analysis of sequencing experiments

    PubMed Central

    Roberts, Adam; Pachter, Lior

    2013-01-01

    We present eXpress, a software package for highly efficient probabilistic assignment of ambiguously mapping sequenced fragments. eXpress uses a streaming algorithm with linear run time and constant memory use. It can determine abundances of sequenced molecules in real time, and can be applied to ChIP-seq, metagenomics and other large-scale sequencing data. We demonstrate its use on RNA-seq data, showing greater efficiency than other quantification methods. PMID:23160280

  8. SLIDE - a web-based tool for interactive visualization of large-scale -omics data.

    PubMed

    Ghosh, Soumita; Datta, Abhik; Tan, Kaisen; Choi, Hyungwon

    2018-06-28

    Data visualization is often regarded as a post hoc step for verifying statistically significant results in the analysis of high-throughput data sets. This common practice leaves a large amount of raw data behind, from which more information can be extracted. However, existing solutions do not provide capabilities to explore large-scale raw datasets using biologically sensible queries, nor do they allow user interaction based real-time customization of graphics. To address these drawbacks, we have designed an open-source, web-based tool called Systems-Level Interactive Data Exploration, or SLIDE to visualize large-scale -omics data interactively. SLIDE's interface makes it easier for scientists to explore quantitative expression data in multiple resolutions in a single screen. SLIDE is publicly available under BSD license both as an online version as well as a stand-alone version at https://github.com/soumitag/SLIDE. Supplementary Information are available at Bioinformatics online.

  9. Predictive model for inflammation grades of chronic hepatitis B: Large-scale analysis of clinical parameters and gene expressions.

    PubMed

    Zhou, Weichen; Ma, Yanyun; Zhang, Jun; Hu, Jingyi; Zhang, Menghan; Wang, Yi; Li, Yi; Wu, Lijun; Pan, Yida; Zhang, Yitong; Zhang, Xiaonan; Zhang, Xinxin; Zhang, Zhanqing; Zhang, Jiming; Li, Hai; Lu, Lungen; Jin, Li; Wang, Jiucun; Yuan, Zhenghong; Liu, Jie

    2017-11-01

    Liver biopsy is the gold standard to assess pathological features (eg inflammation grades) for hepatitis B virus-infected patients although it is invasive and traumatic; meanwhile, several gene profiles of chronic hepatitis B (CHB) have been separately described in relatively small hepatitis B virus (HBV)-infected samples. We aimed to analyse correlations among inflammation grades, gene expressions and clinical parameters (serum alanine amino transaminase, aspartate amino transaminase and HBV-DNA) in large-scale CHB samples and to predict inflammation grades by using clinical parameters and/or gene expressions. We analysed gene expressions with three clinical parameters in 122 CHB samples by an improved regression model. Principal component analysis and machine-learning methods including Random Forest, K-nearest neighbour and support vector machine were used for analysis and further diagnosis models. Six normal samples were conducted to validate the predictive model. Significant genes related to clinical parameters were found enriching in the immune system, interferon-stimulated, regulation of cytokine production, anti-apoptosis, and etc. A panel of these genes with clinical parameters can effectively predict binary classifications of inflammation grade (area under the ROC curve [AUC]: 0.88, 95% confidence interval [CI]: 0.77-0.93), validated by normal samples. A panel with only clinical parameters was also valuable (AUC: 0.78, 95% CI: 0.65-0.86), indicating that liquid biopsy method for detecting the pathology of CHB is possible. This is the first study to systematically elucidate the relationships among gene expressions, clinical parameters and pathological inflammation grades in CHB, and to build models predicting inflammation grades by gene expressions and/or clinical parameters as well. © 2017 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  10. Large-scale atlas of microarray data reveals biological landscape of gene expression in Arabidopsis

    USDA-ARS?s Scientific Manuscript database

    Transcriptome datasets from thousands of samples of the model plant Arabidopsis thaliana have been collectively generated by multiple individual labs. Although integration and meta-analysis of these samples has become routine in the plant research community, it is often hampered by the lack of metad...

  11. Sequence analysis reveals genomic factors affecting EST-SSR primer performance and polymorphism

    USDA-ARS?s Scientific Manuscript database

    Search for simple sequence repeat (SSR) motifs and design of flanking primers in expressed sequence tag (EST) sequences can be easily done at a large scale using bioinformatics programs. However, failed amplification and/or detection, along with lack of polymorphism, is often seen among randomly sel...

  12. An efficient procedure for the expression and purification of HIV-1 protease from inclusion bodies.

    PubMed

    Nguyen, Hong-Loan Thi; Nguyen, Thuy Thi; Vu, Quy Thi; Le, Hang Thi; Pham, Yen; Trinh, Phuong Le; Bui, Thuan Phuong; Phan, Tuan-Nghia

    2015-12-01

    Several studies have focused on HIV-1 protease for developing drugs for treating AIDS. Recombinant HIV-1 protease is used to screen new drugs from synthetic compounds or natural substances. However, large-scale expression and purification of this enzyme is difficult mainly because of its low expression and solubility. In this study, we constructed 9 recombinant plasmids containing a sequence encoding HIV-1 protease along with different fusion tags and examined the expression of the enzyme from these plasmids. Of the 9 plasmids, pET32a(+) plasmid containing the HIV-1 protease-encoding sequence along with sequences encoding an autocleavage site GTVSFNF at the N-terminus and TEV plus 6× His tag at the C-terminus showed the highest expression of the enzyme and was selected for further analysis. The recombinant protein was isolated from inclusion bodies by using 2 tandem Q- and Ni-Sepharose columns. SDS-PAGE of the obtained HIV-1 protease produced a single band of approximately 13 kDa. The enzyme was recovered efficiently (4 mg protein/L of cell culture) and had high specific activity of 1190 nmol min(-1) mg(-1) at an optimal pH of 4.7 and optimal temperature of 37 °C. This procedure for expressing and purifying HIV-1 protease is now being scaled up to produce the enzyme on a large scale for its application. Copyright © 2015 Elsevier Inc. All rights reserved.

  13. Computerized image analysis for quantitative neuronal phenotyping in zebrafish.

    PubMed

    Liu, Tianming; Lu, Jianfeng; Wang, Ye; Campbell, William A; Huang, Ling; Zhu, Jinmin; Xia, Weiming; Wong, Stephen T C

    2006-06-15

    An integrated microscope image analysis pipeline is developed for automatic analysis and quantification of phenotypes in zebrafish with altered expression of Alzheimer's disease (AD)-linked genes. We hypothesize that a slight impairment of neuronal integrity in a large number of zebrafish carrying the mutant genotype can be detected through the computerized image analysis method. Key functionalities of our zebrafish image processing pipeline include quantification of neuron loss in zebrafish embryos due to knockdown of AD-linked genes, automatic detection of defective somites, and quantitative measurement of gene expression levels in zebrafish with altered expression of AD-linked genes or treatment with a chemical compound. These quantitative measurements enable the archival of analyzed results and relevant meta-data. The structured database is organized for statistical analysis and data modeling to better understand neuronal integrity and phenotypic changes of zebrafish under different perturbations. Our results show that the computerized analysis is comparable to manual counting with equivalent accuracy and improved efficacy and consistency. Development of such an automated data analysis pipeline represents a significant step forward to achieve accurate and reproducible quantification of neuronal phenotypes in large scale or high-throughput zebrafish imaging studies.

  14. Systems biology of embryonic development: Prospects for a complete understanding of the Caenorhabditis elegans embryo.

    PubMed

    Murray, John Isaac

    2018-05-01

    The convergence of developmental biology and modern genomics tools brings the potential for a comprehensive understanding of developmental systems. This is especially true for the Caenorhabditis elegans embryo because its small size, invariant developmental lineage, and powerful genetic and genomic tools provide the prospect of a cellular resolution understanding of messenger RNA (mRNA) expression and regulation across the organism. We describe here how a systems biology framework might allow large-scale determination of the embryonic regulatory relationships encoded in the C. elegans genome. This framework consists of two broad steps: (a) defining the "parts list"-all genes expressed in all cells at each time during development and (b) iterative steps of computational modeling and refinement of these models by experimental perturbation. Substantial progress has been made towards defining the parts list through imaging methods such as large-scale green fluorescent protein (GFP) reporter analysis. Imaging results are now being augmented by high-resolution transcriptome methods such as single-cell RNA sequencing, and it is likely the complete expression patterns of all genes across the embryo will be known within the next few years. In contrast, the modeling and perturbation experiments performed so far have focused largely on individual cell types or genes, and improved methods will be needed to expand them to the full genome and organism. This emerging comprehensive map of embryonic expression and regulatory function will provide a powerful resource for developmental biologists, and would also allow scientists to ask questions not accessible without a comprehensive picture. This article is categorized under: Invertebrate Organogenesis > Worms Technologies > Analysis of the Transcriptome Gene Expression and Transcriptional Hierarchies > Gene Networks and Genomics. © 2018 Wiley Periodicals, Inc.

  15. An integrated bioinformatics approach to improve two-color microarray quality-control: impact on biological conclusions.

    PubMed

    van Haaften, Rachel I M; Luceri, Cristina; van Erk, Arie; Evelo, Chris T A

    2009-06-01

    Omics technology used for large-scale measurements of gene expression is rapidly evolving. This work pointed out the need of an extensive bioinformatics analyses for array quality assessment before and after gene expression clustering and pathway analysis. A study focused on the effect of red wine polyphenols on rat colon mucosa was used to test the impact of quality control and normalisation steps on the biological conclusions. The integration of data visualization, pathway analysis and clustering revealed an artifact problem that was solved with an adapted normalisation. We propose a possible point to point standard analysis procedure, based on a combination of clustering and data visualization for the analysis of microarray data.

  16. Comparisons between Arabidopsis thaliana and Drosophila melanogaster in relation to Coding and Noncoding Sequence Length and Gene Expression

    PubMed Central

    Caldwell, Rachel; Lin, Yan-Xia; Zhang, Ren

    2015-01-01

    There is a continuing interest in the analysis of gene architecture and gene expression to determine the relationship that may exist. Advances in high-quality sequencing technologies and large-scale resource datasets have increased the understanding of relationships and cross-referencing of expression data to the large genome data. Although a negative correlation between expression level and gene (especially transcript) length has been generally accepted, there have been some conflicting results arising from the literature concerning the impacts of different regions of genes, and the underlying reason is not well understood. The research aims to apply quantile regression techniques for statistical analysis of coding and noncoding sequence length and gene expression data in the plant, Arabidopsis thaliana, and fruit fly, Drosophila melanogaster, to determine if a relationship exists and if there is any variation or similarities between these species. The quantile regression analysis found that the coding sequence length and gene expression correlations varied, and similarities emerged for the noncoding sequence length (5′ and 3′ UTRs) between animal and plant species. In conclusion, the information described in this study provides the basis for further exploration into gene regulation with regard to coding and noncoding sequence length. PMID:26114098

  17. DEXTER: Disease-Expression Relation Extraction from Text.

    PubMed

    Gupta, Samir; Dingerdissen, Hayley; Ross, Karen E; Hu, Yu; Wu, Cathy H; Mazumder, Raja; Vijay-Shanker, K

    2018-01-01

    Gene expression levels affect biological processes and play a key role in many diseases. Characterizing expression profiles is useful for clinical research, and diagnostics and prognostics of diseases. There are currently several high-quality databases that capture gene expression information, obtained mostly from large-scale studies, such as microarray and next-generation sequencing technologies, in the context of disease. The scientific literature is another rich source of information on gene expression-disease relationships that not only have been captured from large-scale studies but have also been observed in thousands of small-scale studies. Expression information obtained from literature through manual curation can extend expression databases. While many of the existing databases include information from literature, they are limited by the time-consuming nature of manual curation and have difficulty keeping up with the explosion of publications in the biomedical field. In this work, we describe an automated text-mining tool, Disease-Expression Relation Extraction from Text (DEXTER) to extract information from literature on gene and microRNA expression in the context of disease. One of the motivations in developing DEXTER was to extend the BioXpress database, a cancer-focused gene expression database that includes data derived from large-scale experiments and manual curation of publications. The literature-based portion of BioXpress lags behind significantly compared to expression information obtained from large-scale studies and can benefit from our text-mined results. We have conducted two different evaluations to measure the accuracy of our text-mining tool and achieved average F-scores of 88.51 and 81.81% for the two evaluations, respectively. Also, to demonstrate the ability to extract rich expression information in different disease-related scenarios, we used DEXTER to extract information on differential expression information for 2024 genes in lung cancer, 115 glycosyltransferases in 62 cancers and 826 microRNA in 171 cancers. All extractions using DEXTER are integrated in the literature-based portion of BioXpress.Database URL: http://biotm.cis.udel.edu/DEXTER.

  18. Genetic Approaches to Study Meiosis and Meiosis-Specific Gene Expression in Saccharomyces cerevisiae.

    PubMed

    Kassir, Yona; Stuart, David T

    2017-01-01

    The budding yeast Saccharomyces cerevisiae has a long history as a model organism for studies of meiosis and the cell cycle. The popularity of this yeast as a model is in large part due to the variety of genetic and cytological approaches that can be effectively performed with the cells. Cultures of the cells can be induced to synchronously progress through meiosis and sporulation allowing large-scale gene expression and biochemical studies to be performed. Additionally, the spore tetrads resulting from meiosis make it possible to characterize the haploid products of meiosis allowing investigation of meiotic recombination and chromosome segregation. Here we describe genetic methods for analysis progression of S. cerevisiae through meiosis and sporulation with an emphasis on strategies for the genetic analysis of regulators of meiosis-specific genes.

  19. Cell-free protein synthesis: applications in proteomics and biotechnology.

    PubMed

    He, Mingyue

    2008-01-01

    Protein production is one of the key steps in biotechnology and functional proteomics. Expression of proteins in heterologous hosts (such as in E. coli) is generally lengthy and costly. Cell-free protein synthesis is thus emerging as an attractive alternative. In addition to the simplicity and speed for protein production, cell-free expression allows generation of functional proteins that are difficult to produce by in vivo systems. Recent exploitation of cell-free systems enables novel development of technologies for rapid discovery of proteins with desirable properties from very large libraries. This article reviews the recent development in cell-free systems and their application in the large scale protein analysis.

  20. Concordant integrative gene set enrichment analysis of multiple large-scale two-sample expression data sets.

    PubMed

    Lai, Yinglei; Zhang, Fanni; Nayak, Tapan K; Modarres, Reza; Lee, Norman H; McCaffrey, Timothy A

    2014-01-01

    Gene set enrichment analysis (GSEA) is an important approach to the analysis of coordinate expression changes at a pathway level. Although many statistical and computational methods have been proposed for GSEA, the issue of a concordant integrative GSEA of multiple expression data sets has not been well addressed. Among different related data sets collected for the same or similar study purposes, it is important to identify pathways or gene sets with concordant enrichment. We categorize the underlying true states of differential expression into three representative categories: no change, positive change and negative change. Due to data noise, what we observe from experiments may not indicate the underlying truth. Although these categories are not observed in practice, they can be considered in a mixture model framework. Then, we define the mathematical concept of concordant gene set enrichment and calculate its related probability based on a three-component multivariate normal mixture model. The related false discovery rate can be calculated and used to rank different gene sets. We used three published lung cancer microarray gene expression data sets to illustrate our proposed method. One analysis based on the first two data sets was conducted to compare our result with a previous published result based on a GSEA conducted separately for each individual data set. This comparison illustrates the advantage of our proposed concordant integrative gene set enrichment analysis. Then, with a relatively new and larger pathway collection, we used our method to conduct an integrative analysis of the first two data sets and also all three data sets. Both results showed that many gene sets could be identified with low false discovery rates. A consistency between both results was also observed. A further exploration based on the KEGG cancer pathway collection showed that a majority of these pathways could be identified by our proposed method. This study illustrates that we can improve detection power and discovery consistency through a concordant integrative analysis of multiple large-scale two-sample gene expression data sets.

  1. Transcriptional analysis of the Arabidopsis ovule by massively parallel signature sequencing

    PubMed Central

    Sánchez-León, Nidia; Arteaga-Vázquez, Mario; Alvarez-Mejía, César; Mendiola-Soto, Javier; Durán-Figueroa, Noé; Rodríguez-Leal, Daniel; Rodríguez-Arévalo, Isaac; García-Campayo, Vicenta; García-Aguilar, Marcelina; Olmedo-Monfil, Vianey; Arteaga-Sánchez, Mario; Martínez de la Vega, Octavio; Nobuta, Kan; Vemaraju, Kalyan; Meyers, Blake C.; Vielle-Calzada, Jean-Philippe

    2012-01-01

    The life cycle of flowering plants alternates between a predominant sporophytic (diploid) and an ephemeral gametophytic (haploid) generation that only occurs in reproductive organs. In Arabidopsis thaliana, the female gametophyte is deeply embedded within the ovule, complicating the study of the genetic and molecular interactions involved in the sporophytic to gametophytic transition. Massively parallel signature sequencing (MPSS) was used to conduct a quantitative large-scale transcriptional analysis of the fully differentiated Arabidopsis ovule prior to fertilization. The expression of 9775 genes was quantified in wild-type ovules, additionally detecting >2200 new transcripts mapping to antisense or intergenic regions. A quantitative comparison of global expression in wild-type and sporocyteless (spl) individuals resulted in 1301 genes showing 25-fold reduced or null activity in ovules lacking a female gametophyte, including those encoding 92 signalling proteins, 75 transcription factors, and 72 RNA-binding proteins not reported in previous studies based on microarray profiling. A combination of independent genetic and molecular strategies confirmed the differential expression of 28 of them, showing that they are either preferentially active in the female gametophyte, or dependent on the presence of a female gametophyte to be expressed in sporophytic cells of the ovule. Among 18 genes encoding pentatricopeptide-repeat proteins (PPRs) that show transcriptional activity in wild-type but not spl ovules, CIHUATEOTL (At4g38150) is specifically expressed in the female gametophyte and necessary for female gametogenesis. These results expand the nature of the transcriptional universe present in the ovule of Arabidopsis, and offer a large-scale quantitative reference of global expression for future genomic and developmental studies. PMID:22442422

  2. Transcriptional analysis of the Arabidopsis ovule by massively parallel signature sequencing.

    PubMed

    Sánchez-León, Nidia; Arteaga-Vázquez, Mario; Alvarez-Mejía, César; Mendiola-Soto, Javier; Durán-Figueroa, Noé; Rodríguez-Leal, Daniel; Rodríguez-Arévalo, Isaac; García-Campayo, Vicenta; García-Aguilar, Marcelina; Olmedo-Monfil, Vianey; Arteaga-Sánchez, Mario; de la Vega, Octavio Martínez; Nobuta, Kan; Vemaraju, Kalyan; Meyers, Blake C; Vielle-Calzada, Jean-Philippe

    2012-06-01

    The life cycle of flowering plants alternates between a predominant sporophytic (diploid) and an ephemeral gametophytic (haploid) generation that only occurs in reproductive organs. In Arabidopsis thaliana, the female gametophyte is deeply embedded within the ovule, complicating the study of the genetic and molecular interactions involved in the sporophytic to gametophytic transition. Massively parallel signature sequencing (MPSS) was used to conduct a quantitative large-scale transcriptional analysis of the fully differentiated Arabidopsis ovule prior to fertilization. The expression of 9775 genes was quantified in wild-type ovules, additionally detecting >2200 new transcripts mapping to antisense or intergenic regions. A quantitative comparison of global expression in wild-type and sporocyteless (spl) individuals resulted in 1301 genes showing 25-fold reduced or null activity in ovules lacking a female gametophyte, including those encoding 92 signalling proteins, 75 transcription factors, and 72 RNA-binding proteins not reported in previous studies based on microarray profiling. A combination of independent genetic and molecular strategies confirmed the differential expression of 28 of them, showing that they are either preferentially active in the female gametophyte, or dependent on the presence of a female gametophyte to be expressed in sporophytic cells of the ovule. Among 18 genes encoding pentatricopeptide-repeat proteins (PPRs) that show transcriptional activity in wild-type but not spl ovules, CIHUATEOTL (At4g38150) is specifically expressed in the female gametophyte and necessary for female gametogenesis. These results expand the nature of the transcriptional universe present in the ovule of Arabidopsis, and offer a large-scale quantitative reference of global expression for future genomic and developmental studies.

  3. GECKO: a complete large-scale gene expression analysis platform.

    PubMed

    Theilhaber, Joachim; Ulyanov, Anatoly; Malanthara, Anish; Cole, Jack; Xu, Dapeng; Nahf, Robert; Heuer, Michael; Brockel, Christoph; Bushnell, Steven

    2004-12-10

    Gecko (Gene Expression: Computation and Knowledge Organization) is a complete, high-capacity centralized gene expression analysis system, developed in response to the needs of a distributed user community. Based on a client-server architecture, with a centralized repository of typically many tens of thousands of Affymetrix scans, Gecko includes automatic processing pipelines for uploading data from remote sites, a data base, a computational engine implementing approximately 50 different analysis tools, and a client application. Among available analysis tools are clustering methods, principal component analysis, supervised classification including feature selection and cross-validation, multi-factorial ANOVA, statistical contrast calculations, and various post-processing tools for extracting data at given error rates or significance levels. On account of its open architecture, Gecko also allows for the integration of new algorithms. The Gecko framework is very general: non-Affymetrix and non-gene expression data can be analyzed as well. A unique feature of the Gecko architecture is the concept of the Analysis Tree (actually, a directed acyclic graph), in which all successive results in ongoing analyses are saved. This approach has proven invaluable in allowing a large (approximately 100 users) and distributed community to share results, and to repeatedly return over a span of years to older and potentially very complex analyses of gene expression data. The Gecko system is being made publicly available as free software http://sourceforge.net/projects/geckoe. In totality or in parts, the Gecko framework should prove useful to users and system developers with a broad range of analysis needs.

  4. Identification and Characterization of Genomic Amplifications in Ovarian Serous Carcinoma

    DTIC Science & Technology

    2009-07-01

    oncogenes, Rsf1 and Notch3, which were up-regulated in both genomic DNA and transcript levels in ovarian cancer. In a large- scale FISH analysis, Rsf1...associated with worse disease outcome, suggesting that Rsf1 could be potentially used as a prognostic marker in the future (Appendix #1). For the...over- expressed in a recurrent carcinoma. Although the follow-up study in a larger- scale sample size did not demonstrate clear amplification in NAC1

  5. RNA sequencing: current and prospective uses in metabolic research.

    PubMed

    Vikman, Petter; Fadista, Joao; Oskolkov, Nikolay

    2014-10-01

    Previous global RNA analysis was restricted to known transcripts in species with a defined transcriptome. Next generation sequencing has transformed transcriptomics by making it possible to analyse expressed genes with an exon level resolution from any tissue in any species without any a priori knowledge of which genes that are being expressed, splice patterns or their nucleotide sequence. In addition, RNA sequencing is a more sensitive technique compared with microarrays with a larger dynamic range, and it also allows for investigation of imprinting and allele-specific expression. This can be done for a cost that is able to compete with that of a microarray, making RNA sequencing a technique available to most researchers. Therefore RNA sequencing has recently become the state of the art with regards to large-scale RNA investigations and has to a large extent replaced microarrays. The only drawback is the large data amounts produced, which together with the complexity of the data can make a researcher spend far more time on analysis than performing the actual experiment. © 2014 Society for Endocrinology.

  6. A genome scale metabolic network for rice and accompanying analysis of tryptophan, auxin and serotonin biosynthesis regulation under biotic stress

    USDA-ARS?s Scientific Manuscript database

    Functional annotations of large plant genome projects mostly provide information on gene function and gene families based on the presence of protein domains and gene homology, but not necessarily in association with gene expression or metabolic and regulatory networks. These additional annotations a...

  7. Large-scale analysis of antisense transcription in wheat using the Affymetrix GeneChip Wheat Genome Array

    USDA-ARS?s Scientific Manuscript database

    Natural antisense transcripts (NATs) are transcripts of the opposite DNA strand to the sense-strand either at the same locus (cis-encoded) or a different locus (trans-encoded). They can affect gene expression at multiple stages including transcription, RNA processing and transport, and translation....

  8. Crowdsourcing scoring of immunohistochemistry images: Evaluating Performance of the Crowd and an Automated Computational Method

    NASA Astrophysics Data System (ADS)

    Irshad, Humayun; Oh, Eun-Yeong; Schmolze, Daniel; Quintana, Liza M.; Collins, Laura; Tamimi, Rulla M.; Beck, Andrew H.

    2017-02-01

    The assessment of protein expression in immunohistochemistry (IHC) images provides important diagnostic, prognostic and predictive information for guiding cancer diagnosis and therapy. Manual scoring of IHC images represents a logistical challenge, as the process is labor intensive and time consuming. Since the last decade, computational methods have been developed to enable the application of quantitative methods for the analysis and interpretation of protein expression in IHC images. These methods have not yet replaced manual scoring for the assessment of IHC in the majority of diagnostic laboratories and in many large-scale research studies. An alternative approach is crowdsourcing the quantification of IHC images to an undefined crowd. The aim of this study is to quantify IHC images for labeling of ER status with two different crowdsourcing approaches, image-labeling and nuclei-labeling, and compare their performance with automated methods. Crowdsourcing- derived scores obtained greater concordance with the pathologist interpretations for both image-labeling and nuclei-labeling tasks (83% and 87%), as compared to the pathologist concordance achieved by the automated method (81%) on 5,338 TMA images from 1,853 breast cancer patients. This analysis shows that crowdsourcing the scoring of protein expression in IHC images is a promising new approach for large scale cancer molecular pathology studies.

  9. ExprAlign - the identification of ESTs in non-model species by alignment of cDNA microarray expression profiles

    PubMed Central

    2009-01-01

    Background Sequence identification of ESTs from non-model species offers distinct challenges particularly when these species have duplicated genomes and when they are phylogenetically distant from sequenced model organisms. For the common carp, an environmental model of aquacultural interest, large numbers of ESTs remained unidentified using BLAST sequence alignment. We have used the expression profiles from large-scale microarray experiments to suggest gene identities. Results Expression profiles from ~700 cDNA microarrays describing responses of 7 major tissues to multiple environmental stressors were used to define a co-expression landscape. This was based on the Pearsons correlation coefficient relating each gene with all other genes, from which a network description provided clusters of highly correlated genes as 'mountains'. We show that these contain genes with known identities and genes with unknown identities, and that the correlation constitutes evidence of identity in the latter. This procedure has suggested identities to 522 of 2701 unknown carp ESTs sequences. We also discriminate several common carp genes and gene isoforms that were not discriminated by BLAST sequence alignment alone. Precision in identification was substantially improved by use of data from multiple tissues and treatments. Conclusion The detailed analysis of co-expression landscapes is a sensitive technique for suggesting an identity for the large number of BLAST unidentified cDNAs generated in EST projects. It is capable of detecting even subtle changes in expression profiles, and thereby of distinguishing genes with a common BLAST identity into different identities. It benefits from the use of multiple treatments or contrasts, and from the large-scale microarray data. PMID:19939286

  10. The opportunities and challenges of large-scale molecular approaches to songbird neurobiology

    PubMed Central

    Mello, C.V.; Clayton, D.F.

    2014-01-01

    High-through put methods for analyzing genome structure and function are having a large impact in song-bird neurobiology. Methods include genome sequencing and annotation, comparative genomics, DNA microarrays and transcriptomics, and the development of a brain atlas of gene expression. Key emerging findings include the identification of complex transcriptional programs active during singing, the robust brain expression of non-coding RNAs, evidence of profound variations in gene expression across brain regions, and the identification of molecular specializations within song production and learning circuits. Current challenges include the statistical analysis of large datasets, effective genome curations, the efficient localization of gene expression changes to specific neuronal circuits and cells, and the dissection of behavioral and environmental factors that influence brain gene expression. The field requires efficient methods for comparisons with organisms like chicken, which offer important anatomical, functional and behavioral contrasts. As sequencing costs plummet, opportunities emerge for comparative approaches that may help reveal evolutionary transitions contributing to vocal learning, social behavior and other properties that make songbirds such compelling research subjects. PMID:25280907

  11. Directed module detection in a large-scale expression compendium.

    PubMed

    Fu, Qiang; Lemmens, Karen; Sanchez-Rodriguez, Aminael; Thijs, Inge M; Meysman, Pieter; Sun, Hong; Fierro, Ana Carolina; Engelen, Kristof; Marchal, Kathleen

    2012-01-01

    Public online microarray databases contain tremendous amounts of expression data. Mining these data sources can provide a wealth of information on the underlying transcriptional networks. In this chapter, we illustrate how the web services COLOMBOS and DISTILLER can be used to identify condition-dependent coexpression modules by exploring compendia of public expression data. COLOMBOS is designed for user-specified query-driven analysis, whereas DISTILLER generates a global regulatory network overview. The user is guided through both web services by means of a case study in which condition-dependent coexpression modules comprising a gene of interest (i.e., "directed") are identified.

  12. Combining classifiers to predict gene function in Arabidopsis thaliana using large-scale gene expression measurements.

    PubMed

    Lan, Hui; Carson, Rachel; Provart, Nicholas J; Bonner, Anthony J

    2007-09-21

    Arabidopsis thaliana is the model species of current plant genomic research with a genome size of 125 Mb and approximately 28,000 genes. The function of half of these genes is currently unknown. The purpose of this study is to infer gene function in Arabidopsis using machine-learning algorithms applied to large-scale gene expression data sets, with the goal of identifying genes that are potentially involved in plant response to abiotic stress. Using in house and publicly available data, we assembled a large set of gene expression measurements for A. thaliana. Using those genes of known function, we first evaluated and compared the ability of basic machine-learning algorithms to predict which genes respond to stress. Predictive accuracy was measured using ROC50 and precision curves derived through cross validation. To improve accuracy, we developed a method for combining these classifiers using a weighted-voting scheme. The combined classifier was then trained on genes of known function and applied to genes of unknown function, identifying genes that potentially respond to stress. Visual evidence corroborating the predictions was obtained using electronic Northern analysis. Three of the predicted genes were chosen for biological validation. Gene knockout experiments confirmed that all three are involved in a variety of stress responses. The biological analysis of one of these genes (At1g16850) is presented here, where it is shown to be necessary for the normal response to temperature and NaCl. Supervised learning methods applied to large-scale gene expression measurements can be used to predict gene function. However, the ability of basic learning methods to predict stress response varies widely and depends heavily on how much dimensionality reduction is used. Our method of combining classifiers can improve the accuracy of such predictions - in this case, predictions of genes involved in stress response in plants - and it effectively chooses the appropriate amount of dimensionality reduction automatically. The method provides a useful means of identifying genes in A. thaliana that potentially respond to stress, and we expect it would be useful in other organisms and for other gene functions.

  13. Large scale systematic proteomic quantification from non-metastatic to metastatic colorectal cancer

    NASA Astrophysics Data System (ADS)

    Yin, Xuefei; Zhang, Yang; Guo, Shaowen; Jin, Hong; Wang, Wenhai; Yang, Pengyuan

    2015-07-01

    A systematic proteomic quantification of formalin-fixed, paraffin-embedded (FFPE) colorectal cancer tissues from stage I to stage IIIC was performed in large scale. 1017 proteins were identified with 338 proteins in quantitative changes by label free method, while 341 proteins were quantified with significant expression changes among 6294 proteins by iTRAQ method. We found that proteins related to migration expression increased and those for binding and adherent decreased during the colorectal cancer development according to the gene ontology (GO) annotation and ingenuity pathway analysis (IPA). The integrin alpha 5 (ITA5) in integrin family was focused, which was consistent with the metastasis related pathway. The expression level of ITA5 decreased in metastasis tissues and the result has been further verified by Western blotting. Another two cell migration related proteins vitronectin (VTN) and actin-related protein (ARP3) were also proved to be up-regulated by both mass spectrometry (MS) based quantification results and Western blotting. Up to now, our result shows one of the largest dataset in colorectal cancer proteomics research. Our strategy reveals a disease driven omics-pattern for the metastasis colorectal cancer.

  14. Purification and properties of insulin receptor ectodomain from large-scale mammalian cell culture.

    PubMed

    Cosgrove, L; Lovrecz, G O; Verkuylen, A; Cavaleri, L; Black, L A; Bentley, J D; Howlett, G J; Gray, P P; Ward, C W; McKern, N M

    1995-12-01

    Ectodomain of the exon 11+ form of the human insulin receptor (hIR) was expressed in the mammalian cell secretion vector pEE6.HCMV-GS, containing the glutamine synthetase gene. Following transfection of the hIR ectodomain gene into Chinese hamster ovary (CHO-K1) cells, clones were isolated by selecting for glutamine synthetase expression with methionine sulphoximine. The expression levels of ectodomain were subsequently increased by gene amplification. Production was scaled up using a 40-liter airlift fermenter in which the transfected CHO-K1 cells were cultured on microcarrier beads, initially in medium containing 10% fetal calf serum (FCS). By continuous perfusion of serum-free medium into the bioreactor, cell viability was maintained during reduction of FCS, which enabled soluble hIR ectodomain to be harvested for at least 22 days. Harvests were concentrated 20-fold by anion-exchange chromatography. Optimal recovery of ectodomain from early harvests containing large quantities of serum proteins was achieved by insulin-affinity chromatography, whereas in later harvests purification was achieved by multistep chromatography. Analysis of the purified hIR ectodomain showed that it had a molecular weight by sedimentation equilibrium analysis of 269,500. Amino-terminal amino acid sequence analysis showed that the ectodomain was correctly processed to alpha and beta chains and that glycosylation characteristics were similar to those of native hIR. The integrity of the ectodomain was demonstrated by the recognition of conformation-dependent anti-hIR antibodies and by its binding of insulin (Kd approximately 2 x 10(-9) M). These results demonstrate the successful production and purification of hIR ectodomain by processes amenable to scale-up and in a form appropriate for structure/function studies of the ligand-binding domain of the receptor.

  15. Identification of novel diagnostic biomarkers for thyroid carcinoma.

    PubMed

    Wang, Xiliang; Zhang, Qing; Cai, Zhiming; Dai, Yifan; Mou, Lisha

    2017-12-19

    Thyroid carcinoma (THCA) is the most universal endocrine malignancy worldwide. Unfortunately, a limited number of large-scale analyses have been performed to identify biomarkers for THCA. Here, we conducted a meta-analysis using 505 THCA patients and 59 normal controls from The Cancer Genome Atlas. After identifying differentially expressed long non-coding RNA (lncRNA) and protein coding genes (PCG), we found vast difference in various lncRNA-PCG co-expressed pairs in THCA. A dysregulation network with scale-free topology was constructed. Four molecules (LA16c-380H5.2, RP11-203J24.8, MLF1 and SDC4) could potentially serve as diagnostic biomarkers of THCA with high sensitivity and specificity. We further represent a diagnostic panel with expression cutoff values. Our results demonstrate the potential application of those four molecules as novel independent biomarkers for THCA diagnosis.

  16. Statistical Analysis of Big Data on Pharmacogenomics

    PubMed Central

    Fan, Jianqing; Liu, Han

    2013-01-01

    This paper discusses statistical methods for estimating complex correlation structure from large pharmacogenomic datasets. We selectively review several prominent statistical methods for estimating large covariance matrix for understanding correlation structure, inverse covariance matrix for network modeling, large-scale simultaneous tests for selecting significantly differently expressed genes and proteins and genetic markers for complex diseases, and high dimensional variable selection for identifying important molecules for understanding molecule mechanisms in pharmacogenomics. Their applications to gene network estimation and biomarker selection are used to illustrate the methodological power. Several new challenges of Big data analysis, including complex data distribution, missing data, measurement error, spurious correlation, endogeneity, and the need for robust statistical methods, are also discussed. PMID:23602905

  17. Bi-Force: large-scale bicluster editing and its application to gene expression data biclustering

    PubMed Central

    Sun, Peng; Speicher, Nora K.; Röttger, Richard; Guo, Jiong; Baumbach, Jan

    2014-01-01

    Abstract The explosion of the biological data has dramatically reformed today's biological research. The need to integrate and analyze high-dimensional biological data on a large scale is driving the development of novel bioinformatics approaches. Biclustering, also known as ‘simultaneous clustering’ or ‘co-clustering’, has been successfully utilized to discover local patterns in gene expression data and similar biomedical data types. Here, we contribute a new heuristic: ‘Bi-Force’. It is based on the weighted bicluster editing model, to perform biclustering on arbitrary sets of biological entities, given any kind of pairwise similarities. We first evaluated the power of Bi-Force to solve dedicated bicluster editing problems by comparing Bi-Force with two existing algorithms in the BiCluE software package. We then followed a biclustering evaluation protocol in a recent review paper from Eren et al. (2013) (A comparative analysis of biclustering algorithms for gene expressiondata. Brief. Bioinform., 14:279–292.) and compared Bi-Force against eight existing tools: FABIA, QUBIC, Cheng and Church, Plaid, BiMax, Spectral, xMOTIFs and ISA. To this end, a suite of synthetic datasets as well as nine large gene expression datasets from Gene Expression Omnibus were analyzed. All resulting biclusters were subsequently investigated by Gene Ontology enrichment analysis to evaluate their biological relevance. The distinct theoretical foundation of Bi-Force (bicluster editing) is more powerful than strict biclustering. We thus outperformed existing tools with Bi-Force at least when following the evaluation protocols from Eren et al. Bi-Force is implemented in Java and integrated into the open source software package of BiCluE. The software as well as all used datasets are publicly available at http://biclue.mpi-inf.mpg.de. PMID:24682815

  18. From protein-protein interactions to protein co-expression networks: a new perspective to evaluate large-scale proteomic data.

    PubMed

    Vella, Danila; Zoppis, Italo; Mauri, Giancarlo; Mauri, Pierluigi; Di Silvestre, Dario

    2017-12-01

    The reductionist approach of dissecting biological systems into their constituents has been successful in the first stage of the molecular biology to elucidate the chemical basis of several biological processes. This knowledge helped biologists to understand the complexity of the biological systems evidencing that most biological functions do not arise from individual molecules; thus, realizing that the emergent properties of the biological systems cannot be explained or be predicted by investigating individual molecules without taking into consideration their relations. Thanks to the improvement of the current -omics technologies and the increasing understanding of the molecular relationships, even more studies are evaluating the biological systems through approaches based on graph theory. Genomic and proteomic data are often combined with protein-protein interaction (PPI) networks whose structure is routinely analyzed by algorithms and tools to characterize hubs/bottlenecks and topological, functional, and disease modules. On the other hand, co-expression networks represent a complementary procedure that give the opportunity to evaluate at system level including organisms that lack information on PPIs. Based on these premises, we introduce the reader to the PPI and to the co-expression networks, including aspects of reconstruction and analysis. In particular, the new idea to evaluate large-scale proteomic data by means of co-expression networks will be discussed presenting some examples of application. Their use to infer biological knowledge will be shown, and a special attention will be devoted to the topological and module analysis.

  19. From Coexpression to Coregulation: An Approach to Inferring Transcriptional Regulation Among Gene Classes from Large-Scale Expression Data

    NASA Technical Reports Server (NTRS)

    Mjolsness, Eric; Castano, Rebecca; Mann, Tobias; Wold, Barbara

    2000-01-01

    We provide preliminary evidence that existing algorithms for inferring small-scale gene regulation networks from gene expression data can be adapted to large-scale gene expression data coming from hybridization microarrays. The essential steps are (I) clustering many genes by their expression time-course data into a minimal set of clusters of co-expressed genes, (2) theoretically modeling the various conditions under which the time-courses are measured using a continuous-time analog recurrent neural network for the cluster mean time-courses, (3) fitting such a regulatory model to the cluster mean time courses by simulated annealing with weight decay, and (4) analysing several such fits for commonalities in the circuit parameter sets including the connection matrices. This procedure can be used to assess the adequacy of existing and future gene expression time-course data sets for determining transcriptional regulatory relationships such as coregulation.

  20. Molecular phenotype of zebrafish ovarian follicle by serial analysis of gene expression and proteomic profiling, and comparison with the transcriptomes of other animals

    PubMed Central

    Knoll-Gellida, Anja; André, Michèle; Gattegno, Tamar; Forgue, Jean; Admon, Arie; Babin, Patrick J

    2006-01-01

    Background The ability of an oocyte to develop into a viable embryo depends on the accumulation of specific maternal information and molecules, such as RNAs and proteins. A serial analysis of gene expression (SAGE) was carried out in parallel with proteomic analysis on fully-grown ovarian follicles from zebrafish (Danio rerio). The data obtained were compared with ovary/follicle/egg molecular phenotypes of other animals, published or available in public sequence databases. Results Sequencing of 27,486 SAGE tags identified 11,399 different ones, including 3,329 tags with an occurrence superior to one. Fifty-eight genes were expressed at over 0.15% of the total population and represented 17.34% of the mRNA population identified. The three most expressed transcripts were a rhamnose-binding lectin, beta-actin 2, and a transcribed locus similar to the H2B histone family. Comparison with the large-scale expressed sequence tags sequencing approach revealed highly expressed transcripts that were not previously known to be expressed at high levels in fish ovaries, like the short-sized polarized metallothionein 2 transcript. A higher sensitivity for the detection of transcripts with a characterized maternal genetic contribution was also demonstrated compared to large-scale sequencing of cDNA libraries. Ferritin heavy polypeptide 1, heat shock protein 90-beta, lactate dehydrogenase B4, beta-actin isoforms, tubulin beta 2, ATP synthase subunit 9, together with 40 S ribosomal protein S27a, were common highly-expressed transcripts of vertebrate ovary/unfertilized egg. Comparison of transcriptome and proteome data revealed that transcript levels provide little predictive value with respect to the extent of protein abundance. All the proteins identified by proteomic analysis of fully-grown zebrafish follicles had at least one transcript counterpart, with two exceptions: eosinophil chemotactic cytokine and nothepsin. Conclusion This study provides a complete sequence data set of maternal mRNA stored in zebrafish germ cells at the end of oogenesis. This catalogue contains highly-expressed transcripts that are part of a vertebrate ovarian expressed gene signature. Comparison of transcriptome and proteome data identified downregulated transcripts or proteins potentially incorporated in the oocyte by endocytosis. The molecular phenotype described provides groundwork for future experimental approaches aimed at identifying functionally important stored maternal transcripts and proteins involved in oogenesis and early stages of embryo development. PMID:16526958

  1. Large Scale Bacterial Colony Screening of Diversified FRET Biosensors

    PubMed Central

    Litzlbauer, Julia; Schifferer, Martina; Ng, David; Fabritius, Arne; Thestrup, Thomas; Griesbeck, Oliver

    2015-01-01

    Biosensors based on Förster Resonance Energy Transfer (FRET) between fluorescent protein mutants have started to revolutionize physiology and biochemistry. However, many types of FRET biosensors show relatively small FRET changes, making measurements with these probes challenging when used under sub-optimal experimental conditions. Thus, a major effort in the field currently lies in designing new optimization strategies for these types of sensors. Here we describe procedures for optimizing FRET changes by large scale screening of mutant biosensor libraries in bacterial colonies. We describe optimization of biosensor expression, permeabilization of bacteria, software tools for analysis, and screening conditions. The procedures reported here may help in improving FRET changes in multiple suitable classes of biosensors. PMID:26061878

  2. A regulation probability model-based meta-analysis of multiple transcriptomics data sets for cancer biomarker identification.

    PubMed

    Xie, Xin-Ping; Xie, Yu-Feng; Wang, Hong-Qiang

    2017-08-23

    Large-scale accumulation of omics data poses a pressing challenge of integrative analysis of multiple data sets in bioinformatics. An open question of such integrative analysis is how to pinpoint consistent but subtle gene activity patterns across studies. Study heterogeneity needs to be addressed carefully for this goal. This paper proposes a regulation probability model-based meta-analysis, jGRP, for identifying differentially expressed genes (DEGs). The method integrates multiple transcriptomics data sets in a gene regulatory space instead of in a gene expression space, which makes it easy to capture and manage data heterogeneity across studies from different laboratories or platforms. Specifically, we transform gene expression profiles into a united gene regulation profile across studies by mathematically defining two gene regulation events between two conditions and estimating their occurring probabilities in a sample. Finally, a novel differential expression statistic is established based on the gene regulation profiles, realizing accurate and flexible identification of DEGs in gene regulation space. We evaluated the proposed method on simulation data and real-world cancer datasets and showed the effectiveness and efficiency of jGRP in identifying DEGs identification in the context of meta-analysis. Data heterogeneity largely influences the performance of meta-analysis of DEGs identification. Existing different meta-analysis methods were revealed to exhibit very different degrees of sensitivity to study heterogeneity. The proposed method, jGRP, can be a standalone tool due to its united framework and controllable way to deal with study heterogeneity.

  3. A powerful nonparametric method for detecting differentially co-expressed genes: distance correlation screening and edge-count test.

    PubMed

    Zhang, Qingyang

    2018-05-16

    Differential co-expression analysis, as a complement of differential expression analysis, offers significant insights into the changes in molecular mechanism of different phenotypes. A prevailing approach to detecting differentially co-expressed genes is to compare Pearson's correlation coefficients in two phenotypes. However, due to the limitations of Pearson's correlation measure, this approach lacks the power to detect nonlinear changes in gene co-expression which is common in gene regulatory networks. In this work, a new nonparametric procedure is proposed to search differentially co-expressed gene pairs in different phenotypes from large-scale data. Our computational pipeline consisted of two main steps, a screening step and a testing step. The screening step is to reduce the search space by filtering out all the independent gene pairs using distance correlation measure. In the testing step, we compare the gene co-expression patterns in different phenotypes by a recently developed edge-count test. Both steps are distribution-free and targeting nonlinear relations. We illustrate the promise of the new approach by analyzing the Cancer Genome Atlas data and the METABRIC data for breast cancer subtypes. Compared with some existing methods, the new method is more powerful in detecting nonlinear type of differential co-expressions. The distance correlation screening can greatly improve computational efficiency, facilitating its application to large data sets.

  4. Thymidylate synthase (TS) gene expression in primary lung cancer patients: a large-scale study in Japanese population.

    PubMed

    Tanaka, F; Wada, H; Fukui, Y; Fukushima, M

    2011-08-01

    Previous small-sized studies showed lower thymidylate synthase (TS) expression in adenocarcinoma of the lung, which may explain higher antitumor activity of TS-inhibiting agents such as pemetrexed. To quantitatively measure TS gene expression in a large-scale Japanese population (n = 2621) with primary lung cancer, laser-captured microdissected sections were cut from primary tumors, surrounding normal lung tissues and involved nodes. TS gene expression level in primary tumor was significantly higher than that in normal lung tissue (mean TS/β-actin, 3.4 and 1.0, respectively; P < 0.01), and TS gene expression level was further higher in involved node (mean TS/β-actin, 7.7; P < 0.01). Analyses of TS gene expression levels in primary tumor according to histologic cell type revealed that small-cell carcinoma showed highest TS expression (mean TS/β-actin, 13.8) and that squamous cell carcinoma showed higher TS expression as compared with adenocarcinoma (mean TS/β-actin, 4.3 and 2.3, respectively; P < 0.01); TS gene expression was significantly increased along with a decrease in the grade of tumor cell differentiation. There was no significant difference in TS gene expression according to any other patient characteristics including tumor progression. Lower TS expression in adenocarcinoma of the lung was confirmed in a large-scale study.

  5. Engineering of Baeyer-Villiger monooxygenase-based Escherichia coli biocatalyst for large scale biotransformation of ricinoleic acid into (Z)-11-(heptanoyloxy)undec-9-enoic acid

    PubMed Central

    Seo, Joo-Hyun; Kim, Hwan-Hee; Jeon, Eun-Yeong; Song, Young-Ha; Shin, Chul-Soo; Park, Jin-Byung

    2016-01-01

    Baeyer-Villiger monooxygenases (BVMOs) are able to catalyze regiospecific Baeyer-Villiger oxygenation of a variety of cyclic and linear ketones to generate the corresponding lactones and esters, respectively. However, the enzymes are usually difficult to express in a functional form in microbial cells and are rather unstable under process conditions hindering their large-scale applications. Thereby, we investigated engineering of the BVMO from Pseudomonas putida KT2440 and the gene expression system to improve its activity and stability for large-scale biotransformation of ricinoleic acid (1) into the ester (i.e., (Z)-11-(heptanoyloxy)undec-9-enoic acid) (3), which can be hydrolyzed into 11-hydroxyundec-9-enoic acid (5) (i.e., a precursor of polyamide-11) and n-heptanoic acid (4). The polyionic tag-based fusion engineering of the BVMO and the use of a synthetic promoter for constitutive enzyme expression allowed the recombinant Escherichia coli expressing the BVMO and the secondary alcohol dehydrogenase of Micrococcus luteus to produce the ester (3) to 85 mM (26.6 g/L) within 5 h. The 5 L scale biotransformation process was then successfully scaled up to a 70 L bioreactor; 3 was produced to over 70 mM (21.9 g/L) in the culture medium 6 h after biotransformation. This study demonstrated that the BVMO-based whole-cell reactions can be applied for large-scale biotransformations. PMID:27311560

  6. Techno-economic analysis of horseradish peroxidase production using a transient expression system in Nicotiana benthamiana.

    PubMed

    Walwyn, David Richard; Huddy, Suzanne M; Rybicki, Edward P

    2015-01-01

    Despite the advantages of plant-based transient expression systems relative to microbial or mammalian cell systems, the commercial production of recombinant proteins using plants has not yet been achieved to any significant extent. One of the challenges has been the lack of published data on the costs of manufacture for products other than biopharmaceuticals. In this study, we report on the techno-economic analysis of the production of a standard commercial enzyme, namely, horseradish peroxidase (HRP), using a transient expression system in Nicotiana benthamiana. Based on the proven plant yield of 240 mg HRP/kg biomass, a biomass productivity of 15-kg biomass/m(2)/year and a process yield of 54 % (mg HRP product/mg HRP in biomass), it is apparent that HRP can be manufactured economically via transient expression in plants in a large-scale facility (>5 kg HRP/year). At this level, the process is competitive versus the existing technology (extraction of the enzyme from horseradish), and the product is of comparable or improved activity, containing only the preferred isoenzyme C. Production scale, protein yield and biomass productivity are found to be the most important determinants of overall viability.

  7. Functional regression method for whole genome eQTL epistasis analysis with sequencing data.

    PubMed

    Xu, Kelin; Jin, Li; Xiong, Momiao

    2017-05-18

    Epistasis plays an essential rule in understanding the regulation mechanisms and is an essential component of the genetic architecture of the gene expressions. However, interaction analysis of gene expressions remains fundamentally unexplored due to great computational challenges and data availability. Due to variation in splicing, transcription start sites, polyadenylation sites, post-transcriptional RNA editing across the entire gene, and transcription rates of the cells, RNA-seq measurements generate large expression variability and collectively create the observed position level read count curves. A single number for measuring gene expression which is widely used for microarray measured gene expression analysis is highly unlikely to sufficiently account for large expression variation across the gene. Simultaneously analyzing epistatic architecture using the RNA-seq and whole genome sequencing (WGS) data poses enormous challenges. We develop a nonlinear functional regression model (FRGM) with functional responses where the position-level read counts within a gene are taken as a function of genomic position, and functional predictors where genotype profiles are viewed as a function of genomic position, for epistasis analysis with RNA-seq data. Instead of testing the interaction of all possible pair-wises SNPs, the FRGM takes a gene as a basic unit for epistasis analysis, which tests for the interaction of all possible pairs of genes and use all the information that can be accessed to collectively test interaction between all possible pairs of SNPs within two genome regions. By large-scale simulations, we demonstrate that the proposed FRGM for epistasis analysis can achieve the correct type 1 error and has higher power to detect the interactions between genes than the existing methods. The proposed methods are applied to the RNA-seq and WGS data from the 1000 Genome Project. The numbers of pairs of significantly interacting genes after Bonferroni correction identified using FRGM, RPKM and DESeq were 16,2361, 260 and 51, respectively, from the 350 European samples. The proposed FRGM for epistasis analysis of RNA-seq can capture isoform and position-level information and will have a broad application. Both simulations and real data analysis highlight the potential for the FRGM to be a good choice of the epistatic analysis with sequencing data.

  8. Authentic Research Experience and “Big Data” Analysis in the Classroom: Maize Response to Abiotic Stress

    PubMed Central

    Makarevitch, Irina; Frechette, Cameo; Wiatros, Natalia

    2015-01-01

    Integration of inquiry-based approaches into curriculum is transforming the way science is taught and studied in undergraduate classrooms. Incorporating quantitative reasoning and mathematical skills into authentic biology undergraduate research projects has been shown to benefit students in developing various skills necessary for future scientists and to attract students to science, technology, engineering, and mathematics disciplines. While large-scale data analysis became an essential part of modern biological research, students have few opportunities to engage in analysis of large biological data sets. RNA-seq analysis, a tool that allows precise measurement of the level of gene expression for all genes in a genome, revolutionized molecular biology and provides ample opportunities for engaging students in authentic research. We developed, implemented, and assessed a series of authentic research laboratory exercises incorporating a large data RNA-seq analysis into an introductory undergraduate classroom. Our laboratory series is focused on analyzing gene expression changes in response to abiotic stress in maize seedlings; however, it could be easily adapted to the analysis of any other biological system with available RNA-seq data. Objective and subjective assessment of student learning demonstrated gains in understanding important biological concepts and in skills related to the process of science. PMID:26163561

  9. Interdisciplinary Team Science in Cell Biology.

    PubMed

    Horwitz, Rick

    2016-11-01

    The cell is complex. With its multitude of components, spatial-temporal character, and gene expression diversity, it is challenging to comprehend the cell as an integrated system and to develop models that predict its behaviors. I suggest an approach to address this issue, involving system level data analysis, large scale team science, and philanthropy. Copyright © 2016 Elsevier Ltd. All rights reserved.

  10. VLSI Microsystem for Rapid Bioinformatic Pattern Recognition

    NASA Technical Reports Server (NTRS)

    Fang, Wai-Chi; Lue, Jaw-Chyng

    2009-01-01

    A system comprising very-large-scale integrated (VLSI) circuits is being developed as a means of bioinformatics-oriented analysis and recognition of patterns of fluorescence generated in a microarray in an advanced, highly miniaturized, portable genetic-expression-assay instrument. Such an instrument implements an on-chip combination of polymerase chain reactions and electrochemical transduction for amplification and detection of deoxyribonucleic acid (DNA).

  11. Identification of novel diagnostic biomarkers for thyroid carcinoma

    PubMed Central

    Wang, Xiliang; Zhang, Qing; Cai, Zhiming; Dai, Yifan; Mou, Lisha

    2017-01-01

    Thyroid carcinoma (THCA) is the most universal endocrine malignancy worldwide. Unfortunately, a limited number of large-scale analyses have been performed to identify biomarkers for THCA. Here, we conducted a meta-analysis using 505 THCA patients and 59 normal controls from The Cancer Genome Atlas. After identifying differentially expressed long non-coding RNA (lncRNA) and protein coding genes (PCG), we found vast difference in various lncRNA-PCG co-expressed pairs in THCA. A dysregulation network with scale-free topology was constructed. Four molecules (LA16c-380H5.2, RP11-203J24.8, MLF1 and SDC4) could potentially serve as diagnostic biomarkers of THCA with high sensitivity and specificity. We further represent a diagnostic panel with expression cutoff values. Our results demonstrate the potential application of those four molecules as novel independent biomarkers for THCA diagnosis. PMID:29340074

  12. A combinatorial code for pattern formation in Drosophila oogenesis.

    PubMed

    Yakoby, Nir; Bristow, Christopher A; Gong, Danielle; Schafer, Xenia; Lembong, Jessica; Zartman, Jeremiah J; Halfon, Marc S; Schüpbach, Trudi; Shvartsman, Stanislav Y

    2008-11-01

    Two-dimensional patterning of the follicular epithelium in Drosophila oogenesis is required for the formation of three-dimensional eggshell structures. Our analysis of a large number of published gene expression patterns in the follicle cells suggests that they follow a simple combinatorial code based on six spatial building blocks and the operations of union, difference, intersection, and addition. The building blocks are related to the distribution of inductive signals, provided by the highly conserved epidermal growth factor receptor and bone morphogenetic protein signaling pathways. We demonstrate the validity of the code by testing it against a set of patterns obtained in a large-scale transcriptional profiling experiment. Using the proposed code, we distinguish 36 distinct patterns for 81 genes expressed in the follicular epithelium and characterize their joint dynamics over four stages of oogenesis. The proposed combinatorial framework allows systematic analysis of the diversity and dynamics of two-dimensional transcriptional patterns and guides future studies of gene regulation.

  13. Reliable and efficient solution of genome-scale models of Metabolism and macromolecular Expression

    DOE PAGES

    Ma, Ding; Yang, Laurence; Fleming, Ronan M. T.; ...

    2017-01-18

    Currently, Constraint-Based Reconstruction and Analysis (COBRA) is the only methodology that permits integrated modeling of Metabolism and macromolecular Expression (ME) at genome-scale. Linear optimization computes steady-state flux solutions to ME models, but flux values are spread over many orders of magnitude. Data values also have greatly varying magnitudes. Furthermore, standard double-precision solvers may return inaccurate solutions or report that no solution exists. Exact simplex solvers based on rational arithmetic require a near-optimal warm start to be practical on large problems (current ME models have 70,000 constraints and variables and will grow larger). We also developed a quadrupleprecision version of ourmore » linear and nonlinear optimizer MINOS, and a solution procedure (DQQ) involving Double and Quad MINOS that achieves reliability and efficiency for ME models and other challenging problems tested here. DQQ will enable extensive use of large linear and nonlinear models in systems biology and other applications involving multiscale data.« less

  14. Reliable and efficient solution of genome-scale models of Metabolism and macromolecular Expression

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ma, Ding; Yang, Laurence; Fleming, Ronan M. T.

    Currently, Constraint-Based Reconstruction and Analysis (COBRA) is the only methodology that permits integrated modeling of Metabolism and macromolecular Expression (ME) at genome-scale. Linear optimization computes steady-state flux solutions to ME models, but flux values are spread over many orders of magnitude. Data values also have greatly varying magnitudes. Furthermore, standard double-precision solvers may return inaccurate solutions or report that no solution exists. Exact simplex solvers based on rational arithmetic require a near-optimal warm start to be practical on large problems (current ME models have 70,000 constraints and variables and will grow larger). We also developed a quadrupleprecision version of ourmore » linear and nonlinear optimizer MINOS, and a solution procedure (DQQ) involving Double and Quad MINOS that achieves reliability and efficiency for ME models and other challenging problems tested here. DQQ will enable extensive use of large linear and nonlinear models in systems biology and other applications involving multiscale data.« less

  15. Coalescence computations for large samples drawn from populations of time-varying sizes

    PubMed Central

    Polanski, Andrzej; Szczesna, Agnieszka; Garbulowski, Mateusz; Kimmel, Marek

    2017-01-01

    We present new results concerning probability distributions of times in the coalescence tree and expected allele frequencies for coalescent with large sample size. The obtained results are based on computational methodologies, which involve combining coalescence time scale changes with techniques of integral transformations and using analytical formulae for infinite products. We show applications of the proposed methodologies for computing probability distributions of times in the coalescence tree and their limits, for evaluation of accuracy of approximate expressions for times in the coalescence tree and expected allele frequencies, and for analysis of large human mitochondrial DNA dataset. PMID:28170404

  16. Efficient production of human acidic fibroblast growth factor in pea (Pisum sativum L.) plants by agroinfection of germinated seeds

    PubMed Central

    2011-01-01

    Background For efficient and large scale production of recombinant proteins in plants transient expression by agroinfection has a number of advantages over stable transformation. Simple manipulation, rapid analysis and high expression efficiency are possible. In pea, Pisum sativum, a Virus Induced Gene Silencing System using the pea early browning virus has been converted into an efficient agroinfection system by converting the two RNA genomes of the virus into binary expression vectors for Agrobacterium transformation. Results By vacuum infiltration (0.08 Mpa, 1 min) of germinating pea seeds with 2-3 cm roots with Agrobacteria carrying the binary vectors, expression of the gene for Green Fluorescent Protein as marker and the gene for the human acidic fibroblast growth factor (aFGF) was obtained in 80% of the infiltrated developing seedlings. Maximal production of the recombinant proteins was achieved 12-15 days after infiltration. Conclusions Compared to the leaf injection method vacuum infiltration of germinated seeds is highly efficient allowing large scale production of plants transiently expressing recombinant proteins. The production cycle of plants for harvesting the recombinant protein was shortened from 30 days for leaf injection to 15 days by applying vacuum infiltration. The synthesized aFGF was purified by heparin-affinity chromatography and its mitogenic activity on NIH 3T3 cells confirmed to be similar to a commercial product. PMID:21548923

  17. Reverse engineering and analysis of large genome-scale gene networks

    PubMed Central

    Aluru, Maneesha; Zola, Jaroslaw; Nettleton, Dan; Aluru, Srinivas

    2013-01-01

    Reverse engineering the whole-genome networks of complex multicellular organisms continues to remain a challenge. While simpler models easily scale to large number of genes and gene expression datasets, more accurate models are compute intensive limiting their scale of applicability. To enable fast and accurate reconstruction of large networks, we developed Tool for Inferring Network of Genes (TINGe), a parallel mutual information (MI)-based program. The novel features of our approach include: (i) B-spline-based formulation for linear-time computation of MI, (ii) a novel algorithm for direct permutation testing and (iii) development of parallel algorithms to reduce run-time and facilitate construction of large networks. We assess the quality of our method by comparison with ARACNe (Algorithm for the Reconstruction of Accurate Cellular Networks) and GeneNet and demonstrate its unique capability by reverse engineering the whole-genome network of Arabidopsis thaliana from 3137 Affymetrix ATH1 GeneChips in just 9 min on a 1024-core cluster. We further report on the development of a new software Gene Network Analyzer (GeNA) for extracting context-specific subnetworks from a given set of seed genes. Using TINGe and GeNA, we performed analysis of 241 Arabidopsis AraCyc 8.0 pathways, and the results are made available through the web. PMID:23042249

  18. Design of a large-scale femtoliter droplet array for single-cell analysis of drug-tolerant and drug-resistant bacteria.

    PubMed

    Iino, Ryota; Matsumoto, Yoshimi; Nishino, Kunihiko; Yamaguchi, Akihito; Noji, Hiroyuki

    2013-01-01

    Single-cell analysis is a powerful method to assess the heterogeneity among individual cells, enabling the identification of very rare cells with properties that differ from those of the majority. In this Methods Article, we describe the use of a large-scale femtoliter droplet array to enclose, isolate, and analyze individual bacterial cells. As a first example, we describe the single-cell detection of drug-tolerant persisters of Pseudomonas aeruginosa treated with the antibiotic carbenicillin. As a second example, this method was applied to the single-cell evaluation of drug efflux activity, which causes acquired antibiotic resistance of bacteria. The activity of the MexAB-OprM multidrug efflux pump system from Pseudomonas aeruginosa was expressed in Escherichia coli and the effect of an inhibitor D13-9001 were assessed at the single cell level.

  19. Evolution of Synonymous Codon Usage in Neurospora tetrasperma and Neurospora discreta

    PubMed Central

    Whittle, C. A.; Sun, Y.; Johannesson, H.

    2011-01-01

    Neurospora comprises a primary model system for the study of fungal genetics and biology. In spite of this, little is known about genome evolution in Neurospora. For example, the evolution of synonymous codon usage is largely unknown in this genus. In the present investigation, we conducted a comprehensive analysis of synonymous codon usage and its relationship to gene expression and gene length (GL) in Neurospora tetrasperma and Neurospora discreta. For our analysis, we examined codon usage among 2,079 genes per organism and assessed gene expression using large-scale expressed sequenced tag (EST) data sets (279,323 and 453,559 ESTs for N. tetrasperma and N. discreta, respectively). Data on relative synonymous codon usage revealed 24 codons (and two putative codons) that are more frequently used in genes with high than with low expression and thus were defined as optimal codons. Although codon-usage bias was highly correlated with gene expression, it was independent of selectively neutral base composition (introns); thus demonstrating that translational selection drives synonymous codon usage in these genomes. We also report that GL (coding sequences [CDS]) was inversely associated with optimal codon usage at each gene expression level, with highly expressed short genes having the greatest frequency of optimal codons. Optimal codon frequency was moderately higher in N. tetrasperma than in N. discreta, which might be due to variation in selective pressures and/or mating systems. PMID:21402862

  20. CPTAC researchers report first large-scale integrated proteomic and genomic analysis of a human cancer | Office of Cancer Clinical Proteomics Research

    Cancer.gov

    Investigators from the National Cancer Institute's Clinical Proteomic Tumor Analysis Consortium (CPTAC) who comprehensively analyzed 95 human colorectal tumor samples, have determined how gene alterations identified in previous analyses of the same samples are expressed at the protein level. The integration of proteomic and genomic data, or proteogenomics, provides a more comprehensive view of the biological features that drive cancer than genomic analysis alone and may help identify the most important targets for cancer detection and intervention.

  1. How cosmic microwave background correlations at large angles relate to mass autocorrelations in space

    NASA Technical Reports Server (NTRS)

    Blumenthal, George R.; Johnston, Kathryn V.

    1994-01-01

    The Sachs-Wolfe effect is known to produce large angular scale fluctuations in the cosmic microwave background radiation (CMBR) due to gravitational potential fluctuations. We show how the angular correlation function of the CMBR can be expressed explicitly in terms of the mass autocorrelation function xi(r) in the universe. We derive analytic expressions for the angular correlation function and its multipole moments in terms of integrals over xi(r) or its second moment, J(sub 3)(r), which does not need to satisfy the sort of integral constraint that xi(r) must. We derive similar expressions for bulk flow velocity in terms of xi and J(sub 3). One interesting result that emerges directly from this analysis is that, for all angles theta, there is a substantial contribution to the correlation function from a wide range of distance r and that radial shape of this contribution does not vary greatly with angle.

  2. Microarray analysis identifies candidate genes for key roles in coral development

    PubMed Central

    Grasso, Lauretta C; Maindonald, John; Rudd, Stephen; Hayward, David C; Saint, Robert; Miller, David J; Ball, Eldon E

    2008-01-01

    Background Anthozoan cnidarians are amongst the simplest animals at the tissue level of organization, but are surprisingly complex and vertebrate-like in terms of gene repertoire. As major components of tropical reef ecosystems, the stony corals are anthozoans of particular ecological significance. To better understand the molecular bases of both cnidarian development in general and coral-specific processes such as skeletogenesis and symbiont acquisition, microarray analysis was carried out through the period of early development – when skeletogenesis is initiated, and symbionts are first acquired. Results Of 5081 unique peptide coding genes, 1084 were differentially expressed (P ≤ 0.05) in comparisons between four different stages of coral development, spanning key developmental transitions. Genes of likely relevance to the processes of settlement, metamorphosis, calcification and interaction with symbionts were characterised further and their spatial expression patterns investigated using whole-mount in situ hybridization. Conclusion This study is the first large-scale investigation of developmental gene expression for any cnidarian, and has provided candidate genes for key roles in many aspects of coral biology, including calcification, metamorphosis and symbiont uptake. One surprising finding is that some of these genes have clear counterparts in higher animals but are not present in the closely-related sea anemone Nematostella. Secondly, coral-specific processes (i.e. traits which distinguish corals from their close relatives) may be analogous to similar processes in distantly related organisms. This first large-scale application of microarray analysis demonstrates the potential of this approach for investigating many aspects of coral biology, including the effects of stress and disease. PMID:19014561

  3. Exploring root symbiotic programs in the model legume Medicago truncatula using EST analysis.

    PubMed

    Journet, Etienne-Pascal; van Tuinen, Diederik; Gouzy, Jérome; Crespeau, Hervé; Carreau, Véronique; Farmer, Mary-Jo; Niebel, Andreas; Schiex, Thomas; Jaillon, Olivier; Chatagnier, Odile; Godiard, Laurence; Micheli, Fabienne; Kahn, Daniel; Gianinazzi-Pearson, Vivienne; Gamas, Pascal

    2002-12-15

    We report on a large-scale expressed sequence tag (EST) sequencing and analysis program aimed at characterizing the sets of genes expressed in roots of the model legume Medicago truncatula during interactions with either of two microsymbionts, the nitrogen-fixing bacterium Sinorhizobium meliloti or the arbuscular mycorrhizal fungus Glomus intraradices. We have designed specific tools for in silico analysis of EST data, in relation to chimeric cDNA detection, EST clustering, encoded protein prediction, and detection of differential expression. Our 21 473 5'- and 3'-ESTs could be grouped into 6359 EST clusters, corresponding to distinct virtual genes, along with 52 498 other M.truncatula ESTs available in the dbEST (NCBI) database that were recruited in the process. These clusters were manually annotated, using a specifically developed annotation interface. Analysis of EST cluster distribution in various M.truncatula cDNA libraries, supported by a refined R test to evaluate statistical significance and by 'electronic northern' representation, enabled us to identify a large number of novel genes predicted to be up- or down-regulated during either symbiotic root interaction. These in silico analyses provide a first global view of the genetic programs for root symbioses in M.truncatula. A searchable database has been built and can be accessed through a public interface.

  4. Large-scale expansion of Wharton's jelly-derived mesenchymal stem cells on gelatin microbeads, with retention of self-renewal and multipotency characteristics and the capacity for enhancing skin wound healing.

    PubMed

    Zhao, Guifang; Liu, Feilin; Lan, Shaowei; Li, Pengdong; Wang, Li; Kou, Junna; Qi, Xiaojuan; Fan, Ruirui; Hao, Deshun; Wu, Chunling; Bai, Tingting; Li, Yulin; Liu, Jin Yu

    2015-03-19

    Successful stem cell therapy relies on large-scale generation of stem cells and their maintenance in a proliferative multipotent state. This study aimed to establish a three-dimension culture system for large-scale generation of hWJ-MSC and investigated the self-renewal activity, genomic stability and multi-lineage differentiation potential of such hWJ-MSC in enhancing skin wound healing. hWJ-MSC were seeded on gelatin microbeads and cultured in spinning bottles (3D). Cell proliferation, karyotype analysis, surface marker expression, multipotent differentiation (adipogenic, chondrogenic, and osteogenic potentials), and expression of core transcription factors (OCT4, SOX2, NANOG, and C-MYC), as well as their efficacy in accelerating skin wound healing, were investigated and compared with those of hWJ-MSC derived from plate cultres (2D), using in vivo and in vitro experiments. hWJ-MSC attached to and proliferated on gelatin microbeads in 3D cultures reaching a maximum of 1.1-1.30×10(7) cells on 0.5 g of microbeads by days 8-14; in contrast, hWJ-MSC derived from 2D cultures reached a maximum of 6.5 -11.5×10(5) cells per well in a 24-well plate by days 6-10. hWJ-MSC derived by 3D culture incorporated significantly more EdU (P<0.05) and had a significantly higher proliferation index (P<0.05) than those derived from 2D culture. Immunofluorescence staining, real-time PCR, flow cytometry analysis, and multipotency assays showed that hWJ-MSC derived from 3D culture retained MSC surface markers and multipotency potential similar to 2D culture-derived cells. 3D culture-derived hWJ-MSC also retained the expression of core transcription factors at levels comparable to their 2D culture counterparts. Direct injection of hWJ-MSC derived from 3D or 2D cultures into animals exhibited similar efficacy in enhancing skin wound healing. Thus, hWJ-MSC can be expanded markedly in gelatin microbeads, while retaining MSC surface marker expression, multipotent differential potential, and expression of core transcription factors. These cells also efficiently enhanced skin wound healing in vivo, in a manner comparable to that of hWJ-MSC obtained from 2D culture.

  5. Evolution and expression analysis of the grape (Vitis vinifera L.) WRKY gene family.

    PubMed

    Guo, Chunlei; Guo, Rongrong; Xu, Xiaozhao; Gao, Min; Li, Xiaoqin; Song, Junyang; Zheng, Yi; Wang, Xiping

    2014-04-01

    WRKY proteins comprise a large family of transcription factors that play important roles in plant defence regulatory networks, including responses to various biotic and abiotic stresses. To date, no large-scale study of WRKY genes has been undertaken in grape (Vitis vinifera L.). In this study, a total of 59 putative grape WRKY genes (VvWRKY) were identified and renamed on the basis of their respective chromosome distribution. A multiple sequence alignment analysis using all predicted grape WRKY genes coding sequences, together with those from Arabidopsis thaliana and tomato (Solanum lycopersicum), indicated that the 59 VvWRKY genes can be classified into three main groups (I-III). An evaluation of the duplication events suggested that several WRKY genes arose before the divergence of the grape and Arabidopsis lineages. Moreover, expression profiles derived from semiquantitative PCR and real-time quantitative PCR analyses showed distinct expression patterns in various tissues and in response to different treatments. Four VvWRKY genes showed a significantly higher expression in roots or leaves, 55 responded to varying degrees to at least one abiotic stress treatment, and the expression of 38 were altered following powdery mildew (Erysiphe necator) infection. Most VvWRKY genes were downregulated in response to abscisic acid or salicylic acid treatments, while the expression of a subset was upregulated by methyl jasmonate or ethylene treatments.

  6. Evolution and expression analysis of the grape (Vitis vinifera L.) WRKY gene family

    PubMed Central

    Guo, Chunlei; Guo, Rongrong; Wang, Xiping

    2014-01-01

    WRKY proteins comprise a large family of transcription factors that play important roles in plant defence regulatory networks, including responses to various biotic and abiotic stresses. To date, no large-scale study of WRKY genes has been undertaken in grape (Vitis vinifera L.). In this study, a total of 59 putative grape WRKY genes (VvWRKY) were identified and renamed on the basis of their respective chromosome distribution. A multiple sequence alignment analysis using all predicted grape WRKY genes coding sequences, together with those from Arabidopsis thaliana and tomato (Solanum lycopersicum), indicated that the 59 VvWRKY genes can be classified into three main groups (I–III). An evaluation of the duplication events suggested that several WRKY genes arose before the divergence of the grape and Arabidopsis lineages. Moreover, expression profiles derived from semiquantitative PCR and real-time quantitative PCR analyses showed distinct expression patterns in various tissues and in response to different treatments. Four VvWRKY genes showed a significantly higher expression in roots or leaves, 55 responded to varying degrees to at least one abiotic stress treatment, and the expression of 38 were altered following powdery mildew (Erysiphe necator) infection. Most VvWRKY genes were downregulated in response to abscisic acid or salicylic acid treatments, while the expression of a subset was upregulated by methyl jasmonate or ethylene treatments. PMID:24510937

  7. Large-scale protein-protein interaction analysis in Arabidopsis mesophyll protoplasts by split firefly luciferase complementation.

    PubMed

    Li, Jian-Feng; Bush, Jenifer; Xiong, Yan; Li, Lei; McCormack, Matthew

    2011-01-01

    Protein-protein interactions (PPIs) constitute the regulatory network that coordinates diverse cellular functions. There are growing needs in plant research for creating protein interaction maps behind complex cellular processes and at a systems biology level. However, only a few approaches have been successfully used for large-scale surveys of PPIs in plants, each having advantages and disadvantages. Here we present split firefly luciferase complementation (SFLC) as a highly sensitive and noninvasive technique for in planta PPI investigation. In this assay, the separate halves of a firefly luciferase can come into close proximity and transiently restore its catalytic activity only when their fusion partners, namely the two proteins of interest, interact with each other. This assay was conferred with quantitativeness and high throughput potential when the Arabidopsis mesophyll protoplast system and a microplate luminometer were employed for protein expression and luciferase measurement, respectively. Using the SFLC assay, we could monitor the dynamics of rapamycin-induced and ascomycin-disrupted interaction between Arabidopsis FRB and human FKBP proteins in a near real-time manner. As a proof of concept for large-scale PPI survey, we further applied the SFLC assay to testing 132 binary PPIs among 8 auxin response factors (ARFs) and 12 Aux/IAA proteins from Arabidopsis. Our results demonstrated that the SFLC assay is ideal for in vivo quantitative PPI analysis in plant cells and is particularly powerful for large-scale binary PPI screens.

  8. Integrated analysis of long non-coding RNAs in human gastric cancer: An in silico study.

    PubMed

    Han, Weiwei; Zhang, Zhenyu; He, Bangshun; Xu, Yijun; Zhang, Jun; Cao, Weijun

    2017-01-01

    Accumulating evidence highlights the important role of long non-coding RNAs (lncRNAs) in a large number of biological processes. However, the knowledge of genome scale expression of lncRNAs and their potential biological function in gastric cancer is still lacking. Using RNA-seq data from 420 gastric cancer patients in The Cancer Genome Atlas (TCGA), we identified 1,294 lncRNAs differentially expressed in gastric cancer compared with adjacent normal tissues. We also found 247 lncRNAs differentially expressed between intestinal subtype and diffuse subtype. Survival analysis revealed 33 lncRNAs independently associated with patient overall survival, of which 6 lncRNAs were validated in the internal validation set. There were 181 differentially expressed lncRNAs located in the recurrent somatic copy number alterations (SCNAs) regions and their correlations between copy number and RNA expression level were also analyzed. In addition, we inferred the function of lncRNAs by construction of a co-expression network for mRNAs and lncRNAs. Together, this study presented an integrative analysis of lncRNAs in gastric cancer and provided a valuable resource for further functional research of lncRNAs in gastric cancer.

  9. Grid-Enabled Quantitative Analysis of Breast Cancer

    DTIC Science & Technology

    2010-10-01

    large-scale, multi-modality computerized image analysis . The central hypothesis of this research is that large-scale image analysis for breast cancer...research, we designed a pilot study utilizing large scale parallel Grid computing harnessing nationwide infrastructure for medical image analysis . Also

  10. The Reliability of the OWLS Written Expression Scale with ESL Kindergarten Students

    ERIC Educational Resources Information Center

    Harrison, Gina L.; Ogle, Keira C.; Keilty, Megan

    2011-01-01

    A reliability analysis was conducted on the Written Expression Scale from the Oral and Written Language Scales, (OWLS, Carrow-Woolfolk, 1996), with 68 ESL and 56 non-ESL kindergarten students. Interrater and internal consistency estimates for the Written Expression Scale were examined separately for each language group. Despite lower oral English…

  11. Insights into the noncoding RNome of nitrogen-fixing endosymbiotic α-proteobacteria.

    PubMed

    Jiménez-Zurdo, José I; Valverde, Claudio; Becker, Anke

    2013-02-01

    Symbiotic chronic infection of legumes by rhizobia involves transition of invading bacteria from a free-living environment in soil to an intracellular state as differentiated nitrogen-fixing bacteroids within the nodules elicited in the host plant. The adaptive flexibility demanded by this complex lifestyle is likely facilitated by the large set of regulatory proteins encoded by rhizobial genomes. However, proteins are not the only relevant players in the regulation of gene expression in bacteria. Large-scale high-throughput analysis of prokaryotic genomes is evidencing the expression of an unexpected plethora of small untranslated transcripts (sRNAs) with housekeeping or regulatory roles. sRNAs mostly act in response to environmental cues as post-transcriptional regulators of gene expression through protein-assisted base-pairing interactions with target mRNAs. Riboregulation contributes to fine-tune a wide range of bacterial processes which, in intracellular animal pathogens, largely compromise virulence traits. Here, we summarize the incipient knowledge about the noncoding RNome structure of nitrogen-fixing endosymbiotic bacteria as inferred from genome-wide searches for sRNA genes in the alfalfa partner Sinorhizobium meliloti and further comparative genomics analysis. The biology of relevant S. meliloti RNA chaperones (e.g., Hfq) is also reviewed as a first global indicator of the impact of riboregulation in the establishment of the symbiotic interaction.

  12. High-Throughput Screening Using iPSC-Derived Neuronal Progenitors to Identify Compounds Counteracting Epigenetic Gene Silencing in Fragile X Syndrome.

    PubMed

    Kaufmann, Markus; Schuffenhauer, Ansgar; Fruh, Isabelle; Klein, Jessica; Thiemeyer, Anke; Rigo, Pierre; Gomez-Mancilla, Baltazar; Heidinger-Millot, Valerie; Bouwmeester, Tewis; Schopfer, Ulrich; Mueller, Matthias; Fodor, Barna D; Cobos-Correa, Amanda

    2015-10-01

    Fragile X syndrome (FXS) is the most common form of inherited mental retardation, and it is caused in most of cases by epigenetic silencing of the Fmr1 gene. Today, no specific therapy exists for FXS, and current treatments are only directed to improve behavioral symptoms. Neuronal progenitors derived from FXS patient induced pluripotent stem cells (iPSCs) represent a unique model to study the disease and develop assays for large-scale drug discovery screens since they conserve the Fmr1 gene silenced within the disease context. We have established a high-content imaging assay to run a large-scale phenotypic screen aimed to identify compounds that reactivate the silenced Fmr1 gene. A set of 50,000 compounds was tested, including modulators of several epigenetic targets. We describe an integrated drug discovery model comprising iPSC generation, culture scale-up, and quality control and screening with a very sensitive high-content imaging assay assisted by single-cell image analysis and multiparametric data analysis based on machine learning algorithms. The screening identified several compounds that induced a weak expression of fragile X mental retardation protein (FMRP) and thus sets the basis for further large-scale screens to find candidate drugs or targets tackling the underlying mechanism of FXS with potential for therapeutic intervention. © 2015 Society for Laboratory Automation and Screening.

  13. Laminar and dorsoventral molecular organization of the medial entorhinal cortex revealed by large-scale anatomical analysis of gene expression.

    PubMed

    Ramsden, Helen L; Sürmeli, Gülşen; McDonagh, Steven G; Nolan, Matthew F

    2015-01-01

    Neural circuits in the medial entorhinal cortex (MEC) encode an animal's position and orientation in space. Within the MEC spatial representations, including grid and directional firing fields, have a laminar and dorsoventral organization that corresponds to a similar topography of neuronal connectivity and cellular properties. Yet, in part due to the challenges of integrating anatomical data at the resolution of cortical layers and borders, we know little about the molecular components underlying this organization. To address this we develop a new computational pipeline for high-throughput analysis and comparison of in situ hybridization (ISH) images at laminar resolution. We apply this pipeline to ISH data for over 16,000 genes in the Allen Brain Atlas and validate our analysis with RNA sequencing of MEC tissue from adult mice. We find that differential gene expression delineates the borders of the MEC with neighboring brain structures and reveals its laminar and dorsoventral organization. We propose a new molecular basis for distinguishing the deep layers of the MEC and show that their similarity to corresponding layers of neocortex is greater than that of superficial layers. Our analysis identifies ion channel-, cell adhesion- and synapse-related genes as candidates for functional differentiation of MEC layers and for encoding of spatial information at different scales along the dorsoventral axis of the MEC. We also reveal laminar organization of genes related to disease pathology and suggest that a high metabolic demand predisposes layer II to neurodegenerative pathology. In principle, our computational pipeline can be applied to high-throughput analysis of many forms of neuroanatomical data. Our results support the hypothesis that differences in gene expression contribute to functional specialization of superficial layers of the MEC and dorsoventral organization of the scale of spatial representations.

  14. Laminar and Dorsoventral Molecular Organization of the Medial Entorhinal Cortex Revealed by Large-scale Anatomical Analysis of Gene Expression

    PubMed Central

    Ramsden, Helen L.; Sürmeli, Gülşen; McDonagh, Steven G.; Nolan, Matthew F.

    2015-01-01

    Neural circuits in the medial entorhinal cortex (MEC) encode an animal’s position and orientation in space. Within the MEC spatial representations, including grid and directional firing fields, have a laminar and dorsoventral organization that corresponds to a similar topography of neuronal connectivity and cellular properties. Yet, in part due to the challenges of integrating anatomical data at the resolution of cortical layers and borders, we know little about the molecular components underlying this organization. To address this we develop a new computational pipeline for high-throughput analysis and comparison of in situ hybridization (ISH) images at laminar resolution. We apply this pipeline to ISH data for over 16,000 genes in the Allen Brain Atlas and validate our analysis with RNA sequencing of MEC tissue from adult mice. We find that differential gene expression delineates the borders of the MEC with neighboring brain structures and reveals its laminar and dorsoventral organization. We propose a new molecular basis for distinguishing the deep layers of the MEC and show that their similarity to corresponding layers of neocortex is greater than that of superficial layers. Our analysis identifies ion channel-, cell adhesion- and synapse-related genes as candidates for functional differentiation of MEC layers and for encoding of spatial information at different scales along the dorsoventral axis of the MEC. We also reveal laminar organization of genes related to disease pathology and suggest that a high metabolic demand predisposes layer II to neurodegenerative pathology. In principle, our computational pipeline can be applied to high-throughput analysis of many forms of neuroanatomical data. Our results support the hypothesis that differences in gene expression contribute to functional specialization of superficial layers of the MEC and dorsoventral organization of the scale of spatial representations. PMID:25615592

  15. Preparation of highly multiplexed small RNA sequencing libraries.

    PubMed

    Persson, Helena; Søkilde, Rolf; Pirona, Anna Chiara; Rovira, Carlos

    2017-08-01

    MicroRNAs (miRNAs) are ~22-nucleotide-long small non-coding RNAs that regulate the expression of protein-coding genes by base pairing to partially complementary target sites, preferentially located in the 3´ untranslated region (UTR) of target mRNAs. The expression and function of miRNAs have been extensively studied in human disease, as well as the possibility of using these molecules as biomarkers for prognostication and treatment guidance. To identify and validate miRNAs as biomarkers, their expression must be screened in large collections of patient samples. Here, we develop a scalable protocol for the rapid and economical preparation of a large number of small RNA sequencing libraries using dual indexing for multiplexing. Combined with the use of off-the-shelf reagents, more samples can be sequenced simultaneously on large-scale sequencing platforms at a considerably lower cost per sample. Sample preparation is simplified by pooling libraries prior to gel purification, which allows for the selection of a narrow size range while minimizing sample variation. A comparison with publicly available data from benchmarking of miRNA analysis platforms showed that this method captures absolute and differential expression as effectively as commercially available alternatives.

  16. What Sort of Girl Wants to Study Physics after the Age of 16? Findings from a Large-Scale UK Survey

    ERIC Educational Resources Information Center

    Mujtaba, Tamjid; Reiss, Michael J.

    2013-01-01

    This paper investigates the characteristics of 15-year-old girls who express an intention to study physics post-16. This paper unpacks issues around within-girl group differences and similarities between boys and girls in survey responses about physics. The analysis is based on the year 10 (age 15 years) responses of 5,034 students from 137 UK…

  17. Developmental transcriptional profiling reveals key insights into Triticeae reproductive development.

    PubMed

    Tran, Frances; Penniket, Carolyn; Patel, Rohan V; Provart, Nicholas J; Laroche, André; Rowland, Owen; Robert, Laurian S

    2013-06-01

    Despite their importance, there remains a paucity of large-scale gene expression-based studies of reproductive development in species belonging to the Triticeae. As a first step to address this deficiency, a gene expression atlas of triticale reproductive development was generated using the 55K Affymetrix GeneChip(®) wheat genome array. The global transcriptional profiles of the anther/pollen, ovary and stigma were analyzed at concurrent developmental stages, and co-expressed as well as preferentially expressed genes were identified. Data analysis revealed both novel and conserved regulatory factors underlying Triticeae floral development and function. This comprehensive resource rests upon detailed gene annotations, and the expression profiles are readily accessible via a web browser. © 2013 Her Majesty the Queen in Right of Canada as represented by the Minister of Agriculture and Agri-Food Canada.

  18. Screening and large-scale expression of membrane proteins in mammalian cells for structural studies

    PubMed Central

    Goehring, April; Lee, Chia-Hsueh; Wang, Kevin H.; Michel, Jennifer Carlisle; Claxton, Derek P.; Baconguis, Isabelle; Althoff, Thorsten; Fischer, Suzanne; Garcia, K. Christopher; Gouaux, Eric

    2014-01-01

    Structural, biochemical and biophysical studies of eukaryotic membrane proteins are often hampered by difficulties in over-expression of the candidate molecule. Baculovirus transduction of mammalian cells (BacMam), although a powerful method to heterologously express membrane proteins, can be cumbersome for screening and expression of multiple constructs. We therefore developed plasmid Eric Gouaux (pEG) BacMam, a vector optimized for use in screening assays, as well as for efficient production of baculovirus and robust expression of the target protein. In this protocol we show how to use small-scale transient transfection and fluorescence-detection, size-exclusion chromatography (FSEC) experiments using a GFP-His8 tagged candidate protein to screen for monodispersity and expression level. Once promising candidates are identified, we describe how to generate baculovirus, transduce HEK293S GnTI− (N-acetylglucosaminyltransferase I-negative) cells in suspension culture, and over-express the candidate protein. We have used these methods to prepare pure samples of chicken acid-sensing ion channel 1a (cASIC1) and Caenorhabditis elegans glutamate-gated chloride channel (GluCl), for X-ray crystallography, demonstrating how to rapidly and efficiently screen hundreds of constructs and accomplish large-scale expression in 4-6 weeks. PMID:25299155

  19. Bi-Force: large-scale bicluster editing and its application to gene expression data biclustering.

    PubMed

    Sun, Peng; Speicher, Nora K; Röttger, Richard; Guo, Jiong; Baumbach, Jan

    2014-05-01

    The explosion of the biological data has dramatically reformed today's biological research. The need to integrate and analyze high-dimensional biological data on a large scale is driving the development of novel bioinformatics approaches. Biclustering, also known as 'simultaneous clustering' or 'co-clustering', has been successfully utilized to discover local patterns in gene expression data and similar biomedical data types. Here, we contribute a new heuristic: 'Bi-Force'. It is based on the weighted bicluster editing model, to perform biclustering on arbitrary sets of biological entities, given any kind of pairwise similarities. We first evaluated the power of Bi-Force to solve dedicated bicluster editing problems by comparing Bi-Force with two existing algorithms in the BiCluE software package. We then followed a biclustering evaluation protocol in a recent review paper from Eren et al. (2013) (A comparative analysis of biclustering algorithms for gene expressiondata. Brief. Bioinform., 14:279-292.) and compared Bi-Force against eight existing tools: FABIA, QUBIC, Cheng and Church, Plaid, BiMax, Spectral, xMOTIFs and ISA. To this end, a suite of synthetic datasets as well as nine large gene expression datasets from Gene Expression Omnibus were analyzed. All resulting biclusters were subsequently investigated by Gene Ontology enrichment analysis to evaluate their biological relevance. The distinct theoretical foundation of Bi-Force (bicluster editing) is more powerful than strict biclustering. We thus outperformed existing tools with Bi-Force at least when following the evaluation protocols from Eren et al. Bi-Force is implemented in Java and integrated into the open source software package of BiCluE. The software as well as all used datasets are publicly available at http://biclue.mpi-inf.mpg.de. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  20. Scale and time dependence of serial correlations in word-length time series of written texts

    NASA Astrophysics Data System (ADS)

    Rodriguez, E.; Aguilar-Cornejo, M.; Femat, R.; Alvarez-Ramirez, J.

    2014-11-01

    This work considered the quantitative analysis of large written texts. To this end, the text was converted into a time series by taking the sequence of word lengths. The detrended fluctuation analysis (DFA) was used for characterizing long-range serial correlations of the time series. To this end, the DFA was implemented within a rolling window framework for estimating the variations of correlations, quantified in terms of the scaling exponent, strength along the text. Also, a filtering derivative was used to compute the dependence of the scaling exponent relative to the scale. The analysis was applied to three famous English-written literary narrations; namely, Alice in Wonderland (by Lewis Carrol), Dracula (by Bram Stoker) and Sense and Sensibility (by Jane Austen). The results showed that high correlations appear for scales of about 50-200 words, suggesting that at these scales the text contains the stronger coherence. The scaling exponent was not constant along the text, showing important variations with apparent cyclical behavior. An interesting coincidence between the scaling exponent variations and changes in narrative units (e.g., chapters) was found. This suggests that the scaling exponent obtained from the DFA is able to detect changes in narration structure as expressed by the usage of words of different lengths.

  1. Temporal Expression-based Analysis of Metabolism

    PubMed Central

    Segrè, Daniel

    2012-01-01

    Metabolic flux is frequently rerouted through cellular metabolism in response to dynamic changes in the intra- and extra-cellular environment. Capturing the mechanisms underlying these metabolic transitions in quantitative and predictive models is a prominent challenge in systems biology. Progress in this regard has been made by integrating high-throughput gene expression data into genome-scale stoichiometric models of metabolism. Here, we extend previous approaches to perform a Temporal Expression-based Analysis of Metabolism (TEAM). We apply TEAM to understanding the complex metabolic dynamics of the respiratorily versatile bacterium Shewanella oneidensis grown under aerobic, lactate-limited conditions. TEAM predicts temporal metabolic flux distributions using time-series gene expression data. Increased predictive power is achieved by supplementing these data with a large reference compendium of gene expression, which allows us to take into account the unique character of the distribution of expression of each individual gene. We further propose a straightforward method for studying the sensitivity of TEAM to changes in its fundamental free threshold parameter θ, and reveal that discrete zones of distinct metabolic behavior arise as this parameter is changed. By comparing the qualitative characteristics of these zones to additional experimental data, we are able to constrain the range of θ to a small, well-defined interval. In parallel, the sensitivity analysis reveals the inherently difficult nature of dynamic metabolic flux modeling: small errors early in the simulation propagate to relatively large changes later in the simulation. We expect that handling such “history-dependent” sensitivities will be a major challenge in the future development of dynamic metabolic-modeling techniques. PMID:23209390

  2. DISRUPTION OF LARGE-SCALE NEURAL NETWORKS IN NON-FLUENT/AGRAMMATIC VARIANT PRIMARY PROGRESSIVE APHASIA ASSOCIATED WITH FRONTOTEMPORAL DEGENERATION PATHOLOGY

    PubMed Central

    Grossman, Murray; Powers, John; Ash, Sherry; McMillan, Corey; Burkholder, Lisa; Irwin, David; Trojanowski, John Q.

    2012-01-01

    Non-fluent/agrammatic primary progressive aphasia (naPPA) is a progressive neurodegenerative condition most prominently associated with slowed, effortful speech. A clinical imaging marker of naPPA is disease centered in the left inferior frontal lobe. We used multimodal imaging to assess large-scale neural networks underlying effortful expression in 15 patients with sporadic naPPA due to frontotemporal lobar degeneration (FTLD) spectrum pathology. Effortful speech in these patients is related in part to impaired grammatical processing, and to phonologic speech errors. Gray matter (GM) imaging shows frontal and anterior-superior temporal atrophy, most prominently in the left hemisphere. Diffusion tensor imaging reveals reduced fractional anisotropy in several white matter (WM) tracts mediating projections between left frontal and other GM regions. Regression analyses suggest disruption of three large-scale GM-WM neural networks in naPPA that support fluent, grammatical expression. These findings emphasize the role of large-scale neural networks in language, and demonstrate associated language deficits in naPPA. PMID:23218686

  3. TLM-Quant: an open-source pipeline for visualization and quantification of gene expression heterogeneity in growing microbial cells.

    PubMed

    Piersma, Sjouke; Denham, Emma L; Drulhe, Samuel; Tonk, Rudi H J; Schwikowski, Benno; van Dijl, Jan Maarten

    2013-01-01

    Gene expression heterogeneity is a key driver for microbial adaptation to fluctuating environmental conditions, cell differentiation and the evolution of species. This phenomenon has therefore enormous implications, not only for life in general, but also for biotechnological applications where unwanted subpopulations of non-producing cells can emerge in large-scale fermentations. Only time-lapse fluorescence microscopy allows real-time measurements of gene expression heterogeneity. A major limitation in the analysis of time-lapse microscopy data is the lack of fast, cost-effective, open, simple and adaptable protocols. Here we describe TLM-Quant, a semi-automatic pipeline for the analysis of time-lapse fluorescence microscopy data that enables the user to visualize and quantify gene expression heterogeneity. Importantly, our pipeline builds on the open-source packages ImageJ and R. To validate TLM-Quant, we selected three possible scenarios, namely homogeneous expression, highly 'noisy' heterogeneous expression, and bistable heterogeneous expression in the Gram-positive bacterium Bacillus subtilis. This bacterium is both a paradigm for systems-level studies on gene expression and a highly appreciated biotechnological 'cell factory'. We conclude that the temporal resolution of such analyses with TLM-Quant is only limited by the numbers of recorded images.

  4. Large-scale gene function analysis with the PANTHER classification system.

    PubMed

    Mi, Huaiyu; Muruganujan, Anushya; Casagrande, John T; Thomas, Paul D

    2013-08-01

    The PANTHER (protein annotation through evolutionary relationship) classification system (http://www.pantherdb.org/) is a comprehensive system that combines gene function, ontology, pathways and statistical analysis tools that enable biologists to analyze large-scale, genome-wide data from sequencing, proteomics or gene expression experiments. The system is built with 82 complete genomes organized into gene families and subfamilies, and their evolutionary relationships are captured in phylogenetic trees, multiple sequence alignments and statistical models (hidden Markov models or HMMs). Genes are classified according to their function in several different ways: families and subfamilies are annotated with ontology terms (Gene Ontology (GO) and PANTHER protein class), and sequences are assigned to PANTHER pathways. The PANTHER website includes a suite of tools that enable users to browse and query gene functions, and to analyze large-scale experimental data with a number of statistical tests. It is widely used by bench scientists, bioinformaticians, computer scientists and systems biologists. In the 2013 release of PANTHER (v.8.0), in addition to an update of the data content, we redesigned the website interface to improve both user experience and the system's analytical capability. This protocol provides a detailed description of how to analyze genome-wide experimental data with the PANTHER classification system.

  5. MALDI-TOF mass spectrometry for quantitative gene expression analysis of acid responses in Staphylococcus aureus.

    PubMed

    Rode, Tone Mari; Berget, Ingunn; Langsrud, Solveig; Møretrø, Trond; Holck, Askild

    2009-07-01

    Microorganisms are constantly exposed to new and altered growth conditions, and respond by changing gene expression patterns. Several methods for studying gene expression exist. During the last decade, the analysis of microarrays has been one of the most common approaches applied for large scale gene expression studies. A relatively new method for gene expression analysis is MassARRAY, which combines real competitive-PCR and MALDI-TOF (matrix-assisted laser desorption/ionization time-of-flight) mass spectrometry. In contrast to microarray methods, MassARRAY technology is suitable for analysing a larger number of samples, though for a smaller set of genes. In this study we compare the results from MassARRAY with microarrays on gene expression responses of Staphylococcus aureus exposed to acid stress at pH 4.5. RNA isolated from the same stress experiments was analysed using both the MassARRAY and the microarray methods. The MassARRAY and microarray methods showed good correlation. Both MassARRAY and microarray estimated somewhat lower fold changes compared with quantitative real-time PCR (qRT-PCR). The results confirmed the up-regulation of the urease genes in acidic environments, and also indicated the importance of metal ion regulation. This study shows that the MassARRAY technology is suitable for gene expression analysis in prokaryotes, and has advantages when a set of genes is being analysed for an organism exposed to many different environmental conditions.

  6. Gene coexpression measures in large heterogeneous samples using count statistics.

    PubMed

    Wang, Y X Rachel; Waterman, Michael S; Huang, Haiyan

    2014-11-18

    With the advent of high-throughput technologies making large-scale gene expression data readily available, developing appropriate computational tools to process these data and distill insights into systems biology has been an important part of the "big data" challenge. Gene coexpression is one of the earliest techniques developed that is still widely in use for functional annotation, pathway analysis, and, most importantly, the reconstruction of gene regulatory networks, based on gene expression data. However, most coexpression measures do not specifically account for local features in expression profiles. For example, it is very likely that the patterns of gene association may change or only exist in a subset of the samples, especially when the samples are pooled from a range of experiments. We propose two new gene coexpression statistics based on counting local patterns of gene expression ranks to take into account the potentially diverse nature of gene interactions. In particular, one of our statistics is designed for time-course data with local dependence structures, such as time series coupled over a subregion of the time domain. We provide asymptotic analysis of their distributions and power, and evaluate their performance against a wide range of existing coexpression measures on simulated and real data. Our new statistics are fast to compute, robust against outliers, and show comparable and often better general performance.

  7. The effects of gender, ethnicity, and a close relationship theme on perceptions of persons introducing a condom.

    PubMed

    Castaneda, D M; Collins, B E

    1998-09-01

    Perceptions of persons who introduce condoms in an ongoing sexual interaction, and the effects of gender and ethnicity on these perceptions, were explored in a study involving 243 students at a large, urban university in the western US. 133 of these students identified themselves as Mexican American; the remaining students indicated they were White. A vignette methodology was used to elicit perceptions of condom introducers on six scales (Nice, Exciting, Sexually Attractive, Promiscuous, Good Relationship Partner, Unpersonable/Personable). Data were analyzed in a 2 (gender of participant) x 2 (gender of condom introducer) x 3 (low acculturated Mexican American, high acculturated Mexican American, White) x 2 (presence/absence of close relationship theme) analysis of variance and covariance. In terms of the Nice Scale, women rated condom introducers significantly higher than men, female condom introducers were rated significantly higher than male introducers, and condom introducers who expressed a care and responsibility theme while introducing a condom were rated significantly higher than those who expressed no theme. On the Exciting Scale, women condom introducers were rated significantly higher than men. Condom introducers who expressed a care and responsibility theme were rated significantly higher than those who expressed no theme on the Good Relationship Partner scale. Men rated the female condom introducer significantly higher than women on the Promiscuous scale. Low acculturated Mexicans rated the female condom introducer significantly higher than the male introducer on the Promiscuous scale and rated the condom introducer significantly higher than Whites on the Sexually Attractive scale. These findings attest that many often contradictory interpersonal gender- and ethnicity-related perceptions operate in sexual encounters.

  8. Development of a gene synthesis platform for the efficient large scale production of small genes encoding animal toxins.

    PubMed

    Sequeira, Ana Filipa; Brás, Joana L A; Guerreiro, Catarina I P D; Vincentelli, Renaud; Fontes, Carlos M G A

    2016-12-01

    Gene synthesis is becoming an important tool in many fields of recombinant DNA technology, including recombinant protein production. De novo gene synthesis is quickly replacing the classical cloning and mutagenesis procedures and allows generating nucleic acids for which no template is available. In addition, when coupled with efficient gene design algorithms that optimize codon usage, it leads to high levels of recombinant protein expression. Here, we describe the development of an optimized gene synthesis platform that was applied to the large scale production of small genes encoding venom peptides. This improved gene synthesis method uses a PCR-based protocol to assemble synthetic DNA from pools of overlapping oligonucleotides and was developed to synthesise multiples genes simultaneously. This technology incorporates an accurate, automated and cost effective ligation independent cloning step to directly integrate the synthetic genes into an effective Escherichia coli expression vector. The robustness of this technology to generate large libraries of dozens to thousands of synthetic nucleic acids was demonstrated through the parallel and simultaneous synthesis of 96 genes encoding animal toxins. An automated platform was developed for the large-scale synthesis of small genes encoding eukaryotic toxins. Large scale recombinant expression of synthetic genes encoding eukaryotic toxins will allow exploring the extraordinary potency and pharmacological diversity of animal venoms, an increasingly valuable but unexplored source of lead molecules for drug discovery.

  9. Sex genes for genomic analysis in human brain: internal controls for comparison of probe level data extraction.

    PubMed Central

    Galfalvy, Hanga C; Erraji-Benchekroun, Loubna; Smyrniotopoulos, Peggy; Pavlidis, Paul; Ellis, Steven P; Mann, J John; Sibille, Etienne; Arango, Victoria

    2003-01-01

    Background Genomic studies of complex tissues pose unique analytical challenges for assessment of data quality, performance of statistical methods used for data extraction, and detection of differentially expressed genes. Ideally, to assess the accuracy of gene expression analysis methods, one needs a set of genes which are known to be differentially expressed in the samples and which can be used as a "gold standard". We introduce the idea of using sex-chromosome genes as an alternative to spiked-in control genes or simulations for assessment of microarray data and analysis methods. Results Expression of sex-chromosome genes were used as true internal biological controls to compare alternate probe-level data extraction algorithms (Microarray Suite 5.0 [MAS5.0], Model Based Expression Index [MBEI] and Robust Multi-array Average [RMA]), to assess microarray data quality and to establish some statistical guidelines for analyzing large-scale gene expression. These approaches were implemented on a large new dataset of human brain samples. RMA-generated gene expression values were markedly less variable and more reliable than MAS5.0 and MBEI-derived values. A statistical technique controlling the false discovery rate was applied to adjust for multiple testing, as an alternative to the Bonferroni method, and showed no evidence of false negative results. Fourteen probesets, representing nine Y- and two X-chromosome linked genes, displayed significant sex differences in brain prefrontal cortex gene expression. Conclusion In this study, we have demonstrated the use of sex genes as true biological internal controls for genomic analysis of complex tissues, and suggested analytical guidelines for testing alternate oligonucleotide microarray data extraction protocols and for adjusting multiple statistical analysis of differentially expressed genes. Our results also provided evidence for sex differences in gene expression in the brain prefrontal cortex, supporting the notion of a putative direct role of sex-chromosome genes in differentiation and maintenance of sexual dimorphism of the central nervous system. Importantly, these analytical approaches are applicable to all microarray studies that include male and female human or animal subjects. PMID:12962547

  10. Sex genes for genomic analysis in human brain: internal controls for comparison of probe level data extraction.

    PubMed

    Galfalvy, Hanga C; Erraji-Benchekroun, Loubna; Smyrniotopoulos, Peggy; Pavlidis, Paul; Ellis, Steven P; Mann, J John; Sibille, Etienne; Arango, Victoria

    2003-09-08

    Genomic studies of complex tissues pose unique analytical challenges for assessment of data quality, performance of statistical methods used for data extraction, and detection of differentially expressed genes. Ideally, to assess the accuracy of gene expression analysis methods, one needs a set of genes which are known to be differentially expressed in the samples and which can be used as a "gold standard". We introduce the idea of using sex-chromosome genes as an alternative to spiked-in control genes or simulations for assessment of microarray data and analysis methods. Expression of sex-chromosome genes were used as true internal biological controls to compare alternate probe-level data extraction algorithms (Microarray Suite 5.0 [MAS5.0], Model Based Expression Index [MBEI] and Robust Multi-array Average [RMA]), to assess microarray data quality and to establish some statistical guidelines for analyzing large-scale gene expression. These approaches were implemented on a large new dataset of human brain samples. RMA-generated gene expression values were markedly less variable and more reliable than MAS5.0 and MBEI-derived values. A statistical technique controlling the false discovery rate was applied to adjust for multiple testing, as an alternative to the Bonferroni method, and showed no evidence of false negative results. Fourteen probesets, representing nine Y- and two X-chromosome linked genes, displayed significant sex differences in brain prefrontal cortex gene expression. In this study, we have demonstrated the use of sex genes as true biological internal controls for genomic analysis of complex tissues, and suggested analytical guidelines for testing alternate oligonucleotide microarray data extraction protocols and for adjusting multiple statistical analysis of differentially expressed genes. Our results also provided evidence for sex differences in gene expression in the brain prefrontal cortex, supporting the notion of a putative direct role of sex-chromosome genes in differentiation and maintenance of sexual dimorphism of the central nervous system. Importantly, these analytical approaches are applicable to all microarray studies that include male and female human or animal subjects.

  11. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Li, Jun-Hao; Liu, Shun; Zheng, Ling-Ling

    Long non-coding RNAs (lncRNAs) are emerging as important regulatory molecules in developmental, physiological, and pathological processes. However, the precise mechanism and functions of most of lncRNAs remain largely unknown. Recent advances in high-throughput sequencing of immunoprecipitated RNAs after cross-linking (CLIP-Seq) provide powerful ways to identify biologically relevant protein–lncRNA interactions. In this study, by analyzing millions of RNA-binding protein (RBP) binding sites from 117 CLIP-Seq datasets generated by 50 independent studies, we identified 22,735 RBP–lncRNA regulatory relationships. We found that one single lncRNA will generally be bound and regulated by one or multiple RBPs, the combination of which may coordinately regulatemore » gene expression. We also revealed the expression correlation of these interaction networks by mining expression profiles of over 6000 normal and tumor samples from 14 cancer types. Our combined analysis of CLIP-Seq data and genome-wide association studies data discovered hundreds of disease-related single nucleotide polymorphisms resided in the RBP binding sites of lncRNAs. Finally, we developed interactive web implementations to provide visualization, analysis, and downloading of the aforementioned large-scale datasets. Our study represented an important step in identification and analysis of RBP–lncRNA interactions and showed that these interactions may play crucial roles in cancer and genetic diseases.« less

  12. Hepatic gene expression patterns following trauma-hemorrhage: effect of posttreatment with estrogen.

    PubMed

    Yu, Huang-Ping; Pang, See-Tong; Chaudry, Irshad H

    2013-01-01

    The aim of this study was to examine the role of estrogen on hepatic gene expression profiles at an early time point following trauma-hemorrhage in rats. Groups of injured and sham controls receiving estrogen or vehicle were killed 2 h after injury and resuscitation, and liver tissue was harvested. Complementary RNA was synthesized from each RNA sample and hybridized to microarrays. A large number of genes were differentially expressed at the 2-h time point in injured animals with or without estrogen treatment. The upregulation or downregulation of a cohort of 14 of these genes was validated by reverse transcription-polymerase chain reaction. This large-scale microarray analysis shows that at the 2-h time point, there is marked alteration in hepatic gene expression following trauma-hemorrhage. However, estrogen treatment attenuated these changes in injured animals. Pathway analysis demonstrated predominant changes in the expression of genes involved in metabolism, immunity, and apoptosis. Upregulation of low-density lipoprotein receptor, protein phosphatase 1, regulatory subunit 3C, ring-finger protein 11, pyroglutamyl-peptidase I, bactericidal/permeability-increasing protein, integrin, αD, BCL2-like 11, leukemia inhibitory factor receptor, ATPase, Cu transporting, α polypeptide, and Mk1 protein was found in estrogen-treated trauma-hemorrhaged animals. Thus, estrogen produces hepatoprotection following trauma-hemorrhage likely via antiapoptosis and improving/restoring metabolism and immunity pathways.

  13. Integrative approaches for large-scale transcriptome-wide association studies

    PubMed Central

    Gusev, Alexander; Ko, Arthur; Shi, Huwenbo; Bhatia, Gaurav; Chung, Wonil; Penninx, Brenda W J H; Jansen, Rick; de Geus, Eco JC; Boomsma, Dorret I; Wright, Fred A; Sullivan, Patrick F; Nikkola, Elina; Alvarez, Marcus; Civelek, Mete; Lusis, Aldons J.; Lehtimäki, Terho; Raitoharju, Emma; Kähönen, Mika; Seppälä, Ilkka; Raitakari, Olli T.; Kuusisto, Johanna; Laakso, Markku; Price, Alkes L.; Pajukanta, Päivi; Pasaniuc, Bogdan

    2016-01-01

    Many genetic variants influence complex traits by modulating gene expression, thus altering the abundance levels of one or multiple proteins. Here, we introduce a powerful strategy that integrates gene expression measurements with summary association statistics from large-scale genome-wide association studies (GWAS) to identify genes whose cis-regulated expression is associated to complex traits. We leverage expression imputation to perform a transcriptome wide association scan (TWAS) to identify significant expression-trait associations. We applied our approaches to expression data from blood and adipose tissue measured in ~3,000 individuals overall. We imputed gene expression into GWAS data from over 900,000 phenotype measurements to identify 69 novel genes significantly associated to obesity-related traits (BMI, lipids, and height). Many of the novel genes are associated with relevant phenotypes in the Hybrid Mouse Diversity Panel. Our results showcase the power of integrating genotype, gene expression and phenotype to gain insights into the genetic basis of complex traits. PMID:26854917

  14. Mechanism of Arachidonic Acid Accumulation during Aging in Mortierella alpina: A Large-Scale Label-Free Comparative Proteomics Study.

    PubMed

    Yu, Yadong; Li, Tao; Wu, Na; Ren, Lujing; Jiang, Ling; Ji, Xiaojun; Huang, He

    2016-11-30

    Arachidonic acid (ARA) is an important polyunsaturated fatty acid having various beneficial physiological effects on the human body. The aging of Mortierella alpina has long been known to significantly improve ARA yield, but the exact mechanism is still elusive. Herein, multiple approaches including large-scale label-free comparative proteomics were employed to systematically investigate the mechanism mentioned above. Upon ultrastructural observation, abnormal mitochondria were found to aggregate around shrunken lipid droplets. Proteomics analysis revealed a total of 171 proteins with significant alterations of expression during aging. Pathway analysis suggested that reactive oxygen species (ROS) were accumulated and stimulated the activation of the malate/pyruvate cycle and isocitrate dehydrogenase, which might provide additional NADPH for ARA synthesis. EC 4.2.1.17-hydratase might be a key player in ARA accumulation during aging. These findings provide a valuable resource for efforts to further improve the ARA content in the oil produced by aging M. alpina.

  15. Techniques for Large-Scale Bacterial Genome Manipulation and Characterization of the Mutants with Respect to In Silico Metabolic Reconstructions.

    PubMed

    diCenzo, George C; Finan, Turlough M

    2018-01-01

    The rate at which all genes within a bacterial genome can be identified far exceeds the ability to characterize these genes. To assist in associating genes with cellular functions, a large-scale bacterial genome deletion approach can be employed to rapidly screen tens to thousands of genes for desired phenotypes. Here, we provide a detailed protocol for the generation of deletions of large segments of bacterial genomes that relies on the activity of a site-specific recombinase. In this procedure, two recombinase recognition target sequences are introduced into known positions of a bacterial genome through single cross-over plasmid integration. Subsequent expression of the site-specific recombinase mediates recombination between the two target sequences, resulting in the excision of the intervening region and its loss from the genome. We further illustrate how this deletion system can be readily adapted to function as a large-scale in vivo cloning procedure, in which the region excised from the genome is captured as a replicative plasmid. We next provide a procedure for the metabolic analysis of bacterial large-scale genome deletion mutants using the Biolog Phenotype MicroArray™ system. Finally, a pipeline is described, and a sample Matlab script is provided, for the integration of the obtained data with a draft metabolic reconstruction for the refinement of the reactions and gene-protein-reaction relationships in a metabolic reconstruction.

  16. Geometric quantification of features in large flow fields.

    PubMed

    Kendall, Wesley; Huang, Jian; Peterka, Tom

    2012-01-01

    Interactive exploration of flow features in large-scale 3D unsteady-flow data is one of the most challenging visualization problems today. To comprehensively explore the complex feature spaces in these datasets, a proposed system employs a scalable framework for investigating a multitude of characteristics from traced field lines. This capability supports the examination of various neighborhood-based geometric attributes in concert with other scalar quantities. Such an analysis wasn't previously possible because of the large computational overhead and I/O requirements. The system integrates visual analytics methods by letting users procedurally and interactively describe and extract high-level flow features. An exploration of various phenomena in a large global ocean-modeling simulation demonstrates the approach's generality and expressiveness as well as its efficacy.

  17. The Human EST Ontology Explorer: a tissue-oriented visualization system for ontologies distribution in human EST collections.

    PubMed

    Merelli, Ivan; Caprera, Andrea; Stella, Alessandra; Del Corvo, Marcello; Milanesi, Luciano; Lazzari, Barbara

    2009-10-15

    The NCBI dbEST currently contains more than eight million human Expressed Sequenced Tags (ESTs). This wide collection represents an important source of information for gene expression studies, provided it can be inspected according to biologically relevant criteria. EST data can be browsed using different dedicated web resources, which allow to investigate library specific gene expression levels and to make comparisons among libraries, highlighting significant differences in gene expression. Nonetheless, no tool is available to examine distributions of quantitative EST collections in Gene Ontology (GO) categories, nor to retrieve information concerning library-dependent EST involvement in metabolic pathways. In this work we present the Human EST Ontology Explorer (HEOE) http://www.itb.cnr.it/ptp/human_est_explorer, a web facility for comparison of expression levels among libraries from several healthy and diseased tissues. The HEOE provides library-dependent statistics on the distribution of sequences in the GO Direct Acyclic Graph (DAG) that can be browsed at each GO hierarchical level. The tool is based on large-scale BLAST annotation of EST sequences. Due to the huge number of input sequences, this BLAST analysis was performed with the aid of grid computing technology, which is particularly suitable to address data parallel task. Relying on the achieved annotation, library-specific distributions of ESTs in the GO Graph were inferred. A pathway-based search interface was also implemented, for a quick evaluation of the representation of libraries in metabolic pathways. EST processing steps were integrated in a semi-automatic procedure that relies on Perl scripts and stores results in a MySQL database. A PHP-based web interface offers the possibility to simultaneously visualize, retrieve and compare data from the different libraries. Statistically significant differences in GO categories among user selected libraries can also be computed. The HEOE provides an alternative and complementary way to inspect EST expression levels with respect to approaches currently offered by other resources. Furthermore, BLAST computation on the whole human EST dataset was a suitable test of grid scalability in the context of large-scale bioinformatics analysis. The HEOE currently comprises sequence analysis from 70 non-normalized libraries, representing a comprehensive overview on healthy and unhealthy tissues. As the analysis procedure can be easily applied to other libraries, the number of represented tissues is intended to increase.

  18. CellLineNavigator: a workbench for cancer cell line analysis

    PubMed Central

    Krupp, Markus; Itzel, Timo; Maass, Thorsten; Hildebrandt, Andreas; Galle, Peter R.; Teufel, Andreas

    2013-01-01

    The CellLineNavigator database, freely available at http://www.medicalgenomics.org/celllinenavigator, is a web-based workbench for large scale comparisons of a large collection of diverse cell lines. It aims to support experimental design in the fields of genomics, systems biology and translational biomedical research. Currently, this compendium holds genome wide expression profiles of 317 different cancer cell lines, categorized into 57 different pathological states and 28 individual tissues. To enlarge the scope of CellLineNavigator, the database was furthermore closely linked to commonly used bioinformatics databases and knowledge repositories. To ensure easy data access and search ability, a simple data and an intuitive querying interface were implemented. It allows the user to explore and filter gene expression, focusing on pathological or physiological conditions. For a more complex search, the advanced query interface may be used to query for (i) differentially expressed genes; (ii) pathological or physiological conditions; or (iii) gene names or functional attributes, such as Kyoto Encyclopaedia of Genes and Genomes pathway maps. These queries may also be combined. Finally, CellLineNavigator allows additional advanced analysis of differentially regulated genes by a direct link to the Database for Annotation, Visualization and Integrated Discovery (DAVID) Bioinformatics Resources. PMID:23118487

  19. WHAM!: a web-based visualization suite for user-defined analysis of metagenomic shotgun sequencing data.

    PubMed

    Devlin, Joseph C; Battaglia, Thomas; Blaser, Martin J; Ruggles, Kelly V

    2018-06-25

    Exploration of large data sets, such as shotgun metagenomic sequence or expression data, by biomedical experts and medical professionals remains as a major bottleneck in the scientific discovery process. Although tools for this purpose exist for 16S ribosomal RNA sequencing analysis, there is a growing but still insufficient number of user-friendly interactive visualization workflows for easy data exploration and figure generation. The development of such platforms for this purpose is necessary to accelerate and streamline microbiome laboratory research. We developed the Workflow Hub for Automated Metagenomic Exploration (WHAM!) as a web-based interactive tool capable of user-directed data visualization and statistical analysis of annotated shotgun metagenomic and metatranscriptomic data sets. WHAM! includes exploratory and hypothesis-based gene and taxa search modules for visualizing differences in microbial taxa and gene family expression across experimental groups, and for creating publication quality figures without the need for command line interface or in-house bioinformatics. WHAM! is an interactive and customizable tool for downstream metagenomic and metatranscriptomic analysis providing a user-friendly interface allowing for easy data exploration by microbiome and ecological experts to facilitate discovery in multi-dimensional and large-scale data sets.

  20. [Isolation and function of genes regulating aphB expression in Vibrio cholerae].

    PubMed

    Chen, Haili; Zhu, Zhaoqin; Zhong, Zengtao; Zhu, Jun; Kan, Biao

    2012-02-04

    We identified genes that regulate the expression of aphB, the gene encoding a key virulence regulator in Vibrio cholerae O1 E1 Tor C6706(-). We constructed a transposon library in V. cholerae C6706 strain containing a P(aphB)-luxCDABE and P(aphB)-lacZ transcriptional reporter plasmids. Using a chemiluminescence imager system, we rapidly detected aphB promoter expression level at a large scale. We then sequenced the transposon insertion sites by arbitrary PCR and sequencing analysis. We obtained two candidate mutants T1 and T2 which displayed reduced aphB expression from approximately 40,000 transposon insertion mutants. Sequencing analysis shows that Tn inserted in vc1585 reading frame in the T1 mutant and Tn inserted in the end of coding sequence of vc1602 in the T2 mutant. By using a genetic screen, we identified two potential genes that may involve in regulation of the expression of the key virulence regulator AphB. This study sheds light on our further investigation to fully understand V. cholerae virulence gene regulatory cascades.

  1. Analysis of musical expression in audio signals

    NASA Astrophysics Data System (ADS)

    Dixon, Simon

    2003-01-01

    In western art music, composers communicate their work to performers via a standard notation which specificies the musical pitches and relative timings of notes. This notation may also include some higher level information such as variations in the dynamics, tempo and timing. Famous performers are characterised by their expressive interpretation, the ability to convey structural and emotive information within the given framework. The majority of work on audio content analysis focusses on retrieving score-level information; this paper reports on the extraction of parameters describing the performance, a task which requires a much higher degree of accuracy. Two systems are presented: BeatRoot, an off-line beat tracking system which finds the times of musical beats and tracks changes in tempo throughout a performance, and the Performance Worm, a system which provides a real-time visualisation of the two most important expressive dimensions, tempo and dynamics. Both of these systems are being used to process data for a large-scale study of musical expression in classical and romantic piano performance, which uses artificial intelligence (machine learning) techniques to discover fundamental patterns or principles governing expressive performance.

  2. High-resolution face verification using pore-scale facial features.

    PubMed

    Li, Dong; Zhou, Huiling; Lam, Kin-Man

    2015-08-01

    Face recognition methods, which usually represent face images using holistic or local facial features, rely heavily on alignment. Their performances also suffer a severe degradation under variations in expressions or poses, especially when there is one gallery per subject only. With the easy access to high-resolution (HR) face images nowadays, some HR face databases have recently been developed. However, few studies have tackled the use of HR information for face recognition or verification. In this paper, we propose a pose-invariant face-verification method, which is robust to alignment errors, using the HR information based on pore-scale facial features. A new keypoint descriptor, namely, pore-Principal Component Analysis (PCA)-Scale Invariant Feature Transform (PPCASIFT)-adapted from PCA-SIFT-is devised for the extraction of a compact set of distinctive pore-scale facial features. Having matched the pore-scale features of two-face regions, an effective robust-fitting scheme is proposed for the face-verification task. Experiments show that, with one frontal-view gallery only per subject, our proposed method outperforms a number of standard verification methods, and can achieve excellent accuracy even the faces are under large variations in expression and pose.

  3. Discrete domains of gene expression in germinal layers distinguish the development of gyrencephaly

    PubMed Central

    de Juan Romero, Camino; Bruder, Carl; Tomasello, Ugo; Sanz-Anquela, José Miguel; Borrell, Víctor

    2015-01-01

    Gyrencephalic species develop folds in the cerebral cortex in a stereotypic manner, but the genetic mechanisms underlying this patterning process are unknown. We present a large-scale transcriptomic analysis of individual germinal layers in the developing cortex of the gyrencephalic ferret, comparing between regions prospective of fold and fissure. We find unique transcriptional signatures in each germinal compartment, where thousands of genes are differentially expressed between regions, including ∼80% of genes mutated in human cortical malformations. These regional differences emerge from the existence of discrete domains of gene expression, which occur at multiple locations across the developing cortex of ferret and human, but not the lissencephalic mouse. Complex expression patterns emerge late during development and map the eventual location of folds or fissures. Protomaps of gene expression within germinal layers may contribute to define cortical folds or functional areas, but our findings demonstrate that they distinguish the development of gyrencephalic cortices. PMID:25916825

  4. Assembly of 500,000 inter-specific catfish expressed sequence tags and large scale gene-associated marker development for whole genome association studies

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Catfish Genome Consortium; Wang, Shaolin; Peatman, Eric

    2010-03-23

    Background-Through the Community Sequencing Program, a catfish EST sequencing project was carried out through a collaboration between the catfish research community and the Department of Energy's Joint Genome Institute. Prior to this project, only a limited EST resource from catfish was available for the purpose of SNP identification. Results-A total of 438,321 quality ESTs were generated from 8 channel catfish (Ictalurus punctatus) and 4 blue catfish (Ictalurus furcatus) libraries, bringing the number of catfish ESTs to nearly 500,000. Assembly of all catfish ESTs resulted in 45,306 contigs and 66,272 singletons. Over 35percent of the unique sequences had significant similarities tomore » known genes, allowing the identification of 14,776 unique genes in catfish. Over 300,000 putative SNPs have been identified, of which approximately 48,000 are high-quality SNPs identified from contigs with at least four sequences and the minor allele presence of at least two sequences in the contig. The EST resource should be valuable for identification of microsatellites, genome annotation, large-scale expression analysis, and comparative genome analysis. Conclusions-This project generated a large EST resource for catfish that captured the majority of the catfish transcriptome. The parallel analysis of ESTs from two closely related Ictalurid catfishes should also provide powerful means for the evaluation of ancient and recent gene duplications, and for the development of high-density microarrays in catfish. The inter- and intra-specific SNPs identified from all catfish EST dataset assembly will greatly benefit the catfish introgression breeding program and whole genome association studies.« less

  5. Brucella proteomes--a review.

    PubMed

    DelVecchio, Vito G; Wagner, Mary Ann; Eschenbrenner, Michel; Horn, Troy A; Kraycer, Jo Ann; Estock, Frank; Elzer, Phil; Mujer, Cesar V

    2002-12-20

    The proteomes of selected Brucella spp. have been extensively analyzed by utilizing current proteomic technology involving 2-DE and MALDI-MS. In Brucella melitensis, more than 500 proteins were identified. The rapid and large-scale identification of proteins in this organism was accomplished by using the annotated B. melitensis genome which is now available in the GenBank. Coupled with new and powerful tools for data analysis, differentially expressed proteins were identified and categorized into several classes. A global overview of protein expression patterns emerged, thereby facilitating the simultaneous analysis of different metabolic pathways in B. melitensis. Such a global characterization would not have been possible by using time consuming and traditional biochemical approaches. The era of post-genomic technology offers new and exciting opportunities to understand the complete biology of different Brucella species.

  6. How to normalize metatranscriptomic count data for differential expression analysis.

    PubMed

    Klingenberg, Heiner; Meinicke, Peter

    2017-01-01

    Differential expression analysis on the basis of RNA-Seq count data has become a standard tool in transcriptomics. Several studies have shown that prior normalization of the data is crucial for a reliable detection of transcriptional differences. Until now it has not been clear whether and how the transcriptomic approach can be used for differential expression analysis in metatranscriptomics. We propose a model for differential expression in metatranscriptomics that explicitly accounts for variations in the taxonomic composition of transcripts across different samples. As a main consequence the correct normalization of metatranscriptomic count data under this model requires the taxonomic separation of the data into organism-specific bins. Then the taxon-specific scaling of organism profiles yields a valid normalization and allows us to recombine the scaled profiles into a metatranscriptomic count matrix. This matrix can then be analyzed with statistical tools for transcriptomic count data. For taxon-specific scaling and recombination of scaled counts we provide a simple R script. When applying transcriptomic tools for differential expression analysis directly to metatranscriptomic data with an organism-independent (global) scaling of counts the resulting differences may be difficult to interpret. The differences may correspond to changing functional profiles of the contributing organisms but may also result from a variation of taxonomic abundances. Taxon-specific scaling eliminates this variation and therefore the resulting differences actually reflect a different behavior of organisms under changing conditions. In simulation studies we show that the divergence between results from global and taxon-specific scaling can be drastic. In particular, the variation of organism abundances can imply a considerable increase of significant differences with global scaling. Also, on real metatranscriptomic data, the predictions from taxon-specific and global scaling can differ widely. Our studies indicate that in real data applications performed with global scaling it might be impossible to distinguish between differential expression in terms of transcriptomic changes and differential composition in terms of changing taxonomic proportions. As in transcriptomics, a proper normalization of count data is also essential for differential expression analysis in metatranscriptomics. Our model implies a taxon-specific scaling of counts for normalization of the data. The application of taxon-specific scaling consequently removes taxonomic composition variations from functional profiles and therefore provides a clear interpretation of the observed functional differences.

  7. Increasing the yield of middle silk gland expression system through transgenic knock-down of endogenous sericin-1.

    PubMed

    Ma, Sanyuan; Xia, Xiaojuan; Li, Yufeng; Sun, Le; Liu, Yue; Liu, Yuanyuan; Wang, Xiaogang; Shi, Run; Chang, Jiasong; Zhao, Ping; Xia, Qingyou

    2017-08-01

    Various genetically modified bioreactor systems have been developed to meet the increasing demands of recombinant proteins. Silk gland of Bombyx mori holds great potential to be a cost-effective bioreactor for commercial-scale production of recombinant proteins. However, the actual yields of proteins obtained from the current silk gland expression systems are too low for the proteins to be dissolved and purified in a large scale. Here, we proposed a strategy that reducing endogenous sericin proteins would increase the expression yield of foreign proteins. Using transgenic RNA interference, we successfully reduced the expression of BmSer1 to 50%. A total 26 transgenic lines expressing Discosoma sp. red fluorescent protein (DsRed) in the middle silk gland (MSG) under the control of BmSer1 promoter were established to analyze the expression of recombinant. qRT-PCR and western blotting showed that in BmSer1 knock-down lines, the expression of DsRed had significantly increased both at mRNA and protein levels. We did an additional analysis of DsRed/BmSer1 distribution in cocoon and effect of DsRed protein accumulation on the silk fiber formation process. This study describes not only a novel method to enhance recombinant protein expression in MSG bioreactor, but also a strategy to optimize other bioreactor systems.

  8. Integrated analysis of numerous heterogeneous gene expression profiles for detecting robust disease-specific biomarkers and proposing drug targets.

    PubMed

    Amar, David; Hait, Tom; Izraeli, Shai; Shamir, Ron

    2015-09-18

    Genome-wide expression profiling has revolutionized biomedical research; vast amounts of expression data from numerous studies of many diseases are now available. Making the best use of this resource in order to better understand disease processes and treatment remains an open challenge. In particular, disease biomarkers detected in case-control studies suffer from low reliability and are only weakly reproducible. Here, we present a systematic integrative analysis methodology to overcome these shortcomings. We assembled and manually curated more than 14,000 expression profiles spanning 48 diseases and 18 expression platforms. We show that when studying a particular disease, judicious utilization of profiles from other diseases and information on disease hierarchy improves classification quality, avoids overoptimistic evaluation of that quality, and enhances disease-specific biomarker discovery. This approach yielded specific biomarkers for 24 of the analyzed diseases. We demonstrate how to combine these biomarkers with large-scale interaction, mutation and drug target data, forming a highly valuable disease summary that suggests novel directions in disease understanding and drug repurposing. Our analysis also estimates the number of samples required to reach a desired level of biomarker stability. This methodology can greatly improve the exploitation of the mountain of expression profiles for better disease analysis. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  9. N-point statistics of large-scale structure in the Zel'dovich approximation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Tassev, Svetlin, E-mail: tassev@astro.princeton.edu

    2014-06-01

    Motivated by the results presented in a companion paper, here we give a simple analytical expression for the matter n-point functions in the Zel'dovich approximation (ZA) both in real and in redshift space (including the angular case). We present numerical results for the 2-dimensional redshift-space correlation function, as well as for the equilateral configuration for the real-space 3-point function. We compare those to the tree-level results. Our analysis is easily extendable to include Lagrangian bias, as well as higher-order perturbative corrections to the ZA. The results should be especially useful for modelling probes of large-scale structure in the linear regime,more » such as the Baryon Acoustic Oscillations. We make the numerical code used in this paper freely available.« less

  10. Gene expression studies of developing bovine longissimus muscle from two different beef cattle breeds

    PubMed Central

    Lehnert, Sigrid A; Reverter, Antonio; Byrne, Keren A; Wang, Yonghong; Nattrass, Greg S; Hudson, Nicholas J; Greenwood, Paul L

    2007-01-01

    Background The muscle fiber number and fiber composition of muscle is largely determined during prenatal development. In order to discover genes that are involved in determining adult muscle phenotypes, we studied the gene expression profile of developing fetal bovine longissimus muscle from animals with two different genetic backgrounds using a bovine cDNA microarray. Fetal longissimus muscle was sampled at 4 stages of myogenesis and muscle maturation: primary myogenesis (d 60), secondary myogenesis (d 135), as well as beginning (d 195) and final stages (birth) of functional differentiation of muscle fibers. All fetuses and newborns (total n = 24) were from Hereford dams and crossed with either Wagyu (high intramuscular fat) or Piedmontese (GDF8 mutant) sires, genotypes that vary markedly in muscle and compositional characteristics later in postnatal life. Results We obtained expression profiles of three individuals for each time point and genotype to allow comparisons across time and between sire breeds. Quantitative reverse transcription-PCR analysis of RNA from developing longissimus muscle was able to validate the differential expression patterns observed for a selection of differentially expressed genes, with one exception. We detected large-scale changes in temporal gene expression between the four developmental stages in genes coding for extracellular matrix and for muscle fiber structural and metabolic proteins. FSTL1 and IGFBP5 were two genes implicated in growth and differentiation that showed developmentally regulated expression levels in fetal muscle. An abundantly expressed gene with no functional annotation was found to be developmentally regulated in the same manner as muscle structural proteins. We also observed differences in gene expression profiles between the two different sire breeds. Wagyu-sired calves showed higher expression of fatty acid binding protein 5 (FABP5) RNA at birth. The developing longissimus muscle of fetuses carrying the Piedmontese mutation shows an emphasis on glycolytic muscle biochemistry and a large-scale up-regulation of the translational machinery at birth. We also document evidence for timing differences in differentiation events between the two breeds. Conclusion Taken together, these findings provide a detailed description of molecular events accompanying skeletal muscle differentiation in the bovine, as well as gene expression differences that may underpin the phenotype differences between the two breeds. In addition, this study has highlighted a non-coding RNA, which is abundantly expressed and developmentally regulated in bovine fetal muscle. PMID:17697390

  11. A regulatory toolbox of MiniPromoters to drive selective expression in the brain.

    PubMed

    Portales-Casamar, Elodie; Swanson, Douglas J; Liu, Li; de Leeuw, Charles N; Banks, Kathleen G; Ho Sui, Shannan J; Fulton, Debra L; Ali, Johar; Amirabbasi, Mahsa; Arenillas, David J; Babyak, Nazar; Black, Sonia F; Bonaguro, Russell J; Brauer, Erich; Candido, Tara R; Castellarin, Mauro; Chen, Jing; Chen, Ying; Cheng, Jason C Y; Chopra, Vik; Docking, T Roderick; Dreolini, Lisa; D'Souza, Cletus A; Flynn, Erin K; Glenn, Randy; Hatakka, Kristi; Hearty, Taryn G; Imanian, Behzad; Jiang, Steven; Khorasan-zadeh, Shadi; Komljenovic, Ivana; Laprise, Stéphanie; Liao, Nancy Y; Lim, Jonathan S; Lithwick, Stuart; Liu, Flora; Liu, Jun; Lu, Meifen; McConechy, Melissa; McLeod, Andrea J; Milisavljevic, Marko; Mis, Jacek; O'Connor, Katie; Palma, Betty; Palmquist, Diana L; Schmouth, Jean-François; Swanson, Magdalena I; Tam, Bonny; Ticoll, Amy; Turner, Jenna L; Varhol, Richard; Vermeulen, Jenny; Watkins, Russell F; Wilson, Gary; Wong, Bibiana K Y; Wong, Siaw H; Wong, Tony Y T; Yang, George S; Ypsilanti, Athena R; Jones, Steven J M; Holt, Robert A; Goldowitz, Daniel; Wasserman, Wyeth W; Simpson, Elizabeth M

    2010-09-21

    The Pleiades Promoter Project integrates genomewide bioinformatics with large-scale knockin mouse production and histological examination of expression patterns to develop MiniPromoters and related tools designed to study and treat the brain by directed gene expression. Genes with brain expression patterns of interest are subjected to bioinformatic analysis to delineate candidate regulatory regions, which are then incorporated into a panel of compact human MiniPromoters to drive expression to brain regions and cell types of interest. Using single-copy, homologous-recombination "knockins" in embryonic stem cells, each MiniPromoter reporter is integrated immediately 5' of the Hprt locus in the mouse genome. MiniPromoter expression profiles are characterized in differentiation assays of the transgenic cells or in mouse brains following transgenic mouse production. Histological examination of adult brains, eyes, and spinal cords for reporter gene activity is coupled to costaining with cell-type-specific markers to define expression. The publicly available Pleiades MiniPromoter Project is a key resource to facilitate research on brain development and therapies.

  12. Prognostic value of programmed cell death ligand 1 expression in patients with head and neck cancer: A systematic review and meta-analysis.

    PubMed

    Li, Ji; Wang, Ping; Xu, Youliang

    2017-01-01

    Programmed cell death ligand 1 (PD-L1) expression was reported to be correlated with poor prognosis in various cancers. However, the relationship between PD-L1 expression and the survival of patients with head and neck cancer (HNC) remains inconclusive. In the present study, we aimed to clarify the prognostic value of PD-L1 in HNC patients using meta-analysis techniques. A comprehensive database searching was conducted in the PubMed, EMBASE, Web of Science and Cochrane Library from inception to August 2016. Studies meeting the inclusion criteria were included. The methodological quality of included studies was assessed by the Newcastle-Ottawa quality assessment scale. Hazard ratios (HRs) with their corresponding 95% confidence intervals (CIs) were pooled by STATA 11.0 for the outcome of overall survival (OS) and disease-free survival (DFS). A total of 17 studies with 2,869 HNC patients were included in the meta-analysis. The results of meta-analysis showed that there was no significant correlation between PD-L1 expression and OS (HR, 1.23; 95% CI, 0.99-1.53; P = 0.065) or DFS (HR, 1.42; 95% CI, 1.00-2.03; P = 0.052) of HNC patients. However, the subgroup analysis suggested that positive expression of PD-L1 was associated with poor OS (HR, 1.38; 95% CI, 1.12, 1.70; P = 0.003) and DFS (HR, 1.99; 95% CI, 1.59, 2.48; P = 0.001) in HNC patients from Asian countries/regions. The subgroup analysis also showed that the correlations between PD-L1 and prognosis are variant among different subtypes of HNC. When performing sensitive analyses, we found that the results of meta-analyses were not robust. The meta-analysis indicated that positive expression of PD-L1 could serve as a good predictor for poor prognosis of Asian patients with HNC. However, the findings still need to be confirmed by large-scale, prospective studies.

  13. Screening and large-scale expression of membrane proteins in mammalian cells for structural studies.

    PubMed

    Goehring, April; Lee, Chia-Hsueh; Wang, Kevin H; Michel, Jennifer Carlisle; Claxton, Derek P; Baconguis, Isabelle; Althoff, Thorsten; Fischer, Suzanne; Garcia, K Christopher; Gouaux, Eric

    2014-11-01

    Structural, biochemical and biophysical studies of eukaryotic membrane proteins are often hampered by difficulties in overexpression of the candidate molecule. Baculovirus transduction of mammalian cells (BacMam), although a powerful method to heterologously express membrane proteins, can be cumbersome for screening and expression of multiple constructs. We therefore developed plasmid Eric Gouaux (pEG) BacMam, a vector optimized for use in screening assays, as well as for efficient production of baculovirus and robust expression of the target protein. In this protocol, we show how to use small-scale transient transfection and fluorescence-detection size-exclusion chromatography (FSEC) experiments using a GFP-His8-tagged candidate protein to screen for monodispersity and expression level. Once promising candidates are identified, we describe how to generate baculovirus, transduce HEK293S GnTI(-) (N-acetylglucosaminyltransferase I-negative) cells in suspension culture and overexpress the candidate protein. We have used these methods to prepare pure samples of chicken acid-sensing ion channel 1a (cASIC1) and Caenorhabditis elegans glutamate-gated chloride channel (GluCl) for X-ray crystallography, demonstrating how to rapidly and efficiently screen hundreds of constructs and accomplish large-scale expression in 4-6 weeks.

  14. Channel correlation and BER performance analysis of coherent optical communication systems with receive diversity over moderate-to-strong non-Kolmogorov turbulence.

    PubMed

    Fu, Yulong; Ma, Jing; Tan, Liying; Yu, Siyuan; Lu, Gaoyuan

    2018-04-10

    In this paper, new expressions of the channel-correlation coefficient and its components (the large- and small-scale channel-correlation coefficients) for a plane wave are derived for a horizontal link in moderate-to-strong non-Kolmogorov turbulence using a generalized effective atmospheric spectrum which includes finite-turbulence inner and outer scales and high-wave-number "bump". The closed-form expression of the average bit error rate (BER) of the coherent free-space optical communication system is derived using the derived channel-correlation coefficients and an α-μ distribution to approximate the sum of the square root of arbitrarily correlated Gamma-Gamma random variables. Analytical results are provided to investigate the channel correlation and evaluate the average BER performance. The validity of the proposed approximation is illustrated by Monte Carlo simulations. This work will help with further investigation of the fading correlation in spatial diversity systems.

  15. Fully synchronous solutions and the synchronization phase transition for the finite-N Kuramoto model

    NASA Astrophysics Data System (ADS)

    Bronski, Jared C.; DeVille, Lee; Jip Park, Moon

    2012-09-01

    We present a detailed analysis of the stability of phase-locked solutions to the Kuramoto system of oscillators. We derive an analytical expression counting the dimension of the unstable manifold associated to a given stationary solution. From this we are able to derive a number of consequences, including analytic expressions for the first and last frequency vectors to phase-lock, upper and lower bounds on the probability that a randomly chosen frequency vector will phase-lock, and very sharp results on the large N limit of this model. One of the surprises in this calculation is that for frequencies that are Gaussian distributed, the correct scaling for full synchrony is not the one commonly studied in the literature; rather, there is a logarithmic correction to the scaling which is related to the extremal value statistics of the random frequency vector.

  16. Introduction to bioinformatics.

    PubMed

    Can, Tolga

    2014-01-01

    Bioinformatics is an interdisciplinary field mainly involving molecular biology and genetics, computer science, mathematics, and statistics. Data intensive, large-scale biological problems are addressed from a computational point of view. The most common problems are modeling biological processes at the molecular level and making inferences from collected data. A bioinformatics solution usually involves the following steps: Collect statistics from biological data. Build a computational model. Solve a computational modeling problem. Test and evaluate a computational algorithm. This chapter gives a brief introduction to bioinformatics by first providing an introduction to biological terminology and then discussing some classical bioinformatics problems organized by the types of data sources. Sequence analysis is the analysis of DNA and protein sequences for clues regarding function and includes subproblems such as identification of homologs, multiple sequence alignment, searching sequence patterns, and evolutionary analyses. Protein structures are three-dimensional data and the associated problems are structure prediction (secondary and tertiary), analysis of protein structures for clues regarding function, and structural alignment. Gene expression data is usually represented as matrices and analysis of microarray data mostly involves statistics analysis, classification, and clustering approaches. Biological networks such as gene regulatory networks, metabolic pathways, and protein-protein interaction networks are usually modeled as graphs and graph theoretic approaches are used to solve associated problems such as construction and analysis of large-scale networks.

  17. Grid-Enabled Quantitative Analysis of Breast Cancer

    DTIC Science & Technology

    2009-10-01

    large-scale, multi-modality computerized image analysis . The central hypothesis of this research is that large-scale image analysis for breast cancer...pilot study to utilize large scale parallel Grid computing to harness the nationwide cluster infrastructure for optimization of medical image ... analysis parameters. Additionally, we investigated the use of cutting edge dataanalysis/ mining techniques as applied to Ultrasound, FFDM, and DCE-MRI Breast

  18. How much does a tokamak reactor cost?

    NASA Astrophysics Data System (ADS)

    Freidberg, J.; Cerfon, A.; Ballinger, S.; Barber, J.; Dogra, A.; McCarthy, W.; Milanese, L.; Mouratidis, T.; Redman, W.; Sandberg, A.; Segal, D.; Simpson, R.; Sorensen, C.; Zhou, M.

    2017-10-01

    The cost of a fusion reactor is of critical importance to its ultimate acceptability as a commercial source of electricity. While there are general rules of thumb for scaling both overnight cost and levelized cost of electricity the corresponding relations are not very accurate or universally agreed upon. We have carried out a series of scaling studies of tokamak reactor costs based on reasonably sophisticated plasma and engineering models. The analysis is largely analytic, requiring only a simple numerical code, thus allowing a very large number of designs. Importantly, the studies are aimed at plasma physicists rather than fusion engineers. The goals are to assess the pros and cons of steady state burning plasma experiments and reactors. One specific set of results discusses the benefits of higher magnetic fields, now possible because of the recent development of high T rare earth superconductors (REBCO); with this goal in mind, we calculate quantitative expressions, including both scaling and multiplicative constants, for cost and major radius as a function of central magnetic field.

  19. Automation of large scale transient protein expression in mammalian cells

    PubMed Central

    Zhao, Yuguang; Bishop, Benjamin; Clay, Jordan E.; Lu, Weixian; Jones, Margaret; Daenke, Susan; Siebold, Christian; Stuart, David I.; Yvonne Jones, E.; Radu Aricescu, A.

    2011-01-01

    Traditional mammalian expression systems rely on the time-consuming generation of stable cell lines; this is difficult to accommodate within a modern structural biology pipeline. Transient transfections are a fast, cost-effective solution, but require skilled cell culture scientists, making man-power a limiting factor in a setting where numerous samples are processed in parallel. Here we report a strategy employing a customised CompacT SelecT cell culture robot allowing the large-scale expression of multiple protein constructs in a transient format. Successful protocols have been designed for automated transient transfection of human embryonic kidney (HEK) 293T and 293S GnTI− cells in various flask formats. Protein yields obtained by this method were similar to those produced manually, with the added benefit of reproducibility, regardless of user. Automation of cell maintenance and transient transfection allows the expression of high quality recombinant protein in a completely sterile environment with limited support from a cell culture scientist. The reduction in human input has the added benefit of enabling continuous cell maintenance and protein production, features of particular importance to structural biology laboratories, which typically use large quantities of pure recombinant proteins, and often require rapid characterisation of a series of modified constructs. This automated method for large scale transient transfection is now offered as a Europe-wide service via the P-cube initiative. PMID:21571074

  20. State of the Art Methodology for the Design and Analysis of Future Large Scale Evaluations: A Selective Examination.

    ERIC Educational Resources Information Center

    Burstein, Leigh

    Two specific methods of analysis in large-scale evaluations are considered: structural equation modeling and selection modeling/analysis of non-equivalent control group designs. Their utility in large-scale educational program evaluation is discussed. The examination of these methodological developments indicates how people (evaluators,…

  1. Modeling of Hurricane Impacts

    DTIC Science & Technology

    2007-12-21

    2.4 Implementation of non-uniform gridsize The numerical method has been extended to allow non-uniform gridsizes in x and y direction, though the...and the vertical excursion of the swash motion A is expressed as 0.125 / 0 inaA sT g h π = . Figure 3 and 4 compare the XBeach results with the...A. Van Gent, A. J. H. M. Reniers, and D. J. R. Walstra (2008), Analysis of dune erosion processes in large scale flume experiments, submitted to

  2. TLM-Quant: An Open-Source Pipeline for Visualization and Quantification of Gene Expression Heterogeneity in Growing Microbial Cells

    PubMed Central

    Piersma, Sjouke; Denham, Emma L.; Drulhe, Samuel; Tonk, Rudi H. J.; Schwikowski, Benno; van Dijl, Jan Maarten

    2013-01-01

    Gene expression heterogeneity is a key driver for microbial adaptation to fluctuating environmental conditions, cell differentiation and the evolution of species. This phenomenon has therefore enormous implications, not only for life in general, but also for biotechnological applications where unwanted subpopulations of non-producing cells can emerge in large-scale fermentations. Only time-lapse fluorescence microscopy allows real-time measurements of gene expression heterogeneity. A major limitation in the analysis of time-lapse microscopy data is the lack of fast, cost-effective, open, simple and adaptable protocols. Here we describe TLM-Quant, a semi-automatic pipeline for the analysis of time-lapse fluorescence microscopy data that enables the user to visualize and quantify gene expression heterogeneity. Importantly, our pipeline builds on the open-source packages ImageJ and R. To validate TLM-Quant, we selected three possible scenarios, namely homogeneous expression, highly ‘noisy’ heterogeneous expression, and bistable heterogeneous expression in the Gram-positive bacterium Bacillus subtilis. This bacterium is both a paradigm for systems-level studies on gene expression and a highly appreciated biotechnological ‘cell factory’. We conclude that the temporal resolution of such analyses with TLM-Quant is only limited by the numbers of recorded images. PMID:23874729

  3. The Center for Optimized Structural Studies (COSS) platform for automation in cloning, expression, and purification of single proteins and protein-protein complexes.

    PubMed

    Mlynek, Georg; Lehner, Anita; Neuhold, Jana; Leeb, Sarah; Kostan, Julius; Charnagalov, Alexej; Stolt-Bergner, Peggy; Djinović-Carugo, Kristina; Pinotsis, Nikos

    2014-06-01

    Expression in Escherichia coli represents the simplest and most cost effective means for the production of recombinant proteins. This is a routine task in structural biology and biochemistry where milligrams of the target protein are required in high purity and monodispersity. To achieve these criteria, the user often needs to screen several constructs in different expression and purification conditions in parallel. We describe a pipeline, implemented in the Center for Optimized Structural Studies, that enables the systematic screening of expression and purification conditions for recombinant proteins and relies on a series of logical decisions. We first use bioinformatics tools to design a series of protein fragments, which we clone in parallel, and subsequently screen in small scale for optimal expression and purification conditions. Based on a scoring system that assesses soluble expression, we then select the top ranking targets for large-scale purification. In the establishment of our pipeline, emphasis was put on streamlining the processes such that it can be easily but not necessarily automatized. In a typical run of about 2 weeks, we are able to prepare and perform small-scale expression screens for 20-100 different constructs followed by large-scale purification of at least 4-6 proteins. The major advantage of our approach is its flexibility, which allows for easy adoption, either partially or entirely, by any average hypothesis driven laboratory in a manual or robot-assisted manner.

  4. Immunological metagene signatures derived from immunogenic cancer cell death associate with improved survival of patients with lung, breast or ovarian malignancies: A large-scale meta-analysis

    PubMed Central

    Garg, Abhishek D.; De Ruysscher, Dirk; Agostinis, Patrizia

    2016-01-01

    ABSTRACT The emerging role of the cancer cell-immune cell interface in shaping tumorigenesis/anticancer immunotherapy has increased the need to identify prognostic biomarkers. Henceforth, our primary aim was to identify the immunogenic cell death (ICD)-derived metagene signatures in breast, lung and ovarian cancer that associate with improved patient survival. To this end, we analyzed the prognostic impact of differential gene-expression of 33 pre-clinically-validated ICD-parameters through a large-scale meta-analysis involving 3,983 patients (‘discovery’ dataset) across lung (1,432), breast (1,115) and ovarian (1,436) malignancies. The main results were also substantiated in ‘validation’ datasets consisting of 818 patients of same cancer-types (i.e. 285 breast/274 lung/259 ovarian). The ICD-associated parameters exhibited a highly-clustered and largely cancer type-specific prognostic impact. Interestingly, we delineated ICD-derived consensus-metagene signatures that exhibited a positive prognostic impact that was either cancer type-independent or specific. Importantly, most of these ICD-derived consensus-metagenes (acted as attractor-metagenes and thereby) ‘attracted’ highly co-expressing sets of genes or convergent-metagenes. These convergent-metagenes also exhibited positive prognostic impact in respective cancer types. Remarkably, we found that the cancer type-independent consensus-metagene acted as an ‘attractor’ for cancer-specific convergent-metagenes. This reaffirms that the immunological prognostic landscape of cancer tends to segregate between cancer-independent and cancer-type specific gene signatures. Moreover, this prognostic landscape was largely dominated by the classical T cell activity/infiltration/function-related biomarkers. Interestingly, each cancer type tended to associate with biomarkers representing a specific T cell activity or function rather than pan-T cell biomarkers. Thus, our analysis confirms that ICD can serve as a platform for discovery of novel prognostic metagenes. PMID:27057433

  5. Multi-tissue analysis of co-expression networks by higher-order generalized singular value decomposition identifies functionally coherent transcriptional modules.

    PubMed

    Xiao, Xiaolin; Moreno-Moral, Aida; Rotival, Maxime; Bottolo, Leonardo; Petretto, Enrico

    2014-01-01

    Recent high-throughput efforts such as ENCODE have generated a large body of genome-scale transcriptional data in multiple conditions (e.g., cell-types and disease states). Leveraging these data is especially important for network-based approaches to human disease, for instance to identify coherent transcriptional modules (subnetworks) that can inform functional disease mechanisms and pathological pathways. Yet, genome-scale network analysis across conditions is significantly hampered by the paucity of robust and computationally-efficient methods. Building on the Higher-Order Generalized Singular Value Decomposition, we introduce a new algorithmic approach for efficient, parameter-free and reproducible identification of network-modules simultaneously across multiple conditions. Our method can accommodate weighted (and unweighted) networks of any size and can similarly use co-expression or raw gene expression input data, without hinging upon the definition and stability of the correlation used to assess gene co-expression. In simulation studies, we demonstrated distinctive advantages of our method over existing methods, which was able to recover accurately both common and condition-specific network-modules without entailing ad-hoc input parameters as required by other approaches. We applied our method to genome-scale and multi-tissue transcriptomic datasets from rats (microarray-based) and humans (mRNA-sequencing-based) and identified several common and tissue-specific subnetworks with functional significance, which were not detected by other methods. In humans we recapitulated the crosstalk between cell-cycle progression and cell-extracellular matrix interactions processes in ventricular zones during neocortex expansion and further, we uncovered pathways related to development of later cognitive functions in the cortical plate of the developing brain which were previously unappreciated. Analyses of seven rat tissues identified a multi-tissue subnetwork of co-expressed heat shock protein (Hsp) and cardiomyopathy genes (Bag3, Cryab, Kras, Emd, Plec), which was significantly replicated using separate failing heart and liver gene expression datasets in humans, thus revealing a conserved functional role for Hsp genes in cardiovascular disease.

  6. Portraying the Expression Landscapes of B-Cell Lymphoma-Intuitive Detection of Outlier Samples and of Molecular Subtypes

    PubMed Central

    Hopp, Lydia; Lembcke, Kathrin; Binder, Hans; Wirth, Henry

    2013-01-01

    We present an analytic framework based on Self-Organizing Map (SOM) machine learning to study large scale patient data sets. The potency of the approach is demonstrated in a case study using gene expression data of more than 200 mature aggressive B-cell lymphoma patients. The method portrays each sample with individual resolution, characterizes the subtypes, disentangles the expression patterns into distinct modules, extracts their functional context using enrichment techniques and enables investigation of the similarity relations between the samples. The method also allows to detect and to correct outliers caused by contaminations. Based on our analysis, we propose a refined classification of B-cell Lymphoma into four molecular subtypes which are characterized by differential functional and clinical characteristics. PMID:24833231

  7. Workflow based framework for life science informatics.

    PubMed

    Tiwari, Abhishek; Sekhar, Arvind K T

    2007-10-01

    Workflow technology is a generic mechanism to integrate diverse types of available resources (databases, servers, software applications and different services) which facilitate knowledge exchange within traditionally divergent fields such as molecular biology, clinical research, computational science, physics, chemistry and statistics. Researchers can easily incorporate and access diverse, distributed tools and data to develop their own research protocols for scientific analysis. Application of workflow technology has been reported in areas like drug discovery, genomics, large-scale gene expression analysis, proteomics, and system biology. In this article, we have discussed the existing workflow systems and the trends in applications of workflow based systems.

  8. Development of 5123 Intron-Length Polymorphic Markers for Large-Scale Genotyping Applications in Foxtail Millet

    PubMed Central

    Muthamilarasan, Mehanathan; Venkata Suresh, B.; Pandey, Garima; Kumari, Kajal; Parida, Swarup Kumar; Prasad, Manoj

    2014-01-01

    Generating genomic resources in terms of molecular markers is imperative in molecular breeding for crop improvement. Though development and application of microsatellite markers in large-scale was reported in the model crop foxtail millet, no such large-scale study was conducted for intron-length polymorphic (ILP) markers. Considering this, we developed 5123 ILP markers, of which 4049 were physically mapped onto 9 chromosomes of foxtail millet. BLAST analysis of 5123 expressed sequence tags (ESTs) suggested the function for ∼71.5% ESTs and grouped them into 5 different functional categories. About 440 selected primer pairs representing the foxtail millet genome and the different functional groups showed high-level of cross-genera amplification at an average of ∼85% in eight millets and five non-millet species. The efficacy of the ILP markers for distinguishing the foxtail millet is demonstrated by observed heterozygosity (0.20) and Nei's average gene diversity (0.22). In silico comparative mapping of physically mapped ILP markers demonstrated substantial percentage of sequence-based orthology and syntenic relationship between foxtail millet chromosomes and sorghum (∼50%), maize (∼46%), rice (∼21%) and Brachypodium (∼21%) chromosomes. Hence, for the first time, we developed large-scale ILP markers in foxtail millet and demonstrated their utility in germplasm characterization, transferability, phylogenetics and comparative mapping studies in millets and bioenergy grass species. PMID:24086082

  9. Network-directed cis-mediator analysis of normal prostate tissue expression profiles reveals downstream regulatory associations of prostate cancer susceptibility loci.

    PubMed

    Larson, Nicholas B; McDonnell, Shannon K; Fogarty, Zach; Larson, Melissa C; Cheville, John; Riska, Shaun; Baheti, Saurabh; Weber, Alexandra M; Nair, Asha A; Wang, Liang; O'Brien, Daniel; Davila, Jaime; Schaid, Daniel J; Thibodeau, Stephen N

    2017-10-17

    Large-scale genome-wide association studies have identified multiple single-nucleotide polymorphisms associated with risk of prostate cancer. Many of these genetic variants are presumed to be regulatory in nature; however, follow-up expression quantitative trait loci (eQTL) association studies have to-date been restricted largely to cis -acting associations due to study limitations. While trans -eQTL scans suffer from high testing dimensionality, recent evidence indicates most trans -eQTL associations are mediated by cis -regulated genes, such as transcription factors. Leveraging a data-driven gene co-expression network, we conducted a comprehensive cis -mediator analysis using RNA-Seq data from 471 normal prostate tissue samples to identify downstream regulatory associations of previously identified prostate cancer risk variants. We discovered multiple trans -eQTL associations that were significantly mediated by cis -regulated transcripts, four of which involved risk locus 17q12, proximal transcription factor HNF1B , and target trans -genes with known HNF response elements ( MIA2 , SRC , SEMA6A , KIF12 ). We additionally identified evidence of cis -acting down-regulation of MSMB via rs10993994 corresponding to reduced co-expression of NDRG1 . The majority of these cis -mediator relationships demonstrated trans -eQTL replicability in 87 prostate tissue samples from the Gene-Tissue Expression Project. These findings provide further biological context to known risk loci and outline new hypotheses for investigation into the etiology of prostate cancer.

  10. Orthogonal control of expression mean and variance by epigenetic features at different genomic loci

    DOE PAGES

    Dey, Siddharth S.; Foley, Jonathan E.; Limsirichai, Prajit; ...

    2015-05-05

    While gene expression noise has been shown to drive dramatic phenotypic variations, the molecular basis for this variability in mammalian systems is not well understood. Gene expression has been shown to be regulated by promoter architecture and the associated chromatin environment. However, the exact contribution of these two factors in regulating expression noise has not been explored. Using a dual-reporter lentiviral model system, we deconvolved the influence of the promoter sequence to systematically study the contribution of the chromatin environment at different genomic locations in regulating expression noise. By integrating a large-scale analysis to quantify mRNA levels by smFISH andmore » protein levels by flow cytometry in single cells, we found that mean expression and noise are uncorrelated across genomic locations. Furthermore, we showed that this independence could be explained by the orthogonal control of mean expression by the transcript burst size and noise by the burst frequency. Finally, we showed that genomic locations displaying higher expression noise are associated with more repressed chromatin, thereby indicating the contribution of the chromatin environment in regulating expression noise.« less

  11. Resilient protein co-expression network in male orbitofrontal cortex layer 2/3 during human aging.

    PubMed

    Pabba, Mohan; Scifo, Enzo; Kapadia, Fenika; Nikolova, Yuliya S; Ma, Tianzhou; Mechawar, Naguib; Tseng, George C; Sibille, Etienne

    2017-10-01

    The orbitofrontal cortex (OFC) is vulnerable to normal and pathologic aging. Currently, layer resolution large-scale proteomic studies describing "normal" age-related alterations at OFC are not available. Here, we performed a large-scale exploratory high-throughput mass spectrometry-based protein analysis on OFC layer 2/3 from 15 "young" (15-43 years) and 18 "old" (62-88 years) human male subjects. We detected 4193 proteins and identified 127 differentially expressed (DE) proteins (p-value ≤0.05; effect size >20%), including 65 up- and 62 downregulated proteins (e.g., GFAP, CALB1). Using a previously described categorization of biological aging based on somatic tissues, that is, peripheral "hallmarks of aging," and considering overlap in protein function, we show the highest representation of altered cell-cell communication (54%), deregulated nutrient sensing (39%), and loss of proteostasis (35%) in the set of OFC layer 2/3 DE proteins. DE proteins also showed a significant association with several neurologic disorders; for example, Alzheimer's disease and schizophrenia. Notably, despite age-related changes in individual protein levels, protein co-expression modules were remarkably conserved across age groups, suggesting robust functional homeostasis. Collectively, these results provide biological insight into aging and associated homeostatic mechanisms that maintain normal brain function with advancing age. Copyright © 2017 Elsevier Inc. All rights reserved.

  12. Demonstration-Scale High-Cell-Density Fermentation of Pichia pastoris.

    PubMed

    Liu, Wan-Cang; Zhu, Ping

    2018-01-01

    Pichia pastoris has been one of the most successful heterologous overexpression systems in generating proteins for large-scale production through high-cell-density fermentation. However, optimizing conditions of the large-scale high-cell-density fermentation for biochemistry and industrialization is usually a laborious and time-consuming process. Furthermore, it is often difficult to produce authentic proteins in large quantities, which is a major obstacle for functional and structural features analysis and industrial application. For these reasons, we have developed a protocol for efficient demonstration-scale high-cell-density fermentation of P. pastoris, which employs a new methanol-feeding strategy-biomass-stat strategy and a strategy of increased air pressure instead of pure oxygen supplement. The protocol included three typical stages of glycerol batch fermentation (initial culture phase), glycerol fed-batch fermentation (biomass accumulation phase), and methanol fed-batch fermentation (induction phase), which allows direct online-monitoring of fermentation conditions, including broth pH, temperature, DO, anti-foam generation, and feeding of glycerol and methanol. Using this protocol, production of the recombinant β-xylosidase of Lentinula edodes origin in 1000-L scale fermentation can be up to ~900 mg/L or 9.4 mg/g cells (dry cell weight, intracellular expression), with the specific production rate and average specific production of 0.1 mg/g/h and 0.081 mg/g/h, respectively. The methodology described in this protocol can be easily transferred to other systems, and eligible to scale up for a large number of proteins used in either the scientific studies or commercial purposes.

  13. Transcriptomic analysis of grain amaranth (Amaranthus hypochondriacus) using 454 pyrosequencing: comparison with A. tuberculatus, expression profiling in stems and in response to biotic and abiotic stress

    PubMed Central

    2011-01-01

    Background Amaranthus hypochondriacus, a grain amaranth, is a C4 plant noted by its ability to tolerate stressful conditions and produce highly nutritious seeds. These possess an optimal amino acid balance and constitute a rich source of health-promoting peptides. Although several recent studies, mostly involving subtractive hybridization strategies, have contributed to increase the relatively low number of grain amaranth expressed sequence tags (ESTs), transcriptomic information of this species remains limited, particularly regarding tissue-specific and biotic stress-related genes. Thus, a large scale transcriptome analysis was performed to generate stem- and (a)biotic stress-responsive gene expression profiles in grain amaranth. Results A total of 2,700,168 raw reads were obtained from six 454 pyrosequencing runs, which were assembled into 21,207 high quality sequences (20,408 isotigs + 799 contigs). The average sequence length was 1,064 bp and 930 bp for isotigs and contigs, respectively. Only 5,113 singletons were recovered after quality control. Contigs/isotigs were further incorporated into 15,667 isogroups. All unique sequences were queried against the nr, TAIR, UniRef100, UniRef50 and Amaranthaceae EST databases for annotation. Functional GO annotation was performed with all contigs/isotigs that produced significant hits with the TAIR database. Only 8,260 sequences were found to be homologous when the transcriptomes of A. tuberculatus and A. hypochondriacus were compared, most of which were associated with basic house-keeping processes. Digital expression analysis identified 1,971 differentially expressed genes in response to at least one of four stress treatments tested. These included several multiple-stress-inducible genes that could represent potential candidates for use in the engineering of stress-resistant plants. The transcriptomic data generated from pigmented stems shared similarity with findings reported in developing stems of Arabidopsis and black cottonwood (Populus trichocarpa). Conclusions This study represents the first large-scale transcriptomic analysis of A. hypochondriacus, considered to be a highly nutritious and stress-tolerant crop. Numerous genes were found to be induced in response to (a)biotic stress, many of which could further the understanding of the mechanisms that contribute to multiple stress-resistance in plants, a trait that has potential biotechnological applications in agriculture. PMID:21752295

  14. Symposium on Parallel Computational Methods for Large-scale Structural Analysis and Design, 2nd, Norfolk, VA, US

    NASA Technical Reports Server (NTRS)

    Storaasli, Olaf O. (Editor); Housner, Jerrold M. (Editor)

    1993-01-01

    Computing speed is leaping forward by several orders of magnitude each decade. Engineers and scientists gathered at a NASA Langley symposium to discuss these exciting trends as they apply to parallel computational methods for large-scale structural analysis and design. Among the topics discussed were: large-scale static analysis; dynamic, transient, and thermal analysis; domain decomposition (substructuring); and nonlinear and numerical methods.

  15. Gene expression of Caenorhabditis elegans neurons carries information on their synaptic connectivity.

    PubMed

    Kaufman, Alon; Dror, Gideon; Meilijson, Isaac; Ruppin, Eytan

    2006-12-08

    The claim that genetic properties of neurons significantly influence their synaptic network structure is a common notion in neuroscience. The nematode Caenorhabditis elegans provides an exciting opportunity to approach this question in a large-scale quantitative manner. Its synaptic connectivity network has been identified, and, combined with cellular studies, we currently have characteristic connectivity and gene expression signatures for most of its neurons. By using two complementary analysis assays we show that the expression signature of a neuron carries significant information about its synaptic connectivity signature, and identify a list of putative genes predicting neural connectivity. The current study rigorously quantifies the relation between gene expression and synaptic connectivity signatures in the C. elegans nervous system and identifies subsets of neurons where this relation is highly marked. The results presented and the genes identified provide a promising starting point for further, more detailed computational and experimental investigations.

  16. Machine Learning–Based Differential Network Analysis: A Study of Stress-Responsive Transcriptomes in Arabidopsis[W

    PubMed Central

    Ma, Chuang; Xin, Mingming; Feldmann, Kenneth A.; Wang, Xiangfeng

    2014-01-01

    Machine learning (ML) is an intelligent data mining technique that builds a prediction model based on the learning of prior knowledge to recognize patterns in large-scale data sets. We present an ML-based methodology for transcriptome analysis via comparison of gene coexpression networks, implemented as an R package called machine learning–based differential network analysis (mlDNA) and apply this method to reanalyze a set of abiotic stress expression data in Arabidopsis thaliana. The mlDNA first used a ML-based filtering process to remove nonexpressed, constitutively expressed, or non-stress-responsive “noninformative” genes prior to network construction, through learning the patterns of 32 expression characteristics of known stress-related genes. The retained “informative” genes were subsequently analyzed by ML-based network comparison to predict candidate stress-related genes showing expression and network differences between control and stress networks, based on 33 network topological characteristics. Comparative evaluation of the network-centric and gene-centric analytic methods showed that mlDNA substantially outperformed traditional statistical testing–based differential expression analysis at identifying stress-related genes, with markedly improved prediction accuracy. To experimentally validate the mlDNA predictions, we selected 89 candidates out of the 1784 predicted salt stress–related genes with available SALK T-DNA mutagenesis lines for phenotypic screening and identified two previously unreported genes, mutants of which showed salt-sensitive phenotypes. PMID:24520154

  17. Normalization of RNA-seq data using factor analysis of control genes or samples

    PubMed Central

    Risso, Davide; Ngai, John; Speed, Terence P.; Dudoit, Sandrine

    2015-01-01

    Normalization of RNA-seq data has proven essential to ensure accurate inference of expression levels. Here we show that usual normalization approaches mostly account for sequencing depth and fail to correct for library preparation and other more-complex unwanted effects. We evaluate the performance of the External RNA Control Consortium (ERCC) spike-in controls and investigate the possibility of using them directly for normalization. We show that the spike-ins are not reliable enough to be used in standard global-scaling or regression-based normalization procedures. We propose a normalization strategy, remove unwanted variation (RUV), that adjusts for nuisance technical effects by performing factor analysis on suitable sets of control genes (e.g., ERCC spike-ins) or samples (e.g., replicate libraries). Our approach leads to more-accurate estimates of expression fold-changes and tests of differential expression compared to state-of-the-art normalization methods. In particular, RUV promises to be valuable for large collaborative projects involving multiple labs, technicians, and/or platforms. PMID:25150836

  18. DTWscore: differential expression and cell clustering analysis for time-series single-cell RNA-seq data.

    PubMed

    Wang, Zhuo; Jin, Shuilin; Liu, Guiyou; Zhang, Xiurui; Wang, Nan; Wu, Deliang; Hu, Yang; Zhang, Chiping; Jiang, Qinghua; Xu, Li; Wang, Yadong

    2017-05-23

    The development of single-cell RNA sequencing has enabled profound discoveries in biology, ranging from the dissection of the composition of complex tissues to the identification of novel cell types and dynamics in some specialized cellular environments. However, the large-scale generation of single-cell RNA-seq (scRNA-seq) data collected at multiple time points remains a challenge to effective measurement gene expression patterns in transcriptome analysis. We present an algorithm based on the Dynamic Time Warping score (DTWscore) combined with time-series data, that enables the detection of gene expression changes across scRNA-seq samples and recovery of potential cell types from complex mixtures of multiple cell types. The DTWscore successfully classify cells of different types with the most highly variable genes from time-series scRNA-seq data. The study was confined to methods that are implemented and available within the R framework. Sample datasets and R packages are available at https://github.com/xiaoxiaoxier/DTWscore .

  19. In silico Pathway Activation Network Decomposition Analysis (iPANDA) as a method for biomarker development.

    PubMed

    Ozerov, Ivan V; Lezhnina, Ksenia V; Izumchenko, Evgeny; Artemov, Artem V; Medintsev, Sergey; Vanhaelen, Quentin; Aliper, Alexander; Vijg, Jan; Osipov, Andreyan N; Labat, Ivan; West, Michael D; Buzdin, Anton; Cantor, Charles R; Nikolsky, Yuri; Borisov, Nikolay; Irincheeva, Irina; Khokhlovich, Edward; Sidransky, David; Camargo, Miguel Luiz; Zhavoronkov, Alex

    2016-11-16

    Signalling pathway activation analysis is a powerful approach for extracting biologically relevant features from large-scale transcriptomic and proteomic data. However, modern pathway-based methods often fail to provide stable pathway signatures of a specific phenotype or reliable disease biomarkers. In the present study, we introduce the in silico Pathway Activation Network Decomposition Analysis (iPANDA) as a scalable robust method for biomarker identification using gene expression data. The iPANDA method combines precalculated gene coexpression data with gene importance factors based on the degree of differential gene expression and pathway topology decomposition for obtaining pathway activation scores. Using Microarray Analysis Quality Control (MAQC) data sets and pretreatment data on Taxol-based neoadjuvant breast cancer therapy from multiple sources, we demonstrate that iPANDA provides significant noise reduction in transcriptomic data and identifies highly robust sets of biologically relevant pathway signatures. We successfully apply iPANDA for stratifying breast cancer patients according to their sensitivity to neoadjuvant therapy.

  20. In silico Pathway Activation Network Decomposition Analysis (iPANDA) as a method for biomarker development

    PubMed Central

    Ozerov, Ivan V.; Lezhnina, Ksenia V.; Izumchenko, Evgeny; Artemov, Artem V.; Medintsev, Sergey; Vanhaelen, Quentin; Aliper, Alexander; Vijg, Jan; Osipov, Andreyan N.; Labat, Ivan; West, Michael D.; Buzdin, Anton; Cantor, Charles R.; Nikolsky, Yuri; Borisov, Nikolay; Irincheeva, Irina; Khokhlovich, Edward; Sidransky, David; Camargo, Miguel Luiz; Zhavoronkov, Alex

    2016-01-01

    Signalling pathway activation analysis is a powerful approach for extracting biologically relevant features from large-scale transcriptomic and proteomic data. However, modern pathway-based methods often fail to provide stable pathway signatures of a specific phenotype or reliable disease biomarkers. In the present study, we introduce the in silico Pathway Activation Network Decomposition Analysis (iPANDA) as a scalable robust method for biomarker identification using gene expression data. The iPANDA method combines precalculated gene coexpression data with gene importance factors based on the degree of differential gene expression and pathway topology decomposition for obtaining pathway activation scores. Using Microarray Analysis Quality Control (MAQC) data sets and pretreatment data on Taxol-based neoadjuvant breast cancer therapy from multiple sources, we demonstrate that iPANDA provides significant noise reduction in transcriptomic data and identifies highly robust sets of biologically relevant pathway signatures. We successfully apply iPANDA for stratifying breast cancer patients according to their sensitivity to neoadjuvant therapy. PMID:27848968

  1. Partial least squares based identification of Duchenne muscular dystrophy specific genes.

    PubMed

    An, Hui-bo; Zheng, Hua-cheng; Zhang, Li; Ma, Lin; Liu, Zheng-yan

    2013-11-01

    Large-scale parallel gene expression analysis has provided a greater ease for investigating the underlying mechanisms of Duchenne muscular dystrophy (DMD). Previous studies typically implemented variance/regression analysis, which would be fundamentally flawed when unaccounted sources of variability in the arrays existed. Here we aim to identify genes that contribute to the pathology of DMD using partial least squares (PLS) based analysis. We carried out PLS-based analysis with two datasets downloaded from the Gene Expression Omnibus (GEO) database to identify genes contributing to the pathology of DMD. Except for the genes related to inflammation, muscle regeneration and extracellular matrix (ECM) modeling, we found some genes with high fold change, which have not been identified by previous studies, such as SRPX, GPNMB, SAT1, and LYZ. In addition, downregulation of the fatty acid metabolism pathway was found, which may be related to the progressive muscle wasting process. Our results provide a better understanding for the downstream mechanisms of DMD.

  2. Batch effects in single-cell RNA-sequencing data are corrected by matching mutual nearest neighbors.

    PubMed

    Haghverdi, Laleh; Lun, Aaron T L; Morgan, Michael D; Marioni, John C

    2018-06-01

    Large-scale single-cell RNA sequencing (scRNA-seq) data sets that are produced in different laboratories and at different times contain batch effects that may compromise the integration and interpretation of the data. Existing scRNA-seq analysis methods incorrectly assume that the composition of cell populations is either known or identical across batches. We present a strategy for batch correction based on the detection of mutual nearest neighbors (MNNs) in the high-dimensional expression space. Our approach does not rely on predefined or equal population compositions across batches; instead, it requires only that a subset of the population be shared between batches. We demonstrate the superiority of our approach compared with existing methods by using both simulated and real scRNA-seq data sets. Using multiple droplet-based scRNA-seq data sets, we demonstrate that our MNN batch-effect-correction method can be scaled to large numbers of cells.

  3. Impact of Chromosome 4p- Syndrome on Communication and Expressive Language Skills: A Preliminary Investigation

    ERIC Educational Resources Information Center

    Marshall, Althea T.

    2010-01-01

    Purpose: The purpose of this investigation was to examine the impact of Chromosome 4p- syndrome on the communication and expressive language phenotype of a large cross-cultural population of children, adolescents, and adults. Method: A large-scale survey study was conducted and a descriptive research design was used to analyze quantitative and…

  4. 2D-Difference Gel Electrophoretic Proteomic Analysis of a Cell Culture Model of Alveolar Rhabdomyosarcoma

    PubMed Central

    Pressey, Joseph G.; Pressey, Christine S.; Robinson, Gloria; Herring, Richie; Wilson, Landon; Kelly, David R.; Kim, Helen

    2011-01-01

    To evaluate the consequences of expression of the protein encoded by PAX3-FOXO1 (P3F) in the pediatric malignancy alveolar rhabdomyosarcoma (A-RMS), we developed and evaluated a genetically defined in vitro model of A-RMS tumorigenesis. The expression of P3F in cooperation with simian virus 40 (SV40) Large-T (LT) antigen in murine C3H10T1/2 fibroblasts led to robust malignant transformation. Using 2 dimensional difference gel electrophoresis (2D-DIGE) we compared proteomes from lysates from cells that express P3F + LT versus from cells that express LT alone. Analysis of 2D gel spot patterns by DeCyder™ image analysis software indicated 93 spots that were different in abundance. Peptide mass fingerprint analysis of the 93 spots by matrix assisted laser desorption ionization time-of-flight mass spectrometry (MALDI-TOF MS) analysis identified 37 non-redundant proteins. 2D DIGE analysis of cell culture media conditioned by cells transduced by P3F + LT versus by LT alone found 29 spots in the P3F + LT cells leading to the identification of 11 non-redundant proteins. A substantial number of proteins with potential roles in tumorigenesis and myogenesis were detected, most of which have not been identified in previous wide-scale expression studies of RMS experimental models or tumors. We validated the 2D gel image analysis findings by western blot analysis and immunohistochemistry (IHC). Thus, the 2D DIGE proteomics methodology described here provided an important discovery approach to the study of RMS biology and complements the findings of previous mRNA expression studies. PMID:21110518

  5. 2D-difference gel electrophoretic proteomic analysis of a cell culture model of alveolar rhabdomyosarcoma.

    PubMed

    Pressey, Joseph G; Pressey, Christine S; Robinson, Gloria; Herring, Richie; Wilson, Landon; Kelly, David R; Kim, Helen

    2011-02-04

    To evaluate the consequences of expression of the protein encoded by PAX3-FOXO1 (P3F) in the pediatric malignancy alveolar rhabdomyosarcoma (A-RMS), we developed and evaluated a genetically defined in vitro model of A-RMS tumorigenesis. The expression of P3F in cooperation with simian virus 40 (SV40) Large-T (LT) antigen in murine C3H10T1/2 fibroblasts led to robust malignant transformation. Using 2-dimensional-difference gel electrophoresis (2D-DIGE), we compared proteomes from lysates from cells that express P3F + LT versus from cells that express LT alone. Analysis of 2D gel spot patterns by DeCyder image analysis software indicated 93 spots that were different in abundance. Peptide mass fingerprint analysis of the 93 spots by matrix assisted laser desorption ionization time-of-flight mass spectrometry (MALDI-TOF MS) analysis identified 37 nonredundant proteins. 2D-DIGE analysis of cell culture media conditioned by cells transduced by P3F + LT versus by LT alone found 29 spots in the P3F + LT cells leading to the identification of 11 nonredundant proteins. A substantial number of proteins with potential roles in tumorigenesis and myogenesis were detected, most of which have not been identified in previous wide-scale expression studies of RMS experimental models or tumors. We validated the 2D gel image analysis findings by Western blot analysis and immunohistochemistry (IHC). Thus, the 2D-DIGE proteomics methodology described here provided an important discovery approach to the study of RMS biology and complements the findings of previous mRNA expression studies.

  6. Voting contagion: Modeling and analysis of a century of U.S. presidential elections

    PubMed Central

    de Aguiar, Marcus A. M.

    2017-01-01

    Social influence plays an important role in human behavior and decisions. Sources of influence can be divided as external, which are independent of social context, or as originating from peers, such as family and friends. An important question is how to disentangle the social contagion by peers from external influences. While a variety of experimental and observational studies provided insight into this problem, identifying the extent of contagion based on large-scale observational data with an unknown network structure remains largely unexplored. By bridging the gap between the large-scale complex systems perspective of collective human dynamics and the detailed approach of social sciences, we present a parsimonious model of social influence, and apply it to a central topic in political science—elections and voting behavior. We provide an analytical expression of the county vote-share distribution, which is in excellent agreement with almost a century of observed U.S. presidential election data. Analyzing the social influence topography over this period reveals an abrupt phase transition from low to high levels of social contagion, and robust differences among regions. These results suggest that social contagion effects are becoming more instrumental in shaping large-scale collective political behavior, with implications on democratic electoral processes and policies. PMID:28542409

  7. Large-scale generation of human iPSC-derived neural stem cells/early neural progenitor cells and their neuronal differentiation.

    PubMed

    D'Aiuto, Leonardo; Zhi, Yun; Kumar Das, Dhanjit; Wilcox, Madeleine R; Johnson, Jon W; McClain, Lora; MacDonald, Matthew L; Di Maio, Roberto; Schurdak, Mark E; Piazza, Paolo; Viggiano, Luigi; Sweet, Robert; Kinchington, Paul R; Bhattacharjee, Ayantika G; Yolken, Robert; Nimgaonka, Vishwajit L; Nimgaonkar, Vishwajit L

    2014-01-01

    Induced pluripotent stem cell (iPSC)-based technologies offer an unprecedented opportunity to perform high-throughput screening of novel drugs for neurological and neurodegenerative diseases. Such screenings require a robust and scalable method for generating large numbers of mature, differentiated neuronal cells. Currently available methods based on differentiation of embryoid bodies (EBs) or directed differentiation of adherent culture systems are either expensive or are not scalable. We developed a protocol for large-scale generation of neuronal stem cells (NSCs)/early neural progenitor cells (eNPCs) and their differentiation into neurons. Our scalable protocol allows robust and cost-effective generation of NSCs/eNPCs from iPSCs. Following culture in neurobasal medium supplemented with B27 and BDNF, NSCs/eNPCs differentiate predominantly into vesicular glutamate transporter 1 (VGLUT1) positive neurons. Targeted mass spectrometry analysis demonstrates that iPSC-derived neurons express ligand-gated channels and other synaptic proteins and whole-cell patch-clamp experiments indicate that these channels are functional. The robust and cost-effective differentiation protocol described here for large-scale generation of NSCs/eNPCs and their differentiation into neurons paves the way for automated high-throughput screening of drugs for neurological and neurodegenerative diseases.

  8. Dichlorvos Exposure Results in Large Scale Disruption of Energy Metabolism in the Liver of the Zebra Fish, Danio Rerio

    DTIC Science & Technology

    2015-10-24

    zebrafish reference genome sequence and its relationship to the human genome . Nature. 2013;496(7446):498–503. 21. Linney E, Upchurch L, Donerly S. Zebrafish...To obtain a broader understanding of the effects of dichlorvos on liver metabolism, we per- formed a genome -wide analysis of gene expression in the ...condition) for whole genome transcript ana- lysis, and fixed another set of fish for histological evaluation (n = 5/condition). We determined the target

  9. Achieving online consent to participation in large-scale gene-environment studies: a tangible destination.

    PubMed

    Wood, Fiona; Kowalczuk, Jenny; Elwyn, Glyn; Mitchell, Clive; Gallacher, John

    2011-08-01

    Population based genetics studies are dependent on large numbers of individuals in the pursuit of small effect sizes. Recruiting and consenting a large number of participants is both costly and time consuming. We explored whether an online consent process for large-scale genetics studies is acceptable for prospective participants using an example online genetics study. We conducted semi-structured interviews with 42 members of the public stratified by age group, gender and newspaper readership (a measure of social status). Respondents were asked to use a website designed to recruit for a large-scale genetic study. After using the website a semi-structured interview was conducted to explore opinions and any issues they would have. Responses were analysed using thematic content analysis. The majority of respondents said they would take part in the research (32/42). Those who said they would decline to participate saw fewer benefits from the research, wanted more information and expressed a greater number of concerns about the study. Younger respondents had concerns over time commitment. Middle aged respondents were concerned about privacy and security. Older respondents were more altruistic in their motivation to participate. Common themes included trust in the authenticity of the website, security of personal data, curiosity about their own genetic profile, operational concerns and a desire for more information about the research. Online consent to large-scale genetic studies is likely to be acceptable to the public. The online consent process must establish trust quickly and effectively by asserting authenticity and credentials, and provide access to a range of information to suit different information preferences.

  10. Global maps of the magnetic thickness and magnetization of the Earth's lithosphere

    NASA Astrophysics Data System (ADS)

    Vervelidou, Foteini; Thébault, Erwan

    2015-10-01

    We have constructed global maps of the large-scale magnetic thickness and magnetization of Earth's lithosphere. Deriving such large-scale maps based on lithospheric magnetic field measurements faces the challenge of the masking effect of the core field. In this study, the maps were obtained through analyses in the spectral domain by means of a new regional spatial power spectrum based on the Revised Spherical Cap Harmonic Analysis (R-SCHA) formalism. A series of regional spectral analyses were conducted covering the entire Earth. The R-SCHA surface power spectrum for each region was estimated using the NGDC-720 spherical harmonic (SH) model of the lithospheric magnetic field, which is based on satellite, aeromagnetic, and marine measurements. These observational regional spectra were fitted to a recently proposed statistical expression of the power spectrum of Earth's lithospheric magnetic field, whose free parameters include the thickness and magnetization of the magnetic sources. The resulting global magnetic thickness map is compared to other crustal and magnetic thickness maps based upon different geophysical data. We conclude that the large-scale magnetic thickness of the lithosphere is on average confined to a layer that does not exceed the Moho.

  11. Transcriptome Analysis of the Differentially Expressed Genes in the Male and Female Shrub Willows (Salix suchowensis)

    PubMed Central

    Liu, Jingjing; Yin, Tongming; Ye, Ning; Chen, Yingnan; Yin, Tingting; Liu, Min; Hassani, Danial

    2013-01-01

    Background The dioecious system is relatively rare in plants. Shrub willow is an annual flowering dioecious woody plant, and possesses many characteristics that lend it as a great model for tracking the missing pieces of sex determination evolution. To gain a global view of the genes differentially expressed in the male and female shrub willows and to develop a database for further studies, we performed a large-scale transcriptome sequencing of flower buds which were separately collected from two types of sexes. Results Totally, 1,201,931 high quality reads were obtained, with an average length of 389 bp and a total length of 467.96 Mb. The ESTs were assembled into 29,048 contigs, and 132,709 singletons. These unigenes were further functionally annotated by comparing their sequences to different proteins and functional domain databases and assigned with Gene Ontology (GO) terms. A biochemical pathway database containing 291 predicted pathways was also created based on the annotations of the unigenes. Digital expression analysis identified 806 differentially expressed genes between the male and female flower buds. And 33 of them located on the incipient sex chromosome of Salicaceae, among which, 12 genes might involve in plant sex determination empirically. These genes were worthy of special notification in future studies. Conclusions In this study, a large number of EST sequences were generated from the flower buds of a male and a female shrub willow. We also reported the differentially expressed genes between the two sex-type flowers. This work provides valuable information and sequence resources for uncovering the sex determining genes and for future functional genomics analysis of Salicaceae spp. PMID:23560075

  12. Gene expression profiling of human mesenchymal stem cells derived from bone marrow during expansion and osteoblast differentiation.

    PubMed

    Kulterer, Birgit; Friedl, Gerald; Jandrositz, Anita; Sanchez-Cabo, Fatima; Prokesch, Andreas; Paar, Christine; Scheideler, Marcel; Windhager, Reinhard; Preisegger, Karl-Heinz; Trajanoski, Zlatko

    2007-03-12

    Human mesenchymal stem cells (MSC) with the capacity to differentiate into osteoblasts provide potential for the development of novel treatment strategies, such as improved healing of large bone defects. However, their low frequency in bone marrow necessitate ex vivo expansion for further clinical application. In this study we asked if MSC are developing in an aberrant or unwanted way during ex vivo long-term cultivation and if artificial cultivation conditions exert any influence on their stem cell maintenance. To address this question we first developed human oligonucleotide microarrays with 30.000 elements and then performed large-scale expression profiling of long-term expanded MSC and MSC during differentiation into osteoblasts. The results showed that MSC did not alter their osteogenic differentiation capacity, surface marker profile, and the expression profiles of MSC during expansion. Microarray analysis of MSC during osteogenic differentiation identified three candidate genes for further examination and functional analysis: ID4, CRYAB, and SORT1. Additionally, we were able to reconstruct the three developmental phases during osteoblast differentiation: proliferation, matrix maturation, and mineralization, and illustrate the activation of the SMAD signaling pathways by TGF-beta2 and BMPs. With a variety of assays we could show that MSC represent a cell population which can be expanded for therapeutic applications.

  13. Genomic analysis of expressed sequence tags in American black bear Ursus americanus

    PubMed Central

    2010-01-01

    Background Species of the bear family (Ursidae) are important organisms for research in molecular evolution, comparative physiology and conservation biology, but relatively little genetic sequence information is available for this group. Here we report the development and analyses of the first large scale Expressed Sequence Tag (EST) resource for the American black bear (Ursus americanus). Results Comprehensive analyses of molecular functions, alternative splicing, and tissue-specific expression of 38,757 black bear EST sequences were conducted using the dog genome as a reference. We identified 18 genes, involved in functions such as lipid catabolism, cell cycle, and vesicle-mediated transport, that are showing rapid evolution in the bear lineage Three genes, Phospholamban (PLN), cysteine glycine-rich protein 3 (CSRP3) and Troponin I type 3 (TNNI3), are related to heart contraction, and defects in these genes in humans lead to heart disease. Two genes, biphenyl hydrolase-like (BPHL) and CSRP3, contain positively selected sites in bear. Global analysis of evolution rates of hibernation-related genes in bear showed that they are largely conserved and slowly evolving genes, rather than novel and fast-evolving genes. Conclusion We provide a genomic resource for an important mammalian organism and our study sheds new light on the possible functions and evolution of bear genes. PMID:20338065

  14. Genomic analysis of expressed sequence tags in American black bear Ursus americanus.

    PubMed

    Zhao, Sen; Shao, Chunxuan; Goropashnaya, Anna V; Stewart, Nathan C; Xu, Yichi; Tøien, Øivind; Barnes, Brian M; Fedorov, Vadim B; Yan, Jun

    2010-03-26

    Species of the bear family (Ursidae) are important organisms for research in molecular evolution, comparative physiology and conservation biology, but relatively little genetic sequence information is available for this group. Here we report the development and analyses of the first large scale Expressed Sequence Tag (EST) resource for the American black bear (Ursus americanus). Comprehensive analyses of molecular functions, alternative splicing, and tissue-specific expression of 38,757 black bear EST sequences were conducted using the dog genome as a reference. We identified 18 genes, involved in functions such as lipid catabolism, cell cycle, and vesicle-mediated transport, that are showing rapid evolution in the bear lineage Three genes, Phospholamban (PLN), cysteine glycine-rich protein 3 (CSRP3) and Troponin I type 3 (TNNI3), are related to heart contraction, and defects in these genes in humans lead to heart disease. Two genes, biphenyl hydrolase-like (BPHL) and CSRP3, contain positively selected sites in bear. Global analysis of evolution rates of hibernation-related genes in bear showed that they are largely conserved and slowly evolving genes, rather than novel and fast-evolving genes. We provide a genomic resource for an important mammalian organism and our study sheds new light on the possible functions and evolution of bear genes.

  15. Authentic Research Experience and "Big Data" Analysis in the Classroom: Maize Response to Abiotic Stress.

    PubMed

    Makarevitch, Irina; Frechette, Cameo; Wiatros, Natalia

    2015-01-01

    Integration of inquiry-based approaches into curriculum is transforming the way science is taught and studied in undergraduate classrooms. Incorporating quantitative reasoning and mathematical skills into authentic biology undergraduate research projects has been shown to benefit students in developing various skills necessary for future scientists and to attract students to science, technology, engineering, and mathematics disciplines. While large-scale data analysis became an essential part of modern biological research, students have few opportunities to engage in analysis of large biological data sets. RNA-seq analysis, a tool that allows precise measurement of the level of gene expression for all genes in a genome, revolutionized molecular biology and provides ample opportunities for engaging students in authentic research. We developed, implemented, and assessed a series of authentic research laboratory exercises incorporating a large data RNA-seq analysis into an introductory undergraduate classroom. Our laboratory series is focused on analyzing gene expression changes in response to abiotic stress in maize seedlings; however, it could be easily adapted to the analysis of any other biological system with available RNA-seq data. Objective and subjective assessment of student learning demonstrated gains in understanding important biological concepts and in skills related to the process of science. © 2015 I. Makarevitch et al. CBE—Life Sciences Education © 2015 The American Society for Cell Biology. This article is distributed by The American Society for Cell Biology under license from the author(s). It is available to the public under an Attribution–Noncommercial–Share Alike 3.0 Unported Creative Commons License (http://creativecommons.org/licenses/by-nc-sa/3.0).

  16. Large scale aggregate microarray analysis reveals three distinct molecular subclasses of human preeclampsia.

    PubMed

    Leavey, Katherine; Bainbridge, Shannon A; Cox, Brian J

    2015-01-01

    Preeclampsia (PE) is a life-threatening hypertensive pathology of pregnancy affecting 3-5% of all pregnancies. To date, PE has no cure, early detection markers, or effective treatments short of the removal of what is thought to be the causative organ, the placenta, which may necessitate a preterm delivery. Additionally, numerous small placental microarray studies attempting to identify "PE-specific" genes have yielded inconsistent results. We therefore hypothesize that preeclampsia is a multifactorial disease encompassing several pathology subclasses, and that large cohort placental gene expression analysis will reveal these groups. To address our hypothesis, we utilized known bioinformatic methods to aggregate 7 microarray data sets across multiple platforms in order to generate a large data set of 173 patient samples, including 77 with preeclampsia. Unsupervised clustering of these patient samples revealed three distinct molecular subclasses of PE. This included a "canonical" PE subclass demonstrating elevated expression of known PE markers and genes associated with poor oxygenation and increased secretion, as well as two other subclasses potentially representing a poor maternal response to pregnancy and an immunological presentation of preeclampsia. Our analysis sheds new light on the heterogeneity of PE patients, and offers up additional avenues for future investigation. Hopefully, our subclassification of preeclampsia based on molecular diversity will finally lead to the development of robust diagnostics and patient-based treatments for this disorder.

  17. Hi-C Chromatin Interaction Networks Predict Co-expression in the Mouse Cortex

    PubMed Central

    Hulsman, Marc; Lelieveldt, Boudewijn P. F.; de Ridder, Jeroen; Reinders, Marcel

    2015-01-01

    The three dimensional conformation of the genome in the cell nucleus influences important biological processes such as gene expression regulation. Recent studies have shown a strong correlation between chromatin interactions and gene co-expression. However, predicting gene co-expression from frequent long-range chromatin interactions remains challenging. We address this by characterizing the topology of the cortical chromatin interaction network using scale-aware topological measures. We demonstrate that based on these characterizations it is possible to accurately predict spatial co-expression between genes in the mouse cortex. Consistent with previous findings, we find that the chromatin interaction profile of a gene-pair is a good predictor of their spatial co-expression. However, the accuracy of the prediction can be substantially improved when chromatin interactions are described using scale-aware topological measures of the multi-resolution chromatin interaction network. We conclude that, for co-expression prediction, it is necessary to take into account different levels of chromatin interactions ranging from direct interaction between genes (i.e. small-scale) to chromatin compartment interactions (i.e. large-scale). PMID:25965262

  18. Development of a targeted transgenesis strategy in highly differentiated cells: a powerful tool for functional genomic analysis.

    PubMed

    Puttini, Stefania; Ouvrard-Pascaud, Antoine; Palais, Gael; Beggah, Ahmed T; Gascard, Philippe; Cohen-Tannoudji, Michel; Babinet, Charles; Blot-Chabaud, Marcel; Jaisser, Frederic

    2005-03-16

    Functional genomic analysis is a challenging step in the so-called post-genomic field. Identification of potential targets using large-scale gene expression analysis requires functional validation to identify those that are physiologically relevant. Genetically modified cell models are often used for this purpose allowing up- or down-expression of selected targets in a well-defined and if possible highly differentiated cell type. However, the generation of such models remains time-consuming and expensive. In order to alleviate this step, we developed a strategy aimed at the rapid and efficient generation of genetically modified cell lines with conditional, inducible expression of various target genes. Efficient knock-in of various constructs, called targeted transgenesis, in a locus selected for its permissibility to the tet inducible system, was obtained through the stimulation of site-specific homologous recombination by the meganuclease I-SceI. Our results demonstrate that targeted transgenesis in a reference inducible locus greatly facilitated the functional analysis of the selected recombinant cells. The efficient screening strategy we have designed makes possible automation of the transfection and selection steps. Furthermore, this strategy could be applied to a variety of highly differentiated cells.

  19. Ingestion of bacterially expressed double-stranded RNA inhibits gene expression in planarians.

    PubMed

    Newmark, Phillip A; Reddien, Peter W; Cebrià, Francesc; Sánchez Alvarado, Alejandro

    2003-09-30

    Freshwater planarian flatworms are capable of regenerating complete organisms from tiny fragments of their bodies; the basis for this regenerative prowess is an experimentally accessible stem cell population that is present in the adult planarian. The study of these organisms, classic experimental models for investigating metazoan regeneration, has been revitalized by the application of modern molecular biological approaches. The identification of thousands of unique planarian ESTs, coupled with large-scale whole-mount in situ hybridization screens, and the ability to inhibit planarian gene expression through double-stranded RNA-mediated genetic interference, provide a wealth of tools for studying the molecular mechanisms that regulate tissue regeneration and stem cell biology in these organisms. Here we show that, as in Caenorhabditis elegans, ingestion of bacterially expressed double-stranded RNA can inhibit gene expression in planarians. This inhibition persists throughout the process of regeneration, allowing phenotypes with disrupted regenerative patterning to be identified. These results pave the way for large-scale screens for genes involved in regenerative processes.

  20. Advancing biopharmaceutical process development by system-level data analysis and integration of omics data.

    PubMed

    Schaub, Jochen; Clemens, Christoph; Kaufmann, Hitto; Schulz, Torsten W

    2012-01-01

    Development of efficient bioprocesses is essential for cost-effective manufacturing of recombinant therapeutic proteins. To achieve further process improvement and process rationalization comprehensive data analysis of both process data and phenotypic cell-level data is essential. Here, we present a framework for advanced bioprocess data analysis consisting of multivariate data analysis (MVDA), metabolic flux analysis (MFA), and pathway analysis for mapping of large-scale gene expression data sets. This data analysis platform was applied in a process development project with an IgG-producing Chinese hamster ovary (CHO) cell line in which the maximal product titer could be increased from about 5 to 8 g/L.Principal component analysis (PCA), k-means clustering, and partial least-squares (PLS) models were applied to analyze the macroscopic bioprocess data. MFA and gene expression analysis revealed intracellular information on the characteristics of high-performance cell cultivations. By MVDA, for example, correlations between several essential amino acids and the product concentration were observed. Also, a grouping into rather cell specific productivity-driven and process control-driven processes could be unraveled. By MFA, phenotypic characteristics in glycolysis, glutaminolysis, pentose phosphate pathway, citrate cycle, coupling of amino acid metabolism to citrate cycle, and in the energy yield could be identified. By gene expression analysis 247 deregulated metabolic genes were identified which are involved, inter alia, in amino acid metabolism, transport, and protein synthesis.

  1. Inducing mutations through γ-irradiation in seeds of Mucuna pruriens for developing high L-DOPA-yielding genotypes.

    PubMed

    Singh, Susheel Kumar; Yadav, Deepti; Lal, Raj Kishori; Gupta, Madan M; Dhawan, Sunita Singh

    2017-04-01

    To develop elite genotypes in Mucuna pruriens (L.) DC with high L-DOPA (L-3, 4 dihydroxyphenylalanine) yields, with non-itching characteristics and better adaptability by applying γ-irradiation. Molecular and chemical analysis was performed for screening based on specific characteristics desired for developing suitable genotypes. Developed, mutant populations were analyzed for L-DOPA % in seeds through TLC (thin layer chromatography), and the results obtained were validated with the HPLC (High performance liquid chromatography). The DNA (Deoxyribonucleic acid) was isolated from the leaf at the initial stage and used for DNA polymorphism. RNA (Ribonucleic acid) was isolated from the leaf during maturity and used for expression analysis. The selected mutant T-I-7 showed 5.7% L-DOPA content compared to 3.18% of parent CIM-Ajar. The total polymorphism obtained was 57% with the molecular marker analysis. The gene expression analysis showed higher fold change expression of the dopadecarboxylase gene (DDC) in control compared to selected mutants (T-I-7, T-II-23, T-IV-9, T-VI-1). DNA polymorphism was used for the screening of mutants for efficient screening at an early stage. TLC was found suitable for the large-scale comparative chemical analysis of L-DOPA. The expression profile of DDC clearly demonstrated the higher yields of L-DOPA in selected mutants developed by γ-irradiation in the seeds of the control.

  2. Highly specific gene silencing in a monocot species by artificial microRNAs derived from chimeric miRNA precursors

    DOE PAGES

    Carbonell, Alberto; Fahlgren, Noah; Mitchell, Skyler; ...

    2015-05-20

    Artificial microRNAs (amiRNAs) are used for selective gene silencing in plants. However, current methods to produce amiRNA constructs for silencing transcripts in monocot species are not suitable for simple, cost-effective and large-scale synthesis. Here, a series of expression vectors based on Oryza sativa MIR390 (OsMIR390) precursor was developed for high-throughput cloning and high expression of amiRNAs in monocots. Four different amiRNA sequences designed to target specifically endogenous genes and expressed from OsMIR390-based vectors were validated in transgenic Brachypodium distachyon plants. Surprisingly, amiRNAs accumulated to higher levels and were processed more accurately when expressed from chimeric OsMIR390-based precursors that include distalmore » stem-loop sequences from Arabidopsis thaliana MIR390a (AtMIR390a). In all cases, transgenic plants displayed the predicted phenotypes induced by target gene repression, and accumulated high levels of amiRNAs and low levels of the corresponding target transcripts. Genome-wide transcriptome profiling combined with 5-RLM-RACE analysis in transgenic plants confirmed that amiRNAs were highly specific. Finally, significance Statement A series of amiRNA vectors based on Oryza sativa MIR390 (OsMIR390) precursor were developed for simple, cost-effective and large-scale synthesis of amiRNA constructs to silence genes in monocots. Unexpectedly, amiRNAs produced from chimeric OsMIR390-based precursors including Arabidopsis thaliana MIR390a distal stem-loop sequences accumulated elevated levels of highly effective and specific amiRNAs in transgenic Brachypodium distachyon plants.« less

  3. Comparison of gene expression changes induced by biguanides in db/db mice liver.

    PubMed

    Heishi, Masayuki; Hayashi, Koji; Ichihara, Junji; Ishikawa, Hironori; Kawamura, Takao; Kanaoka, Masaharu; Taiji, Mutsuo; Kimura, Toru

    2008-08-01

    Large-scale clinical studies have shown that the biguanide drug metformin, widely used for type 2 diabetes, to be very safe. By contrast, another biguanide, phenformin, has been withdrawn from major markets because of a high incidence of serious adverse effects. The difference in mode of action between the two biguanides remains unclear. To gain insight into the different modes of action of the two drugs, we performed global gene expression profiling using the livers of obese diabetic db/db mice after a single administration of phenformin or metformin at levels sufficient to cause a significant reduction in blood glucose level. Metformin induced modest expression changes, including G6pc in the liver as previously reported. By contrast, phenformin caused changes in expression level of many additional genes. We used a knowledge-based bioinformatic analysis to study the effects of phenformin. Differentially expressed genes identified in this study constitute a large gene network, which may be related to cell death, inflammation or wound response. Our results suggest that the two biguanides show a similar hypoglycemic effect in db/db mice, but phenformin induces a greater stress on the liver even a short time after a single administration. These findings provide a novel insight into the cause of the relatively high occurrence of serious adverse effect after phenformin treatment.

  4. Engineered human skin substitutes undergo large-scale genomic reprogramming and normal skin-like maturation after transplantation to athymic mice.

    PubMed

    Klingenberg, Jennifer M; McFarland, Kevin L; Friedman, Aaron J; Boyce, Steven T; Aronow, Bruce J; Supp, Dorothy M

    2010-02-01

    Bioengineered skin substitutes can facilitate wound closure in severely burned patients, but deficiencies limit their outcomes compared with native skin autografts. To identify gene programs associated with their in vivo capabilities and limitations, we extended previous gene expression profile analyses to now compare engineered skin after in vivo grafting with both in vitro maturation and normal human skin. Cultured skin substitutes were grafted on full-thickness wounds in athymic mice, and biopsy samples for microarray analyses were collected at multiple in vitro and in vivo time points. Over 10,000 transcripts exhibited large-scale expression pattern differences during in vitro and in vivo maturation. Using hierarchical clustering, 11 different expression profile clusters were partitioned on the basis of differential sample type and temporal stage-specific activation or repression. Analyses show that the wound environment exerts a massive influence on gene expression in skin substitutes. For example, in vivo-healed skin substitutes gained the expression of many native skin-expressed genes, including those associated with epidermal barrier and multiple categories of cell-cell and cell-basement membrane adhesion. In contrast, immunological, trichogenic, and endothelial gene programs were largely lacking. These analyses suggest important areas for guiding further improvement of engineered skin for both increased homology with native skin and enhanced wound healing.

  5. TOXICOGENOMICS AND HUMAN DISEASE RISK ASSESSMENT

    EPA Science Inventory


    Toxicogenomics and Human Disease Risk Assessment.

    Complete sequencing of human and other genomes, availability of large-scale gene
    expression arrays with ever-increasing numbers of genes displayed, and steady
    improvements in protein expression technology can hav...

  6. Gram-scale production of a basidiomycetous laccase in Aspergillus niger.

    PubMed

    Mekmouche, Yasmina; Zhou, Simeng; Cusano, Angela M; Record, Eric; Lomascolo, Anne; Robert, Viviane; Simaan, A Jalila; Rousselot-Pailley, Pierre; Ullah, Sana; Chaspoul, Florence; Tron, Thierry

    2014-01-01

    We report on the expression in Aspergillus niger of a laccase gene we used to produce variants in Saccharomyces cerevisiae. Grams of recombinant enzyme can be easily obtained. This highlights the potential of combining this generic laccase sequence to the yeast and fungal expression systems for large-scale productions of variants. Copyright © 2013 The Society for Biotechnology, Japan. Published by Elsevier B.V. All rights reserved.

  7. Titer improvement of iso-migrastatin in selected heterologous Streptomyces hosts and related analysis of mRNA expression by quantitative RT–PCR

    PubMed Central

    Yang, Dong; Zhu, Xiangcheng; Wu, Xueyun; Feng, Zhiyang; Huang, Lei; Shen, Ben; Xu, Zhinan

    2011-01-01

    iso-Migrastatin (iso-MGS) has been actively pursued recently as an outstanding candidate of antimetastasis agents. Having characterized the iso-MGS biosynthetic gene cluster from its native producer Streptomyces platensis NRRL 18993, we have recently succeeded in producing iso-MGS in five selected heterologous Streptomyces hosts, albeit the low titers failed to meet expectations and cast doubt on the utility of this novel technique for large-scale production. To further explore and capitalize on the production capacity of these hosts, a thorough investigation of these five engineered strains with three fermentation media for iso-MGS production was undertaken. Streptomyces albus J1074 and Streptomyces lividans K4-114 were found to be preferred heterologous hosts, and subsequent analysis of carbon and nitrogen sources revealed that sucrose and yeast extract were ideal for iso-MGS production. After the initial optimization, the titers of iso-MGS in all five hosts were considerably improved by 3–18-fold in the optimized R2YE medium. Furthermore, the iso-MGS titer of S. albus J1074 (pBS11001) was significantly improved to 186.7 mg/L by a hybrid medium strategy. Addition of NaHCO3 to the latter finally afforded an optimized iso-MGS titer of 213.8 mg/L, about 5-fold higher than the originally reported system. With S. albus J1074 (pBS11001) as a model host, the expression of iso-MGS gene cluster in four different media was systematically studied via the quantitative RT–PCR technology. The resultant comparison revealed the correlation of gene expression and iso-MGS production for the first time; synchronous expression of the whole gene cluster was crucial for optimal iso-MGS production. These results reveal new insights into the iso-MGS biosynthetic machinery in heterologous hosts and provide the primary data to realize large-scale production of iso-MGS for further preclinical studies. PMID:21132287

  8. Gene expression analysis of flax seed development

    PubMed Central

    2011-01-01

    Background Flax, Linum usitatissimum L., is an important crop whose seed oil and stem fiber have multiple industrial applications. Flax seeds are also well-known for their nutritional attributes, viz., omega-3 fatty acids in the oil and lignans and mucilage from the seed coat. In spite of the importance of this crop, there are few molecular resources that can be utilized toward improving seed traits. Here, we describe flax embryo and seed development and generation of comprehensive genomic resources for the flax seed. Results We describe a large-scale generation and analysis of expressed sequences in various tissues. Collectively, the 13 libraries we have used provide a broad representation of genes active in developing embryos (globular, heart, torpedo, cotyledon and mature stages) seed coats (globular and torpedo stages) and endosperm (pooled globular to torpedo stages) and genes expressed in flowers, etiolated seedlings, leaves, and stem tissue. A total of 261,272 expressed sequence tags (EST) (GenBank accessions LIBEST_026995 to LIBEST_027011) were generated. These EST libraries included transcription factor genes that are typically expressed at low levels, indicating that the depth is adequate for in silico expression analysis. Assembly of the ESTs resulted in 30,640 unigenes and 82% of these could be identified on the basis of homology to known and hypothetical genes from other plants. When compared with fully sequenced plant genomes, the flax unigenes resembled poplar and castor bean more than grape, sorghum, rice or Arabidopsis. Nearly one-fifth of these (5,152) had no homologs in sequences reported for any organism, suggesting that this category represents genes that are likely unique to flax. Digital analyses revealed gene expression dynamics for the biosynthesis of a number of important seed constituents during seed development. Conclusions We have developed a foundational database of expressed sequences and collection of plasmid clones that comprise even low-expressed genes such as those encoding transcription factors. This has allowed us to delineate the spatio-temporal aspects of gene expression underlying the biosynthesis of a number of important seed constituents in flax. Flax belongs to a taxonomic group of diverse plants and the large sequence database will allow for evolutionary studies as well. PMID:21529361

  9. Activity-based protein profiling for biochemical pathway discovery in cancer

    PubMed Central

    Nomura, Daniel K.; Dix, Melissa M.; Cravatt, Benjamin F.

    2011-01-01

    Large-scale profiling methods have uncovered numerous gene and protein expression changes that correlate with tumorigenesis. However, determining the relevance of these expression changes and which biochemical pathways they affect has been hindered by our incomplete understanding of the proteome and its myriad functions and modes of regulation. Activity-based profiling platforms enable both the discovery of cancer-relevant enzymes and selective pharmacological probes to perturb and characterize these proteins in tumour cells. When integrated with other large-scale profiling methods, activity-based proteomics can provide insight into the metabolic and signalling pathways that support cancer pathogenesis and illuminate new strategies for disease diagnosis and treatment. PMID:20703252

  10. A regulatory toolbox of MiniPromoters to drive selective expression in the brain

    PubMed Central

    Portales-Casamar, Elodie; Swanson, Douglas J.; Liu, Li; de Leeuw, Charles N.; Banks, Kathleen G.; Ho Sui, Shannan J.; Fulton, Debra L.; Ali, Johar; Amirabbasi, Mahsa; Arenillas, David J.; Babyak, Nazar; Black, Sonia F.; Bonaguro, Russell J.; Brauer, Erich; Candido, Tara R.; Castellarin, Mauro; Chen, Jing; Chen, Ying; Cheng, Jason C. Y.; Chopra, Vik; Docking, T. Roderick; Dreolini, Lisa; D'Souza, Cletus A.; Flynn, Erin K.; Glenn, Randy; Hatakka, Kristi; Hearty, Taryn G.; Imanian, Behzad; Jiang, Steven; Khorasan-zadeh, Shadi; Komljenovic, Ivana; Laprise, Stéphanie; Liao, Nancy Y.; Lim, Jonathan S.; Lithwick, Stuart; Liu, Flora; Liu, Jun; Lu, Meifen; McConechy, Melissa; McLeod, Andrea J.; Milisavljevic, Marko; Mis, Jacek; O'Connor, Katie; Palma, Betty; Palmquist, Diana L.; Schmouth, Jean-François; Swanson, Magdalena I.; Tam, Bonny; Ticoll, Amy; Turner, Jenna L.; Varhol, Richard; Vermeulen, Jenny; Watkins, Russell F.; Wilson, Gary; Wong, Bibiana K. Y.; Wong, Siaw H.; Wong, Tony Y. T.; Yang, George S.; Ypsilanti, Athena R.; Jones, Steven J. M.; Holt, Robert A.; Goldowitz, Daniel; Wasserman, Wyeth W.; Simpson, Elizabeth M.

    2010-01-01

    The Pleiades Promoter Project integrates genomewide bioinformatics with large-scale knockin mouse production and histological examination of expression patterns to develop MiniPromoters and related tools designed to study and treat the brain by directed gene expression. Genes with brain expression patterns of interest are subjected to bioinformatic analysis to delineate candidate regulatory regions, which are then incorporated into a panel of compact human MiniPromoters to drive expression to brain regions and cell types of interest. Using single-copy, homologous-recombination “knockins” in embryonic stem cells, each MiniPromoter reporter is integrated immediately 5′ of the Hprt locus in the mouse genome. MiniPromoter expression profiles are characterized in differentiation assays of the transgenic cells or in mouse brains following transgenic mouse production. Histological examination of adult brains, eyes, and spinal cords for reporter gene activity is coupled to costaining with cell-type–specific markers to define expression. The publicly available Pleiades MiniPromoter Project is a key resource to facilitate research on brain development and therapies. PMID:20807748

  11. Annealed Scaling for a Charged Polymer

    NASA Astrophysics Data System (ADS)

    Caravenna, F.; den Hollander, F.; Pétrélis, N.; Poisat, J.

    2016-03-01

    This paper studies an undirected polymer chain living on the one-dimensional integer lattice and carrying i.i.d. random charges. Each self-intersection of the polymer chain contributes to the interaction Hamiltonian an energy that is equal to the product of the charges of the two monomers that meet. The joint probability distribution for the polymer chain and the charges is given by the Gibbs distribution associated with the interaction Hamiltonian. The focus is on the annealed free energy per monomer in the limit as the length of the polymer chain tends to infinity. We derive a spectral representation for the free energy and use this to prove that there is a critical curve in the parameter plane of charge bias versus inverse temperature separating a ballistic phase from a subballistic phase. We show that the phase transition is first order. We prove large deviation principles for the laws of the empirical speed and the empirical charge, and derive a spectral representation for the associated rate functions. Interestingly, in both phases both rate functions exhibit flat pieces, which correspond to an inhomogeneous strategy for the polymer to realise a large deviation. The large deviation principles in turn lead to laws of large numbers and central limit theorems. We identify the scaling behaviour of the critical curve for small and for large charge bias. In addition, we identify the scaling behaviour of the free energy for small charge bias and small inverse temperature. Both are linked to an associated Sturm-Liouville eigenvalue problem. A key tool in our analysis is the Ray-Knight formula for the local times of the one-dimensional simple random walk. This formula is exploited to derive a closed form expression for the generating function of the annealed partition function, and for several related quantities. This expression in turn serves as the starting point for the derivation of the spectral representation for the free energy, and for the scaling theorems. What happens for the quenched free energy per monomer remains open. We state two modest results and raise a few questions.

  12. The large-scale investigation of gene expression in Leymus chinensis stigmas provides a valuable resource for understanding the mechanisms of poaceae self-incompatibility.

    PubMed

    Zhou, Qingyuan; Jia, Junting; Huang, Xing; Yan, Xueqing; Cheng, Liqin; Chen, Shuangyan; Li, Xiaoxia; Peng, Xianjun; Liu, Gongshe

    2014-05-26

    Many Poaceae species show a gametophytic self-incompatibility (GSI) system, which is controlled by at least two independent and multiallelic loci, S and Z. Until currently, the gene products for S and Z were unknown. Grass SI plant stigmas discriminate between pollen grains that land on its surface and support compatible pollen tube growth and penetration into the stigma, whereas recognizing incompatible pollen and thus inhibiting pollination behaviors. Leymus chinensis (Trin.) Tzvel. (sheepgrass) is a Poaceae SI species. A comprehensive analysis of sheepgrass stigma transcriptome may provide valuable information for understanding the mechanism of pollen-stigma interactions and grass SI. The transcript abundance profiles of mature stigmas, mature ovaries and leaves were examined using high-throughput next generation sequencing technology. A comparative transcriptomic analysis of these tissues identified 1,025 specifically or preferentially expressed genes in sheepgrass stigmas. These genes contained a significant proportion of genes predicted to function in cell-cell communication and signal transduction. We identified 111 putative transcription factors (TFs) genes and the most abundant groups were MYB, C2H2, C3H, FAR1, MADS. Comparative analysis of the sheepgrass, rice and Arabidopsis stigma-specific or preferential datasets showed broad similarities and some differences in the proportion of genes in the Gene Ontology (GO) functional categories. Potential SI candidate genes identified in other grasses were also detected in the sheepgrass stigma-specific or preferential dataset. Quantitative real-time PCR experiments validated the expression pattern of stigma preferential genes including homologous grass SI candidate genes. This study represents the first large-scale investigation of gene expression in the stigmas of an SI grass species. We uncovered many notable genes that are potentially involved in pollen-stigma interactions and SI mechanisms, including genes encoding receptor-like protein kinases (RLK), CBL (calcineurin B-like proteins) interacting protein kinases, calcium-dependent protein kinase, expansins, pectinesterase, peroxidases and various transcription factors. The availability of a pool of stigma-specific or preferential genes for L. chinensis offers an opportunity to elucidate the mechanisms of SI in Poaceae.

  13. Genome-scale gene expression characteristics define the follicular initiation and developmental rules during folliculogenesis.

    PubMed

    Shi, Kerong; He, Feng; Yuan, Xuefeng; Zhao, Yaofeng; Deng, Xuemei; Hu, Xiaoxiang; Li, Ning

    2013-08-01

    The ovarian follicle supplies a unique dynamic system for gametes that ensures the propagation of the species. During folliculogenesis, the vast majority of the germ cells are lost or inactivated because of ovarian follicle atresia, resulting in diminished reproductive potency and potential infertility. Understanding the underlying molecular mechanism of folliculogenesis rules is essential. Primordial (P), preantral (M), and large antral (L) porcine follicles were used to reveal their genome-wide gene expression profiles. Results indicate that primordial follicles (P) process a diverse gene expression pattern compared to growing follicles (M and L). The 5,548 differentially expressed genes display a similar expression mode in M and L, with a correlation coefficient of 0.892. The number of regulated (both up and down) genes in M is more than that in L. Also, their regulation folds in M (2-364-fold) are much more acute than in L (2-75-fold). Differentially expressed gene groups with different regulation patterns in certain follicular stages are identified and presumed to be closely related following follicular developmental rules. Interestingly, functional annotation analysis revealed that these gene groups feature distinct biological processes or molecular functions. Moreover, representative candidate genes from these gene groups have had their RNA or protein expressions within follicles confirmed. Our study emphasized genome-scale gene expression characteristics, which provide novel entry points for understanding the folliculogenesis rules on the molecular level, such as follicular initiation, atresia, and dominance. Transcriptional regulatory circuitries in certain follicular stages are expected to be found among the identified differentially expressed gene groups.

  14. CoryneRegNet 4.0 – A reference database for corynebacterial gene regulatory networks

    PubMed Central

    Baumbach, Jan

    2007-01-01

    Background Detailed information on DNA-binding transcription factors (the key players in the regulation of gene expression) and on transcriptional regulatory interactions of microorganisms deduced from literature-derived knowledge, computer predictions and global DNA microarray hybridization experiments, has opened the way for the genome-wide analysis of transcriptional regulatory networks. The large-scale reconstruction of these networks allows the in silico analysis of cell behavior in response to changing environmental conditions. We previously published CoryneRegNet, an ontology-based data warehouse of corynebacterial transcription factors and regulatory networks. Initially, it was designed to provide methods for the analysis and visualization of the gene regulatory network of Corynebacterium glutamicum. Results Now we introduce CoryneRegNet release 4.0, which integrates data on the gene regulatory networks of 4 corynebacteria, 2 mycobacteria and the model organism Escherichia coli K12. As the previous versions, CoryneRegNet provides a web-based user interface to access the database content, to allow various queries, and to support the reconstruction, analysis and visualization of regulatory networks at different hierarchical levels. In this article, we present the further improved database content of CoryneRegNet along with novel analysis features. The network visualization feature GraphVis now allows the inter-species comparisons of reconstructed gene regulatory networks and the projection of gene expression levels onto that networks. Therefore, we added stimulon data directly into the database, but also provide Web Service access to the DNA microarray analysis platform EMMA. Additionally, CoryneRegNet now provides a SOAP based Web Service server, which can easily be consumed by other bioinformatics software systems. Stimulons (imported from the database, or uploaded by the user) can be analyzed in the context of known transcriptional regulatory networks to predict putative contradictions or further gene regulatory interactions. Furthermore, it integrates protein clusters by means of heuristically solving the weighted graph cluster editing problem. In addition, it provides Web Service based access to up to date gene annotation data from GenDB. Conclusion The release 4.0 of CoryneRegNet is a comprehensive system for the integrated analysis of procaryotic gene regulatory networks. It is a versatile systems biology platform to support the efficient and large-scale analysis of transcriptional regulation of gene expression in microorganisms. It is publicly available at . PMID:17986320

  15. The Rationality/Emotional Defensiveness Scale--II. Convergent and discriminant correlational analysis in males and females with and without cancer.

    PubMed

    Swan, G E; Carmelli, D; Dame, A; Rosenman, R H; Spielberger, C D

    1992-05-01

    The psychological correlates of the Rationality/Emotional Defensiveness Scale and its two subscales were examined in 1236 males and 863 females from the Western Collaborative Group Study. An additional 157 males and 164 females with some form of cancer other than of the skin were also included in this analysis. Characteristics measured included self-reported emotional control, anger expression, trait personality, depressive and neurotic symptomatology, Type A behavior, hostility, and social desirability. Results indicate that the Rationality/Emotional Defensiveness Scale is most strongly related to the suppression and control of emotions, especially anger. Scores on this scale also tend to be associated with less Type A behavior and hostility and with more social conformity. Analysis of the component subscale suggests that Antiemotionality, i.e. the extent to which an individual uses reason and logic to avoid interpersonally related emotions, is most strongly marked by the control of anger, while Rationality, i.e. the extent to which an individual uses reason and logic as a general approach to coping with the environment, is related to the control of anxiety and a higher level of trait curiosity. The psychological interpretation of the scale appears to be largely invariant across gender, unaffected by residualization of the total scale score for its association with Social Desirability, and, except for a few minor instances, unrelated to the diagnosis of cancer.

  16. Pathway analysis from lists of microRNAs: common pitfalls and alternative strategy

    PubMed Central

    Godard, Patrice; van Eyll, Jonathan

    2015-01-01

    MicroRNAs (miRNAs) are involved in the regulation of gene expression at a post-transcriptional level. As such, monitoring miRNA expression has been increasingly used to assess their role in regulatory mechanisms of biological processes. In large scale studies, once miRNAs of interest have been identified, the target genes they regulate are often inferred using algorithms or databases. A pathway analysis is then often performed in order to generate hypotheses about the relevant biological functions controlled by the miRNA signature. Here we show that the method widely used in scientific literature to identify these pathways is biased and leads to inaccurate results. In addition to describing the bias and its origin we present an alternative strategy to identify potential biological functions specifically impacted by a miRNA signature. More generally, our study exemplifies the crucial need of relevant negative controls when developing, and using, bioinformatics methods. PMID:25800743

  17. Evolution of Sulfobacillus thermosulfidooxidans secreting alginate during bioleaching of chalcopyrite concentrate.

    PubMed

    Yu, R-L; Liu, A; Liu, Y; Yu, Z; Peng, T; Wu, X; Shen, L; Liu, Y; Li, J; Liu, X; Qiu, G; Chen, M; Zeng, W

    2017-06-01

    To explore the distribution disciplinarian of alginate on the chalcopyrite concentrate surface during bioleaching. The evolution of Sulfobacillus thermosulfidooxidans secreting alginate during bioleaching of chalcopyrite concentrate was investigated through gas chromatography coupled with mass spectrometry (GC-MS) and confocal laser scanning microscope (CLSM), and the critical synthetic genes (algA, algC, algD) of alginate were analysed by real-time polymerase chain reaction (RT-PCR). The GC-MS analysis results indicated that there was a little amount of alginate formed on the mineral surface at the early stage, while increasing largely to the maximum value at the intermediate stage, and then kept a stable value at the end stage. The CLSM analysis of chalcopyrite slice showed the same variation trend of alginate content on the mineral surface. Furthermore, the RT-PCR results showed that during the early stage of bioleaching, the expressions of the algA, algC and the algD genes were all overexpressed. However, at the final stage, the algD gene expression decreased in a large scale, and the algA and algC decreased slightly. This expression pattern was attributed to the fact that algA and algC genes were involved in several biosynthesis reactions, but the algD gene only participated in the alginate biosynthesis and this was considered as the key gene to control alginate synthesis. The content of alginate on the mineral surface increased largely at the beginning of bioleaching, and remained stable at the end of bioleaching due to the restriction of algD gene expression. Our findings provide valuable information to explore the relationship between alginate formation and bioleaching of chalcopyrite. © 2017 The Society for Applied Microbiology.

  18. Complex Genetics of Behavior: BXDs in the Automated Home-Cage.

    PubMed

    Loos, Maarten; Verhage, Matthijs; Spijker, Sabine; Smit, August B

    2017-01-01

    This chapter describes a use case for the genetic dissection and automated analysis of complex behavioral traits using the genetically diverse panel of BXD mouse recombinant inbred strains. Strains of the BXD resource differ widely in terms of gene and protein expression in the brain, as well as in their behavioral repertoire. A large mouse resource opens the possibility for gene finding studies underlying distinct behavioral phenotypes, however, such a resource poses a challenge in behavioral phenotyping. To address the specifics of large-scale screening we describe how to investigate: (1) how to assess mouse behavior systematically in addressing a large genetic cohort, (2) how to dissect automation-derived longitudinal mouse behavior into quantitative parameters, and (3) how to map these quantitative traits to the genome, deriving loci underlying aspects of behavior.

  19. Large-scale production of foot-and-mouth disease virus (serotype Asia1) VLP vaccine in Escherichia coli and protection potency evaluation in cattle.

    PubMed

    Xiao, Yan; Chen, Hong-Ying; Wang, Yuzhou; Yin, Bo; Lv, Chaochao; Mo, Xiaobing; Yan, He; Xuan, Yajie; Huang, Yuxin; Pang, Wenqiang; Li, Xiangdong; Yuan, Y Adam; Tian, Kegong

    2016-07-02

    Foot-and-mouth disease (FMD) is an acute, highly contagious disease that infects cloven-hoofed animals. Vaccination is an effective means of preventing and controlling FMD. Compared to conventional inactivated FMDV vaccines, the format of FMDV virus-like particles (VLPs) as a non-replicating particulate vaccine candidate is a promising alternative. In this study, we have developed a co-expression system in E. coli, which drove the expression of FMDV capsid proteins (VP0, VP1, and VP3) in tandem by a single plasmid. The co-expressed FMDV capsid proteins (VP0, VP1, and VP3) were produced in large scale by fermentation at 10 L scale and the chromatographic purified capsid proteins were auto-assembled as VLPs in vitro. Cattle vaccinated with a single dose of the subunit vaccine, comprising in vitro assembled FMDV VLP and adjuvant, developed FMDV-specific antibody response (ELISA antibodies and neutralizing antibodies) with the persistent period of 6 months. Moreover, cattle vaccinated with the subunit vaccine showed the high protection potency with the 50 % bovine protective dose (PD50) reaching 11.75 PD50 per dose. Our data strongly suggest that in vitro assembled recombinant FMDV VLPs produced from E. coli could function as a potent FMDV vaccine candidate against FMDV Asia1 infection. Furthermore, the robust protein expression and purification approaches described here could lead to the development of industrial level large-scale production of E. coli-based VLPs against FMDV infections with different serotypes.

  20. Comparative transcriptome analysis of lufenuron-resistant and susceptible strains of Spodoptera frugiperda (Lepidoptera: Noctuidae).

    PubMed

    do Nascimento, Antonio Rogério Bezerra; Fresia, Pablo; Cônsoli, Fernando Luis; Omoto, Celso

    2015-11-21

    The evolution of insecticide resistance in Spodoptera frugiperda (Lepidoptera: Noctuidae) has resulted in large economic losses and disturbances to the environment and agroecosystems. Resistance to lufenuron, a chitin biosynthesis inhibitor insecticide, was recently documented in Brazilian populations of S. frugiperda. Thus, we utilized large-scale cDNA sequencing (RNA-Seq analysis) to compare the pattern of gene expression between lufenuron-resistant (LUF-R) and susceptible (LUF-S) S. larvae in an attempt to identify the molecular basis behind the resistance mechanism(s) of S. frugiperda to this insecticide. A transcriptome was assembled using approximately 19.6 million 100 bp-long single-end reads, which generated 18,506 transcripts with a N50 of 996 bp. A search against the NCBI non-redundant database generated 51.1% (9,457) functionally annotated transcripts. A large portion of the alignments were homologous to insects, with the majority (45%) being similar to sequences of Bombyx mori (Lepidoptera: Bombycidae). Moreover, 10% of the alignments were similar to sequences of various species of Spodoptera (Lepidoptera: Noctuidae), with 3% of them being similar to sequences of S. frugiperda. A comparative analysis of the gene expression between LUF-R and LUF-S S. frugiperda larvae identified 940 differentially expressed transcripts (p ≤ 0.05, t-test; fold change ≥ 4). Six of them were associated with cuticle metabolism. Of those, four were overexpressed in LUF-R larvae. The machinery involved with the detoxification process was represented by 35 differentially expressed transcripts; 24 of them belonging to P450 monooxygenases, four to glutathione-S-transferases, six to carboxylases and one to sulfotransferases. RNA-Seq analysis was validated for a number of selected candidate transcripts by using quantitative real time PCR (qPCR). The gene expression profile of LUF-R larvae of S. frugiperda differs from LUF-S larvae. In general, gene expression is much higher in resistant larvae when compared to the susceptible ones, particularly for those genes involved with pathways for xenobiotic detoxification, mainly represented by P450 monooxygenases transcripts. Our data indicate that enzymes involved with the detoxification process, and mostly the P450, are one of the resistance mechanisms employed by the LUF-R S. frugiperda larvae against lufenuron.

  1. High-Throughput Analysis of Age-Dependent Protein Changes in Layer II/III of the Human Orbitofrontal Cortex

    NASA Astrophysics Data System (ADS)

    Kapadia, Fenika

    Studies on the orbitofrontal cortex (OFC) during normal aging have shown a decline in cognitive functions, a loss of spines/synapses in layer III and gene expression changes related to neural communication. Biological changes during the course of normal aging are summarized into 9 hallmarks based on aging in peripheral tissue. Whether these hallmarks apply to non-dividing brain tissue is not known. Therefore, we opted to perform large-scale proteomic profiling of the OFC layer II/III during normal aging from 15 young and 18 old male subjects. MaxQuant was utilized for label-free quantification and statistical analysis by the Random Intercept Model (RIM) identified 118 differentially expressed (DE) age-related proteins. Altered neural communication was the most represented hallmark of aging (54% of DE proteins), highlighting the importance of communication in the brain. Functional analysis showed enrichment in GABA/glutamate signaling and pro-inflammatory responses. The former may contribute to alterations in excitation/inhibition, leading to cognitive decline during aging.

  2. DEIVA: a web application for interactive visual analysis of differential gene expression profiles.

    PubMed

    Harshbarger, Jayson; Kratz, Anton; Carninci, Piero

    2017-01-07

    Differential gene expression (DGE) analysis is a technique to identify statistically significant differences in RNA abundance for genes or arbitrary features between different biological states. The result of a DGE test is typically further analyzed using statistical software, spreadsheets or custom ad hoc algorithms. We identified a need for a web-based system to share DGE statistical test results, and locate and identify genes in DGE statistical test results with a very low barrier of entry. We have developed DEIVA, a free and open source, browser-based single page application (SPA) with a strong emphasis on being user friendly that enables locating and identifying single or multiple genes in an immediate, interactive, and intuitive manner. By design, DEIVA scales with very large numbers of users and datasets. Compared to existing software, DEIVA offers a unique combination of design decisions that enable inspection and analysis of DGE statistical test results with an emphasis on ease of use.

  3. Transcriptomic Analysis of Paeonia delavayi Wild Population Flowers to Identify Differentially Expressed Genes Involved in Purple-Red and Yellow Petal Pigmentation

    PubMed Central

    Wang, Yan; Li, Kui; Zheng, Baoqiang; Miao, Kun

    2015-01-01

    Tree peony (Paeonia suffruticosa Andrews) is a very famous traditional ornamental plant in China. P. delavayi is a species endemic to Southwest China that has aroused great interest from researchers as a precious genetic resource for flower color breeding. However, the current understanding of the molecular mechanisms of flower pigmentation in this plant is limited, hindering the genetic engineering of novel flower color in tree peonies. In this study, we conducted a large-scale transcriptome analysis based on Illumina HiSeq sequencing of cDNA libraries generated from yellow and purple-red P. delavayi petals. A total of 90,202 unigenes were obtained by de novo assembly, with an average length of 721 nt. Using Blastx, 44,811 unigenes (49.68%) were found to have significant similarity to accessions in the NR, NT, and Swiss-Prot databases. We also examined COG, GO and KEGG annotations to better understand the functions of these unigenes. Further analysis of the two digital transcriptomes revealed that 6,855 unigenes were differentially expressed between yellow and purple-red flower petals, with 3,430 up-regulated and 3,425 down-regulated. According to the RNA-Seq data and qRT-PCR analysis, we proposed that four up-regulated key structural genes, including F3H, DFR, ANS and 3GT, might play an important role in purple-red petal pigmentation, while high co-expression of THC2'GT, CHI and FNS II ensures the accumulation of pigments contributing to the yellow color. We also found 50 differentially expressed transcription factors that might be involved in flavonoid biosynthesis. This study is the first to report genetic information for P. delavayi. The large number of gene sequences produced by transcriptome sequencing and the candidate genes identified using pathway mapping and expression profiles will provide a valuable resource for future association studies aimed at better understanding the molecular mechanisms underlying flower pigmentation in tree peonies. PMID:26267644

  4. Metadata Analysis of Phanerochaete chrysosporium Gene Expression Data Identified Common CAZymes Encoding Gene Expression Profiles Involved in Cellulose and Hemicellulose Degradation.

    PubMed

    Kameshwar, Ayyappa Kumar Sista; Qin, Wensheng

    2017-01-01

    In literature, extensive studies have been conducted on popular wood degrading white rot fungus, Phanerochaete chrysosporium about its lignin degrading mechanisms compared to the cellulose and hemicellulose degrading abilities. This study delineates cellulose and hemicellulose degrading mechanisms through large scale metadata analysis of P. chrysosporium gene expression data (retrieved from NCBI GEO) to understand the common expression patterns of differentially expressed genes when cultured on different growth substrates. Genes encoding glycoside hydrolase classes commonly expressed during breakdown of cellulose such as GH-5,6,7,9,44,45,48 and hemicellulose are GH-2,8,10,11,26,30,43,47 were found to be highly expressed among varied growth conditions including simple customized and complex natural plant biomass growth mediums. Genes encoding carbohydrate esterase class enzymes CE (1,4,8,9,15,16) polysaccharide lyase class enzymes PL-8 and PL-14, and glycosyl transferases classes GT (1,2,4,8,15,20,35,39,48) were differentially expressed in natural plant biomass growth mediums. Based on these results, P. chrysosporium, on natural plant biomass substrates was found to express lignin and hemicellulose degrading enzymes more than cellulolytic enzymes except GH-61 (LPMO) class enzymes, in early stages. It was observed that the fate of P. chrysosporium transcriptome is significantly affected by the wood substrate provided. We believe, the gene expression findings in this study plays crucial role in developing genetically efficient microbe with effective cellulose and hemicellulose degradation abilities.

  5. A Modified ABCDE Model of Flowering in Orchids Based on Gene Expression Profiling Studies of the Moth Orchid Phalaenopsis aphrodite

    PubMed Central

    Lee, Ann-Ying; Chen, Chun-Yi; Chang, Yao-Chien Alex; Chao, Ya-Ting; Shih, Ming-Che

    2013-01-01

    Previously we developed genomic resources for orchids, including transcriptomic analyses using next-generation sequencing techniques and construction of a web-based orchid genomic database. Here, we report a modified molecular model of flower development in the Orchidaceae based on functional analysis of gene expression profiles in Phalaenopsis aphrodite (a moth orchid) that revealed novel roles for the transcription factors involved in floral organ pattern formation. Phalaenopsis orchid floral organ-specific genes were identified by microarray analysis. Several critical transcription factors including AP3, PI, AP1 and AGL6, displayed distinct spatial distribution patterns. Phylogenetic analysis of orchid MADS box genes was conducted to infer the evolutionary relationship among floral organ-specific genes. The results suggest that gene duplication MADS box genes in orchid may have resulted in their gaining novel functions during evolution. Based on these analyses, a modified model of orchid flowering was proposed. Comparison of the expression profiles of flowers of a peloric mutant and wild-type Phalaenopsis orchid further identified genes associated with lip morphology and peloric effects. Large scale investigation of gene expression profiles revealed that homeotic genes from the ABCDE model of flower development classes A and B in the Phalaenopsis orchid have novel functions due to evolutionary diversification, and display differential expression patterns. PMID:24265826

  6. Evidence for Alteration of Gene Regulatory Networks through MicroRNAs of the HIV-infected brain: novel analysis of retrospective cases.

    PubMed

    Tatro, Erick T; Scott, Erick R; Nguyen, Timothy B; Salaria, Shahid; Banerjee, Sugato; Moore, David J; Masliah, Eliezer; Achim, Cristian L; Everall, Ian P

    2010-04-26

    HIV infection disturbs the central nervous system (CNS) through inflammation and glial activation. Evidence suggests roles for microRNA (miRNA) in host defense and neuronal homeostasis, though little is known about miRNAs' role in HIV CNS infection. MiRNAs are non-coding RNAs that regulate gene translation through post-transcriptional mechanisms. Messenger-RNA profiling alone is insufficient to elucidate the dynamic dance of molecular expression of the genome. We sought to clarify RNA alterations in the frontal cortex (FC) of HIV-infected individuals and those concurrently infected and diagnosed with major depressive disorder (MDD). This report is the first published study of large-scale miRNA profiling from human HIV-infected FC. The goals of this study were to: 1. Identify changes in miRNA expression that occurred in the frontal cortex (FC) of HIV individuals, 2. Determine whether miRNA expression profiles of the FC could differentiate HIV from HIV/MDD, and 3. Adapt a method to meaningfully integrate gene expression data and miRNA expression data in clinical samples. We isolated RNA from the FC (n = 3) of three separate groups (uninfected controls, HIV, and HIV/MDD) and then pooled the RNA within each group for use in large-scale miRNA profiling. RNA from HIV and HIV/MDD patients (n = 4 per group) were also used for non-pooled mRNA analysis on Affymetrix U133 Plus 2.0 arrays. We then utilized a method for integrating the two datasets in a Target Bias Analysis. We found miRNAs of three types: A) Those with many dysregulated mRNA targets of less stringent statistical significance, B) Fewer dysregulated target-genes of highly stringent statistical significance, and C) unclear bias. In HIV/MDD, more miRNAs were downregulated than in HIV alone. Specific miRNA families at targeted chromosomal loci were dysregulated. The dysregulated miRNAs clustered on Chromosomes 14, 17, 19, and X. A small subset of dysregulated genes had many 3' untranslated region (3'UTR) target-sites for dysregulated miRNAs. We provide evidence that certain miRNAs serve as key elements in gene regulatory networks in HIV-infected FC and may be implicated in neurobehavioral disorder. Finally, our data indicates that some genes may serve as hubs of miRNA activity.

  7. Linking Automated Data Analysis and Visualization with Applications in Developmental Biology and High-Energy Physics

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ruebel, Oliver

    2009-11-20

    Knowledge discovery from large and complex collections of today's scientific datasets is a challenging task. With the ability to measure and simulate more processes at increasingly finer spatial and temporal scales, the increasing number of data dimensions and data objects is presenting tremendous challenges for data analysis and effective data exploration methods and tools. Researchers are overwhelmed with data and standard tools are often insufficient to enable effective data analysis and knowledge discovery. The main objective of this thesis is to provide important new capabilities to accelerate scientific knowledge discovery form large, complex, and multivariate scientific data. The research coveredmore » in this thesis addresses these scientific challenges using a combination of scientific visualization, information visualization, automated data analysis, and other enabling technologies, such as efficient data management. The effectiveness of the proposed analysis methods is demonstrated via applications in two distinct scientific research fields, namely developmental biology and high-energy physics.Advances in microscopy, image analysis, and embryo registration enable for the first time measurement of gene expression at cellular resolution for entire organisms. Analysis of high-dimensional spatial gene expression datasets is a challenging task. By integrating data clustering and visualization, analysis of complex, time-varying, spatial gene expression patterns and their formation becomes possible. The analysis framework MATLAB and the visualization have been integrated, making advanced analysis tools accessible to biologist and enabling bioinformatic researchers to directly integrate their analysis with the visualization. Laser wakefield particle accelerators (LWFAs) promise to be a new compact source of high-energy particles and radiation, with wide applications ranging from medicine to physics. To gain insight into the complex physical processes of particle acceleration, physicists model LWFAs computationally. The datasets produced by LWFA simulations are (i) extremely large, (ii) of varying spatial and temporal resolution, (iii) heterogeneous, and (iv) high-dimensional, making analysis and knowledge discovery from complex LWFA simulation data a challenging task. To address these challenges this thesis describes the integration of the visualization system VisIt and the state-of-the-art index/query system FastBit, enabling interactive visual exploration of extremely large three-dimensional particle datasets. Researchers are especially interested in beams of high-energy particles formed during the course of a simulation. This thesis describes novel methods for automatic detection and analysis of particle beams enabling a more accurate and efficient data analysis process. By integrating these automated analysis methods with visualization, this research enables more accurate, efficient, and effective analysis of LWFA simulation data than previously possible.« less

  8. Module discovery by exhaustive search for densely connected, co-expressed regions in biomolecular interaction networks.

    PubMed

    Colak, Recep; Moser, Flavia; Chu, Jeffrey Shih-Chieh; Schönhuth, Alexander; Chen, Nansheng; Ester, Martin

    2010-10-25

    Computational prediction of functionally related groups of genes (functional modules) from large-scale data is an important issue in computational biology. Gene expression experiments and interaction networks are well studied large-scale data sources, available for many not yet exhaustively annotated organisms. It has been well established, when analyzing these two data sources jointly, modules are often reflected by highly interconnected (dense) regions in the interaction networks whose participating genes are co-expressed. However, the tractability of the problem had remained unclear and methods by which to exhaustively search for such constellations had not been presented. We provide an algorithmic framework, referred to as Densely Connected Biclustering (DECOB), by which the aforementioned search problem becomes tractable. To benchmark the predictive power inherent to the approach, we computed all co-expressed, dense regions in physical protein and genetic interaction networks from human and yeast. An automatized filtering procedure reduces our output which results in smaller collections of modules, comparable to state-of-the-art approaches. Our results performed favorably in a fair benchmarking competition which adheres to standard criteria. We demonstrate the usefulness of an exhaustive module search, by using the unreduced output to more quickly perform GO term related function prediction tasks. We point out the advantages of our exhaustive output by predicting functional relationships using two examples. We demonstrate that the computation of all densely connected and co-expressed regions in interaction networks is an approach to module discovery of considerable value. Beyond confirming the well settled hypothesis that such co-expressed, densely connected interaction network regions reflect functional modules, we open up novel computational ways to comprehensively analyze the modular organization of an organism based on prevalent and largely available large-scale datasets. Software and data sets are available at http://www.sfu.ca/~ester/software/DECOB.zip.

  9. Gene expression profiling of single cells on large-scale oligonucleotide arrays

    PubMed Central

    Hartmann, Claudia H.; Klein, Christoph A.

    2006-01-01

    Over the last decade, important insights into the regulation of cellular responses to various stimuli were gained by global gene expression analyses of cell populations. More recently, specific cell functions and underlying regulatory networks of rare cells isolated from their natural environment moved to the center of attention. However, low cell numbers still hinder gene expression profiling of rare ex vivo material in biomedical research. Therefore, we developed a robust method for gene expression profiling of single cells on high-density oligonucleotide arrays with excellent coverage of low abundance transcripts. The protocol was extensively tested with freshly isolated single cells of very low mRNA content including single epithelial, mature and immature dendritic cells and hematopoietic stem cells. Quantitative PCR confirmed that the PCR-based global amplification method did not change the relative ratios of transcript abundance and unsupervised hierarchical cluster analysis revealed that the histogenetic origin of an individual cell is correctly reflected by the gene expression profile. Moreover, the gene expression data from dendritic cells demonstrate that cellular differentiation and pathway activation can be monitored in individual cells. PMID:17071717

  10. Cell-free translational screening of an expression sequence tag library of Clonorchis sinensis for novel antigen discovery.

    PubMed

    Kasi, Devi; Catherine, Christy; Lee, Seung-Won; Lee, Kyung-Ho; Kim, Yu Jung; Ro Lee, Myeong; Ju, Jung Won; Kim, Dong-Myung

    2017-05-01

    The rapidly evolving cloning and sequencing technologies have enabled understanding of genomic structure of parasite genomes, opening up new ways of combatting parasite-related diseases. To make the most of the exponentially accumulating genomic data, however, it is crucial to analyze the proteins encoded by these genomic sequences. In this study, we adopted an engineered cell-free protein synthesis system for large-scale expression screening of an expression sequence tag (EST) library of Clonorchis sinensis to identify potential antigens that can be used for diagnosis and treatment of clonorchiasis. To allow high-throughput expression and identification of individual genes comprising the library, a cell-free synthesis reaction was designed such that both the template DNA and the expressed proteins were co-immobilized on the same microbeads, leading to microbead-based linkage of the genotype and phenotype. This reaction configuration allowed streamlined expression, recovery, and analysis of proteins. This approach enabled us to identify 21 antigenic proteins. © 2017 American Institute of Chemical Engineers Biotechnol. Prog., 33:832-837, 2017. © 2017 American Institute of Chemical Engineers.

  11. Diversity analysis in Cannabis sativa based on large-scale development of expressed sequence tag-derived simple sequence repeat markers.

    PubMed

    Gao, Chunsheng; Xin, Pengfei; Cheng, Chaohua; Tang, Qing; Chen, Ping; Wang, Changbiao; Zang, Gonggu; Zhao, Lining

    2014-01-01

    Cannabis sativa L. is an important economic plant for the production of food, fiber, oils, and intoxicants. However, lack of sufficient simple sequence repeat (SSR) markers has limited the development of cannabis genetic research. Here, large-scale development of expressed sequence tag simple sequence repeat (EST-SSR) markers was performed to obtain more informative genetic markers, and to assess genetic diversity in cannabis (Cannabis sativa L.). Based on the cannabis transcriptome, 4,577 SSRs were identified from 3,624 ESTs. From there, a total of 3,442 complementary primer pairs were designed as SSR markers. Among these markers, trinucleotide repeat motifs (50.99%) were the most abundant, followed by hexanucleotide (25.13%), dinucleotide (16.34%), tetranucloetide (3.8%), and pentanucleotide (3.74%) repeat motifs, respectively. The AAG/CTT trinucleotide repeat (17.96%) was the most abundant motif detected in the SSRs. One hundred and seventeen EST-SSR markers were randomly selected to evaluate primer quality in 24 cannabis varieties. Among these 117 markers, 108 (92.31%) were successfully amplified and 87 (74.36%) were polymorphic. Forty-five polymorphic primer pairs were selected to evaluate genetic diversity and relatedness among the 115 cannabis genotypes. The results showed that 115 varieties could be divided into 4 groups primarily based on geography: Northern China, Europe, Central China, and Southern China. Moreover, the coefficient of similarity when comparing cannabis from Northern China with the European group cannabis was higher than that when comparing with cannabis from the other two groups, owing to a similar climate. This study outlines the first large-scale development of SSR markers for cannabis. These data may serve as a foundation for the development of genetic linkage, quantitative trait loci mapping, and marker-assisted breeding of cannabis.

  12. Diversity Analysis in Cannabis sativa Based on Large-Scale Development of Expressed Sequence Tag-Derived Simple Sequence Repeat Markers

    PubMed Central

    Cheng, Chaohua; Tang, Qing; Chen, Ping; Wang, Changbiao; Zang, Gonggu; Zhao, Lining

    2014-01-01

    Cannabis sativa L. is an important economic plant for the production of food, fiber, oils, and intoxicants. However, lack of sufficient simple sequence repeat (SSR) markers has limited the development of cannabis genetic research. Here, large-scale development of expressed sequence tag simple sequence repeat (EST-SSR) markers was performed to obtain more informative genetic markers, and to assess genetic diversity in cannabis (Cannabis sativa L.). Based on the cannabis transcriptome, 4,577 SSRs were identified from 3,624 ESTs. From there, a total of 3,442 complementary primer pairs were designed as SSR markers. Among these markers, trinucleotide repeat motifs (50.99%) were the most abundant, followed by hexanucleotide (25.13%), dinucleotide (16.34%), tetranucloetide (3.8%), and pentanucleotide (3.74%) repeat motifs, respectively. The AAG/CTT trinucleotide repeat (17.96%) was the most abundant motif detected in the SSRs. One hundred and seventeen EST-SSR markers were randomly selected to evaluate primer quality in 24 cannabis varieties. Among these 117 markers, 108 (92.31%) were successfully amplified and 87 (74.36%) were polymorphic. Forty-five polymorphic primer pairs were selected to evaluate genetic diversity and relatedness among the 115 cannabis genotypes. The results showed that 115 varieties could be divided into 4 groups primarily based on geography: Northern China, Europe, Central China, and Southern China. Moreover, the coefficient of similarity when comparing cannabis from Northern China with the European group cannabis was higher than that when comparing with cannabis from the other two groups, owing to a similar climate. This study outlines the first large-scale development of SSR markers for cannabis. These data may serve as a foundation for the development of genetic linkage, quantitative trait loci mapping, and marker-assisted breeding of cannabis. PMID:25329551

  13. An Analysis of Large-Scale Writing Assessments in Canada (Grades 5-8)

    ERIC Educational Resources Information Center

    Peterson, Shelley Stagg; McClay, Jill; Main, Kristin

    2011-01-01

    This paper reports on an analysis of large-scale assessments of Grades 5-8 students' writing across 10 provinces and 2 territories in Canada. Theory, classroom practice, and the contributions and constraints of large-scale writing assessment are brought together with a focus on Grades 5-8 writing in order to provide both a broad view of…

  14. Mayday - integrative analytics for expression data

    PubMed Central

    2010-01-01

    Background DNA Microarrays have become the standard method for large scale analyses of gene expression and epigenomics. The increasing complexity and inherent noisiness of the generated data makes visual data exploration ever more important. Fast deployment of new methods as well as a combination of predefined, easy to apply methods with programmer's access to the data are important requirements for any analysis framework. Mayday is an open source platform with emphasis on visual data exploration and analysis. Many built-in methods for clustering, machine learning and classification are provided for dissecting complex datasets. Plugins can easily be written to extend Mayday's functionality in a large number of ways. As Java program, Mayday is platform-independent and can be used as Java WebStart application without any installation. Mayday can import data from several file formats, database connectivity is included for efficient data organization. Numerous interactive visualization tools, including box plots, profile plots, principal component plots and a heatmap are available, can be enhanced with metadata and exported as publication quality vector files. Results We have rewritten large parts of Mayday's core to make it more efficient and ready for future developments. Among the large number of new plugins are an automated processing framework, dynamic filtering, new and efficient clustering methods, a machine learning module and database connectivity. Extensive manual data analysis can be done using an inbuilt R terminal and an integrated SQL querying interface. Our visualization framework has become more powerful, new plot types have been added and existing plots improved. Conclusions We present a major extension of Mayday, a very versatile open-source framework for efficient micro array data analysis designed for biologists and bioinformaticians. Most everyday tasks are already covered. The large number of available plugins as well as the extension possibilities using compiled plugins and ad-hoc scripting allow for the rapid adaption of Mayday also to very specialized data exploration. Mayday is available at http://microarray-analysis.org. PMID:20214778

  15. Characterization of the dynamics of the atmosphere of Venus with Doppler velocimetry

    NASA Astrophysics Data System (ADS)

    Machado, Pedro Miguel Borges do Canto Mota

    Currently the study of the Venus' atmosphere grows as a theme of major interest among the astrophysics scientific community. The most significant aspect of the general circulation of the atmosphere of Venus is its retrograde super-rotation. A complete characterization of this dynamical phenomenon is crucial for understanding its driving mechanisms. This work participates in the international effort to characterize the atmospheric dynamics of this planet in coordination with orbiter missions, in particular with Venus Express. The objectives of this study are to investigate the nature of the processes governing the super-rotation of the atmosphere of Venus using ground-based observations, thereby complementing measurements by orbiter instruments. This thesis analyzes observations of Venus made with two different instruments and Doppler velocimetry techniques. The data analysis technique allowed an unambiguous characterization of the zonal wind latitudinal profile and its temporal variability, as well as an investigation of large-scale planetary waves signature and their role in the maintenance of the zonal super-rotation, and suggest that detection and investigation of large-scale planetary waves can be carried out with this technique.These studies complement the independent observations of the european space mission Venus Express, in particular as regards the study of atmospheric super-rotation, meridional flow and its variability. (Abstract shortened by ProQuest.).

  16. Combining Flux Balance and Energy Balance Analysis for Large-Scale Metabolic Network: Biochemical Circuit Theory for Analysis of Large-Scale Metabolic Networks

    NASA Technical Reports Server (NTRS)

    Beard, Daniel A.; Liang, Shou-Dan; Qian, Hong; Biegel, Bryan (Technical Monitor)

    2001-01-01

    Predicting behavior of large-scale biochemical metabolic networks represents one of the greatest challenges of bioinformatics and computational biology. Approaches, such as flux balance analysis (FBA), that account for the known stoichiometry of the reaction network while avoiding implementation of detailed reaction kinetics are perhaps the most promising tools for the analysis of large complex networks. As a step towards building a complete theory of biochemical circuit analysis, we introduce energy balance analysis (EBA), which compliments the FBA approach by introducing fundamental constraints based on the first and second laws of thermodynamics. Fluxes obtained with EBA are thermodynamically feasible and provide valuable insight into the activation and suppression of biochemical pathways.

  17. Characterization and Enhanced Processing of Soluble, Oligomeric gp140 Envelope Glycoproteins Derived from Human Immunodeficiency Virus Type-1 Primary Isolates

    DTIC Science & Technology

    2001-05-01

    isolates could retain gp120 in an oligomer. A large scale purification scheme was developed using lentil lectin affinity and size exclusion...34 e. Western blot analysis……………………………………………… 35 f. Large scale protein expression and purification…………………... 35 g. Metabolic labeling, size...isolate HIV-1 Env………... 60 c. Large scale antigen preparation and analysis……………………… 67 d. Cleaved, soluble crosslinked primary isolate Env binds

  18. Expression of Steroid Receptors in Ameloblasts during Amelogenesis in Rat Incisors.

    PubMed

    Houari, Sophia; Loiodice, Sophia; Jedeon, Katia; Berdal, Ariane; Babajko, Sylvie

    2016-01-01

    Endocrine disrupting chemicals (EDCs) play a part in the modern burst of diseases and interfere with the steroid hormone axis. Bisphenol A (BPA), one of the most active and widely used EDCs, affects ameloblast functions, leading to an enamel hypomineralization pattern similar to that of Molar Incisor Hypomineralization (MIH). In order to explore the molecular pathways stimulated by BPA during amelogenesis, we thoroughly investigated the receptors known to directly or indirectly mediate the effects of BPA. The expression patterns of high affinity BPA receptors (ERRγ, GPR30), of ketosteroid receptors (ERs, AR, PGR, GR, MR), of the retinoid receptor RXRα, and PPARγ were established using RT-qPCR analysis of RNAs extracted from microdissected enamel organ of adult rats. Their expression was dependent on the stage of ameloblast differentiation, except that of ERβ and PPARγ which remained undetectable. An additional large scale microarray analysis revealed three main groups of receptors according to their level of expression in maturation-stage ameloblasts. The expression level of RXRα was the highest, similar to the vitamin D receptor (VDR), whereas the others were 13 to 612-fold lower, with AR and GR being intermediate. Immunofluorescent analysis of VDR, ERα and AR confirmed their presence mainly in maturation- stage ameloblasts. These data provide further evidence that ameloblasts express a specific combination of hormonal receptors depending on their developmental stage. This study represents the first step toward understanding dental endocrinology as well as some of the effects of EDCs on the pathophysiology of amelogenesis.

  19. Expression of Steroid Receptors in Ameloblasts during Amelogenesis in Rat Incisors

    PubMed Central

    Houari, Sophia; Loiodice, Sophia; Jedeon, Katia; Berdal, Ariane; Babajko, Sylvie

    2016-01-01

    Endocrine disrupting chemicals (EDCs) play a part in the modern burst of diseases and interfere with the steroid hormone axis. Bisphenol A (BPA), one of the most active and widely used EDCs, affects ameloblast functions, leading to an enamel hypomineralization pattern similar to that of Molar Incisor Hypomineralization (MIH). In order to explore the molecular pathways stimulated by BPA during amelogenesis, we thoroughly investigated the receptors known to directly or indirectly mediate the effects of BPA. The expression patterns of high affinity BPA receptors (ERRγ, GPR30), of ketosteroid receptors (ERs, AR, PGR, GR, MR), of the retinoid receptor RXRα, and PPARγ were established using RT-qPCR analysis of RNAs extracted from microdissected enamel organ of adult rats. Their expression was dependent on the stage of ameloblast differentiation, except that of ERβ and PPARγ which remained undetectable. An additional large scale microarray analysis revealed three main groups of receptors according to their level of expression in maturation-stage ameloblasts. The expression level of RXRα was the highest, similar to the vitamin D receptor (VDR), whereas the others were 13 to 612-fold lower, with AR and GR being intermediate. Immunofluorescent analysis of VDR, ERα and AR confirmed their presence mainly in maturation- stage ameloblasts. These data provide further evidence that ameloblasts express a specific combination of hormonal receptors depending on their developmental stage. This study represents the first step toward understanding dental endocrinology as well as some of the effects of EDCs on the pathophysiology of amelogenesis. PMID:27853434

  20. ProteinInferencer: Confident protein identification and multiple experiment comparison for large scale proteomics projects.

    PubMed

    Zhang, Yaoyang; Xu, Tao; Shan, Bing; Hart, Jonathan; Aslanian, Aaron; Han, Xuemei; Zong, Nobel; Li, Haomin; Choi, Howard; Wang, Dong; Acharya, Lipi; Du, Lisa; Vogt, Peter K; Ping, Peipei; Yates, John R

    2015-11-03

    Shotgun proteomics generates valuable information from large-scale and target protein characterizations, including protein expression, protein quantification, protein post-translational modifications (PTMs), protein localization, and protein-protein interactions. Typically, peptides derived from proteolytic digestion, rather than intact proteins, are analyzed by mass spectrometers because peptides are more readily separated, ionized and fragmented. The amino acid sequences of peptides can be interpreted by matching the observed tandem mass spectra to theoretical spectra derived from a protein sequence database. Identified peptides serve as surrogates for their proteins and are often used to establish what proteins were present in the original mixture and to quantify protein abundance. Two major issues exist for assigning peptides to their originating protein. The first issue is maintaining a desired false discovery rate (FDR) when comparing or combining multiple large datasets generated by shotgun analysis and the second issue is properly assigning peptides to proteins when homologous proteins are present in the database. Herein we demonstrate a new computational tool, ProteinInferencer, which can be used for protein inference with both small- or large-scale data sets to produce a well-controlled protein FDR. In addition, ProteinInferencer introduces confidence scoring for individual proteins, which makes protein identifications evaluable. This article is part of a Special Issue entitled: Computational Proteomics. Copyright © 2015. Published by Elsevier B.V.

  1. Information science team

    NASA Technical Reports Server (NTRS)

    Billingsley, F.

    1982-01-01

    Concerns are expressed about the data handling aspects of system design and about enabling technology for data handling and data analysis. The status, contributing factors, critical issues, and recommendations for investigations are listed for data handling, rectification and registration, and information extraction. Potential supports to individual P.I., research tasks, systematic data system design, and to system operation. The need for an airborne spectrometer class instrument for fundamental research in high spectral and spatial resolution is indicated. Geographic information system formatting and labelling techniques, very large scale integration, and methods for providing multitype data sets must also be developed.

  2. Evaluation of RNAi and CRISPR technologies by large-scale gene expression profiling in the Connectivity Map.

    PubMed

    Smith, Ian; Greenside, Peyton G; Natoli, Ted; Lahr, David L; Wadden, David; Tirosh, Itay; Narayan, Rajiv; Root, David E; Golub, Todd R; Subramanian, Aravind; Doench, John G

    2017-11-01

    The application of RNA interference (RNAi) to mammalian cells has provided the means to perform phenotypic screens to determine the functions of genes. Although RNAi has revolutionized loss-of-function genetic experiments, it has been difficult to systematically assess the prevalence and consequences of off-target effects. The Connectivity Map (CMAP) represents an unprecedented resource to study the gene expression consequences of expressing short hairpin RNAs (shRNAs). Analysis of signatures for over 13,000 shRNAs applied in 9 cell lines revealed that microRNA (miRNA)-like off-target effects of RNAi are far stronger and more pervasive than generally appreciated. We show that mitigating off-target effects is feasible in these datasets via computational methodologies to produce a consensus gene signature (CGS). In addition, we compared RNAi technology to clustered regularly interspaced short palindromic repeat (CRISPR)-based knockout by analysis of 373 single guide RNAs (sgRNAs) in 6 cells lines and show that the on-target efficacies are comparable, but CRISPR technology is far less susceptible to systematic off-target effects. These results will help guide the proper use and analysis of loss-of-function reagents for the determination of gene function.

  3. [Validation of the Spanish version of the Frankfurt Emotion Work Scales].

    PubMed

    Ortiz Bonnín, Silvia; Navarro Guzmán, Capilla; García Buades, Esther; Ramis Palmer, Carmen; Manassero Mas, M Antonia

    2012-05-01

    This study presents the validity and reliability analysis of a questionnaire that assesses emotion work in the service sector. Emotion work is a term introduced by Hochschild (1983) and it refers to the expression of organizationally desirable emotions to influence the interactions with clients at work. The results show a 6-factor structure: Requirement to display Positive, Negative and Neutral Emotions, Sensitivity Requirements, Interaction Control and Emotional Dissonance. The analysis of the sub-scale scores reveals that the most frequently expressed emotions are positive, whereas negative emotions are expressed less frequently.

  4. Advanced Connectivity Analysis (ACA): a Large Scale Functional Connectivity Data Mining Environment.

    PubMed

    Chen, Rong; Nixon, Erika; Herskovits, Edward

    2016-04-01

    Using resting-state functional magnetic resonance imaging (rs-fMRI) to study functional connectivity is of great importance to understand normal development and function as well as a host of neurological and psychiatric disorders. Seed-based analysis is one of the most widely used rs-fMRI analysis methods. Here we describe a freely available large scale functional connectivity data mining software package called Advanced Connectivity Analysis (ACA). ACA enables large-scale seed-based analysis and brain-behavior analysis. It can seamlessly examine a large number of seed regions with minimal user input. ACA has a brain-behavior analysis component to delineate associations among imaging biomarkers and one or more behavioral variables. We demonstrate applications of ACA to rs-fMRI data sets from a study of autism.

  5. Negative Symptom Dimensions of the Positive and Negative Syndrome Scale Across Geographical Regions

    PubMed Central

    Liharska, Lora; Harvey, Philip D.; Atkins, Alexandra; Ulshen, Daniel; Keefe, Richard S.E.

    2017-01-01

    Objective: Recognizing the discrete dimensions that underlie negative symptoms in schizophrenia and how these dimensions are understood across localities might result in better understanding and treatment of these symptoms. To this end, the objectives of this study were to 1) identify the Positive and Negative Syndrome Scale negative symptom dimensions of expressive deficits and experiential deficits and 2) analyze performance on these dimensions over 15 geographical regions to determine whether the items defining them manifest similar reliability across these regions. Design: Data were obtained for the baseline Positive and Negative Syndrome Scale visits of 6,889 subjects across 15 geographical regions. Using confirmatory factor analysis, we examined whether a two-factor negative symptom structure that is found in schizophrenia (experiential deficits and expressive deficits) would be replicated in our sample, and using differential item functioning, we tested the degree to which specific items from each negative symptom subfactor performed across geographical regions in comparison with the United States. Results: The two-factor negative symptom solution was replicated in this sample. Most geographical regions showed moderate-to-large differential item functioning for Positive and Negative Syndrome Scale expressive deficit items, especially N3 Poor Rapport, as compared with Positive and Negative Syndrome Scale experiential deficit items, showing that these items might be interpreted or scored differently in different regions. Across countries, except for India, the differential item functioning values did not favor raters in the United States. Conclusion: These results suggest that the Positive and Negative Syndrome Scale negative symptom factor can be better represented by a two-factor model than by a single-factor model. Additionally, the results show significant differences in responses to items representing the Positive and Negative Syndrome Scale expressive factors, but not the experiential factors, across regions. This could be due to a lack of equivalence between the original and translated versions, cultural differences with the interpretation of items, dissimilarities in rater training, or diversity in the understanding of scoring anchors. Knowing which items are challenging for raters across regions can help to guide Positive and Negative Syndrome Scale training and improve the results of international clinical trials aimed at negative symptoms. PMID:29410935

  6. Using the Saccharomyces Genome Database (SGD) for analysis of genomic information

    PubMed Central

    Skrzypek, Marek S.; Hirschman, Jodi

    2011-01-01

    Analysis of genomic data requires access to software tools that place the sequence-derived information in the context of biology. The Saccharomyces Genome Database (SGD) integrates functional information about budding yeast genes and their products with a set of analysis tools that facilitate exploring their biological details. This unit describes how the various types of functional data available at SGD can be searched, retrieved, and analyzed. Starting with the guided tour of the SGD Home page and Locus Summary page, this unit highlights how to retrieve data using YeastMine, how to visualize genomic information with GBrowse, how to explore gene expression patterns with SPELL, and how to use Gene Ontology tools to characterize large-scale datasets. PMID:21901739

  7. Production of Self-Purifying Proteins in a Variety of Expression Hosts with Focus on Organophosphorus Hydrolase

    DTIC Science & Technology

    2012-08-17

    cell-density fermentation at laboratory scale, and have provided evidence of their effectiveness. Our most recent work has been on the optimization...of the fermentation process itself, as well as a more biochemical optimization of the expression system. Overall, the ARO support on this project...large scale in high-density fermentation in microbial hosts, which is a critical gap in its appeal. The overall goals of our first renewal proposal

  8. Optimization and analysis of large chemical kinetic mechanisms using the solution mapping method - Combustion of methane

    NASA Technical Reports Server (NTRS)

    Frenklach, Michael; Wang, Hai; Rabinowitz, Martin J.

    1992-01-01

    A method of systematic optimization, solution mapping, as applied to a large-scale dynamic model is presented. The basis of the technique is parameterization of model responses in terms of model parameters by simple algebraic expressions. These expressions are obtained by computer experiments arranged in a factorial design. The developed parameterized responses are then used in a joint multiparameter multidata-set optimization. A brief review of the mathematical background of the technique is given. The concept of active parameters is discussed. The technique is applied to determine an optimum set of parameters for a methane combustion mechanism. Five independent responses - comprising ignition delay times, pre-ignition methyl radical concentration profiles, and laminar premixed flame velocities - were optimized with respect to thirteen reaction rate parameters. The numerical predictions of the optimized model are compared to those computed with several recent literature mechanisms. The utility of the solution mapping technique in situations where the optimum is not unique is also demonstrated.

  9. Evaluation of Bias-Variance Trade-Off for Commonly Used Post-Summarizing Normalization Procedures in Large-Scale Gene Expression Studies

    PubMed Central

    Qiu, Xing; Hu, Rui; Wu, Zhixin

    2014-01-01

    Normalization procedures are widely used in high-throughput genomic data analyses to remove various technological noise and variations. They are known to have profound impact to the subsequent gene differential expression analysis. Although there has been some research in evaluating different normalization procedures, few attempts have been made to systematically evaluate the gene detection performances of normalization procedures from the bias-variance trade-off point of view, especially with strong gene differentiation effects and large sample size. In this paper, we conduct a thorough study to evaluate the effects of normalization procedures combined with several commonly used statistical tests and MTPs under different configurations of effect size and sample size. We conduct theoretical evaluation based on a random effect model, as well as simulation and biological data analyses to verify the results. Based on our findings, we provide some practical guidance for selecting a suitable normalization procedure under different scenarios. PMID:24941114

  10. Large-scale time-lapse microscopy of Oct4 expression in human embryonic stem cell colonies.

    PubMed

    Bhadriraju, Kiran; Halter, Michael; Amelot, Julien; Bajcsy, Peter; Chalfoun, Joe; Vandecreme, Antoine; Mallon, Barbara S; Park, Kye-Yoon; Sista, Subhash; Elliott, John T; Plant, Anne L

    2016-07-01

    Identification and quantification of the characteristics of stem cell preparations is critical for understanding stem cell biology and for the development and manufacturing of stem cell based therapies. We have developed image analysis and visualization software that allows effective use of time-lapse microscopy to provide spatial and dynamic information from large numbers of human embryonic stem cell colonies. To achieve statistically relevant sampling, we examined >680 colonies from 3 different preparations of cells over 5days each, generating a total experimental dataset of 0.9 terabyte (TB). The 0.5 Giga-pixel images at each time point were represented by multi-resolution pyramids and visualized using the Deep Zoom Javascript library extended to support viewing Giga-pixel images over time and extracting data on individual colonies. We present a methodology that enables quantification of variations in nominally-identical preparations and between colonies, correlation of colony characteristics with Oct4 expression, and identification of rare events. Copyright © 2016. Published by Elsevier B.V.

  11. Comment on 'Large-Scale Cognitive GWAS Meta-Analysis Reveals Tissue-Specific Neural Expression and Potential Nootropic Drug Targets' by Lam et al.

    PubMed

    Hill, W David

    2018-04-01

    Intelligence and educational attainment are strongly genetically correlated. This relationship can be exploited by Multi-Trait Analysis of GWAS (MTAG) to add power to Genome-wide Association Studies (GWAS) of intelligence. MTAG allows the user to meta-analyze GWASs of different phenotypes, based on their genetic correlations, to identify association's specific to the trait of choice. An MTAG analysis using GWAS data sets on intelligence and education was conducted by Lam et al. (2017). Lam et al. (2017) reported 70 loci that they described as 'trait specific' to intelligence. This article examines whether the analysis conducted by Lam et al. (2017) has resulted in genetic information about a phenotype that is more similar to education than intelligence.

  12. Manganese peroxidase from the white-rot fungus Phanerochaete chrysosporium is enzymatically active and accumulates to high levels in transgenic maize seed.

    PubMed

    Clough, Richard C; Pappu, Kameshwari; Thompson, Kevin; Beifuss, Katherine; Lane, Jeff; Delaney, Donna E; Harkey, Robin; Drees, Carol; Howard, John A; Hood, Elizabeth E

    2006-01-01

    Manganese peroxidase (MnP) has been implicated in lignin degradation and thus has potential applications in pulp and paper bleaching, enzymatic remediation and the textile industry. Transgenic plants are an emerging protein expression platform that offer many advantages over traditional systems, in particular their potential for large-scale industrial enzyme production. Several plant expression vectors were created to evaluate the accumulation of MnP from the wood-rot fungus Phanerochaete chrysosporium in maize seed. We showed that cell wall targeting yielded full-length MnP, whereas cytoplasmic localization resulted in multiple truncated peroxidase polypeptides as detected by immunoblot analysis. In addition, the use of a seed-preferred promoter dramatically increased the expression levels and reduced the negative effects on plant health. Multiple independent transgenic lines were backcrossed with elite inbred corn lines for several generations with the maintenance of high-level expression, indicating genetic stability of the transgene.

  13. A method for feature selection of APT samples based on entropy

    NASA Astrophysics Data System (ADS)

    Du, Zhenyu; Li, Yihong; Hu, Jinsong

    2018-05-01

    By studying the known APT attack events deeply, this paper propose a feature selection method of APT sample and a logic expression generation algorithm IOCG (Indicator of Compromise Generate). The algorithm can automatically generate machine readable IOCs (Indicator of Compromise), to solve the existing IOCs logical relationship is fixed, the number of logical items unchanged, large scale and cannot generate a sample of the limitations of the expression. At the same time, it can reduce the redundancy and useless APT sample processing time consumption, and improve the sharing rate of information analysis, and actively respond to complex and volatile APT attack situation. The samples were divided into experimental set and training set, and then the algorithm was used to generate the logical expression of the training set with the IOC_ Aware plug-in. The contrast expression itself was different from the detection result. The experimental results show that the algorithm is effective and can improve the detection effect.

  14. Transcriptional profiling of CD31(+) cells isolated from murine embryonic stem cells.

    PubMed

    Mariappan, Devi; Winkler, Johannes; Chen, Shuhua; Schulz, Herbert; Hescheler, Jürgen; Sachinidis, Agapios

    2009-02-01

    Identification of genes involved in endothelial differentiation is of great interest for the understanding of the cellular and molecular mechanisms involved in the development of new blood vessels. Mouse embryonic stem (mES) cells serve as a potential source of endothelial cells for transcriptomic analysis. We isolated endothelial cells from 8-days old embryoid bodies by immuno-magnetic separation using platelet endothelial cell adhesion molecule-1 (also known as CD31) expressed on both early and mature endothelial cells. CD31(+) cells exhibit endothelial-like behavior by being able to incorporate DiI-labeled acetylated low-density lipoprotein as well as form tubular structures on matrigel. Quantitative and semi-quantitative PCR analysis further demonstrated the increased expression of endothelial transcripts. To ascertain the specific transcriptomic identity of the CD31(+) cells, large-scale microarray analysis was carried out. Comparative bioinformatic analysis reveals an enrichment of the gene ontology categories angiogenesis, blood vessel morphogenesis, vasculogenesis and blood coagulation in the CD31(+) cell population. Based on the transcriptomic signatures of the CD31(+) cells, we conclude that this ES cell-derived population contains endothelial-like cells expressing a mesodermal marker BMP2 and possess an angiogenic potential. The transcriptomic characterization of CD31(+) cells enables an in vitro functional genomic model to identify genes required for angiogenesis.

  15. Elucidation of the effect of brain cortex tetrapeptide Cortagen on gene expression in mouse heart by microarray.

    PubMed

    Anisimov, Sergey V; Khavinson, Vladimir Kh; Anisimov, Vladimir N

    2004-01-01

    Aging is associated with significant alterations in gene expression in numerous organs and tissues. Anti-aging therapy with peptide bioregulators holds much promise for the correction of age-associated changes, making a screening for their molecular targets in tissues an important question of modern gerontology. The synthetic tetrapeptide Cortagen (Ala-Glu-Asp-Pro) was obtained by directed synthesis based on amino acid analysis of natural brain cortex peptide preparation Cortexin. In humans, Cortagen demonstrated a pronounced therapeutic effect upon the structural and functional posttraumatic recovery of peripheral nerve tissue. Importantly, other effects were also observed in cardiovascular and cerebrovascular parameters. Based on these latter observations, we hypothesized that acute course of Cortagen treatment, large-scale transcriptome analysis, and identification of transcripts with altered expression in heart would facilitate our understanding of the mechanisms responsible for this peptide biological effects. We therefore analyzed the expression of 15,247 transcripts in the heart of female 6-months CBA mice receiving injections of Cortagen for 5 consecutive days was studied by cDNA microarrays. Comparative analysis of cDNA microarray hybridisation with heart samples from control and experimental group revealed 234 clones (1,53% of the total number of clones) with significant changes of expression that matched 110 known genes belonging to various functional categories. Maximum up- and down-regulation was +5.42 and -2.86, respectively. Intercomparison of changes in cardiac expression profile induced by synthetic peptides (Cortagen, Vilon, Epitalon) and pineal peptide hormone melatonin revealed both common and specific effects of Cortagen upon gene expression in heart.

  16. Novel numerical techniques for magma dynamics

    NASA Astrophysics Data System (ADS)

    Rhebergen, S.; Katz, R. F.; Wathen, A.; Alisic, L.; Rudge, J. F.; Wells, G.

    2013-12-01

    We discuss the development of finite element techniques and solvers for magma dynamics computations. These are implemented within the FEniCS framework. This approach allows for user-friendly, expressive, high-level code development, but also provides access to powerful, scalable numerical solvers and a large family of finite element discretisations. With the recent addition of dolfin-adjoint, FeniCS supports automated adjoint and tangent-linear models, enabling the rapid development of Generalised Stability Analysis. The ability to easily scale codes to three dimensions with large meshes, and/or to apply intricate adjoint calculations means that efficiency of the numerical algorithms is vital. We therefore describe our development and analysis of preconditioners designed specifically for finite element discretizations of equations governing magma dynamics. The preconditioners are based on Elman-Silvester-Wathen methods for the Stokes equation, and we extend these to flows with compaction. Our simulations are validated by comparison of results with laboratory experiments on partially molten aggregates.

  17. Improved ethanol production from cheese whey, whey powder, and sugar beet molasses by "Vitreoscilla hemoglobin expressing" Escherichia coli.

    PubMed

    Akbas, Meltem Yesilcimen; Sar, Taner; Ozcelik, Busra

    2014-01-01

    This work investigated the improvement of ethanol production by engineered ethanologenic Escherichia coli to express the hemoglobin from the bacterium Vitreoscilla (VHb). Ethanologenic E. coli strain FBR5 and FBR5 transformed with the VHb gene in two constructs (strains TS3 and TS4) were grown in cheese whey (CW) medium at small and large scales, at both high and low aeration, or with whey powder (WP) or sugar beet molasses hydrolysate (SBMH) media at large scale and low aeration. Culture pH, cell growth, VHb levels, and ethanol production were evaluated after 48 h. VHb expression in TS3 and TS4 enhanced their ethanol production in CW (21-419%), in WP (17-362%), or in SBMH (48-118%) media. This work extends the findings that "VHb technology" may be useful for improving the production of ethanol from waste and byproducts of various sources.

  18. Removing Batch Effects from Longitudinal Gene Expression - Quantile Normalization Plus ComBat as Best Approach for Microarray Transcriptome Data

    PubMed Central

    Müller, Christian; Schillert, Arne; Röthemeier, Caroline; Trégouët, David-Alexandre; Proust, Carole; Binder, Harald; Pfeiffer, Norbert; Beutel, Manfred; Lackner, Karl J.; Schnabel, Renate B.; Tiret, Laurence; Wild, Philipp S.; Blankenberg, Stefan

    2016-01-01

    Technical variation plays an important role in microarray-based gene expression studies, and batch effects explain a large proportion of this noise. It is therefore mandatory to eliminate technical variation while maintaining biological variability. Several strategies have been proposed for the removal of batch effects, although they have not been evaluated in large-scale longitudinal gene expression data. In this study, we aimed at identifying a suitable method for batch effect removal in a large study of microarray-based longitudinal gene expression. Monocytic gene expression was measured in 1092 participants of the Gutenberg Health Study at baseline and 5-year follow up. Replicates of selected samples were measured at both time points to identify technical variability. Deming regression, Passing-Bablok regression, linear mixed models, non-linear models as well as ReplicateRUV and ComBat were applied to eliminate batch effects between replicates. In a second step, quantile normalization prior to batch effect correction was performed for each method. Technical variation between batches was evaluated by principal component analysis. Associations between body mass index and transcriptomes were calculated before and after batch removal. Results from association analyses were compared to evaluate maintenance of biological variability. Quantile normalization, separately performed in each batch, combined with ComBat successfully reduced batch effects and maintained biological variability. ReplicateRUV performed perfectly in the replicate data subset of the study, but failed when applied to all samples. All other methods did not substantially reduce batch effects in the replicate data subset. Quantile normalization plus ComBat appears to be a valuable approach for batch correction in longitudinal gene expression data. PMID:27272489

  19. Static Analysis of Large-Scale Multibody System Using Joint Coordinates and Spatial Algebra Operator

    PubMed Central

    Omar, Mohamed A.

    2014-01-01

    Initial transient oscillations inhibited in the dynamic simulations responses of multibody systems can lead to inaccurate results, unrealistic load prediction, or simulation failure. These transients could result from incompatible initial conditions, initial constraints violation, and inadequate kinematic assembly. Performing static equilibrium analysis before the dynamic simulation can eliminate these transients and lead to stable simulation. Most exiting multibody formulations determine the static equilibrium position by minimizing the system potential energy. This paper presents a new general purpose approach for solving the static equilibrium in large-scale articulated multibody. The proposed approach introduces an energy drainage mechanism based on Baumgarte constraint stabilization approach to determine the static equilibrium position. The spatial algebra operator is used to express the kinematic and dynamic equations of the closed-loop multibody system. The proposed multibody system formulation utilizes the joint coordinates and modal elastic coordinates as the system generalized coordinates. The recursive nonlinear equations of motion are formulated using the Cartesian coordinates and the joint coordinates to form an augmented set of differential algebraic equations. Then system connectivity matrix is derived from the system topological relations and used to project the Cartesian quantities into the joint subspace leading to minimum set of differential equations. PMID:25045732

  20. The peripheral blood proteome signature of idiopathic pulmonary fibrosis is distinct from normal and is associated with novel immunological processes.

    PubMed

    O'Dwyer, David N; Norman, Katy C; Xia, Meng; Huang, Yong; Gurczynski, Stephen J; Ashley, Shanna L; White, Eric S; Flaherty, Kevin R; Martinez, Fernando J; Murray, Susan; Noth, Imre; Arnold, Kelly B; Moore, Bethany B

    2017-04-25

    Idiopathic pulmonary fibrosis (IPF) is a progressive and fatal interstitial pneumonia. The disease pathophysiology is poorly understood and the etiology remains unclear. Recent advances have generated new therapies and improved knowledge of the natural history of IPF. These gains have been brokered by advances in technology and improved insight into the role of various genes in mediating disease, but gene expression and protein levels do not always correlate. Thus, in this paper we apply a novel large scale high throughput aptamer approach to identify more than 1100 proteins in the peripheral blood of well-characterized IPF patients and normal volunteers. We use systems biology approaches to identify a unique IPF proteome signature and give insight into biological processes driving IPF. We found IPF plasma to be altered and enriched for proteins involved in defense response, wound healing and protein phosphorylation when compared to normal human plasma. Analysis also revealed a minimal protein signature that differentiated IPF patients from normal controls, which may allow for accurate diagnosis of IPF based on easily-accessible peripheral blood. This report introduces large scale unbiased protein discovery analysis to IPF and describes distinct biological processes that further inform disease biology.

  1. Static analysis of large-scale multibody system using joint coordinates and spatial algebra operator.

    PubMed

    Omar, Mohamed A

    2014-01-01

    Initial transient oscillations inhibited in the dynamic simulations responses of multibody systems can lead to inaccurate results, unrealistic load prediction, or simulation failure. These transients could result from incompatible initial conditions, initial constraints violation, and inadequate kinematic assembly. Performing static equilibrium analysis before the dynamic simulation can eliminate these transients and lead to stable simulation. Most exiting multibody formulations determine the static equilibrium position by minimizing the system potential energy. This paper presents a new general purpose approach for solving the static equilibrium in large-scale articulated multibody. The proposed approach introduces an energy drainage mechanism based on Baumgarte constraint stabilization approach to determine the static equilibrium position. The spatial algebra operator is used to express the kinematic and dynamic equations of the closed-loop multibody system. The proposed multibody system formulation utilizes the joint coordinates and modal elastic coordinates as the system generalized coordinates. The recursive nonlinear equations of motion are formulated using the Cartesian coordinates and the joint coordinates to form an augmented set of differential algebraic equations. Then system connectivity matrix is derived from the system topological relations and used to project the Cartesian quantities into the joint subspace leading to minimum set of differential equations.

  2. Genomic analysis of regulatory network dynamics reveals large topological changes

    NASA Astrophysics Data System (ADS)

    Luscombe, Nicholas M.; Madan Babu, M.; Yu, Haiyuan; Snyder, Michael; Teichmann, Sarah A.; Gerstein, Mark

    2004-09-01

    Network analysis has been applied widely, providing a unifying language to describe disparate systems ranging from social interactions to power grids. It has recently been used in molecular biology, but so far the resulting networks have only been analysed statically. Here we present the dynamics of a biological network on a genomic scale, by integrating transcriptional regulatory information and gene-expression data for multiple conditions in Saccharomyces cerevisiae. We develop an approach for the statistical analysis of network dynamics, called SANDY, combining well-known global topological measures, local motifs and newly derived statistics. We uncover large changes in underlying network architecture that are unexpected given current viewpoints and random simulations. In response to diverse stimuli, transcription factors alter their interactions to varying degrees, thereby rewiring the network. A few transcription factors serve as permanent hubs, but most act transiently only during certain conditions. By studying sub-network structures, we show that environmental responses facilitate fast signal propagation (for example, with short regulatory cascades), whereas the cell cycle and sporulation direct temporal progression through multiple stages (for example, with highly inter-connected transcription factors). Indeed, to drive the latter processes forward, phase-specific transcription factors inter-regulate serially, and ubiquitously active transcription factors layer above them in a two-tiered hierarchy. We anticipate that many of the concepts presented here-particularly the large-scale topological changes and hub transience-will apply to other biological networks, including complex sub-systems in higher eukaryotes.

  3. Protein-Fragment Complementation Assays for Large-Scale Analysis, Functional Dissection, and Spatiotemporal Dynamic Studies of Protein-Protein Interactions in Living Cells.

    PubMed

    Michnick, Stephen W; Landry, Christian R; Levy, Emmanuel D; Diss, Guillaume; Ear, Po Hien; Kowarzyk, Jacqueline; Malleshaiah, Mohan K; Messier, Vincent; Tchekanda, Emmanuelle

    2016-11-01

    Protein-fragment complementation assays (PCAs) comprise a family of assays that can be used to study protein-protein interactions (PPIs), conformation changes, and protein complex dimensions. We developed PCAs to provide simple and direct methods for the study of PPIs in any living cell, subcellular compartments or membranes, multicellular organisms, or in vitro. Because they are complete assays, requiring no cell-specific components other than reporter fragments, they can be applied in any context. PCAs provide a general strategy for the detection of proteins expressed at endogenous levels within appropriate subcellular compartments and with normal posttranslational modifications, in virtually any cell type or organism under any conditions. Here we introduce a number of applications of PCAs in budding yeast, Saccharomyces cerevisiae These applications represent the full range of PPI characteristics that might be studied, from simple detection on a large scale to visualization of spatiotemporal dynamics. © 2016 Cold Spring Harbor Laboratory Press.

  4. Expression profiles of urbilaterian genes uniquely shared between honey bee and vertebrates

    PubMed Central

    Matsui, Toshiaki; Yamamoto, Toshiyuki; Wyder, Stefan; Zdobnov, Evgeny M; Kadowaki, Tatsuhiko

    2009-01-01

    Background Large-scale comparison of metazoan genomes has revealed that a significant fraction of genes of the last common ancestor of Bilateria (Urbilateria) is lost in each animal lineage. This event could be one of the underlying mechanisms involved in generating metazoan diversity. However, the present functions of these ancient genes have not been addressed extensively. To understand the functions and evolutionary mechanisms of such ancient Urbilaterian genes, we carried out comprehensive expression profile analysis of genes shared between vertebrates and honey bees but not with the other sequenced ecdysozoan genomes (honey bee-vertebrate specific, HVS genes) as a model. Results We identified 30 honey bee and 55 mouse HVS genes. Many HVS genes exhibited tissue-selective expression patterns; intriguingly, the expression of 60% of honey bee HVS genes was found to be brain enriched, and 24% of mouse HVS genes were highly expressed in either or both the brain and testis. Moreover, a minimum of 38% of mouse HVS genes demonstrated neuron-enriched expression patterns, and 62% of them exhibited expression in selective brain areas, particularly the forebrain and cerebellum. Furthermore, gene ontology (GO) analysis of HVS genes predicted that 35% of genes are associated with DNA transcription and RNA processing. Conclusion These results suggest that HVS genes include genes that are biased towards expression in the brain and gonads. They also demonstrate that at least some of Urbilaterian genes retained in the specific animal lineage may be selectively maintained to support the species-specific phenotypes. PMID:19138430

  5. Expression profiles of urbilaterian genes uniquely shared between honey bee and vertebrates.

    PubMed

    Matsui, Toshiaki; Yamamoto, Toshiyuki; Wyder, Stefan; Zdobnov, Evgeny M; Kadowaki, Tatsuhiko

    2009-01-12

    Large-scale comparison of metazoan genomes has revealed that a significant fraction of genes of the last common ancestor of Bilateria (Urbilateria) is lost in each animal lineage. This event could be one of the underlying mechanisms involved in generating metazoan diversity. However, the present functions of these ancient genes have not been addressed extensively. To understand the functions and evolutionary mechanisms of such ancient Urbilaterian genes, we carried out comprehensive expression profile analysis of genes shared between vertebrates and honey bees but not with the other sequenced ecdysozoan genomes (honey bee-vertebrate specific, HVS genes) as a model. We identified 30 honey bee and 55 mouse HVS genes. Many HVS genes exhibited tissue-selective expression patterns; intriguingly, the expression of 60% of honey bee HVS genes was found to be brain enriched, and 24% of mouse HVS genes were highly expressed in either or both the brain and testis. Moreover, a minimum of 38% of mouse HVS genes demonstrated neuron-enriched expression patterns, and 62% of them exhibited expression in selective brain areas, particularly the forebrain and cerebellum. Furthermore, gene ontology (GO) analysis of HVS genes predicted that 35% of genes are associated with DNA transcription and RNA processing. These results suggest that HVS genes include genes that are biased towards expression in the brain and gonads. They also demonstrate that at least some of Urbilaterian genes retained in the specific animal lineage may be selectively maintained to support the species-specific phenotypes.

  6. Single cell Hi-C reveals cell-to-cell variability in chromosome structure

    PubMed Central

    Schoenfelder, Stefan; Yaffe, Eitan; Dean, Wendy; Laue, Ernest D.; Tanay, Amos; Fraser, Peter

    2013-01-01

    Large-scale chromosome structure and spatial nuclear arrangement have been linked to control of gene expression and DNA replication and repair. Genomic techniques based on chromosome conformation capture assess contacts for millions of loci simultaneously, but do so by averaging chromosome conformations from millions of nuclei. Here we introduce single cell Hi-C, combined with genome-wide statistical analysis and structural modeling of single copy X chromosomes, to show that individual chromosomes maintain domain organisation at the megabase scale, but show variable cell-to-cell chromosome territory structures at larger scales. Despite this structural stochasticity, localisation of active gene domains to boundaries of territories is a hallmark of chromosomal conformation. Single cell Hi-C data bridge current gaps between genomics and microscopy studies of chromosomes, demonstrating how modular organisation underlies dynamic chromosome structure, and how this structure is probabilistically linked with genome activity patterns. PMID:24067610

  7. From crater functions to partial differential equations: a new approach to ion bombardment induced nonequilibrium pattern formation.

    PubMed

    Norris, Scott A; Brenner, Michael P; Aziz, Michael J

    2009-06-03

    We develop a methodology for deriving continuum partial differential equations for the evolution of large-scale surface morphology directly from molecular dynamics simulations of the craters formed from individual ion impacts. Our formalism relies on the separation between the length scale of ion impact and the characteristic scale of pattern formation, and expresses the surface evolution in terms of the moments of the crater function. We demonstrate that the formalism reproduces the classical Bradley-Harper results, as well as ballistic atomic drift, under the appropriate simplifying assumptions. Given an actual set of converged molecular dynamics moments and their derivatives with respect to the incidence angle, our approach can be applied directly to predict the presence and absence of surface morphological instabilities. This analysis represents the first work systematically connecting molecular dynamics simulations of ion bombardment to partial differential equations that govern topographic pattern-forming instabilities.

  8. Comparative performance of different scale-down simulators of substrate gradients in Penicillium chrysogenum cultures: the need of a biological systems response analysis.

    PubMed

    Wang, Guan; Zhao, Junfei; Haringa, Cees; Tang, Wenjun; Xia, Jianye; Chu, Ju; Zhuang, Yingping; Zhang, Siliang; Deshmukh, Amit T; van Gulik, Walter; Heijnen, Joseph J; Noorman, Henk J

    2018-05-01

    In a 54 m 3 large-scale penicillin fermentor, the cells experience substrate gradient cycles at the timescales of global mixing time about 20-40 s. Here, we used an intermittent feeding regime (IFR) and a two-compartment reactor (TCR) to mimic these substrate gradients at laboratory-scale continuous cultures. The IFR was applied to simulate substrate dynamics experienced by the cells at full scale at timescales of tens of seconds to minutes (30 s, 3 min and 6 min), while the TCR was designed to simulate substrate gradients at an applied mean residence time (τc) of 6 min. A biological systems analysis of the response of an industrial high-yielding P. chrysogenum strain has been performed in these continuous cultures. Compared to an undisturbed continuous feeding regime in a single reactor, the penicillin productivity (q PenG ) was reduced in all scale-down simulators. The dynamic metabolomics data indicated that in the IFRs, the cells accumulated high levels of the central metabolites during the feast phase to actively cope with external substrate deprivation during the famine phase. In contrast, in the TCR system, the storage pool (e.g. mannitol and arabitol) constituted a large contribution of carbon supply in the non-feed compartment. Further, transcript analysis revealed that all scale-down simulators gave different expression levels of the glucose/hexose transporter genes and the penicillin gene clusters. The results showed that q PenG did not correlate well with exposure to the substrate regimes (excess, limitation and starvation), but there was a clear inverse relation between q PenG and the intracellular glucose level. © 2018 The Authors. Microbial Biotechnology published by John Wiley & Sons Ltd and Society for Applied Microbiology.

  9. Identification of reference genes for quantitative expression analysis using large-scale RNA-seq data of Arabidopsis thaliana and model crop plants.

    PubMed

    Kudo, Toru; Sasaki, Yohei; Terashima, Shin; Matsuda-Imai, Noriko; Takano, Tomoyuki; Saito, Misa; Kanno, Maasa; Ozaki, Soichi; Suwabe, Keita; Suzuki, Go; Watanabe, Masao; Matsuoka, Makoto; Takayama, Seiji; Yano, Kentaro

    2016-10-13

    In quantitative gene expression analysis, normalization using a reference gene as an internal control is frequently performed for appropriate interpretation of the results. Efforts have been devoted to exploring superior novel reference genes using microarray transcriptomic data and to evaluating commonly used reference genes by targeting analysis. However, because the number of specifically detectable genes is totally dependent on probe design in the microarray analysis, exploration using microarray data may miss some of the best choices for the reference genes. Recently emerging RNA sequencing (RNA-seq) provides an ideal resource for comprehensive exploration of reference genes since this method is capable of detecting all expressed genes, in principle including even unknown genes. We report the results of a comprehensive exploration of reference genes using public RNA-seq data from plants such as Arabidopsis thaliana (Arabidopsis), Glycine max (soybean), Solanum lycopersicum (tomato) and Oryza sativa (rice). To select reference genes suitable for the broadest experimental conditions possible, candidates were surveyed by the following four steps: (1) evaluation of the basal expression level of each gene in each experiment; (2) evaluation of the expression stability of each gene in each experiment; (3) evaluation of the expression stability of each gene across the experiments; and (4) selection of top-ranked genes, after ranking according to the number of experiments in which the gene was expressed stably. Employing this procedure, 13, 10, 12 and 21 top candidates for reference genes were proposed in Arabidopsis, soybean, tomato and rice, respectively. Microarray expression data confirmed that the expression of the proposed reference genes under broad experimental conditions was more stable than that of commonly used reference genes. These novel reference genes will be useful for analyzing gene expression profiles across experiments carried out under various experimental conditions.

  10. [The French adaptation of the STAXI-2, C.D. Spielberger's State-trait anger expression inventory].

    PubMed

    Borteyrou, X; Bruchon-Schweitzer, M; Spielberger, C D

    2008-06-01

    The assessment of anger has received increasing attention because of growing evidence that anger and hostility are related to heart disease. Research on anger assessment has also been stimulated by the development of psychometric measures for evaluating different aspects of anger. First, we review the major self-report scales used to assess anger and hostility. The scales appeared to have been constructed without explicit definition of anger and there is little differentiation between the experience and expression of anger. The factor-derived STAXI-2 is a 57-item measure of the expression of anger, and is comprised of the state-trait anger scale [Spielberger CD, Jacobs G, Russell JS, Crane RS. Assessment of anger: the state-trait anger scale. In: Butcher JN, Spielberger CD, editors. Advances in personality assessment, 2. Hillside, NJ: Erlbaum; 1983] and the anger expression scale (AX; Spielberger et al., 1985). The state anger scale (SAS) includes three subscales: feeling angry, feeling like expressing anger verbally, and feeling like expressing anger physically. The trait anger scale (TAS) consists of two subscales: angry temperament and angry reaction. The AX deals with the direction of both anger expression and anger control, resulting in four revised AX subscales: anger expression/out (verbal and physical, aggressive behavior directed toward other persons or objects), and anger expression/in (anger suppression), anger control/out (attempts to monitor and prevent the outward expression of anger) and anger control/in (active attempts to calm down and reduce angry feelings). The aim of this work was to examine the factor structure and the psychometric properties of the French adaptation of STAXI-2. A sample of 1085 French subjects, 546 female and 539 male, between 18 and 70 years old participated in the study. The 57 items of the three original subscales (SAS, TAS, and AX scale) were analyzed separately by sex and by subscale, using exploratory factor analyses (principal axis analysis, followed by promax rotations). For the first part of the questionnaire (SAS), factor analysis suggested the presence of three factors with eigenvalues >1.0; but the factor structure obtained for males and females differed and was difficult to interpret. Moreover, the explained variance of Factors 2 and 3 was low. Velicer's MAP criteria and screen test established that one solution factor was more relevant. Confirmatory factor analysis suggested that the three factor solution was acceptable, but the unifactorial solution adjusted better to the data. For the second part of the questionnaire (TAS) factor analysis was conducted following the same procedure, and two factors were extracted. The explained variance of Factor 2 was very low. Velicer's MAP criteria and screen test suggested that the solution factor was more relevant. Moreover, the adjustment parameters of the original two-factor structure were not satisfactory. Finally, the analyses of the 32 items of anger expression and control yielded four factors with eigenvalues >1.0. All items loaded higher than 0.38 on the corresponding factor and lower than 0.30 in other factor. The factor structure of the AX scale was fairly robust, both for males and females. Internal consistency and test-retest reliability of the subscales were acceptable except for the SAS. The correlations of the six subscales with four criterion variables (Buss Durkee hostility inventory, Cook and Medley Ho scale, NEO PI-R Ho scale and Courtauld emotions control scale) were in the expected direction, establishing their convergent validity. In summary, the analysis reported in this study checked the factor structure of the STAXI-2 translated into French. The state anger dimension was also essentially confirmed, but no distinction was found between the three components: feeling angry, feeling like expressing anger verbally, and feeling like expressing anger physically. Moreover, the distinction between angry temperament and angry reaction was not confirmed because of gender differences, but we established a robust and valid trait anger factor. Finally, we confirmed the factor structure of the original anger expression scale without gender differences. Some practical and theoretical perspectives for the use of the French adaptation of the STAXI-2 are suggested.

  11. Course 10: Three Lectures on Biological Networks

    NASA Astrophysics Data System (ADS)

    Magnasco, M. O.

    1 Enzymatic networks. Proofreading knots: How DNA topoisomerases disentangle DNA 1.1 Length scales and energy scales 1.2 DNA topology 1.3 Topoisomerases 1.4 Knots and supercoils 1.5 Topological equilibrium 1.6 Can topoisomerases recognize topology? 1.7 Proposal: Kinetic proofreading 1.8 How to do it twice 1.9 The care and proofreading of knots 1.10 Suppression of supercoils 1.11 Problems and outlook 1.12 Disquisition 2 Gene expression networks. Methods for analysis of DNA chip experiments 2.1 The regulation of gene expression 2.2 Gene expression arrays 2.3 Analysis of array data 2.4 Some simplifying assumptions 2.5 Probeset analysis 2.6 Discussion 3 Neural and gene expression networks: Song-induced gene expression in the canary brain 3.1 The study of songbirds 3.2 Canary song 3.3 ZENK 3.4 The blush 3.5 Histological analysis 3.6 Natural vs. artificial 3.7 The Blush II: gAP 3.8 Meditation

  12. Differentiating unipolar and bipolar depression by alterations in large-scale brain networks.

    PubMed

    Goya-Maldonado, Roberto; Brodmann, Katja; Keil, Maria; Trost, Sarah; Dechent, Peter; Gruber, Oliver

    2016-02-01

    Misdiagnosing bipolar depression can lead to very deleterious consequences of mistreatment. Although depressive symptoms may be similarly expressed in unipolar and bipolar disorder, changes in specific brain networks could be very distinct, being therefore informative markers for the differential diagnosis. We aimed to characterize specific alterations in candidate large-scale networks (frontoparietal, cingulo-opercular, and default mode) in symptomatic unipolar and bipolar patients using resting state fMRI, a cognitively low demanding paradigm ideal to investigate patients. Networks were selected after independent component analysis, compared across 40 patients acutely depressed (20 unipolar, 20 bipolar), and 20 controls well-matched for age, gender, and education levels, and alterations were correlated to clinical parameters. Despite comparable symptoms, patient groups were robustly differentiated by large-scale network alterations. Differences were driven in bipolar patients by increased functional connectivity in the frontoparietal network, a central executive and externally-oriented network. Conversely, unipolar patients presented increased functional connectivity in the default mode network, an introspective and self-referential network, as much as reduced connectivity of the cingulo-opercular network to default mode regions, a network involved in detecting the need to switch between internally and externally oriented demands. These findings were mostly unaffected by current medication, comorbidity, and structural changes. Moreover, network alterations in unipolar patients were significantly correlated to the number of depressive episodes. Unipolar and bipolar groups displaying similar symptomatology could be clearly distinguished by characteristic changes in large-scale networks, encouraging further investigation of network fingerprints for clinical use. Hum Brain Mapp 37:808-818, 2016. © 2015 Wiley Periodicals, Inc. © 2015 Wiley Periodicals, Inc.

  13. Expression Atlas: gene and protein expression across multiple studies and organisms

    PubMed Central

    Tang, Y Amy; Bazant, Wojciech; Burke, Melissa; Fuentes, Alfonso Muñoz-Pomer; George, Nancy; Koskinen, Satu; Mohammed, Suhaib; Geniza, Matthew; Preece, Justin; Jarnuczak, Andrew F; Huber, Wolfgang; Stegle, Oliver; Brazma, Alvis; Petryszak, Robert

    2018-01-01

    Abstract Expression Atlas (http://www.ebi.ac.uk/gxa) is an added value database that provides information about gene and protein expression in different species and contexts, such as tissue, developmental stage, disease or cell type. The available public and controlled access data sets from different sources are curated and re-analysed using standardized, open source pipelines and made available for queries, download and visualization. As of August 2017, Expression Atlas holds data from 3,126 studies across 33 different species, including 731 from plants. Data from large-scale RNA sequencing studies including Blueprint, PCAWG, ENCODE, GTEx and HipSci can be visualized next to each other. In Expression Atlas, users can query genes or gene-sets of interest and explore their expression across or within species, tissues, developmental stages in a constitutive or differential context, representing the effects of diseases, conditions or experimental interventions. All processed data matrices are available for direct download in tab-delimited format or as R-data. In addition to the web interface, data sets can now be searched and downloaded through the Expression Atlas R package. Novel features and visualizations include the on-the-fly analysis of gene set overlaps and the option to view gene co-expression in experiments investigating constitutive gene expression across tissues or other conditions. PMID:29165655

  14. Highly efficient mesophyll protoplast isolation and PEG-mediated transient gene expression for rapid and large-scale gene characterization in cassava (Manihot esculenta Crantz).

    PubMed

    Wu, Jun-Zheng; Liu, Qin; Geng, Xiao-Shan; Li, Kai-Mian; Luo, Li-Juan; Liu, Jin-Ping

    2017-03-14

    Cassava (Manihot esculenta Crantz) is a major crop extensively cultivated in the tropics as both an important source of calories and a promising source for biofuel production. Although stable gene expression have been used for transgenic breeding and gene function study, a quick, easy and large-scale transformation platform has been in urgent need for gene functional characterization, especially after the cassava full genome was sequenced. Fully expanded leaves from in vitro plantlets of Manihot esculenta were used to optimize the concentrations of cellulase R-10 and macerozyme R-10 for obtaining protoplasts with the highest yield and viability. Then, the optimum conditions (PEG4000 concentration and transfection time) were determined for cassava protoplast transient gene expression. In addition, the reliability of the established protocol was confirmed for subcellular protein localization. In this work we optimized the main influencing factors and developed an efficient mesophyll protoplast isolation and PEG-mediated transient gene expression in cassava. The suitable enzyme digestion system was established with the combination of 1.6% cellulase R-10 and 0.8% macerozyme R-10 for 16 h of digestion in the dark at 25 °C, resulting in the high yield (4.4 × 10 7 protoplasts/g FW) and vitality (92.6%) of mesophyll protoplasts. The maximum transfection efficiency (70.8%) was obtained with the incubation of the protoplasts/vector DNA mixture with 25% PEG4000 for 10 min. We validated the applicability of the system for studying the subcellular localization of MeSTP7 (an H + /monosaccharide cotransporter) with our transient expression protocol and a heterologous Arabidopsis transient gene expression system. We optimized the main influencing factors and developed an efficient mesophyll protoplast isolation and transient gene expression in cassava, which will facilitate large-scale characterization of genes and pathways in cassava.

  15. RNA sequencing demonstrates large-scale temporal dysregulation of gene expression in stimulated macrophages derived from MHC-defined chicken haplotypes.

    PubMed

    Irizarry, Kristopher J L; Downs, Eileen; Bryden, Randall; Clark, Jory; Griggs, Lisa; Kopulos, Renee; Boettger, Cynthia M; Carr, Thomas J; Keeler, Calvin L; Collisson, Ellen; Drechsler, Yvonne

    2017-01-01

    Discovering genetic biomarkers associated with disease resistance and enhanced immunity is critical to developing advanced strategies for controlling viral and bacterial infections in different species. Macrophages, important cells of innate immunity, are directly involved in cellular interactions with pathogens, the release of cytokines activating other immune cells and antigen presentation to cells of the adaptive immune response. IFNγ is a potent activator of macrophages and increased production has been associated with disease resistance in several species. This study characterizes the molecular basis for dramatically different nitric oxide production and immune function between the B2 and the B19 haplotype chicken macrophages.A large-scale RNA sequencing approach was employed to sequence the RNA of purified macrophages from each haplotype group (B2 vs. B19) during differentiation and after stimulation. Our results demonstrate that a large number of genes exhibit divergent expression between B2 and B19 haplotype cells both prior and after stimulation. These differences in gene expression appear to be regulated by complex epigenetic mechanisms that need further investigation.

  16. Efficient scheme for parametric fitting of data in arbitrary dimensions.

    PubMed

    Pang, Ning-Ning; Tzeng, Wen-Jer; Kao, Hisen-Ching

    2008-07-01

    We propose an efficient scheme for parametric fitting expressed in terms of the Legendre polynomials. For continuous systems, our scheme is exact and the derived explicit expression is very helpful for further analytical studies. For discrete systems, our scheme is almost as accurate as the method of singular value decomposition. Through a few numerical examples, we show that our algorithm costs much less CPU time and memory space than the method of singular value decomposition. Thus, our algorithm is very suitable for a large amount of data fitting. In addition, the proposed scheme can also be used to extract the global structure of fluctuating systems. We then derive the exact relation between the correlation function and the detrended variance function of fluctuating systems in arbitrary dimensions and give a general scaling analysis.

  17. Expression and purification of ELP-intein-tagged target proteins in high cell density E. coli fermentation.

    PubMed

    Fong, Baley A; Wood, David W

    2010-10-19

    Elastin-like polypeptides (ELPs) are useful tools that can be used to non-chromatographically purify proteins. When paired with self-cleaving inteins, they can be used as economical self-cleaving purification tags. However, ELPs and ELP-tagged target proteins have been traditionally expressed using highly enriched media in shake flask cultures, which are generally not amenable to scale-up. In this work, we describe the high cell-density expression of self-cleaving ELP-tagged targets in a supplemented minimal medium at a 2.5 liter fermentation scale, with increased yields and purity compared to traditional shake flask cultures. This demonstration of ELP expression in supplemented minimal media is juxtaposed to previous expression of ELP tags in extract-based rich media. We also describe several sets of fed-batch conditions and their impact on ELP expression and growth medium cost. By using fed batch E. coli fermentation at high cell density, ELP-intein-tagged proteins can be expressed and purified at high yield with low cost. Further, the impact of media components and fermentation design can significantly impact the overall process cost, particularly at large scale. This work thus demonstrates an important advances in the scale up of self-cleaving ELP tag-mediated processes.

  18. Expression and purification of ELP-intein-tagged target proteins in high cell density E. coli fermentation

    PubMed Central

    2010-01-01

    Background Elastin-like polypeptides (ELPs) are useful tools that can be used to non-chromatographically purify proteins. When paired with self-cleaving inteins, they can be used as economical self-cleaving purification tags. However, ELPs and ELP-tagged target proteins have been traditionally expressed using highly enriched media in shake flask cultures, which are generally not amenable to scale-up. Results In this work, we describe the high cell-density expression of self-cleaving ELP-tagged targets in a supplemented minimal medium at a 2.5 liter fermentation scale, with increased yields and purity compared to traditional shake flask cultures. This demonstration of ELP expression in supplemented minimal media is juxtaposed to previous expression of ELP tags in extract-based rich media. We also describe several sets of fed-batch conditions and their impact on ELP expression and growth medium cost. Conclusions By using fed batch E. coli fermentation at high cell density, ELP-intein-tagged proteins can be expressed and purified at high yield with low cost. Further, the impact of media components and fermentation design can significantly impact the overall process cost, particularly at large scale. This work thus demonstrates an important advances in the scale up of self-cleaving ELP tag-mediated processes. PMID:20959011

  19. A SAGE based approach to human glomerular endothelium: defining the transcriptome, finding a novel molecule and highlighting endothelial diversity.

    PubMed

    Sengoelge, Guerkan; Winnicki, Wolfgang; Kupczok, Anne; von Haeseler, Arndt; Schuster, Michael; Pfaller, Walter; Jennings, Paul; Weltermann, Ansgar; Blake, Sophia; Sunder-Plassmann, Gere

    2014-08-27

    Large scale transcript analysis of human glomerular microvascular endothelial cells (HGMEC) has never been accomplished. We designed this study to define the transcriptome of HGMEC and facilitate a better characterization of these endothelial cells with unique features. Serial analysis of gene expression (SAGE) was used for its unbiased approach to quantitative acquisition of transcripts. We generated a HGMEC SAGE library consisting of 68,987 transcript tags. Then taking advantage of large public databases and advanced bioinformatics we compared the HGMEC SAGE library with a SAGE library of non-cultured ex vivo human glomeruli (44,334 tags) which contained endothelial cells. The 823 tags common to both which would have the potential to be expressed in vivo were subsequently checked against 822,008 tags from 16 non-glomerular endothelial SAGE libraries. This resulted in 268 transcript tags differentially overexpressed in HGMEC compared to non-glomerular endothelia. These tags were filtered using a set of criteria: never before shown in kidney or any type of endothelial cell, absent in all nephron regions except the glomerulus, more highly expressed than statistically expected in HGMEC. Neurogranin, a direct target of thyroid hormone action which had been thought to be brain specific and never shown in endothelial cells before, fulfilled these criteria. Its expression in glomerular endothelium in vitro and in vivo was then verified by real-time-PCR, sequencing and immunohistochemistry. Our results represent an extensive molecular characterization of HGMEC beyond a mere database, underline the endothelial heterogeneity, and propose neurogranin as a potential link in the kidney-thyroid axis.

  20. Large-scale production of functional human lysozyme from marker-free transgenic cloned cows.

    PubMed

    Lu, Dan; Liu, Shen; Ding, Fangrong; Wang, Haiping; Li, Jing; Li, Ling; Dai, Yunping; Li, Ning

    2016-03-10

    Human lysozyme is an important natural non-specific immune protein that is highly expressed in breast milk and participates in the immune response of infants against bacterial and viral infections. Considering the medicinal value and market demand for human lysozyme, an animal model for large-scale production of recombinant human lysozyme (rhLZ) is needed. In this study, we generated transgenic cloned cows with the marker-free vector pBAC-hLF-hLZ, which was shown to efficiently express rhLZ in cow milk. Seven transgenic cloned cows, identified by polymerase chain reaction, Southern blot, and western blot analyses, produced rhLZ in milk at concentrations of up to 3149.19 ± 24.80 mg/L. The purified rhLZ had a similar molecular weight and enzymatic activity as wild-type human lysozyme possessed the same C-terminal and N-terminal amino acid sequences. The preliminary results from the milk yield and milk compositions from a naturally lactating transgenic cloned cow 0906 were also tested. These results provide a solid foundation for the large-scale production of rhLZ in the future.

  1. A transcriptional dynamic network during Arabidopsis thaliana pollen development.

    PubMed

    Wang, Jigang; Qiu, Xiaojie; Li, Yuhua; Deng, Youping; Shi, Tieliu

    2011-01-01

    To understand transcriptional regulatory networks (TRNs), especially the coordinated dynamic regulation between transcription factors (TFs) and their corresponding target genes during development, computational approaches would represent significant advances in the genome-wide expression analysis. The major challenges for the experiments include monitoring the time-specific TFs' activities and identifying the dynamic regulatory relationships between TFs and their target genes, both of which are currently not yet available at the large scale. However, various methods have been proposed to computationally estimate those activities and regulations. During the past decade, significant progresses have been made towards understanding pollen development at each development stage under the molecular level, yet the regulatory mechanisms that control the dynamic pollen development processes remain largely unknown. Here, we adopt Networks Component Analysis (NCA) to identify TF activities over time course, and infer their regulatory relationships based on the coexpression of TFs and their target genes during pollen development. We carried out meta-analysis by integrating several sets of gene expression data related to Arabidopsis thaliana pollen development (stages range from UNM, BCP, TCP, HP to 0.5 hr pollen tube and 4 hr pollen tube). We constructed a regulatory network, including 19 TFs, 101 target genes and 319 regulatory interactions. The computationally estimated TF activities were well correlated to their coordinated genes' expressions during the development process. We clustered the expression of their target genes in the context of regulatory influences, and inferred new regulatory relationships between those TFs and their target genes, such as transcription factor WRKY34, which was identified that specifically expressed in pollen, and regulated several new target genes. Our finding facilitates the interpretation of the expression patterns with more biological relevancy, since the clusters corresponding to the activity of specific TF or the combination of TFs suggest the coordinated regulation of TFs to their target genes. Through integrating different resources, we constructed a dynamic regulatory network of Arabidopsis thaliana during pollen development with gene coexpression and NCA. The network illustrated the relationships between the TFs' activities and their target genes' expression, as well as the interactions between TFs, which provide new insight into the molecular mechanisms that control the pollen development.

  2. [Transciptome among Mexicans: a large scale methodology to analyze the genetics expression profile of simultaneous samples in muscle, adipose tissue and lymphocytes obtained from the same individual].

    PubMed

    Bastarrachea, Raúl A; López-Alvarenga, Juan Carlos; Kent, Jack W; Laviada-Molina, Hugo A; Cerda-Flores, Ricardo M; Calderón-Garcidueñas, Ana Laura; Torres-Salazar, Amada; Torres-Salazar, Amanda; Nava-González, Edna J; Solis-Pérez, Elizabeth; Gallegos-Cabrales, Esther C; Cole, Shelley A; Comuzzie, Anthony G

    2008-01-01

    We describe the methodology used to analyze multiple transcripts using microarray techniques in simultaneous biopsies of muscle, adipose tissue and lymphocytes obtained from the same individual as part of the standard protocol of the Genetics of Metabolic Diseases in Mexico: GEMM Family Study. We recruited 4 healthy male subjects with BM1 20-41, who signed an informed consent letter. Subjects participated in a clinical examination that included anthropometric and body composition measurements, muscle biopsies (vastus lateralis) subcutaneous fat biopsies anda blood draw. All samples provided sufficient amplified RNA for microarray analysis. Total RNA was extracted from the biopsy samples and amplified for analysis. Of the 48,687 transcript targets queried, 39.4% were detectable in a least one of the studied tissues. Leptin was not detectable in lymphocytes, weakly expressed in muscle, but overexpressed and highly correlated with BMI in subcutaneous fat. Another example was GLUT4, which was detectable only in muscle and not correlated with BMI. Expression level concordance was 0.7 (p< 0.001) for the three tissues studied. We demonstrated the feasibility of carrying out simultaneous analysis of gene expression in multiple tissues, concordance of genetic expression in different tissues, and obtained confidence that this method corroborates the expected biological relationships among LEPand GLUT4. TheGEMM study will provide a broad and valuable overview on metabolic diseases, including obesity and type 2 diabetes.

  3. Transcriptome analysis of carbohydrate metabolism during bulblet formation and development in Lilium davidii var. unicolor.

    PubMed

    Li, XueYan; Wang, ChunXia; Cheng, JinYun; Zhang, Jing; da Silva, Jaime A Teixeira; Liu, XiaoYu; Duan, Xin; Li, TianLai; Sun, HongMei

    2014-12-19

    The formation and development of bulblets are crucial to the Lilium genus since these processes are closely related to carbohydrate metabolism, especially to starch and sucrose metabolism. However, little is known about the transcriptional regulation of both processes. To gain insight into carbohydrate-related genes involved in bulblet formation and development, we conducted comparative transcriptome profiling of Lilium davidii var. unicolor bulblets at 0 d, 15 d (bulblets emerged) and 35 d (bulblets formed a basic shape with three or four scales) after scale propagation. Analysis of the transcriptome revealed that a total of 52,901 unigenes with an average sequence size of 630 bp were generated. Based on Clusters of Orthologous Groups (COG) analysis, 8% of the sequences were attributed to carbohydrate transport and metabolism. The results of KEGG pathway enrichment analysis showed that starch and sucrose metabolism constituted the predominant pathway among the three library pairs. The starch content in mother scales and bulblets decreased and increased, respectively, with almost the same trend as sucrose content. Gene expression analysis of the key enzymes in starch and sucrose metabolism suggested that sucrose synthase (SuSy) and invertase (INV), mainly hydrolyzing sucrose, presented higher gene expression in mother scales and bulblets at stages of bulblet appearance and enlargement, while sucrose phosphate synthase (SPS) showed higher expression in bulblets at morphogenesis. The enzymes involved in the starch synthetic direction such as ADPG pyrophosphorylase (AGPase), soluble starch synthase (SSS), starch branching enzyme (SBE) and granule-bound starch synthase (GBSS) showed a decreasing trend in mother scales and higher gene expression in bulblets at bulblet appearance and enlargement stages while the enzyme in the cleavage direction, starch de-branching enzyme (SDBE), showed higher gene expression in mother scales than in bulblets. An extensive transcriptome analysis of three bulblet development stages contributes considerable novel information to our understanding of carbohydrate metabolism-related genes in Lilium at the transcriptional level, and demonstrates the fundamentality of carbohydrate metabolism in bulblet emergence and development at the molecular level. This could facilitate further investigation into the molecular mechanisms underlying these processes in lily and other related species.

  4. Imprint of non-linear effects on HI intensity mapping on large scales

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Umeh, Obinna, E-mail: umeobinna@gmail.com

    Intensity mapping of the HI brightness temperature provides a unique way of tracing large-scale structures of the Universe up to the largest possible scales. This is achieved by using a low angular resolution radio telescopes to detect emission line from cosmic neutral Hydrogen in the post-reionization Universe. We use general relativistic perturbation theory techniques to derive for the first time the full expression for the HI brightness temperature up to third order in perturbation theory without making any plane-parallel approximation. We use this result and the renormalization prescription for biased tracers to study the impact of nonlinear effects on themore » power spectrum of HI brightness temperature both in real and redshift space. We show how mode coupling at nonlinear order due to nonlinear bias parameters and redshift space distortion terms modulate the power spectrum on large scales. The large scale modulation may be understood to be due to the effective bias parameter and effective shot noise.« less

  5. Imprint of non-linear effects on HI intensity mapping on large scales

    NASA Astrophysics Data System (ADS)

    Umeh, Obinna

    2017-06-01

    Intensity mapping of the HI brightness temperature provides a unique way of tracing large-scale structures of the Universe up to the largest possible scales. This is achieved by using a low angular resolution radio telescopes to detect emission line from cosmic neutral Hydrogen in the post-reionization Universe. We use general relativistic perturbation theory techniques to derive for the first time the full expression for the HI brightness temperature up to third order in perturbation theory without making any plane-parallel approximation. We use this result and the renormalization prescription for biased tracers to study the impact of nonlinear effects on the power spectrum of HI brightness temperature both in real and redshift space. We show how mode coupling at nonlinear order due to nonlinear bias parameters and redshift space distortion terms modulate the power spectrum on large scales. The large scale modulation may be understood to be due to the effective bias parameter and effective shot noise.

  6. Venus Monitoring Camera (VMC/VEx) 1 micron emissivity and Magellan microwave properties of crater-related radar-dark parabolas and other terrains

    NASA Astrophysics Data System (ADS)

    Basilevsky, A. T.; Shalygina, O. S.; Bondarenko, N. V.; Shalygin, E. V.; Markiewicz, W. J.

    2017-09-01

    The aim of this work is a comparative study of several typical radar-dark parabolas, the neighboring plains and some other geologic units seen in the study areas which include craters Adivar, Bassi, Bathsheba, du Chatelet and Sitwell, at two depths scales: the upper several meters of the study object available through the Magellan-based microwave (at 12.6 cm wavelength) properties (microwave emissivity, Fresnel reflectivity, large-scale surface roughness, and radar cross-section), and the upper hundreds microns of the object characterized by the 1 micron emissivity resulted from the analysis of the near infra-red (NIR) irradiation of the night-side of the Venusian surface measured by the Venus Monitoring Camera (VMC) on-board of Venus Express (VEx).

  7. General statistics of stochastic process of gene expression in eukaryotic cells.

    PubMed Central

    Kuznetsov, V A; Knott, G D; Bonner, R F

    2002-01-01

    Thousands of genes are expressed at such very low levels (< or =1 copy per cell) that global gene expression analysis of rarer transcripts remains problematic. Ambiguity in identification of rarer transcripts creates considerable uncertainty in fundamental questions such as the total number of genes expressed in an organism and the biological significance of rarer transcripts. Knowing the distribution of the true number of genes expressed at each level and the corresponding gene expression level probability function (GELPF) could help resolve these uncertainties. We found that all observed large-scale gene expression data sets in yeast, mouse, and human cells follow a Pareto-like distribution model skewed by many low-abundance transcripts. A novel stochastic model of the gene expression process predicts the universality of the GELPF both across different cell types within a multicellular organism and across different organisms. This model allows us to predict the frequency distribution of all gene expression levels within a single cell and to estimate the number of expressed genes in a single cell and in a population of cells. A random "basal" transcription mechanism for protein-coding genes in all or almost all eukaryotic cell types is predicted. This fundamental mechanism might enhance the expression of rarely expressed genes and, thus, provide a basic level of phenotypic diversity, adaptability, and random monoallelic expression in cell populations. PMID:12136033

  8. Prognostic Value of microRNA-224 in Various Cancers: A Meta-analysis.

    PubMed

    Zhang, Yue; Guo, Cong-Cong; Guan, Dong-Hui; Yang, Chuan-Hua; Jiang, Yue-Hua

    2017-07-01

    During previous studies, microRNA-224 (miR-224) was frequently investigated and discovered to be of vital significance to prognosis of patients with various cancers. However, its accurate prognostic value has not been estimated worldwide. Herein, we performed meta-analysis to assess its potential predictive value in a variety of human tumors. Qualified researches were identified up to March 1, 2017 through performing online searches in PubMed, EMBASE, Web of Science and Cochrane Database of Systematic Reviews. Overall survival (OS), disease-free survival (DFS) or progression-free survival (PFS) as a prognosis for various cancers were extracted and calculated, if available. Pooled hazard ratios (HR) and 95% confidence intervals (CI) were calculated using Stata version 13.0 (StataCorp, College Station, Texas, USA). 22 eligible studies with 3000 patients were ultimately brought into the current meta-analysis. It suggested that high miR-224 expression was significantly associated with poor OS in tissue (HR = 1.43, 95% CI = 1.00-2.03). During multivariate analysis, high miR-224 expression was more significantly associated with OS in tissue (HR = 2.81, 95% CI = 1.91-4.13). Likewise, there were significant associations between tissue miR-224 expression and colorectal cancer (CRC), diffuse large B-cell lymphoma (DLBCL) and gastric cancer (GC) patients (p <0.05). Nevertheless, there were not significant associations between high tissue miR-224 expression and DFS (HR = 2.15, 95% CI = 0.97-4.79) or PFS (HR = 0.92, 95% CI = 0.53-1.59). As far as the present researches are concerned, tissue miR-224 has a significantly prognostic value in various cancers, especially in CRC, DLBCL and GC. Due to the complicated pathogenesis of cancers, more large-scale and standard researches are requisite. Copyright © 2017 IMSS. Published by Elsevier Inc. All rights reserved.

  9. mRNA-Seq Analysis of the Pseudoperonospora cubensis Transcriptome During Cucumber (Cucumis sativus L.) Infection

    PubMed Central

    Hamilton, John P.; Vaillancourt, Brieanne; Buell, C. Robin; Day, Brad

    2012-01-01

    Pseudoperonospora cubensis, an oomycete, is the causal agent of cucurbit downy mildew, and is responsible for significant losses on cucurbit crops worldwide. While other oomycete plant pathogens have been extensively studied at the molecular level, Ps. cubensis and the molecular basis of its interaction with cucurbit hosts has not been well examined. Here, we present the first large-scale global gene expression analysis of Ps. cubensis infection of a susceptible Cucumis sativus cultivar, ‘Vlaspik’, and identification of genes with putative roles in infection, growth, and pathogenicity. Using high throughput whole transcriptome sequencing, we captured differential expression of 2383 Ps. cubensis genes in sporangia and at 1, 2, 3, 4, 6, and 8 days post-inoculation (dpi). Additionally, comparison of Ps. cubensis expression profiles with expression profiles from an infection time course of the oomycete pathogen Phytophthora infestans on Solanum tuberosum revealed similarities in expression patterns of 1,576–6,806 orthologous genes suggesting a substantial degree of overlap in molecular events in virulence between the biotrophic Ps. cubensis and the hemi-biotrophic P. infestans. Co-expression analyses identified distinct modules of Ps. cubensis genes that were representative of early, intermediate, and late infection stages. Collectively, these expression data have advanced our understanding of key molecular and genetic events in the virulence of Ps. cubensis and thus, provides a foundation for identifying mechanism(s) by which to engineer or effect resistance in the host. PMID:22545137

  10. Computing and Applying Atomic Regulons to Understand Gene Expression and Regulation

    PubMed Central

    Faria, José P.; Davis, James J.; Edirisinghe, Janaka N.; Taylor, Ronald C.; Weisenhorn, Pamela; Olson, Robert D.; Stevens, Rick L.; Rocha, Miguel; Rocha, Isabel; Best, Aaron A.; DeJongh, Matthew; Tintle, Nathan L.; Parrello, Bruce; Overbeek, Ross; Henry, Christopher S.

    2016-01-01

    Understanding gene function and regulation is essential for the interpretation, prediction, and ultimate design of cell responses to changes in the environment. An important step toward meeting the challenge of understanding gene function and regulation is the identification of sets of genes that are always co-expressed. These gene sets, Atomic Regulons (ARs), represent fundamental units of function within a cell and could be used to associate genes of unknown function with cellular processes and to enable rational genetic engineering of cellular systems. Here, we describe an approach for inferring ARs that leverages large-scale expression data sets, gene context, and functional relationships among genes. We computed ARs for Escherichia coli based on 907 gene expression experiments and compared our results with gene clusters produced by two prevalent data-driven methods: Hierarchical clustering and k-means clustering. We compared ARs and purely data-driven gene clusters to the curated set of regulatory interactions for E. coli found in RegulonDB, showing that ARs are more consistent with gold standard regulons than are data-driven gene clusters. We further examined the consistency of ARs and data-driven gene clusters in the context of gene interactions predicted by Context Likelihood of Relatedness (CLR) analysis, finding that the ARs show better agreement with CLR predicted interactions. We determined the impact of increasing amounts of expression data on AR construction and find that while more data improve ARs, it is not necessary to use the full set of gene expression experiments available for E. coli to produce high quality ARs. In order to explore the conservation of co-regulated gene sets across different organisms, we computed ARs for Shewanella oneidensis, Pseudomonas aeruginosa, Thermus thermophilus, and Staphylococcus aureus, each of which represents increasing degrees of phylogenetic distance from E. coli. Comparison of the organism-specific ARs showed that the consistency of AR gene membership correlates with phylogenetic distance, but there is clear variability in the regulatory networks of closely related organisms. As large scale expression data sets become increasingly common for model and non-model organisms, comparative analyses of atomic regulons will provide valuable insights into fundamental regulatory modules used across the bacterial domain. PMID:27933038

  11. A high resolution atlas of gene expression in the domestic sheep (Ovis aries)

    PubMed Central

    Farquhar, Iseabail L.; Young, Rachel; Lefevre, Lucas; Pridans, Clare; Tsang, Hiu G.; Afrasiabi, Cyrus; Watson, Mick; Whitelaw, C. Bruce; Freeman, Tom C.; Archibald, Alan L.; Hume, David A.

    2017-01-01

    Sheep are a key source of meat, milk and fibre for the global livestock sector, and an important biomedical model. Global analysis of gene expression across multiple tissues has aided genome annotation and supported functional annotation of mammalian genes. We present a large-scale RNA-Seq dataset representing all the major organ systems from adult sheep and from several juvenile, neonatal and prenatal developmental time points. The Ovis aries reference genome (Oar v3.1) includes 27,504 genes (20,921 protein coding), of which 25,350 (19,921 protein coding) had detectable expression in at least one tissue in the sheep gene expression atlas dataset. Network-based cluster analysis of this dataset grouped genes according to their expression pattern. The principle of ‘guilt by association’ was used to infer the function of uncharacterised genes from their co-expression with genes of known function. We describe the overall transcriptional signatures present in the sheep gene expression atlas and assign those signatures, where possible, to specific cell populations or pathways. The findings are related to innate immunity by focusing on clusters with an immune signature, and to the advantages of cross-breeding by examining the patterns of genes exhibiting the greatest expression differences between purebred and crossbred animals. This high-resolution gene expression atlas for sheep is, to our knowledge, the largest transcriptomic dataset from any livestock species to date. It provides a resource to improve the annotation of the current reference genome for sheep, presenting a model transcriptome for ruminants and insight into gene, cell and tissue function at multiple developmental stages. PMID:28915238

  12. A high resolution atlas of gene expression in the domestic sheep (Ovis aries).

    PubMed

    Clark, Emily L; Bush, Stephen J; McCulloch, Mary E B; Farquhar, Iseabail L; Young, Rachel; Lefevre, Lucas; Pridans, Clare; Tsang, Hiu G; Wu, Chunlei; Afrasiabi, Cyrus; Watson, Mick; Whitelaw, C Bruce; Freeman, Tom C; Summers, Kim M; Archibald, Alan L; Hume, David A

    2017-09-01

    Sheep are a key source of meat, milk and fibre for the global livestock sector, and an important biomedical model. Global analysis of gene expression across multiple tissues has aided genome annotation and supported functional annotation of mammalian genes. We present a large-scale RNA-Seq dataset representing all the major organ systems from adult sheep and from several juvenile, neonatal and prenatal developmental time points. The Ovis aries reference genome (Oar v3.1) includes 27,504 genes (20,921 protein coding), of which 25,350 (19,921 protein coding) had detectable expression in at least one tissue in the sheep gene expression atlas dataset. Network-based cluster analysis of this dataset grouped genes according to their expression pattern. The principle of 'guilt by association' was used to infer the function of uncharacterised genes from their co-expression with genes of known function. We describe the overall transcriptional signatures present in the sheep gene expression atlas and assign those signatures, where possible, to specific cell populations or pathways. The findings are related to innate immunity by focusing on clusters with an immune signature, and to the advantages of cross-breeding by examining the patterns of genes exhibiting the greatest expression differences between purebred and crossbred animals. This high-resolution gene expression atlas for sheep is, to our knowledge, the largest transcriptomic dataset from any livestock species to date. It provides a resource to improve the annotation of the current reference genome for sheep, presenting a model transcriptome for ruminants and insight into gene, cell and tissue function at multiple developmental stages.

  13. Unique differentiation profile of mouse embryonic stem cells in rotary and stirred tank bioreactors.

    PubMed

    Fridley, Krista M; Fernandez, Irina; Li, Mon-Tzu Alice; Kettlewell, Robert B; Roy, Krishnendu

    2010-11-01

    Embryonic stem (ES)-cell-derived lineage-specific stem cells, for example, hematopoietic stem cells, could provide a potentially unlimited source for transplantable cells, especially for cell-based therapies. However, reproducible methods must be developed to maximize and scale-up ES cell differentiation to produce clinically relevant numbers of therapeutic cells. Bioreactor-based dynamic culture conditions are amenable to large-scale cell production, but few studies have evaluated how various bioreactor types and culture parameters influence ES cell differentiation, especially hematopoiesis. Our results indicate that cell seeding density and bioreactor speed significantly affect embryoid body formation and subsequent generation of hematopoietic stem and progenitor cells in both stirred tank (spinner flask) and rotary microgravity (Synthecon™) type bioreactors. In general, high percentages of hematopoietic stem and progenitor cells were generated in both bioreactors, especially at high cell densities. In addition, Synthecon bioreactors produced more sca-1(+) progenitors and spinner flasks generated more c-Kit(+) progenitors, demonstrating their unique differentiation profiles. cDNA microarray analysis of genes involved in pluripotency, germ layer formation, and hematopoietic differentiation showed that on day 7 of differentiation, embryoid bodies from both bioreactors consisted of all three germ layers of embryonic development. However, unique gene expression profiles were observed in the two bioreactors; for example, expression of specific hematopoietic genes were significantly more upregulated in the Synthecon cultures than in spinner flasks. We conclude that bioreactor type and culture parameters can be used to control ES cell differentiation, enhance unique progenitor cell populations, and provide means for large-scale production of transplantable therapeutic cells.

  14. Survival analysis for a large scale forest health issue: Missouri oak decline

    Treesearch

    C.W. Woodall; P.L. Grambsch; W. Thomas; W.K. Moser

    2005-01-01

    Survival analysis methodologies provide novel approaches for forest mortality analysis that may aid in detecting, monitoring, and mitigating of large-scale forest health issues. This study examined survivor analysis for evaluating a regional forest health issue - Missouri oak decline. With a statewide Missouri forest inventory, log-rank tests of the effects of...

  15. Large-scale bioinformatic analysis of the regulation of the disease resistance NBS gene family by microRNAs in Poaceae.

    PubMed

    Habachi-Houimli, Yosra; Khalfallah, Yosra; Makni, Hanem; Makni, Mohamed; Bouktila, Dhia

    2016-01-01

    In the present study, we have screened 71, 713, 525, 119 and 241 mature miRNA variants from Hordeum vulgare, Oryza sativa, Brachypodium distachyon, Triticum aestivum, and Sorghum bicolor, respectively, and classified them with respect to their conservation status and expression levels. These Poaceae non-redundant miRNA species (1,669) were distributed over a total of 625 MIR families, among which only 54 were conserved across two or more plant species, confirming the relatively recent evolutionary differentiation of miRNAs in grasses. On the other hand, we have used 257 H. vulgare, 286T. aestivum, 119 B. distachyon, 269 O. sativa, and 139 S. bicolor NBS domains, which were either mined directly from the annotated proteomes, or predicted from whole genome sequence assemblies. The hybridization potential between miRNAs and their putative NBS genes targets was analyzed, revealing that at least 454 NBS genes from all five Poaceae were potentially regulated by 265 distinct miRNA species, most of them expressed in leaves and predominantly co-expressed in additional tissues. Based on gene ontology, we could assign these probable miRNA target genes to 16 functional groups, among which three conferring resistance to bacteria (Rpm1, Xa1 and Rps2), and 13 groups of resistance to fungi (Rpp8,13, Rp3, Tsn1, Lr10, Rps1-k-1, Pm3, Rpg5, and MLA1,6,10,12,13). The results of the present analysis provide a large-scale platform for a better understanding of biological control strategies of disease resistance genes in Poaceae, and will serve as an important starting point for enhancing crop disease resistance improvement by means of transgenic lines with artificial miRNAs. Copyright © 2016 Académie des sciences. Published by Elsevier SAS. All rights reserved.

  16. Comprehensive evaluation of AmpliSeq transcriptome, a novel targeted whole transcriptome RNA sequencing methodology for global gene expression analysis.

    PubMed

    Li, Wenli; Turner, Amy; Aggarwal, Praful; Matter, Andrea; Storvick, Erin; Arnett, Donna K; Broeckel, Ulrich

    2015-12-16

    Whole transcriptome sequencing (RNA-seq) represents a powerful approach for whole transcriptome gene expression analysis. However, RNA-seq carries a few limitations, e.g., the requirement of a significant amount of input RNA and complications led by non-specific mapping of short reads. The Ion AmpliSeq Transcriptome Human Gene Expression Kit (AmpliSeq) was recently introduced by Life Technologies as a whole-transcriptome, targeted gene quantification kit to overcome these limitations of RNA-seq. To assess the performance of this new methodology, we performed a comprehensive comparison of AmpliSeq with RNA-seq using two well-established next-generation sequencing platforms (Illumina HiSeq and Ion Torrent Proton). We analyzed standard reference RNA samples and RNA samples obtained from human induced pluripotent stem cell derived cardiomyocytes (hiPSC-CMs). Using published data from two standard RNA reference samples, we observed a strong concordance of log2 fold change for all genes when comparing AmpliSeq to Illumina HiSeq (Pearson's r = 0.92) and Ion Torrent Proton (Pearson's r = 0.92). We used ROC, Matthew's correlation coefficient and RMSD to determine the overall performance characteristics. All three statistical methods demonstrate AmpliSeq as a highly accurate method for differential gene expression analysis. Additionally, for genes with high abundance, AmpliSeq outperforms the two RNA-seq methods. When analyzing four closely related hiPSC-CM lines, we show that both AmpliSeq and RNA-seq capture similar global gene expression patterns consistent with known sources of variations. Our study indicates that AmpliSeq excels in the limiting areas of RNA-seq for gene expression quantification analysis. Thus, AmpliSeq stands as a very sensitive and cost-effective approach for very large scale gene expression analysis and mRNA marker screening with high accuracy.

  17. Proteomic analysis of the renal effects of simulated occupational jet fuel exposure.

    PubMed

    Witzmann, F A; Bauer, M D; Fieno, A M; Grant, R A; Keough, T W; Lacey, M P; Sun, Y; Witten, M L; Young, R S

    2000-03-01

    We analyzed protein expression in the cytosolic fraction prepared from whole kidneys in male Swiss-Webster mice exposed 1 h/day for five days to aerosolized JP-8 jet fuel at a concentration of 1000 mg/m3, simulating military occupational exposure. Kidney cytosol samples were solubilized and separated via large-scale, high-resolution two-dimensional electrophoresis (2-DE) and gel patterns scanned, digitized and processed for statistical analysis. Significant changes in soluble kidney proteins resulted from jet fuel exposure. Several of the altered proteins were identified by peptide mass finger-printing and related to ultrastructural abnormalities, altered protein processing, metabolic effects, and paradoxical stress protein/detoxification system responses. These results demonstrate a significant but comparatively moderate JP-8 effect on protein expression in the kidney and provide novel molecular evidence of JP-8 nephrotoxicity. Human risk is suggested by these data but conclusive assessment awaits a noninvasive search for biomarkers in JP-8 exposed humans.

  18. From genes to genomes: a new paradigm for studying fungal pathogenesis in Magnaporthe oryzae.

    PubMed

    Xu, Jin-Rong; Zhao, Xinhua; Dean, Ralph A

    2007-01-01

    Magnaporthe oryzae is the most destructive fungal pathogen of rice worldwide and because of its amenability to classical and molecular genetic manipulation, availability of a genome sequence, and other resources it has emerged as a leading model system to study host-pathogen interactions. This chapter reviews recent progress toward elucidation of the molecular basis of infection-related morphogenesis, host penetration, invasive growth, and host-pathogen interactions. Related information on genome analysis and genomic studies of plant infection processes is summarized under specific topics where appropriate. Particular emphasis is placed on the role of MAP kinase and cAMP signal transduction pathways and unique features in the genome such as repetitive sequences and expanded gene families. Emerging developments in functional genome analysis through large-scale insertional mutagenesis and gene expression profiling are detailed. The chapter concludes with new prospects in the area of systems biology, such as protein expression profiling, and highlighting remaining crucial information needed to fully appreciate host-pathogen interactions.

  19. Stable isotope dimethyl labelling for quantitative proteomics and beyond

    PubMed Central

    Hsu, Jue-Liang; Chen, Shu-Hui

    2016-01-01

    Stable-isotope reductive dimethylation, a cost-effective, simple, robust, reliable and easy-to- multiplex labelling method, is widely applied to quantitative proteomics using liquid chromatography-mass spectrometry. This review focuses on biological applications of stable-isotope dimethyl labelling for a large-scale comparative analysis of protein expression and post-translational modifications based on its unique properties of the labelling chemistry. Some other applications of the labelling method for sample preparation and mass spectrometry-based protein identification and characterization are also summarized. This article is part of the themed issue ‘Quantitative mass spectrometry’. PMID:27644970

  20. CPM Is a Useful Cell Surface Marker to Isolate Expandable Bi-Potential Liver Progenitor Cells Derived from Human iPS Cells.

    PubMed

    Kido, Taketomo; Koui, Yuta; Suzuki, Kaori; Kobayashi, Ayaka; Miura, Yasushi; Chern, Edward Y; Tanaka, Minoru; Miyajima, Atsushi

    2015-10-13

    To develop a culture system for large-scale production of mature hepatocytes, liver progenitor cells (LPCs) with a high proliferation potential would be advantageous. We have found that carboxypeptidase M (CPM) is highly expressed in embryonic LPCs, hepatoblasts, while its expression is decreased along with hepatic maturation. Consistently, CPM expression was transiently induced during hepatic specification from human-induced pluripotent stem cells (hiPSCs). CPM(+) cells isolated from differentiated hiPSCs at the immature hepatocyte stage proliferated extensively in vitro and expressed a set of genes that were typical of hepatoblasts. Moreover, the CPM(+) cells exhibited a mature hepatocyte phenotype after induction of hepatic maturation and also underwent cholangiocytic differentiation in a three-dimensional culture system. These results indicated that hiPSC-derived CPM(+) cells share the characteristics of LPCs, with the potential to proliferate and differentiate bi-directionally. Thus, CPM is a useful marker for isolating hiPSC-derived LPCs, which allows development of a large-scale culture system for producing hepatocytes and cholangiocytes. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.

  1. CPM Is a Useful Cell Surface Marker to Isolate Expandable Bi-Potential Liver Progenitor Cells Derived from Human iPS Cells

    PubMed Central

    Kido, Taketomo; Koui, Yuta; Suzuki, Kaori; Kobayashi, Ayaka; Miura, Yasushi; Chern, Edward Y.; Tanaka, Minoru; Miyajima, Atsushi

    2015-01-01

    Summary To develop a culture system for large-scale production of mature hepatocytes, liver progenitor cells (LPCs) with a high proliferation potential would be advantageous. We have found that carboxypeptidase M (CPM) is highly expressed in embryonic LPCs, hepatoblasts, while its expression is decreased along with hepatic maturation. Consistently, CPM expression was transiently induced during hepatic specification from human-induced pluripotent stem cells (hiPSCs). CPM+ cells isolated from differentiated hiPSCs at the immature hepatocyte stage proliferated extensively in vitro and expressed a set of genes that were typical of hepatoblasts. Moreover, the CPM+ cells exhibited a mature hepatocyte phenotype after induction of hepatic maturation and also underwent cholangiocytic differentiation in a three-dimensional culture system. These results indicated that hiPSC-derived CPM+ cells share the characteristics of LPCs, with the potential to proliferate and differentiate bi-directionally. Thus, CPM is a useful marker for isolating hiPSC-derived LPCs, which allows development of a large-scale culture system for producing hepatocytes and cholangiocytes. PMID:26365514

  2. The Plant Genome Integrative Explorer Resource: PlantGenIE.org.

    PubMed

    Sundell, David; Mannapperuma, Chanaka; Netotea, Sergiu; Delhomme, Nicolas; Lin, Yao-Cheng; Sjödin, Andreas; Van de Peer, Yves; Jansson, Stefan; Hvidsten, Torgeir R; Street, Nathaniel R

    2015-12-01

    Accessing and exploring large-scale genomics data sets remains a significant challenge to researchers without specialist bioinformatics training. We present the integrated PlantGenIE.org platform for exploration of Populus, conifer and Arabidopsis genomics data, which includes expression networks and associated visualization tools. Standard features of a model organism database are provided, including genome browsers, gene list annotation, Blast homology searches and gene information pages. Community annotation updating is supported via integration of WebApollo. We have produced an RNA-sequencing (RNA-Seq) expression atlas for Populus tremula and have integrated these data within the expression tools. An updated version of the ComPlEx resource for performing comparative plant expression analyses of gene coexpression network conservation between species has also been integrated. The PlantGenIE.org platform provides intuitive access to large-scale and genome-wide genomics data from model forest tree species, facilitating both community contributions to annotation improvement and tools supporting use of the included data resources to inform biological insight. © 2015 The Authors. New Phytologist © 2015 New Phytologist Trust.

  3. High level expression of Acidothermus cellulolyticus β-1, 4-endoglucanase in transgenic rice enhances the hydrolysis of its straw by cultured cow gastric fluid

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chou, Hong L.; Dai, Ziyu; Hsieh, Chia W.

    Large-scale production of effective cellulose hydrolytic enzymes is the key to the bioconversion of agricultural residues to ethanol. The goal of this study was to develop a rice plant as a bioreactor for the large-scale production of cellulose hydrolytic enzymes via genetic transformation, and to simultaneously improve rice straw as an efficient biomass feedstock for conversion of cellulose to glucose. In this study, the cellulose hydrolytic enzyme {beta}-1, 4-endoglucanase (E1) from the thermophilic bacterium Acidothermus cellulolyticus was overexpressed in rice through Agrobacterium-mediated transformation. The expression of the bacterial gene in rice was driven by the constitutive Mac promoter, a hybridmore » promoter of Ti plasmid mannopine synthetase promoter and cauliflower mosaic virus 35S promoter enhancer with the signal peptide of tobacco pathogenesis-related protein for targeting the protein to the apoplastic compartment for storage. A total of 52 transgenic rice plants from six independent lines expressing the bacterial enzyme were obtained, which expressed the gene at high levels with a normal phenotype. The specific activities of E1 in the leaves of the highest expressing transgenic rice lines were about 20 fold higher than those of various transgenic plants obtained in previous studies and the protein amounts accounted for up to 6.1% of the total leaf soluble protein. Zymogram and temperature-dependent activity analyses demonstrated the thermostability of the enzyme and its substrate specificity against cellulose, and a simple heat treatment can be used to purify the protein. In addition, hydrolysis of transgenic rice straw with cultured cow gastric fluid yielded almost twice more reducing sugars than wild type straw. Taken together, these data suggest that transgenic rice can effectively serve as a bioreactor for large-scale production of active, thermostable cellulose hydrolytic enzymes. As a feedstock, direct expression of large amount of cellulases in transgenic rice may also facilitate saccharification of cellulose in rice straw and significantly reduce the costs for hydrolytic enzymes.« less

  4. Large-scale modelling of the divergent spectrin repeats in nesprins: giant modular proteins.

    PubMed

    Autore, Flavia; Pfuhl, Mark; Quan, Xueping; Williams, Aisling; Roberts, Roland G; Shanahan, Catherine M; Fraternali, Franca

    2013-01-01

    Nesprin-1 and nesprin-2 are nuclear envelope (NE) proteins characterized by a common structure of an SR (spectrin repeat) rod domain and a C-terminal transmembrane KASH [Klarsicht-ANC-Syne-homology] domain and display N-terminal actin-binding CH (calponin homology) domains. Mutations in these proteins have been described in Emery-Dreifuss muscular dystrophy and attributed to disruptions of interactions at the NE with nesprins binding partners, lamin A/C and emerin. Evolutionary analysis of the rod domains of the nesprins has shown that they are almost entirely composed of unbroken SR-like structures. We present a bioinformatical approach to accurate definition of the boundaries of each SR by comparison with canonical SR structures, allowing for a large-scale homology modelling of the 74 nesprin-1 and 56 nesprin-2 SRs. The exposed and evolutionary conserved residues identify important pbs for protein-protein interactions that can guide tailored binding experiments. Most importantly, the bioinformatics analyses and the 3D models have been central to the design of selected constructs for protein expression. 1D NMR and CD spectra have been performed of the expressed SRs, showing a folded, stable, high content α-helical structure, typical of SRs. Molecular Dynamics simulations have been performed to study the structural and elastic properties of consecutive SRs, revealing insights in the mechanical properties adopted by these modules in the cell.

  5. Atypical language laterality is associated with large-scale disruption of network integration in children with intractable focal epilepsy.

    PubMed

    Ibrahim, George M; Morgan, Benjamin R; Doesburg, Sam M; Taylor, Margot J; Pang, Elizabeth W; Donner, Elizabeth; Go, Cristina Y; Rutka, James T; Snead, O Carter

    2015-04-01

    Epilepsy is associated with disruption of integration in distributed networks, together with altered localization for functions such as expressive language. The relation between atypical network connectivity and altered localization is unknown. In the current study we tested whether atypical expressive language laterality was associated with the alteration of large-scale network integration in children with medically-intractable localization-related epilepsy (LRE). Twenty-three right-handed children (age range 8-17) with medically-intractable LRE performed a verb generation task in fMRI. Language network activation was identified and the Laterality index (LI) was calculated within the pars triangularis and pars opercularis. Resting-state data from the same cohort were subjected to independent component analysis. Dual regression was used to identify associations between resting-state integration and LI values. Higher positive values of the LI, indicating typical language localization were associated with stronger functional integration of various networks including the default mode network (DMN). The normally symmetric resting-state networks showed a pattern of lateralized connectivity mirroring that of language function. The association between atypical language localization and network integration implies a widespread disruption of neural network development. These findings may inform the interpretation of localization studies by providing novel insights into reorganization of neural networks in epilepsy. Copyright © 2015 Elsevier Ltd. All rights reserved.

  6. Analysis on the Critical Rainfall Value For Predicting Large Scale Landslides Caused by Heavy Rainfall In Taiwan.

    NASA Astrophysics Data System (ADS)

    Tsai, Kuang-Jung; Chiang, Jie-Lun; Lee, Ming-Hsi; Chen, Yie-Ruey

    2017-04-01

    Analysis on the Critical Rainfall Value For Predicting Large Scale Landslides Caused by Heavy Rainfall In Taiwan. Kuang-Jung Tsai 1, Jie-Lun Chiang 2,Ming-Hsi Lee 2, Yie-Ruey Chen 1, 1Department of Land Management and Development, Chang Jung Christian Universityt, Tainan, Taiwan. 2Department of Soil and Water Conservation, National Pingtung University of Science and Technology, Pingtung, Taiwan. ABSTRACT The accumulated rainfall amount was recorded more than 2,900mm that were brought by Morakot typhoon in August, 2009 within continuous 3 days. Very serious landslides, and sediment related disasters were induced by this heavy rainfall event. The satellite image analysis project conducted by Soil and Water Conservation Bureau after Morakot event indicated that more than 10,904 sites of landslide with total sliding area of 18,113ha were found by this project. At the same time, all severe sediment related disaster areas are also characterized based on their disaster type, scale, topography, major bedrock formations and geologic structures during the period of extremely heavy rainfall events occurred at the southern Taiwan. Characteristics and mechanism of large scale landslide are collected on the basis of the field investigation technology integrated with GPS/GIS/RS technique. In order to decrease the risk of large scale landslides on slope land, the strategy of slope land conservation, and critical rainfall database should be set up and executed as soon as possible. Meanwhile, study on the establishment of critical rainfall value used for predicting large scale landslides induced by heavy rainfall become an important issue which was seriously concerned by the government and all people live in Taiwan. The mechanism of large scale landslide, rainfall frequency analysis ,sediment budge estimation and river hydraulic analysis under the condition of extremely climate change during the past 10 years would be seriously concerned and recognized as a required issue by this research. Hopefully, all results developed from this research can be used as a warning system for Predicting Large Scale Landslides in the southern Taiwan. Keywords:Heavy Rainfall, Large Scale, landslides, Critical Rainfall Value

  7. Anger Expression Types and Interpersonal Problems in Nurses.

    PubMed

    Han, Aekyung; Won, Jongsoon; Kim, Oksoo; Lee, Sang E

    2015-06-01

    The purpose of this study was to investigate the anger expression types in nurses and to analyze the differences between the anger expression types and interpersonal problems. The data were collected from 149 nurses working in general hospitals with 300 beds or more in Seoul or Gyeonggi province, Korea. For anger expression type, the anger expression scale from the Korean State-Trait Anger Expression Inventory was used. For interpersonal problems, the short form of the Korean Inventory of Interpersonal Problems Circumplex Scales was used. Data were analyzed using descriptive statistics, cluster analysis, multivariate analysis of variance, and Duncan's multiple comparisons test. Three anger expression types in nurses were found: low-anger expression, anger-in, and anger-in/control type. From the results of multivariate analysis of variance, there were significant differences between anger expression types and interpersonal problems (Wilks lambda F = 3.52, p < .001). Additionally, anger-in/control type was found to have the most difficulty with interpersonal problems by Duncan's post hoc test (p < .050). Based on this research, the development of an anger expression intervention program for nurses is recommended to establish the means of expressing the suppressed emotions, which would help the nurses experience less interpersonal problems. Copyright © 2015. Published by Elsevier B.V.

  8. A comprehensive analysis on preservation patterns of gene co-expression networks during Alzheimer's disease progression.

    PubMed

    Ray, Sumanta; Hossain, Sk Md Mosaddek; Khatun, Lutfunnesa; Mukhopadhyay, Anirban

    2017-12-20

    Alzheimer's disease (AD) is a chronic neuro-degenerative disruption of the brain which involves in large scale transcriptomic variation. The disease does not impact every regions of the brain at the same time, instead it progresses slowly involving somewhat sequential interaction with different regions. Analysis of the expression patterns of the genes in different regions of the brain influenced in AD surely contribute for a enhanced comprehension of AD pathogenesis and shed light on the early characterization of the disease. Here, we have proposed a framework to identify perturbation and preservation characteristics of gene expression patterns across six distinct regions of the brain ("EC", "HIP", "PC", "MTG", "SFG", and "VCX") affected in AD. Co-expression modules were discovered considering a couple of regions at once. These are then analyzed to know the preservation and perturbation characteristics. Different module preservation statistics and a rank aggregation mechanism have been adopted to detect the changes of expression patterns across brain regions. Gene ontology (GO) and pathway based analysis were also carried out to know the biological meaning of preserved and perturbed modules. In this article, we have extensively studied the preservation patterns of co-expressed modules in six distinct brain regions affected in AD. Some modules are emerged as the most preserved while some others are detected as perturbed between a pair of brain regions. Further investigation on the topological properties of preserved and non-preserved modules reveals a substantial association amongst "betweenness centrality" and "degree" of the involved genes. Our findings may render a deeper realization of the preservation characteristics of gene expression patterns in discrete brain regions affected by AD.

  9. Divergent evolution of arrested development in the dauer stage of Caenorhabditis elegans and the infective stage of Heterodera glycines

    PubMed Central

    Elling, Axel A; Mitreva, Makedonka; Recknor, Justin; Gai, Xiaowu; Martin, John; Maier, Thomas R; McDermott, Jeffrey P; Hewezi, Tarek; McK Bird, David; Davis, Eric L; Hussey, Richard S; Nettleton, Dan; McCarter, James P; Baum, Thomas J

    2007-01-01

    Background The soybean cyst nematode Heterodera glycines is the most important parasite in soybean production worldwide. A comprehensive analysis of large-scale gene expression changes throughout the development of plant-parasitic nematodes has been lacking to date. Results We report an extensive genomic analysis of H. glycines, beginning with the generation of 20,100 expressed sequence tags (ESTs). In-depth analysis of these ESTs plus approximately 1,900 previously published sequences predicted 6,860 unique H. glycines genes and allowed a classification by function using InterProScan. Expression profiling of all 6,860 genes throughout the H. glycines life cycle was undertaken using the Affymetrix Soybean Genome Array GeneChip. Our data sets and results represent a comprehensive resource for molecular studies of H. glycines. Demonstrating the power of this resource, we were able to address whether arrested development in the Caenorhabditis elegans dauer larva and the H. glycines infective second-stage juvenile (J2) exhibits shared gene expression profiles. We determined that the gene expression profiles associated with the C. elegans dauer pathway are not uniformly conserved in H. glycines and that the expression profiles of genes for metabolic enzymes of C. elegans dauer larvae and H. glycines infective J2 are dissimilar. Conclusion Our results indicate that hallmark gene expression patterns and metabolism features are not shared in the developmentally arrested life stages of C. elegans and H. glycines, suggesting that developmental arrest in these two nematode species has undergone more divergent evolution than previously thought and pointing to the need for detailed genomic analyses of individual parasite species. PMID:17919324

  10. The Effect of Gestational Age on Angiogenic Gene Expression in the Rat Placenta

    PubMed Central

    Vaswani, Kanchan; Hum, Melissa Wen-Ching; Chan, Hsiu-Wen; Ryan, Jennifer; Wood-Bradley, Ryan J.; Nitert, Marloes Dekker; Mitchell, Murray D.; Armitage, James A.; Rice, Gregory E.

    2013-01-01

    The placenta plays a central role in determining the outcome of pregnancy. It undergoes changes during gestation as the fetus develops and as demands for energy substrate transfer and gas exchange increase. The molecular mechanisms that coordinate these changes have yet to be fully elucidated. The study performed a large scale screen of the transcriptome of the rat placenta throughout mid-late gestation (E14.25–E20) with emphasis on characterizing gestational age associated changes in the expression of genes invoved in angiogenic pathways. Sprague Dawley dams were sacrificed at E14.25, E15.25, E17.25 and E20 (n = 6 per group) and RNA was isolated from one placenta per dam. Changes in placental gene expression were identifed using Illumina Rat Ref-12 Expression BeadChip Microarrays. Differentially expressed genes (>2-fold change, <1% false discovery rate, FDR) were functionally categorised by gene ontology pathway analysis. A subset of differentially expressed genes identified by microarrays were confirmed using Real-Time qPCR. The expression of thirty one genes involved in the angiogenic pathway was shown to change over time, using microarray analysis (22 genes displayed increased and 9 gene decreased expression). Five genes (4 up regulated: Cd36, Mmp14, Rhob and Angpt4 and 1 down regulated: Foxm1) involved in angiogenesis and blood vessel morphogenesis were subjected to further validation. qPCR confirmed late gestational increased expression of Cd36, Mmp14, Rhob and Angpt4 and a decrease in expression of Foxm1 before labour onset (P<0.0001). The observed acute, pre-labour changes in the expression of the 31 genes during gestation warrant further investigation to elucidate their role in pregnancy. PMID:24391823

  11. Shifting Interests: Changes in the Lexical Semantics of ED-MEDIA

    ERIC Educational Resources Information Center

    Wild, Fridolin; Valentine, Chris; Scott, Peter

    2010-01-01

    Large research networks naturally form complex communities with overlapping but not identical expertise. To map the distribution of professional competence in field of "technology-enhanced learning", the lexical semantics expressed in research articles published in a representative, large-scale conference (ED-MEDIA) can be investigated and changes…

  12. Genomic analysis of hepatoblastoma identifies distinct molecular and prognostic subgroups.

    PubMed

    Sumazin, Pavel; Chen, Yidong; Treviño, Lisa R; Sarabia, Stephen F; Hampton, Oliver A; Patel, Kayuri; Mistretta, Toni-Ann; Zorman, Barry; Thompson, Patrick; Heczey, Andras; Comerford, Sarah; Wheeler, David A; Chintagumpala, Murali; Meyers, Rebecka; Rakheja, Dinesh; Finegold, Milton J; Tomlinson, Gail; Parsons, D Williams; López-Terrada, Dolores

    2017-01-01

    Despite being the most common liver cancer in children, hepatoblastoma (HB) is a rare neoplasm. Consequently, few pretreatment tumors have been molecularly profiled, and there are no validated prognostic or therapeutic biomarkers for HB patients. We report on the first large-scale effort to profile pretreatment HBs at diagnosis. Our analysis of 88 clinically annotated HBs revealed three risk-stratifying molecular subtypes that are characterized by differential activation of hepatic progenitor cell markers and metabolic pathways: high-risk tumors were characterized by up-regulated nuclear factor, erythroid 2-like 2 activity; high lin-28 homolog B, high mobility group AT-hook 2, spalt-like transcription factor 4, and alpha-fetoprotein expression; and high coordinated expression of oncofetal proteins and stem-cell markers, while low-risk tumors had low lin-28 homolog B and lethal-7 expression and high hepatic nuclear factor 1 alpha activity. Analysis of immunohistochemical assays using antibodies targeting these genes in a prospective study of 35 HBs suggested that these candidate biomarkers have the potential to improve risk stratification and guide treatment decisions for HB patients at diagnosis; our results pave the way for clinical collaborative studies to validate candidate biomarkers and test their potential to improve outcome for HB patients. (Hepatology 2017;65:104-121). © 2016 by the American Association for the Study of Liver Diseases.

  13. Dual Anterograde and Retrograde Viral Tracing of Reciprocal Connectivity.

    PubMed

    Haberl, Matthias G; Ginger, Melanie; Frick, Andreas

    2017-01-01

    Current large-scale approaches in neuroscience aim to unravel the complete connectivity map of specific neuronal circuits, or even the entire brain. This emerging research discipline has been termed connectomics. Recombinant glycoprotein-deleted rabies virus (RABV ∆G) has become an important tool for the investigation of neuronal connectivity in the brains of a variety of species. Neuronal infection with even a single RABV ∆G particle results in high-level transgene expression, revealing the fine-detailed morphology of all neuronal features-including dendritic spines, axonal processes, and boutons-on a brain-wide scale. This labeling is eminently suitable for subsequent post-hoc morphological analysis, such as semiautomated reconstruction in 3D. Here we describe the use of a recently developed anterograde RABV ∆G variant together with a retrograde RABV ∆G for the investigation of projections both to, and from, a particular brain region. In addition to the automated reconstruction of a dendritic tree, we also give as an example the volume measurements of axonal boutons following RABV ∆G-mediated fluorescent marker expression. In conclusion RABV ∆G variants expressing a combination of markers and/or tools for stimulating/monitoring neuronal activity, used together with genetic or behavioral animal models, promise important insights in the structure-function relationship of neural circuits.

  14. Rationality, emotional expression and control: psychometric characteristics of a questionnaire for research in psycho-oncology.

    PubMed

    Bleiker, E M; van der Ploeg, H M; Hendriks, J H; Leer, J W; Kleijn, W C

    1993-12-01

    In some studies rationality, anti-emotionality and the control of (negative) emotions were found to be psychological risk factors for cancer. In the present study instruments were developed in order to cross-validate the role of the 'rationality/anti-emotionality (RAE)'-concept and the 'emotional expression and control (EEC)'-concept. The psychometric characteristics of a RAE-scale and EEC-scales were investigated in 4302 healthy women attending a breast cancer screening programme in The Netherlands. Principal components analysis revealed three factors for the RAE-scale: (1) Rationality; (2) Emotionality; and (3) Understanding. The EEC-scales consist of three factors that indicate: (1) expression of emotions to oneself; (2) expression of emotions towards others; and (3) control of emotions. These RAE and EEC scales can be of importance in psycho-oncological research, especially when: (1) the more refined subscales are used; and (2) age of the subjects is taken into account.

  15. Integrated network analysis identifies fight-club nodes as a class of hubs encompassing key putative switch genes that induce major transcriptome reprogramming during grapevine development.

    PubMed

    Palumbo, Maria Concetta; Zenoni, Sara; Fasoli, Marianna; Massonnet, Mélanie; Farina, Lorenzo; Castiglione, Filippo; Pezzotti, Mario; Paci, Paola

    2014-12-01

    We developed an approach that integrates different network-based methods to analyze the correlation network arising from large-scale gene expression data. By studying grapevine (Vitis vinifera) and tomato (Solanum lycopersicum) gene expression atlases and a grapevine berry transcriptomic data set during the transition from immature to mature growth, we identified a category named "fight-club hubs" characterized by a marked negative correlation with the expression profiles of neighboring genes in the network. A special subset named "switch genes" was identified, with the additional property of many significant negative correlations outside their own group in the network. Switch genes are involved in multiple processes and include transcription factors that may be considered master regulators of the previously reported transcriptome remodeling that marks the developmental shift from immature to mature growth. All switch genes, expressed at low levels in vegetative/green tissues, showed a significant increase in mature/woody organs, suggesting a potential regulatory role during the developmental transition. Finally, our analysis of tomato gene expression data sets showed that wild-type switch genes are downregulated in ripening-deficient mutants. The identification of known master regulators of tomato fruit maturation suggests our method is suitable for the detection of key regulators of organ development in different fleshy fruit crops. © 2014 American Society of Plant Biologists. All rights reserved.

  16. Integrated Network Analysis Identifies Fight-Club Nodes as a Class of Hubs Encompassing Key Putative Switch Genes That Induce Major Transcriptome Reprogramming during Grapevine Development[W][OPEN

    PubMed Central

    Palumbo, Maria Concetta; Zenoni, Sara; Fasoli, Marianna; Massonnet, Mélanie; Farina, Lorenzo; Castiglione, Filippo; Pezzotti, Mario; Paci, Paola

    2014-01-01

    We developed an approach that integrates different network-based methods to analyze the correlation network arising from large-scale gene expression data. By studying grapevine (Vitis vinifera) and tomato (Solanum lycopersicum) gene expression atlases and a grapevine berry transcriptomic data set during the transition from immature to mature growth, we identified a category named “fight-club hubs” characterized by a marked negative correlation with the expression profiles of neighboring genes in the network. A special subset named “switch genes” was identified, with the additional property of many significant negative correlations outside their own group in the network. Switch genes are involved in multiple processes and include transcription factors that may be considered master regulators of the previously reported transcriptome remodeling that marks the developmental shift from immature to mature growth. All switch genes, expressed at low levels in vegetative/green tissues, showed a significant increase in mature/woody organs, suggesting a potential regulatory role during the developmental transition. Finally, our analysis of tomato gene expression data sets showed that wild-type switch genes are downregulated in ripening-deficient mutants. The identification of known master regulators of tomato fruit maturation suggests our method is suitable for the detection of key regulators of organ development in different fleshy fruit crops. PMID:25490918

  17. Global map of physical interactions among differentially expressed genes in multiple sclerosis relapses and remissions.

    PubMed

    Tuller, Tamir; Atar, Shimshi; Ruppin, Eytan; Gurevich, Michael; Achiron, Anat

    2011-09-15

    Multiple sclerosis (MS) is a central nervous system autoimmune inflammatory T-cell-mediated disease with a relapsing-remitting course in the majority of patients. In this study, we performed a high-resolution systems biology analysis of gene expression and physical interactions in MS relapse and remission. To this end, we integrated 164 large-scale measurements of gene expression in peripheral blood mononuclear cells of MS patients in relapse or remission and healthy subjects, with large-scale information about the physical interactions between these genes obtained from public databases. These data were analyzed with a variety of computational methods. We find that there is a clear and significant global network-level signal that is related to the changes in gene expression of MS patients in comparison to healthy subjects. However, despite the clear differences in the clinical symptoms of MS patients in relapse versus remission, the network level signal is weaker when comparing patients in these two stages of the disease. This result suggests that most of the genes have relatively similar expression levels in the two stages of the disease. In accordance with previous studies, we found that the pathways related to regulation of cell death, chemotaxis and inflammatory response are differentially expressed in the disease in comparison to healthy subjects, while pathways related to cell adhesion, cell migration and cell-cell signaling are activated in relapse in comparison to remission. However, the current study includes a detailed report of the exact set of genes involved in these pathways and the interactions between them. For example, we found that the genes TP53 and IL1 are 'network-hub' that interacts with many of the differentially expressed genes in MS patients versus healthy subjects, and the epidermal growth factor receptor is a 'network-hub' in the case of MS patients with relapse versus remission. The statistical approaches employed in this study enabled us to report new sets of genes that according to their gene expression and physical interactions are predicted to be differentially expressed in MS versus healthy subjects, and in MS patients in relapse versus remission. Some of these genes may be useful biomarkers for diagnosing MS and predicting relapses in MS patients.

  18. Honeycomb: Visual Analysis of Large Scale Social Networks

    NASA Astrophysics Data System (ADS)

    van Ham, Frank; Schulz, Hans-Jörg; Dimicco, Joan M.

    The rise in the use of social network sites allows us to collect large amounts of user reported data on social structures and analysis of this data could provide useful insights for many of the social sciences. This analysis is typically the domain of Social Network Analysis, and visualization of these structures often proves invaluable in understanding them. However, currently available visual analysis tools are not very well suited to handle the massive scale of this network data, and often resolve to displaying small ego networks or heavily abstracted networks. In this paper, we present Honeycomb, a visualization tool that is able to deal with much larger scale data (with millions of connections), which we illustrate by using a large scale corporate social networking site as an example. Additionally, we introduce a new probability based network metric to guide users to potentially interesting or anomalous patterns and discuss lessons learned during design and implementation.

  19. Predicting chromatin architecture from models of polymer physics.

    PubMed

    Bianco, Simona; Chiariello, Andrea M; Annunziatella, Carlo; Esposito, Andrea; Nicodemi, Mario

    2017-03-01

    We review the picture of chromatin large-scale 3D organization emerging from the analysis of Hi-C data and polymer modeling. In higher mammals, Hi-C contact maps reveal a complex higher-order organization, extending from the sub-Mb to chromosomal scales, hierarchically folded in a structure of domains-within-domains (metaTADs). The domain folding hierarchy is partially conserved throughout differentiation, and deeply correlated to epigenomic features. Rearrangements in the metaTAD topology relate to gene expression modifications: in particular, in neuronal differentiation models, topologically associated domains (TADs) tend to have coherent expression changes within architecturally conserved metaTAD niches. To identify the nature of architectural domains and their molecular determinants within a principled approach, we discuss models based on polymer physics. We show that basic concepts of interacting polymer physics explain chromatin spatial organization across chromosomal scales and cell types. The 3D structure of genomic loci can be derived with high accuracy and its molecular determinants identified by crossing information with epigenomic databases. In particular, we illustrate the case of the Sox9 locus, linked to human congenital disorders. The model in-silico predictions on the effects of genomic rearrangements are confirmed by available 5C data. That can help establishing new diagnostic tools for diseases linked to chromatin mis-folding, such as congenital disorders and cancer.

  20. Zebrafish Whole-Adult-Organism Chemogenomics for Large-Scale Predictive and Discovery Chemical Biology

    PubMed Central

    Lam, Siew Hong; Mathavan, Sinnakarupan; Tong, Yan; Li, Haixia; Karuturi, R. Krishna Murthy; Wu, Yilian; Vega, Vinsensius B.; Liu, Edison T.; Gong, Zhiyuan

    2008-01-01

    The ability to perform large-scale, expression-based chemogenomics on whole adult organisms, as in invertebrate models (worm and fly), is highly desirable for a vertebrate model but its feasibility and potential has not been demonstrated. We performed expression-based chemogenomics on the whole adult organism of a vertebrate model, the zebrafish, and demonstrated its potential for large-scale predictive and discovery chemical biology. Focusing on two classes of compounds with wide implications to human health, polycyclic (halogenated) aromatic hydrocarbons [P(H)AHs] and estrogenic compounds (ECs), we generated robust prediction models that can discriminate compounds of the same class from those of different classes in two large independent experiments. The robust expression signatures led to the identification of biomarkers for potent aryl hydrocarbon receptor (AHR) and estrogen receptor (ER) agonists, respectively, and were validated in multiple targeted tissues. Knowledge-based data mining of human homologs of zebrafish genes revealed highly conserved chemical-induced biological responses/effects, health risks, and novel biological insights associated with AHR and ER that could be inferred to humans. Thus, our study presents an effective, high-throughput strategy of capturing molecular snapshots of chemical-induced biological states of a whole adult vertebrate that provides information on biomarkers of effects, deregulated signaling pathways, and possible affected biological functions, perturbed physiological systems, and increased health risks. These findings place zebrafish in a strategic position to bridge the wide gap between cell-based and rodent models in chemogenomics research and applications, especially in preclinical drug discovery and toxicology. PMID:18618001

  1. Integrative analysis of RUNX1 downstream pathways and target genes

    PubMed Central

    Michaud, Joëlle; Simpson, Ken M; Escher, Robert; Buchet-Poyau, Karine; Beissbarth, Tim; Carmichael, Catherine; Ritchie, Matthew E; Schütz, Frédéric; Cannon, Ping; Liu, Marjorie; Shen, Xiaofeng; Ito, Yoshiaki; Raskind, Wendy H; Horwitz, Marshall S; Osato, Motomi; Turner, David R; Speed, Terence P; Kavallaris, Maria; Smyth, Gordon K; Scott, Hamish S

    2008-01-01

    Background The RUNX1 transcription factor gene is frequently mutated in sporadic myeloid and lymphoid leukemia through translocation, point mutation or amplification. It is also responsible for a familial platelet disorder with predisposition to acute myeloid leukemia (FPD-AML). The disruption of the largely unknown biological pathways controlled by RUNX1 is likely to be responsible for the development of leukemia. We have used multiple microarray platforms and bioinformatic techniques to help identify these biological pathways to aid in the understanding of why RUNX1 mutations lead to leukemia. Results Here we report genes regulated either directly or indirectly by RUNX1 based on the study of gene expression profiles generated from 3 different human and mouse platforms. The platforms used were global gene expression profiling of: 1) cell lines with RUNX1 mutations from FPD-AML patients, 2) over-expression of RUNX1 and CBFβ, and 3) Runx1 knockout mouse embryos using either cDNA or Affymetrix microarrays. We observe that our datasets (lists of differentially expressed genes) significantly correlate with published microarray data from sporadic AML patients with mutations in either RUNX1 or its cofactor, CBFβ. A number of biological processes were identified among the differentially expressed genes and functional assays suggest that heterozygous RUNX1 point mutations in patients with FPD-AML impair cell proliferation, microtubule dynamics and possibly genetic stability. In addition, analysis of the regulatory regions of the differentially expressed genes has for the first time systematically identified numerous potential novel RUNX1 target genes. Conclusion This work is the first large-scale study attempting to identify the genetic networks regulated by RUNX1, a master regulator in the development of the hematopoietic system and leukemia. The biological pathways and target genes controlled by RUNX1 will have considerable importance in disease progression in both familial and sporadic leukemia as well as therapeutic implications. PMID:18671852

  2. Negative Symptom Dimensions of the Positive and Negative Syndrome Scale Across Geographical Regions: Implications for Social, Linguistic, and Cultural Consistency.

    PubMed

    Khan, Anzalee; Liharska, Lora; Harvey, Philip D; Atkins, Alexandra; Ulshen, Daniel; Keefe, Richard S E

    2017-12-01

    Objective: Recognizing the discrete dimensions that underlie negative symptoms in schizophrenia and how these dimensions are understood across localities might result in better understanding and treatment of these symptoms. To this end, the objectives of this study were to 1) identify the Positive and Negative Syndrome Scale negative symptom dimensions of expressive deficits and experiential deficits and 2) analyze performance on these dimensions over 15 geographical regions to determine whether the items defining them manifest similar reliability across these regions. Design: Data were obtained for the baseline Positive and Negative Syndrome Scale visits of 6,889 subjects across 15 geographical regions. Using confirmatory factor analysis, we examined whether a two-factor negative symptom structure that is found in schizophrenia (experiential deficits and expressive deficits) would be replicated in our sample, and using differential item functioning, we tested the degree to which specific items from each negative symptom subfactor performed across geographical regions in comparison with the United States. Results: The two-factor negative symptom solution was replicated in this sample. Most geographical regions showed moderate-to-large differential item functioning for Positive and Negative Syndrome Scale expressive deficit items, especially N3 Poor Rapport, as compared with Positive and Negative Syndrome Scale experiential deficit items, showing that these items might be interpreted or scored differently in different regions. Across countries, except for India, the differential item functioning values did not favor raters in the United States. Conclusion: These results suggest that the Positive and Negative Syndrome Scale negative symptom factor can be better represented by a two-factor model than by a single-factor model. Additionally, the results show significant differences in responses to items representing the Positive and Negative Syndrome Scale expressive factors, but not the experiential factors, across regions. This could be due to a lack of equivalence between the original and translated versions, cultural differences with the interpretation of items, dissimilarities in rater training, or diversity in the understanding of scoring anchors. Knowing which items are challenging for raters across regions can help to guide Positive and Negative Syndrome Scale training and improve the results of international clinical trials aimed at negative symptoms.

  3. The Spielberger Anger Expression Scale: some psychometric data.

    PubMed

    Knight, R G; Chisholm, B J; Paulin, J M; Waal-Manning, H J

    1988-09-01

    Some general population norms for the Spielberger, Johnson et al. (1984) Anger Expression Scale (AX) are reported for a sample of over 1000 adults tested in a general health survey of a New Zealand community. Factor analysis confirmed the independence of the Anger/In and Anger/Out subscales, and the measure was found to have satisfactory levels of reliability.

  4. Large-scale purification and characterization of recombinant human stem cell factor in Escherichia coli.

    PubMed

    Chen, Liang-Hua; Cai, Feng; Zhang, Dan-Ju; Zhang, Li; Zhu, Peng; Gao, Shun

    2017-07-01

    The pharmacological importance of recombinant human stem cell factor (rhSCF) has increased the demand to establish effective and large-scale production and purification processes. A good source of bioactive recombinant protein with capability of being scaled-up without losing activity has always been a challenge. The objectives of the study were the rapid and efficient pilot-scale expression and purification of rhSCF. The gene encoding stem cell factor (SCF) was cloned into pBV220 and transformed into Escherichia coli. The recombinant SCF was expressed and isolated using a procedure consisting of isolation of inclusion bodies (IBs), denaturation, and refolding followed by chromatographic steps toward purification. The yield of rhSCF reached 835.6 g/20 L, and the expression levels of rhSCF were about 33.9% of the total E. coli protein content. rhSCF was purified by isolation of IBs, denaturation, and refolding, followed by SP-Sepharose chromatography, Source 30 reversed-phase chromatography, and Q-Sepharose chromatography. This procedure was developed to isolate 5.5 g of rhSCF (99.5% purity) with specific activity at 0.96 × 10 6  IU/mg, endotoxin levels of pyrogen at 1.0 EU/mg, and bacterial DNA at 10 ng/mg. Pilot-scale fermentations and purifications were set up for the production of rhSCF that can be upscaled for industry. © 2016 International Union of Biochemistry and Molecular Biology, Inc.

  5. Analysis and Functional Annotation of an Expressed Sequence Tag Collection for Tropical Crop Sugarcane

    PubMed Central

    Vettore, André L.; da Silva, Felipe R.; Kemper, Edson L.; Souza, Glaucia M.; da Silva, Aline M.; Ferro, Maria Inês T.; Henrique-Silva, Flavio; Giglioti, Éder A.; Lemos, Manoel V.F.; Coutinho, Luiz L.; Nobrega, Marina P.; Carrer, Helaine; França, Suzelei C.; Bacci, Maurício; Goldman, Maria Helena S.; Gomes, Suely L.; Nunes, Luiz R.; Camargo, Luis E.A.; Siqueira, Walter J.; Van Sluys, Marie-Anne; Thiemann, Otavio H.; Kuramae, Eiko E.; Santelli, Roberto V.; Marino, Celso L.; Targon, Maria L.P.N.; Ferro, Jesus A.; Silveira, Henrique C.S.; Marini, Danyelle C.; Lemos, Eliana G.M.; Monteiro-Vitorello, Claudia B.; Tambor, José H.M.; Carraro, Dirce M.; Roberto, Patrícia G.; Martins, Vanderlei G.; Goldman, Gustavo H.; de Oliveira, Regina C.; Truffi, Daniela; Colombo, Carlos A.; Rossi, Magdalena; de Araujo, Paula G.; Sculaccio, Susana A.; Angella, Aline; Lima, Marleide M.A.; de Rosa, Vicente E.; Siviero, Fábio; Coscrato, Virginia E.; Machado, Marcos A.; Grivet, Laurent; Di Mauro, Sonia M.Z.; Nobrega, Francisco G.; Menck, Carlos F.M.; Braga, Marilia D.V.; Telles, Guilherme P.; Cara, Frank A.A.; Pedrosa, Guilherme; Meidanis, João; Arruda, Paulo

    2003-01-01

    To contribute to our understanding of the genome complexity of sugarcane, we undertook a large-scale expressed sequence tag (EST) program. More than 260,000 cDNA clones were partially sequenced from 26 standard cDNA libraries generated from different sugarcane tissues. After the processing of the sequences, 237,954 high-quality ESTs were identified. These ESTs were assembled into 43,141 putative transcripts. Of the assembled sequences, 35.6% presented no matches with existing sequences in public databases. A global analysis of the whole SUCEST data set indicated that 14,409 assembled sequences (33% of the total) contained at least one cDNA clone with a full-length insert. Annotation of the 43,141 assembled sequences associated almost 50% of the putative identified sugarcane genes with protein metabolism, cellular communication/signal transduction, bioenergetics, and stress responses. Inspection of the translated assembled sequences for conserved protein domains revealed 40,821 amino acid sequences with 1415 Pfam domains. Reassembling the consensus sequences of the 43,141 transcripts revealed a 22% redundancy in the first assembling. This indicated that possibly 33,620 unique genes had been identified and indicated that >90% of the sugarcane expressed genes were tagged. PMID:14613979

  6. A large scale analysis of cDNA in Arabidopsis thaliana: generation of 12,028 non-redundant expressed sequence tags from normalized and size-selected cDNA libraries.

    PubMed

    Asamizu, E; Nakamura, Y; Sato, S; Tabata, S

    2000-06-30

    For comprehensive analysis of genes expressed in the model dicotyledonous plant, Arabidopsis thaliana, expressed sequence tags (ESTs) were accumulated. Normalized and size-selected cDNA libraries were constructed from aboveground organs, flower buds, roots, green siliques and liquid-cultured seedlings, respectively, and a total of 14,026 5'-end ESTs and 39,207 3'-end ESTs were obtained. The 3'-end ESTs could be clustered into 12,028 non-redundant groups. Similarity search of the non-redundant ESTs against the public non-redundant protein database indicated that 4816 groups show similarity to genes of known function, 1864 to hypothetical genes, and the remaining 5348 are novel sequences. Gene coverage by the non-redundant ESTs was analyzed using the annotated genomic sequences of approximately 10 Mb on chromosomes 3 and 5. A total of 923 regions were hit by at least one EST, among which only 499 regions were hit by the ESTs deposited in the public database. The result indicates that the EST source generated in this project complements the EST data in the public database and facilitates new gene discovery.

  7. Analysis of the Nicotiana tabacum Stigma/Style Transcriptome Reveals Gene Expression Differences between Wet and Dry Stigma Species1[W][OA

    PubMed Central

    Quiapim, Andréa C.; Brito, Michael S.; Bernardes, Luciano A.S.; daSilva, Idalete; Malavazi, Iran; DePaoli, Henrique C.; Molfetta-Machado, Jeanne B.; Giuliatti, Silvana; Goldman, Gustavo H.; Goldman, Maria Helena S.

    2009-01-01

    The success of plant reproduction depends on pollen-pistil interactions occurring at the stigma/style. These interactions vary depending on the stigma type: wet or dry. Tobacco (Nicotiana tabacum) represents a model of wet stigma, and its stigmas/styles express genes to accomplish the appropriate functions. For a large-scale study of gene expression during tobacco pistil development and preparation for pollination, we generated 11,216 high-quality expressed sequence tags (ESTs) from stigmas/styles and created the TOBEST database. These ESTs were assembled in 6,177 clusters, from which 52.1% are pistil transcripts/genes of unknown function. The 21 clusters with the highest number of ESTs (putative higher expression levels) correspond to genes associated with defense mechanisms or pollen-pistil interactions. The database analysis unraveled tobacco sequences homologous to the Arabidopsis (Arabidopsis thaliana) genes involved in specifying pistil identity or determining normal pistil morphology and function. Additionally, 782 independent clusters were examined by macroarray, revealing 46 stigma/style preferentially expressed genes. Real-time reverse transcription-polymerase chain reaction experiments validated the pistil-preferential expression for nine out of 10 genes tested. A search for these 46 genes in the Arabidopsis pistil data sets demonstrated that only 11 sequences, with putative equivalent molecular functions, are expressed in this dry stigma species. The reverse search for the Arabidopsis pistil genes in the TOBEST exposed a partial overlap between these dry and wet stigma transcriptomes. The TOBEST represents the most extensive survey of gene expression in the stigmas/styles of wet stigma plants, and our results indicate that wet and dry stigmas/styles express common as well as distinct genes in preparation for the pollination process. PMID:19052150

  8. Characterization and improvement of RNA-Seq precision in quantitative transcript expression profiling.

    PubMed

    Łabaj, Paweł P; Leparc, Germán G; Linggi, Bryan E; Markillie, Lye Meng; Wiley, H Steven; Kreil, David P

    2011-07-01

    Measurement precision determines the power of any analysis to reliably identify significant signals, such as in screens for differential expression, independent of whether the experimental design incorporates replicates or not. With the compilation of large-scale RNA-Seq datasets with technical replicate samples, however, we can now, for the first time, perform a systematic analysis of the precision of expression level estimates from massively parallel sequencing technology. This then allows considerations for its improvement by computational or experimental means. We report on a comprehensive study of target identification and measurement precision, including their dependence on transcript expression levels, read depth and other parameters. In particular, an impressive recall of 84% of the estimated true transcript population could be achieved with 331 million 50 bp reads, with diminishing returns from longer read lengths and even less gains from increased sequencing depths. Most of the measurement power (75%) is spent on only 7% of the known transcriptome, however, making less strongly expressed transcripts harder to measure. Consequently, <30% of all transcripts could be quantified reliably with a relative error<20%. Based on established tools, we then introduce a new approach for mapping and analysing sequencing reads that yields substantially improved performance in gene expression profiling, increasing the number of transcripts that can reliably be quantified to over 40%. Extrapolations to higher sequencing depths highlight the need for efficient complementary steps. In discussion we outline possible experimental and computational strategies for further improvements in quantification precision. rnaseq10@boku.ac.at

  9. Using scale and feather traits for module construction provides a functional approach to chicken epidermal development.

    PubMed

    Bao, Weier; Greenwold, Matthew J; Sawyer, Roger H

    2017-11-01

    Gene co-expression network analysis has been a research method widely used in systematically exploring gene function and interaction. Using the Weighted Gene Co-expression Network Analysis (WGCNA) approach to construct a gene co-expression network using data from a customized 44K microarray transcriptome of chicken epidermal embryogenesis, we have identified two distinct modules that are highly correlated with scale or feather development traits. Signaling pathways related to feather development were enriched in the traditional KEGG pathway analysis and functional terms relating specifically to embryonic epidermal development were also enriched in the Gene Ontology analysis. Significant enrichment annotations were discovered from customized enrichment tools such as Modular Single-Set Enrichment Test (MSET) and Medical Subject Headings (MeSH). Hub genes in both trait-correlated modules showed strong specific functional enrichment toward epidermal development. Also, regulatory elements, such as transcription factors and miRNAs, were targeted in the significant enrichment result. This work highlights the advantage of this methodology for functional prediction of genes not previously associated with scale- and feather trait-related modules.

  10. Toward server-side, high performance climate change data analytics in the Earth System Grid Federation (ESGF) eco-system

    NASA Astrophysics Data System (ADS)

    Fiore, Sandro; Williams, Dean; Aloisio, Giovanni

    2016-04-01

    In many scientific domains such as climate, data is often n-dimensional and requires tools that support specialized data types and primitives to be properly stored, accessed, analysed and visualized. Moreover, new challenges arise in large-scale scenarios and eco-systems where petabytes (PB) of data can be available and data can be distributed and/or replicated (e.g., the Earth System Grid Federation (ESGF) serving the Coupled Model Intercomparison Project, Phase 5 (CMIP5) experiment, providing access to 2.5PB of data for the Intergovernmental Panel on Climate Change (IPCC) Fifth Assessment Report (AR5). Most of the tools currently available for scientific data analysis in the climate domain fail at large scale since they: (1) are desktop based and need the data locally; (2) are sequential, so do not benefit from available multicore/parallel machines; (3) do not provide declarative languages to express scientific data analysis tasks; (4) are domain-specific, which ties their adoption to a specific domain; and (5) do not provide a workflow support, to enable the definition of complex "experiments". The Ophidia project aims at facing most of the challenges highlighted above by providing a big data analytics framework for eScience. Ophidia provides declarative, server-side, and parallel data analysis, jointly with an internal storage model able to efficiently deal with multidimensional data and a hierarchical data organization to manage large data volumes ("datacubes"). The project relies on a strong background of high performance database management and OLAP systems to manage large scientific data sets. It also provides a native workflow management support, to define processing chains and workflows with tens to hundreds of data analytics operators to build real scientific use cases. With regard to interoperability aspects, the talk will present the contribution provided both to the RDA Working Group on Array Databases, and the Earth System Grid Federation (ESGF) Compute Working Team. Also highlighted will be the results of large scale climate model intercomparison data analysis experiments, for example: (1) defined in the context of the EU H2020 INDIGO-DataCloud project; (2) implemented in a real geographically distributed environment involving CMCC (Italy) and LLNL (US) sites; (3) exploiting Ophidia as server-side, parallel analytics engine; and (4) applied on real CMIP5 data sets available through ESGF.

  11. Field-aligned currents' scale analysis performed with the Swarm constellation

    NASA Astrophysics Data System (ADS)

    Lühr, Hermann; Park, Jaeheung; Gjerloev, Jesper W.; Rauberg, Jan; Michaelis, Ingo; Merayo, Jose M. G.; Brauer, Peter

    2015-01-01

    We present a statistical study of the temporal- and spatial-scale characteristics of different field-aligned current (FAC) types derived with the Swarm satellite formation. We divide FACs into two classes: small-scale, up to some 10 km, which are carried predominantly by kinetic Alfvén waves, and large-scale FACs with sizes of more than 150 km. For determining temporal variability we consider measurements at the same point, the orbital crossovers near the poles, but at different times. From correlation analysis we obtain a persistent period of small-scale FACs of order 10 s, while large-scale FACs can be regarded stationary for more than 60 s. For the first time we investigate the longitudinal scales. Large-scale FACs are different on dayside and nightside. On the nightside the longitudinal extension is on average 4 times the latitudinal width, while on the dayside, particularly in the cusp region, latitudinal and longitudinal scales are comparable.

  12. Secondary Analysis of Large-Scale Assessment Data: An Alternative to Variable-Centred Analysis

    ERIC Educational Resources Information Center

    Chow, Kui Foon; Kennedy, Kerry John

    2014-01-01

    International large-scale assessments are now part of the educational landscape in many countries and often feed into major policy decisions. Yet, such assessments also provide data sets for secondary analysis that can address key issues of concern to educators and policymakers alike. Traditionally, such secondary analyses have been based on a…

  13. The connection-set algebra--a novel formalism for the representation of connectivity structure in neuronal network models.

    PubMed

    Djurfeldt, Mikael

    2012-07-01

    The connection-set algebra (CSA) is a novel and general formalism for the description of connectivity in neuronal network models, from small-scale to large-scale structure. The algebra provides operators to form more complex sets of connections from simpler ones and also provides parameterization of such sets. CSA is expressive enough to describe a wide range of connection patterns, including multiple types of random and/or geometrically dependent connectivity, and can serve as a concise notation for network structure in scientific writing. CSA implementations allow for scalable and efficient representation of connectivity in parallel neuronal network simulators and could even allow for avoiding explicit representation of connections in computer memory. The expressiveness of CSA makes prototyping of network structure easy. A C+ + version of the algebra has been implemented and used in a large-scale neuronal network simulation (Djurfeldt et al., IBM J Res Dev 52(1/2):31-42, 2008b) and an implementation in Python has been publicly released.

  14. Equivalent Electromagnetic Constants for Microwave Application to Composite Materials for the Multi-Scale Problem

    PubMed Central

    Fujisaki, Keisuke; Ikeda, Tomoyuki

    2013-01-01

    To connect different scale models in the multi-scale problem of microwave use, equivalent material constants were researched numerically by a three-dimensional electromagnetic field, taking into account eddy current and displacement current. A volume averaged method and a standing wave method were used to introduce the equivalent material constants; water particles and aluminum particles are used as composite materials. Consumed electrical power is used for the evaluation. Water particles have the same equivalent material constants for both methods; the same electrical power is obtained for both the precise model (micro-model) and the homogeneous model (macro-model). However, aluminum particles have dissimilar equivalent material constants for both methods; different electric power is obtained for both models. The varying electromagnetic phenomena are derived from the expression of eddy current. For small electrical conductivity such as water, the macro-current which flows in the macro-model and the micro-current which flows in the micro-model express the same electromagnetic phenomena. However, for large electrical conductivity such as aluminum, the macro-current and micro-current express different electromagnetic phenomena. The eddy current which is observed in the micro-model is not expressed by the macro-model. Therefore, the equivalent material constant derived from the volume averaged method and the standing wave method is applicable to water with a small electrical conductivity, although not applicable to aluminum with a large electrical conductivity. PMID:28788395

  15. A Java program for LRE-based real-time qPCR that enables large-scale absolute quantification.

    PubMed

    Rutledge, Robert G

    2011-03-02

    Linear regression of efficiency (LRE) introduced a new paradigm for real-time qPCR that enables large-scale absolute quantification by eliminating the need for standard curves. Developed through the application of sigmoidal mathematics to SYBR Green I-based assays, target quantity is derived directly from fluorescence readings within the central region of an amplification profile. However, a major challenge of implementing LRE quantification is the labor intensive nature of the analysis. Utilizing the extensive resources that are available for developing Java-based software, the LRE Analyzer was written using the NetBeans IDE, and is built on top of the modular architecture and windowing system provided by the NetBeans Platform. This fully featured desktop application determines the number of target molecules within a sample with little or no intervention by the user, in addition to providing extensive database capabilities. MS Excel is used to import data, allowing LRE quantification to be conducted with any real-time PCR instrument that provides access to the raw fluorescence readings. An extensive help set also provides an in-depth introduction to LRE, in addition to guidelines on how to implement LRE quantification. The LRE Analyzer provides the automated analysis and data storage capabilities required by large-scale qPCR projects wanting to exploit the many advantages of absolute quantification. Foremost is the universal perspective afforded by absolute quantification, which among other attributes, provides the ability to directly compare quantitative data produced by different assays and/or instruments. Furthermore, absolute quantification has important implications for gene expression profiling in that it provides the foundation for comparing transcript quantities produced by any gene with any other gene, within and between samples.

  16. A Java Program for LRE-Based Real-Time qPCR that Enables Large-Scale Absolute Quantification

    PubMed Central

    Rutledge, Robert G.

    2011-01-01

    Background Linear regression of efficiency (LRE) introduced a new paradigm for real-time qPCR that enables large-scale absolute quantification by eliminating the need for standard curves. Developed through the application of sigmoidal mathematics to SYBR Green I-based assays, target quantity is derived directly from fluorescence readings within the central region of an amplification profile. However, a major challenge of implementing LRE quantification is the labor intensive nature of the analysis. Findings Utilizing the extensive resources that are available for developing Java-based software, the LRE Analyzer was written using the NetBeans IDE, and is built on top of the modular architecture and windowing system provided by the NetBeans Platform. This fully featured desktop application determines the number of target molecules within a sample with little or no intervention by the user, in addition to providing extensive database capabilities. MS Excel is used to import data, allowing LRE quantification to be conducted with any real-time PCR instrument that provides access to the raw fluorescence readings. An extensive help set also provides an in-depth introduction to LRE, in addition to guidelines on how to implement LRE quantification. Conclusions The LRE Analyzer provides the automated analysis and data storage capabilities required by large-scale qPCR projects wanting to exploit the many advantages of absolute quantification. Foremost is the universal perspective afforded by absolute quantification, which among other attributes, provides the ability to directly compare quantitative data produced by different assays and/or instruments. Furthermore, absolute quantification has important implications for gene expression profiling in that it provides the foundation for comparing transcript quantities produced by any gene with any other gene, within and between samples. PMID:21407812

  17. Systematic Analysis of Zn2Cys6 Transcription Factors Required for Development and Pathogenicity by High-Throughput Gene Knockout in the Rice Blast Fungus

    PubMed Central

    Huang, Pengyun; Lin, Fucheng

    2014-01-01

    Because of great challenges and workload in deleting genes on a large scale, the functions of most genes in pathogenic fungi are still unclear. In this study, we developed a high-throughput gene knockout system using a novel yeast-Escherichia-Agrobacterium shuttle vector, pKO1B, in the rice blast fungus Magnaporthe oryzae. Using this method, we deleted 104 fungal-specific Zn2Cys6 transcription factor (TF) genes in M. oryzae. We then analyzed the phenotypes of these mutants with regard to growth, asexual and infection-related development, pathogenesis, and 9 abiotic stresses. The resulting data provide new insights into how this rice pathogen of global significance regulates important traits in the infection cycle through Zn2Cys6TF genes. A large variation in biological functions of Zn2Cys6TF genes was observed under the conditions tested. Sixty-one of 104 Zn2Cys6 TF genes were found to be required for fungal development. In-depth analysis of TF genes revealed that TF genes involved in pathogenicity frequently tend to function in multiple development stages, and disclosed many highly conserved but unidentified functional TF genes of importance in the fungal kingdom. We further found that the virulence-required TF genes GPF1 and CNF2 have similar regulation mechanisms in the gene expression involved in pathogenicity. These experimental validations clearly demonstrated the value of a high-throughput gene knockout system in understanding the biological functions of genes on a genome scale in fungi, and provided a solid foundation for elucidating the gene expression network that regulates the development and pathogenicity of M. oryzae. PMID:25299517

  18. Large-scale gene-centric analysis identifies novel variants for coronary artery disease.

    PubMed

    2011-09-01

    Coronary artery disease (CAD) has a significant genetic contribution that is incompletely characterized. To complement genome-wide association (GWA) studies, we conducted a large and systematic candidate gene study of CAD susceptibility, including analysis of many uncommon and functional variants. We examined 49,094 genetic variants in ∼2,100 genes of cardiovascular relevance, using a customised gene array in 15,596 CAD cases and 34,992 controls (11,202 cases and 30,733 controls of European descent; 4,394 cases and 4,259 controls of South Asian origin). We attempted to replicate putative novel associations in an additional 17,121 CAD cases and 40,473 controls. Potential mechanisms through which the novel variants could affect CAD risk were explored through association tests with vascular risk factors and gene expression. We confirmed associations of several previously known CAD susceptibility loci (eg, 9p21.3:p<10(-33); LPA:p<10(-19); 1p13.3:p<10(-17)) as well as three recently discovered loci (COL4A1/COL4A2, ZC3HC1, CYP17A1:p<5×10(-7)). However, we found essentially null results for most previously suggested CAD candidate genes. In our replication study of 24 promising common variants, we identified novel associations of variants in or near LIPA, IL5, TRIB1, and ABCG5/ABCG8, with per-allele odds ratios for CAD risk with each of the novel variants ranging from 1.06-1.09. Associations with variants at LIPA, TRIB1, and ABCG5/ABCG8 were supported by gene expression data or effects on lipid levels. Apart from the previously reported variants in LPA, none of the other ∼4,500 low frequency and functional variants showed a strong effect. Associations in South Asians did not differ appreciably from those in Europeans, except for 9p21.3 (per-allele odds ratio: 1.14 versus 1.27 respectively; P for heterogeneity = 0.003). This large-scale gene-centric analysis has identified several novel genes for CAD that relate to diverse biochemical and cellular functions and clarified the literature with regard to many previously suggested genes.

  19. Large-scale retrieval for medical image analytics: A comprehensive review.

    PubMed

    Li, Zhongyu; Zhang, Xiaofan; Müller, Henning; Zhang, Shaoting

    2018-01-01

    Over the past decades, medical image analytics was greatly facilitated by the explosion of digital imaging techniques, where huge amounts of medical images were produced with ever-increasing quality and diversity. However, conventional methods for analyzing medical images have achieved limited success, as they are not capable to tackle the huge amount of image data. In this paper, we review state-of-the-art approaches for large-scale medical image analysis, which are mainly based on recent advances in computer vision, machine learning and information retrieval. Specifically, we first present the general pipeline of large-scale retrieval, summarize the challenges/opportunities of medical image analytics on a large-scale. Then, we provide a comprehensive review of algorithms and techniques relevant to major processes in the pipeline, including feature representation, feature indexing, searching, etc. On the basis of existing work, we introduce the evaluation protocols and multiple applications of large-scale medical image retrieval, with a variety of exploratory and diagnostic scenarios. Finally, we discuss future directions of large-scale retrieval, which can further improve the performance of medical image analysis. Copyright © 2017 Elsevier B.V. All rights reserved.

  20. Fast and Scalable Gaussian Process Modeling with Applications to Astronomical Time Series

    NASA Astrophysics Data System (ADS)

    Foreman-Mackey, Daniel; Agol, Eric; Ambikasaran, Sivaram; Angus, Ruth

    2017-12-01

    The growing field of large-scale time domain astronomy requires methods for probabilistic data analysis that are computationally tractable, even with large data sets. Gaussian processes (GPs) are a popular class of models used for this purpose, but since the computational cost scales, in general, as the cube of the number of data points, their application has been limited to small data sets. In this paper, we present a novel method for GPs modeling in one dimension where the computational requirements scale linearly with the size of the data set. We demonstrate the method by applying it to simulated and real astronomical time series data sets. These demonstrations are examples of probabilistic inference of stellar rotation periods, asteroseismic oscillation spectra, and transiting planet parameters. The method exploits structure in the problem when the covariance function is expressed as a mixture of complex exponentials, without requiring evenly spaced observations or uniform noise. This form of covariance arises naturally when the process is a mixture of stochastically driven damped harmonic oscillators—providing a physical motivation for and interpretation of this choice—but we also demonstrate that it can be a useful effective model in some other cases. We present a mathematical description of the method and compare it to existing scalable GP methods. The method is fast and interpretable, with a range of potential applications within astronomical data analysis and beyond. We provide well-tested and documented open-source implementations of this method in C++, Python, and Julia.

  1. Hidden among the crowd: differential DNA methylation-expression correlations in cancer occur at important oncogenic pathways.

    PubMed

    Mosquera Orgueira, Adrián

    2015-01-01

    DNA methylation is a frequent epigenetic mechanism that participates in transcriptional repression. Variations in DNA methylation with respect to gene expression are constant, and, for unknown reasons, some genes with highly methylated promoters are sometimes overexpressed. In this study we have analyzed the expression and methylation patterns of thousands of genes in five groups of cancer and normal tissue samples in order to determine local and genome-wide differences. We observed significant changes in global methylation-expression correlation in all the neoplasms, which suggests that differential correlation events are frequent in cancer. A focused analysis in the breast cancer cohort identified 1662 genes whose correlation varies significantly between normal and cancerous breast, but whose DNA methylation and gene expression patterns do not change substantially. These genes were enriched in cancer-related pathways and repressive chromatin features across various model cell lines, such as PRC2 binding and H3K27me3 marks. Substantial changes in methylation-expression correlation indicate that these genes are subject to epigenetic remodeling, where the differential activity of other factors break the expected relationship between both variables. Our findings suggest a complex regulatory landscape where a redistribution of local and large-scale chromatin repressive domains at differentially correlated genes (DCGs) creates epigenetic hotspots that modulate cancer-specific gene expression.

  2. The analysis of soil cores polluted with certain metals using the Box-Cox transformation.

    PubMed

    Meloun, Milan; Sánka, Milan; Nemec, Pavel; Krítková, Sona; Kupka, Karel

    2005-09-01

    To define the soil properties for a given area or country including the level of pollution, soil survey and inventory programs are essential tools. Soil data transformations enable the expression of the original data on a new scale, more suitable for data analysis. In the computer-aided interactive analysis of large data files of soil characteristics containing outliers, the diagnostic plots of the exploratory data analysis (EDA) often find that the sample distribution is systematically skewed or reject sample homogeneity. Under such circumstances the original data should be transformed. The Box-Cox transformation improves sample symmetry and stabilizes spread. The logarithmic plot of a profile likelihood function enables the optimum transformation parameter to be found. Here, a proposed procedure for data transformation in univariate data analysis is illustrated on a determination of cadmium content in the plough zone of agricultural soils. A typical soil pollution survey concerns the determination of the elements Be (16 544 values available), Cd (40 317 values), Co (22 176 values), Cr (40 318 values), Hg (32 344 values), Ni (34 989 values), Pb (40 344 values), V (20 373 values) and Zn (36 123 values) in large samples.

  3. Gene expression signature of cerebellar hypoplasia in a mouse model of Down syndrome during postnatal development

    PubMed Central

    Laffaire, Julien; Rivals, Isabelle; Dauphinot, Luce; Pasteau, Fabien; Wehrle, Rosine; Larrat, Benoit; Vitalis, Tania; Moldrich, Randal X; Rossier, Jean; Sinkus, Ralph; Herault, Yann; Dusart, Isabelle; Potier, Marie-Claude

    2009-01-01

    Background Down syndrome is a chromosomal disorder caused by the presence of three copies of chromosome 21. The mechanisms by which this aneuploidy produces the complex and variable phenotype observed in people with Down syndrome are still under discussion. Recent studies have demonstrated an increased transcript level of the three-copy genes with some dosage compensation or amplification for a subset of them. The impact of this gene dosage effect on the whole transcriptome is still debated and longitudinal studies assessing the variability among samples, tissues and developmental stages are needed. Results We thus designed a large scale gene expression study in mice (the Ts1Cje Down syndrome mouse model) in which we could measure the effects of trisomy 21 on a large number of samples (74 in total) in a tissue that is affected in Down syndrome (the cerebellum) and where we could quantify the defect during postnatal development in order to correlate gene expression changes to the phenotype observed. Statistical analysis of microarray data revealed a major gene dosage effect: for the three-copy genes as well as for a 2 Mb segment from mouse chromosome 12 that we show for the first time as being deleted in the Ts1Cje mice. This gene dosage effect impacts moderately on the expression of euploid genes (2.4 to 7.5% differentially expressed). Only 13 genes were significantly dysregulated in Ts1Cje mice at all four postnatal development stages studied from birth to 10 days after birth, and among them are 6 three-copy genes. The decrease in granule cell proliferation demonstrated in newborn Ts1Cje cerebellum was correlated with a major gene dosage effect on the transcriptome in dissected cerebellar external granule cell layer. Conclusion High throughput gene expression analysis in the cerebellum of a large number of samples of Ts1Cje and euploid mice has revealed a prevailing gene dosage effect on triplicated genes. Moreover using an enriched cell population that is thought responsible for the cerebellar hypoplasia in Down syndrome, a global destabilization of gene expression was not detected. Altogether these results strongly suggest that the three-copy genes are directly responsible for the phenotype present in cerebellum. We provide here a short list of candidate genes. PMID:19331679

  4. Development and analysis of prognostic equations for mesoscale kinetic energy and mesoscale (subgrid scale) fluxes for large-scale atmospheric models

    NASA Technical Reports Server (NTRS)

    Avissar, Roni; Chen, Fei

    1993-01-01

    Generated by landscape discontinuities (e.g., sea breezes) mesoscale circulation processes are not represented in large-scale atmospheric models (e.g., general circulation models), which have an inappropiate grid-scale resolution. With the assumption that atmospheric variables can be separated into large scale, mesoscale, and turbulent scale, a set of prognostic equations applicable in large-scale atmospheric models for momentum, temperature, moisture, and any other gaseous or aerosol material, which includes both mesoscale and turbulent fluxes is developed. Prognostic equations are also developed for these mesoscale fluxes, which indicate a closure problem and, therefore, require a parameterization. For this purpose, the mean mesoscale kinetic energy (MKE) per unit of mass is used, defined as E-tilde = 0.5 (the mean value of u'(sub i exp 2), where u'(sub i) represents the three Cartesian components of a mesoscale circulation (the angle bracket symbol is the grid-scale, horizontal averaging operator in the large-scale model, and a tilde indicates a corresponding large-scale mean value). A prognostic equation is developed for E-tilde, and an analysis of the different terms of this equation indicates that the mesoscale vertical heat flux, the mesoscale pressure correlation, and the interaction between turbulence and mesoscale perturbations are the major terms that affect the time tendency of E-tilde. A-state-of-the-art mesoscale atmospheric model is used to investigate the relationship between MKE, landscape discontinuities (as characterized by the spatial distribution of heat fluxes at the earth's surface), and mesoscale sensible and latent heat fluxes in the atmosphere. MKE is compared with turbulence kinetic energy to illustrate the importance of mesoscale processes as compared to turbulent processes. This analysis emphasizes the potential use of MKE to bridge between landscape discontinuities and mesoscale fluxes and, therefore, to parameterize mesoscale fluxes generated by such subgrid-scale landscape discontinuities in large-scale atmospheric models.

  5. Molecular Structure-Based Large-Scale Prediction of Chemical-Induced Gene Expression Changes.

    PubMed

    Liu, Ruifeng; AbdulHameed, Mohamed Diwan M; Wallqvist, Anders

    2017-09-25

    The quantitative structure-activity relationship (QSAR) approach has been used to model a wide range of chemical-induced biological responses. However, it had not been utilized to model chemical-induced genomewide gene expression changes until very recently, owing to the complexity of training and evaluating a very large number of models. To address this issue, we examined the performance of a variable nearest neighbor (v-NN) method that uses information on near neighbors conforming to the principle that similar structures have similar activities. Using a data set of gene expression signatures of 13 150 compounds derived from cell-based measurements in the NIH Library of Integrated Network-based Cellular Signatures program, we were able to make predictions for 62% of the compounds in a 10-fold cross validation test, with a correlation coefficient of 0.61 between the predicted and experimentally derived signatures-a reproducibility rivaling that of high-throughput gene expression measurements. To evaluate the utility of the predicted gene expression signatures, we compared the predicted and experimentally derived signatures in their ability to identify drugs known to cause specific liver, kidney, and heart injuries. Overall, the predicted and experimentally derived signatures had similar receiver operating characteristics, whose areas under the curve ranged from 0.71 to 0.77 and 0.70 to 0.73, respectively, across the three organ injury models. However, detailed analyses of enrichment curves indicate that signatures predicted from multiple near neighbors outperformed those derived from experiments, suggesting that averaging information from near neighbors may help improve the signal from gene expression measurements. Our results demonstrate that the v-NN method can serve as a practical approach for modeling large-scale, genomewide, chemical-induced, gene expression changes.

  6. Genome-scale analysis identifies GJB2 and ERO1LB as prognosis markers in patients with pancreatic cancer.

    PubMed

    Zhu, Tao; Gao, Yuan-Feng; Chen, Yi-Xin; Wang, Zhi-Bin; Yin, Ji-Ye; Mao, Xiao-Yuan; Li, Xi; Zhang, Wei; Zhou, Hong-Hao; Liu, Zhao-Qian

    2017-03-28

    Pancreatic cancer is a complex and heterogeneous disease with the etiology largely unknown. The deadly nature of pancreatic cancer, with an extremely low 5-year survival rate, renders urgent a better understanding of the molecular events underlying it. The aim of this study is to investigate the gene expression module of pancreatic adenocarcinoma and to identify differentially expressed genes (DEGs) with prognostic potentials. Transcriptome microarray data of five GEO datasets (GSE15471, GSE16515, GSE18670, GSE32676, GSE71989), including 117 primary tumor samples and 73 normal pancreatic tissue samples, were utilized to identify DEGs. The five sets of DEGs had an overlapping subset consisting of 98 genes (90 up-regulated and 8 down-regulated), which were probably common to pancreatic cancer. Gene ontology (GO) analysis of the 98 DEGs showed that cell cycle and cell adhesion were the major enriched processes, and extracellular matrix (ECM)-receptor interaction and p53 signaling pathway were the most enriched pathways according to Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis. Elevated expression of gap junction protein beta 2 (GJB2) and reduced endoplasmic reticulum oxidoreductase 1-like beta (ERO1LB) expression were validated in an independent cohort. Kaplan-Meier survival analysis revealed that GJB2 and ERO1LB levels were significantly associated with the overall survival of pancreatic cancer patients. GJB2 and ERO1LB are implicated in pancreatic cancer progression and can be used to predict patient survival. Therapeutic strategies targeting GJB2 and facilitating ERO1LB expression may deserve evaluation to improve prognosis of pancreatic cancer patients.

  7. Transcriptomic Analysis Reveals Mechanisms of Sterile and Fertile Flower Differentiation and Development in Viburnum macrocephalum f. keteleeri

    PubMed Central

    Lu, Zhaogeng; Xu, Jing; Li, Weixing; Zhang, Li; Cui, Jiawen; He, Qingsong; Wang, Li; Jin, Biao

    2017-01-01

    Sterile and fertile flowers are an important evolutionary developmental (evo-devo) phenotype in angiosperm flowers, playing important roles in pollinator attraction and sexual reproductive success. However, the gene regulatory mechanisms underlying fertile and sterile flower differentiation and development remain largely unknown. Viburnum macrocephalum f. keteleeri, which possesses fertile and sterile flowers in a single inflorescence, is a useful candidate species for investigating the regulatory networks in differentiation and development. We developed a de novo-assembled flower reference transcriptome. Using RNA sequencing (RNA-seq), we compared the expression patterns of fertile and sterile flowers isolated from the same inflorescence over its rapid developmental stages. The flower reference transcriptome consisted of 105,683 non-redundant transcripts, of which 5,675 transcripts showed significant differential expression between fertile and sterile flowers. Combined with morphological and cytological changes between fertile and sterile flowers, we identified expression changes of many genes potentially involved in reproductive processes, phytohormone signaling, and cell proliferation and expansion using RNA-seq and qRT-PCR. In particular, many transcription factors (TFs), including MADS-box family members and ABCDE-class genes, were identified, and expression changes in TFs involved in multiple functions were analyzed and highlighted to determine their roles in regulating fertile and sterile flower differentiation and development. Our large-scale transcriptional analysis of fertile and sterile flowers revealed the dynamics of transcriptional networks and potentially key components in regulating differentiation and development of fertile and sterile flowers in Viburnum macrocephalum f. keteleeri. Our data provide a useful resource for Viburnum transcriptional research and offer insights into gene regulation of differentiation of diverse evo-devo processes in flowers. PMID:28298915

  8. A small-scale turbulence model

    NASA Technical Reports Server (NTRS)

    Lundgren, T. S.

    1992-01-01

    A model for the small-scale structure of turbulence is reformulated in such a way that it may be conveniently computed. The model is an ensemble of randomly oriented structured two dimensional vortices stretched by an axially symmetric strain flow. The energy spectrum of the resulting flow may be expressed as a time integral involving only the enstrophy spectrum of the time evolving two-dimensional cross section flow, which may be obtained numerically. Examples are given in which a k(exp -5/3) spectrum is obtained by this method without using large wave number asymptotic analysis. The k(exp -5/3) inertial range spectrum is shown to be related to the existence of a self-similar enstrophy preserving range in the two-dimensional enstrophy spectrum. The results are insensitive to time dependence of the strain-rate, including even intermittent on-or-off strains.

  9. Tools for Large-Scale Mobile Malware Analysis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bierma, Michael

    Analyzing mobile applications for malicious behavior is an important area of re- search, and is made di cult, in part, by the increasingly large number of appli- cations available for the major operating systems. There are currently over 1.2 million apps available in both the Google Play and Apple App stores (the respec- tive o cial marketplaces for the Android and iOS operating systems)[1, 2]. Our research provides two large-scale analysis tools to aid in the detection and analysis of mobile malware. The rst tool we present, Andlantis, is a scalable dynamic analysis system capa- ble of processing over 3000more » Android applications per hour. Traditionally, Android dynamic analysis techniques have been relatively limited in scale due to the compu- tational resources required to emulate the full Android system to achieve accurate execution. Andlantis is the most scalable Android dynamic analysis framework to date, and is able to collect valuable forensic data, which helps reverse-engineers and malware researchers identify and understand anomalous application behavior. We discuss the results of running 1261 malware samples through the system, and provide examples of malware analysis performed with the resulting data. While techniques exist to perform static analysis on a large number of appli- cations, large-scale analysis of iOS applications has been relatively small scale due to the closed nature of the iOS ecosystem, and the di culty of acquiring appli- cations for analysis. The second tool we present, iClone, addresses the challenges associated with iOS research in order to detect application clones within a dataset of over 20,000 iOS applications.« less

  10. HiQuant: Rapid Postquantification Analysis of Large-Scale MS-Generated Proteomics Data.

    PubMed

    Bryan, Kenneth; Jarboui, Mohamed-Ali; Raso, Cinzia; Bernal-Llinares, Manuel; McCann, Brendan; Rauch, Jens; Boldt, Karsten; Lynn, David J

    2016-06-03

    Recent advances in mass-spectrometry-based proteomics are now facilitating ambitious large-scale investigations of the spatial and temporal dynamics of the proteome; however, the increasing size and complexity of these data sets is overwhelming current downstream computational methods, specifically those that support the postquantification analysis pipeline. Here we present HiQuant, a novel application that enables the design and execution of a postquantification workflow, including common data-processing steps, such as assay normalization and grouping, and experimental replicate quality control and statistical analysis. HiQuant also enables the interpretation of results generated from large-scale data sets by supporting interactive heatmap analysis and also the direct export to Cytoscape and Gephi, two leading network analysis platforms. HiQuant may be run via a user-friendly graphical interface and also supports complete one-touch automation via a command-line mode. We evaluate HiQuant's performance by analyzing a large-scale, complex interactome mapping data set and demonstrate a 200-fold improvement in the execution time over current methods. We also demonstrate HiQuant's general utility by analyzing proteome-wide quantification data generated from both a large-scale public tyrosine kinase siRNA knock-down study and an in-house investigation into the temporal dynamics of the KSR1 and KSR2 interactomes. Download HiQuant, sample data sets, and supporting documentation at http://hiquant.primesdb.eu .

  11. Global Analysis of Transcriptome Responses and Gene Expression Profiles to Cold Stress of Jatropha curcas L.

    PubMed Central

    Wang, Haibo; Zou, Zhurong; Wang, Shasha; Gong, Ming

    2013-01-01

    Background Jatropha curcas L., also called the Physic nut, is an oil-rich shrub with multiple uses, including biodiesel production, and is currently exploited as a renewable energy resource in many countries. Nevertheless, because of its origin from the tropical MidAmerican zone, J. curcas confers an inherent but undesirable characteristic (low cold resistance) that may seriously restrict its large-scale popularization. This adaptive flaw can be genetically improved by elucidating the mechanisms underlying plant tolerance to cold temperatures. The newly developed Illumina Hiseq™ 2000 RNA-seq and Digital Gene Expression (DGE) are deep high-throughput approaches for gene expression analysis at the transcriptome level, using which we carefully investigated the gene expression profiles in response to cold stress to gain insight into the molecular mechanisms of cold response in J. curcas. Results In total, 45,251 unigenes were obtained by assembly of clean data generated by RNA-seq analysis of the J. curcas transcriptome. A total of 33,363 and 912 complete or partial coding sequences (CDSs) were determined by protein database alignments and ESTScan prediction, respectively. Among these unigenes, more than 41.52% were involved in approximately 128 known metabolic or signaling pathways, and 4,185 were possibly associated with cold resistance. DGE analysis was used to assess the changes in gene expression when exposed to cold condition (12°C) for 12, 24, and 48 h. The results showed that 3,178 genes were significantly upregulated and 1,244 were downregulated under cold stress. These genes were then functionally annotated based on the transcriptome data from RNA-seq analysis. Conclusions This study provides a global view of transcriptome response and gene expression profiling of J. curcas in response to cold stress. The results can help improve our current understanding of the mechanisms underlying plant cold resistance and favor the screening of crucial genes for genetically enhancing cold resistance in J. curcas. PMID:24349370

  12. Global analysis of transcriptome responses and gene expression profiles to cold stress of Jatropha curcas L.

    PubMed

    Wang, Haibo; Zou, Zhurong; Wang, Shasha; Gong, Ming

    2013-01-01

    Jatropha curcas L., also called the Physic nut, is an oil-rich shrub with multiple uses, including biodiesel production, and is currently exploited as a renewable energy resource in many countries. Nevertheless, because of its origin from the tropical MidAmerican zone, J. curcas confers an inherent but undesirable characteristic (low cold resistance) that may seriously restrict its large-scale popularization. This adaptive flaw can be genetically improved by elucidating the mechanisms underlying plant tolerance to cold temperatures. The newly developed Illumina Hiseq™ 2000 RNA-seq and Digital Gene Expression (DGE) are deep high-throughput approaches for gene expression analysis at the transcriptome level, using which we carefully investigated the gene expression profiles in response to cold stress to gain insight into the molecular mechanisms of cold response in J. curcas. In total, 45,251 unigenes were obtained by assembly of clean data generated by RNA-seq analysis of the J. curcas transcriptome. A total of 33,363 and 912 complete or partial coding sequences (CDSs) were determined by protein database alignments and ESTScan prediction, respectively. Among these unigenes, more than 41.52% were involved in approximately 128 known metabolic or signaling pathways, and 4,185 were possibly associated with cold resistance. DGE analysis was used to assess the changes in gene expression when exposed to cold condition (12°C) for 12, 24, and 48 h. The results showed that 3,178 genes were significantly upregulated and 1,244 were downregulated under cold stress. These genes were then functionally annotated based on the transcriptome data from RNA-seq analysis. This study provides a global view of transcriptome response and gene expression profiling of J. curcas in response to cold stress. The results can help improve our current understanding of the mechanisms underlying plant cold resistance and favor the screening of crucial genes for genetically enhancing cold resistance in J. curcas.

  13. Ascertaining Validity in the Abstract Realm of PMESII Simulation Models: An Analysis of the Peace Support Operations Model (PSOM)

    DTIC Science & Technology

    2009-06-01

    simulation is the campaign-level Peace Support Operations Model (PSOM). This thesis provides a quantitative analysis of PSOM. The results are based ...multiple potential outcomes , further development and analysis is required before the model is used for large scale analysis . 15. NUMBER OF PAGES 159...multiple potential outcomes , further development and analysis is required before the model is used for large scale analysis . vi THIS PAGE

  14. Large scale atmospheric waves in the Venus mesosphere as seen by the VeRa Radio Science instrument on Venus Express

    NASA Astrophysics Data System (ADS)

    Tellmann, S.; Häusler, B.; Hinson, D. P.; Tyler, G. L.; Andert, T. P.; Bird, M. K.; Imamura, T.; Pätzold, M.; Remus, S.

    2014-04-01

    Atmospheric waves on almost all spatial scales have been observed in the Venus atmosphere in various atmospheric regions. They play a crucial role in the redistribution of energy, momentum, and atmospheric constituent and are thought to be involved in the development and maintenance of the atmospheric superrotation.

  15. A large-scale perspective on stress-induced alterations in resting-state networks

    NASA Astrophysics Data System (ADS)

    Maron-Katz, Adi; Vaisvaser, Sharon; Lin, Tamar; Hendler, Talma; Shamir, Ron

    2016-02-01

    Stress is known to induce large-scale neural modulations. However, its neural effect once the stressor is removed and how it relates to subjective experience are not fully understood. Here we used a statistically sound data-driven approach to investigate alterations in large-scale resting-state functional connectivity (rsFC) induced by acute social stress. We compared rsfMRI profiles of 57 healthy male subjects before and after stress induction. Using a parcellation-based univariate statistical analysis, we identified a large-scale rsFC change, involving 490 parcel-pairs. Aiming to characterize this change, we employed statistical enrichment analysis, identifying anatomic structures that were significantly interconnected by these pairs. This analysis revealed strengthening of thalamo-cortical connectivity and weakening of cross-hemispheral parieto-temporal connectivity. These alterations were further found to be associated with change in subjective stress reports. Integrating report-based information on stress sustainment 20 minutes post induction, revealed a single significant rsFC change between the right amygdala and the precuneus, which inversely correlated with the level of subjective recovery. Our study demonstrates the value of enrichment analysis for exploring large-scale network reorganization patterns, and provides new insight on stress-induced neural modulations and their relation to subjective experience.

  16. Virus-Induced Gene Silencing Using Tobacco Rattle Virus as a Tool to Study the Interaction between Nicotiana attenuata and Rhizophagus irregularis.

    PubMed

    Groten, Karin; Pahari, Nabin T; Xu, Shuqing; Miloradovic van Doorn, Maja; Baldwin, Ian T

    2015-01-01

    Most land plants live in a symbiotic association with arbuscular mycorrhizal fungi (AMF) that belong to the phylum Glomeromycota. Although a number of plant genes involved in the plant-AMF interactions have been identified by analyzing mutants, the ability to rapidly manipulate gene expression to study the potential functions of new candidate genes remains unrealized. We analyzed changes in gene expression of wild tobacco roots (Nicotiana attenuata) after infection with mycorrhizal fungi (Rhizophagus irregularis) by serial analysis of gene expression (SuperSAGE) combined with next generation sequencing, and established a virus-induced gene-silencing protocol to study the function of candidate genes in the interaction. From 92,434 SuperSAGE Tag sequences, 32,808 (35%) matched with our in-house Nicotiana attenuata transcriptome database and 3,698 (4%) matched to Rhizophagus genes. In total, 11,194 Tags showed a significant change in expression (p<0.05, >2-fold change) after infection. When comparing the functions of highly up-regulated annotated Tags in this study with those of two previous large-scale gene expression studies, 18 gene functions were found to be up-regulated in all three studies mainly playing roles related to phytohormone metabolism, catabolism and defense. To validate the function of identified candidate genes, we used the technique of virus-induced gene silencing (VIGS) to silence the expression of three putative N. attenuata genes: germin-like protein, indole-3-acetic acid-amido synthetase GH3.9 and, as a proof-of-principle, calcium and calmodulin-dependent protein kinase (CCaMK). The silencing of the three plant genes in roots was successful, but only CCaMK silencing had a significant effect on the interaction with R. irregularis. Interestingly, when a highly activated inoculum was used for plant inoculation, the effect of CCaMK silencing on fungal colonization was masked, probably due to trans-complementation. This study demonstrates that large-scale gene expression studies across different species induce of a core set of genes of similar functions. However, additional factors seem to influence the overall pattern of gene expression, resulting in high variability among independent studies with different hosts. We conclude that VIGS is a powerful tool with which to investigate the function of genes involved in plant-AMF interactions but that inoculum strength can strongly influence the outcome of the interaction.

  17. Gene targeting by TALEN-induced homologous recombination in goats directs production of β-lactoglobulin-free, high-human lactoferrin milk

    PubMed Central

    Cui, Chenchen; Song, Yujie; Liu, Jun; Ge, Hengtao; Li, Qian; Huang, Hui; Hu, Linyong; Zhu, Hongmei; Jin, Yaping; Zhang, Yong

    2015-01-01

    β-Lactoglobulin (BLG) is a major goat’s milk allergen that is absent in human milk. Engineered endonucleases, including transcription activator-like effector nucleases (TALENs) and zinc-finger nucleases, enable targeted genetic modification in livestock. In this study, TALEN-mediated gene knockout followed by gene knock-in were used to generate BLG knockout goats as mammary gland bioreactors for large-scale production of human lactoferrin (hLF). We introduced precise genetic modifications in the goat genome at frequencies of approximately 13.6% and 6.09% for the first and second sequential targeting, respectively, by using targeting vectors that underwent TALEN-induced homologous recombination (HR). Analysis of milk from the cloned goats revealed large-scale hLF expression or/and decreased BLG levels in milk from heterozygous goats as well as the absence of BLG in milk from homozygous goats. Furthermore, the TALEN-mediated targeting events in somatic cells can be transmitted through the germline after SCNT. Our result suggests that gene targeting via TALEN-induced HR may expedite the production of genetically engineered livestock for agriculture and biomedicine. PMID:25994151

  18. Coexistence trend contingent to Mediterranean oaks with different leaf habits.

    PubMed

    Di Paola, Arianna; Paquette, Alain; Trabucco, Antonio; Mereu, Simone; Valentini, Riccardo; Paparella, Francesco

    2017-05-01

    In a previous work we developed a mathematical model to explain the co-occurrence of evergreen and deciduous oak groups in the Mediterranean region, regarded as one of the distinctive features of Mediterranean biodiversity. The mathematical analysis showed that a stabilizing mechanism resulting from niche difference (i.e. different water use and water stress tolerance) between groups allows their coexistence at intermediate values of suitable soil water content. A simple formal derivation of the model expresses this hypothesis in a testable form linked uniquely to the actual evapotranspiration of forests community. In the present work we ascertain whether this simplified conclusion possesses some degree of explanatory power by comparing available data on oaks distributions and remotely sensed evapotranspiration (MODIS product) in a large-scale survey embracing the western Mediterranean area. Our findings confirmed the basic assumptions of model addressed on large scale, but also revealed asymmetric responses to water use and water stress tolerance between evergreen and deciduous oaks that should be taken into account to increase the understating of species interactions and, ultimately, improve the modeling capacity to explain co-occurrence.

  19. Gene targeting by TALEN-induced homologous recombination in goats directs production of β-lactoglobulin-free, high-human lactoferrin milk.

    PubMed

    Cui, Chenchen; Song, Yujie; Liu, Jun; Ge, Hengtao; Li, Qian; Huang, Hui; Hu, Linyong; Zhu, Hongmei; Jin, Yaping; Zhang, Yong

    2015-05-21

    β-Lactoglobulin (BLG) is a major goat's milk allergen that is absent in human milk. Engineered endonucleases, including transcription activator-like effector nucleases (TALENs) and zinc-finger nucleases, enable targeted genetic modification in livestock. In this study, TALEN-mediated gene knockout followed by gene knock-in were used to generate BLG knockout goats as mammary gland bioreactors for large-scale production of human lactoferrin (hLF). We introduced precise genetic modifications in the goat genome at frequencies of approximately 13.6% and 6.09% for the first and second sequential targeting, respectively, by using targeting vectors that underwent TALEN-induced homologous recombination (HR). Analysis of milk from the cloned goats revealed large-scale hLF expression or/and decreased BLG levels in milk from heterozygous goats as well as the absence of BLG in milk from homozygous goats. Furthermore, the TALEN-mediated targeting events in somatic cells can be transmitted through the germline after SCNT. Our result suggests that gene targeting via TALEN-induced HR may expedite the production of genetically engineered livestock for agriculture and biomedicine.

  20. Microbial forensics: predicting phenotypic characteristics and environmental conditions from large-scale gene expression profiles.

    PubMed

    Kim, Minseung; Zorraquino, Violeta; Tagkopoulos, Ilias

    2015-03-01

    A tantalizing question in cellular physiology is whether the cellular state and environmental conditions can be inferred by the expression signature of an organism. To investigate this relationship, we created an extensive normalized gene expression compendium for the bacterium Escherichia coli that was further enriched with meta-information through an iterative learning procedure. We then constructed an ensemble method to predict environmental and cellular state, including strain, growth phase, medium, oxygen level, antibiotic and carbon source presence. Results show that gene expression is an excellent predictor of environmental structure, with multi-class ensemble models achieving balanced accuracy between 70.0% (±3.5%) to 98.3% (±2.3%) for the various characteristics. Interestingly, this performance can be significantly boosted when environmental and strain characteristics are simultaneously considered, as a composite classifier that captures the inter-dependencies of three characteristics (medium, phase and strain) achieved 10.6% (±1.0%) higher performance than any individual models. Contrary to expectations, only 59% of the top informative genes were also identified as differentially expressed under the respective conditions. Functional analysis of the respective genetic signatures implicates a wide spectrum of Gene Ontology terms and KEGG pathways with condition-specific information content, including iron transport, transferases, and enterobactin synthesis. Further experimental phenotypic-to-genotypic mapping that we conducted for knock-out mutants argues for the information content of top-ranked genes. This work demonstrates the degree at which genome-scale transcriptional information can be predictive of latent, heterogeneous and seemingly disparate phenotypic and environmental characteristics, with far-reaching applications.

  1. Medium-throughput processing of whole mount in situ hybridisation experiments into gene expression domains.

    PubMed

    Crombach, Anton; Cicin-Sain, Damjan; Wotton, Karl R; Jaeger, Johannes

    2012-01-01

    Understanding the function and evolution of developmental regulatory networks requires the characterisation and quantification of spatio-temporal gene expression patterns across a range of systems and species. However, most high-throughput methods to measure the dynamics of gene expression do not preserve the detailed spatial information needed in this context. For this reason, quantification methods based on image bioinformatics have become increasingly important over the past few years. Most available approaches in this field either focus on the detailed and accurate quantification of a small set of gene expression patterns, or attempt high-throughput analysis of spatial expression through binary pattern extraction and large-scale analysis of the resulting datasets. Here we present a robust, "medium-throughput" pipeline to process in situ hybridisation patterns from embryos of different species of flies. It bridges the gap between high-resolution, and high-throughput image processing methods, enabling us to quantify graded expression patterns along the antero-posterior axis of the embryo in an efficient and straightforward manner. Our method is based on a robust enzymatic (colorimetric) in situ hybridisation protocol and rapid data acquisition through wide-field microscopy. Data processing consists of image segmentation, profile extraction, and determination of expression domain boundary positions using a spline approximation. It results in sets of measured boundaries sorted by gene and developmental time point, which are analysed in terms of expression variability or spatio-temporal dynamics. Our method yields integrated time series of spatial gene expression, which can be used to reverse-engineer developmental gene regulatory networks across species. It is easily adaptable to other processes and species, enabling the in silico reconstitution of gene regulatory networks in a wide range of developmental contexts.

  2. Defining the Human Macula Transcriptome and Candidate Retinal Disease Genes UsingEyeSAGE

    PubMed Central

    Rickman, Catherine Bowes; Ebright, Jessica N.; Zavodni, Zachary J.; Yu, Ling; Wang, Tianyuan; Daiger, Stephen P.; Wistow, Graeme; Boon, Kathy; Hauser, Michael A.

    2009-01-01

    Purpose To develop large-scale, high-throughput annotation of the human macula transcriptome and to identify and prioritize candidate genes for inherited retinal dystrophies, based on ocular-expression profiles using serial analysis of gene expression (SAGE). Methods Two human retina and two retinal pigment epithelium (RPE)/choroid SAGE libraries made from matched macula or midperipheral retina and adjacent RPE/choroid of morphologically normal 28- to 66-year-old donors and a human central retina longSAGE library made from 41- to 66-year-old donors were generated. Their transcription profiles were entered into a relational database, EyeSAGE, including microarray expression profiles of retina and publicly available normal human tissue SAGE libraries. EyeSAGE was used to identify retina- and RPE-specific and -associated genes, and candidate genes for retina and RPE disease loci. Differential and/or cell-type specific expression was validated by quantitative and single-cell RT-PCR. Results Cone photoreceptor-associated gene expression was elevated in the macula transcription profiles. Analysis of the longSAGE retina tags enhanced tag-to-gene mapping and revealed alternatively spliced genes. Analysis of candidate gene expression tables for the identified Bardet-Biedl syndrome disease gene (BBS5) in the BBS5 disease region table yielded BBS5 as the top candidate. Compelling candidates for inherited retina diseases were identified. Conclusions The EyeSAGE database, combining three different gene-profiling platforms including the authors’ multidonor-derived retina/RPE SAGE libraries and existing single-donor retina/RPE libraries, is a powerful resource for definition of the retina and RPE transcriptomes. It can be used to identify retina-specific genes, including alternatively spliced transcripts and to prioritize candidate genes within mapped retinal disease regions. PMID:16723438

  3. Defining the human macula transcriptome and candidate retinal disease genes using EyeSAGE.

    PubMed

    Bowes Rickman, Catherine; Ebright, Jessica N; Zavodni, Zachary J; Yu, Ling; Wang, Tianyuan; Daiger, Stephen P; Wistow, Graeme; Boon, Kathy; Hauser, Michael A

    2006-06-01

    To develop large-scale, high-throughput annotation of the human macula transcriptome and to identify and prioritize candidate genes for inherited retinal dystrophies, based on ocular-expression profiles using serial analysis of gene expression (SAGE). Two human retina and two retinal pigment epithelium (RPE)/choroid SAGE libraries made from matched macula or midperipheral retina and adjacent RPE/choroid of morphologically normal 28- to 66-year-old donors and a human central retina longSAGE library made from 41- to 66-year-old donors were generated. Their transcription profiles were entered into a relational database, EyeSAGE, including microarray expression profiles of retina and publicly available normal human tissue SAGE libraries. EyeSAGE was used to identify retina- and RPE-specific and -associated genes, and candidate genes for retina and RPE disease loci. Differential and/or cell-type specific expression was validated by quantitative and single-cell RT-PCR. Cone photoreceptor-associated gene expression was elevated in the macula transcription profiles. Analysis of the longSAGE retina tags enhanced tag-to-gene mapping and revealed alternatively spliced genes. Analysis of candidate gene expression tables for the identified Bardet-Biedl syndrome disease gene (BBS5) in the BBS5 disease region table yielded BBS5 as the top candidate. Compelling candidates for inherited retina diseases were identified. The EyeSAGE database, combining three different gene-profiling platforms including the authors' multidonor-derived retina/RPE SAGE libraries and existing single-donor retina/RPE libraries, is a powerful resource for definition of the retina and RPE transcriptomes. It can be used to identify retina-specific genes, including alternatively spliced transcripts and to prioritize candidate genes within mapped retinal disease regions.

  4. Disturbance Frequency Determines Morphology and Community Development in Multi-Species Biofilm at the Landscape Scale

    PubMed Central

    Milferstedt, Kim; Santa-Catalina, Gaëlle; Godon, Jean-Jacques; Escudié, Renaud; Bernet, Nicolas

    2013-01-01

    Many natural and engineered biofilm systems periodically face disturbances. Here we present how the recovery time of a biofilm between disturbances (expressed as disturbance frequency) shapes the development of morphology and community structure in a multi-species biofilm at the landscape scale. It was hypothesized that a high disturbance frequency favors the development of a stable adapted biofilm system while a low disturbance frequency promotes a dynamic biofilm response. Biofilms were grown in laboratory-scale reactors over a period of 55-70 days and exposed to the biocide monochloramine at two frequencies: daily or weekly pulse injections. One untreated reactor served as control. Biofilm morphology and community structure were followed on comparably large biofilm areas at the landscape scale using automated image analysis (spatial gray level dependence matrices) and community fingerprinting (single-strand conformation polymorphisms). We demonstrated that a weekly disturbed biofilm developed a resilient morphology and community structure. Immediately after the disturbance, the biofilm simplified but recovered its initial complex morphology and community structure between two biocide pulses. In the daily treated reactor, one organism largely dominated a morphologically simple and stable biofilm. Disturbances primarily affected the abundance distribution of already present bacterial taxa but did not promote growth of previously undetected organisms. Our work indicates that disturbances can be used as lever to engineer biofilms by maintaining a biofilm between two developmental states. PMID:24303024

  5. WImpiBLAST: web interface for mpiBLAST to help biologists perform large-scale annotation using high performance computing.

    PubMed

    Sharma, Parichit; Mantri, Shrikant S

    2014-01-01

    The function of a newly sequenced gene can be discovered by determining its sequence homology with known proteins. BLAST is the most extensively used sequence analysis program for sequence similarity search in large databases of sequences. With the advent of next generation sequencing technologies it has now become possible to study genes and their expression at a genome-wide scale through RNA-seq and metagenome sequencing experiments. Functional annotation of all the genes is done by sequence similarity search against multiple protein databases. This annotation task is computationally very intensive and can take days to obtain complete results. The program mpiBLAST, an open-source parallelization of BLAST that achieves superlinear speedup, can be used to accelerate large-scale annotation by using supercomputers and high performance computing (HPC) clusters. Although many parallel bioinformatics applications using the Message Passing Interface (MPI) are available in the public domain, researchers are reluctant to use them due to lack of expertise in the Linux command line and relevant programming experience. With these limitations, it becomes difficult for biologists to use mpiBLAST for accelerating annotation. No web interface is available in the open-source domain for mpiBLAST. We have developed WImpiBLAST, a user-friendly open-source web interface for parallel BLAST searches. It is implemented in Struts 1.3 using a Java backbone and runs atop the open-source Apache Tomcat Server. WImpiBLAST supports script creation and job submission features and also provides a robust job management interface for system administrators. It combines script creation and modification features with job monitoring and management through the Torque resource manager on a Linux-based HPC cluster. Use case information highlights the acceleration of annotation analysis achieved by using WImpiBLAST. Here, we describe the WImpiBLAST web interface features and architecture, explain design decisions, describe workflows and provide a detailed analysis.

  6. WImpiBLAST: Web Interface for mpiBLAST to Help Biologists Perform Large-Scale Annotation Using High Performance Computing

    PubMed Central

    Sharma, Parichit; Mantri, Shrikant S.

    2014-01-01

    The function of a newly sequenced gene can be discovered by determining its sequence homology with known proteins. BLAST is the most extensively used sequence analysis program for sequence similarity search in large databases of sequences. With the advent of next generation sequencing technologies it has now become possible to study genes and their expression at a genome-wide scale through RNA-seq and metagenome sequencing experiments. Functional annotation of all the genes is done by sequence similarity search against multiple protein databases. This annotation task is computationally very intensive and can take days to obtain complete results. The program mpiBLAST, an open-source parallelization of BLAST that achieves superlinear speedup, can be used to accelerate large-scale annotation by using supercomputers and high performance computing (HPC) clusters. Although many parallel bioinformatics applications using the Message Passing Interface (MPI) are available in the public domain, researchers are reluctant to use them due to lack of expertise in the Linux command line and relevant programming experience. With these limitations, it becomes difficult for biologists to use mpiBLAST for accelerating annotation. No web interface is available in the open-source domain for mpiBLAST. We have developed WImpiBLAST, a user-friendly open-source web interface for parallel BLAST searches. It is implemented in Struts 1.3 using a Java backbone and runs atop the open-source Apache Tomcat Server. WImpiBLAST supports script creation and job submission features and also provides a robust job management interface for system administrators. It combines script creation and modification features with job monitoring and management through the Torque resource manager on a Linux-based HPC cluster. Use case information highlights the acceleration of annotation analysis achieved by using WImpiBLAST. Here, we describe the WImpiBLAST web interface features and architecture, explain design decisions, describe workflows and provide a detailed analysis. PMID:24979410

  7. The influence of cosmic rays on the stability and large-scale dynamics of the interstellar medium

    NASA Astrophysics Data System (ADS)

    Kuznetsov, V. D.

    1986-06-01

    The diffusion-convection formulation is used to study the influence of galactic cosmic rays on the stability and dynamics of the interstellar medium which is supposedly kept in equilibrium by the gravitational field of stars. It is shown that the influence of cosmic rays on the growth rate of MHD instability depends largely on a dimensionless parameter expressing the ratio of the characteristic acoustic time scale to the cosmic-ray diffusion time. If this parameter is small, the cosmic rays will decelerate the build-up of instabilities, thereby stabilizing the system; in contrast, if the parameter is large, the system will be destabilized.

  8. Understanding Business Interests in International Large-Scale Student Assessments: A Media Analysis of "The Economist," "Financial Times," and "Wall Street Journal"

    ERIC Educational Resources Information Center

    Steiner-Khamsi, Gita; Appleton, Margaret; Vellani, Shezleen

    2018-01-01

    The media analysis is situated in the larger body of studies that explore the varied reasons why different policy actors advocate for international large-scale student assessments (ILSAs) and adds to the research on the fast advance of the global education industry. The analysis of "The Economist," "Financial Times," and…

  9. Post-16 Physics and Chemistry Uptake: Combining Large-Scale Secondary Analysis with In-Depth Qualitative Methods

    ERIC Educational Resources Information Center

    Hampden-Thompson, Gillian; Lubben, Fred; Bennett, Judith

    2011-01-01

    Quantitative secondary analysis of large-scale data can be combined with in-depth qualitative methods. In this paper, we discuss the role of this combined methods approach in examining the uptake of physics and chemistry in post compulsory schooling for students in England. The secondary data analysis of the National Pupil Database (NPD) served…

  10. Sensitivity analysis for large-scale problems

    NASA Technical Reports Server (NTRS)

    Noor, Ahmed K.; Whitworth, Sandra L.

    1987-01-01

    The development of efficient techniques for calculating sensitivity derivatives is studied. The objective is to present a computational procedure for calculating sensitivity derivatives as part of performing structural reanalysis for large-scale problems. The scope is limited to framed type structures. Both linear static analysis and free-vibration eigenvalue problems are considered.

  11. Blocking monocyte transmigration in in vitro system by a human antibody scFv anti-CD99. Efficient large scale purification from periplasmic inclusion bodies in E. coli expression system.

    PubMed

    Moricoli, Diego; Muller, William Anthony; Carbonella, Damiano Cosimo; Balducci, Maria Cristina; Dominici, Sabrina; Watson, Richard; Fiori, Valentina; Weber, Evan; Cianfriglia, Maurizio; Scotlandi, Katia; Magnani, Mauro

    2014-06-01

    Migration of leukocytes into site of inflammation involves several steps mediated by various families of adhesion molecules. CD99 play a significant role in transendothelial migration (TEM) of leukocytes. Inhibition of TEM by specific monoclonal antibody (mAb) can provide a potent therapeutic approach to treating inflammatory conditions. However, the therapeutic utilization of whole IgG can lead to an inappropriate activation of Fc receptor-expressing cells, inducing serious adverse side effects due to cytokine release. In this regard, specific recombinant antibody in single chain variable fragments (scFvs) originated by phage library may offer a solution by affecting TEM function in a safe clinical context. However, this consideration requires large scale production of functional scFv antibodies and the absence of toxic reagents utilized for solubilization and refolding step of inclusion bodies that may discourage industrial application of these antibody fragments. In order to apply the scFv anti-CD99 named C7A in a clinical setting, we herein describe an efficient and large scale production of the antibody fragments expressed in E. coli as periplasmic insoluble protein avoiding gel filtration chromatography approach, and laborious refolding step pre- and post-purification. Using differential salt elution which is a simple, reproducible and effective procedure we are able to separate scFv in monomer format from aggregates. The purified scFv antibody C7A exhibits inhibitory activity comparable to an antagonistic conventional mAb, thus providing an excellent agent for blocking CD99 signaling. This protocol can be useful for the successful purification of other monomeric scFvs which are expressed as periplasmic inclusion bodies in bacterial systems. Copyright © 2014 Elsevier B.V. All rights reserved.

  12. Multi-scale chromatin state annotation using a hierarchical hidden Markov model

    NASA Astrophysics Data System (ADS)

    Marco, Eugenio; Meuleman, Wouter; Huang, Jialiang; Glass, Kimberly; Pinello, Luca; Wang, Jianrong; Kellis, Manolis; Yuan, Guo-Cheng

    2017-04-01

    Chromatin-state analysis is widely applied in the studies of development and diseases. However, existing methods operate at a single length scale, and therefore cannot distinguish large domains from isolated elements of the same type. To overcome this limitation, we present a hierarchical hidden Markov model, diHMM, to systematically annotate chromatin states at multiple length scales. We apply diHMM to analyse a public ChIP-seq data set. diHMM not only accurately captures nucleosome-level information, but identifies domain-level states that vary in nucleosome-level state composition, spatial distribution and functionality. The domain-level states recapitulate known patterns such as super-enhancers, bivalent promoters and Polycomb repressed regions, and identify additional patterns whose biological functions are not yet characterized. By integrating chromatin-state information with gene expression and Hi-C data, we identify context-dependent functions of nucleosome-level states. Thus, diHMM provides a powerful tool for investigating the role of higher-order chromatin structure in gene regulation.

  13. Scalable Parameter Estimation for Genome-Scale Biochemical Reaction Networks

    PubMed Central

    Kaltenbacher, Barbara; Hasenauer, Jan

    2017-01-01

    Mechanistic mathematical modeling of biochemical reaction networks using ordinary differential equation (ODE) models has improved our understanding of small- and medium-scale biological processes. While the same should in principle hold for large- and genome-scale processes, the computational methods for the analysis of ODE models which describe hundreds or thousands of biochemical species and reactions are missing so far. While individual simulations are feasible, the inference of the model parameters from experimental data is computationally too intensive. In this manuscript, we evaluate adjoint sensitivity analysis for parameter estimation in large scale biochemical reaction networks. We present the approach for time-discrete measurement and compare it to state-of-the-art methods used in systems and computational biology. Our comparison reveals a significantly improved computational efficiency and a superior scalability of adjoint sensitivity analysis. The computational complexity is effectively independent of the number of parameters, enabling the analysis of large- and genome-scale models. Our study of a comprehensive kinetic model of ErbB signaling shows that parameter estimation using adjoint sensitivity analysis requires a fraction of the computation time of established methods. The proposed method will facilitate mechanistic modeling of genome-scale cellular processes, as required in the age of omics. PMID:28114351

  14. An Integrated Analysis of MicroRNA and mRNA Expression Profiles to Identify RNA Expression Signatures in Lambskin Hair Follicles in Hu Sheep

    PubMed Central

    Lv, Xiaoyang; Sun, Wei; Yin, Jinfeng; Ni, Rong; Su, Rui; Wang, Qingzeng; Gao, Wen; Bao, Jianjun; Yu, Jiarui; Wang, Lihong; Chen, Ling

    2016-01-01

    Wave patterns in lambskin hair follicles are an important factor determining the quality of sheep’s wool. Hair follicles in lambskin from Hu sheep, a breed unique to China, have 3 types of waves, designated as large, medium, and small. The quality of wool from small wave follicles is excellent, while the quality of large waves is considered poor. Because no molecular and biological studies on hair follicles of these sheep have been conducted to date, the molecular mechanisms underlying the formation of different wave patterns is currently unknown. The aim of this article was to screen the candidate microRNAs (miRNA) and genes for the development of hair follicles in Hu sheep. Two-day-old Hu lambs were selected from full-sib individuals that showed large, medium, and small waves. Integrated analysis of microRNA and mRNA expression profiles employed high-throughout sequencing technology. Approximately 13, 24, and 18 differentially expressed miRNAs were found between small and large waves, small and medium waves, and medium and large waves, respectively. A total of 54, 190, and 81 differentially expressed genes were found between small and large waves, small and medium waves, and medium and large waves, respectively, by RNA sequencing (RNA-seq) analysis. Differentially expressed genes were classified using gene ontology and pathway analyses. They were found to be mainly involved in cell differentiation, proliferation, apoptosis, growth, immune response, and ion transport, and were associated with MAPK and the Notch signaling pathway. Reverse transcription-polymerase chain reaction (RT-PCR) analyses of differentially-expressed miRNA and genes were consistent with sequencing results. Integrated analysis of miRNA and mRNA expression indicated that, compared to small waves, large waves included 4 downregulated miRNAs that had regulatory effects on 8 upregulated genes and 3 upregulated miRNAs, which in turn influenced 13 downregulated genes. Compared to small waves, medium waves included 13 downregulated miRNAs that had regulatory effects on 64 upregulated genes and 4 upregulated miRNAs, which in turn had regulatory effects on 22 downregulated genes. Compared to medium waves, large waves consisted of 13 upregulated miRNAs that had regulatory effects on 48 downregulated genes. These differentially expressed miRNAs and genes may play a significant role in forming different patterns, and provide evidence for the molecular mechanisms underlying the formation of hair follicles of varying patterns. PMID:27404636

  15. Quantitative analysis of voids in percolating structures in two-dimensional N-body simulations

    NASA Technical Reports Server (NTRS)

    Harrington, Patrick M.; Melott, Adrian L.; Shandarin, Sergei F.

    1993-01-01

    We present in this paper a quantitative method for defining void size in large-scale structure based on percolation threshold density. Beginning with two-dimensional gravitational clustering simulations smoothed to the threshold of nonlinearity, we perform percolation analysis to determine the large scale structure. The resulting objective definition of voids has a natural scaling property, is topologically interesting, and can be applied immediately to redshift surveys.

  16. Preparing Laboratory and Real-World EEG Data for Large-Scale Analysis: A Containerized Approach

    PubMed Central

    Bigdely-Shamlo, Nima; Makeig, Scott; Robbins, Kay A.

    2016-01-01

    Large-scale analysis of EEG and other physiological measures promises new insights into brain processes and more accurate and robust brain–computer interface models. However, the absence of standardized vocabularies for annotating events in a machine understandable manner, the welter of collection-specific data organizations, the difficulty in moving data across processing platforms, and the unavailability of agreed-upon standards for preprocessing have prevented large-scale analyses of EEG. Here we describe a “containerized” approach and freely available tools we have developed to facilitate the process of annotating, packaging, and preprocessing EEG data collections to enable data sharing, archiving, large-scale machine learning/data mining and (meta-)analysis. The EEG Study Schema (ESS) comprises three data “Levels,” each with its own XML-document schema and file/folder convention, plus a standardized (PREP) pipeline to move raw (Data Level 1) data to a basic preprocessed state (Data Level 2) suitable for application of a large class of EEG analysis methods. Researchers can ship a study as a single unit and operate on its data using a standardized interface. ESS does not require a central database and provides all the metadata data necessary to execute a wide variety of EEG processing pipelines. The primary focus of ESS is automated in-depth analysis and meta-analysis EEG studies. However, ESS can also encapsulate meta-information for the other modalities such as eye tracking, that are increasingly used in both laboratory and real-world neuroimaging. ESS schema and tools are freely available at www.eegstudy.org and a central catalog of over 850 GB of existing data in ESS format is available at studycatalog.org. These tools and resources are part of a larger effort to enable data sharing at sufficient scale for researchers to engage in truly large-scale EEG analysis and data mining (BigEEG.org). PMID:27014048

  17. Application of multivariate analysis and mass transfer principles for refinement of a 3-L bioreactor scale-down model--when shake flasks mimic 15,000-L bioreactors better.

    PubMed

    Ahuja, Sanjeev; Jain, Shilpa; Ram, Kripa

    2015-01-01

    Characterization of manufacturing processes is key to understanding the effects of process parameters on process performance and product quality. These studies are generally conducted using small-scale model systems. Because of the importance of the results derived from these studies, the small-scale model should be predictive of large scale. Typically, small-scale bioreactors, which are considered superior to shake flasks in simulating large-scale bioreactors, are used as the scale-down models for characterizing mammalian cell culture processes. In this article, we describe a case study where a cell culture unit operation in bioreactors using one-sided pH control and their satellites (small-scale runs conducted using the same post-inoculation cultures and nutrient feeds) in 3-L bioreactors and shake flasks indicated that shake flasks mimicked the large-scale performance better than 3-L bioreactors. We detail here how multivariate analysis was used to make the pertinent assessment and to generate the hypothesis for refining the existing 3-L scale-down model. Relevant statistical techniques such as principal component analysis, partial least square, orthogonal partial least square, and discriminant analysis were used to identify the outliers and to determine the discriminatory variables responsible for performance differences at different scales. The resulting analysis, in combination with mass transfer principles, led to the hypothesis that observed similarities between 15,000-L and shake flask runs, and differences between 15,000-L and 3-L runs, were due to pCO2 and pH values. This hypothesis was confirmed by changing the aeration strategy at 3-L scale. By reducing the initial sparge rate in 3-L bioreactor, process performance and product quality data moved closer to that of large scale. © 2015 American Institute of Chemical Engineers.

  18. MMPI-2 validity, clinical and content scales, and the Fake Bad Scale for personal injury litigants claiming idiopathic environmental intolerance.

    PubMed

    Staudenmayer, Herman; Phillips, Scott

    2007-01-01

    Idiopathic environmental intolerance (IEI) is a descriptor for nonspecific complaints that are attributed to environmental exposure. The Minnesota Multiphasic Personality Inventory 2 (MMPI-2) was administered to 50 female and 20 male personal injury litigants alleging IEI. The validity scales indicated no overreporting of psychopathology. Half of the cases had elevated scores on validity scales suggesting defensiveness, and a large number had elevations on Fake Bad Scale (FBS) suggesting overreporting of unauthenticated symptoms. The average T-score profile for females was defined by the two-point code type 3-1 (Hysteria-Hypochondriasis), and the average T-score profile for males was defined by the three-point code type 3-1-2 (Hysteria, Hypochondriasis-Depression). On the content scales, Health Concerns (HEA) scale was significantly elevated. Idiopathic environmental intolerance litigants (a) are more defensive about expressing psychopathology, (b) express distress through somatization, (c) use a self-serving misrepresentation of exaggerated health concerns, and (d) may exaggerate unauthenticated symptoms suggesting malingering.

  19. Correlated motion of protein subdomains and large-scale conformational flexibility of RecA protein filament

    NASA Astrophysics Data System (ADS)

    Yu, Garmay; A, Shvetsov; D, Karelov; D, Lebedev; A, Radulescu; M, Petukhov; V, Isaev-Ivanov

    2012-02-01

    Based on X-ray crystallographic data available at Protein Data Bank, we have built molecular dynamics (MD) models of homologous recombinases RecA from E. coli and D. radiodurans. Functional form of RecA enzyme, which is known to be a long helical filament, was approximated by a trimer, simulated in periodic water box. The MD trajectories were analyzed in terms of large-scale conformational motions that could be detectable by neutron and X-ray scattering techniques. The analysis revealed that large-scale RecA monomer dynamics can be described in terms of relative motions of 7 subdomains. Motion of C-terminal domain was the major contributor to the overall dynamics of protein. Principal component analysis (PCA) of the MD trajectories in the atom coordinate space showed that rotation of C-domain is correlated with the conformational changes in the central domain and N-terminal domain, that forms the monomer-monomer interface. Thus, even though C-terminal domain is relatively far from the interface, its orientation is correlated with large-scale filament conformation. PCA of the trajectories in the main chain dihedral angle coordinate space implicates a co-existence of a several different large-scale conformations of the modeled trimer. In order to clarify the relationship of independent domain orientation with large-scale filament conformation, we have performed analysis of independent domain motion and its implications on the filament geometry.

  20. Topology of large-scale structure. IV - Topology in two dimensions

    NASA Technical Reports Server (NTRS)

    Melott, Adrian L.; Cohen, Alexander P.; Hamilton, Andrew J. S.; Gott, J. Richard, III; Weinberg, David H.

    1989-01-01

    In a recent series of papers, an algorithm was developed for quantitatively measuring the topology of the large-scale structure of the universe and this algorithm was applied to numerical models and to three-dimensional observational data sets. In this paper, it is shown that topological information can be derived from a two-dimensional cross section of a density field, and analytic expressions are given for a Gaussian random field. The application of a two-dimensional numerical algorithm for measuring topology to cross sections of three-dimensional models is demonstrated.

  1. Automated image alignment for 2D gel electrophoresis in a high-throughput proteomics pipeline.

    PubMed

    Dowsey, Andrew W; Dunn, Michael J; Yang, Guang-Zhong

    2008-04-01

    The quest for high-throughput proteomics has revealed a number of challenges in recent years. Whilst substantial improvements in automated protein separation with liquid chromatography and mass spectrometry (LC/MS), aka 'shotgun' proteomics, have been achieved, large-scale open initiatives such as the Human Proteome Organization (HUPO) Brain Proteome Project have shown that maximal proteome coverage is only possible when LC/MS is complemented by 2D gel electrophoresis (2-DE) studies. Moreover, both separation methods require automated alignment and differential analysis to relieve the bioinformatics bottleneck and so make high-throughput protein biomarker discovery a reality. The purpose of this article is to describe a fully automatic image alignment framework for the integration of 2-DE into a high-throughput differential expression proteomics pipeline. The proposed method is based on robust automated image normalization (RAIN) to circumvent the drawbacks of traditional approaches. These use symbolic representation at the very early stages of the analysis, which introduces persistent errors due to inaccuracies in modelling and alignment. In RAIN, a third-order volume-invariant B-spline model is incorporated into a multi-resolution schema to correct for geometric and expression inhomogeneity at multiple scales. The normalized images can then be compared directly in the image domain for quantitative differential analysis. Through evaluation against an existing state-of-the-art method on real and synthetically warped 2D gels, the proposed analysis framework demonstrates substantial improvements in matching accuracy and differential sensitivity. High-throughput analysis is established through an accelerated GPGPU (general purpose computation on graphics cards) implementation. Supplementary material, software and images used in the validation are available at http://www.proteomegrid.org/rain/.

  2. Human Disease-Drug Network Based on Genomic Expression Profiles

    PubMed Central

    Hu, Guanghui; Agarwal, Pankaj

    2009-01-01

    Background Drug repositioning offers the possibility of faster development times and reduced risks in drug discovery. With the rapid development of high-throughput technologies and ever-increasing accumulation of whole genome-level datasets, an increasing number of diseases and drugs can be comprehensively characterized by the changes they induce in gene expression, protein, metabolites and phenotypes. Methodology/Principal Findings We performed a systematic, large-scale analysis of genomic expression profiles of human diseases and drugs to create a disease-drug network. A network of 170,027 significant interactions was extracted from the ∼24.5 million comparisons between ∼7,000 publicly available transcriptomic profiles. The network includes 645 disease-disease, 5,008 disease-drug, and 164,374 drug-drug relationships. At least 60% of the disease-disease pairs were in the same disease area as determined by the Medical Subject Headings (MeSH) disease classification tree. The remaining can drive a molecular level nosology by discovering relationships between seemingly unrelated diseases, such as a connection between bipolar disorder and hereditary spastic paraplegia, and a connection between actinic keratosis and cancer. Among the 5,008 disease-drug links, connections with negative scores suggest new indications for existing drugs, such as the use of some antimalaria drugs for Crohn's disease, and a variety of existing drugs for Huntington's disease; while the positive scoring connections can aid in drug side effect identification, such as tamoxifen's undesired carcinogenic property. From the ∼37K drug-drug relationships, we discover relationships that aid in target and pathway deconvolution, such as 1) KCNMA1 as a potential molecular target of lobeline, and 2) both apoptotic DNA fragmentation and G2/M DNA damage checkpoint regulation as potential pathway targets of daunorubicin. Conclusions/Significance We have automatically generated thousands of disease and drug expression profiles using GEO datasets, and constructed a large scale disease-drug network for effective and efficient drug repositioning as well as drug target/pathway identification. PMID:19657382

  3. High-efficiency Agrobacterium-mediated transformation of Norway spruce (Picea abies) and loblolly pine (Pinus taeda)

    NASA Technical Reports Server (NTRS)

    Wenck, A. R.; Quinn, M.; Whetten, R. W.; Pullman, G.; Sederoff, R.; Brown, C. S. (Principal Investigator)

    1999-01-01

    Agrobacterium-mediated gene transfer is the method of choice for many plant biotechnology laboratories; however, large-scale use of this organism in conifer transformation has been limited by difficult propagation of explant material, selection efficiencies and low transformation frequency. We have analyzed co-cultivation conditions and different disarmed strains of Agrobacterium to improve transformation. Additional copies of virulence genes were added to three common disarmed strains. These extra virulence genes included either a constitutively active virG or extra copies of virG and virB, both from pTiBo542. In experiments with Norway spruce, we increased transformation efficiencies 1000-fold from initial experiments where little or no transient expression was detected. Over 100 transformed lines expressing the marker gene beta-glucuronidase (GUS) were generated from rapidly dividing embryogenic suspension-cultured cells co-cultivated with Agrobacterium. GUS activity was used to monitor transient expression and to further test lines selected on kanamycin-containing medium. In loblolly pine, transient expression increased 10-fold utilizing modified Agrobacterium strains. Agrobacterium-mediated gene transfer is a useful technique for large-scale generation of transgenic Norway spruce and may prove useful for other conifer species.

  4. Explaining human uniqueness: genome interactions with environment, behaviour and culture.

    PubMed

    Varki, Ajit; Geschwind, Daniel H; Eichler, Evan E

    2008-10-01

    What makes us human? Specialists in each discipline respond through the lens of their own expertise. In fact, 'anthropogeny' (explaining the origin of humans) requires a transdisciplinary approach that eschews such barriers. Here we take a genomic and genetic perspective towards molecular variation, explore systems analysis of gene expression and discuss an organ-systems approach. Rejecting any 'genes versus environment' dichotomy, we then consider genome interactions with environment, behaviour and culture, finally speculating that aspects of human uniqueness arose because of a primate evolutionary trend towards increasing and irreversible dependence on learned behaviours and culture - perhaps relaxing allowable thresholds for large-scale genomic diversity.

  5. Explaining human uniqueness: genome interactions with environment, behaviour and culture

    PubMed Central

    Varki, Ajit; Geschwind, Daniel H.; Eichler, Evan E.

    2009-01-01

    What makes us human? Specialists in each discipline respond through the lens of their own expertise. In fact, ‘anthropogeny’ (explaining the origin of humans) requires a transdisciplinary approach that eschews such barriers. Here we take a genomic and genetic perspective towards molecular variation, explore systems analysis of gene expression and discuss an organ-systems approach. Rejecting any ‘genes versus environment’ dichotomy, we then consider genome interactions with environment, behaviour and culture, finally speculating that aspects of human uniqueness arose because of a primate evolutionary trend towards increasing and irreversible dependence on learned behaviours and culture — perhaps relaxing allowable thresholds for large-scale genomic diversity. PMID:18802414

  6. Can global hydrological models reproduce large scale river flood regimes?

    NASA Astrophysics Data System (ADS)

    Eisner, Stephanie; Flörke, Martina

    2013-04-01

    River flooding remains one of the most severe natural hazards. On the one hand, major flood events pose a serious threat to human well-being, causing deaths and considerable economic damage. On the other hand, the periodic occurrence of flood pulses is crucial to maintain the functioning of riverine floodplains and wetlands, and to preserve the ecosystem services the latter provide. In many regions, river floods reveal a distinct seasonality, i.e. they occur at a particular time during the year. This seasonality is related to regionally dominant flood generating processes which can be expressed in river flood types. While in data-rich regions (esp. Europe and North America) the analysis of flood regimes can be based on observed river discharge time series, this data is sparse or lacking in many other regions of the world. This gap of knowledge can be filled by global modeling approaches. However, to date most global modeling studies have focused on mean annual or monthly water availability and their change over time while simulating discharge extremes, both floods and droughts, still remains a challenge for large scale hydrological models. This study will explore the ability of the global hydrological model WaterGAP3 to simulate the large scale patterns of river flood regimes, represented by seasonal pattern and the dominant flood type. WaterGAP3 simulates the global terrestrial water balance on a 5 arc minute spatial grid (excluding Greenland and Antarctica) at a daily time step. The model accounts for human interference on river flow, i.e. water abstraction for various purposes, e.g. irrigation, and flow regulation by large dams and reservoirs. Our analysis will provide insight in the general ability of global hydrological models to reproduce river flood regimes and thus will promote the creation of a global map of river flood regimes to provide a spatially inclusive and comprehensive picture. Understanding present-day flood regimes can support both flood risk analysis and the assessment of potential regional impacts of climate change on river flooding.

  7. Linear static structural and vibration analysis on high-performance computers

    NASA Technical Reports Server (NTRS)

    Baddourah, M. A.; Storaasli, O. O.; Bostic, S. W.

    1993-01-01

    Parallel computers offer the oppurtunity to significantly reduce the computation time necessary to analyze large-scale aerospace structures. This paper presents algorithms developed for and implemented on massively-parallel computers hereafter referred to as Scalable High-Performance Computers (SHPC), for the most computationally intensive tasks involved in structural analysis, namely, generation and assembly of system matrices, solution of systems of equations and calculation of the eigenvalues and eigenvectors. Results on SHPC are presented for large-scale structural problems (i.e. models for High-Speed Civil Transport). The goal of this research is to develop a new, efficient technique which extends structural analysis to SHPC and makes large-scale structural analyses tractable.

  8. Review of Dynamic Modeling and Simulation of Large Scale Belt Conveyor System

    NASA Astrophysics Data System (ADS)

    He, Qing; Li, Hong

    Belt conveyor is one of the most important devices to transport bulk-solid material for long distance. Dynamic analysis is the key to decide whether the design is rational in technique, safe and reliable in running, feasible in economy. It is very important to study dynamic properties, improve efficiency and productivity, guarantee conveyor safe, reliable and stable running. The dynamic researches and applications of large scale belt conveyor are discussed. The main research topics, the state-of-the-art of dynamic researches on belt conveyor are analyzed. The main future works focus on dynamic analysis, modeling and simulation of main components and whole system, nonlinear modeling, simulation and vibration analysis of large scale conveyor system.

  9. Integration of a constraint-based metabolic model of Brassica napus developing seeds with 13C-metabolic flux analysis

    PubMed Central

    Hay, Jordan O.; Shi, Hai; Heinzel, Nicolas; Hebbelmann, Inga; Rolletschek, Hardy; Schwender, Jorg

    2014-01-01

    The use of large-scale or genome-scale metabolic reconstructions for modeling and simulation of plant metabolism and integration of those models with large-scale omics and experimental flux data is becoming increasingly important in plant metabolic research. Here we report an updated version of bna572, a bottom-up reconstruction of oilseed rape (Brassica napus L.; Brassicaceae) developing seeds with emphasis on representation of biomass-component biosynthesis. New features include additional seed-relevant pathways for isoprenoid, sterol, phenylpropanoid, flavonoid, and choline biosynthesis. Being now based on standardized data formats and procedures for model reconstruction, bna572+ is available as a COBRA-compliant Systems Biology Markup Language (SBML) model and conforms to the Minimum Information Requested in the Annotation of Biochemical Models (MIRIAM) standards for annotation of external data resources. Bna572+ contains 966 genes, 671 reactions, and 666 metabolites distributed among 11 subcellular compartments. It is referenced to the Arabidopsis thaliana genome, with gene-protein-reaction (GPR) associations resolving subcellular localization. Detailed mass and charge balancing and confidence scoring were applied to all reactions. Using B. napus seed specific transcriptome data, expression was verified for 78% of bna572+ genes and 97% of reactions. Alongside bna572+ we also present a revised carbon centric model for 13C-Metabolic Flux Analysis (13C-MFA) with all its reactions being referenced to bna572+ based on linear projections. By integration of flux ratio constraints obtained from 13C-MFA and by elimination of infinite flux bounds around thermodynamically infeasible loops based on COBRA loopless methods, we demonstrate improvements in predictive power of Flux Variability Analysis (FVA). Using this combined approach we characterize the difference in metabolic flux of developing seeds of two B. napus genotypes contrasting in starch and oil content. PMID:25566296

  10. A fast boosting-based screening method for large-scale association study in complex traits with genetic heterogeneity.

    PubMed

    Wang, Lu-Yong; Fasulo, D

    2006-01-01

    Genome-wide association study for complex diseases will generate massive amount of single nucleotide polymorphisms (SNPs) data. Univariate statistical test (i.e. Fisher exact test) was used to single out non-associated SNPs. However, the disease-susceptible SNPs may have little marginal effects in population and are unlikely to retain after the univariate tests. Also, model-based methods are impractical for large-scale dataset. Moreover, genetic heterogeneity makes the traditional methods harder to identify the genetic causes of diseases. A more recent random forest method provides a more robust method for screening the SNPs in thousands scale. However, for more large-scale data, i.e., Affymetrix Human Mapping 100K GeneChip data, a faster screening method is required to screening SNPs in whole-genome large scale association analysis with genetic heterogeneity. We propose a boosting-based method for rapid screening in large-scale analysis of complex traits in the presence of genetic heterogeneity. It provides a relatively fast and fairly good tool for screening and limiting the candidate SNPs for further more complex computational modeling task.

  11. Large-scale identification of differentially expressed genes during pupa development reveals solute carrier gene is essential for pupal pigmentation in Chilo suppressalis.

    PubMed

    Sun, Yang; Huang, Shuijin; Wang, Shuping; Guo, Dianhao; Ge, Chang; Xiao, Huamei; Jie, Wencai; Yang, Qiupu; Teng, Xiaolu; Li, Fei

    2017-04-01

    Insects undergo metamorphosis, involving an abrupt change in body structure through cell growth and differentiation. Rice stem stripped borer (SSB), Chilo suppressalis, is one of the most destructive rice pests. However, little is known about the regulation mechanism of metamorphosis development in this notorious insect pest. Here, we studied the expression of 22,197 SSB genes at seven time points during pupa development with a customized microarray, identifying 622 differentially expressed genes (DEG) during pupa development. Gene ontology (GO) analysis of these DEGs indicated that the genes related to substance metabolism were highly expressed in the early pupa, which participate in the physiological processes of larval tissue disintegration at these stages. In comparison, highly expressed genes in the late pupal stages were mainly associated with substance biosynthesis, consistent with adult organ formation at these stages. There were 27 solute carrier (SLC) genes that were highly expressed during pupa development. We knocked down SLC22A3 at the prepupal stage, demonstrating that silencing SLC22A3 induced a deficiency in pupa stiffness and pigmentation. The RNAi-treated individuals had white and soft pupa, suggesting that this gene has an essential role in pupal development. Copyright © 2016 Elsevier Ltd. All rights reserved.

  12. Imprinted gene expression in fetal growth and development.

    PubMed

    Lambertini, L; Marsit, C J; Sharma, P; Maccani, M; Ma, Y; Hu, J; Chen, J

    2012-06-01

    Experimental studies showed that genomic imprinting is fundamental in fetoplacental development by timely regulating the expression of the imprinted genes to overlook a set of events determining placenta implantation, growth and embryogenesis. We examined the expression profile of 22 imprinted genes which have been linked to pregnancy abnormalities that may ultimately influence childhood development. The study was conducted in a subset of 106 placenta samples, overrepresented with small and large for gestational age cases, from the Rhode Island Child Health Study. We investigated associations between imprinted gene expression and three fetal development parameters: newborn head circumference, birth weight, and size for gestational age. Results from our investigation show that the maternally imprinted/paternally expressed gene ZNF331 inversely associates with each parameter to drive smaller fetal size, while paternally imprinted/maternally expressed gene SLC22A18 directly associates with the newborn head circumference promoting growth. Multidimensional Scaling analysis revealed two clusters within the 22 imprinted genes which are independently associated with fetoplacental development. Our data suggest that cluster 1 genes work by assuring cell growth and tissue development, while cluster 2 genes act by coordinating these processes. Results from this epidemiologic study offer solid support for the key role of imprinting in fetoplacental development. Copyright © 2012 Elsevier Ltd. All rights reserved.

  13. Tissue microarray immunohistochemical detection of brachyury is not a prognostic indicator in chordoma.

    PubMed

    Zhang, Linlin; Guo, Shang; Schwab, Joseph H; Nielsen, G Petur; Choy, Edwin; Ye, Shunan; Zhang, Zhan; Mankin, Henry; Hornicek, Francis J; Duan, Zhenfeng

    2013-01-01

    Brachyury is a marker for notochord-derived tissues and neoplasms, such as chordoma. However, the prognostic relevance of brachyury expression in chordoma is still unknown. The improvement of tissue microarray technology has provided the opportunity to perform analyses of tumor tissues on a large scale in a uniform and consistent manner. This study was designed with the use of tissue microarray to determine the expression of brachyury. Brachyury expression in chordoma tissues from 78 chordoma patients was analyzed by immunohistochemical staining of tissue microarray. The clinicopathologic parameters, including gender, age, location of tumor and metastatic status were evaluated. Fifty-nine of 78 (75.64%) tumors showed nuclear staining for brachyury, and among them, 29 tumors (49.15%) showed 1+ (<30% positive cells) staining, 15 tumors (25.42%) had 2+ (31% to 60% positive cells) staining, and 15 tumors (25.42%) demonstrated 3+ (61% to 100% positive cells) staining. Brachyury nuclear staining was detected more frequently in sacral chordomas than in chordomas of the mobile spine. However, there was no significant relationship between brachyury expression and other clinical variables. By Kaplan-Meier analysis, brachyury expression failed to produce any significant relationship with the overall survival rate. In conclusion, brachyury expression is not a prognostic indicator in chordoma.

  14. Reduction of adenovirus E1A mRNA by RNAi results in enhanced recombinant protein expression in transiently transfected HEK293 cells.

    PubMed

    Hacker, David L; Bertschinger, Martin; Baldi, Lucia; Wurm, Florian M

    2004-10-27

    Human embryonic kidney 293 (HEK293) cells, a widely used host for large-scale transient expression of recombinant proteins, are transformed with the adenovirus E1A and E1B genes. Because the E1A proteins function as transcriptional activators or repressors, they may have a positive or negative effect on transient transgene expression in this cell line. Suspension cultures of HEK293 EBNA (HEK293E) cells were co-transfected with a reporter plasmid expressing the GFP gene and a plasmid expressing a short hairpin RNA (shRNA) targeting the E1A mRNAs for degradation by RNA interference (RNAi). The presence of the shRNA in HEK293E cells reduced the steady state level of E1A mRNA up to 75% and increased transient GFP expression from either the elongation factor-1alpha (EF-1alpha) promoter or the human cytomegalovirus (HCMV) immediate early promoter up to twofold. E1A mRNA depletion also resulted in a twofold increase in transient expression of a recombinant IgG in both small- and large-scale suspension cultures when the IgG light and heavy chain genes were controlled by the EF-1alpha promoter. Finally, transient IgG expression was enhanced 2.5-fold when the anti-E1A shRNA was expressed from the same vector as the IgG light chain gene. These results demonstrated that E1A has a negative effect on transient gene expression in HEK293E cells, and they established that RNAi can be used to enhance recombinant protein expression in mammalian cells.

  15. Integrative analysis of the Caenorhabditis elegans genome by the modENCODE project.

    PubMed

    Gerstein, Mark B; Lu, Zhi John; Van Nostrand, Eric L; Cheng, Chao; Arshinoff, Bradley I; Liu, Tao; Yip, Kevin Y; Robilotto, Rebecca; Rechtsteiner, Andreas; Ikegami, Kohta; Alves, Pedro; Chateigner, Aurelien; Perry, Marc; Morris, Mitzi; Auerbach, Raymond K; Feng, Xin; Leng, Jing; Vielle, Anne; Niu, Wei; Rhrissorrakrai, Kahn; Agarwal, Ashish; Alexander, Roger P; Barber, Galt; Brdlik, Cathleen M; Brennan, Jennifer; Brouillet, Jeremy Jean; Carr, Adrian; Cheung, Ming-Sin; Clawson, Hiram; Contrino, Sergio; Dannenberg, Luke O; Dernburg, Abby F; Desai, Arshad; Dick, Lindsay; Dosé, Andréa C; Du, Jiang; Egelhofer, Thea; Ercan, Sevinc; Euskirchen, Ghia; Ewing, Brent; Feingold, Elise A; Gassmann, Reto; Good, Peter J; Green, Phil; Gullier, Francois; Gutwein, Michelle; Guyer, Mark S; Habegger, Lukas; Han, Ting; Henikoff, Jorja G; Henz, Stefan R; Hinrichs, Angie; Holster, Heather; Hyman, Tony; Iniguez, A Leo; Janette, Judith; Jensen, Morten; Kato, Masaomi; Kent, W James; Kephart, Ellen; Khivansara, Vishal; Khurana, Ekta; Kim, John K; Kolasinska-Zwierz, Paulina; Lai, Eric C; Latorre, Isabel; Leahey, Amber; Lewis, Suzanna; Lloyd, Paul; Lochovsky, Lucas; Lowdon, Rebecca F; Lubling, Yaniv; Lyne, Rachel; MacCoss, Michael; Mackowiak, Sebastian D; Mangone, Marco; McKay, Sheldon; Mecenas, Desirea; Merrihew, Gennifer; Miller, David M; Muroyama, Andrew; Murray, John I; Ooi, Siew-Loon; Pham, Hoang; Phippen, Taryn; Preston, Elicia A; Rajewsky, Nikolaus; Rätsch, Gunnar; Rosenbaum, Heidi; Rozowsky, Joel; Rutherford, Kim; Ruzanov, Peter; Sarov, Mihail; Sasidharan, Rajkumar; Sboner, Andrea; Scheid, Paul; Segal, Eran; Shin, Hyunjin; Shou, Chong; Slack, Frank J; Slightam, Cindie; Smith, Richard; Spencer, William C; Stinson, E O; Taing, Scott; Takasaki, Teruaki; Vafeados, Dionne; Voronina, Ksenia; Wang, Guilin; Washington, Nicole L; Whittle, Christina M; Wu, Beijing; Yan, Koon-Kiu; Zeller, Georg; Zha, Zheng; Zhong, Mei; Zhou, Xingliang; Ahringer, Julie; Strome, Susan; Gunsalus, Kristin C; Micklem, Gos; Liu, X Shirley; Reinke, Valerie; Kim, Stuart K; Hillier, LaDeana W; Henikoff, Steven; Piano, Fabio; Snyder, Michael; Stein, Lincoln; Lieb, Jason D; Waterston, Robert H

    2010-12-24

    We systematically generated large-scale data sets to improve genome annotation for the nematode Caenorhabditis elegans, a key model organism. These data sets include transcriptome profiling across a developmental time course, genome-wide identification of transcription factor-binding sites, and maps of chromatin organization. From this, we created more complete and accurate gene models, including alternative splice forms and candidate noncoding RNAs. We constructed hierarchical networks of transcription factor-binding and microRNA interactions and discovered chromosomal locations bound by an unusually large number of transcription factors. Different patterns of chromatin composition and histone modification were revealed between chromosome arms and centers, with similarly prominent differences between autosomes and the X chromosome. Integrating data types, we built statistical models relating chromatin, transcription factor binding, and gene expression. Overall, our analyses ascribed putative functions to most of the conserved genome.

  16. Research highlights: microfluidics meets big data.

    PubMed

    Tseng, Peter; Weaver, Westbrook M; Masaeli, Mahdokht; Owsley, Keegan; Di Carlo, Dino

    2014-03-07

    In this issue we highlight a collection of recent work in which microfluidic parallelization and automation have been employed to address the increasing need for large amounts of quantitative data concerning cellular function--from correlating microRNA levels to protein expression, increasing the throughput and reducing the noise when studying protein dynamics in single-cells, and understanding how signal dynamics encodes information. The painstaking dissection of cellular pathways one protein at a time appears to be coming to an end, leading to more rapid discoveries which will inevitably translate to better cellular control--in producing useful gene products and treating disease at the individual cell level. From these studies it is also clear that development of large scale mutant or fusion libraries, automation of microscopy, image analysis, and data extraction will be key components as microfluidics contributes its strengths to aid systems biology moving forward.

  17. Renormalization-group theory of plasma microturbulence

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Carati, D.; Chriaa, K.; Balescu, R.

    1994-08-01

    The dynamical renormalization-group methods are applied to the gyrokinetic equation describing drift-wave turbulence in plasmas. As in both magnetohydrodynamic and neutral turbulence, small-scale fluctuations appear to act as effective dissipative processes on large-scale phenomena. A linear renormalized gyrokinetic equation is derived. No artificial forcing is introduced into the equations and all the renormalized corrections are expressed in terms of the fluctuating electric potential. The link with the quasilinear limit and the direct interaction approximation is investigated. Simple analytical expressions for the anomalous transport coefficients are derived by using the linear renormalized gyrokinetic equation. Examples show that both quasilinear and Bohmmore » scalings can be recovered depending on the spectral amplitude of the electric potential fluctuations.« less

  18. Higher Education Teachers' Descriptions of Their Own Learning: A Large-Scale Study of Finnish Universities of Applied Sciences

    ERIC Educational Resources Information Center

    Töytäri, Aija; Piirainen, Arja; Tynjälä, Päivi; Vanhanen-Nuutinen, Liisa; Mäki, Kimmo; Ilves, Vesa

    2016-01-01

    In this large-scale study, higher education teachers' descriptions of their own learning were examined with qualitative analysis involving application of principles of phenomenographic research. This study is unique: it is unusual to use large-scale data in qualitative studies. The data were collected through an e-mail survey sent to 5960 teachers…

  19. Automated deep-phenotyping of the vertebrate brain

    PubMed Central

    Allalou, Amin; Wu, Yuelong; Ghannad-Rezaie, Mostafa; Eimon, Peter M; Yanik, Mehmet Fatih

    2017-01-01

    Here, we describe an automated platform suitable for large-scale deep-phenotyping of zebrafish mutant lines, which uses optical projection tomography to rapidly image brain-specific gene expression patterns in 3D at cellular resolution. Registration algorithms and correlation analysis are then used to compare 3D expression patterns, to automatically detect all statistically significant alterations in mutants, and to map them onto a brain atlas. Automated deep-phenotyping of a mutation in the master transcriptional regulator fezf2 not only detects all known phenotypes but also uncovers important novel neural deficits that were overlooked in previous studies. In the telencephalon, we show for the first time that fezf2 mutant zebrafish have significant patterning deficits, particularly in glutamatergic populations. Our findings reveal unexpected parallels between fezf2 function in zebrafish and mice, where mutations cause deficits in glutamatergic neurons of the telencephalon-derived neocortex. DOI: http://dx.doi.org/10.7554/eLife.23379.001 PMID:28406399

  20. Formal methods for modeling and analysis of hybrid systems

    NASA Technical Reports Server (NTRS)

    Tiwari, Ashish (Inventor); Lincoln, Patrick D. (Inventor)

    2009-01-01

    A technique based on the use of a quantifier elimination decision procedure for real closed fields and simple theorem proving to construct a series of successively finer qualitative abstractions of hybrid automata is taught. The resulting abstractions are always discrete transition systems which can then be used by any traditional analysis tool. The constructed abstractions are conservative and can be used to establish safety properties of the original system. The technique works on linear and non-linear polynomial hybrid systems: the guards on discrete transitions and the continuous flows in all modes can be specified using arbitrary polynomial expressions over the continuous variables. An exemplar tool in the SAL environment built over the theorem prover PVS is detailed. The technique scales well to large and complex hybrid systems.

  1. Production of recombinant antigens and antibodies in Nicotiana benthamiana using 'magnifection' technology: GMP-compliant facilities for small- and large-scale manufacturing.

    PubMed

    Klimyuk, Victor; Pogue, Gregory; Herz, Stefan; Butler, John; Haydon, Hugh

    2014-01-01

    This review describes the adaptation of the plant virus-based transient expression system, magnICON(®) for the at-scale manufacturing of pharmaceutical proteins. The system utilizes so-called "deconstructed" viral vectors that rely on Agrobacterium-mediated systemic delivery into the plant cells for recombinant protein production. The system is also suitable for production of hetero-oligomeric proteins like immunoglobulins. By taking advantage of well established R&D tools for optimizing the expression of protein of interest using this system, product concepts can reach the manufacturing stage in highly competitive time periods. At the manufacturing stage, the system offers many remarkable features including rapid production cycles, high product yield, virtually unlimited scale-up potential, and flexibility for different manufacturing schemes. The magnICON system has been successfully adaptated to very different logistical manufacturing formats: (1) speedy production of multiple small batches of individualized pharmaceuticals proteins (e.g. antigens comprising individualized vaccines to treat NonHodgkin's Lymphoma patients) and (2) large-scale production of other pharmaceutical proteins such as therapeutic antibodies. General descriptions of the prototype GMP-compliant manufacturing processes and facilities for the product formats that are in preclinical and clinical testing are provided.

  2. A Gene Co-Expression Network in Whole Blood of Schizophrenia Patients Is Independent of Antipsychotic-Use and Enriched for Brain-Expressed Genes

    PubMed Central

    de Jong, Simone; Boks, Marco P. M.; Fuller, Tova F.; Strengman, Eric; Janson, Esther; de Kovel, Carolien G. F.; Ori, Anil P. S.; Vi, Nancy; Mulder, Flip; Blom, Jan Dirk; Glenthøj, Birte; Schubart, Chris D.; Cahn, Wiepke; Kahn, René S.; Horvath, Steve; Ophoff, Roel A.

    2012-01-01

    Despite large-scale genome-wide association studies (GWAS), the underlying genes for schizophrenia are largely unknown. Additional approaches are therefore required to identify the genetic background of this disorder. Here we report findings from a large gene expression study in peripheral blood of schizophrenia patients and controls. We applied a systems biology approach to genome-wide expression data from whole blood of 92 medicated and 29 antipsychotic-free schizophrenia patients and 118 healthy controls. We show that gene expression profiling in whole blood can identify twelve large gene co-expression modules associated with schizophrenia. Several of these disease related modules are likely to reflect expression changes due to antipsychotic medication. However, two of the disease modules could be replicated in an independent second data set involving antipsychotic-free patients and controls. One of these robustly defined disease modules is significantly enriched with brain-expressed genes and with genetic variants that were implicated in a GWAS study, which could imply a causal role in schizophrenia etiology. The most highly connected intramodular hub gene in this module (ABCF1), is located in, and regulated by the major histocompatibility (MHC) complex, which is intriguing in light of the fact that common allelic variants from the MHC region have been implicated in schizophrenia. This suggests that the MHC increases schizophrenia susceptibility via altered gene expression of regulatory genes in this network. PMID:22761806

  3. Methods to increase reproducibility in differential gene expression via meta-analysis

    PubMed Central

    Sweeney, Timothy E.; Haynes, Winston A.; Vallania, Francesco; Ioannidis, John P.; Khatri, Purvesh

    2017-01-01

    Findings from clinical and biological studies are often not reproducible when tested in independent cohorts. Due to the testing of a large number of hypotheses and relatively small sample sizes, results from whole-genome expression studies in particular are often not reproducible. Compared to single-study analysis, gene expression meta-analysis can improve reproducibility by integrating data from multiple studies. However, there are multiple choices in designing and carrying out a meta-analysis. Yet, clear guidelines on best practices are scarce. Here, we hypothesized that studying subsets of very large meta-analyses would allow for systematic identification of best practices to improve reproducibility. We therefore constructed three very large gene expression meta-analyses from clinical samples, and then examined meta-analyses of subsets of the datasets (all combinations of datasets with up to N/2 samples and K/2 datasets) compared to a ‘silver standard’ of differentially expressed genes found in the entire cohort. We tested three random-effects meta-analysis models using this procedure. We showed relatively greater reproducibility with more-stringent effect size thresholds with relaxed significance thresholds; relatively lower reproducibility when imposing extraneous constraints on residual heterogeneity; and an underestimation of actual false positive rate by Benjamini–Hochberg correction. In addition, multivariate regression showed that the accuracy of a meta-analysis increased significantly with more included datasets even when controlling for sample size. PMID:27634930

  4. Temporal transcriptome profiling reveals expression partitioning of homeologous genes contributing to heat and drought acclimation in wheat (Triticum aestivum L.).

    PubMed

    Liu, Zhenshan; Xin, Mingming; Qin, Jinxia; Peng, Huiru; Ni, Zhongfu; Yao, Yingyin; Sun, Qixin

    2015-06-20

    Hexaploid wheat (Triticum aestivum) is a globally important crop. Heat, drought and their combination dramatically reduce wheat yield and quality, but the molecular mechanisms underlying wheat tolerance to extreme environments, especially stress combination, are largely unknown. As an allohexaploid, wheat consists of three closely related subgenomes (A, B, and D), and was reported to show improved tolerance to stress conditions compared to tetraploid. But so far very little is known about how wheat coordinates the expression of homeologous genes to cope with various environmental constraints on the whole-genome level. To explore the transcriptional response of wheat to the individual and combined stress, we performed high-throughput transcriptome sequencing of seedlings under normal condition and subjected to drought stress (DS), heat stress (HS) and their combination (HD) for 1 h and 6 h, and presented global gene expression reprograms in response to these three stresses. Gene Ontology (GO) enrichment analysis of DS, HS and HD responsive genes revealed an overlap and complexity of functional pathways between each other. Moreover, 4,375 wheat transcription factors were identified on a whole-genome scale based on the released scaffold information by IWGSC, and 1,328 were responsive to stress treatments. Then, the regulatory network analysis of HSFs and DREBs implicated they were both involved in the regulation of DS, HS and HD response and indicated a cross-talk between heat and drought stress. Finally, approximately 68.4 % of homeologous genes were found to exhibit expression partitioning in response to DS, HS or HD, which was further confirmed by using quantitative RT-PCR and Nullisomic-Tetrasomic lines. A large proportion of wheat homeologs exhibited expression partitioning under normal and abiotic stresses, which possibly contributes to the wide adaptability and distribution of hexaploid wheat in response to various environmental constraints.

  5. Cross-disease transcriptomics: Unique IL-17A signaling in psoriasis lesions and an autoimmune PBMC signature

    PubMed Central

    Sarkar, Mrinal K.; Liang, Yun; Xing, Xianying; Gudjonsson, Johann E.

    2016-01-01

    Transcriptome studies of psoriasis have identified robust changes in mRNA expression through large-scale analysis of patient cohorts. These studies, however, have analyzed all mRNA changes in aggregate, without distinguishing between disease-specific and non-specific differentially expressed genes (DEGs). In this study, RNA-seq meta-analysis was used to identify (1) psoriasis-specific DEGs altered in few diseases besides psoriasis and (2) non-specific DEGs similarly altered in many other skin conditions. We show that few cutaneous DEGs are psoriasis-specific and that the two DEG classes differ in their cell type and cytokine associations. Psoriasis-specific DEGs are expressed by keratinocytes and induced by IL-17A, whereas non-specific DEGs are expressed by inflammatory cells and induced by IFN-gamma and TNF. PBMC-derived DEGs were more psoriasis-specific than cutaneous DEGs. Nonetheless, PBMC DEGs associated with MHC class I and NK cells were commonly downregulated in psoriasis and other autoimmune diseases (e.g., multiple sclerosis, sarcoidosis and juvenile rheumatoid arthritis). These findings demonstrate “cross-disease” transcriptomics as an approach to gain insights into the cutaneous and non-cutaneous psoriasis transcriptomes. This highlighted unique contributions of IL-17A to the cytokine network and uncovered a blood-based gene signature that links psoriasis to other diseases of autoimmunity. PMID:27206706

  6. Dynamics of a vertical-flow windrow vermicomposting system.

    PubMed

    Hanc, Ales; Castkova, Tereza; Kuzel, Stanislav; Cajthaml, Tomas

    2017-11-01

    Large-scale vermicomposting under outdoor conditions may differ from small-scale procedures in the laboratory. The present study evaluated changes in selected properties of a large-scale vertical-flow windrow vermicomposting system with continuous feeding with household biowaste. The windrow profile was divided into five layers of differing thickness and age after more than 12 months of vermicomposting. The top layer (0-30 cm, age <3 months) was characterised by partially decomposed organic matter with a high pH value and an elevated carbon/nitrogen (C/N) ratio. The earthworm biomass was 15 g kg -1 with a population density of 125 earthworms per kilogram predominantly found in clusters. The greatest amount of fungi (3.5 µg g -1 dw) and bacteria (62 µg g -1 dw) (expressed as phospholipid fatty acid analysis) was found in this layer. Thus, the top layer could be used for an additional cycle of windrow vermicomposting and for the preparation of aqueous extracts to protect plants against diseases. The lower layers (graduated by 30 cm and by 3 months of age) were mature as reflected by the low content of ammonia nitrogen, ratio of ammonia to nitrate nitrogen and dissolved organic carbon, and high ion-exchange capacity and its ratio to carbon. These layers were characterised by elevated values for electrical conductivity, total content of nutrients, available magnesium content, and a relatively large bacterial/fungal ratio. On the basis of the observed properties, the bottom layers were predetermined as effective fertilisers.

  7. Identification of human circadian genes based on time course gene expression profiles by using a deep learning method.

    PubMed

    Cui, Peng; Zhong, Tingyan; Wang, Zhuo; Wang, Tao; Zhao, Hongyu; Liu, Chenglin; Lu, Hui

    2018-06-01

    Circadian genes express periodically in an approximate 24-h period and the identification and study of these genes can provide deep understanding of the circadian control which plays significant roles in human health. Although many circadian gene identification algorithms have been developed, large numbers of false positives and low coverage are still major problems in this field. In this study we constructed a novel computational framework for circadian gene identification using deep neural networks (DNN) - a deep learning algorithm which can represent the raw form of data patterns without imposing assumptions on the expression distribution. Firstly, we transformed time-course gene expression data into categorical-state data to denote the changing trend of gene expression. Two distinct expression patterns emerged after clustering of the state data for circadian genes from our manually created learning dataset. DNN was then applied to discriminate the aperiodic genes and the two subtypes of periodic genes. In order to assess the performance of DNN, four commonly used machine learning methods including k-nearest neighbors, logistic regression, naïve Bayes, and support vector machines were used for comparison. The results show that the DNN model achieves the best balanced precision and recall. Next, we conducted large scale circadian gene detection using the trained DNN model for the remaining transcription profiles. Comparing with JTK_CYCLE and a study performed by Möller-Levet et al. (doi: https://doi.org/10.1073/pnas.1217154110), we identified 1132 novel periodic genes. Through the functional analysis of these novel circadian genes, we found that the GTPase superfamily exhibits distinct circadian expression patterns and may provide a molecular switch of circadian control of the functioning of the immune system in human blood. Our study provides novel insights into both the circadian gene identification field and the study of complex circadian-driven biological control. This article is part of a Special Issue entitled: Accelerating Precision Medicine through Genetic and Genomic Big Data Analysis edited by Yudong Cai & Tao Huang. Copyright © 2017. Published by Elsevier B.V.

  8. De novo sequencing and characterization of floral transcriptome in two species of buckwheat (Fagopyrum)

    PubMed Central

    2011-01-01

    Background Transcriptome sequencing data has become an integral component of modern genetics, genomics and evolutionary biology. However, despite advances in the technologies of DNA sequencing, such data are lacking for many groups of living organisms, in particular, many plant taxa. We present here the results of transcriptome sequencing for two closely related plant species. These species, Fagopyrum esculentum and F. tataricum, belong to the order Caryophyllales - a large group of flowering plants with uncertain evolutionary relationships. F. esculentum (common buckwheat) is also an important food crop. Despite these practical and evolutionary considerations Fagopyrum species have not been the subject of large-scale sequencing projects. Results Normalized cDNA corresponding to genes expressed in flowers and inflorescences of F. esculentum and F. tataricum was sequenced using the 454 pyrosequencing technology. This resulted in 267 (for F. esculentum) and 229 (F. tataricum) thousands of reads with average length of 341-349 nucleotides. De novo assembly of the reads produced about 25 thousands of contigs for each species, with 7.5-8.2× coverage. Comparative analysis of two transcriptomes demonstrated their overall similarity but also revealed genes that are presumably differentially expressed. Among them are retrotransposon genes and genes involved in sugar biosynthesis and metabolism. Thirteen single-copy genes were used for phylogenetic analysis; the resulting trees are largely consistent with those inferred from multigenic plastid datasets. The sister relationships of the Caryophyllales and asterids now gained high support from nuclear gene sequences. Conclusions 454 transcriptome sequencing and de novo assembly was performed for two congeneric flowering plant species, F. esculentum and F. tataricum. As a result, a large set of cDNA sequences that represent orthologs of known plant genes as well as potential new genes was generated. PMID:21232141

  9. Astronomical algorithms for automated analysis of tissue protein expression in breast cancer

    PubMed Central

    Ali, H R; Irwin, M; Morris, L; Dawson, S-J; Blows, F M; Provenzano, E; Mahler-Araujo, B; Pharoah, P D; Walton, N A; Brenton, J D; Caldas, C

    2013-01-01

    Background: High-throughput evaluation of tissue biomarkers in oncology has been greatly accelerated by the widespread use of tissue microarrays (TMAs) and immunohistochemistry. Although TMAs have the potential to facilitate protein expression profiling on a scale to rival experiments of tumour transcriptomes, the bottleneck and imprecision of manually scoring TMAs has impeded progress. Methods: We report image analysis algorithms adapted from astronomy for the precise automated analysis of IHC in all subcellular compartments. The power of this technique is demonstrated using over 2000 breast tumours and comparing quantitative automated scores against manual assessment by pathologists. Results: All continuous automated scores showed good correlation with their corresponding ordinal manual scores. For oestrogen receptor (ER), the correlation was 0.82, P<0.0001, for BCL2 0.72, P<0.0001 and for HER2 0.62, P<0.0001. Automated scores showed excellent concordance with manual scores for the unsupervised assignment of cases to ‘positive' or ‘negative' categories with agreement rates of up to 96%. Conclusion: The adaptation of astronomical algorithms coupled with their application to large annotated study cohorts, constitutes a powerful tool for the realisation of the enormous potential of digital pathology. PMID:23329232

  10. Genome-wide association meta-analysis in 269,867 individuals identifies new genetic and functional links to intelligence.

    PubMed

    Savage, Jeanne E; Jansen, Philip R; Stringer, Sven; Watanabe, Kyoko; Bryois, Julien; de Leeuw, Christiaan A; Nagel, Mats; Awasthi, Swapnil; Barr, Peter B; Coleman, Jonathan R I; Grasby, Katrina L; Hammerschlag, Anke R; Kaminski, Jakob A; Karlsson, Robert; Krapohl, Eva; Lam, Max; Nygaard, Marianne; Reynolds, Chandra A; Trampush, Joey W; Young, Hannah; Zabaneh, Delilah; Hägg, Sara; Hansell, Narelle K; Karlsson, Ida K; Linnarsson, Sten; Montgomery, Grant W; Muñoz-Manchado, Ana B; Quinlan, Erin B; Schumann, Gunter; Skene, Nathan G; Webb, Bradley T; White, Tonya; Arking, Dan E; Avramopoulos, Dimitrios; Bilder, Robert M; Bitsios, Panos; Burdick, Katherine E; Cannon, Tyrone D; Chiba-Falek, Ornit; Christoforou, Andrea; Cirulli, Elizabeth T; Congdon, Eliza; Corvin, Aiden; Davies, Gail; Deary, Ian J; DeRosse, Pamela; Dickinson, Dwight; Djurovic, Srdjan; Donohoe, Gary; Conley, Emily Drabant; Eriksson, Johan G; Espeseth, Thomas; Freimer, Nelson A; Giakoumaki, Stella; Giegling, Ina; Gill, Michael; Glahn, David C; Hariri, Ahmad R; Hatzimanolis, Alex; Keller, Matthew C; Knowles, Emma; Koltai, Deborah; Konte, Bettina; Lahti, Jari; Le Hellard, Stephanie; Lencz, Todd; Liewald, David C; London, Edythe; Lundervold, Astri J; Malhotra, Anil K; Melle, Ingrid; Morris, Derek; Need, Anna C; Ollier, William; Palotie, Aarno; Payton, Antony; Pendleton, Neil; Poldrack, Russell A; Räikkönen, Katri; Reinvang, Ivar; Roussos, Panos; Rujescu, Dan; Sabb, Fred W; Scult, Matthew A; Smeland, Olav B; Smyrnis, Nikolaos; Starr, John M; Steen, Vidar M; Stefanis, Nikos C; Straub, Richard E; Sundet, Kjetil; Tiemeier, Henning; Voineskos, Aristotle N; Weinberger, Daniel R; Widen, Elisabeth; Yu, Jin; Abecasis, Goncalo; Andreassen, Ole A; Breen, Gerome; Christiansen, Lene; Debrabant, Birgit; Dick, Danielle M; Heinz, Andreas; Hjerling-Leffler, Jens; Ikram, M Arfan; Kendler, Kenneth S; Martin, Nicholas G; Medland, Sarah E; Pedersen, Nancy L; Plomin, Robert; Polderman, Tinca J C; Ripke, Stephan; van der Sluis, Sophie; Sullivan, Patrick F; Vrieze, Scott I; Wright, Margaret J; Posthuma, Danielle

    2018-06-25

    Intelligence is highly heritable 1 and a major determinant of human health and well-being 2 . Recent genome-wide meta-analyses have identified 24 genomic loci linked to variation in intelligence 3-7 , but much about its genetic underpinnings remains to be discovered. Here, we present a large-scale genetic association study of intelligence (n = 269,867), identifying 205 associated genomic loci (190 new) and 1,016 genes (939 new) via positional mapping, expression quantitative trait locus (eQTL) mapping, chromatin interaction mapping, and gene-based association analysis. We find enrichment of genetic effects in conserved and coding regions and associations with 146 nonsynonymous exonic variants. Associated genes are strongly expressed in the brain, specifically in striatal medium spiny neurons and hippocampal pyramidal neurons. Gene set analyses implicate pathways related to nervous system development and synaptic structure. We confirm previous strong genetic correlations with multiple health-related outcomes, and Mendelian randomization analysis results suggest protective effects of intelligence for Alzheimer's disease and ADHD and bidirectional causation with pleiotropic effects for schizophrenia. These results are a major step forward in understanding the neurobiology of cognitive function as well as genetically related neurological and psychiatric disorders.

  11. Characterization of receptor of activated C kinase 1 (RACK1) and functional analysis during larval metamorphosis of the oyster Crassostrea angulata.

    PubMed

    Yang, Bingye; Pu, Fei; Qin, Ji; You, Weiwei; Ke, Caihuan

    2014-03-10

    During a large-scale screen of the larval transcriptome library of the Portuguese oyster, Crassostrea angulata, the oyster gene RACK, which encodes a receptor of activated protein kinase C protein was isolated and characterized. The cDNA is 1,148 bp long and has a predicted open reading frame encoding 317 aa. The predicted protein shows high sequence identity to many RACK proteins of different organisms including molluscs, fish, amphibians and mammals, suggesting that it is conserved during evolution. The structural analysis of the Ca-RACK1 genomic sequence implies that the Ca-RACK1 gene has seven exons and six introns, extending approximately 6.5 kb in length. It is expressed ubiquitously in many oyster tissues as detected by RT-PCR analysis. The Ca-RACK1 mRNA expression pattern was markedly increased at larval metamorphosis; and was further increased along with Ca-RACK1 protein synthesis during epinephrine-induced metamorphosis. These results indicate that the Ca-RACK1 plays an important role in tissue differentiation and/or in cell growth during larval metamorphosis in the oyster, C. angulata. Copyright © 2013 Elsevier B.V. All rights reserved.

  12. Modeling gene expression measurement error: a quasi-likelihood approach

    PubMed Central

    Strimmer, Korbinian

    2003-01-01

    Background Using suitable error models for gene expression measurements is essential in the statistical analysis of microarray data. However, the true probabilistic model underlying gene expression intensity readings is generally not known. Instead, in currently used approaches some simple parametric model is assumed (usually a transformed normal distribution) or the empirical distribution is estimated. However, both these strategies may not be optimal for gene expression data, as the non-parametric approach ignores known structural information whereas the fully parametric models run the risk of misspecification. A further related problem is the choice of a suitable scale for the model (e.g. observed vs. log-scale). Results Here a simple semi-parametric model for gene expression measurement error is presented. In this approach inference is based an approximate likelihood function (the extended quasi-likelihood). Only partial knowledge about the unknown true distribution is required to construct this function. In case of gene expression this information is available in the form of the postulated (e.g. quadratic) variance structure of the data. As the quasi-likelihood behaves (almost) like a proper likelihood, it allows for the estimation of calibration and variance parameters, and it is also straightforward to obtain corresponding approximate confidence intervals. Unlike most other frameworks, it also allows analysis on any preferred scale, i.e. both on the original linear scale as well as on a transformed scale. It can also be employed in regression approaches to model systematic (e.g. array or dye) effects. Conclusions The quasi-likelihood framework provides a simple and versatile approach to analyze gene expression data that does not make any strong distributional assumptions about the underlying error model. For several simulated as well as real data sets it provides a better fit to the data than competing models. In an example it also improved the power of tests to identify differential expression. PMID:12659637

  13. Design and analysis of large-scale biological rhythm studies: a comparison of algorithms for detecting periodic signals in biological data

    PubMed Central

    Deckard, Anastasia; Anafi, Ron C.; Hogenesch, John B.; Haase, Steven B.; Harer, John

    2013-01-01

    Motivation: To discover and study periodic processes in biological systems, we sought to identify periodic patterns in their gene expression data. We surveyed a large number of available methods for identifying periodicity in time series data and chose representatives of different mathematical perspectives that performed well on both synthetic data and biological data. Synthetic data were used to evaluate how each algorithm responds to different curve shapes, periods, phase shifts, noise levels and sampling rates. The biological datasets we tested represent a variety of periodic processes from different organisms, including the cell cycle and metabolic cycle in Saccharomyces cerevisiae, circadian rhythms in Mus musculus and the root clock in Arabidopsis thaliana. Results: From these results, we discovered that each algorithm had different strengths. Based on our findings, we make recommendations for selecting and applying these methods depending on the nature of the data and the periodic patterns of interest. Additionally, these results can also be used to inform the design of large-scale biological rhythm experiments so that the resulting data can be used with these algorithms to detect periodic signals more effectively. Contact: anastasia.deckard@duke.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:24058056

  14. Multiplex titration RT-PCR: rapid determination of gene expression patterns for a large number of genes

    NASA Technical Reports Server (NTRS)

    Nebenfuhr, A.; Lomax, T. L.

    1998-01-01

    We have developed an improved method for determination of gene expression levels with RT-PCR. The procedure is rapid and does not require extensive optimization or densitometric analysis. Since the detection of individual transcripts is PCR-based, small amounts of tissue samples are sufficient for the analysis of expression patterns in large gene families. Using this method, we were able to rapidly screen nine members of the Aux/IAA family of auxin-responsive genes and identify those genes which vary in message abundance in a tissue- and light-specific manner. While not offering the accuracy of conventional semi-quantitative or competitive RT-PCR, our method allows quick screening of large numbers of genes in a wide range of RNA samples with just a thermal cycler and standard gel analysis equipment.

  15. ARHGAP18 is a novel gene under positive natural selection that influences HbF levels in β-thalassaemia.

    PubMed

    He, Yunyan; Luo, Jianming; Chen, Yang; Zhou, Xiaoheng; Yu, Shanjuan; Jin, Ling; Xiao, Xuan; Jia, Siyuan; Liu, Qiang

    2018-02-01

    Foetal haemoglobin (HbF) plays a dominant role in ameliorating the morbidity and mortality of β-thalassaemia. A better understanding of the loci and genes involved in HbF expression would be beneficial for the treatment of β-thalassaemia major. However, the genes associated with HbF expression remain largely unknown. In this study, we first explored large-scale data sets and examined the human genome for evidence of positive natural selection to screen out single nucleotide polymorphisms (SNPs). A genetic analysis of HbF levels was conducted in a Chinese cohort of patients with β-thalassaemia to confirm the bioinformatics results. A total of 1141 subjects with β-thalassaemia were recruited. The results showed that the SNP rs11759328 in the ARHGAP18 gene was significantly associated with HbF levels (Ρ = 5.1 × 10 -4 ). ARHGAP18 belongs to the RhoGAP family and controls angiogenesis, cellular morphology and motility. Second, after determining that ARHGAP18 was highly expressed in the human K562 cell line, we used lentiviral-mediated small interfering RNA to knock down ARHGAP18 expression and subsequently assessed cell proliferation and apoptosis using cell proliferation assays and flow cytometry, respectively. ARHGAP18 downregulation in K562 cells significantly increased HBG1/2 expression and apoptosis, but proliferation was not significantly affected in vitro. Our data suggest that ARHGAP18, which was located by the SNP rs11759328 via positive selection, plays a potential role in regulating HbF expression in β-thalassaemia and may be a promising therapeutic target. Knockout studies of ARHGAP18 warrant further investigation into its aetiology in HbF.

  16. Effect of helicity on the correlation time of large scales in turbulent flows

    NASA Astrophysics Data System (ADS)

    Cameron, Alexandre; Alexakis, Alexandros; Brachet, Marc-Étienne

    2017-11-01

    Solutions of the forced Navier-Stokes equation have been conjectured to thermalize at scales larger than the forcing scale, similar to an absolute equilibrium obtained for the spectrally truncated Euler equation. Using direct numeric simulations of Taylor-Green flows and general-periodic helical flows, we present results on the probability density function, energy spectrum, autocorrelation function, and correlation time that compare the two systems. In the case of highly helical flows, we derive an analytic expression describing the correlation time for the absolute equilibrium of helical flows that is different from the E-1 /2k-1 scaling law of weakly helical flows. This model predicts a new helicity-based scaling law for the correlation time as τ (k ) ˜H-1 /2k-1 /2 . This scaling law is verified in simulations of the truncated Euler equation. In simulations of the Navier-Stokes equations the large-scale modes of forced Taylor-Green symmetric flows (with zero total helicity and large separation of scales) follow the same properties as absolute equilibrium including a τ (k ) ˜E-1 /2k-1 scaling for the correlation time. General-periodic helical flows also show similarities between the two systems; however, the largest scales of the forced flows deviate from the absolute equilibrium solutions.

  17. Gene expression metadata analysis reveals molecular mechanisms employed by Phanerochaete chrysosporium during lignin degradation and detoxification of plant extractives.

    PubMed

    Kameshwar, Ayyappa Kumar Sista; Qin, Wensheng

    2017-10-01

    Lignin, most complex and abundant biopolymer on the earth's surface, attains its stability from intricate polyphenolic units and non-phenolic bonds, making it difficult to depolymerize or separate from other units of biomass. Eccentric lignin degrading ability and availability of annotated genome make Phanerochaete chrysosporium ideal for studying lignin degrading mechanisms. Decoding and understanding the molecular mechanisms underlying the process of lignin degradation will significantly aid the progressing biofuel industries and lead to the production of commercially vital platform chemicals. In this study, we have performed a large-scale metadata analysis to understand the common gene expression patterns of P. chrysosporium during lignin degradation. Gene expression datasets were retrieved from NCBI GEO database and analyzed using GEO2R and Bioconductor packages. Commonly expressed statistically significant genes among different datasets were further considered to understand their involvement in lignin degradation and detoxification mechanisms. We have observed three sets of enzymes commonly expressed during ligninolytic conditions which were later classified into primary ligninolytic, aromatic compound-degrading and other necessary enzymes. Similarly, we have observed three sets of genes coding for detoxification and stress-responsive, phase I and phase II metabolic enzymes. Results obtained in this study indicate the coordinated action of enzymes involved in lignin depolymerization and detoxification-stress responses under ligninolytic conditions. We have developed tentative network of genes and enzymes involved in lignin degradation and detoxification mechanisms by P. chrysosporium based on the literature and results obtained in this study. However, ambiguity raised due to higher expression of several uncharacterized proteins necessitates for further proteomic studies in P. chrysosporium.

  18. Inferring causal genomic alterations in breast cancer using gene expression data

    PubMed Central

    2011-01-01

    Background One of the primary objectives in cancer research is to identify causal genomic alterations, such as somatic copy number variation (CNV) and somatic mutations, during tumor development. Many valuable studies lack genomic data to detect CNV; therefore, methods that are able to infer CNVs from gene expression data would help maximize the value of these studies. Results We developed a framework for identifying recurrent regions of CNV and distinguishing the cancer driver genes from the passenger genes in the regions. By inferring CNV regions across many datasets we were able to identify 109 recurrent amplified/deleted CNV regions. Many of these regions are enriched for genes involved in many important processes associated with tumorigenesis and cancer progression. Genes in these recurrent CNV regions were then examined in the context of gene regulatory networks to prioritize putative cancer driver genes. The cancer driver genes uncovered by the framework include not only well-known oncogenes but also a number of novel cancer susceptibility genes validated via siRNA experiments. Conclusions To our knowledge, this is the first effort to systematically identify and validate drivers for expression based CNV regions in breast cancer. The framework where the wavelet analysis of copy number alteration based on expression coupled with the gene regulatory network analysis, provides a blueprint for leveraging genomic data to identify key regulatory components and gene targets. This integrative approach can be applied to many other large-scale gene expression studies and other novel types of cancer data such as next-generation sequencing based expression (RNA-Seq) as well as CNV data. PMID:21806811

  19. Statistical analysis of mesoscale rainfall: Dependence of a random cascade generator on large-scale forcing

    NASA Technical Reports Server (NTRS)

    Over, Thomas, M.; Gupta, Vijay K.

    1994-01-01

    Under the theory of independent and identically distributed random cascades, the probability distribution of the cascade generator determines the spatial and the ensemble properties of spatial rainfall. Three sets of radar-derived rainfall data in space and time are analyzed to estimate the probability distribution of the generator. A detailed comparison between instantaneous scans of spatial rainfall and simulated cascades using the scaling properties of the marginal moments is carried out. This comparison highlights important similarities and differences between the data and the random cascade theory. Differences are quantified and measured for the three datasets. Evidence is presented to show that the scaling properties of the rainfall can be captured to the first order by a random cascade with a single parameter. The dependence of this parameter on forcing by the large-scale meteorological conditions, as measured by the large-scale spatial average rain rate, is investigated for these three datasets. The data show that this dependence can be captured by a one-to-one function. Since the large-scale average rain rate can be diagnosed from the large-scale dynamics, this relationship demonstrates an important linkage between the large-scale atmospheric dynamics and the statistical cascade theory of mesoscale rainfall. Potential application of this research to parameterization of runoff from the land surface and regional flood frequency analysis is briefly discussed, and open problems for further research are presented.

  20. Scaling of the Urban Water Footprint: An Analysis of 65 Mid- to Large-Sized U.S. Metropolitan Areas

    NASA Astrophysics Data System (ADS)

    Mahjabin, T.; Garcia, S.; Grady, C.; Mejia, A.

    2017-12-01

    Scaling laws have been shown to be relevant to a range of disciplines including biology, ecology, hydrology, and physics, among others. Recently, scaling was shown to be important for understanding and characterizing cities. For instance, it was found that urban infrastructure (water supply pipes and electrical wires) tends to scale sublinearly with city population, implying that large cities are more efficient. In this study, we explore the scaling of the water footprint of cities. The water footprint is a measure of water appropriation that considers both the direct and indirect (virtual) water use of a consumer or producer. Here we compute the water footprint of 65 mid- to large-sized U.S. metropolitan areas, accounting for direct and indirect water uses associated with agricultural and industrial commodities, and residential and commercial water uses. We find that the urban water footprint, computed as the sum of the water footprint of consumption and production, exhibits sublinear scaling with an exponent of 0.89. This suggests the possibility of large cities being more water-efficient than small ones. To further assess this result, we conduct additional analysis by accounting for international flows, and the effects of green water and city boundary definition on the scaling. The analysis confirms the scaling and provides additional insight about its interpretation.

  1. Disentangling Detoxification: Gene Expression Analysis of Feeding Mountain Pine Beetle Illuminates Molecular-Level Host Chemical Defense Detoxification Mechanisms

    PubMed Central

    Robert, Jeanne A.; Pitt, Caitlin; Bonnett, Tiffany R.; Yuen, Macaire M. S.; Keeling, Christopher I.; Bohlmann, Jörg; Huber, Dezene P. W.

    2013-01-01

    The mountain pine beetle, Dendroctonus ponderosae, is a native species of bark beetle (Coleoptera: Curculionidae) that caused unprecedented damage to the pine forests of British Columbia and other parts of western North America and is currently expanding its range into the boreal forests of central and eastern Canada and the USA. We conducted a large-scale gene expression analysis (RNA-seq) of mountain pine beetle male and female adults either starved or fed in male-female pairs for 24 hours on lodgepole pine host tree tissues. Our aim was to uncover transcripts involved in coniferophagous mountain pine beetle detoxification systems during early host colonization. Transcripts of members from several gene families significantly increased in insects fed on host tissue including: cytochromes P450, glucosyl transferases and glutathione S-transferases, esterases, and one ABC transporter. Other significantly increasing transcripts with potential roles in detoxification of host defenses included alcohol dehydrogenases and a group of unexpected transcripts whose products may play an, as yet, undiscovered role in host colonization by mountain pine beetle. PMID:24223726

  2. Disentangling detoxification: gene expression analysis of feeding mountain pine beetle illuminates molecular-level host chemical defense detoxification mechanisms.

    PubMed

    Robert, Jeanne A; Pitt, Caitlin; Bonnett, Tiffany R; Yuen, Macaire M S; Keeling, Christopher I; Bohlmann, Jörg; Huber, Dezene P W

    2013-01-01

    The mountain pine beetle, Dendroctonus ponderosae, is a native species of bark beetle (Coleoptera: Curculionidae) that caused unprecedented damage to the pine forests of British Columbia and other parts of western North America and is currently expanding its range into the boreal forests of central and eastern Canada and the USA. We conducted a large-scale gene expression analysis (RNA-seq) of mountain pine beetle male and female adults either starved or fed in male-female pairs for 24 hours on lodgepole pine host tree tissues. Our aim was to uncover transcripts involved in coniferophagous mountain pine beetle detoxification systems during early host colonization. Transcripts of members from several gene families significantly increased in insects fed on host tissue including: cytochromes P450, glucosyl transferases and glutathione S-transferases, esterases, and one ABC transporter. Other significantly increasing transcripts with potential roles in detoxification of host defenses included alcohol dehydrogenases and a group of unexpected transcripts whose products may play an, as yet, undiscovered role in host colonization by mountain pine beetle.

  3. Quantitative image analysis of cellular heterogeneity in breast tumors complements genomic profiling.

    PubMed

    Yuan, Yinyin; Failmezger, Henrik; Rueda, Oscar M; Ali, H Raza; Gräf, Stefan; Chin, Suet-Feung; Schwarz, Roland F; Curtis, Christina; Dunning, Mark J; Bardwell, Helen; Johnson, Nicola; Doyle, Sarah; Turashvili, Gulisa; Provenzano, Elena; Aparicio, Sam; Caldas, Carlos; Markowetz, Florian

    2012-10-24

    Solid tumors are heterogeneous tissues composed of a mixture of cancer and normal cells, which complicates the interpretation of their molecular profiles. Furthermore, tissue architecture is generally not reflected in molecular assays, rendering this rich information underused. To address these challenges, we developed a computational approach based on standard hematoxylin and eosin-stained tissue sections and demonstrated its power in a discovery and validation cohort of 323 and 241 breast tumors, respectively. To deconvolute cellular heterogeneity and detect subtle genomic aberrations, we introduced an algorithm based on tumor cellularity to increase the comparability of copy number profiles between samples. We next devised a predictor for survival in estrogen receptor-negative breast cancer that integrated both image-based and gene expression analyses and significantly outperformed classifiers that use single data types, such as microarray expression signatures. Image processing also allowed us to describe and validate an independent prognostic factor based on quantitative analysis of spatial patterns between stromal cells, which are not detectable by molecular assays. Our quantitative, image-based method could benefit any large-scale cancer study by refining and complementing molecular assays of tumor samples.

  4. Perceiving Facial and Vocal Expressions of Emotion in Individuals with Williams Syndrome

    ERIC Educational Resources Information Center

    Plesa-Skwerer, Daniela; Faja, Susan; Schofield, Casey; Verbalis, Alyssa; Tager-Flusberg, Helen

    2006-01-01

    People with Williams syndrome are extremely sociable, empathic, and expressive in communication. Some researchers suggest they may be especially sensitive to perceiving emotional expressions. We administered the Faces and Paralanguage subtests of the Diagnostic Analysis of Nonverbal Accuracy Scale (DANVA2), a standardized measure of emotion…

  5. Proteomic analysis in type 2 diabetes patients before and after a very low calorie diet reveals potential disease state and intervention specific biomarkers.

    PubMed

    Sleddering, Maria A; Markvoort, Albert J; Dharuri, Harish K; Jeyakar, Skhandhan; Snel, Marieke; Juhasz, Peter; Lynch, Moira; Hines, Wade; Li, Xiaohong; Jazet, Ingrid M; Adourian, Aram; Hilbers, Peter A J; Smit, Johannes W A; Van Dijk, Ko Willems

    2014-01-01

    Very low calorie diets (VLCD) with and without exercise programs lead to major metabolic improvements in obese type 2 diabetes patients. The mechanisms underlying these improvements have so far not been elucidated fully. To further investigate the mechanisms of a VLCD with or without exercise and to uncover possible biomarkers associated with these interventions, blood samples were collected from 27 obese type 2 diabetes patients before and after a 16-week VLCD (Modifast ∼ 450 kcal/day). Thirteen of these patients followed an exercise program in addition to the VCLD. Plasma was obtained from 27 lean and 27 obese controls as well. Proteomic analysis was performed using mass spectrometry (MS) and targeted multiple reaction monitoring (MRM) and a large scale isobaric tags for relative and absolute quantitation (iTRAQ) approach. After the 16-week VLCD, there was a significant decrease in body weight and HbA1c in all patients, without differences between the two intervention groups. Targeted MRM analysis revealed differences in several proteins, which could be divided in diabetes-associated (fibrinogen, transthyretin), obesity-associated (complement C3), and diet-associated markers (apolipoproteins, especially apolipoprotein A-IV). To further investigate the effects of exercise, large scale iTRAQ analysis was performed. However, no proteins were found showing an exercise effect. Thus, in this study, specific proteins were found to be differentially expressed in type 2 diabetes patients versus controls and before and after a VLCD. These proteins are potential disease state and intervention specific biomarkers. Controlled-Trials.com ISRCTN76920690.

  6. Proteomic Analysis in Type 2 Diabetes Patients before and after a Very Low Calorie Diet Reveals Potential Disease State and Intervention Specific Biomarkers

    PubMed Central

    Dharuri, Harish K.; Jeyakar, Skhandhan; Snel, Marieke; Juhasz, Peter; Lynch, Moira; Hines, Wade; Li, Xiaohong; Jazet, Ingrid M.; Adourian, Aram; Hilbers, Peter A. J.; Smit, Johannes W. A.; Van Dijk, Ko Willems

    2014-01-01

    Very low calorie diets (VLCD) with and without exercise programs lead to major metabolic improvements in obese type 2 diabetes patients. The mechanisms underlying these improvements have so far not been elucidated fully. To further investigate the mechanisms of a VLCD with or without exercise and to uncover possible biomarkers associated with these interventions, blood samples were collected from 27 obese type 2 diabetes patients before and after a 16-week VLCD (Modifast ∼450 kcal/day). Thirteen of these patients followed an exercise program in addition to the VCLD. Plasma was obtained from 27 lean and 27 obese controls as well. Proteomic analysis was performed using mass spectrometry (MS) and targeted multiple reaction monitoring (MRM) and a large scale isobaric tags for relative and absolute quantitation (iTRAQ) approach. After the 16-week VLCD, there was a significant decrease in body weight and HbA1c in all patients, without differences between the two intervention groups. Targeted MRM analysis revealed differences in several proteins, which could be divided in diabetes-associated (fibrinogen, transthyretin), obesity-associated (complement C3), and diet-associated markers (apolipoproteins, especially apolipoprotein A-IV). To further investigate the effects of exercise, large scale iTRAQ analysis was performed. However, no proteins were found showing an exercise effect. Thus, in this study, specific proteins were found to be differentially expressed in type 2 diabetes patients versus controls and before and after a VLCD. These proteins are potential disease state and intervention specific biomarkers. Trial Registration Controlled-Trials.com ISRCTN76920690 PMID:25415563

  7. Large-scale integrative network-based analysis identifies common pathways disrupted by copy number alterations across cancers

    PubMed Central

    2013-01-01

    Background Many large-scale studies analyzed high-throughput genomic data to identify altered pathways essential to the development and progression of specific types of cancer. However, no previous study has been extended to provide a comprehensive analysis of pathways disrupted by copy number alterations across different human cancers. Towards this goal, we propose a network-based method to integrate copy number alteration data with human protein-protein interaction networks and pathway databases to identify pathways that are commonly disrupted in many different types of cancer. Results We applied our approach to a data set of 2,172 cancer patients across 16 different types of cancers, and discovered a set of commonly disrupted pathways, which are likely essential for tumor formation in majority of the cancers. We also identified pathways that are only disrupted in specific cancer types, providing molecular markers for different human cancers. Analysis with independent microarray gene expression datasets confirms that the commonly disrupted pathways can be used to identify patient subgroups with significantly different survival outcomes. We also provide a network view of disrupted pathways to explain how copy number alterations affect pathways that regulate cell growth, cycle, and differentiation for tumorigenesis. Conclusions In this work, we demonstrated that the network-based integrative analysis can help to identify pathways disrupted by copy number alterations across 16 types of human cancers, which are not readily identifiable by conventional overrepresentation-based and other pathway-based methods. All the results and source code are available at http://compbio.cs.umn.edu/NetPathID/. PMID:23822816

  8. TOXICOGENOMICS DRUG DISCOVERY AND THE PATHOLOGIST

    EPA Science Inventory

    Toxicogenomics, drug discovery, and pathologist.

    The field of toxicogenomics, which currently focuses on the application of large-scale differential gene expression (DGE) data to toxicology, is starting to influence drug discovery and development in the pharmaceutical indu...

  9. The Challenges of Recombinant Endostatin in Clinical Application: Focus on the Different Expression Systems and Molecular Bioengineering

    PubMed Central

    Mohajeri, Abbas; Sanaei, Sarvin; Kiafar, Farhad; Fattahi, Amir; Khalili, Majid; Zarghami, Nosratollah

    2017-01-01

    Angiogenesis plays an essential role in rapid growing and metastasis of the tumors. Inhibition of angiogenesis is a putative strategy for cancer therapy. Endostatin (Es) is an attractive anti-angiogenesis protein with some clinical application challenges including; short half-life, instability in serum and requirement to high dosage. Therefore, production of recombinant endostatin (rEs) is necessary in large scale. The production of rEs is difficult because of its structural properties and is high-cost. Therefore, this review focused on the different expression systems that involved in rEs production including; mammalian, baculovirus, yeast, and Escherichia coli (E. coli) expression systems. The evaluating of the results of different expression systems declared that none of the mentioned systems can be considered to be generally superior to the other. Meanwhile with considering the advantages and disadvantage of E. coli expression system compared with other systems beside the molecular properties of Es, E. coli expression system can be a preferred expression system for expressing of the Es in large scale. Also, the molecular bioengineering and sustained release formulations that lead to improving of its stability and bioactivity will be discussed. Point mutation (P125A) of Es, addition of RGD moiety or an additional zinc biding site to N-terminal of Es , fusing of Es to anti-HER2 IgG or heavy-chain of IgG, and finally loading of the endostar by PLGA and PEG- PLGA nanoparticles and gold nano-shell particles are the effective bioengineering methods to overcome to clinical changes of endostatin. PMID:28507934

  10. Transcriptome sequencing and annotation of the halophytic microalga Dunaliella salina * #

    PubMed Central

    Hong, Ling; Liu, Jun-li; Midoun, Samira Z.; Miller, Philip C.

    2017-01-01

    The unicellular green alga Dunaliella salina is well adapted to salt stress and contains compounds (including β-carotene and vitamins) with potential commercial value. A large transcriptome database of D. salina during the adjustment, exponential and stationary growth phases was generated using a high throughput sequencing platform. We characterized the metabolic processes in D. salina with a focus on valuable metabolites, with the aim of manipulating D. salina to achieve greater economic value in large-scale production through a bioengineering strategy. Gene expression profiles under salt stress verified using quantitative polymerase chain reaction (qPCR) implied that salt can regulate the expression of key genes. This study generated a substantial fraction of D. salina transcriptional sequences for the entire growth cycle, providing a basis for the discovery of novel genes. This first full-scale transcriptome study of D. salina establishes a foundation for further comparative genomic studies. PMID:28990374

  11. Large-scale structural optimization

    NASA Technical Reports Server (NTRS)

    Sobieszczanski-Sobieski, J.

    1983-01-01

    Problems encountered by aerospace designers in attempting to optimize whole aircraft are discussed, along with possible solutions. Large scale optimization, as opposed to component-by-component optimization, is hindered by computational costs, software inflexibility, concentration on a single, rather than trade-off, design methodology and the incompatibility of large-scale optimization with single program, single computer methods. The software problem can be approached by placing the full analysis outside of the optimization loop. Full analysis is then performed only periodically. Problem-dependent software can be removed from the generic code using a systems programming technique, and then embody the definitions of design variables, objective function and design constraints. Trade-off algorithms can be used at the design points to obtain quantitative answers. Finally, decomposing the large-scale problem into independent subproblems allows systematic optimization of the problems by an organization of people and machines.

  12. Bayesian hierarchical model for large-scale covariance matrix estimation.

    PubMed

    Zhu, Dongxiao; Hero, Alfred O

    2007-12-01

    Many bioinformatics problems implicitly depend on estimating large-scale covariance matrix. The traditional approaches tend to give rise to high variance and low accuracy due to "overfitting." We cast the large-scale covariance matrix estimation problem into the Bayesian hierarchical model framework, and introduce dependency between covariance parameters. We demonstrate the advantages of our approaches over the traditional approaches using simulations and OMICS data analysis.

  13. Children's Understanding of Large-Scale Mapping Tasks: An Analysis of Talk, Drawings, and Gesture

    ERIC Educational Resources Information Center

    Kotsopoulos, Donna; Cordy, Michelle; Langemeyer, Melanie

    2015-01-01

    This research examined how children represent motion in large-scale mapping tasks that we referred to as "motion maps". The underlying mathematical content was transformational geometry. In total, 19 children, 8- to 10-year-old, created motion maps and captured their motion maps with accompanying verbal description digitally. Analysis of…

  14. Explore the Usefulness of Person-Fit Analysis on Large-Scale Assessment

    ERIC Educational Resources Information Center

    Cui, Ying; Mousavi, Amin

    2015-01-01

    The current study applied the person-fit statistic, l[subscript z], to data from a Canadian provincial achievement test to explore the usefulness of conducting person-fit analysis on large-scale assessments. Item parameter estimates were compared before and after the misfitting student responses, as identified by l[subscript z], were removed. The…

  15. Whole genome co-expression analysis of soybean cytochrome P450 genes identifies nodulation-specific P450 monooxygenases

    PubMed Central

    2010-01-01

    Background Cytochrome P450 monooxygenases (P450s) catalyze oxidation of various substrates using oxygen and NAD(P)H. Plant P450s are involved in the biosynthesis of primary and secondary metabolites performing diverse biological functions. The recent availability of the soybean genome sequence allows us to identify and analyze soybean putative P450s at a genome scale. Co-expression analysis using an available soybean microarray and Illumina sequencing data provides clues for functional annotation of these enzymes. This approach is based on the assumption that genes that have similar expression patterns across a set of conditions may have a functional relationship. Results We have identified a total number of 332 full-length P450 genes and 378 pseudogenes from the soybean genome. From the full-length sequences, 195 genes belong to A-type, which could be further divided into 20 families. The remaining 137 genes belong to non-A type P450s and are classified into 28 families. A total of 178 probe sets were found to correspond to P450 genes on the Affymetrix soybean array. Out of these probe sets, 108 represented single genes. Using the 28 publicly available microarray libraries that contain organ-specific information, some tissue-specific P450s were identified. Similarly, stress responsive soybean P450s were retrieved from 99 microarray soybean libraries. We also utilized Illumina transcriptome sequencing technology to analyze the expressions of all 332 soybean P450 genes. This dataset contains total RNAs isolated from nodules, roots, root tips, leaves, flowers, green pods, apical meristem, mock-inoculated and Bradyrhizobium japonicum-infected root hair cells. The tissue-specific expression patterns of these P450 genes were analyzed and the expression of a representative set of genes were confirmed by qRT-PCR. We performed the co-expression analysis on many of the 108 P450 genes on the Affymetrix arrays. First we confirmed that CYP93C5 (an isoflavone synthase gene) is co-expressed with several genes encoding isoflavonoid-related metabolic enzymes. We then focused on nodulation-induced P450s and found that CYP728H1 was co-expressed with the genes involved in phenylpropanoid metabolism. Similarly, CYP736A34 was highly co-expressed with lipoxygenase, lectin and CYP83D1, all of which are involved in root and nodule development. Conclusions The genome scale analysis of P450s in soybean reveals many unique features of these important enzymes in this crop although the functions of most of them are largely unknown. Gene co-expression analysis proves to be a useful tool to infer the function of uncharacterized genes. Our work presented here could provide important leads toward functional genomics studies of soybean P450s and their regulatory network through the integration of reverse genetics, biochemistry, and metabolic profiling tools. The identification of nodule-specific P450s and their further exploitation may help us to better understand the intriguing process of soybean and rhizobium interaction. PMID:21062474

  16. Protection Conferred by recombinant Yersinia pestis Antigens Produced by a Rapid and Highly Scalable Plant Expression System

    DTIC Science & Technology

    2006-01-24

    translational fusions with dsRED (lanes 8), and cytosol-targeted GFP (lanes 9). RbcL, large subunit of Rubisco . 862 ! www.pnas.org"cgi"doi൒.1073...analysis of F1-V expression with SDS"PAGE-Coomassie staining was difficult because the chimeric protein comigrates with the large subunit of Rubisco , a...contaminated by the Rubisco large subunit, which is very similar in size to F1-V. Analysis of Purified Plant-Produced Antigens. Western blots were

  17. Large-Scale Aerosol Modeling and Analysis

    DTIC Science & Technology

    2009-09-30

    Modeling of Burning Emissions ( FLAMBE ) project, and other related parameters. Our plans to embed NAAPS inside NOGAPS may need to be put on hold...AOD, FLAMBE and FAROP at FNMOC are supported by 6.4 funding from PMW-120 for “Large-scale Atmospheric Models”, “Small-scale Atmospheric Models

  18. HipMCL: a high-performance parallel implementation of the Markov clustering algorithm for large-scale networks

    PubMed Central

    Azad, Ariful; Ouzounis, Christos A; Kyrpides, Nikos C; Buluç, Aydin

    2018-01-01

    Abstract Biological networks capture structural or functional properties of relevant entities such as molecules, proteins or genes. Characteristic examples are gene expression networks or protein–protein interaction networks, which hold information about functional affinities or structural similarities. Such networks have been expanding in size due to increasing scale and abundance of biological data. While various clustering algorithms have been proposed to find highly connected regions, Markov Clustering (MCL) has been one of the most successful approaches to cluster sequence similarity or expression networks. Despite its popularity, MCL’s scalability to cluster large datasets still remains a bottleneck due to high running times and memory demands. Here, we present High-performance MCL (HipMCL), a parallel implementation of the original MCL algorithm that can run on distributed-memory computers. We show that HipMCL can efficiently utilize 2000 compute nodes and cluster a network of ∼70 million nodes with ∼68 billion edges in ∼2.4 h. By exploiting distributed-memory environments, HipMCL clusters large-scale networks several orders of magnitude faster than MCL and enables clustering of even bigger networks. HipMCL is based on MPI and OpenMP and is freely available under a modified BSD license. PMID:29315405

  19. HipMCL: a high-performance parallel implementation of the Markov clustering algorithm for large-scale networks

    DOE PAGES

    Azad, Ariful; Pavlopoulos, Georgios A.; Ouzounis, Christos A.; ...

    2018-01-05

    Biological networks capture structural or functional properties of relevant entities such as molecules, proteins or genes. Characteristic examples are gene expression networks or protein–protein interaction networks, which hold information about functional affinities or structural similarities. Such networks have been expanding in size due to increasing scale and abundance of biological data. While various clustering algorithms have been proposed to find highly connected regions, Markov Clustering (MCL) has been one of the most successful approaches to cluster sequence similarity or expression networks. Despite its popularity, MCL’s scalability to cluster large datasets still remains a bottleneck due to high running times andmore » memory demands. In this paper, we present High-performance MCL (HipMCL), a parallel implementation of the original MCL algorithm that can run on distributed-memory computers. We show that HipMCL can efficiently utilize 2000 compute nodes and cluster a network of ~70 million nodes with ~68 billion edges in ~2.4 h. By exploiting distributed-memory environments, HipMCL clusters large-scale networks several orders of magnitude faster than MCL and enables clustering of even bigger networks. Finally, HipMCL is based on MPI and OpenMP and is freely available under a modified BSD license.« less

  20. HipMCL: a high-performance parallel implementation of the Markov clustering algorithm for large-scale networks

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Azad, Ariful; Pavlopoulos, Georgios A.; Ouzounis, Christos A.

    Biological networks capture structural or functional properties of relevant entities such as molecules, proteins or genes. Characteristic examples are gene expression networks or protein–protein interaction networks, which hold information about functional affinities or structural similarities. Such networks have been expanding in size due to increasing scale and abundance of biological data. While various clustering algorithms have been proposed to find highly connected regions, Markov Clustering (MCL) has been one of the most successful approaches to cluster sequence similarity or expression networks. Despite its popularity, MCL’s scalability to cluster large datasets still remains a bottleneck due to high running times andmore » memory demands. In this paper, we present High-performance MCL (HipMCL), a parallel implementation of the original MCL algorithm that can run on distributed-memory computers. We show that HipMCL can efficiently utilize 2000 compute nodes and cluster a network of ~70 million nodes with ~68 billion edges in ~2.4 h. By exploiting distributed-memory environments, HipMCL clusters large-scale networks several orders of magnitude faster than MCL and enables clustering of even bigger networks. Finally, HipMCL is based on MPI and OpenMP and is freely available under a modified BSD license.« less

  1. Packed Bed Bioreactor for the Isolation and Expansion of Placental-Derived Mesenchymal Stromal Cells

    PubMed Central

    Osiecki, Michael J.; Michl, Thomas D.; Kul Babur, Betul; Kabiri, Mahboubeh; Atkinson, Kerry; Lott, William B.; Griesser, Hans J.; Doran, Michael R.

    2015-01-01

    Large numbers of Mesenchymal stem/stromal cells (MSCs) are required for clinical relevant doses to treat a number of diseases. To economically manufacture these MSCs, an automated bioreactor system will be required. Herein we describe the development of a scalable closed-system, packed bed bioreactor suitable for large-scale MSCs expansion. The packed bed was formed from fused polystyrene pellets that were air plasma treated to endow them with a surface chemistry similar to traditional tissue culture plastic. The packed bed was encased within a gas permeable shell to decouple the medium nutrient supply and gas exchange. This enabled a significant reduction in medium flow rates, thus reducing shear and even facilitating single pass medium exchange. The system was optimised in a small-scale bioreactor format (160 cm2) with murine-derived green fluorescent protein-expressing MSCs, and then scaled-up to a 2800 cm2 format. We demonstrated that placental derived MSCs could be isolated directly within the bioreactor and subsequently expanded. Our results demonstrate that the closed system large-scale packed bed bioreactor is an effective and scalable tool for large-scale isolation and expansion of MSCs. PMID:26660475

  2. Packed Bed Bioreactor for the Isolation and Expansion of Placental-Derived Mesenchymal Stromal Cells.

    PubMed

    Osiecki, Michael J; Michl, Thomas D; Kul Babur, Betul; Kabiri, Mahboubeh; Atkinson, Kerry; Lott, William B; Griesser, Hans J; Doran, Michael R

    2015-01-01

    Large numbers of Mesenchymal stem/stromal cells (MSCs) are required for clinical relevant doses to treat a number of diseases. To economically manufacture these MSCs, an automated bioreactor system will be required. Herein we describe the development of a scalable closed-system, packed bed bioreactor suitable for large-scale MSCs expansion. The packed bed was formed from fused polystyrene pellets that were air plasma treated to endow them with a surface chemistry similar to traditional tissue culture plastic. The packed bed was encased within a gas permeable shell to decouple the medium nutrient supply and gas exchange. This enabled a significant reduction in medium flow rates, thus reducing shear and even facilitating single pass medium exchange. The system was optimised in a small-scale bioreactor format (160 cm2) with murine-derived green fluorescent protein-expressing MSCs, and then scaled-up to a 2800 cm2 format. We demonstrated that placental derived MSCs could be isolated directly within the bioreactor and subsequently expanded. Our results demonstrate that the closed system large-scale packed bed bioreactor is an effective and scalable tool for large-scale isolation and expansion of MSCs.

  3. Observation of Spontaneous Expressive Language (OSEL): a new measure for spontaneous and expressive language of children with autism spectrum disorders and other communication disorders.

    PubMed

    Kim, So Hyun; Junker, Dörte; Lord, Catherine

    2014-12-01

    A new language measure, the Observation of Spontaneous Expressive Language (OSEL), is intended to document spontaneous use of syntax, pragmatics, and semantics in 2-12-year-old children with Autism Spectrum Disorder (ASD) and other communication disorders with expressive language levels comparable to typical 2-5 year olds. Because the purpose of the OSEL is to provide developmental norms for use of language, the first step involves assessment of the scale's feasibility, validity, and reliability using a sample of 180 2-5 year-old typically developing children. Pilot data from the OSEL shows strong internal consistency, high reliabilities and validity. Once replicated with a large population-based sample and in special populations, the scale should be helpful in designing appropriate interventions for children with ASD and other communication disorders.

  4. Human cells: new platform for recombinant therapeutic protein production.

    PubMed

    Swiech, Kamilla; Picanço-Castro, Virgínia; Covas, Dimas Tadeu

    2012-07-01

    The demand for recombinant therapeutic proteins is significantly increasing. There is a constant need to improve the existing expression systems, and also developing novel approaches to face the therapeutic proteins demands. Human cell lines have emerged as a new and powerful alternative for the production of human therapeutic proteins because this expression system is expected to produce recombinant proteins with post translation modifications more similar to their natural counterpart and reduce the potential immunogenic reactions against nonhuman epitopes. Currently, little information about the cultivation of human cells for the production of biopharmaceuticals is available. These cells have shown efficient production in laboratory scale and represent an important tool for the pharmaceutical industry. This review presents the cell lines available for large-scale recombinant proteins production and evaluates critically the advantages of this expression system in comparison with other expression systems for recombinant therapeutic protein production. Copyright © 2012 Elsevier Inc. All rights reserved.

  5. Interactive Scripting for Analysis and Visualization of Arbitrarily Large, Disparately Located Climate Data Ensembles Using a Progressive Runtime Server

    NASA Astrophysics Data System (ADS)

    Christensen, C.; Summa, B.; Scorzelli, G.; Lee, J. W.; Venkat, A.; Bremer, P. T.; Pascucci, V.

    2017-12-01

    Massive datasets are becoming more common due to increasingly detailed simulations and higher resolution acquisition devices. Yet accessing and processing these huge data collections for scientific analysis is still a significant challenge. Solutions that rely on extensive data transfers are increasingly untenable and often impossible due to lack of sufficient storage at the client side as well as insufficient bandwidth to conduct such large transfers, that in some cases could entail petabytes of data. Large-scale remote computing resources can be useful, but utilizing such systems typically entails some form of offline batch processing with long delays, data replications, and substantial cost for any mistakes. Both types of workflows can severely limit the flexible exploration and rapid evaluation of new hypotheses that are crucial to the scientific process and thereby impede scientific discovery. In order to facilitate interactivity in both analysis and visualization of these massive data ensembles, we introduce a dynamic runtime system suitable for progressive computation and interactive visualization of arbitrarily large, disparately located spatiotemporal datasets. Our system includes an embedded domain-specific language (EDSL) that allows users to express a wide range of data analysis operations in a simple and abstract manner. The underlying runtime system transparently resolves issues such as remote data access and resampling while at the same time maintaining interactivity through progressive and interruptible processing. Computations involving large amounts of data can be performed remotely in an incremental fashion that dramatically reduces data movement, while the client receives updates progressively thereby remaining robust to fluctuating network latency or limited bandwidth. This system facilitates interactive, incremental analysis and visualization of massive remote datasets up to petabytes in size. Our system is now available for general use in the community through both docker and anaconda.

  6. Measuring disorganized speech in schizophrenia: automated analysis explains variance in cognitive deficits beyond clinician-rated scales.

    PubMed

    Minor, K S; Willits, J A; Marggraf, M P; Jones, M N; Lysaker, P H

    2018-04-25

    Conveying information cohesively is an essential element of communication that is disrupted in schizophrenia. These disruptions are typically expressed through disorganized symptoms, which have been linked to neurocognitive, social cognitive, and metacognitive deficits. Automated analysis can objectively assess disorganization within sentences, between sentences, and across paragraphs by comparing explicit communication to a large text corpus. Little work in schizophrenia has tested: (1) links between disorganized symptoms measured via automated analysis and neurocognition, social cognition, or metacognition; and (2) if automated analysis explains incremental variance in cognitive processes beyond clinician-rated scales. Disorganization was measured in schizophrenia (n = 81) with Coh-Metrix 3.0, an automated program that calculates basic and complex language indices. Trained staff also assessed neurocognition, social cognition, metacognition, and clinician-rated disorganization. Findings showed that all three cognitive processes were significantly associated with at least one automated index of disorganization. When automated analysis was compared with a clinician-rated scale, it accounted for significant variance in neurocognition and metacognition beyond the clinician-rated measure. When combined, these two methods explained 28-31% of the variance in neurocognition, social cognition, and metacognition. This study illustrated how automated analysis can highlight the specific role of disorganization in neurocognition, social cognition, and metacognition. Generally, those with poor cognition also displayed more disorganization in their speech-making it difficult for listeners to process essential information needed to tie the speaker's ideas together. Our findings showcase how implementing a mixed-methods approach in schizophrenia can explain substantial variance in cognitive processes.

  7. Systems Perturbation Analysis of a Large-Scale Signal Transduction Model Reveals Potentially Influential Candidates for Cancer Therapeutics

    PubMed Central

    Puniya, Bhanwar Lal; Allen, Laura; Hochfelder, Colleen; Majumder, Mahbubul; Helikar, Tomáš

    2016-01-01

    Dysregulation in signal transduction pathways can lead to a variety of complex disorders, including cancer. Computational approaches such as network analysis are important tools to understand system dynamics as well as to identify critical components that could be further explored as therapeutic targets. Here, we performed perturbation analysis of a large-scale signal transduction model in extracellular environments that stimulate cell death, growth, motility, and quiescence. Each of the model’s components was perturbed under both loss-of-function and gain-of-function mutations. Using 1,300 simulations under both types of perturbations across various extracellular conditions, we identified the most and least influential components based on the magnitude of their influence on the rest of the system. Based on the premise that the most influential components might serve as better drug targets, we characterized them for biological functions, housekeeping genes, essential genes, and druggable proteins. The most influential components under all environmental conditions were enriched with several biological processes. The inositol pathway was found as most influential under inactivating perturbations, whereas the kinase and small lung cancer pathways were identified as the most influential under activating perturbations. The most influential components were enriched with essential genes and druggable proteins. Moreover, known cancer drug targets were also classified in influential components based on the affected components in the network. Additionally, the systemic perturbation analysis of the model revealed a network motif of most influential components which affect each other. Furthermore, our analysis predicted novel combinations of cancer drug targets with various effects on other most influential components. We found that the combinatorial perturbation consisting of PI3K inactivation and overactivation of IP3R1 can lead to increased activity levels of apoptosis-related components and tumor-suppressor genes, suggesting that this combinatorial perturbation may lead to a better target for decreasing cell proliferation and inducing apoptosis. Finally, our approach shows a potential to identify and prioritize therapeutic targets through systemic perturbation analysis of large-scale computational models of signal transduction. Although some components of the presented computational results have been validated against independent gene expression data sets, more laboratory experiments are warranted to more comprehensively validate the presented results. PMID:26904540

  8. Gene-expression analysis of cold-stress response in the sexually transmitted protist Trichomonas vaginalis.

    PubMed

    Fang, Yi-Kai; Huang, Kuo-Yang; Huang, Po-Jung; Lin, Rose; Chao, Mei; Tang, Petrus

    2015-12-01

    Trichomonas vaginalis is the etiologic agent of trichomoniasis, the most common nonviral sexually transmitted disease in the world. This infection affects millions of individuals worldwide annually. Although direct sexual contact is the most common mode of transmission, increasing evidence indicates that T. vaginalis can survive in the external environment and can be transmitted by contaminated utensils. We found that the growth of T. vaginalis under cold conditions is greatly inhibited, but recovers after placing these stressed cells at the normal cultivation temperature of 37 °C. However, the mechanisms by which T. vaginalis regulates this adaptive process are unclear. An expressed sequence tag (EST) database generated from a complementary DNA library of T. vaginalis messenger RNAs expressed under cold-culture conditions (4 °C, TvC) was compared with a previously published normal-cultured EST library (37 °C, TvE) to assess the cold-stress responses of T. vaginalis. A total of 9780 clones were sequenced from the TvC library and were mapped to 2934 genes in the T. vaginalis genome. A total of 1254 genes were expressed in both the TvE and TvC libraries, and 1680 genes were only found in the TvC library. A functional analysis showed that cold temperature has effects on many cellular mechanisms, including increased H2O2 tolerance, activation of the ubiquitin-proteasome system, induction of iron-sulfur cluster assembly, and reduced energy metabolism and enzyme expression. The current study is the first large-scale transcriptomic analysis in cold-stressed T. vaginalis and the results enhance our understanding of this important protist. Copyright © 2014. Published by Elsevier B.V.

  9. Hidden among the crowd: differential DNA methylation-expression correlations in cancer occur at important oncogenic pathways

    PubMed Central

    Mosquera Orgueira, Adrián

    2015-01-01

    DNA methylation is a frequent epigenetic mechanism that participates in transcriptional repression. Variations in DNA methylation with respect to gene expression are constant, and, for unknown reasons, some genes with highly methylated promoters are sometimes overexpressed. In this study we have analyzed the expression and methylation patterns of thousands of genes in five groups of cancer and normal tissue samples in order to determine local and genome-wide differences. We observed significant changes in global methylation-expression correlation in all the neoplasms, which suggests that differential correlation events are frequent in cancer. A focused analysis in the breast cancer cohort identified 1662 genes whose correlation varies significantly between normal and cancerous breast, but whose DNA methylation and gene expression patterns do not change substantially. These genes were enriched in cancer-related pathways and repressive chromatin features across various model cell lines, such as PRC2 binding and H3K27me3 marks. Substantial changes in methylation-expression correlation indicate that these genes are subject to epigenetic remodeling, where the differential activity of other factors break the expected relationship between both variables. Our findings suggest a complex regulatory landscape where a redistribution of local and large-scale chromatin repressive domains at differentially correlated genes (DCGs) creates epigenetic hotspots that modulate cancer-specific gene expression. PMID:26029238

  10. Reconstruction of the genome-scale co-expression network for the Hippo signaling pathway in colorectal cancer.

    PubMed

    Dehghanian, Fariba; Hojati, Zohreh; Hosseinkhan, Nazanin; Mousavian, Zaynab; Masoudi-Nejad, Ali

    2018-05-26

    The Hippo signaling pathway (HSP) has been identified as an essential and complex signaling pathway for tumor suppression that coordinates proliferation, differentiation, cell death, cell growth and stemness. In the present study, we conducted a genome-scale co-expression analysis to reconstruct the HSP in colorectal cancer (CRC). Five key modules were detected through network clustering, and a detailed discussion of two modules containing respectively 18 and 13 over and down-regulated members of HSP was provided. Our results suggest new potential regulatory factors in the HSP. The detected modules also suggest novel genes contributing to CRC. Moreover, differential expression analysis confirmed the differential expression pattern of HSP members and new suggested regulatory factors between tumor and normal samples. These findings can further reveal the importance of HSP in CRC. Copyright © 2018 Elsevier Ltd. All rights reserved.

  11. The Use of Weighted Graphs for Large-Scale Genome Analysis

    PubMed Central

    Zhou, Fang; Toivonen, Hannu; King, Ross D.

    2014-01-01

    There is an acute need for better tools to extract knowledge from the growing flood of sequence data. For example, thousands of complete genomes have been sequenced, and their metabolic networks inferred. Such data should enable a better understanding of evolution. However, most existing network analysis methods are based on pair-wise comparisons, and these do not scale to thousands of genomes. Here we propose the use of weighted graphs as a data structure to enable large-scale phylogenetic analysis of networks. We have developed three types of weighted graph for enzymes: taxonomic (these summarize phylogenetic importance), isoenzymatic (these summarize enzymatic variety/redundancy), and sequence-similarity (these summarize sequence conservation); and we applied these types of weighted graph to survey prokaryotic metabolism. To demonstrate the utility of this approach we have compared and contrasted the large-scale evolution of metabolism in Archaea and Eubacteria. Our results provide evidence for limits to the contingency of evolution. PMID:24619061

  12. Piggy: a rapid, large-scale pan-genome analysis tool for intergenic regions in bacteria.

    PubMed

    Thorpe, Harry A; Bayliss, Sion C; Sheppard, Samuel K; Feil, Edward J

    2018-04-01

    The concept of the "pan-genome," which refers to the total complement of genes within a given sample or species, is well established in bacterial genomics. Rapid and scalable pipelines are available for managing and interpreting pan-genomes from large batches of annotated assemblies. However, despite overwhelming evidence that variation in intergenic regions in bacteria can directly influence phenotypes, most current approaches for analyzing pan-genomes focus exclusively on protein-coding sequences. To address this we present Piggy, a novel pipeline that emulates Roary except that it is based only on intergenic regions. A key utility provided by Piggy is the detection of highly divergent ("switched") intergenic regions (IGRs) upstream of genes. We demonstrate the use of Piggy on large datasets of clinically important lineages of Staphylococcus aureus and Escherichia coli. For S. aureus, we show that highly divergent (switched) IGRs are associated with differences in gene expression and we establish a multilocus reference database of IGR alleles (igMLST; implemented in BIGSdb).

  13. Identification and expression analysis of the apple (Malus × domestica) basic helix-loop-helix transcription factor family.

    PubMed

    Yang, Jinhua; Gao, Min; Huang, Li; Wang, Yaqiong; van Nocker, Steve; Wan, Ran; Guo, Chunlei; Wang, Xiping; Gao, Hua

    2017-02-09

    Basic helix-loop-helix (bHLH) proteins, which are characterized by a conserved bHLH domain, comprise one of the largest families of transcription factors in both plants and animals, and have been shown to have a wide range of biological functions. However, there have been very few studies of bHLH proteins from perennial tree species. We describe here the identification and characterization of 175 bHLH transcription factors from apple (Malus × domestica). Phylogenetic analysis of apple bHLH (MdbHLH) genes and their Arabidopsis thaliana (Arabidopsis) orthologs indicated that they can be classified into 23 subgroups. Moreover, integrated synteny analysis suggested that the large-scale expansion of the bHLH transcription factor family occurred before the divergence of apple and Arabidopsis. An analysis of the exon/intron structure and protein domains was conducted to suggest their functional roles. Finally, we observed that MdbHLH subgroup III and IV genes displayed diverse expression profiles in various organs, as well as in response to abiotic stresses and various hormone treatments. Taken together, these data provide new information regarding the composition and diversity of the apple bHLH transcription factor family that will provide a platform for future targeted functional characterization.

  14. Proteomic profile of the Bradysia odoriphaga in response to the microbial secondary metabolite benzothiazole.

    PubMed

    Zhao, Yunhe; Cui, Kaidi; Xu, Chunmei; Wang, Qiuhong; Wang, Yao; Zhang, Zhengqun; Liu, Feng; Mu, Wei

    2016-11-24

    Benzothiazole, a microbial secondary metabolite, has been demonstrated to possess fumigant activity against Sclerotinia sclerotiorum, Ditylenchus destructor and Bradysia odoriphaga. However, to facilitate the development of novel microbial pesticides, the mode of action of benzothiazole needs to be elucidated. Here, we employed iTRAQ-based quantitative proteomics analysis to investigate the effects of benzothiazole on the proteomic expression of B. odoriphaga. In response to benzothiazole, 92 of 863 identified proteins in B. odoriphaga exhibited altered levels of expression, among which 14 proteins were related to the action mechanism of benzothiazole, 11 proteins were involved in stress responses, and 67 proteins were associated with the adaptation of B. odoriphaga to benzothiazole. Further bioinformatics analysis indicated that the reduction in energy metabolism, inhibition of the detoxification process and interference with DNA and RNA synthesis were potentially associated with the mode of action of benzothiazole. The myosin heavy chain, succinyl-CoA synthetase and Ca + -transporting ATPase proteins may be related to the stress response. Increased expression of proteins involved in carbohydrate metabolism, energy production and conversion pathways was responsible for the adaptive response of B. odoriphaga. The results of this study provide novel insight into the molecular mechanisms of benzothiazole at a large-scale translation level and will facilitate the elucidation of the mechanism of action of benzothiazole.

  15. Lifetime evaluation of large format CMOS mixed signal infrared devices

    NASA Astrophysics Data System (ADS)

    Linder, A.; Glines, Eddie

    2015-09-01

    New large scale foundry processes continue to produce reliable products. These new large scale devices continue to use industry best practice to screen for failure mechanisms and validate their long lifetime. The Failure-in-Time analysis in conjunction with foundry qualification information can be used to evaluate large format device lifetimes. This analysis is a helpful tool when zero failure life tests are typical. The reliability of the device is estimated by applying the failure rate to the use conditions. JEDEC publications continue to be the industry accepted methods.

  16. Nonlinear modulation of the HI power spectrum on ultra-large scales. I

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Umeh, Obinna; Maartens, Roy; Santos, Mario, E-mail: umeobinna@gmail.com, E-mail: roy.maartens@gmail.com, E-mail: mgrsantos@uwc.ac.za

    2016-03-01

    Intensity mapping of the neutral hydrogen brightness temperature promises to provide a three-dimensional view of the universe on very large scales. Nonlinear effects are typically thought to alter only the small-scale power, but we show how they may bias the extraction of cosmological information contained in the power spectrum on ultra-large scales. For linear perturbations to remain valid on large scales, we need to renormalize perturbations at higher order. In the case of intensity mapping, the second-order contribution to clustering from weak lensing dominates the nonlinear contribution at high redshift. Renormalization modifies the mean brightness temperature and therefore the evolutionmore » bias. It also introduces a term that mimics white noise. These effects may influence forecasting analysis on ultra-large scales.« less

  17. The expression and recognition of emotions in the voice across five nations: A lens model analysis based on acoustic features.

    PubMed

    Laukka, Petri; Elfenbein, Hillary Anger; Thingujam, Nutankumar S; Rockstuhl, Thomas; Iraki, Frederick K; Chui, Wanda; Althoff, Jean

    2016-11-01

    This study extends previous work on emotion communication across cultures with a large-scale investigation of the physical expression cues in vocal tone. In doing so, it provides the first direct test of a key proposition of dialect theory, namely that greater accuracy of detecting emotions from one's own cultural group-known as in-group advantage-results from a match between culturally specific schemas in emotional expression style and culturally specific schemas in emotion recognition. Study 1 used stimuli from 100 professional actors from five English-speaking nations vocally conveying 11 emotional states (anger, contempt, fear, happiness, interest, lust, neutral, pride, relief, sadness, and shame) using standard-content sentences. Detailed acoustic analyses showed many similarities across groups, and yet also systematic group differences. This provides evidence for cultural accents in expressive style at the level of acoustic cues. In Study 2, listeners evaluated these expressions in a 5 × 5 design balanced across groups. Cross-cultural accuracy was greater than expected by chance. However, there was also in-group advantage, which varied across emotions. A lens model analysis of fundamental acoustic properties examined patterns in emotional expression and perception within and across groups. Acoustic cues were used relatively similarly across groups both to produce and judge emotions, and yet there were also subtle cultural differences. Speakers appear to have a culturally nuanced schema for enacting vocal tones via acoustic cues, and perceivers have a culturally nuanced schema in judging them. Consistent with dialect theory's prediction, in-group judgments showed a greater match between these schemas used for emotional expression and perception. (PsycINFO Database Record (c) 2016 APA, all rights reserved).

  18. Modeling High Temperature Deformation Behavior of Large-Scaled Mg-Al-Zn Magnesium Alloy Fabricated by Semi-continuous Casting

    NASA Astrophysics Data System (ADS)

    Li, Jianping; Xia, Xiangsheng

    2015-09-01

    In order to improve the understanding of the hot deformation and dynamic recrystallization (DRX) behaviors of large-scaled AZ80 magnesium alloy fabricated by semi-continuous casting, compression tests were carried out in the temperature range from 250 to 400 °C and strain rate range from 0.001 to 0.1 s-1 on a Gleeble 1500 thermo-mechanical machine. The effects of the temperature and strain rate on the hot deformation behavior have been expressed by means of the conventional hyperbolic sine equation, and the influence of the strain has been incorporated in the equation by considering its effect on different material constants for large-scaled AZ80 magnesium alloy. In addition, the DRX behavior has been discussed. The result shows that the deformation temperature and strain rate exerted remarkable influences on the flow stress. The constitutive equation of large-scaled AZ80 magnesium alloy for hot deformation at steady-state stage (ɛ = 0.5) was The true stress-true strain curves predicted by the extracted model were in good agreement with the experimental results, thereby confirming the validity of the developed constitutive relation. The DRX kinetic model of large-scaled AZ80 magnesium alloy was established as X d = 1 - exp[-0.95((ɛ - ɛc)/ɛ*)2.4904]. The rate of DRX increases with increasing deformation temperature, and high temperature is beneficial for achieving complete DRX in the large-scaled AZ80 magnesium alloy.

  19. Diversification of Root Hair Development Genes in Vascular Plants.

    PubMed

    Huang, Ling; Shi, Xinhui; Wang, Wenjia; Ryu, Kook Hui; Schiefelbein, John

    2017-07-01

    The molecular genetic program for root hair development has been studied intensively in Arabidopsis ( Arabidopsis thaliana ). To understand the extent to which this program might operate in other plants, we conducted a large-scale comparative analysis of root hair development genes from diverse vascular plants, including eudicots, monocots, and a lycophyte. Combining phylogenetics and transcriptomics, we discovered conservation of a core set of root hair genes across all vascular plants, which may derive from an ancient program for unidirectional cell growth coopted for root hair development during vascular plant evolution. Interestingly, we also discovered preferential diversification in the structure and expression of root hair development genes, relative to other root hair- and root-expressed genes, among these species. These differences enabled the definition of sets of genes and gene functions that were acquired or lost in specific lineages during vascular plant evolution. In particular, we found substantial divergence in the structure and expression of genes used for root hair patterning, suggesting that the Arabidopsis transcriptional regulatory mechanism is not shared by other species. To our knowledge, this study provides the first comprehensive view of gene expression in a single plant cell type across multiple species. © 2017 American Society of Plant Biologists. All Rights Reserved.

  20. Diversification of Root Hair Development Genes in Vascular Plants1[OPEN

    PubMed Central

    Shi, Xinhui; Wang, Wenjia; Ryu, Kook Hui

    2017-01-01

    The molecular genetic program for root hair development has been studied intensively in Arabidopsis (Arabidopsis thaliana). To understand the extent to which this program might operate in other plants, we conducted a large-scale comparative analysis of root hair development genes from diverse vascular plants, including eudicots, monocots, and a lycophyte. Combining phylogenetics and transcriptomics, we discovered conservation of a core set of root hair genes across all vascular plants, which may derive from an ancient program for unidirectional cell growth coopted for root hair development during vascular plant evolution. Interestingly, we also discovered preferential diversification in the structure and expression of root hair development genes, relative to other root hair- and root-expressed genes, among these species. These differences enabled the definition of sets of genes and gene functions that were acquired or lost in specific lineages during vascular plant evolution. In particular, we found substantial divergence in the structure and expression of genes used for root hair patterning, suggesting that the Arabidopsis transcriptional regulatory mechanism is not shared by other species. To our knowledge, this study provides the first comprehensive view of gene expression in a single plant cell type across multiple species. PMID:28487476

  1. Nutritional and reproductive signaling revealed by comparative gene expression analysis in Chrysopa pallens (Rambur) at different nutritional statuses

    PubMed Central

    Han, Benfeng; Zhang, Shen; Zeng, Fanrong; Mao, Jianjun

    2017-01-01

    Background The green lacewing, Chrysopa pallens Rambur, is one of the most important natural predators because of its extensive spectrum of prey and wide distribution. However, what we know about the nutritional and reproductive physiology of this species is very scarce. Results By cDNA amplification and Illumina short-read sequencing, we analyzed transcriptomes of C. pallens female adult under starved and fed conditions. In total, 71236 unigenes were obtained with an average length of 833 bp. Four vitellogenins, three insulin-like peptides and two insulin receptors were annotated. Comparison of gene expression profiles suggested that totally 1501 genes were differentially expressed between the two nutritional statuses. KEGG orthology classification showed that these differentially expression genes (DEGs) were mapped to 241 pathways. In turn, the top 4 are ribosome, protein processing in endoplasmic reticulum, biosynthesis of amino acids and carbon metabolism, indicating a distinct difference in nutritional and reproductive signaling between the two feeding conditions. Conclusions Our study yielded large-scale molecular information relevant to C. pallens nutritional and reproductive signaling, which will contribute to mass rearing and commercial use of this predaceous insect species. PMID:28683101

  2. Nutritional and reproductive signaling revealed by comparative gene expression analysis in Chrysopa pallens (Rambur) at different nutritional statuses.

    PubMed

    Han, Benfeng; Zhang, Shen; Zeng, Fanrong; Mao, Jianjun

    2017-01-01

    The green lacewing, Chrysopa pallens Rambur, is one of the most important natural predators because of its extensive spectrum of prey and wide distribution. However, what we know about the nutritional and reproductive physiology of this species is very scarce. By cDNA amplification and Illumina short-read sequencing, we analyzed transcriptomes of C. pallens female adult under starved and fed conditions. In total, 71236 unigenes were obtained with an average length of 833 bp. Four vitellogenins, three insulin-like peptides and two insulin receptors were annotated. Comparison of gene expression profiles suggested that totally 1501 genes were differentially expressed between the two nutritional statuses. KEGG orthology classification showed that these differentially expression genes (DEGs) were mapped to 241 pathways. In turn, the top 4 are ribosome, protein processing in endoplasmic reticulum, biosynthesis of amino acids and carbon metabolism, indicating a distinct difference in nutritional and reproductive signaling between the two feeding conditions. Our study yielded large-scale molecular information relevant to C. pallens nutritional and reproductive signaling, which will contribute to mass rearing and commercial use of this predaceous insect species.

  3. An Assertiveness Inventory for Adults

    ERIC Educational Resources Information Center

    Gay, Melvin L.; And Others

    1975-01-01

    The Adult Self-Expression Scale is a 48-item, self-report measure of assertiveness designed for use with adults in general. Scale was found to have high test-retest reliability and moderate-to-high construct validity, as established by correlations with Adjective Check List scales and by a discriminant analysis procedure. (Author)

  4. An elm EST database for identifying leaf beetle egg-induced defense genes

    PubMed Central

    2012-01-01

    Background Plants can defend themselves against herbivorous insects prior to the onset of larval feeding by responding to the eggs laid on their leaves. In the European field elm (Ulmus minor), egg laying by the elm leaf beetle ( Xanthogaleruca luteola) activates the emission of volatiles that attract specialised egg parasitoids, which in turn kill the eggs. Little is known about the transcriptional changes that insect eggs trigger in plants and how such indirect defense mechanisms are orchestrated in the context of other biological processes. Results Here we present the first large scale study of egg-induced changes in the transcriptional profile of a tree. Five cDNA libraries were generated from leaves of (i) untreated control elms, and elms treated with (ii) egg laying and feeding by elm leaf beetles, (iii) feeding, (iv) artificial transfer of egg clutches, and (v) methyl jasmonate. A total of 361,196 ESTs expressed sequence tags (ESTs) were identified which clustered into 52,823 unique transcripts (Unitrans) and were stored in a database with a public web interface. Among the analyzed Unitrans, 73% could be annotated by homology to known genes in the UniProt (Plant) database, particularly to those from Vitis, Ricinus, Populus and Arabidopsis. Comparative in silico analysis among the different treatments revealed differences in Gene Ontology term abundances. Defense- and stress-related gene transcripts were present in high abundance in leaves after herbivore egg laying, but transcripts involved in photosynthesis showed decreased abundance. Many pathogen-related genes and genes involved in phytohormone signaling were expressed, indicative of jasmonic acid biosynthesis and activation of jasmonic acid responsive genes. Cross-comparisons between different libraries based on expression profiles allowed the identification of genes with a potential relevance in egg-induced defenses, as well as other biological processes, including signal transduction, transport and primary metabolism. Conclusion Here we present a dataset for a large-scale study of the mechanisms of plant defense against insect eggs in a co-evolved, natural ecological plant–insect system. The EST database analysis provided here is a first step in elucidating the transcriptional responses of elm to elm leaf beetle infestation, and adds further to our knowledge on insect egg-induced transcriptomic changes in plants. The sequences identified in our comparative analysis give many hints about novel defense mechanisms directed towards eggs. PMID:22702658

  5. An elm EST database for identifying leaf beetle egg-induced defense genes.

    PubMed

    Büchel, Kerstin; McDowell, Eric; Nelson, Will; Descour, Anne; Gershenzon, Jonathan; Hilker, Monika; Soderlund, Carol; Gang, David R; Fenning, Trevor; Meiners, Torsten

    2012-06-15

    Plants can defend themselves against herbivorous insects prior to the onset of larval feeding by responding to the eggs laid on their leaves. In the European field elm (Ulmus minor), egg laying by the elm leaf beetle ( Xanthogaleruca luteola) activates the emission of volatiles that attract specialised egg parasitoids, which in turn kill the eggs. Little is known about the transcriptional changes that insect eggs trigger in plants and how such indirect defense mechanisms are orchestrated in the context of other biological processes. Here we present the first large scale study of egg-induced changes in the transcriptional profile of a tree. Five cDNA libraries were generated from leaves of (i) untreated control elms, and elms treated with (ii) egg laying and feeding by elm leaf beetles, (iii) feeding, (iv) artificial transfer of egg clutches, and (v) methyl jasmonate. A total of 361,196 ESTs expressed sequence tags (ESTs) were identified which clustered into 52,823 unique transcripts (Unitrans) and were stored in a database with a public web interface. Among the analyzed Unitrans, 73% could be annotated by homology to known genes in the UniProt (Plant) database, particularly to those from Vitis, Ricinus, Populus and Arabidopsis. Comparative in silico analysis among the different treatments revealed differences in Gene Ontology term abundances. Defense- and stress-related gene transcripts were present in high abundance in leaves after herbivore egg laying, but transcripts involved in photosynthesis showed decreased abundance. Many pathogen-related genes and genes involved in phytohormone signaling were expressed, indicative of jasmonic acid biosynthesis and activation of jasmonic acid responsive genes. Cross-comparisons between different libraries based on expression profiles allowed the identification of genes with a potential relevance in egg-induced defenses, as well as other biological processes, including signal transduction, transport and primary metabolism. Here we present a dataset for a large-scale study of the mechanisms of plant defense against insect eggs in a co-evolved, natural ecological plant-insect system. The EST database analysis provided here is a first step in elucidating the transcriptional responses of elm to elm leaf beetle infestation, and adds further to our knowledge on insect egg-induced transcriptomic changes in plants. The sequences identified in our comparative analysis give many hints about novel defense mechanisms directed towards eggs.

  6. Optimisation of insect cell growth in deep-well blocks: development of a high-throughput insect cell expression screen.

    PubMed

    Bahia, Daljit; Cheung, Robert; Buchs, Mirjam; Geisse, Sabine; Hunt, Ian

    2005-01-01

    This report describes a method to culture insects cells in 24 deep-well blocks for the routine small-scale optimisation of baculovirus-mediated protein expression experiments. Miniaturisation of this process provides the necessary reduction in terms of resource allocation, reagents, and labour to allow extensive and rapid optimisation of expression conditions, with the concomitant reduction in lead-time before commencement of large-scale bioreactor experiments. This therefore greatly simplifies the optimisation process and allows the use of liquid handling robotics in much of the initial optimisation stages of the process, thereby greatly increasing the throughput of the laboratory. We present several examples of the use of deep-well block expression studies in the optimisation of therapeutically relevant protein targets. We also discuss how the enhanced throughput offered by this approach can be adapted to robotic handling systems and the implications this has on the capacity to conduct multi-parallel protein expression studies.

  7. Applications of Proteomic Technologies to Toxicology

    EPA Science Inventory

    Proteomics is the large-scale study of gene expression at the protein level. This cutting edge technology has been extensively applied to toxicology research recently. The up-to-date development of proteomics has presented the toxicology community with an unprecedented opportunit...

  8. A summary and analysis of the low-speed longitudinal characteristics of swept wings at high Reynolds number

    NASA Technical Reports Server (NTRS)

    Furlong, G Chester; Mchugh, James G

    1957-01-01

    An analysis of the longitudinal characteristics of swept wings which is based on available large-scale low-speed data and supplemented with low-scale data when feasible is presented. The emphasis has been placed on the differentiation of the characteristics by a differentiation between the basic flow phenomenon involved. Insofar as possible all large-scale data available as of August 15, 1951 have been summarized in tabular form for ready reference.

  9. D-Light on promoters: a client-server system for the analysis and visualization of cis-regulatory elements

    PubMed Central

    2013-01-01

    Background The binding of transcription factors to DNA plays an essential role in the regulation of gene expression. Numerous experiments elucidated binding sequences which subsequently have been used to derive statistical models for predicting potential transcription factor binding sites (TFBS). The rapidly increasing number of genome sequence data requires sophisticated computational approaches to manage and query experimental and predicted TFBS data in the context of other epigenetic factors and across different organisms. Results We have developed D-Light, a novel client-server software package to store and query large amounts of TFBS data for any number of genomes. Users can add small-scale data to the server database and query them in a large scale, genome-wide promoter context. The client is implemented in Java and provides simple graphical user interfaces and data visualization. Here we also performed a statistical analysis showing what a user can expect for certain parameter settings and we illustrate the usage of D-Light with the help of a microarray data set. Conclusions D-Light is an easy to use software tool to integrate, store and query annotation data for promoters. A public D-Light server, the client and server software for local installation and the source code under GNU GPL license are available at http://biwww.che.sbg.ac.at/dlight. PMID:23617301

  10. Human, vector and parasite Hsp90 proteins: A comparative bioinformatics analysis.

    PubMed

    Faya, Ngonidzashe; Penkler, David L; Tastan Bishop, Özlem

    2015-01-01

    The treatment of protozoan parasitic diseases is challenging, and thus identification and analysis of new drug targets is important. Parasites survive within host organisms, and some need intermediate hosts to complete their life cycle. Changing host environment puts stress on parasites, and often adaptation is accompanied by the expression of large amounts of heat shock proteins (Hsps). Among Hsps, Hsp90 proteins play an important role in stress environments. Yet, there has been little computational research on Hsp90 proteins to analyze them comparatively as potential parasitic drug targets. Here, an attempt was made to gain detailed insights into the differences between host, vector and parasitic Hsp90 proteins by large-scale bioinformatics analysis. A total of 104 Hsp90 sequences were divided into three groups based on their cellular localizations; namely cytosolic, mitochondrial and endoplasmic reticulum (ER). Further, the parasitic proteins were divided according to the type of parasite (protozoa, helminth and ectoparasite). Primary sequence analysis, phylogenetic tree calculations, motif analysis and physicochemical properties of Hsp90 proteins suggested that despite the overall structural conservation of these proteins, parasitic Hsp90 proteins have unique features which differentiate them from human ones, thus encouraging the idea that protozoan Hsp90 proteins should be further analyzed as potential drug targets.

  11. Animal component-free Agrobacterium tumefaciens cultivation media for better GMP-compliance increases biomass yield and pharmaceutical protein expression in Nicotiana benthamiana.

    PubMed

    Houdelet, Marcel; Galinski, Anna; Holland, Tanja; Wenzel, Kathrin; Schillberg, Stefan; Buyel, Johannes Felix

    2017-04-01

    Transient expression systems allow the rapid production of recombinant proteins in plants. Such systems can be scaled up to several hundred kilograms of biomass, making them suitable for the production of pharmaceutical proteins required at short notice, such as emergency vaccines. However, large-scale transient expression requires the production of recombinant Agrobacterium tumefaciens strains with the capacity for efficient gene transfer to plant cells. The complex media often used for the cultivation of this species typically include animal-derived ingredients that can contain human pathogens, thus conflicting with the requirements of good manufacturing practice (GMP). We replaced all the animal-derived components in yeast extract broth (YEB) cultivation medium with soybean peptone, and then used a design-of-experiments approach to optimize the medium composition, increasing the biomass yield while maintaining high levels of transient expression in subsequent infiltration experiments. The resulting plant peptone Agrobacterium medium (PAM) achieved a two-fold increase in OD 600 compared to YEB medium during a 4-L batch fermentation lasting 18 h. Furthermore, the yields of the monoclonal antibody 2G12 and the fluorescent protein DsRed were maintained when the cells were cultivated in PAM rather than YEB. We have thus demonstrated a simple, efficient and scalable method for medium optimization that reduces process time and costs. The final optimized medium for the cultivation of A. tumefaciens completely lacks animal-derived components, thus facilitating the GMP-compliant large-scale transient expression of recombinant proteins in plants. © 2017 The Authors. Biotechnology Journal published by WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  12. Modification of the Creator recombination system for proteomics applications--improved expression by addition of splice sites.

    PubMed

    Colwill, Karen; Wells, Clark D; Elder, Kelly; Goudreault, Marilyn; Hersi, Kadija; Kulkarni, Sarang; Hardy, W Rod; Pawson, Tony; Morin, Gregg B

    2006-03-06

    Recombinational systems have been developed to rapidly shuttle Open Reading Frames (ORFs) into multiple expression vectors in order to analyze the large number of cDNAs available in the post-genomic era. In the Creator system, an ORF introduced into a donor vector can be transferred with Cre recombinase to a library of acceptor vectors optimized for different applications. Usability of the Creator system is impacted by the ability to easily manipulate DNA, the number of acceptor vectors for downstream applications, and the level of protein expression from Creator vectors. To date, we have developed over 20 novel acceptor vectors that employ a variety of promoters and epitope tags commonly employed for proteomics applications and gene function analysis. We also made several enhancements to the donor vectors including addition of different multiple cloning sites to allow shuttling from pre-existing vectors and introduction of the lacZ alpha reporter gene to allow for selection. Importantly, in order to ameliorate any effects on protein expression of the loxP site between a 5' tag and ORF, we introduced a splicing event into our expression vectors. The message produced from the resulting 'Creator Splice' vector undergoes splicing in mammalian systems to remove the loxP site. Upon analysis of our Creator Splice constructs, we discovered that protein expression levels were also significantly increased. The development of new donor and acceptor vectors has increased versatility during the cloning process and made this system compatible with a wider variety of downstream applications. The modifications introduced in our Creator Splice system were designed to remove extraneous sequences due to recombination but also aided in downstream analysis by increasing protein expression levels. As a result, we can now employ epitope tags that are detected less efficiently and reduce our assay scale to allow for higher throughput. The Creator Splice system appears to be an extremely useful tool for proteomics.

  13. Modification of the Creator recombination system for proteomics applications – improved expression by addition of splice sites

    PubMed Central

    Colwill, Karen; Wells, Clark D; Elder, Kelly; Goudreault, Marilyn; Hersi, Kadija; Kulkarni, Sarang; Hardy, W Rod; Pawson, Tony; Morin, Gregg B

    2006-01-01

    Background Recombinational systems have been developed to rapidly shuttle Open Reading Frames (ORFs) into multiple expression vectors in order to analyze the large number of cDNAs available in the post-genomic era. In the Creator system, an ORF introduced into a donor vector can be transferred with Cre recombinase to a library of acceptor vectors optimized for different applications. Usability of the Creator system is impacted by the ability to easily manipulate DNA, the number of acceptor vectors for downstream applications, and the level of protein expression from Creator vectors. Results To date, we have developed over 20 novel acceptor vectors that employ a variety of promoters and epitope tags commonly employed for proteomics applications and gene function analysis. We also made several enhancements to the donor vectors including addition of different multiple cloning sites to allow shuttling from pre-existing vectors and introduction of the lacZ alpha reporter gene to allow for selection. Importantly, in order to ameliorate any effects on protein expression of the loxP site between a 5' tag and ORF, we introduced a splicing event into our expression vectors. The message produced from the resulting 'Creator Splice' vector undergoes splicing in mammalian systems to remove the loxP site. Upon analysis of our Creator Splice constructs, we discovered that protein expression levels were also significantly increased. Conclusion The development of new donor and acceptor vectors has increased versatility during the cloning process and made this system compatible with a wider variety of downstream applications. The modifications introduced in our Creator Splice system were designed to remove extraneous sequences due to recombination but also aided in downstream analysis by increasing protein expression levels. As a result, we can now employ epitope tags that are detected less efficiently and reduce our assay scale to allow for higher throughput. The Creator Splice system appears to be an extremely useful tool for proteomics. PMID:16519801

  14. Clinical implications of genomic profiles in metastatic breast cancer with a focus on TP53 and PIK3CA, the most frequently mutated genes.

    PubMed

    Kim, Ji-Yeon; Lee, Eunjin; Park, Kyunghee; Park, Woong-Yang; Jung, Hae Hyun; Ahn, Jin Seok; Im, Young-Hyuck; Park, Yeon Hee

    2017-04-25

    Breast cancer (BC) has been genetically profiled through large-scale genome analyses. However, the role and clinical implications of genetic alterations in metastatic BC (MBC) have not been evaluated. Therefore, we conducted whole-exome sequencing (WES) and RNA-Seq of 37 MBC samples and targeted deep sequencing of another 29 MBCs. We evaluated somatic mutations from WES and targeted sequencing and assessed gene expression and performed pathway analysis from RNA-Seq. In this analysis, PIK3CA was the most commonly mutated gene in estrogen receptor (ER)-positive BC, while in ER-negative BC, TP53 was the most commonly mutated gene (p = 0.018 and p < 0.001, respectively). TP53 stopgain/loss and frameshift mutation was related to low expression of TP53 in contrast nonsynonymous mutation was related to high expression. The impact of TP53 mutation on clinical outcome varied with regard to ER status. In ER-positive BCs, wild type TP53 had a better prognosis than mutated TP53 (median overall survival (OS) (wild type vs. mutated): 88.5 ± 54.4 vs. 32.6 ± 10.7 (months), p = 0.002). In contrast, mutated TP53 had a protective effect in ER-negative BCs (median OS: 0.10 vs. 32.6 ± 8.2, p = 0.026). However, PIK3CA mutation did not affect patient survival. In gene expression analysis, CALM1, a potential regulator of AKT, was highly expressed in PIK3CA-mutated BCs. In conclusion, mutation of TP53 was associated with expression status and affect clinical outcome according to ER status in MBC. Although mutation of PIK3CA was not related to survival in this study, mutation of PIK3CA altered the expression of other genes and pathways including CALM1 and may be a potential predictive marker of PI3K inhibitor effectiveness.

  15. Biological classification with RNA-Seq data: Can alternatively spliced transcript expression enhance machine learning classifier?

    PubMed

    Johnson, Nathan T; Dhroso, Andi; Hughes, Katelyn J; Korkin, Dmitry

    2018-06-25

    The extent to which the genes are expressed in the cell can be simplistically defined as a function of one or more factors of the environment, lifestyle, and genetics. RNA sequencing (RNA-Seq) is becoming a prevalent approach to quantify gene expression, and is expected to gain better insights to a number of biological and biomedical questions, compared to the DNA microarrays. Most importantly, RNA-Seq allows to quantify expression at the gene and alternative splicing isoform levels. However, leveraging the RNA-Seq data requires development of new data mining and analytics methods. Supervised machine learning methods are commonly used approaches for biological data analysis, and have recently gained attention for their applications to the RNA-Seq data. In this work, we assess the utility of supervised learning methods trained on RNA-Seq data for a diverse range of biological classification tasks. We hypothesize that the isoform-level expression data is more informative for biological classification tasks than the gene-level expression data. Our large-scale assessment is done through utilizing multiple datasets, organisms, lab groups, and RNA-Seq analysis pipelines. Overall, we performed and assessed 61 biological classification problems that leverage three independent RNA-Seq datasets and include over 2,000 samples that come from multiple organisms, lab groups, and RNA-Seq analyses. These 61 problems include predictions of the tissue type, sex, or age of the sample, healthy or cancerous phenotypes and, the pathological tumor stage for the samples from the cancerous tissue. For each classification problem, the performance of three normalization techniques and six machine learning classifiers was explored. We find that for every single classification problem, the isoform-based classifiers outperform or are comparable with gene expression based methods. The top-performing supervised learning techniques reached a near perfect classification accuracy, demonstrating the utility of supervised learning for RNA-Seq based data analysis. Published by Cold Spring Harbor Laboratory Press for the RNA Society.

  16. The effects of intermittency on statistical characteristics of turbulence and scale similarity of breakdown coefficients

    NASA Astrophysics Data System (ADS)

    Novikov, E. A.

    1990-05-01

    The influence of intermittency on turbulent diffusion is expressed in terms of the statistics of the dissipation field. The high-order moments of relative diffusion are obtained by using the concept of scale similarity of the breakdown coefficients (bdc). The method of bdc is useful for obtaining new models and general results, which then can be expressed in terms of multifractals. In particular, the concavity and other properties of spectral codimension are proved. Special attention is paid to the logarithmically periodic modulations. The parametrization of small-scale intermittent turbulence, which can be used for large-eddy simulation, is presented. The effect of molecular viscosity is taken into account in the spirit of the renorm group, but without spectral series, ɛ expansion, and fictitious random forces.

  17. Scaling up to address data science challenges

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wendelberger, Joanne R.

    Statistics and Data Science provide a variety of perspectives and technical approaches for exploring and understanding Big Data. Partnerships between scientists from different fields such as statistics, machine learning, computer science, and applied mathematics can lead to innovative approaches for addressing problems involving increasingly large amounts of data in a rigorous and effective manner that takes advantage of advances in computing. Here, this article will explore various challenges in Data Science and will highlight statistical approaches that can facilitate analysis of large-scale data including sampling and data reduction methods, techniques for effective analysis and visualization of large-scale simulations, and algorithmsmore » and procedures for efficient processing.« less

  18. Scaling up to address data science challenges

    DOE PAGES

    Wendelberger, Joanne R.

    2017-04-27

    Statistics and Data Science provide a variety of perspectives and technical approaches for exploring and understanding Big Data. Partnerships between scientists from different fields such as statistics, machine learning, computer science, and applied mathematics can lead to innovative approaches for addressing problems involving increasingly large amounts of data in a rigorous and effective manner that takes advantage of advances in computing. Here, this article will explore various challenges in Data Science and will highlight statistical approaches that can facilitate analysis of large-scale data including sampling and data reduction methods, techniques for effective analysis and visualization of large-scale simulations, and algorithmsmore » and procedures for efficient processing.« less

  19. Three-dimensional constrained variational analysis: Approach and application to analysis of atmospheric diabatic heating and derivative fields during an ARM SGP intensive observational period

    NASA Astrophysics Data System (ADS)

    Tang, Shuaiqi; Zhang, Minghua

    2015-08-01

    Atmospheric vertical velocities and advective tendencies are essential large-scale forcing data to drive single-column models (SCMs), cloud-resolving models (CRMs), and large-eddy simulations (LESs). However, they cannot be directly measured from field measurements or easily calculated with great accuracy. In the Atmospheric Radiation Measurement Program (ARM), a constrained variational algorithm (1-D constrained variational analysis (1DCVA)) has been used to derive large-scale forcing data over a sounding network domain with the aid of flux measurements at the surface and top of the atmosphere (TOA). The 1DCVA algorithm is now extended into three dimensions (3DCVA) along with other improvements to calculate gridded large-scale forcing data, diabatic heating sources (Q1), and moisture sinks (Q2). Results are presented for a midlatitude cyclone case study on 3 March 2000 at the ARM Southern Great Plains site. These results are used to evaluate the diabatic heating fields in the available products such as Rapid Update Cycle, ERA-Interim, National Centers for Environmental Prediction Climate Forecast System Reanalysis, Modern-Era Retrospective Analysis for Research and Applications, Japanese 55-year Reanalysis, and North American Regional Reanalysis. We show that although the analysis/reanalysis generally captures the atmospheric state of the cyclone, their biases in the derivative terms (Q1 and Q2) at regional scale of a few hundred kilometers are large and all analyses/reanalyses tend to underestimate the subgrid-scale upward transport of moist static energy in the lower troposphere. The 3DCVA-gridded large-scale forcing data are physically consistent with the spatial distribution of surface and TOA measurements of radiation, precipitation, latent and sensible heat fluxes, and clouds that are better suited to force SCMs, CRMs, and LESs. Possible applications of the 3DCVA are discussed.

  20. Meta-analysis methods for combining multiple expression profiles: comparisons, statistical characterization and an application guideline

    PubMed Central

    2013-01-01

    Background As high-throughput genomic technologies become accurate and affordable, an increasing number of data sets have been accumulated in the public domain and genomic information integration and meta-analysis have become routine in biomedical research. In this paper, we focus on microarray meta-analysis, where multiple microarray studies with relevant biological hypotheses are combined in order to improve candidate marker detection. Many methods have been developed and applied in the literature, but their performance and properties have only been minimally investigated. There is currently no clear conclusion or guideline as to the proper choice of a meta-analysis method given an application; the decision essentially requires both statistical and biological considerations. Results We performed 12 microarray meta-analysis methods for combining multiple simulated expression profiles, and such methods can be categorized for different hypothesis setting purposes: (1) HS A : DE genes with non-zero effect sizes in all studies, (2) HS B : DE genes with non-zero effect sizes in one or more studies and (3) HS r : DE gene with non-zero effect in "majority" of studies. We then performed a comprehensive comparative analysis through six large-scale real applications using four quantitative statistical evaluation criteria: detection capability, biological association, stability and robustness. We elucidated hypothesis settings behind the methods and further apply multi-dimensional scaling (MDS) and an entropy measure to characterize the meta-analysis methods and data structure, respectively. Conclusions The aggregated results from the simulation study categorized the 12 methods into three hypothesis settings (HS A , HS B , and HS r ). Evaluation in real data and results from MDS and entropy analyses provided an insightful and practical guideline to the choice of the most suitable method in a given application. All source files for simulation and real data are available on the author’s publication website. PMID:24359104

  1. Meta-analysis methods for combining multiple expression profiles: comparisons, statistical characterization and an application guideline.

    PubMed

    Chang, Lun-Ching; Lin, Hui-Min; Sibille, Etienne; Tseng, George C

    2013-12-21

    As high-throughput genomic technologies become accurate and affordable, an increasing number of data sets have been accumulated in the public domain and genomic information integration and meta-analysis have become routine in biomedical research. In this paper, we focus on microarray meta-analysis, where multiple microarray studies with relevant biological hypotheses are combined in order to improve candidate marker detection. Many methods have been developed and applied in the literature, but their performance and properties have only been minimally investigated. There is currently no clear conclusion or guideline as to the proper choice of a meta-analysis method given an application; the decision essentially requires both statistical and biological considerations. We performed 12 microarray meta-analysis methods for combining multiple simulated expression profiles, and such methods can be categorized for different hypothesis setting purposes: (1) HS(A): DE genes with non-zero effect sizes in all studies, (2) HS(B): DE genes with non-zero effect sizes in one or more studies and (3) HS(r): DE gene with non-zero effect in "majority" of studies. We then performed a comprehensive comparative analysis through six large-scale real applications using four quantitative statistical evaluation criteria: detection capability, biological association, stability and robustness. We elucidated hypothesis settings behind the methods and further apply multi-dimensional scaling (MDS) and an entropy measure to characterize the meta-analysis methods and data structure, respectively. The aggregated results from the simulation study categorized the 12 methods into three hypothesis settings (HS(A), HS(B), and HS(r)). Evaluation in real data and results from MDS and entropy analyses provided an insightful and practical guideline to the choice of the most suitable method in a given application. All source files for simulation and real data are available on the author's publication website.

  2. ReSeqTools: an integrated toolkit for large-scale next-generation sequencing based resequencing analysis.

    PubMed

    He, W; Zhao, S; Liu, X; Dong, S; Lv, J; Liu, D; Wang, J; Meng, Z

    2013-12-04

    Large-scale next-generation sequencing (NGS)-based resequencing detects sequence variations, constructs evolutionary histories, and identifies phenotype-related genotypes. However, NGS-based resequencing studies generate extraordinarily large amounts of data, making computations difficult. Effective use and analysis of these data for NGS-based resequencing studies remains a difficult task for individual researchers. Here, we introduce ReSeqTools, a full-featured toolkit for NGS (Illumina sequencing)-based resequencing analysis, which processes raw data, interprets mapping results, and identifies and annotates sequence variations. ReSeqTools provides abundant scalable functions for routine resequencing analysis in different modules to facilitate customization of the analysis pipeline. ReSeqTools is designed to use compressed data files as input or output to save storage space and facilitates faster and more computationally efficient large-scale resequencing studies in a user-friendly manner. It offers abundant practical functions and generates useful statistics during the analysis pipeline, which significantly simplifies resequencing analysis. Its integrated algorithms and abundant sub-functions provide a solid foundation for special demands in resequencing projects. Users can combine these functions to construct their own pipelines for other purposes.

  3. SQDFT: Spectral Quadrature method for large-scale parallel O(N) Kohn-Sham calculations at high temperature

    NASA Astrophysics Data System (ADS)

    Suryanarayana, Phanish; Pratapa, Phanisri P.; Sharma, Abhiraj; Pask, John E.

    2018-03-01

    We present SQDFT: a large-scale parallel implementation of the Spectral Quadrature (SQ) method for O(N) Kohn-Sham Density Functional Theory (DFT) calculations at high temperature. Specifically, we develop an efficient and scalable finite-difference implementation of the infinite-cell Clenshaw-Curtis SQ approach, in which results for the infinite crystal are obtained by expressing quantities of interest as bilinear forms or sums of bilinear forms, that are then approximated by spatially localized Clenshaw-Curtis quadrature rules. We demonstrate the accuracy of SQDFT by showing systematic convergence of energies and atomic forces with respect to SQ parameters to reference diagonalization results, and convergence with discretization to established planewave results, for both metallic and insulating systems. We further demonstrate that SQDFT achieves excellent strong and weak parallel scaling on computer systems consisting of tens of thousands of processors, with near perfect O(N) scaling with system size and wall times as low as a few seconds per self-consistent field iteration. Finally, we verify the accuracy of SQDFT in large-scale quantum molecular dynamics simulations of aluminum at high temperature.

  4. A topology visualization early warning distribution algorithm for large-scale network security incidents.

    PubMed

    He, Hui; Fan, Guotao; Ye, Jianwei; Zhang, Weizhe

    2013-01-01

    It is of great significance to research the early warning system for large-scale network security incidents. It can improve the network system's emergency response capabilities, alleviate the cyber attacks' damage, and strengthen the system's counterattack ability. A comprehensive early warning system is presented in this paper, which combines active measurement and anomaly detection. The key visualization algorithm and technology of the system are mainly discussed. The large-scale network system's plane visualization is realized based on the divide and conquer thought. First, the topology of the large-scale network is divided into some small-scale networks by the MLkP/CR algorithm. Second, the sub graph plane visualization algorithm is applied to each small-scale network. Finally, the small-scale networks' topologies are combined into a topology based on the automatic distribution algorithm of force analysis. As the algorithm transforms the large-scale network topology plane visualization problem into a series of small-scale network topology plane visualization and distribution problems, it has higher parallelism and is able to handle the display of ultra-large-scale network topology.

  5. Multilevel Latent Class Analysis for Large-Scale Educational Assessment Data: Exploring the Relation between the Curriculum and Students' Mathematical Strategies

    ERIC Educational Resources Information Center

    Fagginger Auer, Marije F.; Hickendorff, Marian; Van Putten, Cornelis M.; Béguin, Anton A.; Heiser, Willem J.

    2016-01-01

    A first application of multilevel latent class analysis (MLCA) to educational large-scale assessment data is demonstrated. This statistical technique addresses several of the challenges that assessment data offers. Importantly, MLCA allows modeling of the often ignored teacher effects and of the joint influence of teacher and student variables.…

  6. Modification of the parallel scattering mean free path of cosmic rays in the presence of adiabatic focusing

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    He, H.-Q.; Schlickeiser, R., E-mail: hqhe@mail.iggcas.ac.cn, E-mail: rsch@tp4.rub.de

    The cosmic ray mean free path in a large-scale nonuniform guide magnetic field with superposed magnetostatic turbulence is calculated to clarify some conflicting results in the literature. A new, exact integro-differential equation for the cosmic-ray anisotropy is derived from the Fokker-Planck transport equation. A perturbation analysis of this integro-differential equation leads to an analytical expression for the cosmic ray anisotropy and the focused transport equation for the isotropic part of the cosmic ray distribution function. The derived parallel spatial diffusion coefficient and the associated cosmic ray mean free path include the effect of adiabatic focusing and reduce to the standardmore » forms in the limit of a uniform guide magnetic field. For the illustrative case of isotropic pitch angle scattering, the derived mean free path agrees with the earlier expressions of Beeck and Wibberenz, Bieber and Burger, Kota, and Litvinenko, but disagrees with the result of Shalchi. The disagreement with the expression of Shalchi is particularly strong in the limit of strong adiabatic focusing.« less

  7. Genome-Level Longitudinal Expression of Signaling Pathways and Gene Networks in Pediatric Septic Shock

    PubMed Central

    Shanley, Thomas P; Cvijanovich, Natalie; Lin, Richard; Allen, Geoffrey L; Thomas, Neal J; Doctor, Allan; Kalyanaraman, Meena; Tofil, Nancy M; Penfil, Scott; Monaco, Marie; Odoms, Kelli; Barnes, Michael; Sakthivel, Bhuvaneswari; Aronow, Bruce J; Wong, Hector R

    2007-01-01

    We have conducted longitudinal studies focused on the expression profiles of signaling pathways and gene networks in children with septic shock. Genome-level expression profiles were generated from whole blood-derived RNA of children with septic shock (n = 30) corresponding to day one and day three of septic shock, respectively. Based on sequential statistical and expression filters, day one and day three of septic shock were characterized by differential regulation of 2,142 and 2,504 gene probes, respectively, relative to controls (n = 15). Venn analysis demonstrated 239 unique genes in the day one dataset, 598 unique genes in the day three dataset, and 1,906 genes common to both datasets. Functional analyses demonstrated time-dependent, differential regulation of genes involved in multiple signaling pathways and gene networks primarily related to immunity and inflammation. Notably, multiple and distinct gene networks involving T cell- and MHC antigen-related biology were persistently downregulated on both day one and day three. Further analyses demonstrated large scale, persistent downregulation of genes corresponding to functional annotations related to zinc homeostasis. These data represent the largest reported cohort of patients with septic shock subjected to longitudinal genome-level expression profiling. The data further advance our genome-level understanding of pediatric septic shock and support novel hypotheses. PMID:17932561

  8. LINCS Canvas Browser: interactive web app to query, browse and interrogate LINCS L1000 gene expression signatures.

    PubMed

    Duan, Qiaonan; Flynn, Corey; Niepel, Mario; Hafner, Marc; Muhlich, Jeremy L; Fernandez, Nicolas F; Rouillard, Andrew D; Tan, Christopher M; Chen, Edward Y; Golub, Todd R; Sorger, Peter K; Subramanian, Aravind; Ma'ayan, Avi

    2014-07-01

    For the Library of Integrated Network-based Cellular Signatures (LINCS) project many gene expression signatures using the L1000 technology have been produced. The L1000 technology is a cost-effective method to profile gene expression in large scale. LINCS Canvas Browser (LCB) is an interactive HTML5 web-based software application that facilitates querying, browsing and interrogating many of the currently available LINCS L1000 data. LCB implements two compacted layered canvases, one to visualize clustered L1000 expression data, and the other to display enrichment analysis results using 30 different gene set libraries. Clicking on an experimental condition highlights gene-sets enriched for the differentially expressed genes from the selected experiment. A search interface allows users to input gene lists and query them against over 100 000 conditions to find the top matching experiments. The tool integrates many resources for an unprecedented potential for new discoveries in systems biology and systems pharmacology. The LCB application is available at http://www.maayanlab.net/LINCS/LCB. Customized versions will be made part of the http://lincscloud.org and http://lincs.hms.harvard.edu websites. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  9. Tools for understanding landscapes: combining large-scale surveys to characterize change. Chapter 9.

    Treesearch

    W. Keith Moser; Janine Bolliger; Don C. Bragg; Mark H. Hansen; Mark A. Hatfield; Timothy A. Nigh; Lisa A. Schulte

    2008-01-01

    All landscapes change continuously. Since change is perceived and interpreted through measures of scale, any quantitative analysis of landscapes must identify and describe the spatiotemporal mosaics shaped by large-scale structures and processes. This process is controlled by core influences, or "drivers," that shape the change and affect the outcome...

  10. Evolution of a Cellular Immune Response in Drosophila: A Phenotypic and Genomic Comparative Analysis

    PubMed Central

    Salazar-Jaramillo, Laura; Paspati, Angeliki; van de Zande, Louis; Vermeulen, Cornelis Joseph; Schwander, Tanja; Wertheim, Bregje

    2014-01-01

    Understanding the genomic basis of evolutionary adaptation requires insight into the molecular basis underlying phenotypic variation. However, even changes in molecular pathways associated with extreme variation, gains and losses of specific phenotypes, remain largely uncharacterized. Here, we investigate the large interspecific differences in the ability to survive infection by parasitoids across 11 Drosophila species and identify genomic changes associated with gains and losses of parasitoid resistance. We show that a cellular immune defense, encapsulation, and the production of a specialized blood cell, lamellocytes, are restricted to a sublineage of Drosophila, but that encapsulation is absent in one species of this sublineage, Drosophila sechellia. Our comparative analyses of hemopoiesis pathway genes and of genes differentially expressed during the encapsulation response revealed that hemopoiesis-associated genes are highly conserved and present in all species independently of their resistance. In contrast, 11 genes that are differentially expressed during the response to parasitoids are novel genes, specific to the Drosophila sublineage capable of lamellocyte-mediated encapsulation. These novel genes, which are predominantly expressed in hemocytes, arose via duplications, whereby five of them also showed signatures of positive selection, as expected if they were recruited for new functions. Three of these novel genes further showed large-scale and presumably loss-of-function sequence changes in D. sechellia, consistent with the loss of resistance in this species. In combination, these convergent lines of evidence suggest that co-option of duplicated genes in existing pathways and subsequent neofunctionalization are likely to have contributed to the evolution of the lamellocyte-mediated encapsulation in Drosophila. PMID:24443439

  11. From Ambiguities to Insights: Query-based Comparisons of High-Dimensional Data

    NASA Astrophysics Data System (ADS)

    Kowalski, Jeanne; Talbot, Conover; Tsai, Hua L.; Prasad, Nijaguna; Umbricht, Christopher; Zeiger, Martha A.

    2007-11-01

    Genomic technologies will revolutionize drag discovery and development; that much is universally agreed upon. The high dimension of data from such technologies has challenged available data analytic methods; that much is apparent. To date, large-scale data repositories have not been utilized in ways that permit their wealth of information to be efficiently processed for knowledge, presumably due in large part to inadequate analytical tools to address numerous comparisons of high-dimensional data. In candidate gene discovery, expression comparisons are often made between two features (e.g., cancerous versus normal), such that the enumeration of outcomes is manageable. With multiple features, the setting becomes more complex, in terms of comparing expression levels of tens of thousands transcripts across hundreds of features. In this case, the number of outcomes, while enumerable, become rapidly large and unmanageable, and scientific inquiries become more abstract, such as "which one of these (compounds, stimuli, etc.) is not like the others?" We develop analytical tools that promote more extensive, efficient, and rigorous utilization of the public data resources generated by the massive support of genomic studies. Our work innovates by enabling access to such metadata with logically formulated scientific inquires that define, compare and integrate query-comparison pair relations for analysis. We demonstrate our computational tool's potential to address an outstanding biomedical informatics issue of identifying reliable molecular markers in thyroid cancer. Our proposed query-based comparison (QBC) facilitates access to and efficient utilization of metadata through logically formed inquires expressed as query-based comparisons by organizing and comparing results from biotechnologies to address applications in biomedicine.

  12. Evolution of a cellular immune response in Drosophila: a phenotypic and genomic comparative analysis.

    PubMed

    Salazar-Jaramillo, Laura; Paspati, Angeliki; van de Zande, Louis; Vermeulen, Cornelis Joseph; Schwander, Tanja; Wertheim, Bregje

    2014-02-01

    Understanding the genomic basis of evolutionary adaptation requires insight into the molecular basis underlying phenotypic variation. However, even changes in molecular pathways associated with extreme variation, gains and losses of specific phenotypes, remain largely uncharacterized. Here, we investigate the large interspecific differences in the ability to survive infection by parasitoids across 11 Drosophila species and identify genomic changes associated with gains and losses of parasitoid resistance. We show that a cellular immune defense, encapsulation, and the production of a specialized blood cell, lamellocytes, are restricted to a sublineage of Drosophila, but that encapsulation is absent in one species of this sublineage, Drosophila sechellia. Our comparative analyses of hemopoiesis pathway genes and of genes differentially expressed during the encapsulation response revealed that hemopoiesis-associated genes are highly conserved and present in all species independently of their resistance. In contrast, 11 genes that are differentially expressed during the response to parasitoids are novel genes, specific to the Drosophila sublineage capable of lamellocyte-mediated encapsulation. These novel genes, which are predominantly expressed in hemocytes, arose via duplications, whereby five of them also showed signatures of positive selection, as expected if they were recruited for new functions. Three of these novel genes further showed large-scale and presumably loss-of-function sequence changes in D. sechellia, consistent with the loss of resistance in this species. In combination, these convergent lines of evidence suggest that co-option of duplicated genes in existing pathways and subsequent neofunctionalization are likely to have contributed to the evolution of the lamellocyte-mediated encapsulation in Drosophila.

  13. Biochemical Diversification through Foreign Gene Expression in Bdelloid Rotifers

    PubMed Central

    Eyres, Isobel; Wang-Koh, Yuan; Lubzens, Esther; Barraclough, Timothy G.; Micklem, Gos; Tunnacliffe, Alan

    2012-01-01

    Bdelloid rotifers are microinvertebrates with unique characteristics: they have survived tens of millions of years without sexual reproduction; they withstand extreme desiccation by undergoing anhydrobiosis; and they tolerate very high levels of ionizing radiation. Recent evidence suggests that subtelomeric regions of the bdelloid genome contain sequences originating from other organisms by horizontal gene transfer (HGT), of which some are known to be transcribed. However, the extent to which foreign gene expression plays a role in bdelloid physiology is unknown. We address this in the first large scale analysis of the transcriptome of the bdelloid Adineta ricciae: cDNA libraries from hydrated and desiccated bdelloids were subjected to massively parallel sequencing and assembled transcripts compared against the UniProtKB database by blastx to identify their putative products. Of ∼29,000 matched transcripts, ∼10% were inferred from blastx matches to be horizontally acquired, mainly from eubacteria but also from fungi, protists, and algae. After allowing for possible sources of error, the rate of HGT is at least 8%–9%, a level significantly higher than other invertebrates. We verified their foreign nature by phylogenetic analysis and by demonstrating linkage of foreign genes with metazoan genes in the bdelloid genome. Approximately 80% of horizontally acquired genes expressed in bdelloids code for enzymes, and these represent 39% of enzymes in identified pathways. Many enzymes encoded by foreign genes enhance biochemistry in bdelloids compared to other metazoans, for example, by potentiating toxin degradation or generation of antioxidants and key metabolites. They also supplement, and occasionally potentially replace, existing metazoan functions. Bdelloid rotifers therefore express horizontally acquired genes on a scale unprecedented in animals, and foreign genes make a profound contribution to their metabolism. This represents a potential mechanism for ancient asexuals to adapt rapidly to changing environments and thereby persist over long evolutionary time periods in the absence of sex. PMID:23166508

  14. Normalization of oxygen and hydrogen isotope data

    USGS Publications Warehouse

    Coplen, T.B.

    1988-01-01

    To resolve confusion due to expression of isotopic data from different laboratories on non-corresponding scales, oxygen isotope analyses of all substances can be expressed relative to VSMOW or VPDB (Vienna Peedee belemnite) on scales normalized such that the ??18O of SLAP is -55.5% relative to VSMOW. H3+ contribution in hydrogen isotope ratio analysis can be easily determined using two gaseous reference samples that differ greatly in deuterium content. ?? 1988.

  15. Topological Properties of Some Integrated Circuits for Very Large Scale Integration Chip Designs

    NASA Astrophysics Data System (ADS)

    Swanson, S.; Lanzerotti, M.; Vernizzi, G.; Kujawski, J.; Weatherwax, A.

    2015-03-01

    This talk presents topological properties of integrated circuits for Very Large Scale Integration chip designs. These circuits can be implemented in very large scale integrated circuits, such as those in high performance microprocessors. Prior work considered basic combinational logic functions and produced a mathematical framework based on algebraic topology for integrated circuits composed of logic gates. Prior work also produced an historically-equivalent interpretation of Mr. E. F. Rent's work for today's complex circuitry in modern high performance microprocessors, where a heuristic linear relationship was observed between the number of connections and number of logic gates. This talk will examine topological properties and connectivity of more complex functionally-equivalent integrated circuits. The views expressed in this article are those of the author and do not reflect the official policy or position of the United States Air Force, Department of Defense or the U.S. Government.

  16. Constraints on the Origin of Cosmic Rays above 1018 eV from Large-scale Anisotropy Searches in Data of the Pierre Auger Observatory

    NASA Astrophysics Data System (ADS)

    Pierre Auger Collaboration; Abreu, P.; Aglietta, M.; Ahlers, M.; Ahn, E. J.; Albuquerque, I. F. M.; Allard, D.; Allekotte, I.; Allen, J.; Allison, P.; Almela, A.; Alvarez Castillo, J.; Alvarez-Muñiz, J.; Alves Batista, R.; Ambrosio, M.; Aminaei, A.; Anchordoqui, L.; Andringa, S.; Antiči'c, T.; Aramo, C.; Arganda, E.; Arqueros, F.; Asorey, H.; Assis, P.; Aublin, J.; Ave, M.; Avenier, M.; Avila, G.; Badescu, A. M.; Balzer, M.; Barber, K. B.; Barbosa, A. F.; Bardenet, R.; Barroso, S. L. C.; Baughman, B.; Bäuml, J.; Baus, C.; Beatty, J. J.; Becker, K. H.; Bellétoile, A.; Bellido, J. A.; BenZvi, S.; Berat, C.; Bertou, X.; Biermann, P. L.; Billoir, P.; Blanco, F.; Blanco, M.; Bleve, C.; Blümer, H.; Boháčová, M.; Boncioli, D.; Bonifazi, C.; Bonino, R.; Borodai, N.; Brack, J.; Brancus, I.; Brogueira, P.; Brown, W. C.; Bruijn, R.; Buchholz, P.; Bueno, A.; Buroker, L.; Burton, R. E.; Caballero-Mora, K. S.; Caccianiga, B.; Caramete, L.; Caruso, R.; Castellina, A.; Catalano, O.; Cataldi, G.; Cazon, L.; Cester, R.; Chauvin, J.; Cheng, S. H.; Chiavassa, A.; Chinellato, J. A.; Chirinos Diaz, J.; Chudoba, J.; Cilmo, M.; Clay, R. W.; Cocciolo, G.; Collica, L.; Coluccia, M. R.; Conceição, R.; Contreras, F.; Cook, H.; Cooper, M. J.; Coppens, J.; Cordier, A.; Coutu, S.; Covault, C. E.; Creusot, A.; Criss, A.; Cronin, J.; Curutiu, A.; Dagoret-Campagne, S.; Dallier, R.; Daniel, B.; Dasso, S.; Daumiller, K.; Dawson, B. R.; de Almeida, R. M.; De Domenico, M.; De Donato, C.; de Jong, S. J.; De La Vega, G.; de Mello Junior, W. J. M.; de Mello Neto, J. R. T.; De Mitri, I.; de Souza, V.; de Vries, K. D.; del Peral, L.; del Río, M.; Deligny, O.; Dembinski, H.; Dhital, N.; Di Giulio, C.; Díaz Castro, M. L.; Diep, P. N.; Diogo, F.; Dobrigkeit, C.; Docters, W.; D'Olivo, J. C.; Dong, P. N.; Dorofeev, A.; dos Anjos, J. C.; Dova, M. T.; D'Urso, D.; Dutan, I.; Ebr, J.; Engel, R.; Erdmann, M.; Escobar, C. O.; Espadanal, J.; Etchegoyen, A.; Facal San Luis, P.; Falcke, H.; Fang, K.; Farrar, G.; Fauth, A. C.; Fazzini, N.; Ferguson, A. P.; Fick, B.; Figueira, J. M.; Filevich, A.; Filipčič, A.; Fliescher, S.; Fracchiolla, C. E.; Fraenkel, E. D.; Fratu, O.; Fröhlich, U.; Fuchs, B.; Gaior, R.; Gamarra, R. F.; Gambetta, S.; García, B.; Garcia Roca, S. T.; Garcia-Gamez, D.; Garcia-Pinto, D.; Garilli, G.; Gascon Bravo, A.; Gemmeke, H.; Ghia, P. L.; Giller, M.; Gitto, J.; Glass, H.; Gold, M. S.; Golup, G.; Gomez Albarracin, F.; Gómez Berisso, M.; Gómez Vitale, P. F.; Gonçalves, P.; Gonzalez, J. G.; Gookin, B.; Gorgi, A.; Gouffon, P.; Grashorn, E.; Grebe, S.; Griffith, N.; Grillo, A. F.; Guardincerri, Y.; Guarino, F.; Guedes, G. P.; Hansen, P.; Harari, D.; Harrison, T. A.; Harton, J. L.; Haungs, A.; Hebbeker, T.; Heck, D.; Herve, A. E.; Hill, G. C.; Hojvat, C.; Hollon, N.; Holmes, V. C.; Homola, P.; Hörandel, J. R.; Horvath, P.; Hrabovský, M.; Huber, D.; Huege, T.; Insolia, A.; Ionita, F.; Italiano, A.; Jansen, S.; Jarne, C.; Jiraskova, S.; Josebachuili, M.; Kadija, K.; Kampert, K. H.; Karhan, P.; Kasper, P.; Katkov, I.; Kégl, B.; Keilhauer, B.; Keivani, A.; Kelley, J. L.; Kemp, E.; Kieckhafer, R. M.; Klages, H. O.; Kleifges, M.; Kleinfeller, J.; Knapp, J.; Koang, D.-H.; Kotera, K.; Krohm, N.; Krömer, O.; Kruppke-Hansen, D.; Kuempel, D.; Kulbartz, J. K.; Kunka, N.; La Rosa, G.; Lachaud, C.; LaHurd, D.; Latronico, L.; Lauer, R.; Lautridou, P.; Le Coz, S.; Leão, M. S. A. B.; Lebrun, D.; Lebrun, P.; Leigui de Oliveira, M. A.; Letessier-Selvon, A.; Lhenry-Yvon, I.; Link, K.; López, R.; Lopez Agüera, A.; Louedec, K.; Lozano Bahilo, J.; Lu, L.; Lucero, A.; Ludwig, M.; Lyberis, H.; Maccarone, M. C.; Macolino, C.; Maldera, S.; Maller, J.; Mandat, D.; Mantsch, P.; Mariazzi, A. G.; Marin, J.; Marin, V.; Maris, I. C.; Marquez Falcon, H. R.; Marsella, G.; Martello, D.; Martin, L.; Martinez, H.; Martínez Bravo, O.; Martraire, D.; Masías Meza, J. J.; Mathes, H. J.; Matthews, J.; Matthews, J. A. J.; Matthiae, G.; Maurel, D.; Maurizio, D.; Mazur, P. O.; Medina-Tanco, G.; Melissas, M.; Melo, D.; Menichetti, E.; Menshikov, A.; Mertsch, P.; Messina, S.; Meurer, C.; Meyhandan, R.; Mi'canovi'c, S.; Micheletti, M. I.; Minaya, I. A.; Miramonti, L.; Molina-Bueno, L.; Mollerach, S.; Monasor, M.; Monnier Ragaigne, D.; Montanet, F.; Morales, B.; Morello, C.; Moreno, E.; Moreno, J. C.; Mostafá, M.; Moura, C. A.; Muller, M. A.; Müller, G.; Münchmeyer, M.; Mussa, R.; Navarra, G.; Navarro, J. L.; Navas, S.; Necesal, P.; Nellen, L.; Nelles, A.; Neuser, J.; Nhung, P. T.; Niechciol, M.; Niemietz, L.; Nierstenhoefer, N.; Nitz, D.; Nosek, D.; Nožka, L.; Oehlschläger, J.; Olinto, A.; Ortiz, M.; Pacheco, N.; Pakk Selmi-Dei, D.; Palatka, M.; Pallotta, J.; Palmieri, N.; Parente, G.; Parizot, E.; Parra, A.; Pastor, S.; Paul, T.; Pech, M.; Peķala, J.; Pelayo, R.; Pepe, I. M.; Perrone, L.; Pesce, R.; Petermann, E.; Petrera, S.; Petrolini, A.; Petrov, Y.; Pfendner, C.; Piegaia, R.; Pierog, T.; Pieroni, P.; Pimenta, M.; Pirronello, V.; Platino, M.; Plum, M.; Ponce, V. H.; Pontz, M.; Porcelli, A.; Privitera, P.; Prouza, M.; Quel, E. J.; Querchfeld, S.; Rautenberg, J.; Ravel, O.; Ravignani, D.; Revenu, B.; Ridky, J.; Riggi, S.; Risse, M.; Ristori, P.; Rivera, H.; Rizi, V.; Roberts, J.; Rodrigues de Carvalho, W.; Rodriguez, G.; Rodriguez Cabo, I.; Rodriguez Martino, J.; Rodriguez Rojo, J.; Rodríguez-Frías, M. D.; Ros, G.; Rosado, J.; Rossler, T.; Roth, M.; Rouillé-d'Orfeuil, B.; Roulet, E.; Rovero, A. C.; Rühle, C.; Saftoiu, A.; Salamida, F.; Salazar, H.; Salesa Greus, F.; Salina, G.; Sánchez, F.; Santo, C. E.; Santos, E.; Santos, E. M.; Sarazin, F.; Sarkar, B.; Sarkar, S.; Sato, R.; Scharf, N.; Scherini, V.; Schieler, H.; Schiffer, P.; Schmidt, A.; Scholten, O.; Schoorlemmer, H.; Schovancova, J.; Schovánek, P.; Schröder, F.; Schuster, D.; Sciutto, S. J.; Scuderi, M.; Segreto, A.; Settimo, M.; Shadkam, A.; Shellard, R. C.; Sidelnik, I.; Sigl, G.; Silva Lopez, H. H.; Sima, O.; 'Smiałkowski, A.; Šmída, R.; Snow, G. R.; Sommers, P.; Sorokin, J.; Spinka, H.; Squartini, R.; Srivastava, Y. N.; Stanic, S.; Stapleton, J.; Stasielak, J.; Stephan, M.; Stutz, A.; Suarez, F.; Suomijärvi, T.; Supanitsky, A. D.; Šuša, T.; Sutherland, M. S.; Swain, J.; Szadkowski, Z.; Szuba, M.; Tapia, A.; Tartare, M.; Taşcău, O.; Tcaciuc, R.; Thao, N. T.; Thomas, D.; Tiffenberg, J.; Timmermans, C.; Tkaczyk, W.; Todero Peixoto, C. J.; Toma, G.; Tomankova, L.; Tomé, B.; Tonachini, A.; Torralba Elipe, G.; Travnicek, P.; Tridapalli, D. B.; Tristram, G.; Trovato, E.; Tueros, M.; Ulrich, R.; Unger, M.; Urban, M.; Valdés Galicia, J. F.; Valiño, I.; Valore, L.; van Aar, G.; van den Berg, A. M.; van Velzen, S.; van Vliet, A.; Varela, E.; Vargas Cárdenas, B.; Vázquez, J. R.; Vázquez, R. A.; Veberič, D.; Verzi, V.; Vicha, J.; Videla, M.; Villaseñor, L.; Wahlberg, H.; Wahrlich, P.; Wainberg, O.; Walz, D.; Watson, A. A.; Weber, M.; Weidenhaupt, K.; Weindl, A.; Werner, F.; Westerhoff, S.; Whelan, B. J.; Widom, A.; Wieczorek, G.; Wiencke, L.; Wilczyńska, B.; Wilczyński, H.; Will, M.; Williams, C.; Winchen, T.; Wommer, M.; Wundheiler, B.; Yamamoto, T.; Yapici, T.; Younk, P.; Yuan, G.; Yushkov, A.; Zamorano Garcia, B.; Zas, E.; Zavrtanik, D.; Zavrtanik, M.; Zaw, I.; Zepeda, A.; Zhou, J.; Zhu, Y.; Zimbres Silva, M.; Ziolkowski, M.

    2013-01-01

    A thorough search for large-scale anisotropies in the distribution of arrival directions of cosmic rays detected above 1018 eV at the Pierre Auger Observatory is reported. For the first time, these large-scale anisotropy searches are performed as a function of both the right ascension and the declination and expressed in terms of dipole and quadrupole moments. Within the systematic uncertainties, no significant deviation from isotropy is revealed. Upper limits on dipole and quadrupole amplitudes are derived under the hypothesis that any cosmic ray anisotropy is dominated by such moments in this energy range. These upper limits provide constraints on the production of cosmic rays above 1018 eV, since they allow us to challenge an origin from stationary galactic sources densely distributed in the galactic disk and emitting predominantly light particles in all directions.

  17. Large Scale Processes and Extreme Floods in Brazil

    NASA Astrophysics Data System (ADS)

    Ribeiro Lima, C. H.; AghaKouchak, A.; Lall, U.

    2016-12-01

    Persistent large scale anomalies in the atmospheric circulation and ocean state have been associated with heavy rainfall and extreme floods in water basins of different sizes across the world. Such studies have emerged in the last years as a new tool to improve the traditional, stationary based approach in flood frequency analysis and flood prediction. Here we seek to advance previous studies by evaluating the dominance of large scale processes (e.g. atmospheric rivers/moisture transport) over local processes (e.g. local convection) in producing floods. We consider flood-prone regions in Brazil as case studies and the role of large scale climate processes in generating extreme floods in such regions is explored by means of observed streamflow, reanalysis data and machine learning methods. The dynamics of the large scale atmospheric circulation in the days prior to the flood events are evaluated based on the vertically integrated moisture flux and its divergence field, which are interpreted in a low-dimensional space as obtained by machine learning techniques, particularly supervised kernel principal component analysis. In such reduced dimensional space, clusters are obtained in order to better understand the role of regional moisture recycling or teleconnected moisture in producing floods of a given magnitude. The convective available potential energy (CAPE) is also used as a measure of local convection activities. We investigate for individual sites the exceedance probability in which large scale atmospheric fluxes dominate the flood process. Finally, we analyze regional patterns of floods and how the scaling law of floods with drainage area responds to changes in the climate forcing mechanisms (e.g. local vs large scale).

  18. Variable DAXX gene methylation is a common feature of placental trophoblast differentiation, preeclampsia, and response to hypoxia.

    PubMed

    Novakovic, Boris; Evain-Brion, Danièle; Murthi, Padma; Fournier, Thiery; Saffery, Richard

    2017-06-01

    Placental functioning relies on the appropriate differentiation of progenitor villous cytotrophoblasts (CTBs) into extravillous cytotrophoblasts (EVCTs), including invasive EVCTs, and the multinucleated syncytiotrophoblast (ST) layer. This is accompanied by a general move away from a proliferative, immature phenotype. Genome-scale expression studies have provided valuable insight into genes that are associated with the shift to both an invasive EVCT and ST phenotype, whereas genome-scale DNA methylation analysis has shown that differentiation to ST involves widespread methylation shifts, which are counteracted by low oxygen. In the current study, we sought to identify DNA methylation variation that is associated with transition from CTB to ST in vitro and from a noninvasive to invasive EVCT phenotype after culture on Matrigel. Of the several hundred differentially methylated regions that were identified in each comparison, the majority showed a loss of methylation with differentiation. This included a large differentially methylated region (DMR) in the gene body of death domain-associated protein 6 ( DAXX ), which lost methylation during both CTB syncytialization to ST and EVCT differentiation to invasive EVCT. Comparison to publicly available methylation array data identified the same DMR as among the most consistently differentially methylated genes in placental samples from preeclampsia pregnancies. Of interest, in vitro culture of CTB or ST in low oxygen increases methylation in the same region, which correlates with delayed differentiation. Analysis of combined epigenomics signatures confirmed DAXX DMR as a likely regulatory element, and direct gene expression analysis identified a positive association between methylation at this site and DAXX expression levels. The widespread dynamic nature of DAXX methylation in association with trophoblast differentiation and placenta-associated pathologies is consistent with an important role for this gene in proper placental development and function.-Novakovic, B., Evain-Brion, D., Murthi, P., Fournier, T., Saffery, R. Variable DAXX gene methylation is a common feature of placental trophoblast differentiation, preeclampsia, and response to hypoxia. © FASEB.

  19. Mathematical modeling of gene expression: a guide for the perplexed biologist

    PubMed Central

    Ay, Ahmet; Arnosti, David N.

    2011-01-01

    The detailed analysis of transcriptional networks holds a key for understanding central biological processes, and interest in this field has exploded due to new large-scale data acquisition techniques. Mathematical modeling can provide essential insights, but the diversity of modeling approaches can be a daunting prospect to investigators new to this area. For those interested in beginning a transcriptional mathematical modeling project we provide here an overview of major types of models and their applications to transcriptional networks. In this discussion of recent literature on thermodynamic, Boolean and differential equation models we focus on considerations critical for choosing and validating a modeling approach that will be useful for quantitative understanding of biological systems. PMID:21417596

  20. Hydrometeorological variability on a large french catchment and its relation to large-scale circulation across temporal scales

    NASA Astrophysics Data System (ADS)

    Massei, Nicolas; Dieppois, Bastien; Fritier, Nicolas; Laignel, Benoit; Debret, Maxime; Lavers, David; Hannah, David

    2015-04-01

    In the present context of global changes, considerable efforts have been deployed by the hydrological scientific community to improve our understanding of the impacts of climate fluctuations on water resources. Both observational and modeling studies have been extensively employed to characterize hydrological changes and trends, assess the impact of climate variability or provide future scenarios of water resources. In the aim of a better understanding of hydrological changes, it is of crucial importance to determine how and to what extent trends and long-term oscillations detectable in hydrological variables are linked to global climate oscillations. In this work, we develop an approach associating large-scale/local-scale correlation, enmpirical statistical downscaling and wavelet multiresolution decomposition of monthly precipitation and streamflow over the Seine river watershed, and the North Atlantic sea level pressure (SLP) in order to gain additional insights on the atmospheric patterns associated with the regional hydrology. We hypothesized that: i) atmospheric patterns may change according to the different temporal wavelengths defining the variability of the signals; and ii) definition of those hydrological/circulation relationships for each temporal wavelength may improve the determination of large-scale predictors of local variations. The results showed that the large-scale/local-scale links were not necessarily constant according to time-scale (i.e. for the different frequencies characterizing the signals), resulting in changing spatial patterns across scales. This was then taken into account by developing an empirical statistical downscaling (ESD) modeling approach which integrated discrete wavelet multiresolution analysis for reconstructing local hydrometeorological processes (predictand : precipitation and streamflow on the Seine river catchment) based on a large-scale predictor (SLP over the Euro-Atlantic sector) on a monthly time-step. This approach basically consisted in 1- decomposing both signals (SLP field and precipitation or streamflow) using discrete wavelet multiresolution analysis and synthesis, 2- generating one statistical downscaling model per time-scale, 3- summing up all scale-dependent models in order to obtain a final reconstruction of the predictand. The results obtained revealed a significant improvement of the reconstructions for both precipitation and streamflow when using the multiresolution ESD model instead of basic ESD ; in addition, the scale-dependent spatial patterns associated to the model matched quite well those obtained from scale-dependent composite analysis. In particular, the multiresolution ESD model handled very well the significant changes in variance through time observed in either prepciptation or streamflow. For instance, the post-1980 period, which had been characterized by particularly high amplitudes in interannual-to-interdecadal variability associated with flood and extremely low-flow/drought periods (e.g., winter 2001, summer 2003), could not be reconstructed without integrating wavelet multiresolution analysis into the model. Further investigations would be required to address the issue of the stationarity of the large-scale/local-scale relationships and to test the capability of the multiresolution ESD model for interannual-to-interdecadal forecasting. In terms of methodological approach, further investigations may concern a fully comprehensive sensitivity analysis of the modeling to the parameter of the multiresolution approach (different families of scaling and wavelet functions used, number of coefficients/degree of smoothness, etc.).

Top