Sample records for accurate gene models

  1. SnowyOwl: accurate prediction of fungal genes by using RNA-Seq and homology information to select among ab initio models

    PubMed Central

    2014-01-01

    Background Locating the protein-coding genes in novel genomes is essential to understanding and exploiting the genomic information but it is still difficult to accurately predict all the genes. The recent availability of detailed information about transcript structure from high-throughput sequencing of messenger RNA (RNA-Seq) delineates many expressed genes and promises increased accuracy in gene prediction. Computational gene predictors have been intensively developed for and tested in well-studied animal genomes. Hundreds of fungal genomes are now or will soon be sequenced. The differences of fungal genomes from animal genomes and the phylogenetic sparsity of well-studied fungi call for gene-prediction tools tailored to them. Results SnowyOwl is a new gene prediction pipeline that uses RNA-Seq data to train and provide hints for the generation of Hidden Markov Model (HMM)-based gene predictions and to evaluate the resulting models. The pipeline has been developed and streamlined by comparing its predictions to manually curated gene models in three fungal genomes and validated against the high-quality gene annotation of Neurospora crassa; SnowyOwl predicted N. crassa genes with 83% sensitivity and 65% specificity. SnowyOwl gains sensitivity by repeatedly running the HMM gene predictor Augustus with varied input parameters and selectivity by choosing the models with best homology to known proteins and best agreement with the RNA-Seq data. Conclusions SnowyOwl efficiently uses RNA-Seq data to produce accurate gene models in both well-studied and novel fungal genomes. The source code for the SnowyOwl pipeline (in Python) and a web interface (in PHP) is freely available from http://sourceforge.net/projects/snowyowl/. PMID:24980894

  2. Modeling leaderless transcription and atypical genes results in more accurate gene prediction in prokaryotes.

    PubMed

    Lomsadze, Alexandre; Gemayel, Karl; Tang, Shiyuyun; Borodovsky, Mark

    2018-05-17

    In a conventional view of the prokaryotic genome organization, promoters precede operons and ribosome binding sites (RBSs) with Shine-Dalgarno consensus precede genes. However, recent experimental research suggesting a more diverse view motivated us to develop an algorithm with improved gene-finding accuracy. We describe GeneMarkS-2, an ab initio algorithm that uses a model derived by self-training for finding species-specific (native) genes, along with an array of precomputed "heuristic" models designed to identify harder-to-detect genes (likely horizontally transferred). Importantly, we designed GeneMarkS-2 to identify several types of distinct sequence patterns (signals) involved in gene expression control, among them the patterns characteristic for leaderless transcription as well as noncanonical RBS patterns. To assess the accuracy of GeneMarkS-2, we used genes validated by COG (Clusters of Orthologous Groups) annotation, proteomics experiments, and N-terminal protein sequencing. We observed that GeneMarkS-2 performed better on average in all accuracy measures when compared with the current state-of-the-art gene prediction tools. Furthermore, the screening of ∼5000 representative prokaryotic genomes made by GeneMarkS-2 predicted frequent leaderless transcription in both archaea and bacteria. We also observed that the RBS sites in some species with leadered transcription did not necessarily exhibit the Shine-Dalgarno consensus. The modeling of different types of sequence motifs regulating gene expression prompted a division of prokaryotic genomes into five categories with distinct sequence patterns around the gene starts. © 2018 Lomsadze et al.; Published by Cold Spring Harbor Laboratory Press.

  3. CodingQuarry: highly accurate hidden Markov model gene prediction in fungal genomes using RNA-seq transcripts.

    PubMed

    Testa, Alison C; Hane, James K; Ellwood, Simon R; Oliver, Richard P

    2015-03-11

    The impact of gene annotation quality on functional and comparative genomics makes gene prediction an important process, particularly in non-model species, including many fungi. Sets of homologous protein sequences are rarely complete with respect to the fungal species of interest and are often small or unreliable, especially when closely related species have not been sequenced or annotated in detail. In these cases, protein homology-based evidence fails to correctly annotate many genes, or significantly improve ab initio predictions. Generalised hidden Markov models (GHMM) have proven to be invaluable tools in gene annotation and, recently, RNA-seq has emerged as a cost-effective means to significantly improve the quality of automated gene annotation. As these methods do not require sets of homologous proteins, improving gene prediction from these resources is of benefit to fungal researchers. While many pipelines now incorporate RNA-seq data in training GHMMs, there has been relatively little investigation into additionally combining RNA-seq data at the point of prediction, and room for improvement in this area motivates this study. CodingQuarry is a highly accurate, self-training GHMM fungal gene predictor designed to work with assembled, aligned RNA-seq transcripts. RNA-seq data informs annotations both during gene-model training and in prediction. Our approach capitalises on the high quality of fungal transcript assemblies by incorporating predictions made directly from transcript sequences. Correct predictions are made despite transcript assembly problems, including those caused by overlap between the transcripts of adjacent gene loci. Stringent benchmarking against high-confidence annotation subsets showed CodingQuarry predicted 91.3% of Schizosaccharomyces pombe genes and 90.4% of Saccharomyces cerevisiae genes perfectly. These results are 4-5% better than those of AUGUSTUS, the next best performing RNA-seq driven gene predictor tested. Comparisons against

  4. RapGene: a fast and accurate strategy for synthetic gene assembly in Escherichia coli

    PubMed Central

    Zampini, Massimiliano; Stevens, Pauline Rees; Pachebat, Justin A.; Kingston-Smith, Alison; Mur, Luis A. J.; Hayes, Finbarr

    2015-01-01

    The ability to assemble DNA sequences de novo through efficient and powerful DNA fabrication methods is one of the foundational technologies of synthetic biology. Gene synthesis, in particular, has been considered the main driver for the emergence of this new scientific discipline. Here we describe RapGene, a rapid gene assembly technique which was successfully tested for the synthesis and cloning of both prokaryotic and eukaryotic genes through a ligation independent approach. The method developed in this study is a complete bacterial gene synthesis platform for the quick, accurate and cost effective fabrication and cloning of gene-length sequences that employ the widely used host Escherichia coli. PMID:26062748

  5. Reranking candidate gene models with cross-species comparison for improved gene prediction

    PubMed Central

    Liu, Qian; Crammer, Koby; Pereira, Fernando CN; Roos, David S

    2008-01-01

    Background Most gene finders score candidate gene models with state-based methods, typically HMMs, by combining local properties (coding potential, splice donor and acceptor patterns, etc). Competing models with similar state-based scores may be distinguishable with additional information. In particular, functional and comparative genomics datasets may help to select among competing models of comparable probability by exploiting features likely to be associated with the correct gene models, such as conserved exon/intron structure or protein sequence features. Results We have investigated the utility of a simple post-processing step for selecting among a set of alternative gene models, using global scoring rules to rerank competing models for more accurate prediction. For each gene locus, we first generate the K best candidate gene models using the gene finder Evigan, and then rerank these models using comparisons with putative orthologous genes from closely-related species. Candidate gene models with lower scores in the original gene finder may be selected if they exhibit strong similarity to probable orthologs in coding sequence, splice site location, or signal peptide occurrence. Experiments on Drosophila melanogaster demonstrate that reranking based on cross-species comparison outperforms the best gene models identified by Evigan alone, and also outperforms the comparative gene finders GeneWise and Augustus+. Conclusion Reranking gene models with cross-species comparison improves gene prediction accuracy. This straightforward method can be readily adapted to incorporate additional lines of evidence, as it requires only a ranked source of candidate gene models. PMID:18854050

  6. Probability-based collaborative filtering model for predicting gene-disease associations.

    PubMed

    Zeng, Xiangxiang; Ding, Ningxiang; Rodríguez-Patón, Alfonso; Zou, Quan

    2017-12-28

    Accurately predicting pathogenic human genes has been challenging in recent research. Considering extensive gene-disease data verified by biological experiments, we can apply computational methods to perform accurate predictions with reduced time and expenses. We propose a probability-based collaborative filtering model (PCFM) to predict pathogenic human genes. Several kinds of data sets, containing data of humans and data of other nonhuman species, are integrated in our model. Firstly, on the basis of a typical latent factorization model, we propose model I with an average heterogeneous regularization. Secondly, we develop modified model II with personal heterogeneous regularization to enhance the accuracy of aforementioned models. In this model, vector space similarity or Pearson correlation coefficient metrics and data on related species are also used. We compared the results of PCFM with the results of four state-of-arts approaches. The results show that PCFM performs better than other advanced approaches. PCFM model can be leveraged for predictions of disease genes, especially for new human genes or diseases with no known relationships.

  7. Animal models for prenatal gene therapy: rodent models for prenatal gene therapy.

    PubMed

    Roybal, Jessica L; Endo, Masayuki; Buckley, Suzanne M K; Herbert, Bronwen R; Waddington, Simon N; Flake, Alan W

    2012-01-01

    Fetal gene transfer has been studied in various animal models, including rabbits, guinea pigs, cats, dogs, and nonhuman primate; however, the most common model is the rodent, particularly the mouse. There are numerous advantages to mouse models, including a short gestation time of around 20 days, large litter size usually of more than six pups, ease of colony maintenance due to the small physical size, and the relatively low expense of doing so. Moreover, the mouse genome is well defined, there are many transgenic models particularly of human monogenetic disorders, and mouse-specific biological reagents are readily available. One criticism has been that it is difficult to perform procedures on the fetal mouse with suitable accuracy. Over the past decade, accumulation of technical expertise and development of technology such as high-frequency ultrasound have permitted accurate vector delivery to organs and tissues. Here, we describe our experiences of gene transfer to the fetal mouse with and without ultrasound guidance from mid to late gestation. Depending upon the vector type, the route of delivery and the age of the fetus, specific or widespread gene transfer can be achieved, making fetal mice excellent models for exploratory biodistribution studies.

  8. A study of structural properties of gene network graphs for mathematical modeling of integrated mosaic gene networks.

    PubMed

    Petrovskaya, Olga V; Petrovskiy, Evgeny D; Lavrik, Inna N; Ivanisenko, Vladimir A

    2017-04-01

    Gene network modeling is one of the widely used approaches in systems biology. It allows for the study of complex genetic systems function, including so-called mosaic gene networks, which consist of functionally interacting subnetworks. We conducted a study of a mosaic gene networks modeling method based on integration of models of gene subnetworks by linear control functionals. An automatic modeling of 10,000 synthetic mosaic gene regulatory networks was carried out using computer experiments on gene knockdowns/knockouts. Structural analysis of graphs of generated mosaic gene regulatory networks has revealed that the most important factor for building accurate integrated mathematical models, among those analyzed in the study, is data on expression of genes corresponding to the vertices with high properties of centrality.

  9. Evaluation of New Reference Genes in Papaya for Accurate Transcript Normalization under Different Experimental Conditions

    PubMed Central

    Chen, Weixin; Chen, Jianye; Lu, Wangjin; Chen, Lei; Fu, Danwen

    2012-01-01

    Real-time reverse transcription PCR (RT-qPCR) is a preferred method for rapid and accurate quantification of gene expression studies. Appropriate application of RT-qPCR requires accurate normalization though the use of reference genes. As no single reference gene is universally suitable for all experiments, thus reference gene(s) validation under different experimental conditions is crucial for RT-qPCR analysis. To date, only a few studies on reference genes have been done in other plants but none in papaya. In the present work, we selected 21 candidate reference genes, and evaluated their expression stability in 246 papaya fruit samples using three algorithms, geNorm, NormFinder and RefFinder. The samples consisted of 13 sets collected under different experimental conditions, including various tissues, different storage temperatures, different cultivars, developmental stages, postharvest ripening, modified atmosphere packaging, 1-methylcyclopropene (1-MCP) treatment, hot water treatment, biotic stress and hormone treatment. Our results demonstrated that expression stability varied greatly between reference genes and that different suitable reference gene(s) or combination of reference genes for normalization should be validated according to the experimental conditions. In general, the internal reference genes EIF (Eukaryotic initiation factor 4A), TBP1 (TATA binding protein 1) and TBP2 (TATA binding protein 2) genes had a good performance under most experimental conditions, whereas the most widely present used reference genes, ACTIN (Actin 2), 18S rRNA (18S ribosomal RNA) and GAPDH (Glyceraldehyde-3-phosphate dehydrogenase) were not suitable in many experimental conditions. In addition, two commonly used programs, geNorm and Normfinder, were proved sufficient for the validation. This work provides the first systematic analysis for the selection of superior reference genes for accurate transcript normalization in papaya under different experimental conditions. PMID

  10. Accurate prediction of secondary metabolite gene clusters in filamentous fungi.

    PubMed

    Andersen, Mikael R; Nielsen, Jakob B; Klitgaard, Andreas; Petersen, Lene M; Zachariasen, Mia; Hansen, Tilde J; Blicher, Lene H; Gotfredsen, Charlotte H; Larsen, Thomas O; Nielsen, Kristian F; Mortensen, Uffe H

    2013-01-02

    Biosynthetic pathways of secondary metabolites from fungi are currently subject to an intense effort to elucidate the genetic basis for these compounds due to their large potential within pharmaceutics and synthetic biochemistry. The preferred method is methodical gene deletions to identify supporting enzymes for key synthases one cluster at a time. In this study, we design and apply a DNA expression array for Aspergillus nidulans in combination with legacy data to form a comprehensive gene expression compendium. We apply a guilt-by-association-based analysis to predict the extent of the biosynthetic clusters for the 58 synthases active in our set of experimental conditions. A comparison with legacy data shows the method to be accurate in 13 of 16 known clusters and nearly accurate for the remaining 3 clusters. Furthermore, we apply a data clustering approach, which identifies cross-chemistry between physically separate gene clusters (superclusters), and validate this both with legacy data and experimentally by prediction and verification of a supercluster consisting of the synthase AN1242 and the prenyltransferase AN11080, as well as identification of the product compound nidulanin A. We have used A. nidulans for our method development and validation due to the wealth of available biochemical data, but the method can be applied to any fungus with a sequenced and assembled genome, thus supporting further secondary metabolite pathway elucidation in the fungal kingdom.

  11. Mining Gene Regulatory Networks by Neural Modeling of Expression Time-Series.

    PubMed

    Rubiolo, Mariano; Milone, Diego H; Stegmayer, Georgina

    2015-01-01

    Discovering gene regulatory networks from data is one of the most studied topics in recent years. Neural networks can be successfully used to infer an underlying gene network by modeling expression profiles as times series. This work proposes a novel method based on a pool of neural networks for obtaining a gene regulatory network from a gene expression dataset. They are used for modeling each possible interaction between pairs of genes in the dataset, and a set of mining rules is applied to accurately detect the subjacent relations among genes. The results obtained on artificial and real datasets confirm the method effectiveness for discovering regulatory networks from a proper modeling of the temporal dynamics of gene expression profiles.

  12. A Simple Model of Hox Genes: Bone Morphology Demonstration

    ERIC Educational Resources Information Center

    Shmaefsky, Brian

    2008-01-01

    Visual demonstrations of abstract scientific concepts are effective strategies for enhancing content retention (Shmaefsky 2004). The concepts associated with gene regulation of growth and development are particularly complex and are well suited for teaching with visual models. This demonstration provides a simple and accurate model of Hox gene…

  13. Integrative structural annotation of de novo RNA-Seq provides an accurate reference gene set of the enormous genome of the onion (Allium cepa L.)

    PubMed Central

    Kim, Seungill; Kim, Myung-Shin; Kim, Yong-Min; Yeom, Seon-In; Cheong, Kyeongchae; Kim, Ki-Tae; Jeon, Jongbum; Kim, Sunggil; Kim, Do-Sun; Sohn, Seong-Han; Lee, Yong-Hwan; Choi, Doil

    2015-01-01

    The onion (Allium cepa L.) is one of the most widely cultivated and consumed vegetable crops in the world. Although a considerable amount of onion transcriptome data has been deposited into public databases, the sequences of the protein-coding genes are not accurate enough to be used, owing to non-coding sequences intermixed with the coding sequences. We generated a high-quality, annotated onion transcriptome from de novo sequence assembly and intensive structural annotation using the integrated structural gene annotation pipeline (ISGAP), which identified 54,165 protein-coding genes among 165,179 assembled transcripts totalling 203.0 Mb by eliminating the intron sequences. ISGAP performed reliable annotation, recognizing accurate gene structures based on reference proteins, and ab initio gene models of the assembled transcripts. Integrative functional annotation and gene-based SNP analysis revealed a whole biological repertoire of genes and transcriptomic variation in the onion. The method developed in this study provides a powerful tool for the construction of reference gene sets for organisms based solely on de novo transcriptome data. Furthermore, the reference genes and their variation described here for the onion represent essential tools for molecular breeding and gene cloning in Allium spp. PMID:25362073

  14. Mental models accurately predict emotion transitions.

    PubMed

    Thornton, Mark A; Tamir, Diana I

    2017-06-06

    Successful social interactions depend on people's ability to predict others' future actions and emotions. People possess many mechanisms for perceiving others' current emotional states, but how might they use this information to predict others' future states? We hypothesized that people might capitalize on an overlooked aspect of affective experience: current emotions predict future emotions. By attending to regularities in emotion transitions, perceivers might develop accurate mental models of others' emotional dynamics. People could then use these mental models of emotion transitions to predict others' future emotions from currently observable emotions. To test this hypothesis, studies 1-3 used data from three extant experience-sampling datasets to establish the actual rates of emotional transitions. We then collected three parallel datasets in which participants rated the transition likelihoods between the same set of emotions. Participants' ratings of emotion transitions predicted others' experienced transitional likelihoods with high accuracy. Study 4 demonstrated that four conceptual dimensions of mental state representation-valence, social impact, rationality, and human mind-inform participants' mental models. Study 5 used 2 million emotion reports on the Experience Project to replicate both of these findings: again people reported accurate models of emotion transitions, and these models were informed by the same four conceptual dimensions. Importantly, neither these conceptual dimensions nor holistic similarity could fully explain participants' accuracy, suggesting that their mental models contain accurate information about emotion dynamics above and beyond what might be predicted by static emotion knowledge alone.

  15. Mental models accurately predict emotion transitions

    PubMed Central

    Thornton, Mark A.; Tamir, Diana I.

    2017-01-01

    Successful social interactions depend on people’s ability to predict others’ future actions and emotions. People possess many mechanisms for perceiving others’ current emotional states, but how might they use this information to predict others’ future states? We hypothesized that people might capitalize on an overlooked aspect of affective experience: current emotions predict future emotions. By attending to regularities in emotion transitions, perceivers might develop accurate mental models of others’ emotional dynamics. People could then use these mental models of emotion transitions to predict others’ future emotions from currently observable emotions. To test this hypothesis, studies 1–3 used data from three extant experience-sampling datasets to establish the actual rates of emotional transitions. We then collected three parallel datasets in which participants rated the transition likelihoods between the same set of emotions. Participants’ ratings of emotion transitions predicted others’ experienced transitional likelihoods with high accuracy. Study 4 demonstrated that four conceptual dimensions of mental state representation—valence, social impact, rationality, and human mind—inform participants’ mental models. Study 5 used 2 million emotion reports on the Experience Project to replicate both of these findings: again people reported accurate models of emotion transitions, and these models were informed by the same four conceptual dimensions. Importantly, neither these conceptual dimensions nor holistic similarity could fully explain participants’ accuracy, suggesting that their mental models contain accurate information about emotion dynamics above and beyond what might be predicted by static emotion knowledge alone. PMID:28533373

  16. Integrative structural annotation of de novo RNA-Seq provides an accurate reference gene set of the enormous genome of the onion (Allium cepa L.).

    PubMed

    Kim, Seungill; Kim, Myung-Shin; Kim, Yong-Min; Yeom, Seon-In; Cheong, Kyeongchae; Kim, Ki-Tae; Jeon, Jongbum; Kim, Sunggil; Kim, Do-Sun; Sohn, Seong-Han; Lee, Yong-Hwan; Choi, Doil

    2015-02-01

    The onion (Allium cepa L.) is one of the most widely cultivated and consumed vegetable crops in the world. Although a considerable amount of onion transcriptome data has been deposited into public databases, the sequences of the protein-coding genes are not accurate enough to be used, owing to non-coding sequences intermixed with the coding sequences. We generated a high-quality, annotated onion transcriptome from de novo sequence assembly and intensive structural annotation using the integrated structural gene annotation pipeline (ISGAP), which identified 54,165 protein-coding genes among 165,179 assembled transcripts totalling 203.0 Mb by eliminating the intron sequences. ISGAP performed reliable annotation, recognizing accurate gene structures based on reference proteins, and ab initio gene models of the assembled transcripts. Integrative functional annotation and gene-based SNP analysis revealed a whole biological repertoire of genes and transcriptomic variation in the onion. The method developed in this study provides a powerful tool for the construction of reference gene sets for organisms based solely on de novo transcriptome data. Furthermore, the reference genes and their variation described here for the onion represent essential tools for molecular breeding and gene cloning in Allium spp. © The Author 2014. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.

  17. Seqping: gene prediction pipeline for plant genomes using self-training gene models and transcriptomic data.

    PubMed

    Chan, Kuang-Lim; Rosli, Rozana; Tatarinova, Tatiana V; Hogan, Michael; Firdaus-Raih, Mohd; Low, Eng-Ti Leslie

    2017-01-27

    Gene prediction is one of the most important steps in the genome annotation process. A large number of software tools and pipelines developed by various computing techniques are available for gene prediction. However, these systems have yet to accurately predict all or even most of the protein-coding regions. Furthermore, none of the currently available gene-finders has a universal Hidden Markov Model (HMM) that can perform gene prediction for all organisms equally well in an automatic fashion. We present an automated gene prediction pipeline, Seqping that uses self-training HMM models and transcriptomic data. The pipeline processes the genome and transcriptome sequences of the target species using GlimmerHMM, SNAP, and AUGUSTUS pipelines, followed by MAKER2 program to combine predictions from the three tools in association with the transcriptomic evidence. Seqping generates species-specific HMMs that are able to offer unbiased gene predictions. The pipeline was evaluated using the Oryza sativa and Arabidopsis thaliana genomes. Benchmarking Universal Single-Copy Orthologs (BUSCO) analysis showed that the pipeline was able to identify at least 95% of BUSCO's plantae dataset. Our evaluation shows that Seqping was able to generate better gene predictions compared to three HMM-based programs (MAKER2, GlimmerHMM and AUGUSTUS) using their respective available HMMs. Seqping had the highest accuracy in rice (0.5648 for CDS, 0.4468 for exon, and 0.6695 nucleotide structure) and A. thaliana (0.5808 for CDS, 0.5955 for exon, and 0.8839 nucleotide structure). Seqping provides researchers a seamless pipeline to train species-specific HMMs and predict genes in newly sequenced or less-studied genomes. We conclude that the Seqping pipeline predictions are more accurate than gene predictions using the other three approaches with the default or available HMMs.

  18. Validation of reference genes aiming accurate normalization of qRT-PCR data in Dendrocalamus latiflorus Munro.

    PubMed

    Liu, Mingying; Jiang, Jing; Han, Xiaojiao; Qiao, Guirong; Zhuo, Renying

    2014-01-01

    Dendrocalamus latiflorus Munro distributes widely in subtropical areas and plays vital roles as valuable natural resources. The transcriptome sequencing for D. latiflorus Munro has been performed and numerous genes especially those predicted to be unique to D. latiflorus Munro were revealed. qRT-PCR has become a feasible approach to uncover gene expression profiling, and the accuracy and reliability of the results obtained depends upon the proper selection of stable reference genes for accurate normalization. Therefore, a set of suitable internal controls should be validated for D. latiflorus Munro. In this report, twelve candidate reference genes were selected and the assessment of gene expression stability was performed in ten tissue samples and four leaf samples from seedlings and anther-regenerated plants of different ploidy. The PCR amplification efficiency was estimated, and the candidate genes were ranked according to their expression stability using three software packages: geNorm, NormFinder and Bestkeeper. GAPDH and EF1α were characterized to be the most stable genes among different tissues or in all the sample pools, while CYP showed low expression stability. RPL3 had the optimal performance among four leaf samples. The application of verified reference genes was illustrated by analyzing ferritin and laccase expression profiles among different experimental sets. The analysis revealed the biological variation in ferritin and laccase transcript expression among the tissues studied and the individual plants. geNorm, NormFinder, and BestKeeper analyses recommended different suitable reference gene(s) for normalization according to the experimental sets. GAPDH and EF1α had the highest expression stability across different tissues and RPL3 for the other sample set. This study emphasizes the importance of validating superior reference genes for qRT-PCR analysis to accurately normalize gene expression of D. latiflorus Munro.

  19. Protein and gene model inference based on statistical modeling in k-partite graphs.

    PubMed

    Gerster, Sarah; Qeli, Ermir; Ahrens, Christian H; Bühlmann, Peter

    2010-07-06

    One of the major goals of proteomics is the comprehensive and accurate description of a proteome. Shotgun proteomics, the method of choice for the analysis of complex protein mixtures, requires that experimentally observed peptides are mapped back to the proteins they were derived from. This process is also known as protein inference. We present Markovian Inference of Proteins and Gene Models (MIPGEM), a statistical model based on clearly stated assumptions to address the problem of protein and gene model inference for shotgun proteomics data. In particular, we are dealing with dependencies among peptides and proteins using a Markovian assumption on k-partite graphs. We are also addressing the problems of shared peptides and ambiguous proteins by scoring the encoding gene models. Empirical results on two control datasets with synthetic mixtures of proteins and on complex protein samples of Saccharomyces cerevisiae, Drosophila melanogaster, and Arabidopsis thaliana suggest that the results with MIPGEM are competitive with existing tools for protein inference.

  20. Validation of reference genes for quantitative gene expression analysis in experimental epilepsy.

    PubMed

    Sadangi, Chinmaya; Rosenow, Felix; Norwood, Braxton A

    2017-12-01

    To grasp the molecular mechanisms and pathophysiology underlying epilepsy development (epileptogenesis) and epilepsy itself, it is important to understand the gene expression changes that occur during these phases. Quantitative real-time polymerase chain reaction (qPCR) is a technique that rapidly and accurately determines gene expression changes. It is crucial, however, that stable reference genes are selected for each experimental condition to ensure that accurate values are obtained for genes of interest. If reference genes are unstably expressed, this can lead to inaccurate data and erroneous conclusions. To date, epilepsy studies have used mostly single, nonvalidated reference genes. This is the first study to systematically evaluate reference genes in male Sprague-Dawley rat models of epilepsy. We assessed 15 potential reference genes in hippocampal tissue obtained from 2 different models during epileptogenesis, 1 model during chronic epilepsy, and a model of noninjurious seizures. Reference gene ranking varied between models and also differed between epileptogenesis and chronic epilepsy time points. There was also some variance between the four mathematical models used to rank reference genes. Notably, we found novel reference genes to be more stably expressed than those most often used in experimental epilepsy studies. The consequence of these findings is that reference genes suitable for one epilepsy model may not be appropriate for others and that reference genes can change over time. It is, therefore, critically important to validate potential reference genes before using them as normalizing factors in expression analysis in order to ensure accurate, valid results. © 2017 Wiley Periodicals, Inc.

  1. How to perform RT-qPCR accurately in plant species? A case study on flower colour gene expression in an azalea (Rhododendron simsii hybrids) mapping population.

    PubMed

    De Keyser, Ellen; Desmet, Laurence; Van Bockstaele, Erik; De Riek, Jan

    2013-06-24

    Flower colour variation is one of the most crucial selection criteria in the breeding of a flowering pot plant, as is also the case for azalea (Rhododendron simsii hybrids). Flavonoid biosynthesis was studied intensively in several species. In azalea, flower colour can be described by means of a 3-gene model. However, this model does not clarify pink-coloration. The last decade gene expression studies have been implemented widely for studying flower colour. However, the methods used were often only semi-quantitative or quantification was not done according to the MIQE-guidelines. We aimed to develop an accurate protocol for RT-qPCR and to validate the protocol to study flower colour in an azalea mapping population. An accurate RT-qPCR protocol had to be established. RNA quality was evaluated in a combined approach by means of different techniques e.g. SPUD-assay and Experion-analysis. We demonstrated the importance of testing noRT-samples for all genes under study to detect contaminating DNA. In spite of the limited sequence information available, we prepared a set of 11 reference genes which was validated in flower petals; a combination of three reference genes was most optimal. Finally we also used plasmids for the construction of standard curves. This allowed us to calculate gene-specific PCR efficiencies for every gene to assure an accurate quantification. The validity of the protocol was demonstrated by means of the study of six genes of the flavonoid biosynthesis pathway. No correlations were found between flower colour and the individual expression profiles. However, the combination of early pathway genes (CHS, F3H, F3'H and FLS) is clearly related to co-pigmentation with flavonols. The late pathway genes DFR and ANS are to a minor extent involved in differentiating between coloured and white flowers. Concerning pink coloration, we could demonstrate that the lower intensity in this type of flowers is correlated to the expression of F3'H. Currently in plant

  2. How to perform RT-qPCR accurately in plant species? A case study on flower colour gene expression in an azalea (Rhododendron simsii hybrids) mapping population

    PubMed Central

    2013-01-01

    Background Flower colour variation is one of the most crucial selection criteria in the breeding of a flowering pot plant, as is also the case for azalea (Rhododendron simsii hybrids). Flavonoid biosynthesis was studied intensively in several species. In azalea, flower colour can be described by means of a 3-gene model. However, this model does not clarify pink-coloration. The last decade gene expression studies have been implemented widely for studying flower colour. However, the methods used were often only semi-quantitative or quantification was not done according to the MIQE-guidelines. We aimed to develop an accurate protocol for RT-qPCR and to validate the protocol to study flower colour in an azalea mapping population. Results An accurate RT-qPCR protocol had to be established. RNA quality was evaluated in a combined approach by means of different techniques e.g. SPUD-assay and Experion-analysis. We demonstrated the importance of testing noRT-samples for all genes under study to detect contaminating DNA. In spite of the limited sequence information available, we prepared a set of 11 reference genes which was validated in flower petals; a combination of three reference genes was most optimal. Finally we also used plasmids for the construction of standard curves. This allowed us to calculate gene-specific PCR efficiencies for every gene to assure an accurate quantification. The validity of the protocol was demonstrated by means of the study of six genes of the flavonoid biosynthesis pathway. No correlations were found between flower colour and the individual expression profiles. However, the combination of early pathway genes (CHS, F3H, F3'H and FLS) is clearly related to co-pigmentation with flavonols. The late pathway genes DFR and ANS are to a minor extent involved in differentiating between coloured and white flowers. Concerning pink coloration, we could demonstrate that the lower intensity in this type of flowers is correlated to the expression of F3'H

  3. Recommendations for Accurate Resolution of Gene and Isoform Allele-Specific Expression in RNA-Seq Data

    PubMed Central

    Wood, David L. A.; Nones, Katia; Steptoe, Anita; Christ, Angelika; Harliwong, Ivon; Newell, Felicity; Bruxner, Timothy J. C.; Miller, David; Cloonan, Nicole; Grimmond, Sean M.

    2015-01-01

    Genetic variation modulates gene expression transcriptionally or post-transcriptionally, and can profoundly alter an individual’s phenotype. Measuring allelic differential expression at heterozygous loci within an individual, a phenomenon called allele-specific expression (ASE), can assist in identifying such factors. Massively parallel DNA and RNA sequencing and advances in bioinformatic methodologies provide an outstanding opportunity to measure ASE genome-wide. In this study, matched DNA and RNA sequencing, genotyping arrays and computationally phased haplotypes were integrated to comprehensively and conservatively quantify ASE in a single human brain and liver tissue sample. We describe a methodological evaluation and assessment of common bioinformatic steps for ASE quantification, and recommend a robust approach to accurately measure SNP, gene and isoform ASE through the use of personalized haplotype genome alignment, strict alignment quality control and intragenic SNP aggregation. Our results indicate that accurate ASE quantification requires careful bioinformatic analyses and is adversely affected by sample specific alignment confounders and random sampling even at moderate sequence depths. We identified multiple known and several novel ASE genes in liver, including WDR72, DSP and UBD, as well as genes that contained ASE SNPs with imbalance direction discordant with haplotype phase, explainable by annotated transcript structure, suggesting isoform derived ASE. The methods evaluated in this study will be of use to researchers performing highly conservative quantification of ASE, and the genes and isoforms identified as ASE of interest to researchers studying those loci. PMID:25965996

  4. Inferring gene regression networks with model trees

    PubMed Central

    2010-01-01

    Background Novel strategies are required in order to handle the huge amount of data produced by microarray technologies. To infer gene regulatory networks, the first step is to find direct regulatory relationships between genes building the so-called gene co-expression networks. They are typically generated using correlation statistics as pairwise similarity measures. Correlation-based methods are very useful in order to determine whether two genes have a strong global similarity but do not detect local similarities. Results We propose model trees as a method to identify gene interaction networks. While correlation-based methods analyze each pair of genes, in our approach we generate a single regression tree for each gene from the remaining genes. Finally, a graph from all the relationships among output and input genes is built taking into account whether the pair of genes is statistically significant. For this reason we apply a statistical procedure to control the false discovery rate. The performance of our approach, named REGNET, is experimentally tested on two well-known data sets: Saccharomyces Cerevisiae and E.coli data set. First, the biological coherence of the results are tested. Second the E.coli transcriptional network (in the Regulon database) is used as control to compare the results to that of a correlation-based method. This experiment shows that REGNET performs more accurately at detecting true gene associations than the Pearson and Spearman zeroth and first-order correlation-based methods. Conclusions REGNET generates gene association networks from gene expression data, and differs from correlation-based methods in that the relationship between one gene and others is calculated simultaneously. Model trees are very useful techniques to estimate the numerical values for the target genes by linear regression functions. They are very often more precise than linear regression models because they can add just different linear regressions to separate

  5. Accurate analytic solution of chemical master equations for gene regulation networks in a single cell

    NASA Astrophysics Data System (ADS)

    Huang, Guan-Rong; Saakian, David B.; Hu, Chin-Kun

    2018-01-01

    Studying gene regulation networks in a single cell is an important, interesting, and hot research topic of molecular biology. Such process can be described by chemical master equations (CMEs). We propose a Hamilton-Jacobi equation method with finite-size corrections to solve such CMEs accurately at the intermediate region of switching, where switching rate is comparable to fast protein production rate. We applied this approach to a model of self-regulating proteins [H. Ge et al., Phys. Rev. Lett. 114, 078101 (2015), 10.1103/PhysRevLett.114.078101] and found that as a parameter related to inducer concentration increases the probability of protein production changes from unimodal to bimodal, then to unimodal, consistent with phenotype switching observed in a single cell.

  6. Reference Genes for Accurate Transcript Normalization in Citrus Genotypes under Different Experimental Conditions

    PubMed Central

    Mafra, Valéria; Kubo, Karen S.; Alves-Ferreira, Marcio; Ribeiro-Alves, Marcelo; Stuart, Rodrigo M.; Boava, Leonardo P.; Rodrigues, Carolina M.; Machado, Marcos A.

    2012-01-01

    Real-time reverse transcription PCR (RT-qPCR) has emerged as an accurate and widely used technique for expression profiling of selected genes. However, obtaining reliable measurements depends on the selection of appropriate reference genes for gene expression normalization. The aim of this work was to assess the expression stability of 15 candidate genes to determine which set of reference genes is best suited for transcript normalization in citrus in different tissues and organs and leaves challenged with five pathogens (Alternaria alternata, Phytophthora parasitica, Xylella fastidiosa and Candidatus Liberibacter asiaticus). We tested traditional genes used for transcript normalization in citrus and orthologs of Arabidopsis thaliana genes described as superior reference genes based on transcriptome data. geNorm and NormFinder algorithms were used to find the best reference genes to normalize all samples and conditions tested. Additionally, each biotic stress was individually analyzed by geNorm. In general, FBOX (encoding a member of the F-box family) and GAPC2 (GAPDH) was the most stable candidate gene set assessed under the different conditions and subsets tested, while CYP (cyclophilin), TUB (tubulin) and CtP (cathepsin) were the least stably expressed genes found. Validation of the best suitable reference genes for normalizing the expression level of the WRKY70 transcription factor in leaves infected with Candidatus Liberibacter asiaticus showed that arbitrary use of reference genes without previous testing could lead to misinterpretation of data. Our results revealed FBOX, SAND (a SAND family protein), GAPC2 and UPL7 (ubiquitin protein ligase 7) to be superior reference genes, and we recommend their use in studies of gene expression in citrus species and relatives. This work constitutes the first systematic analysis for the selection of superior reference genes for transcript normalization in different citrus organs and under biotic stress. PMID:22347455

  7. Accurate lithography simulation model based on convolutional neural networks

    NASA Astrophysics Data System (ADS)

    Watanabe, Yuki; Kimura, Taiki; Matsunawa, Tetsuaki; Nojima, Shigeki

    2017-07-01

    Lithography simulation is an essential technique for today's semiconductor manufacturing process. In order to calculate an entire chip in realistic time, compact resist model is commonly used. The model is established for faster calculation. To have accurate compact resist model, it is necessary to fix a complicated non-linear model function. However, it is difficult to decide an appropriate function manually because there are many options. This paper proposes a new compact resist model using CNN (Convolutional Neural Networks) which is one of deep learning techniques. CNN model makes it possible to determine an appropriate model function and achieve accurate simulation. Experimental results show CNN model can reduce CD prediction errors by 70% compared with the conventional model.

  8. Low-dimensional, morphologically accurate models of subthreshold membrane potential

    PubMed Central

    Kellems, Anthony R.; Roos, Derrick; Xiao, Nan; Cox, Steven J.

    2009-01-01

    The accurate simulation of a neuron’s ability to integrate distributed synaptic input typically requires the simultaneous solution of tens of thousands of ordinary differential equations. For, in order to understand how a cell distinguishes between input patterns we apparently need a model that is biophysically accurate down to the space scale of a single spine, i.e., 1 μm. We argue here that one can retain this highly detailed input structure while dramatically reducing the overall system dimension if one is content to accurately reproduce the associated membrane potential at a small number of places, e.g., at the site of action potential initiation, under subthreshold stimulation. The latter hypothesis permits us to approximate the active cell model with an associated quasi-active model, which in turn we reduce by both time-domain (Balanced Truncation) and frequency-domain (ℋ2 approximation of the transfer function) methods. We apply and contrast these methods on a suite of typical cells, achieving up to four orders of magnitude in dimension reduction and an associated speed-up in the simulation of dendritic democratization and resonance. We also append a threshold mechanism and indicate that this reduction has the potential to deliver an accurate quasi-integrate and fire model. PMID:19172386

  9. Improvement of experimental testing and network training conditions with genome-wide microarrays for more accurate predictions of drug gene targets

    PubMed Central

    2014-01-01

    Background Genome-wide microarrays have been useful for predicting chemical-genetic interactions at the gene level. However, interpreting genome-wide microarray results can be overwhelming due to the vast output of gene expression data combined with off-target transcriptional responses many times induced by a drug treatment. This study demonstrates how experimental and computational methods can interact with each other, to arrive at more accurate predictions of drug-induced perturbations. We present a two-stage strategy that links microarray experimental testing and network training conditions to predict gene perturbations for a drug with a known mechanism of action in a well-studied organism. Results S. cerevisiae cells were treated with the antifungal, fluconazole, and expression profiling was conducted under different biological conditions using Affymetrix genome-wide microarrays. Transcripts were filtered with a formal network-based method, sparse simultaneous equation models and Lasso regression (SSEM-Lasso), under different network training conditions. Gene expression results were evaluated using both gene set and single gene target analyses, and the drug’s transcriptional effects were narrowed first by pathway and then by individual genes. Variables included: (i) Testing conditions – exposure time and concentration and (ii) Network training conditions – training compendium modifications. Two analyses of SSEM-Lasso output – gene set and single gene – were conducted to gain a better understanding of how SSEM-Lasso predicts perturbation targets. Conclusions This study demonstrates that genome-wide microarrays can be optimized using a two-stage strategy for a more in-depth understanding of how a cell manifests biological reactions to a drug treatment at the transcription level. Additionally, a more detailed understanding of how the statistical model, SSEM-Lasso, propagates perturbations through a network of gene regulatory interactions is achieved

  10. An Accurate and Dynamic Computer Graphics Muscle Model

    NASA Technical Reports Server (NTRS)

    Levine, David Asher

    1997-01-01

    A computer based musculo-skeletal model was developed at the University in the departments of Mechanical and Biomedical Engineering. This model accurately represents human shoulder kinematics. The result of this model is the graphical display of bones moving through an appropriate range of motion based on inputs of EMGs and external forces. The need existed to incorporate a geometric muscle model in the larger musculo-skeletal model. Previous muscle models did not accurately represent muscle geometries, nor did they account for the kinematics of tendons. This thesis covers the creation of a new muscle model for use in the above musculo-skeletal model. This muscle model was based on anatomical data from the Visible Human Project (VHP) cadaver study. Two-dimensional digital images from the VHP were analyzed and reconstructed to recreate the three-dimensional muscle geometries. The recreated geometries were smoothed, reduced, and sliced to form data files defining the surfaces of each muscle. The muscle modeling function opened these files during run-time and recreated the muscle surface. The modeling function applied constant volume limitations to the muscle and constant geometry limitations to the tendons.

  11. Accurate, Rapid Taxonomic Classification of Fungal Large-Subunit rRNA Genes

    PubMed Central

    Liu, Kuan-Liang; Porras-Alfaro, Andrea; Eichorst, Stephanie A.

    2012-01-01

    Taxonomic and phylogenetic fingerprinting based on sequence analysis of gene fragments from the large-subunit rRNA (LSU) gene or the internal transcribed spacer (ITS) region is becoming an integral part of fungal classification. The lack of an accurate and robust classification tool trained by a validated sequence database for taxonomic placement of fungal LSU genes is a severe limitation in taxonomic analysis of fungal isolates or large data sets obtained from environmental surveys. Using a hand-curated set of 8,506 fungal LSU gene fragments, we determined the performance characteristics of a naïve Bayesian classifier across multiple taxonomic levels and compared the classifier performance to that of a sequence similarity-based (BLASTN) approach. The naïve Bayesian classifier was computationally more rapid (>460-fold with our system) than the BLASTN approach, and it provided equal or superior classification accuracy. Classifier accuracies were compared using sequence fragments of 100 bp and 400 bp and two different PCR primer anchor points to mimic sequence read lengths commonly obtained using current high-throughput sequencing technologies. Accuracy was higher with 400-bp sequence reads than with 100-bp reads. It was also significantly affected by sequence location across the 1,400-bp test region. The highest accuracy was obtained across either the D1 or D2 variable region. The naïve Bayesian classifier provides an effective and rapid means to classify fungal LSU sequences from large environmental surveys. The training set and tool are publicly available through the Ribosomal Database Project (http://rdp.cme.msu.edu/classifier/classifier.jsp). PMID:22194300

  12. A gene expression biomarker accurately predicts estrogen ...

    EPA Pesticide Factsheets

    The EPA’s vision for the Endocrine Disruptor Screening Program (EDSP) in the 21st Century (EDSP21) includes utilization of high-throughput screening (HTS) assays coupled with computational modeling to prioritize chemicals with the goal of eventually replacing current Tier 1 screening tests. The ToxCast program currently includes 18 HTS in vitro assays that evaluate the ability of chemicals to modulate estrogen receptor α (ERα), an important endocrine target. We propose microarray-based gene expression profiling as a complementary approach to predict ERα modulation and have developed computational methods to identify ERα modulators in an existing database of whole-genome microarray data. The ERα biomarker consisted of 46 ERα-regulated genes with consistent expression patterns across 7 known ER agonists and 3 known ER antagonists. The biomarker was evaluated as a predictive tool using the fold-change rank-based Running Fisher algorithm by comparison to annotated gene expression data sets from experiments in MCF-7 cells. Using 141 comparisons from chemical- and hormone-treated cells, the biomarker gave a balanced accuracy for prediction of ERα activation or suppression of 94% or 93%, respectively. The biomarker was able to correctly classify 18 out of 21 (86%) OECD ER reference chemicals including “very weak” agonists and replicated predictions based on 18 in vitro ER-associated HTS assays. For 114 chemicals present in both the HTS data and the MCF-7 c

  13. Identification of Importin 8 (IPO8) as the most accurate reference gene for the clinicopathological analysis of lung specimens

    PubMed Central

    Nguewa, Paul A; Agorreta, Jackeline; Blanco, David; Lozano, Maria Dolores; Gomez-Roman, Javier; Sanchez, Blas A; Valles, Iñaki; Pajares, Maria J; Pio, Ruben; Rodriguez, Maria Jose; Montuenga, Luis M; Calvo, Alfonso

    2008-01-01

    Background The accurate normalization of differentially expressed genes in lung cancer is essential for the identification of novel therapeutic targets and biomarkers by real time RT-PCR and microarrays. Although classical "housekeeping" genes, such as GAPDH, HPRT1, and beta-actin have been widely used in the past, their accuracy as reference genes for lung tissues has not been proven. Results We have conducted a thorough analysis of a panel of 16 candidate reference genes for lung specimens and lung cell lines. Gene expression was measured by quantitative real time RT-PCR and expression stability was analyzed with the softwares GeNorm and NormFinder, mean of |ΔCt| (= |Ct Normal-Ct tumor|) ± SEM, and correlation coefficients among genes. Systematic comparison between candidates led us to the identification of a subset of suitable reference genes for clinical samples: IPO8, ACTB, POLR2A, 18S, and PPIA. Further analysis showed that IPO8 had a very low mean of |ΔCt| (0.70 ± 0.09), with no statistically significant differences between normal and malignant samples and with excellent expression stability. Conclusion Our data show that IPO8 is the most accurate reference gene for clinical lung specimens. In addition, we demonstrate that the commonly used genes GAPDH and HPRT1 are inappropriate to normalize data derived from lung biopsies, although they are suitable as reference genes for lung cell lines. We thus propose IPO8 as a novel reference gene for lung cancer samples. PMID:19014639

  14. Accurate modelling of unsteady flows in collapsible tubes.

    PubMed

    Marchandise, Emilie; Flaud, Patrice

    2010-01-01

    The context of this paper is the development of a general and efficient numerical haemodynamic tool to help clinicians and researchers in understanding of physiological flow phenomena. We propose an accurate one-dimensional Runge-Kutta discontinuous Galerkin (RK-DG) method coupled with lumped parameter models for the boundary conditions. The suggested model has already been successfully applied to haemodynamics in arteries and is now extended for the flow in collapsible tubes such as veins. The main difference with cardiovascular simulations is that the flow may become supercritical and elastic jumps may appear with the numerical consequence that scheme may not remain monotone if no limiting procedure is introduced. We show that our second-order RK-DG method equipped with an approximate Roe's Riemann solver and a slope-limiting procedure allows us to capture elastic jumps accurately. Moreover, this paper demonstrates that the complex physics associated with such flows is more accurately modelled than with traditional methods such as finite difference methods or finite volumes. We present various benchmark problems that show the flexibility and applicability of the numerical method. Our solutions are compared with analytical solutions when they are available and with solutions obtained using other numerical methods. Finally, to illustrate the clinical interest, we study the emptying process in a calf vein squeezed by contracting skeletal muscle in a normal and pathological subject. We compare our results with experimental simulations and discuss the sensitivity to parameters of our model.

  15. A gene regulatory network model for floral transition of the shoot apex in maize and its dynamic modeling.

    PubMed

    Dong, Zhanshan; Danilevskaya, Olga; Abadie, Tabare; Messina, Carlos; Coles, Nathan; Cooper, Mark

    2012-01-01

    The transition from the vegetative to reproductive development is a critical event in the plant life cycle. The accurate prediction of flowering time in elite germplasm is important for decisions in maize breeding programs and best agronomic practices. The understanding of the genetic control of flowering time in maize has significantly advanced in the past decade. Through comparative genomics, mutant analysis, genetic analysis and QTL cloning, and transgenic approaches, more than 30 flowering time candidate genes in maize have been revealed and the relationships among these genes have been partially uncovered. Based on the knowledge of the flowering time candidate genes, a conceptual gene regulatory network model for the genetic control of flowering time in maize is proposed. To demonstrate the potential of the proposed gene regulatory network model, a first attempt was made to develop a dynamic gene network model to predict flowering time of maize genotypes varying for specific genes. The dynamic gene network model is composed of four genes and was built on the basis of gene expression dynamics of the two late flowering id1 and dlf1 mutants, the early flowering landrace Gaspe Flint and the temperate inbred B73. The model was evaluated against the phenotypic data of the id1 dlf1 double mutant and the ZMM4 overexpressed transgenic lines. The model provides a working example that leverages knowledge from model organisms for the utilization of maize genomic information to predict a whole plant trait phenotype, flowering time, of maize genotypes.

  16. Moving Toward Integrating Gene Expression Profiling Into High-Throughput Testing: A Gene Expression Biomarker Accurately Predicts Estrogen Receptor α Modulation in a Microarray Compendium

    PubMed Central

    Ryan, Natalia; Chorley, Brian; Tice, Raymond R.; Judson, Richard; Corton, J. Christopher

    2016-01-01

    Microarray profiling of chemical-induced effects is being increasingly used in medium- and high-throughput formats. Computational methods are described here to identify molecular targets from whole-genome microarray data using as an example the estrogen receptor α (ERα), often modulated by potential endocrine disrupting chemicals. ERα biomarker genes were identified by their consistent expression after exposure to 7 structurally diverse ERα agonists and 3 ERα antagonists in ERα-positive MCF-7 cells. Most of the biomarker genes were shown to be directly regulated by ERα as determined by ESR1 gene knockdown using siRNA as well as through chromatin immunoprecipitation coupled with DNA sequencing analysis of ERα-DNA interactions. The biomarker was evaluated as a predictive tool using the fold-change rank-based Running Fisher algorithm by comparison to annotated gene expression datasets from experiments using MCF-7 cells, including those evaluating the transcriptional effects of hormones and chemicals. Using 141 comparisons from chemical- and hormone-treated cells, the biomarker gave a balanced accuracy for prediction of ERα activation or suppression of 94% and 93%, respectively. The biomarker was able to correctly classify 18 out of 21 (86%) ER reference chemicals including “very weak” agonists. Importantly, the biomarker predictions accurately replicated predictions based on 18 in vitro high-throughput screening assays that queried different steps in ERα signaling. For 114 chemicals, the balanced accuracies were 95% and 98% for activation or suppression, respectively. These results demonstrate that the ERα gene expression biomarker can accurately identify ERα modulators in large collections of microarray data derived from MCF-7 cells. PMID:26865669

  17. Identification and evaluation of new reference genes in Gossypium hirsutum for accurate normalization of real-time quantitative RT-PCR data.

    PubMed

    Artico, Sinara; Nardeli, Sarah M; Brilhante, Osmundo; Grossi-de-Sa, Maria Fátima; Alves-Ferreira, Marcio

    2010-03-21

    Normalizing through reference genes, or housekeeping genes, can make more accurate and reliable results from reverse transcription real-time quantitative polymerase chain reaction (qPCR). Recent studies have shown that no single housekeeping gene is universal for all experiments. Thus, suitable reference genes should be the first step of any qPCR analysis. Only a few studies on the identification of housekeeping gene have been carried on plants. Therefore qPCR studies on important crops such as cotton has been hampered by the lack of suitable reference genes. By the use of two distinct algorithms, implemented by geNorm and NormFinder, we have assessed the gene expression of nine candidate reference genes in cotton: GhACT4, GhEF1alpha5, GhFBX6, GhPP2A1, GhMZA, GhPTB, GhGAPC2, GhbetaTUB3 and GhUBQ14. The candidate reference genes were evaluated in 23 experimental samples consisting of six distinct plant organs, eight stages of flower development, four stages of fruit development and in flower verticils. The expression of GhPP2A1 and GhUBQ14 genes were the most stable across all samples and also when distinct plants organs are examined. GhACT4 and GhUBQ14 present more stable expression during flower development, GhACT4 and GhFBX6 in the floral verticils and GhMZA and GhPTB during fruit development. Our analysis provided the most suitable combination of reference genes for each experimental set tested as internal control for reliable qPCR data normalization. In addition, to illustrate the use of cotton reference genes we checked the expression of two cotton MADS-box genes in distinct plant and floral organs and also during flower development. We have tested the expression stabilities of nine candidate genes in a set of 23 tissue samples from cotton plants divided into five different experimental sets. As a result of this evaluation, we recommend the use of GhUBQ14 and GhPP2A1 housekeeping genes as superior references for normalization of gene expression measures in

  18. Identification and evaluation of new reference genes in Gossypium hirsutum for accurate normalization of real-time quantitative RT-PCR data

    PubMed Central

    2010-01-01

    Background Normalizing through reference genes, or housekeeping genes, can make more accurate and reliable results from reverse transcription real-time quantitative polymerase chain reaction (qPCR). Recent studies have shown that no single housekeeping gene is universal for all experiments. Thus, suitable reference genes should be the first step of any qPCR analysis. Only a few studies on the identification of housekeeping gene have been carried on plants. Therefore qPCR studies on important crops such as cotton has been hampered by the lack of suitable reference genes. Results By the use of two distinct algorithms, implemented by geNorm and NormFinder, we have assessed the gene expression of nine candidate reference genes in cotton: GhACT4, GhEF1α5, GhFBX6, GhPP2A1, GhMZA, GhPTB, GhGAPC2, GhβTUB3 and GhUBQ14. The candidate reference genes were evaluated in 23 experimental samples consisting of six distinct plant organs, eight stages of flower development, four stages of fruit development and in flower verticils. The expression of GhPP2A1 and GhUBQ14 genes were the most stable across all samples and also when distinct plants organs are examined. GhACT4 and GhUBQ14 present more stable expression during flower development, GhACT4 and GhFBX6 in the floral verticils and GhMZA and GhPTB during fruit development. Our analysis provided the most suitable combination of reference genes for each experimental set tested as internal control for reliable qPCR data normalization. In addition, to illustrate the use of cotton reference genes we checked the expression of two cotton MADS-box genes in distinct plant and floral organs and also during flower development. Conclusion We have tested the expression stabilities of nine candidate genes in a set of 23 tissue samples from cotton plants divided into five different experimental sets. As a result of this evaluation, we recommend the use of GhUBQ14 and GhPP2A1 housekeeping genes as superior references for normalization of gene

  19. Accurate Modeling Method for Cu Interconnect

    NASA Astrophysics Data System (ADS)

    Yamada, Kenta; Kitahara, Hiroshi; Asai, Yoshihiko; Sakamoto, Hideo; Okada, Norio; Yasuda, Makoto; Oda, Noriaki; Sakurai, Michio; Hiroi, Masayuki; Takewaki, Toshiyuki; Ohnishi, Sadayuki; Iguchi, Manabu; Minda, Hiroyasu; Suzuki, Mieko

    This paper proposes an accurate modeling method of the copper interconnect cross-section in which the width and thickness dependence on layout patterns and density caused by processes (CMP, etching, sputtering, lithography, and so on) are fully, incorporated and universally expressed. In addition, we have developed specific test patterns for the model parameters extraction, and an efficient extraction flow. We have extracted the model parameters for 0.15μm CMOS using this method and confirmed that 10%τpd error normally observed with conventional LPE (Layout Parameters Extraction) was completely dissolved. Moreover, it is verified that the model can be applied to more advanced technologies (90nm, 65nm and 55nm CMOS). Since the interconnect delay variations due to the processes constitute a significant part of what have conventionally been treated as random variations, use of the proposed model could enable one to greatly narrow the guardbands required to guarantee a desired yield, thereby facilitating design closure.

  20. Dinucleotide controlled null models for comparative RNA gene prediction.

    PubMed

    Gesell, Tanja; Washietl, Stefan

    2008-05-27

    Comparative prediction of RNA structures can be used to identify functional noncoding RNAs in genomic screens. It was shown recently by Babak et al. [BMC Bioinformatics. 8:33] that RNA gene prediction programs can be biased by the genomic dinucleotide content, in particular those programs using a thermodynamic folding model including stacking energies. As a consequence, there is need for dinucleotide-preserving control strategies to assess the significance of such predictions. While there have been randomization algorithms for single sequences for many years, the problem has remained challenging for multiple alignments and there is currently no algorithm available. We present a program called SISSIz that simulates multiple alignments of a given average dinucleotide content. Meeting additional requirements of an accurate null model, the randomized alignments are on average of the same sequence diversity and preserve local conservation and gap patterns. We make use of a phylogenetic substitution model that includes overlapping dependencies and site-specific rates. Using fast heuristics and a distance based approach, a tree is estimated under this model which is used to guide the simulations. The new algorithm is tested on vertebrate genomic alignments and the effect on RNA structure predictions is studied. In addition, we directly combined the new null model with the RNAalifold consensus folding algorithm giving a new variant of a thermodynamic structure based RNA gene finding program that is not biased by the dinucleotide content. SISSIz implements an efficient algorithm to randomize multiple alignments preserving dinucleotide content. It can be used to get more accurate estimates of false positive rates of existing programs, to produce negative controls for the training of machine learning based programs, or as standalone RNA gene finding program. Other applications in comparative genomics that require randomization of multiple alignments can be considered. SISSIz

  1. Prognostic breast cancer signature identified from 3D culture model accurately predicts clinical outcome across independent datasets

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Martin, Katherine J.; Patrick, Denis R.; Bissell, Mina J.

    2008-10-20

    One of the major tenets in breast cancer research is that early detection is vital for patient survival by increasing treatment options. To that end, we have previously used a novel unsupervised approach to identify a set of genes whose expression predicts prognosis of breast cancer patients. The predictive genes were selected in a well-defined three dimensional (3D) cell culture model of non-malignant human mammary epithelial cell morphogenesis as down-regulated during breast epithelial cell acinar formation and cell cycle arrest. Here we examine the ability of this gene signature (3D-signature) to predict prognosis in three independent breast cancer microarray datasetsmore » having 295, 286, and 118 samples, respectively. Our results show that the 3D-signature accurately predicts prognosis in three unrelated patient datasets. At 10 years, the probability of positive outcome was 52, 51, and 47 percent in the group with a poor-prognosis signature and 91, 75, and 71 percent in the group with a good-prognosis signature for the three datasets, respectively (Kaplan-Meier survival analysis, p<0.05). Hazard ratios for poor outcome were 5.5 (95% CI 3.0 to 12.2, p<0.0001), 2.4 (95% CI 1.6 to 3.6, p<0.0001) and 1.9 (95% CI 1.1 to 3.2, p = 0.016) and remained significant for the two larger datasets when corrected for estrogen receptor (ER) status. Hence the 3D-signature accurately predicts breast cancer outcome in both ER-positive and ER-negative tumors, though individual genes differed in their prognostic ability in the two subtypes. Genes that were prognostic in ER+ patients are AURKA, CEP55, RRM2, EPHA2, FGFBP1, and VRK1, while genes prognostic in ER patients include ACTB, FOXM1 and SERPINE2 (Kaplan-Meier p<0.05). Multivariable Cox regression analysis in the largest dataset showed that the 3D-signature was a strong independent factor in predicting breast cancer outcome. The 3D-signature accurately predicts breast cancer outcome across multiple datasets and holds

  2. Gene expression models for prediction of longitudinal dispersion coefficient in streams

    NASA Astrophysics Data System (ADS)

    Sattar, Ahmed M. A.; Gharabaghi, Bahram

    2015-05-01

    Longitudinal dispersion is the key hydrologic process that governs transport of pollutants in natural streams. It is critical for spill action centers to be able to predict the pollutant travel time and break-through curves accurately following accidental spills in urban streams. This study presents a novel gene expression model for longitudinal dispersion developed using 150 published data sets of geometric and hydraulic parameters in natural streams in the United States, Canada, Europe, and New Zealand. The training and testing of the model were accomplished using randomly-selected 67% (100 data sets) and 33% (50 data sets) of the data sets, respectively. Gene expression programming (GEP) is used to develop empirical relations between the longitudinal dispersion coefficient and various control variables, including the Froude number which reflects the effect of reach slope, aspect ratio, and the bed material roughness on the dispersion coefficient. Two GEP models have been developed, and the prediction uncertainties of the developed GEP models are quantified and compared with those of existing models, showing improved prediction accuracy in favor of GEP models. Finally, a parametric analysis is performed for further verification of the developed GEP models. The main reason for the higher accuracy of the GEP models compared to the existing regression models is that exponents of the key variables (aspect ratio and bed material roughness) are not constants but a function of the Froude number. The proposed relations are both simple and accurate and can be effectively used to predict the longitudinal dispersion coefficients in natural streams.

  3. 3ARM: A Fast, Accurate Radiative Transfer Model for Use in Climate Models

    NASA Technical Reports Server (NTRS)

    Bergstrom, R. W.; Kinne, S.; Sokolik, I. N.; Toon, O. B.; Mlawer, E. J.; Clough, S. A.; Ackerman, T. P.; Mather, J.

    1996-01-01

    A new radiative transfer model combining the efforts of three groups of researchers is discussed. The model accurately computes radiative transfer in a inhomogeneous absorbing, scattering and emitting atmospheres. As an illustration of the model, results are shown for the effects of dust on the thermal radiation.

  4. 3ARM: A Fast, Accurate Radiative Transfer Model for use in Climate Models

    NASA Technical Reports Server (NTRS)

    Bergstrom, R. W.; Kinne, S.; Sokolik, I. N.; Toon, O. B.; Mlawer, E. J.; Clough, S. A.; Ackerman, T. P.; Mather, J.

    1996-01-01

    A new radiative transfer model combining the efforts of three groups of researchers is discussed. The model accurately computes radiative transfer in a inhomogeneous absorbing, scattering and emitting atmospheres. As an illustration of the model, results are shown for the effects of dust on the thermal radiation.

  5. 3ARM: A Fast, Accurate Radiative Transfer Model For Use in Climate Models

    NASA Technical Reports Server (NTRS)

    Bergstrom, R. W.; Kinne, S.; Sokolik, I. N.; Toon, O. B.; Mlawer, E. J.; Clough, S. A.; Ackerman, T. P.; Mather, J.

    1996-01-01

    A new radiative transfer model combining the efforts of three groups of researchers is discussed. The model accurately computes radiative transfer in a inhomogeneous absorbing, scattering and emitting atmospheres. As an illustration of the model, results are shown for the effects of dust on the thermal radiation.

  6. A random variance model for detection of differential gene expression in small microarray experiments.

    PubMed

    Wright, George W; Simon, Richard M

    2003-12-12

    Microarray techniques provide a valuable way of characterizing the molecular nature of disease. Unfortunately expense and limited specimen availability often lead to studies with small sample sizes. This makes accurate estimation of variability difficult, since variance estimates made on a gene by gene basis will have few degrees of freedom, and the assumption that all genes share equal variance is unlikely to be true. We propose a model by which the within gene variances are drawn from an inverse gamma distribution, whose parameters are estimated across all genes. This results in a test statistic that is a minor variation of those used in standard linear models. We demonstrate that the model assumptions are valid on experimental data, and that the model has more power than standard tests to pick up large changes in expression, while not increasing the rate of false positives. This method is incorporated into BRB-ArrayTools version 3.0 (http://linus.nci.nih.gov/BRB-ArrayTools.html). ftp://linus.nci.nih.gov/pub/techreport/RVM_supplement.pdf

  7. Can phenological models predict tree phenology accurately under climate change conditions?

    NASA Astrophysics Data System (ADS)

    Chuine, Isabelle; Bonhomme, Marc; Legave, Jean Michel; García de Cortázar-Atauri, Inaki; Charrier, Guillaume; Lacointe, André; Améglio, Thierry

    2014-05-01

    The onset of the growing season of trees has been globally earlier by 2.3 days/decade during the last 50 years because of global warming and this trend is predicted to continue according to climate forecast. The effect of temperature on plant phenology is however not linear because temperature has a dual effect on bud development. On one hand, low temperatures are necessary to break bud dormancy, and on the other hand higher temperatures are necessary to promote bud cells growth afterwards. Increasing phenological changes in temperate woody species have strong impacts on forest trees distribution and productivity, as well as crops cultivation areas. Accurate predictions of trees phenology are therefore a prerequisite to understand and foresee the impacts of climate change on forests and agrosystems. Different process-based models have been developed in the last two decades to predict the date of budburst or flowering of woody species. They are two main families: (1) one-phase models which consider only the ecodormancy phase and make the assumption that endodormancy is always broken before adequate climatic conditions for cell growth occur; and (2) two-phase models which consider both the endodormancy and ecodormancy phases and predict a date of dormancy break which varies from year to year. So far, one-phase models have been able to predict accurately tree bud break and flowering under historical climate. However, because they do not consider what happens prior to ecodormancy, and especially the possible negative effect of winter temperature warming on dormancy break, it seems unlikely that they can provide accurate predictions in future climate conditions. It is indeed well known that a lack of low temperature results in abnormal pattern of bud break and development in temperate fruit trees. An accurate modelling of the dormancy break date has thus become a major issue in phenology modelling. Two-phases phenological models predict that global warming should delay

  8. Moving Toward Integrating Gene Expression Profiling Into High-Throughput Testing: A Gene Expression Biomarker Accurately Predicts Estrogen Receptor α Modulation in a Microarray Compendium.

    PubMed

    Ryan, Natalia; Chorley, Brian; Tice, Raymond R; Judson, Richard; Corton, J Christopher

    2016-05-01

    Microarray profiling of chemical-induced effects is being increasingly used in medium- and high-throughput formats. Computational methods are described here to identify molecular targets from whole-genome microarray data using as an example the estrogen receptor α (ERα), often modulated by potential endocrine disrupting chemicals. ERα biomarker genes were identified by their consistent expression after exposure to 7 structurally diverse ERα agonists and 3 ERα antagonists in ERα-positive MCF-7 cells. Most of the biomarker genes were shown to be directly regulated by ERα as determined by ESR1 gene knockdown using siRNA as well as through chromatin immunoprecipitation coupled with DNA sequencing analysis of ERα-DNA interactions. The biomarker was evaluated as a predictive tool using the fold-change rank-based Running Fisher algorithm by comparison to annotated gene expression datasets from experiments using MCF-7 cells, including those evaluating the transcriptional effects of hormones and chemicals. Using 141 comparisons from chemical- and hormone-treated cells, the biomarker gave a balanced accuracy for prediction of ERα activation or suppression of 94% and 93%, respectively. The biomarker was able to correctly classify 18 out of 21 (86%) ER reference chemicals including "very weak" agonists. Importantly, the biomarker predictions accurately replicated predictions based on 18 in vitro high-throughput screening assays that queried different steps in ERα signaling. For 114 chemicals, the balanced accuracies were 95% and 98% for activation or suppression, respectively. These results demonstrate that the ERα gene expression biomarker can accurately identify ERα modulators in large collections of microarray data derived from MCF-7 cells. Published by Oxford University Press on behalf of the Society of Toxicology 2016. This work is written by US Government employees and is in the public domain in the US.

  9. An Accurate Temperature Correction Model for Thermocouple Hygrometers 1

    PubMed Central

    Savage, Michael J.; Cass, Alfred; de Jager, James M.

    1982-01-01

    Numerous water relation studies have used thermocouple hygrometers routinely. However, the accurate temperature correction of hygrometer calibration curve slopes seems to have been largely neglected in both psychrometric and dewpoint techniques. In the case of thermocouple psychrometers, two temperature correction models are proposed, each based on measurement of the thermojunction radius and calculation of the theoretical voltage sensitivity to changes in water potential. The first model relies on calibration at a single temperature and the second at two temperatures. Both these models were more accurate than the temperature correction models currently in use for four psychrometers calibrated over a range of temperatures (15-38°C). The model based on calibration at two temperatures is superior to that based on only one calibration. The model proposed for dewpoint hygrometers is similar to that for psychrometers. It is based on the theoretical voltage sensitivity to changes in water potential. Comparison with empirical data from three dewpoint hygrometers calibrated at four different temperatures indicates that these instruments need only be calibrated at, e.g. 25°C, if the calibration slopes are corrected for temperature. PMID:16662241

  10. An accurate model for predicting high frequency noise of nanoscale NMOS SOI transistors

    NASA Astrophysics Data System (ADS)

    Shen, Yanfei; Cui, Jie; Mohammadi, Saeed

    2017-05-01

    A nonlinear and scalable model suitable for predicting high frequency noise of N-type Metal Oxide Semiconductor (NMOS) transistors is presented. The model is developed for a commercial 45 nm CMOS SOI technology and its accuracy is validated through comparison with measured performance of a microwave low noise amplifier. The model employs the virtual source nonlinear core and adds parasitic elements to accurately simulate the RF behavior of multi-finger NMOS transistors up to 40 GHz. For the first time, the traditional long-channel thermal noise model is supplemented with an injection noise model to accurately represent the noise behavior of these short-channel transistors up to 26 GHz. The developed model is simple and easy to extract, yet very accurate.

  11. A highly sensitive and accurate gene expression analysis by sequencing ("bead-seq") for a single cell.

    PubMed

    Matsunaga, Hiroko; Goto, Mari; Arikawa, Koji; Shirai, Masataka; Tsunoda, Hiroyuki; Huang, Huan; Kambara, Hideki

    2015-02-15

    Analyses of gene expressions in single cells are important for understanding detailed biological phenomena. Here, a highly sensitive and accurate method by sequencing (called "bead-seq") to obtain a whole gene expression profile for a single cell is proposed. A key feature of the method is to use a complementary DNA (cDNA) library on magnetic beads, which enables adding washing steps to remove residual reagents in a sample preparation process. By adding the washing steps, the next steps can be carried out under the optimal conditions without losing cDNAs. Error sources were carefully evaluated to conclude that the first several steps were the key steps. It is demonstrated that bead-seq is superior to the conventional methods for single-cell gene expression analyses in terms of reproducibility, quantitative accuracy, and biases caused during sample preparation and sequencing processes. Copyright © 2014 Elsevier Inc. All rights reserved.

  12. Modeling stochastic noise in gene regulatory systems

    PubMed Central

    Meister, Arwen; Du, Chao; Li, Ye Henry; Wong, Wing Hung

    2014-01-01

    The Master equation is considered the gold standard for modeling the stochastic mechanisms of gene regulation in molecular detail, but it is too complex to solve exactly in most cases, so approximation and simulation methods are essential. However, there is still a lack of consensus about the best way to carry these out. To help clarify the situation, we review Master equation models of gene regulation, theoretical approximations based on an expansion method due to N.G. van Kampen and R. Kubo, and simulation algorithms due to D.T. Gillespie and P. Langevin. Expansion of the Master equation shows that for systems with a single stable steady-state, the stochastic model reduces to a deterministic model in a first-order approximation. Additional theory, also due to van Kampen, describes the asymptotic behavior of multistable systems. To support and illustrate the theory and provide further insight into the complex behavior of multistable systems, we perform a detailed simulation study comparing the various approximation and simulation methods applied to synthetic gene regulatory systems with various qualitative characteristics. The simulation studies show that for large stochastic systems with a single steady-state, deterministic models are quite accurate, since the probability distribution of the solution has a single peak tracking the deterministic trajectory whose variance is inversely proportional to the system size. In multistable stochastic systems, large fluctuations can cause individual trajectories to escape from the domain of attraction of one steady-state and be attracted to another, so the system eventually reaches a multimodal probability distribution in which all stable steady-states are represented proportional to their relative stability. However, since the escape time scales exponentially with system size, this process can take a very long time in large systems. PMID:25632368

  13. In silico method for modelling metabolism and gene product expression at genome scale

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lerman, Joshua A.; Hyduke, Daniel R.; Latif, Haythem

    2012-07-03

    Transcription and translation use raw materials and energy generated metabolically to create the macromolecular machinery responsible for all cellular functions, including metabolism. A biochemically accurate model of molecular biology and metabolism will facilitate comprehensive and quantitative computations of an organism's molecular constitution as a function of genetic and environmental parameters. Here we formulate a model of metabolism and macromolecular expression. Prototyping it using the simple microorganism Thermotoga maritima, we show our model accurately simulates variations in cellular composition and gene expression. Moreover, through in silico comparative transcriptomics, the model allows the discovery of new regulons and improving the genome andmore » transcription unit annotations. Our method presents a framework for investigating molecular biology and cellular physiology in silico and may allow quantitative interpretation of multi-omics data sets in the context of an integrated biochemical description of an organism.« less

  14. Selection and testing of reference genes for accurate RT-qPCR in rice seedlings under iron toxicity.

    PubMed

    Santos, Fabiane Igansi de Castro Dos; Marini, Naciele; Santos, Railson Schreinert Dos; Hoffman, Bianca Silva Fernandes; Alves-Ferreira, Marcio; de Oliveira, Antonio Costa

    2018-01-01

    Reverse Transcription quantitative PCR (RT-qPCR) is a technique for gene expression profiling with high sensibility and reproducibility. However, to obtain accurate results, it depends on data normalization by using endogenous reference genes whose expression is constitutive or invariable. Although the technique is widely used in plant stress analyzes, the stability of reference genes for iron toxicity in rice (Oryza sativa L.) has not been thoroughly investigated. Here, we tested a set of candidate reference genes for use in rice under this stressful condition. The test was performed using four distinct methods: NormFinder, BestKeeper, geNorm and the comparative ΔCt. To achieve reproducible and reliable results, Minimum Information for Publication of Quantitative Real-Time PCR Experiments (MIQE) guidelines were followed. Valid reference genes were found for shoot (P2, OsGAPDH and OsNABP), root (OsEF-1a, P8 and OsGAPDH) and root+shoot (OsNABP, OsGAPDH and P8) enabling us to perform further reliable studies for iron toxicity in both indica and japonica subspecies. The importance of the study of other than the traditional endogenous genes for use as normalizers is also shown here.

  15. Selection and testing of reference genes for accurate RT-qPCR in rice seedlings under iron toxicity

    PubMed Central

    dos Santos, Fabiane Igansi de Castro; Marini, Naciele; dos Santos, Railson Schreinert; Hoffman, Bianca Silva Fernandes; Alves-Ferreira, Marcio

    2018-01-01

    Reverse Transcription quantitative PCR (RT-qPCR) is a technique for gene expression profiling with high sensibility and reproducibility. However, to obtain accurate results, it depends on data normalization by using endogenous reference genes whose expression is constitutive or invariable. Although the technique is widely used in plant stress analyzes, the stability of reference genes for iron toxicity in rice (Oryza sativa L.) has not been thoroughly investigated. Here, we tested a set of candidate reference genes for use in rice under this stressful condition. The test was performed using four distinct methods: NormFinder, BestKeeper, geNorm and the comparative ΔCt. To achieve reproducible and reliable results, Minimum Information for Publication of Quantitative Real-Time PCR Experiments (MIQE) guidelines were followed. Valid reference genes were found for shoot (P2, OsGAPDH and OsNABP), root (OsEF-1a, P8 and OsGAPDH) and root+shoot (OsNABP, OsGAPDH and P8) enabling us to perform further reliable studies for iron toxicity in both indica and japonica subspecies. The importance of the study of other than the traditional endogenous genes for use as normalizers is also shown here. PMID:29494624

  16. Multiscale Modeling of Gene-Behavior Associations in an Artificial Neural Network Model of Cognitive Development.

    PubMed

    Thomas, Michael S C; Forrester, Neil A; Ronald, Angelica

    2016-01-01

    In the multidisciplinary field of developmental cognitive neuroscience, statistical associations between levels of description play an increasingly important role. One example of such associations is the observation of correlations between relatively common gene variants and individual differences in behavior. It is perhaps surprising that such associations can be detected despite the remoteness of these levels of description, and the fact that behavior is the outcome of an extended developmental process involving interaction of the whole organism with a variable environment. Given that they have been detected, how do such associations inform cognitive-level theories? To investigate this question, we employed a multiscale computational model of development, using a sample domain drawn from the field of language acquisition. The model comprised an artificial neural network model of past-tense acquisition trained using the backpropagation learning algorithm, extended to incorporate population modeling and genetic algorithms. It included five levels of description-four internal: genetic, network, neurocomputation, behavior; and one external: environment. Since the mechanistic assumptions of the model were known and its operation was relatively transparent, we could evaluate whether cross-level associations gave an accurate picture of causal processes. We established that associations could be detected between artificial genes and behavioral variation, even under polygenic assumptions of a many-to-one relationship between genes and neurocomputational parameters, and when an experience-dependent developmental process interceded between the action of genes and the emergence of behavior. We evaluated these associations with respect to their specificity (to different behaviors, to function vs. structure), to their developmental stability, and to their replicability, as well as considering issues of missing heritability and gene-environment interactions. We argue that gene

  17. A Nonlinear Model for Gene-Based Gene-Environment Interaction.

    PubMed

    Sa, Jian; Liu, Xu; He, Tao; Liu, Guifen; Cui, Yuehua

    2016-06-04

    A vast amount of literature has confirmed the role of gene-environment (G×E) interaction in the etiology of complex human diseases. Traditional methods are predominantly focused on the analysis of interaction between a single nucleotide polymorphism (SNP) and an environmental variable. Given that genes are the functional units, it is crucial to understand how gene effects (rather than single SNP effects) are influenced by an environmental variable to affect disease risk. Motivated by the increasing awareness of the power of gene-based association analysis over single variant based approach, in this work, we proposed a sparse principle component regression (sPCR) model to understand the gene-based G×E interaction effect on complex disease. We first extracted the sparse principal components for SNPs in a gene, then the effect of each principal component was modeled by a varying-coefficient (VC) model. The model can jointly model variants in a gene in which their effects are nonlinearly influenced by an environmental variable. In addition, the varying-coefficient sPCR (VC-sPCR) model has nice interpretation property since the sparsity on the principal component loadings can tell the relative importance of the corresponding SNPs in each component. We applied our method to a human birth weight dataset in Thai population. We analyzed 12,005 genes across 22 chromosomes and found one significant interaction effect using the Bonferroni correction method and one suggestive interaction. The model performance was further evaluated through simulation studies. Our model provides a system approach to evaluate gene-based G×E interaction.

  18. Intra- and interspecies gene expression models for predicting drug response in canine osteosarcoma.

    PubMed

    Fowles, Jared S; Brown, Kristen C; Hess, Ann M; Duval, Dawn L; Gustafson, Daniel L

    2016-02-19

    Genomics-based predictors of drug response have the potential to improve outcomes associated with cancer therapy. Osteosarcoma (OS), the most common primary bone cancer in dogs, is commonly treated with adjuvant doxorubicin or carboplatin following amputation of the affected limb. We evaluated the use of gene-expression based models built in an intra- or interspecies manner to predict chemosensitivity and treatment outcome in canine OS. Models were built and evaluated using microarray gene expression and drug sensitivity data from human and canine cancer cell lines, and canine OS tumor datasets. The "COXEN" method was utilized to filter gene signatures between human and dog datasets based on strong co-expression patterns. Models were built using linear discriminant analysis via the misclassification penalized posterior algorithm. The best doxorubicin model involved genes identified in human lines that were co-expressed and trained on canine OS tumor data, which accurately predicted clinical outcome in 73 % of dogs (p = 0.0262, binomial). The best carboplatin model utilized canine lines for gene identification and model training, with canine OS tumor data for co-expression. Dogs whose treatment matched our predictions had significantly better clinical outcomes than those that didn't (p = 0.0006, Log Rank), and this predictor significantly associated with longer disease free intervals in a Cox multivariate analysis (hazard ratio = 0.3102, p = 0.0124). Our data show that intra- and interspecies gene expression models can successfully predict response in canine OS, which may improve outcome in dogs and serve as pre-clinical validation for similar methods in human cancer research.

  19. Fast and Accurate Circuit Design Automation through Hierarchical Model Switching.

    PubMed

    Huynh, Linh; Tagkopoulos, Ilias

    2015-08-21

    In computer-aided biological design, the trifecta of characterized part libraries, accurate models and optimal design parameters is crucial for producing reliable designs. As the number of parts and model complexity increase, however, it becomes exponentially more difficult for any optimization method to search the solution space, hence creating a trade-off that hampers efficient design. To address this issue, we present a hierarchical computer-aided design architecture that uses a two-step approach for biological design. First, a simple model of low computational complexity is used to predict circuit behavior and assess candidate circuit branches through branch-and-bound methods. Then, a complex, nonlinear circuit model is used for a fine-grained search of the reduced solution space, thus achieving more accurate results. Evaluation with a benchmark of 11 circuits and a library of 102 experimental designs with known characterization parameters demonstrates a speed-up of 3 orders of magnitude when compared to other design methods that provide optimality guarantees.

  20. A Machine Learned Classifier That Uses Gene Expression Data to Accurately Predict Estrogen Receptor Status

    PubMed Central

    Bastani, Meysam; Vos, Larissa; Asgarian, Nasimeh; Deschenes, Jean; Graham, Kathryn; Mackey, John; Greiner, Russell

    2013-01-01

    Background Selecting the appropriate treatment for breast cancer requires accurately determining the estrogen receptor (ER) status of the tumor. However, the standard for determining this status, immunohistochemical analysis of formalin-fixed paraffin embedded samples, suffers from numerous technical and reproducibility issues. Assessment of ER-status based on RNA expression can provide more objective, quantitative and reproducible test results. Methods To learn a parsimonious RNA-based classifier of hormone receptor status, we applied a machine learning tool to a training dataset of gene expression microarray data obtained from 176 frozen breast tumors, whose ER-status was determined by applying ASCO-CAP guidelines to standardized immunohistochemical testing of formalin fixed tumor. Results This produced a three-gene classifier that can predict the ER-status of a novel tumor, with a cross-validation accuracy of 93.17±2.44%. When applied to an independent validation set and to four other public databases, some on different platforms, this classifier obtained over 90% accuracy in each. In addition, we found that this prediction rule separated the patients' recurrence-free survival curves with a hazard ratio lower than the one based on the IHC analysis of ER-status. Conclusions Our efficient and parsimonious classifier lends itself to high throughput, highly accurate and low-cost RNA-based assessments of ER-status, suitable for routine high-throughput clinical use. This analytic method provides a proof-of-principle that may be applicable to developing effective RNA-based tests for other biomarkers and conditions. PMID:24312637

  1. A Simple and Accurate Rate-Driven Infiltration Model

    NASA Astrophysics Data System (ADS)

    Cui, G.; Zhu, J.

    2017-12-01

    In this study, we develop a novel Rate-Driven Infiltration Model (RDIMOD) for simulating infiltration into soils. Unlike traditional methods, RDIMOD avoids numerically solving the highly non-linear Richards equation or simply modeling with empirical parameters. RDIMOD employs infiltration rate as model input to simulate one-dimensional infiltration process by solving an ordinary differential equation. The model can simulate the evolutions of wetting front, infiltration rate, and cumulative infiltration on any surface slope including vertical and horizontal directions. Comparing to the results from the Richards equation for both vertical infiltration and horizontal infiltration, RDIMOD simply and accurately predicts infiltration processes for any type of soils and soil hydraulic models without numerical difficulty. Taking into account the accuracy, capability, and computational effectiveness and stability, RDIMOD can be used in large-scale hydrologic and land-atmosphere modeling.

  2. Accurate electromagnetic modeling of terahertz detectors

    NASA Technical Reports Server (NTRS)

    Focardi, Paolo; McGrath, William R.

    2004-01-01

    Twin slot antennas coupled to superconducting devices have been developed over the years as single pixel detectors in the terahertz (THz) frequency range for space-based and astronomy applications. Used either for mixing or direct detection, they have been object of several investigations, and are currently being developed for several missions funded or co-funded by NASA. Although they have shown promising performance in terms of noise and sensitivity, so far they have usually also shown a considerable disagreement in terms of performance between calculations and measurements, especially when considering center frequency and bandwidth. In this paper we present a thorough and accurate electromagnetic model of complete detector and we compare the results of calculations with measurements. Starting from a model of the embedding circuit, the effect of all the other elements in the detector in the coupled power have been analyzed. An extensive variety of measured and calculated data, as presented in this paper, demonstrates the effectiveness and reliability of the electromagnetic model at frequencies between 600 GHz and 2.5THz.

  3. Beyond the Central Dogma: Model-Based Learning of How Genes Determine Phenotypes

    PubMed Central

    Reinagel, Adam; Bray Speth, Elena

    2016-01-01

    In an introductory biology course, we implemented a learner-centered, model-based pedagogy that frequently engaged students in building conceptual models to explain how genes determine phenotypes. Model-building tasks were incorporated within case studies and aimed at eliciting students’ understanding of 1) the origin of variation in a population and 2) how genes/alleles determine phenotypes. Guided by theory on hierarchical development of systems-thinking skills, we scaffolded instruction and assessment so that students would first focus on articulating isolated relationships between pairs of molecular genetics structures and then integrate these relationships into an explanatory network. We analyzed models students generated on two exams to assess whether students’ learning of molecular genetics progressed along the theoretical hierarchical sequence of systems-thinking skills acquisition. With repeated practice, peer discussion, and instructor feedback over the course of the semester, students’ models became more accurate, better contextualized, and more meaningful. At the end of the semester, however, more than 25% of students still struggled to describe phenotype as an output of protein function. We therefore recommend that 1) practices like modeling, which require connecting genes to phenotypes; and 2) well-developed case studies highlighting proteins and their functions, take center stage in molecular genetics instruction. PMID:26903496

  4. Accurate Modeling of Ionospheric Electromagnetic Fields Generated by a Low Altitude VLF Transmitter

    DTIC Science & Technology

    2009-03-31

    AFRL-RV-HA-TR-2009-1055 Accurate Modeling of Ionospheric Electromagnetic Fields Generated by a Low Altitude VLF Transmitter ...m (or even 500 m) at mid to high latitudes . At low latitudes , the FDTD model exhibits variations that make it difficult to determine a reliable...Scientific, Final 3. DATES COVERED (From - To) 02-08-2006 – 31-12-2008 4. TITLE AND SUBTITLE Accurate Modeling of Ionospheric Electromagnetic Fields

  5. Evaluating Gene Set Enrichment Analysis Via a Hybrid Data Model

    PubMed Central

    Hua, Jianping; Bittner, Michael L.; Dougherty, Edward R.

    2014-01-01

    Gene set enrichment analysis (GSA) methods have been widely adopted by biological labs to analyze data and generate hypotheses for validation. Most of the existing comparison studies focus on whether the existing GSA methods can produce accurate P-values; however, practitioners are often more concerned with the correct gene-set ranking generated by the methods. The ranking performance is closely related to two critical goals associated with GSA methods: the ability to reveal biological themes and ensuring reproducibility, especially for small-sample studies. We have conducted a comprehensive simulation study focusing on the ranking performance of seven representative GSA methods. We overcome the limitation on the availability of real data sets by creating hybrid data models from existing large data sets. To build the data model, we pick a master gene from the data set to form the ground truth and artificially generate the phenotype labels. Multiple hybrid data models can be constructed from one data set and multiple data sets of smaller sizes can be generated by resampling the original data set. This approach enables us to generate a large batch of data sets to check the ranking performance of GSA methods. Our simulation study reveals that for the proposed data model, the Q2 type GSA methods have in general better performance than other GSA methods and the global test has the most robust results. The properties of a data set play a critical role in the performance. For the data sets with highly connected genes, all GSA methods suffer significantly in performance. PMID:24558298

  6. An accurate halo model for fitting non-linear cosmological power spectra and baryonic feedback models

    NASA Astrophysics Data System (ADS)

    Mead, A. J.; Peacock, J. A.; Heymans, C.; Joudaki, S.; Heavens, A. F.

    2015-12-01

    We present an optimized variant of the halo model, designed to produce accurate matter power spectra well into the non-linear regime for a wide range of cosmological models. To do this, we introduce physically motivated free parameters into the halo-model formalism and fit these to data from high-resolution N-body simulations. For a variety of Λ cold dark matter (ΛCDM) and wCDM models, the halo-model power is accurate to ≃ 5 per cent for k ≤ 10h Mpc-1 and z ≤ 2. An advantage of our new halo model is that it can be adapted to account for the effects of baryonic feedback on the power spectrum. We demonstrate this by fitting the halo model to power spectra from the OWLS (OverWhelmingly Large Simulations) hydrodynamical simulation suite via parameters that govern halo internal structure. We are able to fit all feedback models investigated at the 5 per cent level using only two free parameters, and we place limits on the range of these halo parameters for feedback models investigated by the OWLS simulations. Accurate predictions to high k are vital for weak-lensing surveys, and these halo parameters could be considered nuisance parameters to marginalize over in future analyses to mitigate uncertainty regarding the details of feedback. Finally, we investigate how lensing observables predicted by our model compare to those from simulations and from HALOFIT for a range of k-cuts and feedback models and quantify the angular scales at which these effects become important. Code to calculate power spectra from the model presented in this paper can be found at https://github.com/alexander-mead/hmcode.

  7. Allele-sharing models: LOD scores and accurate linkage tests.

    PubMed

    Kong, A; Cox, N J

    1997-11-01

    Starting with a test statistic for linkage analysis based on allele sharing, we propose an associated one-parameter model. Under general missing-data patterns, this model allows exact calculation of likelihood ratios and LOD scores and has been implemented by a simple modification of existing software. Most important, accurate linkage tests can be performed. Using an example, we show that some previously suggested approaches to handling less than perfectly informative data can be unacceptably conservative. Situations in which this model may not perform well are discussed, and an alternative model that requires additional computations is suggested.

  8. Allele-sharing models: LOD scores and accurate linkage tests.

    PubMed Central

    Kong, A; Cox, N J

    1997-01-01

    Starting with a test statistic for linkage analysis based on allele sharing, we propose an associated one-parameter model. Under general missing-data patterns, this model allows exact calculation of likelihood ratios and LOD scores and has been implemented by a simple modification of existing software. Most important, accurate linkage tests can be performed. Using an example, we show that some previously suggested approaches to handling less than perfectly informative data can be unacceptably conservative. Situations in which this model may not perform well are discussed, and an alternative model that requires additional computations is suggested. PMID:9345087

  9. Fast and accurate calculation of dilute quantum gas using Uehling–Uhlenbeck model equation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Yano, Ryosuke, E-mail: ryosuke.yano@tokiorisk.co.jp

    The Uehling–Uhlenbeck (U–U) model equation is studied for the fast and accurate calculation of a dilute quantum gas. In particular, the direct simulation Monte Carlo (DSMC) method is used to solve the U–U model equation. DSMC analysis based on the U–U model equation is expected to enable the thermalization to be accurately obtained using a small number of sample particles and the dilute quantum gas dynamics to be calculated in a practical time. Finally, the applicability of DSMC analysis based on the U–U model equation to the fast and accurate calculation of a dilute quantum gas is confirmed by calculatingmore » the viscosity coefficient of a Bose gas on the basis of the Green–Kubo expression and the shock layer of a dilute Bose gas around a cylinder.« less

  10. Combining transcription factor binding affinities with open-chromatin data for accurate gene expression prediction

    PubMed Central

    Schmidt, Florian; Gasparoni, Nina; Gasparoni, Gilles; Gianmoena, Kathrin; Cadenas, Cristina; Polansky, Julia K.; Ebert, Peter; Nordström, Karl; Barann, Matthias; Sinha, Anupam; Fröhler, Sebastian; Xiong, Jieyi; Dehghani Amirabad, Azim; Behjati Ardakani, Fatemeh; Hutter, Barbara; Zipprich, Gideon; Felder, Bärbel; Eils, Jürgen; Brors, Benedikt; Chen, Wei; Hengstler, Jan G.; Hamann, Alf; Lengauer, Thomas; Rosenstiel, Philip; Walter, Jörn; Schulz, Marcel H.

    2017-01-01

    The binding and contribution of transcription factors (TF) to cell specific gene expression is often deduced from open-chromatin measurements to avoid costly TF ChIP-seq assays. Thus, it is important to develop computational methods for accurate TF binding prediction in open-chromatin regions (OCRs). Here, we report a novel segmentation-based method, TEPIC, to predict TF binding by combining sets of OCRs with position weight matrices. TEPIC can be applied to various open-chromatin data, e.g. DNaseI-seq and NOMe-seq. Additionally, Histone-Marks (HMs) can be used to identify candidate TF binding sites. TEPIC computes TF affinities and uses open-chromatin/HM signal intensity as quantitative measures of TF binding strength. Using machine learning, we find low affinity binding sites to improve our ability to explain gene expression variability compared to the standard presence/absence classification of binding sites. Further, we show that both footprints and peaks capture essential TF binding events and lead to a good prediction performance. In our application, gene-based scores computed by TEPIC with one open-chromatin assay nearly reach the quality of several TF ChIP-seq data sets. Finally, these scores correctly predict known transcriptional regulators as illustrated by the application to novel DNaseI-seq and NOMe-seq data for primary human hepatocytes and CD4+ T-cells, respectively. PMID:27899623

  11. Validation of reference genes for accurate normalization of gene expression for real time-quantitative PCR in strawberry fruits using different cultivars and osmotic stresses.

    PubMed

    Galli, Vanessa; Borowski, Joyce Moura; Perin, Ellen Cristina; Messias, Rafael da Silva; Labonde, Julia; Pereira, Ivan dos Santos; Silva, Sérgio Delmar Dos Anjos; Rombaldi, Cesar Valmor

    2015-01-10

    The increasing demand of strawberry (Fragaria×ananassa Duch) fruits is associated mainly with their sensorial characteristics and the content of antioxidant compounds. Nevertheless, the strawberry production has been hampered due to its sensitivity to abiotic stresses. Therefore, to understand the molecular mechanisms highlighting stress response is of great importance to enable genetic engineering approaches aiming to improve strawberry tolerance. However, the study of expression of genes in strawberry requires the use of suitable reference genes. In the present study, seven traditional and novel candidate reference genes were evaluated for transcript normalization in fruits of ten strawberry cultivars and two abiotic stresses, using RefFinder, which integrates the four major currently available software programs: geNorm, NormFinder, BestKeeper and the comparative delta-Ct method. The results indicate that the expression stability is dependent on the experimental conditions. The candidate reference gene DBP (DNA binding protein) was considered the most suitable to normalize expression data in samples of strawberry cultivars and under drought stress condition, and the candidate reference gene HISTH4 (histone H4) was the most stable under osmotic stresses and salt stress. The traditional genes GAPDH (glyceraldehyde-3-phosphate dehydrogenase) and 18S (18S ribosomal RNA) were considered the most unstable genes in all conditions. The expression of phenylalanine ammonia lyase (PAL) and 9-cis epoxycarotenoid dioxygenase (NCED1) genes were used to further confirm the validated candidate reference genes, showing that the use of an inappropriate reference gene may induce erroneous results. This study is the first survey on the stability of reference genes in strawberry cultivars and osmotic stresses and provides guidelines to obtain more accurate RT-qPCR results for future breeding efforts. Copyright © 2014 Elsevier B.V. All rights reserved.

  12. Accurate Modeling of Galaxy Clustering on Small Scales: Testing the Standard ΛCDM + Halo Model

    NASA Astrophysics Data System (ADS)

    Sinha, Manodeep; Berlind, Andreas A.; McBride, Cameron; Scoccimarro, Roman

    2015-01-01

    The large-scale distribution of galaxies can be explained fairly simply by assuming (i) a cosmological model, which determines the dark matter halo distribution, and (ii) a simple connection between galaxies and the halos they inhabit. This conceptually simple framework, called the halo model, has been remarkably successful at reproducing the clustering of galaxies on all scales, as observed in various galaxy redshift surveys. However, none of these previous studies have carefully modeled the systematics and thus truly tested the halo model in a statistically rigorous sense. We present a new accurate and fully numerical halo model framework and test it against clustering measurements from two luminosity samples of galaxies drawn from the SDSS DR7. We show that the simple ΛCDM cosmology + halo model is not able to simultaneously reproduce the galaxy projected correlation function and the group multiplicity function. In particular, the more luminous sample shows significant tension with theory. We discuss the implications of our findings and how this work paves the way for constraining galaxy formation by accurate simultaneous modeling of multiple galaxy clustering statistics.

  13. Accurate path integration in continuous attractor network models of grid cells.

    PubMed

    Burak, Yoram; Fiete, Ila R

    2009-02-01

    Grid cells in the rat entorhinal cortex display strikingly regular firing responses to the animal's position in 2-D space and have been hypothesized to form the neural substrate for dead-reckoning. However, errors accumulate rapidly when velocity inputs are integrated in existing models of grid cell activity. To produce grid-cell-like responses, these models would require frequent resets triggered by external sensory cues. Such inadequacies, shared by various models, cast doubt on the dead-reckoning potential of the grid cell system. Here we focus on the question of accurate path integration, specifically in continuous attractor models of grid cell activity. We show, in contrast to previous models, that continuous attractor models can generate regular triangular grid responses, based on inputs that encode only the rat's velocity and heading direction. We consider the role of the network boundary in the integration performance of the network and show that both periodic and aperiodic networks are capable of accurate path integration, despite important differences in their attractor manifolds. We quantify the rate at which errors in the velocity integration accumulate as a function of network size and intrinsic noise within the network. With a plausible range of parameters and the inclusion of spike variability, our model networks can accurately integrate velocity inputs over a maximum of approximately 10-100 meters and approximately 1-10 minutes. These findings form a proof-of-concept that continuous attractor dynamics may underlie velocity integration in the dorsolateral medial entorhinal cortex. The simulations also generate pertinent upper bounds on the accuracy of integration that may be achieved by continuous attractor dynamics in the grid cell network. We suggest experiments to test the continuous attractor model and differentiate it from models in which single cells establish their responses independently of each other.

  14. Creation of Anatomically Accurate Computer-Aided Design (CAD) Solid Models from Medical Images

    NASA Technical Reports Server (NTRS)

    Stewart, John E.; Graham, R. Scott; Samareh, Jamshid A.; Oberlander, Eric J.; Broaddus, William C.

    1999-01-01

    Most surgical instrumentation and implants used in the world today are designed with sophisticated Computer-Aided Design (CAD)/Computer-Aided Manufacturing (CAM) software. This software automates the mechanical development of a product from its conceptual design through manufacturing. CAD software also provides a means of manipulating solid models prior to Finite Element Modeling (FEM). Few surgical products are designed in conjunction with accurate CAD models of human anatomy because of the difficulty with which these models are created. We have developed a novel technique that creates anatomically accurate, patient specific CAD solids from medical images in a matter of minutes.

  15. Selection of low-variance expressed Malus x domestica (apple) genes for use as quantitative PCR reference genes (housekeepers)

    USDA-ARS?s Scientific Manuscript database

    To accurately measure gene expression using PCR-based approaches, there is the need for reference genes that have low variance in expression (housekeeping genes) to normalise the data for RNA quantity and quality. For non-model species such as Malus x domestica (apples), previously, the selection of...

  16. Local Debonding and Fiber Breakage in Composite Materials Modeled Accurately

    NASA Technical Reports Server (NTRS)

    Bednarcyk, Brett A.; Arnold, Steven M.

    2001-01-01

    A prerequisite for full utilization of composite materials in aerospace components is accurate design and life prediction tools that enable the assessment of component performance and reliability. Such tools assist both structural analysts, who design and optimize structures composed of composite materials, and materials scientists who design and optimize the composite materials themselves. NASA Glenn Research Center's Micromechanics Analysis Code with Generalized Method of Cells (MAC/GMC) software package (http://www.grc.nasa.gov/WWW/LPB/mac) addresses this need for composite design and life prediction tools by providing a widely applicable and accurate approach to modeling composite materials. Furthermore, MAC/GMC serves as a platform for incorporating new local models and capabilities that are under development at NASA, thus enabling these new capabilities to progress rapidly to a stage in which they can be employed by the code's end users.

  17. Accurate HLA type inference using a weighted similarity graph.

    PubMed

    Xie, Minzhu; Li, Jing; Jiang, Tao

    2010-12-14

    The human leukocyte antigen system (HLA) contains many highly variable genes. HLA genes play an important role in the human immune system, and HLA gene matching is crucial for the success of human organ transplantations. Numerous studies have demonstrated that variation in HLA genes is associated with many autoimmune, inflammatory and infectious diseases. However, typing HLA genes by serology or PCR is time consuming and expensive, which limits large-scale studies involving HLA genes. Since it is much easier and cheaper to obtain single nucleotide polymorphism (SNP) genotype data, accurate computational algorithms to infer HLA gene types from SNP genotype data are in need. To infer HLA types from SNP genotypes, the first step is to infer SNP haplotypes from genotypes. However, for the same SNP genotype data set, the haplotype configurations inferred by different methods are usually inconsistent, and it is often difficult to decide which one is true. In this paper, we design an accurate HLA gene type inference algorithm by utilizing SNP genotype data from pedigrees, known HLA gene types of some individuals and the relationship between inferred SNP haplotypes and HLA gene types. Given a set of haplotypes inferred from the genotypes of a population consisting of many pedigrees, the algorithm first constructs a weighted similarity graph based on a new haplotype similarity measure and derives constraint edges from known HLA gene types. Based on the principle that different HLA gene alleles should have different background haplotypes, the algorithm searches for an optimal labeling of all the haplotypes with unknown HLA gene types such that the total weight among the same HLA gene types is maximized. To deal with ambiguous haplotype solutions, we use a genetic algorithm to select haplotype configurations that tend to maximize the same optimization criterion. Our experiments on a previously typed subset of the HapMap data show that the algorithm is highly accurate

  18. Accurate Encoding and Decoding by Single Cells: Amplitude Versus Frequency Modulation

    PubMed Central

    Micali, Gabriele; Aquino, Gerardo; Richards, David M.; Endres, Robert G.

    2015-01-01

    Cells sense external concentrations and, via biochemical signaling, respond by regulating the expression of target proteins. Both in signaling networks and gene regulation there are two main mechanisms by which the concentration can be encoded internally: amplitude modulation (AM), where the absolute concentration of an internal signaling molecule encodes the stimulus, and frequency modulation (FM), where the period between successive bursts represents the stimulus. Although both mechanisms have been observed in biological systems, the question of when it is beneficial for cells to use either AM or FM is largely unanswered. Here, we first consider a simple model for a single receptor (or ion channel), which can either signal continuously whenever a ligand is bound, or produce a burst in signaling molecule upon receptor binding. We find that bursty signaling is more accurate than continuous signaling only for sufficiently fast dynamics. This suggests that modulation based on bursts may be more common in signaling networks than in gene regulation. We then extend our model to multiple receptors, where continuous and bursty signaling are equivalent to AM and FM respectively, finding that AM is always more accurate. This implies that the reason some cells use FM is related to factors other than accuracy, such as the ability to coordinate expression of multiple genes or to implement threshold crossing mechanisms. PMID:26030820

  19. MADGiC: a model-based approach for identifying driver genes in cancer

    PubMed Central

    Korthauer, Keegan D.; Kendziorski, Christina

    2015-01-01

    Motivation: Identifying and prioritizing somatic mutations is an important and challenging area of cancer research that can provide new insights into gene function as well as new targets for drug development. Most methods for prioritizing mutations rely primarily on frequency-based criteria, where a gene is identified as having a driver mutation if it is altered in significantly more samples than expected according to a background model. Although useful, frequency-based methods are limited in that all mutations are treated equally. It is well known, however, that some mutations have no functional consequence, while others may have a major deleterious impact. The spatial pattern of mutations within a gene provides further insight into their functional consequence. Properly accounting for these factors improves both the power and accuracy of inference. Also important is an accurate background model. Results: Here, we develop a Model-based Approach for identifying Driver Genes in Cancer (termed MADGiC) that incorporates both frequency and functional impact criteria and accommodates a number of factors to improve the background model. Simulation studies demonstrate advantages of the approach, including a substantial increase in power over competing methods. Further advantages are illustrated in an analysis of ovarian and lung cancer data from The Cancer Genome Atlas (TCGA) project. Availability and implementation: R code to implement this method is available at http://www.biostat.wisc.edu/ kendzior/MADGiC/. Contact: kendzior@biostat.wisc.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25573922

  20. Combining transcription factor binding affinities with open-chromatin data for accurate gene expression prediction.

    PubMed

    Schmidt, Florian; Gasparoni, Nina; Gasparoni, Gilles; Gianmoena, Kathrin; Cadenas, Cristina; Polansky, Julia K; Ebert, Peter; Nordström, Karl; Barann, Matthias; Sinha, Anupam; Fröhler, Sebastian; Xiong, Jieyi; Dehghani Amirabad, Azim; Behjati Ardakani, Fatemeh; Hutter, Barbara; Zipprich, Gideon; Felder, Bärbel; Eils, Jürgen; Brors, Benedikt; Chen, Wei; Hengstler, Jan G; Hamann, Alf; Lengauer, Thomas; Rosenstiel, Philip; Walter, Jörn; Schulz, Marcel H

    2017-01-09

    The binding and contribution of transcription factors (TF) to cell specific gene expression is often deduced from open-chromatin measurements to avoid costly TF ChIP-seq assays. Thus, it is important to develop computational methods for accurate TF binding prediction in open-chromatin regions (OCRs). Here, we report a novel segmentation-based method, TEPIC, to predict TF binding by combining sets of OCRs with position weight matrices. TEPIC can be applied to various open-chromatin data, e.g. DNaseI-seq and NOMe-seq. Additionally, Histone-Marks (HMs) can be used to identify candidate TF binding sites. TEPIC computes TF affinities and uses open-chromatin/HM signal intensity as quantitative measures of TF binding strength. Using machine learning, we find low affinity binding sites to improve our ability to explain gene expression variability compared to the standard presence/absence classification of binding sites. Further, we show that both footprints and peaks capture essential TF binding events and lead to a good prediction performance. In our application, gene-based scores computed by TEPIC with one open-chromatin assay nearly reach the quality of several TF ChIP-seq data sets. Finally, these scores correctly predict known transcriptional regulators as illustrated by the application to novel DNaseI-seq and NOMe-seq data for primary human hepatocytes and CD4+ T-cells, respectively. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  1. Accurate protein structure modeling using sparse NMR data and homologous structure information.

    PubMed

    Thompson, James M; Sgourakis, Nikolaos G; Liu, Gaohua; Rossi, Paolo; Tang, Yuefeng; Mills, Jeffrey L; Szyperski, Thomas; Montelione, Gaetano T; Baker, David

    2012-06-19

    While information from homologous structures plays a central role in X-ray structure determination by molecular replacement, such information is rarely used in NMR structure determination because it can be incorrect, both locally and globally, when evolutionary relationships are inferred incorrectly or there has been considerable evolutionary structural divergence. Here we describe a method that allows robust modeling of protein structures of up to 225 residues by combining (1)H(N), (13)C, and (15)N backbone and (13)Cβ chemical shift data, distance restraints derived from homologous structures, and a physically realistic all-atom energy function. Accurate models are distinguished from inaccurate models generated using incorrect sequence alignments by requiring that (i) the all-atom energies of models generated using the restraints are lower than models generated in unrestrained calculations and (ii) the low-energy structures converge to within 2.0 Å backbone rmsd over 75% of the protein. Benchmark calculations on known structures and blind targets show that the method can accurately model protein structures, even with very remote homology information, to a backbone rmsd of 1.2-1.9 Å relative to the conventional determined NMR ensembles and of 0.9-1.6 Å relative to X-ray structures for well-defined regions of the protein structures. This approach facilitates the accurate modeling of protein structures using backbone chemical shift data without need for side-chain resonance assignments and extensive analysis of NOESY cross-peak assignments.

  2. Modeling Bi-modality Improves Characterization of Cell Cycle on Gene Expression in Single Cells

    PubMed Central

    Danaher, Patrick; Finak, Greg; Krouse, Michael; Wang, Alice; Webster, Philippa; Beechem, Joseph; Gottardo, Raphael

    2014-01-01

    Advances in high-throughput, single cell gene expression are allowing interrogation of cell heterogeneity. However, there is concern that the cell cycle phase of a cell might bias characterizations of gene expression at the single-cell level. We assess the effect of cell cycle phase on gene expression in single cells by measuring 333 genes in 930 cells across three phases and three cell lines. We determine each cell's phase non-invasively without chemical arrest and use it as a covariate in tests of differential expression. We observe bi-modal gene expression, a previously-described phenomenon, wherein the expression of otherwise abundant genes is either strongly positive, or undetectable within individual cells. This bi-modality is likely both biologically and technically driven. Irrespective of its source, we show that it should be modeled to draw accurate inferences from single cell expression experiments. To this end, we propose a semi-continuous modeling framework based on the generalized linear model, and use it to characterize genes with consistent cell cycle effects across three cell lines. Our new computational framework improves the detection of previously characterized cell-cycle genes compared to approaches that do not account for the bi-modality of single-cell data. We use our semi-continuous modelling framework to estimate single cell gene co-expression networks. These networks suggest that in addition to having phase-dependent shifts in expression (when averaged over many cells), some, but not all, canonical cell cycle genes tend to be co-expressed in groups in single cells. We estimate the amount of single cell expression variability attributable to the cell cycle. We find that the cell cycle explains only 5%–17% of expression variability, suggesting that the cell cycle will not tend to be a large nuisance factor in analysis of the single cell transcriptome. PMID:25032992

  3. Reverse transcription quantitative real-time polymerase chain reaction reference genes in the spared nerve injury model of neuropathic pain: validation and literature search.

    PubMed

    Piller, Nicolas; Decosterd, Isabelle; Suter, Marc R

    2013-07-10

    The reverse transcription quantitative real-time polymerase chain reaction (RT-qPCR) is a widely used, highly sensitive laboratory technique to rapidly and easily detect, identify and quantify gene expression. Reliable RT-qPCR data necessitates accurate normalization with validated control genes (reference genes) whose expression is constant in all studied conditions. This stability has to be demonstrated.We performed a literature search for studies using quantitative or semi-quantitative PCR in the rat spared nerve injury (SNI) model of neuropathic pain to verify whether any reference genes had previously been validated. We then analyzed the stability over time of 7 commonly used reference genes in the nervous system - specifically in the spinal cord dorsal horn and the dorsal root ganglion (DRG). These were: Actin beta (Actb), Glyceraldehyde-3-phosphate dehydrogenase (GAPDH), ribosomal proteins 18S (18S), L13a (RPL13a) and L29 (RPL29), hypoxanthine phosphoribosyltransferase 1 (HPRT1) and hydroxymethylbilane synthase (HMBS). We compared the candidate genes and established a stability ranking using the geNorm algorithm. Finally, we assessed the number of reference genes necessary for accurate normalization in this neuropathic pain model. We found GAPDH, HMBS, Actb, HPRT1 and 18S cited as reference genes in literature on studies using the SNI model. Only HPRT1 and 18S had been once previously demonstrated as stable in RT-qPCR arrays. All the genes tested in this study, using the geNorm algorithm, presented gene stability values (M-value) acceptable enough for them to qualify as potential reference genes in both DRG and spinal cord. Using the coefficient of variation, 18S failed the 50% cut-off with a value of 61% in the DRG. The two most stable genes in the dorsal horn were RPL29 and RPL13a; in the DRG they were HPRT1 and Actb. Using a 0.15 cut-off for pairwise variations we found that any pair of stable reference gene was sufficient for the normalization process

  4. Efficient Exploration of the Space of Reconciled Gene Trees

    PubMed Central

    Szöllősi, Gergely J.; Rosikiewicz, Wojciech; Boussau, Bastien; Tannier, Eric; Daubin, Vincent

    2013-01-01

    Gene trees record the combination of gene-level events, such as duplication, transfer and loss (DTL), and species-level events, such as speciation and extinction. Gene tree–species tree reconciliation methods model these processes by drawing gene trees into the species tree using a series of gene and species-level events. The reconstruction of gene trees based on sequence alone almost always involves choosing between statistically equivalent or weakly distinguishable relationships that could be much better resolved based on a putative species tree. To exploit this potential for accurate reconstruction of gene trees, the space of reconciled gene trees must be explored according to a joint model of sequence evolution and gene tree–species tree reconciliation. Here we present amalgamated likelihood estimation (ALE), a probabilistic approach to exhaustively explore all reconciled gene trees that can be amalgamated as a combination of clades observed in a sample of gene trees. We implement the ALE approach in the context of a reconciliation model (Szöllősi et al. 2013), which allows for the DTL of genes. We use ALE to efficiently approximate the sum of the joint likelihood over amalgamations and to find the reconciled gene tree that maximizes the joint likelihood among all such trees. We demonstrate using simulations that gene trees reconstructed using the joint likelihood are substantially more accurate than those reconstructed using sequence alone. Using realistic gene tree topologies, branch lengths, and alignment sizes, we demonstrate that ALE produces more accurate gene trees even if the model of sequence evolution is greatly simplified. Finally, examining 1099 gene families from 36 cyanobacterial genomes we find that joint likelihood-based inference results in a striking reduction in apparent phylogenetic discord, with respectively. 24%, 59%, and 46% reductions in the mean numbers of duplications, transfers, and losses per gene family. The open source

  5. Accurate modeling of the hose instability in plasma wakefield accelerators

    DOE PAGES

    Mehrling, T. J.; Benedetti, C.; Schroeder, C. B.; ...

    2018-05-20

    Hosing is a major challenge for the applicability of plasma wakefield accelerators and its modeling is therefore of fundamental importance to facilitate future stable and compact plasma-based particle accelerators. In this contribution, we present a new model for the evolution of the plasma centroid, which enables the accurate investigation of the hose instability in the nonlinear blowout regime. Lastly, it paves the road for more precise and comprehensive studies of hosing, e.g., with drive and witness beams, which were not possible with previous models.

  6. Accurate modeling of the hose instability in plasma wakefield accelerators

    NASA Astrophysics Data System (ADS)

    Mehrling, T. J.; Benedetti, C.; Schroeder, C. B.; Martinez de la Ossa, A.; Osterhoff, J.; Esarey, E.; Leemans, W. P.

    2018-05-01

    Hosing is a major challenge for the applicability of plasma wakefield accelerators and its modeling is therefore of fundamental importance to facilitate future stable and compact plasma-based particle accelerators. In this contribution, we present a new model for the evolution of the plasma centroid, which enables the accurate investigation of the hose instability in the nonlinear blowout regime. It paves the road for more precise and comprehensive studies of hosing, e.g., with drive and witness beams, which were not possible with previous models.

  7. Accurate modeling of the hose instability in plasma wakefield accelerators

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mehrling, T. J.; Benedetti, C.; Schroeder, C. B.

    Hosing is a major challenge for the applicability of plasma wakefield accelerators and its modeling is therefore of fundamental importance to facilitate future stable and compact plasma-based particle accelerators. In this contribution, we present a new model for the evolution of the plasma centroid, which enables the accurate investigation of the hose instability in the nonlinear blowout regime. Lastly, it paves the road for more precise and comprehensive studies of hosing, e.g., with drive and witness beams, which were not possible with previous models.

  8. Accurate analytical modeling of junctionless DG-MOSFET by green's function approach

    NASA Astrophysics Data System (ADS)

    Nandi, Ashutosh; Pandey, Nilesh

    2017-11-01

    An accurate analytical model of Junctionless double gate MOSFET (JL-DG-MOSFET) in the subthreshold regime of operation is developed in this work using green's function approach. The approach considers 2-D mixed boundary conditions and multi-zone techniques to provide an exact analytical solution to 2-D Poisson's equation. The Fourier coefficients are calculated correctly to derive the potential equations that are further used to model the channel current and subthreshold slope of the device. The threshold voltage roll-off is computed from parallel shifts of Ids-Vgs curves between the long channel and short-channel devices. It is observed that the green's function approach of solving 2-D Poisson's equation in both oxide and silicon region can accurately predict channel potential, subthreshold current (Isub), threshold voltage (Vt) roll-off and subthreshold slope (SS) of both long & short channel devices designed with different doping concentrations and higher as well as lower tsi/tox ratio. All the analytical model results are verified through comparisons with TCAD Sentaurus simulation results. It is observed that the model matches quite well with TCAD device simulations.

  9. A new accurate quadratic equation model for isothermal gas chromatography and its comparison with the linear model

    PubMed Central

    Wu, Liejun; Chen, Maoxue; Chen, Yongli; Li, Qing X.

    2013-01-01

    The gas holdup time (tM) is a dominant parameter in gas chromatographic retention models. The difference equation (DE) model proposed by Wu et al. (J. Chromatogr. A 2012, http://dx.doi.org/10.1016/j.chroma.2012.07.077) excluded tM. In the present paper, we propose that the relationship between the adjusted retention time tRZ′ and carbon number z of n-alkanes follows a quadratic equation (QE) when an accurate tM is obtained. This QE model is the same as or better than the DE model for an accurate expression of the retention behavior of n-alkanes and model applications. The QE model covers a larger range of n-alkanes with better curve fittings than the linear model. The accuracy of the QE model was approximately 2–6 times better than the DE model and 18–540 times better than the LE model. Standard deviations of the QE model were approximately 2–3 times smaller than those of the DE model. PMID:22989489

  10. Likelihood-Based Gene Annotations for Gap Filling and Quality Assessment in Genome-Scale Metabolic Models

    PubMed Central

    Benedict, Matthew N.; Mundy, Michael B.; Henry, Christopher S.; Chia, Nicholas; Price, Nathan D.

    2014-01-01

    Genome-scale metabolic models provide a powerful means to harness information from genomes to deepen biological insights. With exponentially increasing sequencing capacity, there is an enormous need for automated reconstruction techniques that can provide more accurate models in a short time frame. Current methods for automated metabolic network reconstruction rely on gene and reaction annotations to build draft metabolic networks and algorithms to fill gaps in these networks. However, automated reconstruction is hampered by database inconsistencies, incorrect annotations, and gap filling largely without considering genomic information. Here we develop an approach for applying genomic information to predict alternative functions for genes and estimate their likelihoods from sequence homology. We show that computed likelihood values were significantly higher for annotations found in manually curated metabolic networks than those that were not. We then apply these alternative functional predictions to estimate reaction likelihoods, which are used in a new gap filling approach called likelihood-based gap filling to predict more genomically consistent solutions. To validate the likelihood-based gap filling approach, we applied it to models where essential pathways were removed, finding that likelihood-based gap filling identified more biologically relevant solutions than parsimony-based gap filling approaches. We also demonstrate that models gap filled using likelihood-based gap filling provide greater coverage and genomic consistency with metabolic gene functions compared to parsimony-based approaches. Interestingly, despite these findings, we found that likelihoods did not significantly affect consistency of gap filled models with Biolog and knockout lethality data. This indicates that the phenotype data alone cannot necessarily be used to discriminate between alternative solutions for gap filling and therefore, that the use of other information is necessary to

  11. Inferring landscape effects on gene flow: A new model selection framework

    Treesearch

    A. J. Shirk; D. O. Wallin; S. A. Cushman; C. G. Rice; K. I. Warheit

    2010-01-01

    Populations in fragmented landscapes experience reduced gene flow, lose genetic diversity over time and ultimately face greater extinction risk. Improving connectivity in fragmented landscapes is now a major focus of conservation biology. Designing effective wildlife corridors for this purpose, however, requires an accurate understanding of how landscapes shape gene...

  12. Accurate monoenergetic electron parameters of laser wakefield in a bubble model

    NASA Astrophysics Data System (ADS)

    Raheli, A.; Rahmatallahpur, S. H.

    2012-11-01

    A reliable analytical expression for the potential of plasma waves with phase velocities near the speed of light is derived. The presented spheroid cavity model is more consistent than the previous spherical and ellipsoidal model and it explains the mono-energetic electron trajectory more accurately, especially at the relativistic region. As a result, the quasi-mono-energetic electrons output beam interacting with the laser plasma can be more appropriately described with this model.

  13. Indexed variation graphs for efficient and accurate resistome profiling.

    PubMed

    Rowe, Will P M; Winn, Martyn D

    2018-05-14

    Antimicrobial resistance remains a major threat to global health. Profiling the collective antimicrobial resistance genes within a metagenome (the "resistome") facilitates greater understanding of antimicrobial resistance gene diversity and dynamics. In turn, this can allow for gene surveillance, individualised treatment of bacterial infections and more sustainable use of antimicrobials. However, resistome profiling can be complicated by high similarity between reference genes, as well as the sheer volume of sequencing data and the complexity of analysis workflows. We have developed an efficient and accurate method for resistome profiling that addresses these complications and improves upon currently available tools. Our method combines a variation graph representation of gene sets with an LSH Forest indexing scheme to allow for fast classification of metagenomic sequence reads using similarity-search queries. Subsequent hierarchical local alignment of classified reads against graph traversals enables accurate reconstruction of full-length gene sequences using a scoring scheme. We provide our implementation, GROOT, and show it to be both faster and more accurate than a current reference-dependent tool for resistome profiling. GROOT runs on a laptop and can process a typical 2 gigabyte metagenome in 2 minutes using a single CPU. Our method is not restricted to resistome profiling and has the potential to improve current metagenomic workflows. GROOT is written in Go and is available at https://github.com/will-rowe/groot (MIT license). will.rowe@stfc.ac.uk. Supplementary data are available at Bioinformatics online.

  14. A pairwise maximum entropy model accurately describes resting-state human brain networks

    PubMed Central

    Watanabe, Takamitsu; Hirose, Satoshi; Wada, Hiroyuki; Imai, Yoshio; Machida, Toru; Shirouzu, Ichiro; Konishi, Seiki; Miyashita, Yasushi; Masuda, Naoki

    2013-01-01

    The resting-state human brain networks underlie fundamental cognitive functions and consist of complex interactions among brain regions. However, the level of complexity of the resting-state networks has not been quantified, which has prevented comprehensive descriptions of the brain activity as an integrative system. Here, we address this issue by demonstrating that a pairwise maximum entropy model, which takes into account region-specific activity rates and pairwise interactions, can be robustly and accurately fitted to resting-state human brain activities obtained by functional magnetic resonance imaging. Furthermore, to validate the approximation of the resting-state networks by the pairwise maximum entropy model, we show that the functional interactions estimated by the pairwise maximum entropy model reflect anatomical connexions more accurately than the conventional functional connectivity method. These findings indicate that a relatively simple statistical model not only captures the structure of the resting-state networks but also provides a possible method to derive physiological information about various large-scale brain networks. PMID:23340410

  15. Accurate, low-cost 3D-models of gullies

    NASA Astrophysics Data System (ADS)

    Onnen, Nils; Gronz, Oliver; Ries, Johannes B.; Brings, Christine

    2015-04-01

    Soil erosion is a widespread problem in arid and semi-arid areas. The most severe form is the gully erosion. They often cut into agricultural farmland and can make a certain area completely unproductive. To understand the development and processes inside and around gullies, we calculated detailed 3D-models of gullies in the Souss Valley in South Morocco. Near Taroudant, we had four study areas with five gullies different in size, volume and activity. By using a Canon HF G30 Camcorder, we made varying series of Full HD videos with 25fps. Afterwards, we used the method Structure from Motion (SfM) to create the models. To generate accurate models maintaining feasible runtimes, it is necessary to select around 1500-1700 images from the video, while the overlap of neighboring images should be at least 80%. In addition, it is very important to avoid selecting photos that are blurry or out of focus. Nearby pixels of a blurry image tend to have similar color values. That is why we used a MATLAB script to compare the derivatives of the images. The higher the sum of the derivative, the sharper an image of similar objects. MATLAB subdivides the video into image intervals. From each interval, the image with the highest sum is selected. E.g.: 20min. video at 25fps equals 30.000 single images. The program now inspects the first 20 images, saves the sharpest and moves on to the next 20 images etc. Using this algorithm, we selected 1500 images for our modeling. With VisualSFM, we calculated features and the matches between all images and produced a point cloud. Then, MeshLab has been used to build a surface out of it using the Poisson surface reconstruction approach. Afterwards we are able to calculate the size and the volume of the gullies. It is also possible to determine soil erosion rates, if we compare the data with old recordings. The final step would be the combination of the terrestrial data with the data from our aerial photography. So far, the method works well and we

  16. Proteogenomics produces comprehensive and highly accurate protein-coding gene annotation in a complete genome assembly of Malassezia sympodialis

    PubMed Central

    Tellgren-Roth, Christian; Baudo, Charles D.; Kennell, John C.; Sun, Sheng; Billmyre, R. Blake; Schröder, Markus S.; Andersson, Anna; Holm, Tina; Sigurgeirsson, Benjamin; Wu, Guangxi; Sankaranarayanan, Sundar Ram; Siddharthan, Rahul; Sanyal, Kaustuv; Lundeberg, Joakim; Nystedt, Björn; Boekhout, Teun; Dawson, Thomas L.; Heitman, Joseph

    2017-01-01

    Abstract Complete and accurate genome assembly and annotation is a crucial foundation for comparative and functional genomics. Despite this, few complete eukaryotic genomes are available, and genome annotation remains a major challenge. Here, we present a complete genome assembly of the skin commensal yeast Malassezia sympodialis and demonstrate how proteogenomics can substantially improve gene annotation. Through long-read DNA sequencing, we obtained a gap-free genome assembly for M. sympodialis (ATCC 42132), comprising eight nuclear and one mitochondrial chromosome. We also sequenced and assembled four M. sympodialis clinical isolates, and showed their value for understanding Malassezia reproduction by confirming four alternative allele combinations at the two mating-type loci. Importantly, we demonstrated how proteomics data could be readily integrated with transcriptomics data in standard annotation tools. This increased the number of annotated protein-coding genes by 14% (from 3612 to 4113), compared to using transcriptomics evidence alone. Manual curation further increased the number of protein-coding genes by 9% (to 4493). All of these genes have RNA-seq evidence and 87% were confirmed by proteomics. The M. sympodialis genome assembly and annotation presented here is at a quality yet achieved only for a few eukaryotic organisms, and constitutes an important reference for future host-microbe interaction studies. PMID:28100699

  17. An Accurate Absorption-Based Net Primary Production Model for the Global Ocean

    NASA Astrophysics Data System (ADS)

    Silsbe, G.; Westberry, T. K.; Behrenfeld, M. J.; Halsey, K.; Milligan, A.

    2016-02-01

    As a vital living link in the global carbon cycle, understanding how net primary production (NPP) varies through space, time, and across climatic oscillations (e.g. ENSO) is a key objective in oceanographic research. The continual improvement of ocean observing satellites and data analytics now present greater opportunities for advanced understanding and characterization of the factors regulating NPP. In particular, the emergence of spectral inversion algorithms now permits accurate retrievals of the phytoplankton absorption coefficient (aΦ) from space. As NPP is the efficiency in which absorbed energy is converted into carbon biomass, aΦ measurements circumvents chlorophyll-based empirical approaches by permitting direct and accurate measurements of phytoplankton energy absorption. It has long been recognized, and perhaps underappreciated, that NPP and phytoplankton growth rates display muted variability when normalized to aΦ rather than chlorophyll. Here we present a novel absorption-based NPP model that parameterizes the underlying physiological mechanisms behind this muted variability, and apply this physiological model to the global ocean. Through a comparison against field data from the Hawaii and Bermuda Ocean Time Series, we demonstrate how this approach yields more accurate NPP measurements than other published NPP models. By normalizing NPP to satellite estimates of phytoplankton carbon biomass, this presentation also explores the seasonality of phytoplankton growth rates across several oceanic regions. Finally, we discuss how future advances in remote-sensing (e.g. hyperspectral satellites, LIDAR, autonomous profilers) can be exploited to further improve absorption-based NPP models.

  18. Generalized functional linear models for gene-based case-control association studies.

    PubMed

    Fan, Ruzong; Wang, Yifan; Mills, James L; Carter, Tonia C; Lobach, Iryna; Wilson, Alexander F; Bailey-Wilson, Joan E; Weeks, Daniel E; Xiong, Momiao

    2014-11-01

    By using functional data analysis techniques, we developed generalized functional linear models for testing association between a dichotomous trait and multiple genetic variants in a genetic region while adjusting for covariates. Both fixed and mixed effect models are developed and compared. Extensive simulations show that Rao's efficient score tests of the fixed effect models are very conservative since they generate lower type I errors than nominal levels, and global tests of the mixed effect models generate accurate type I errors. Furthermore, we found that the Rao's efficient score test statistics of the fixed effect models have higher power than the sequence kernel association test (SKAT) and its optimal unified version (SKAT-O) in most cases when the causal variants are both rare and common. When the causal variants are all rare (i.e., minor allele frequencies less than 0.03), the Rao's efficient score test statistics and the global tests have similar or slightly lower power than SKAT and SKAT-O. In practice, it is not known whether rare variants or common variants in a gene region are disease related. All we can assume is that a combination of rare and common variants influences disease susceptibility. Thus, the improved performance of our models when the causal variants are both rare and common shows that the proposed models can be very useful in dissecting complex traits. We compare the performance of our methods with SKAT and SKAT-O on real neural tube defects and Hirschsprung's disease datasets. The Rao's efficient score test statistics and the global tests are more sensitive than SKAT and SKAT-O in the real data analysis. Our methods can be used in either gene-disease genome-wide/exome-wide association studies or candidate gene analyses. © 2014 WILEY PERIODICALS, INC.

  19. Generalized Functional Linear Models for Gene-based Case-Control Association Studies

    PubMed Central

    Mills, James L.; Carter, Tonia C.; Lobach, Iryna; Wilson, Alexander F.; Bailey-Wilson, Joan E.; Weeks, Daniel E.; Xiong, Momiao

    2014-01-01

    By using functional data analysis techniques, we developed generalized functional linear models for testing association between a dichotomous trait and multiple genetic variants in a genetic region while adjusting for covariates. Both fixed and mixed effect models are developed and compared. Extensive simulations show that Rao's efficient score tests of the fixed effect models are very conservative since they generate lower type I errors than nominal levels, and global tests of the mixed effect models generate accurate type I errors. Furthermore, we found that the Rao's efficient score test statistics of the fixed effect models have higher power than the sequence kernel association test (SKAT) and its optimal unified version (SKAT-O) in most cases when the causal variants are both rare and common. When the causal variants are all rare (i.e., minor allele frequencies less than 0.03), the Rao's efficient score test statistics and the global tests have similar or slightly lower power than SKAT and SKAT-O. In practice, it is not known whether rare variants or common variants in a gene are disease-related. All we can assume is that a combination of rare and common variants influences disease susceptibility. Thus, the improved performance of our models when the causal variants are both rare and common shows that the proposed models can be very useful in dissecting complex traits. We compare the performance of our methods with SKAT and SKAT-O on real neural tube defects and Hirschsprung's disease data sets. The Rao's efficient score test statistics and the global tests are more sensitive than SKAT and SKAT-O in the real data analysis. Our methods can be used in either gene-disease genome-wide/exome-wide association studies or candidate gene analyses. PMID:25203683

  20. Accurate pressure gradient calculations in hydrostatic atmospheric models

    NASA Technical Reports Server (NTRS)

    Carroll, John J.; Mendez-Nunez, Luis R.; Tanrikulu, Saffet

    1987-01-01

    A method for the accurate calculation of the horizontal pressure gradient acceleration in hydrostatic atmospheric models is presented which is especially useful in situations where the isothermal surfaces are not parallel to the vertical coordinate surfaces. The present method is shown to be exact if the potential temperature lapse rate is constant between the vertical pressure integration limits. The technique is applied to both the integration of the hydrostatic equation and the computation of the slope correction term in the horizontal pressure gradient. A fixed vertical grid and a dynamic grid defined by the significant levels in the vertical temperature distribution are employed.

  1. Reverse engineering and analysis of large genome-scale gene networks

    PubMed Central

    Aluru, Maneesha; Zola, Jaroslaw; Nettleton, Dan; Aluru, Srinivas

    2013-01-01

    Reverse engineering the whole-genome networks of complex multicellular organisms continues to remain a challenge. While simpler models easily scale to large number of genes and gene expression datasets, more accurate models are compute intensive limiting their scale of applicability. To enable fast and accurate reconstruction of large networks, we developed Tool for Inferring Network of Genes (TINGe), a parallel mutual information (MI)-based program. The novel features of our approach include: (i) B-spline-based formulation for linear-time computation of MI, (ii) a novel algorithm for direct permutation testing and (iii) development of parallel algorithms to reduce run-time and facilitate construction of large networks. We assess the quality of our method by comparison with ARACNe (Algorithm for the Reconstruction of Accurate Cellular Networks) and GeneNet and demonstrate its unique capability by reverse engineering the whole-genome network of Arabidopsis thaliana from 3137 Affymetrix ATH1 GeneChips in just 9 min on a 1024-core cluster. We further report on the development of a new software Gene Network Analyzer (GeNA) for extracting context-specific subnetworks from a given set of seed genes. Using TINGe and GeNA, we performed analysis of 241 Arabidopsis AraCyc 8.0 pathways, and the results are made available through the web. PMID:23042249

  2. An analytic model for accurate spring constant calibration of rectangular atomic force microscope cantilevers.

    PubMed

    Li, Rui; Ye, Hongfei; Zhang, Weisheng; Ma, Guojun; Su, Yewang

    2015-10-29

    Spring constant calibration of the atomic force microscope (AFM) cantilever is of fundamental importance for quantifying the force between the AFM cantilever tip and the sample. The calibration within the framework of thin plate theory undoubtedly has a higher accuracy and broader scope than that within the well-established beam theory. However, thin plate theory-based accurate analytic determination of the constant has been perceived as an extremely difficult issue. In this paper, we implement the thin plate theory-based analytic modeling for the static behavior of rectangular AFM cantilevers, which reveals that the three-dimensional effect and Poisson effect play important roles in accurate determination of the spring constants. A quantitative scaling law is found that the normalized spring constant depends only on the Poisson's ratio, normalized dimension and normalized load coordinate. Both the literature and our refined finite element model validate the present results. The developed model is expected to serve as the benchmark for accurate calibration of rectangular AFM cantilevers.

  3. A Multiscale Red Blood Cell Model with Accurate Mechanics, Rheology, and Dynamics

    PubMed Central

    Fedosov, Dmitry A.; Caswell, Bruce; Karniadakis, George Em

    2010-01-01

    Abstract Red blood cells (RBCs) have highly deformable viscoelastic membranes exhibiting complex rheological response and rich hydrodynamic behavior governed by special elastic and bending properties and by the external/internal fluid and membrane viscosities. We present a multiscale RBC model that is able to predict RBC mechanics, rheology, and dynamics in agreement with experiments. Based on an analytic theory, the modeled membrane properties can be uniquely related to the experimentally established RBC macroscopic properties without any adjustment of parameters. The RBC linear and nonlinear elastic deformations match those obtained in optical-tweezers experiments. The rheological properties of the membrane are compared with those obtained in optical magnetic twisting cytometry, membrane thermal fluctuations, and creep followed by cell recovery. The dynamics of RBCs in shear and Poiseuille flows is tested against experiments and theoretical predictions, and the applicability of the latter is discussed. Our findings clearly indicate that a purely elastic model for the membrane cannot accurately represent the RBC's rheological properties and its dynamics, and therefore accurate modeling of a viscoelastic membrane is necessary. PMID:20483330

  4. Modeling and validation of autoinducer-mediated bacterial gene expression in microfluidic environments

    PubMed Central

    Austin, Caitlin M.; Stoy, William; Su, Peter; Harber, Marie C.; Bardill, J. Patrick; Hammer, Brian K.; Forest, Craig R.

    2014-01-01

    Biosensors exploiting communication within genetically engineered bacteria are becoming increasingly important for monitoring environmental changes. Currently, there are a variety of mathematical models for understanding and predicting how genetically engineered bacteria respond to molecular stimuli in these environments, but as sensors have miniaturized towards microfluidics and are subjected to complex time-varying inputs, the shortcomings of these models have become apparent. The effects of microfluidic environments such as low oxygen concentration, increased biofilm encapsulation, diffusion limited molecular distribution, and higher population densities strongly affect rate constants for gene expression not accounted for in previous models. We report a mathematical model that accurately predicts the biological response of the autoinducer N-acyl homoserine lactone-mediated green fluorescent protein expression in reporter bacteria in microfluidic environments by accommodating these rate constants. This generalized mass action model considers a chain of biomolecular events from input autoinducer chemical to fluorescent protein expression through a series of six chemical species. We have validated this model against experimental data from our own apparatus as well as prior published experimental results. Results indicate accurate prediction of dynamics (e.g., 14% peak time error from a pulse input) and with reduced mean-squared error with pulse or step inputs for a range of concentrations (10 μM–30 μM). This model can help advance the design of genetically engineered bacteria sensors and molecular communication devices. PMID:25379076

  5. A dental vision system for accurate 3D tooth modeling.

    PubMed

    Zhang, Li; Alemzadeh, K

    2006-01-01

    This paper describes an active vision system based reverse engineering approach to extract the three-dimensional (3D) geometric information from dental teeth and transfer this information into Computer-Aided Design/Computer-Aided Manufacture (CAD/CAM) systems to improve the accuracy of 3D teeth models and at the same time improve the quality of the construction units to help patient care. The vision system involves the development of a dental vision rig, edge detection, boundary tracing and fast & accurate 3D modeling from a sequence of sliced silhouettes of physical models. The rig is designed using engineering design methods such as a concept selection matrix and weighted objectives evaluation chart. Reconstruction results and accuracy evaluation are presented on digitizing different teeth models.

  6. Towards Accurate Modelling of Galaxy Clustering on Small Scales: Testing the Standard ΛCDM + Halo Model

    NASA Astrophysics Data System (ADS)

    Sinha, Manodeep; Berlind, Andreas A.; McBride, Cameron K.; Scoccimarro, Roman; Piscionere, Jennifer A.; Wibking, Benjamin D.

    2018-04-01

    Interpreting the small-scale clustering of galaxies with halo models can elucidate the connection between galaxies and dark matter halos. Unfortunately, the modelling is typically not sufficiently accurate for ruling out models statistically. It is thus difficult to use the information encoded in small scales to test cosmological models or probe subtle features of the galaxy-halo connection. In this paper, we attempt to push halo modelling into the "accurate" regime with a fully numerical mock-based methodology and careful treatment of statistical and systematic errors. With our forward-modelling approach, we can incorporate clustering statistics beyond the traditional two-point statistics. We use this modelling methodology to test the standard ΛCDM + halo model against the clustering of SDSS DR7 galaxies. Specifically, we use the projected correlation function, group multiplicity function and galaxy number density as constraints. We find that while the model fits each statistic separately, it struggles to fit them simultaneously. Adding group statistics leads to a more stringent test of the model and significantly tighter constraints on model parameters. We explore the impact of varying the adopted halo definition and cosmological model and find that changing the cosmology makes a significant difference. The most successful model we tried (Planck cosmology with Mvir halos) matches the clustering of low luminosity galaxies, but exhibits a 2.3σ tension with the clustering of luminous galaxies, thus providing evidence that the "standard" halo model needs to be extended. This work opens the door to adding interesting freedom to the halo model and including additional clustering statistics as constraints.

  7. Models of stochastic gene expression

    NASA Astrophysics Data System (ADS)

    Paulsson, Johan

    2005-06-01

    Gene expression is an inherently stochastic process: Genes are activated and inactivated by random association and dissociation events, transcription is typically rare, and many proteins are present in low numbers per cell. The last few years have seen an explosion in the stochastic modeling of these processes, predicting protein fluctuations in terms of the frequencies of the probabilistic events. Here I discuss commonalities between theoretical descriptions, focusing on a gene-mRNA-protein model that includes most published studies as special cases. I also show how expression bursts can be explained as simplistic time-averaging, and how generic approximations can allow for concrete interpretations without requiring concrete assumptions. Measures and nomenclature are discussed to some extent and the modeling literature is briefly reviewed.

  8. Proteogenomics produces comprehensive and highly accurate protein-coding gene annotation in a complete genome assembly of Malassezia sympodialis.

    PubMed

    Zhu, Yafeng; Engström, Pär G; Tellgren-Roth, Christian; Baudo, Charles D; Kennell, John C; Sun, Sheng; Billmyre, R Blake; Schröder, Markus S; Andersson, Anna; Holm, Tina; Sigurgeirsson, Benjamin; Wu, Guangxi; Sankaranarayanan, Sundar Ram; Siddharthan, Rahul; Sanyal, Kaustuv; Lundeberg, Joakim; Nystedt, Björn; Boekhout, Teun; Dawson, Thomas L; Heitman, Joseph; Scheynius, Annika; Lehtiö, Janne

    2017-03-17

    Complete and accurate genome assembly and annotation is a crucial foundation for comparative and functional genomics. Despite this, few complete eukaryotic genomes are available, and genome annotation remains a major challenge. Here, we present a complete genome assembly of the skin commensal yeast Malassezia sympodialis and demonstrate how proteogenomics can substantially improve gene annotation. Through long-read DNA sequencing, we obtained a gap-free genome assembly for M. sympodialis (ATCC 42132), comprising eight nuclear and one mitochondrial chromosome. We also sequenced and assembled four M. sympodialis clinical isolates, and showed their value for understanding Malassezia reproduction by confirming four alternative allele combinations at the two mating-type loci. Importantly, we demonstrated how proteomics data could be readily integrated with transcriptomics data in standard annotation tools. This increased the number of annotated protein-coding genes by 14% (from 3612 to 4113), compared to using transcriptomics evidence alone. Manual curation further increased the number of protein-coding genes by 9% (to 4493). All of these genes have RNA-seq evidence and 87% were confirmed by proteomics. The M. sympodialis genome assembly and annotation presented here is at a quality yet achieved only for a few eukaryotic organisms, and constitutes an important reference for future host-microbe interaction studies. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  9. Modeling gene regulatory network motifs using statecharts

    PubMed Central

    2012-01-01

    Background Gene regulatory networks are widely used by biologists to describe the interactions among genes, proteins and other components at the intra-cellular level. Recently, a great effort has been devoted to give gene regulatory networks a formal semantics based on existing computational frameworks. For this purpose, we consider Statecharts, which are a modular, hierarchical and executable formal model widely used to represent software systems. We use Statecharts for modeling small and recurring patterns of interactions in gene regulatory networks, called motifs. Results We present an improved method for modeling gene regulatory network motifs using Statecharts and we describe the successful modeling of several motifs, including those which could not be modeled or whose models could not be distinguished using the method of a previous proposal. We model motifs in an easy and intuitive way by taking advantage of the visual features of Statecharts. Our modeling approach is able to simulate some interesting temporal properties of gene regulatory network motifs: the delay in the activation and the deactivation of the "output" gene in the coherent type-1 feedforward loop, the pulse in the incoherent type-1 feedforward loop, the bistability nature of double positive and double negative feedback loops, the oscillatory behavior of the negative feedback loop, and the "lock-in" effect of positive autoregulation. Conclusions We present a Statecharts-based approach for the modeling of gene regulatory network motifs in biological systems. The basic motifs used to build more complex networks (that is, simple regulation, reciprocal regulation, feedback loop, feedforward loop, and autoregulation) can be faithfully described and their temporal dynamics can be analyzed. PMID:22536967

  10. Accurate modeling of high-repetition rate ultrashort pulse amplification in optical fibers

    PubMed Central

    Lindberg, Robert; Zeil, Peter; Malmström, Mikael; Laurell, Fredrik; Pasiskevicius, Valdas

    2016-01-01

    A numerical model for amplification of ultrashort pulses with high repetition rates in fiber amplifiers is presented. The pulse propagation is modeled by jointly solving the steady-state rate equations and the generalized nonlinear Schrödinger equation, which allows accurate treatment of nonlinear and dispersive effects whilst considering arbitrary spatial and spectral gain dependencies. Comparison of data acquired by using the developed model and experimental results prove to be in good agreement. PMID:27713496

  11. Phylogenetic modeling of lateral gene transfer reconstructs the pattern and relative timing of speciations

    PubMed Central

    Szöllősi, Gergely J.; Boussau, Bastien; Abby, Sophie S.; Tannier, Eric; Daubin, Vincent

    2012-01-01

    The timing of the evolution of microbial life has largely remained elusive due to the scarcity of prokaryotic fossil record and the confounding effects of the exchange of genes among possibly distant species. The history of gene transfer events, however, is not a series of individual oddities; it records which lineages were concurrent and thus provides information on the timing of species diversification. Here, we use a probabilistic model of genome evolution that accounts for differences between gene phylogenies and the species tree as series of duplication, transfer, and loss events to reconstruct chronologically ordered species phylogenies. Using simulations we show that we can robustly recover accurate chronologically ordered species phylogenies in the presence of gene tree reconstruction errors and realistic rates of duplication, transfer, and loss. Using genomic data we demonstrate that we can infer rooted species phylogenies using homologous gene families from complete genomes of 10 bacterial and archaeal groups. Focusing on cyanobacteria, distinguished among prokaryotes by a relative abundance of fossils, we infer the maximum likelihood chronologically ordered species phylogeny based on 36 genomes with 8,332 homologous gene families. We find the order of speciation events to be in full agreement with the fossil record and the inferred phylogeny of cyanobacteria to be consistent with the phylogeny recovered from established phylogenomics methods. Our results demonstrate that lateral gene transfers, detected by probabilistic models of genome evolution, can be used as a source of information on the timing of evolution, providing a valuable complement to the limited prokaryotic fossil record. PMID:23043116

  12. A Stationary Wavelet Entropy-Based Clustering Approach Accurately Predicts Gene Expression

    PubMed Central

    Nguyen, Nha; Vo, An; Choi, Inchan

    2015-01-01

    Abstract Studying epigenetic landscapes is important to understand the condition for gene regulation. Clustering is a useful approach to study epigenetic landscapes by grouping genes based on their epigenetic conditions. However, classical clustering approaches that often use a representative value of the signals in a fixed-sized window do not fully use the information written in the epigenetic landscapes. Clustering approaches to maximize the information of the epigenetic signals are necessary for better understanding gene regulatory environments. For effective clustering of multidimensional epigenetic signals, we developed a method called Dewer, which uses the entropy of stationary wavelet of epigenetic signals inside enriched regions for gene clustering. Interestingly, the gene expression levels were highly correlated with the entropy levels of epigenetic signals. Dewer separates genes better than a window-based approach in the assessment using gene expression and achieved a correlation coefficient above 0.9 without using any training procedure. Our results show that the changes of the epigenetic signals are useful to study gene regulation. PMID:25383910

  13. Accurate modeling and evaluation of microstructures in complex materials

    NASA Astrophysics Data System (ADS)

    Tahmasebi, Pejman

    2018-02-01

    Accurate characterization of heterogeneous materials is of great importance for different fields of science and engineering. Such a goal can be achieved through imaging. Acquiring three- or two-dimensional images under different conditions is not, however, always plausible. On the other hand, accurate characterization of complex and multiphase materials requires various digital images (I) under different conditions. An ensemble method is presented that can take one single (or a set of) I(s) and stochastically produce several similar models of the given disordered material. The method is based on a successive calculating of a conditional probability by which the initial stochastic models are produced. Then, a graph formulation is utilized for removing unrealistic structures. A distance transform function for the Is with highly connected microstructure and long-range features is considered which results in a new I that is more informative. Reproduction of the I is also considered through a histogram matching approach in an iterative framework. Such an iterative algorithm avoids reproduction of unrealistic structures. Furthermore, a multiscale approach, based on pyramid representation of the large Is, is presented that can produce materials with millions of pixels in a matter of seconds. Finally, the nonstationary systems—those for which the distribution of data varies spatially—are studied using two different methods. The method is tested on several complex and large examples of microstructures. The produced results are all in excellent agreement with the utilized Is and the similarities are quantified using various correlation functions.

  14. Towards accurate modelling of galaxy clustering on small scales: testing the standard ΛCDM + halo model

    NASA Astrophysics Data System (ADS)

    Sinha, Manodeep; Berlind, Andreas A.; McBride, Cameron K.; Scoccimarro, Roman; Piscionere, Jennifer A.; Wibking, Benjamin D.

    2018-07-01

    Interpreting the small-scale clustering of galaxies with halo models can elucidate the connection between galaxies and dark matter haloes. Unfortunately, the modelling is typically not sufficiently accurate for ruling out models statistically. It is thus difficult to use the information encoded in small scales to test cosmological models or probe subtle features of the galaxy-halo connection. In this paper, we attempt to push halo modelling into the `accurate' regime with a fully numerical mock-based methodology and careful treatment of statistical and systematic errors. With our forward-modelling approach, we can incorporate clustering statistics beyond the traditional two-point statistics. We use this modelling methodology to test the standard Λ cold dark matter (ΛCDM) + halo model against the clustering of Sloan Digital Sky Survey (SDSS) seventh data release (DR7) galaxies. Specifically, we use the projected correlation function, group multiplicity function, and galaxy number density as constraints. We find that while the model fits each statistic separately, it struggles to fit them simultaneously. Adding group statistics leads to a more stringent test of the model and significantly tighter constraints on model parameters. We explore the impact of varying the adopted halo definition and cosmological model and find that changing the cosmology makes a significant difference. The most successful model we tried (Planck cosmology with Mvir haloes) matches the clustering of low-luminosity galaxies, but exhibits a 2.3σ tension with the clustering of luminous galaxies, thus providing evidence that the `standard' halo model needs to be extended. This work opens the door to adding interesting freedom to the halo model and including additional clustering statistics as constraints.

  15. A multiscale red blood cell model with accurate mechanics, rheology, and dynamics.

    PubMed

    Fedosov, Dmitry A; Caswell, Bruce; Karniadakis, George Em

    2010-05-19

    Red blood cells (RBCs) have highly deformable viscoelastic membranes exhibiting complex rheological response and rich hydrodynamic behavior governed by special elastic and bending properties and by the external/internal fluid and membrane viscosities. We present a multiscale RBC model that is able to predict RBC mechanics, rheology, and dynamics in agreement with experiments. Based on an analytic theory, the modeled membrane properties can be uniquely related to the experimentally established RBC macroscopic properties without any adjustment of parameters. The RBC linear and nonlinear elastic deformations match those obtained in optical-tweezers experiments. The rheological properties of the membrane are compared with those obtained in optical magnetic twisting cytometry, membrane thermal fluctuations, and creep followed by cell recovery. The dynamics of RBCs in shear and Poiseuille flows is tested against experiments and theoretical predictions, and the applicability of the latter is discussed. Our findings clearly indicate that a purely elastic model for the membrane cannot accurately represent the RBC's rheological properties and its dynamics, and therefore accurate modeling of a viscoelastic membrane is necessary. Copyright 2010 Biophysical Society. Published by Elsevier Inc. All rights reserved.

  16. Accurate Energy Consumption Modeling of IEEE 802.15.4e TSCH Using Dual-BandOpenMote Hardware.

    PubMed

    Daneels, Glenn; Municio, Esteban; Van de Velde, Bruno; Ergeerts, Glenn; Weyn, Maarten; Latré, Steven; Famaey, Jeroen

    2018-02-02

    The Time-Slotted Channel Hopping (TSCH) mode of the IEEE 802.15.4e amendment aims to improve reliability and energy efficiency in industrial and other challenging Internet-of-Things (IoT) environments. This paper presents an accurate and up-to-date energy consumption model for devices using this IEEE 802.15.4e TSCH mode. The model identifies all network-related CPU and radio state changes, thus providing a precise representation of the device behavior and an accurate prediction of its energy consumption. Moreover, energy measurements were performed with a dual-band OpenMote device, running the OpenWSN firmware. This allows the model to be used for devices using 2.4 GHz, as well as 868 MHz. Using these measurements, several network simulations were conducted to observe the TSCH energy consumption effects in end-to-end communication for both frequency bands. Experimental verification of the model shows that it accurately models the consumption for all possible packet sizes and that the calculated consumption on average differs less than 3% from the measured consumption. This deviation includes measurement inaccuracies and the variations of the guard time. As such, the proposed model is very suitable for accurate energy consumption modeling of TSCH networks.

  17. Accurate Energy Consumption Modeling of IEEE 802.15.4e TSCH Using Dual-BandOpenMote Hardware

    PubMed Central

    Municio, Esteban; Van de Velde, Bruno; Latré, Steven

    2018-01-01

    The Time-Slotted Channel Hopping (TSCH) mode of the IEEE 802.15.4e amendment aims to improve reliability and energy efficiency in industrial and other challenging Internet-of-Things (IoT) environments. This paper presents an accurate and up-to-date energy consumption model for devices using this IEEE 802.15.4e TSCH mode. The model identifies all network-related CPU and radio state changes, thus providing a precise representation of the device behavior and an accurate prediction of its energy consumption. Moreover, energy measurements were performed with a dual-band OpenMote device, running the OpenWSN firmware. This allows the model to be used for devices using 2.4 GHz, as well as 868 MHz. Using these measurements, several network simulations were conducted to observe the TSCH energy consumption effects in end-to-end communication for both frequency bands. Experimental verification of the model shows that it accurately models the consumption for all possible packet sizes and that the calculated consumption on average differs less than 3% from the measured consumption. This deviation includes measurement inaccuracies and the variations of the guard time. As such, the proposed model is very suitable for accurate energy consumption modeling of TSCH networks. PMID:29393900

  18. FlyBase: genes and gene models

    PubMed Central

    Drysdale, Rachel A.; Crosby, Madeline A.

    2005-01-01

    FlyBase (http://flybase.org) is the primary repository of genetic and molecular data of the insect family Drosophilidae. For the most extensively studied species, Drosophila melanogaster, a wide range of data are presented in integrated formats. Data types include mutant phenotypes, molecular characterization of mutant alleles and aberrations, cytological maps, wild-type expression patterns, anatomical images, transgenic constructs and insertions, sequence-level gene models and molecular classification of gene product functions. There is a growing body of data for other Drosophila species; this is expected to increase dramatically over the next year, with the completion of draft-quality genomic sequences of an additional 11 Drosphila species. PMID:15608223

  19. Simple Mathematical Models Do Not Accurately Predict Early SIV Dynamics

    PubMed Central

    Noecker, Cecilia; Schaefer, Krista; Zaccheo, Kelly; Yang, Yiding; Day, Judy; Ganusov, Vitaly V.

    2015-01-01

    Upon infection of a new host, human immunodeficiency virus (HIV) replicates in the mucosal tissues and is generally undetectable in circulation for 1–2 weeks post-infection. Several interventions against HIV including vaccines and antiretroviral prophylaxis target virus replication at this earliest stage of infection. Mathematical models have been used to understand how HIV spreads from mucosal tissues systemically and what impact vaccination and/or antiretroviral prophylaxis has on viral eradication. Because predictions of such models have been rarely compared to experimental data, it remains unclear which processes included in these models are critical for predicting early HIV dynamics. Here we modified the “standard” mathematical model of HIV infection to include two populations of infected cells: cells that are actively producing the virus and cells that are transitioning into virus production mode. We evaluated the effects of several poorly known parameters on infection outcomes in this model and compared model predictions to experimental data on infection of non-human primates with variable doses of simian immunodifficiency virus (SIV). First, we found that the mode of virus production by infected cells (budding vs. bursting) has a minimal impact on the early virus dynamics for a wide range of model parameters, as long as the parameters are constrained to provide the observed rate of SIV load increase in the blood of infected animals. Interestingly and in contrast with previous results, we found that the bursting mode of virus production generally results in a higher probability of viral extinction than the budding mode of virus production. Second, this mathematical model was not able to accurately describe the change in experimentally determined probability of host infection with increasing viral doses. Third and finally, the model was also unable to accurately explain the decline in the time to virus detection with increasing viral dose. These results

  20. Accurate Cold-Test Model of Helical TWT Slow-Wave Circuits

    NASA Technical Reports Server (NTRS)

    Kory, Carol L.; Dayton, James A., Jr.

    1997-01-01

    Recently, a method has been established to accurately calculate cold-test data for helical slow-wave structures using the three-dimensional electromagnetic computer code, MAFIA. Cold-test parameters have been calculated for several helical traveling-wave tube (TWT) slow-wave circuits possessing various support rod configurations, and results are presented here showing excellent agreement with experiment. The helical models include tape thickness, dielectric support shapes and material properties consistent with the actual circuits. The cold-test data from this helical model can be used as input into large-signal helical TWT interaction codes making it possible, for the first time, to design a complete TWT via computer simulation.

  1. Stochastic model of transcription factor-regulated gene expression

    NASA Astrophysics Data System (ADS)

    Karmakar, Rajesh; Bose, Indrani

    2006-09-01

    We consider a stochastic model of transcription factor (TF)-regulated gene expression. The model describes two genes, gene A and gene B, which synthesize the TFs and the target gene proteins, respectively. We show through analytic calculations that the TF fluctuations have a significant effect on the distribution of the target gene protein levels when the mean TF level falls in the highest sensitive region of the dose-response curve. We further study the effect of reducing the copy number of gene A from two to one. The enhanced TF fluctuations yield results different from those in the deterministic case. The probability that the target gene protein level exceeds a threshold value is calculated with the knowledge of the probability density functions associated with the TF and target gene protein levels. Numerical simulation results for a more detailed stochastic model are shown to be in agreement with those obtained through analytic calculations. The relevance of these results in the context of the genetic disorder haploinsufficiency is pointed out. Some experimental observations on the haploinsufficiency of the tumour suppressor gene, Nkx 3.1, are explained with the help of the stochastic model of TF-regulated gene expression.

  2. A Partial Least Square Approach for Modeling Gene-gene and Gene-environment Interactions When Multiple Markers Are Genotyped

    PubMed Central

    Wang, Tao; Ho, Gloria; Ye, Kenny; Strickler, Howard; Elston, Robert C.

    2008-01-01

    Genetic association studies achieve an unprecedented level of resolution in mapping disease genes by genotyping dense SNPs in a gene region. Meanwhile, these studies require new powerful statistical tools that can optimally handle a large amount of information provided by genotype data. A question that arises is how to model interactions between two genes. Simply modeling all possible interactions between the SNPs in two gene regions is not desirable because a greatly increased number of degrees of freedom can be involved in the test statistic. We introduce an approach to reduce the genotype dimension in modeling interactions. The genotype compression of this approach is built upon the information on both the trait and the cross-locus gametic disequilibrium between SNPs in two interacting genes, in such a way as to parsimoniously model the interactions without loss of useful information in the process of dimension reduction. As a result, it improves power to detect association in the presence of gene-gene interactions. This approach can be similarly applied for modeling gene-environment interactions. We compare this method with other approaches: the corresponding test without modeling any interaction, that based on a saturated interaction model, that based on principal component analysis, and that based on Tukey’s 1-df model. Our simulations suggest that this new approach has superior power to that of the other methods. In an application to endometrial cancer case-control data from the Women’s Health Initiative (WHI), this approach detected AKT1 and AKT2 as being significantly associated with endometrial cancer susceptibility by taking into account their interactions with BMI. PMID:18615621

  3. A partial least-square approach for modeling gene-gene and gene-environment interactions when multiple markers are genotyped.

    PubMed

    Wang, Tao; Ho, Gloria; Ye, Kenny; Strickler, Howard; Elston, Robert C

    2009-01-01

    Genetic association studies achieve an unprecedented level of resolution in mapping disease genes by genotyping dense single nucleotype polymorphisms (SNPs) in a gene region. Meanwhile, these studies require new powerful statistical tools that can optimally handle a large amount of information provided by genotype data. A question that arises is how to model interactions between two genes. Simply modeling all possible interactions between the SNPs in two gene regions is not desirable because a greatly increased number of degrees of freedom can be involved in the test statistic. We introduce an approach to reduce the genotype dimension in modeling interactions. The genotype compression of this approach is built upon the information on both the trait and the cross-locus gametic disequilibrium between SNPs in two interacting genes, in such a way as to parsimoniously model the interactions without loss of useful information in the process of dimension reduction. As a result, it improves power to detect association in the presence of gene-gene interactions. This approach can be similarly applied for modeling gene-environment interactions. We compare this method with other approaches, the corresponding test without modeling any interaction, that based on a saturated interaction model, that based on principal component analysis, and that based on Tukey's one-degree-of-freedom model. Our simulations suggest that this new approach has superior power to that of the other methods. In an application to endometrial cancer case-control data from the Women's Health Initiative, this approach detected AKT1 and AKT2 as being significantly associated with endometrial cancer susceptibility by taking into account their interactions with body mass index.

  4. Generating Facial Expressions Using an Anatomically Accurate Biomechanical Model.

    PubMed

    Wu, Tim; Hung, Alice; Mithraratne, Kumar

    2014-11-01

    This paper presents a computational framework for modelling the biomechanics of human facial expressions. A detailed high-order (Cubic-Hermite) finite element model of the human head was constructed using anatomical data segmented from magnetic resonance images. The model includes a superficial soft-tissue continuum consisting of skin, the subcutaneous layer and the superficial Musculo-Aponeurotic system. Embedded within this continuum mesh, are 20 pairs of facial muscles which drive facial expressions. These muscles were treated as transversely-isotropic and their anatomical geometries and fibre orientations were accurately depicted. In order to capture the relative composition of muscles and fat, material heterogeneity was also introduced into the model. Complex contact interactions between the lips, eyelids, and between superficial soft tissue continuum and deep rigid skeletal bones were also computed. In addition, this paper investigates the impact of incorporating material heterogeneity and contact interactions, which are often neglected in similar studies. Four facial expressions were simulated using the developed model and the results were compared with surface data obtained from a 3D structured-light scanner. Predicted expressions showed good agreement with the experimental data.

  5. Gene Expression Ratios Lead to Accurate and Translatable Predictors of DR5 Agonism across Multiple Tumor Lineages.

    PubMed

    Reddy, Anupama; Growney, Joseph D; Wilson, Nick S; Emery, Caroline M; Johnson, Jennifer A; Ward, Rebecca; Monaco, Kelli A; Korn, Joshua; Monahan, John E; Stump, Mark D; Mapa, Felipa A; Wilson, Christopher J; Steiger, Janine; Ledell, Jebediah; Rickles, Richard J; Myer, Vic E; Ettenberg, Seth A; Schlegel, Robert; Sellers, William R; Huet, Heather A; Lehár, Joseph

    2015-01-01

    Death Receptor 5 (DR5) agonists demonstrate anti-tumor activity in preclinical models but have yet to demonstrate robust clinical responses. A key limitation may be the lack of patient selection strategies to identify those most likely to respond to treatment. To overcome this limitation, we screened a DR5 agonist Nanobody across >600 cell lines representing 21 tumor lineages and assessed molecular features associated with response. High expression of DR5 and Casp8 were significantly associated with sensitivity, but their expression thresholds were difficult to translate due to low dynamic ranges. To address the translational challenge of establishing thresholds of gene expression, we developed a classifier based on ratios of genes that predicted response across lineages. The ratio classifier outperformed the DR5+Casp8 classifier, as well as standard approaches for feature selection and classification using genes, instead of ratios. This classifier was independently validated using 11 primary patient-derived pancreatic xenograft models showing perfect predictions as well as a striking linearity between prediction probability and anti-tumor response. A network analysis of the genes in the ratio classifier captured important biological relationships mediating drug response, specifically identifying key positive and negative regulators of DR5 mediated apoptosis, including DR5, CASP8, BID, cFLIP, XIAP and PEA15. Importantly, the ratio classifier shows translatability across gene expression platforms (from Affymetrix microarrays to RNA-seq) and across model systems (in vitro to in vivo). Our approach of using gene expression ratios presents a robust and novel method for constructing translatable biomarkers of compound response, which can also probe the underlying biology of treatment response.

  6. Gene Expression Ratios Lead to Accurate and Translatable Predictors of DR5 Agonism across Multiple Tumor Lineages

    PubMed Central

    Reddy, Anupama; Growney, Joseph D.; Wilson, Nick S.; Emery, Caroline M.; Johnson, Jennifer A.; Ward, Rebecca; Monaco, Kelli A.; Korn, Joshua; Monahan, John E.; Stump, Mark D.; Mapa, Felipa A.; Wilson, Christopher J.; Steiger, Janine; Ledell, Jebediah; Rickles, Richard J.; Myer, Vic E.; Ettenberg, Seth A.; Schlegel, Robert; Sellers, William R.

    2015-01-01

    Death Receptor 5 (DR5) agonists demonstrate anti-tumor activity in preclinical models but have yet to demonstrate robust clinical responses. A key limitation may be the lack of patient selection strategies to identify those most likely to respond to treatment. To overcome this limitation, we screened a DR5 agonist Nanobody across >600 cell lines representing 21 tumor lineages and assessed molecular features associated with response. High expression of DR5 and Casp8 were significantly associated with sensitivity, but their expression thresholds were difficult to translate due to low dynamic ranges. To address the translational challenge of establishing thresholds of gene expression, we developed a classifier based on ratios of genes that predicted response across lineages. The ratio classifier outperformed the DR5+Casp8 classifier, as well as standard approaches for feature selection and classification using genes, instead of ratios. This classifier was independently validated using 11 primary patient-derived pancreatic xenograft models showing perfect predictions as well as a striking linearity between prediction probability and anti-tumor response. A network analysis of the genes in the ratio classifier captured important biological relationships mediating drug response, specifically identifying key positive and negative regulators of DR5 mediated apoptosis, including DR5, CASP8, BID, cFLIP, XIAP and PEA15. Importantly, the ratio classifier shows translatability across gene expression platforms (from Affymetrix microarrays to RNA-seq) and across model systems (in vitro to in vivo). Our approach of using gene expression ratios presents a robust and novel method for constructing translatable biomarkers of compound response, which can also probe the underlying biology of treatment response. PMID:26378449

  7. Covariance Structure Models for Gene Expression Microarray Data

    ERIC Educational Resources Information Center

    Xie, Jun; Bentler, Peter M.

    2003-01-01

    Covariance structure models are applied to gene expression data using a factor model, a path model, and their combination. The factor model is based on a few factors that capture most of the expression information. A common factor of a group of genes may represent a common protein factor for the transcript of the co-expressed genes, and hence, it…

  8. Establishing gene models from the Pinus pinaster genome using gene capture and BAC sequencing.

    PubMed

    Seoane-Zonjic, Pedro; Cañas, Rafael A; Bautista, Rocío; Gómez-Maldonado, Josefa; Arrillaga, Isabel; Fernández-Pozo, Noé; Claros, M Gonzalo; Cánovas, Francisco M; Ávila, Concepción

    2016-02-27

    In the era of DNA throughput sequencing, assembling and understanding gymnosperm mega-genomes remains a challenge. Although drafts of three conifer genomes have recently been published, this number is too low to understand the full complexity of conifer genomes. Using techniques focused on specific genes, gene models can be established that can aid in the assembly of gene-rich regions, and this information can be used to compare genomes and understand functional evolution. In this study, gene capture technology combined with BAC isolation and sequencing was used as an experimental approach to establish de novo gene structures without a reference genome. Probes were designed for 866 maritime pine transcripts to sequence genes captured from genomic DNA. The gene models were constructed using GeneAssembler, a new bioinformatic pipeline, which reconstructed over 82% of the gene structures, and a high proportion (85%) of the captured gene models contained sequences from the promoter regulatory region. In a parallel experiment, the P. pinaster BAC library was screened to isolate clones containing genes whose cDNA sequence were already available. BAC clones containing the asparagine synthetase, sucrose synthase and xyloglucan endotransglycosylase gene sequences were isolated and used in this study. The gene models derived from the gene capture approach were compared with the genomic sequences derived from the BAC clones. This combined approach is a particularly efficient way to capture the genomic structures of gene families with a small number of members. The experimental approach used in this study is a valuable combined technique to study genomic gene structures in species for which a reference genome is unavailable. It can be used to establish exon/intron boundaries in unknown gene structures, to reconstruct incomplete genes and to obtain promoter sequences that can be used for transcriptional studies. A bioinformatics algorithm (GeneAssembler) is also provided as a

  9. The Spike-and-Slab Lasso Generalized Linear Models for Prediction and Associated Genes Detection.

    PubMed

    Tang, Zaixiang; Shen, Yueping; Zhang, Xinyan; Yi, Nengjun

    2017-01-01

    Large-scale "omics" data have been increasingly used as an important resource for prognostic prediction of diseases and detection of associated genes. However, there are considerable challenges in analyzing high-dimensional molecular data, including the large number of potential molecular predictors, limited number of samples, and small effect of each predictor. We propose new Bayesian hierarchical generalized linear models, called spike-and-slab lasso GLMs, for prognostic prediction and detection of associated genes using large-scale molecular data. The proposed model employs a spike-and-slab mixture double-exponential prior for coefficients that can induce weak shrinkage on large coefficients, and strong shrinkage on irrelevant coefficients. We have developed a fast and stable algorithm to fit large-scale hierarchal GLMs by incorporating expectation-maximization (EM) steps into the fast cyclic coordinate descent algorithm. The proposed approach integrates nice features of two popular methods, i.e., penalized lasso and Bayesian spike-and-slab variable selection. The performance of the proposed method is assessed via extensive simulation studies. The results show that the proposed approach can provide not only more accurate estimates of the parameters, but also better prediction. We demonstrate the proposed procedure on two cancer data sets: a well-known breast cancer data set consisting of 295 tumors, and expression data of 4919 genes; and the ovarian cancer data set from TCGA with 362 tumors, and expression data of 5336 genes. Our analyses show that the proposed procedure can generate powerful models for predicting outcomes and detecting associated genes. The methods have been implemented in a freely available R package BhGLM (http://www.ssg.uab.edu/bhglm/). Copyright © 2017 by the Genetics Society of America.

  10. A chain reaction approach to modelling gene pathways.

    PubMed

    Cheng, Gary C; Chen, Dung-Tsa; Chen, James J; Soong, Seng-Jaw; Lamartiniere, Coral; Barnes, Stephen

    2012-08-01

    BACKGROUND: Of great interest in cancer prevention is how nutrient components affect gene pathways associated with the physiological events of puberty. Nutrient-gene interactions may cause changes in breast or prostate cells and, therefore, may result in cancer risk later in life. Analysis of gene pathways can lead to insights about nutrient-gene interactions and the development of more effective prevention approaches to reduce cancer risk. To date, researchers have relied heavily upon experimental assays (such as microarray analysis, etc.) to identify genes and their associated pathways that are affected by nutrient and diets. However, the vast number of genes and combinations of gene pathways, coupled with the expense of the experimental analyses, has delayed the progress of gene-pathway research. The development of an analytical approach based on available test data could greatly benefit the evaluation of gene pathways, and thus advance the study of nutrient-gene interactions in cancer prevention. In the present study, we have proposed a chain reaction model to simulate gene pathways, in which the gene expression changes through the pathway are represented by the species undergoing a set of chemical reactions. We have also developed a numerical tool to solve for the species changes due to the chain reactions over time. Through this approach we can examine the impact of nutrient-containing diets on the gene pathway; moreover, transformation of genes over time with a nutrient treatment can be observed numerically, which is very difficult to achieve experimentally. We apply this approach to microarray analysis data from an experiment which involved the effects of three polyphenols (nutrient treatments), epigallo-catechin-3-O-gallate (EGCG), genistein, and resveratrol, in a study of nutrient-gene interaction in the estrogen synthesis pathway during puberty. RESULTS: In this preliminary study, the estrogen synthesis pathway was simulated by a chain reaction model. By

  11. A deep auto-encoder model for gene expression prediction.

    PubMed

    Xie, Rui; Wen, Jia; Quitadamo, Andrew; Cheng, Jianlin; Shi, Xinghua

    2017-11-17

    Gene expression is a key intermediate level that genotypes lead to a particular trait. Gene expression is affected by various factors including genotypes of genetic variants. With an aim of delineating the genetic impact on gene expression, we build a deep auto-encoder model to assess how good genetic variants will contribute to gene expression changes. This new deep learning model is a regression-based predictive model based on the MultiLayer Perceptron and Stacked Denoising Auto-encoder (MLP-SAE). The model is trained using a stacked denoising auto-encoder for feature selection and a multilayer perceptron framework for backpropagation. We further improve the model by introducing dropout to prevent overfitting and improve performance. To demonstrate the usage of this model, we apply MLP-SAE to a real genomic datasets with genotypes and gene expression profiles measured in yeast. Our results show that the MLP-SAE model with dropout outperforms other models including Lasso, Random Forests and the MLP-SAE model without dropout. Using the MLP-SAE model with dropout, we show that gene expression quantifications predicted by the model solely based on genotypes, align well with true gene expression patterns. We provide a deep auto-encoder model for predicting gene expression from SNP genotypes. This study demonstrates that deep learning is appropriate for tackling another genomic problem, i.e., building predictive models to understand genotypes' contribution to gene expression. With the emerging availability of richer genomic data, we anticipate that deep learning models play a bigger role in modeling and interpreting genomics.

  12. WegoLoc: accurate prediction of protein subcellular localization using weighted Gene Ontology terms.

    PubMed

    Chi, Sang-Mun; Nam, Dougu

    2012-04-01

    We present an accurate and fast web server, WegoLoc for predicting subcellular localization of proteins based on sequence similarity and weighted Gene Ontology (GO) information. A term weighting method in the text categorization process is applied to GO terms for a support vector machine classifier. As a result, WegoLoc surpasses the state-of-the-art methods for previously used test datasets. WegoLoc supports three eukaryotic kingdoms (animals, fungi and plants) and provides human-specific analysis, and covers several sets of cellular locations. In addition, WegoLoc provides (i) multiple possible localizations of input protein(s) as well as their corresponding probability scores, (ii) weights of GO terms representing the contribution of each GO term in the prediction, and (iii) a BLAST E-value for the best hit with GO terms. If the similarity score does not meet a given threshold, an amino acid composition-based prediction is applied as a backup method. WegoLoc and User's guide are freely available at the website http://www.btool.org/WegoLoc smchiks@ks.ac.kr; dougnam@unist.ac.kr Supplementary data is available at http://www.btool.org/WegoLoc.

  13. Dynamics Modelling of Biolistic Gene Guns

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhang, M.; Tao, W.; Pianetta, P.A.

    2009-06-04

    The gene transfer process using biolistic gene guns is a highly dynamic process. To achieve good performance, the process needs to be well understood and controlled. Unfortunately, no dynamic model is available in the open literature for analysing and controlling the process. This paper proposes such a model. Relationships of the penetration depth with the helium pressure, the penetration depth with the acceleration distance, and the penetration depth with the micro-carrier radius are presented. Simulations have also been conducted. The results agree well with experimental results in the open literature. The contribution of this paper includes a dynamic model formore » improving and manipulating performance of the biolistic gene gun.« less

  14. Numerically accurate computational techniques for optimal estimator analyses of multi-parameter models

    NASA Astrophysics Data System (ADS)

    Berger, Lukas; Kleinheinz, Konstantin; Attili, Antonio; Bisetti, Fabrizio; Pitsch, Heinz; Mueller, Michael E.

    2018-05-01

    Modelling unclosed terms in partial differential equations typically involves two steps: First, a set of known quantities needs to be specified as input parameters for a model, and second, a specific functional form needs to be defined to model the unclosed terms by the input parameters. Both steps involve a certain modelling error, with the former known as the irreducible error and the latter referred to as the functional error. Typically, only the total modelling error, which is the sum of functional and irreducible error, is assessed, but the concept of the optimal estimator enables the separate analysis of the total and the irreducible errors, yielding a systematic modelling error decomposition. In this work, attention is paid to the techniques themselves required for the practical computation of irreducible errors. Typically, histograms are used for optimal estimator analyses, but this technique is found to add a non-negligible spurious contribution to the irreducible error if models with multiple input parameters are assessed. Thus, the error decomposition of an optimal estimator analysis becomes inaccurate, and misleading conclusions concerning modelling errors may be drawn. In this work, numerically accurate techniques for optimal estimator analyses are identified and a suitable evaluation of irreducible errors is presented. Four different computational techniques are considered: a histogram technique, artificial neural networks, multivariate adaptive regression splines, and an additive model based on a kernel method. For multiple input parameter models, only artificial neural networks and multivariate adaptive regression splines are found to yield satisfactorily accurate results. Beyond a certain number of input parameters, the assessment of models in an optimal estimator analysis even becomes practically infeasible if histograms are used. The optimal estimator analysis in this paper is applied to modelling the filtered soot intermittency in large eddy

  15. Evaluation of amplification refractory mutation system (ARMS) technique for quick and accurate prenatal gene diagnosis of CHM variant in choroideremia.

    PubMed

    Yang, Lisha; Ijaz, Iqra; Cheng, Jingliang; Wei, Chunli; Tan, Xiaojun; Khan, Md Asaduzzaman; Fu, Xiaodong; Fu, Junjiang

    2018-01-01

    Choroideremia is a rare X-linked recessive inherited disorder that causes chorioretinal dystrophy leading to visual impairment in its early stages which finally causes total blindness in the affected person. It is caused due to mutations in the CHM gene. In this study, we have recruited a pedigree with choroideremia and detected a nonsense variant (c.C799T:p.R267X) in CHM of the proband (I:1). Different primer sets for amplification refractory mutation system (ARMS) were designed and PCR conditions were optimized. Then, we evaluated the sequence variant in the patient, carrier, and a fetus by using ARMS technique to identify if they inherited the pathogenic gene from parental generation; we used amniotic fluid DNA for the diagnosis of the gene in the fetus. The primer pairs, WT2+C and MT+C, amplified high specific products in different DNAs which were verified by Sanger sequencing. Based on our results, ARMS technique is fast, accurate, and reliable prenatal gene diagnostic tool to assess CHM variants. Taken together, our study indicates that ARMS technique can be used as a potential molecular tool in the diagnosis of prenatal mutation for choroideremia as well as other genetic diseases in undeveloped and developing countries, where there might be shortage of medical resources and supplies.

  16. Accurate and scalable social recommendation using mixed-membership stochastic block models.

    PubMed

    Godoy-Lorite, Antonia; Guimerà, Roger; Moore, Cristopher; Sales-Pardo, Marta

    2016-12-13

    With increasing amounts of information available, modeling and predicting user preferences-for books or articles, for example-are becoming more important. We present a collaborative filtering model, with an associated scalable algorithm, that makes accurate predictions of users' ratings. Like previous approaches, we assume that there are groups of users and of items and that the rating a user gives an item is determined by their respective group memberships. However, we allow each user and each item to belong simultaneously to mixtures of different groups and, unlike many popular approaches such as matrix factorization, we do not assume that users in each group prefer a single group of items. In particular, we do not assume that ratings depend linearly on a measure of similarity, but allow probability distributions of ratings to depend freely on the user's and item's groups. The resulting overlapping groups and predicted ratings can be inferred with an expectation-maximization algorithm whose running time scales linearly with the number of observed ratings. Our approach enables us to predict user preferences in large datasets and is considerably more accurate than the current algorithms for such large datasets.

  17. Accurate and scalable social recommendation using mixed-membership stochastic block models

    PubMed Central

    Godoy-Lorite, Antonia; Moore, Cristopher

    2016-01-01

    With increasing amounts of information available, modeling and predicting user preferences—for books or articles, for example—are becoming more important. We present a collaborative filtering model, with an associated scalable algorithm, that makes accurate predictions of users’ ratings. Like previous approaches, we assume that there are groups of users and of items and that the rating a user gives an item is determined by their respective group memberships. However, we allow each user and each item to belong simultaneously to mixtures of different groups and, unlike many popular approaches such as matrix factorization, we do not assume that users in each group prefer a single group of items. In particular, we do not assume that ratings depend linearly on a measure of similarity, but allow probability distributions of ratings to depend freely on the user’s and item’s groups. The resulting overlapping groups and predicted ratings can be inferred with an expectation-maximization algorithm whose running time scales linearly with the number of observed ratings. Our approach enables us to predict user preferences in large datasets and is considerably more accurate than the current algorithms for such large datasets. PMID:27911773

  18. Likelihood-based gene annotations for gap filling and quality assessment in genome-scale metabolic models

    DOE PAGES

    Benedict, Matthew N.; Mundy, Michael B.; Henry, Christopher S.; ...

    2014-10-16

    Genome-scale metabolic models provide a powerful means to harness information from genomes to deepen biological insights. With exponentially increasing sequencing capacity, there is an enormous need for automated reconstruction techniques that can provide more accurate models in a short time frame. Current methods for automated metabolic network reconstruction rely on gene and reaction annotations to build draft metabolic networks and algorithms to fill gaps in these networks. However, automated reconstruction is hampered by database inconsistencies, incorrect annotations, and gap filling largely without considering genomic information. Here we develop an approach for applying genomic information to predict alternative functions for genesmore » and estimate their likelihoods from sequence homology. We show that computed likelihood values were significantly higher for annotations found in manually curated metabolic networks than those that were not. We then apply these alternative functional predictions to estimate reaction likelihoods, which are used in a new gap filling approach called likelihood-based gap filling to predict more genomically consistent solutions. To validate the likelihood-based gap filling approach, we applied it to models where essential pathways were removed, finding that likelihood-based gap filling identified more biologically relevant solutions than parsimony-based gap filling approaches. We also demonstrate that models gap filled using likelihood-based gap filling provide greater coverage and genomic consistency with metabolic gene functions compared to parsimony-based approaches. Interestingly, despite these findings, we found that likelihoods did not significantly affect consistency of gap filled models with Biolog and knockout lethality data. This indicates that the phenotype data alone cannot necessarily be used to discriminate between alternative solutions for gap filling and therefore, that the use of other information is necessary

  19. Dynamic sensing model for accurate delectability of environmental phenomena using event wireless sensor network

    NASA Astrophysics Data System (ADS)

    Missif, Lial Raja; Kadhum, Mohammad M.

    2017-09-01

    Wireless Sensor Network (WSN) has been widely used for monitoring where sensors are deployed to operate independently to sense abnormal phenomena. Most of the proposed environmental monitoring systems are designed based on a predetermined sensing range which does not reflect the sensor reliability, event characteristics, and the environment conditions. Measuring of the capability of a sensor node to accurately detect an event within a sensing field is of great important for monitoring applications. This paper presents an efficient mechanism for even detection based on probabilistic sensing model. Different models have been presented theoretically in this paper to examine their adaptability and applicability to the real environment applications. The numerical results of the experimental evaluation have showed that the probabilistic sensing model provides accurate observation and delectability of an event, and it can be utilized for different environment scenarios.

  20. Disease gene prioritization by integrating tissue-specific molecular networks using a robust multi-network model.

    PubMed

    Ni, Jingchao; Koyuturk, Mehmet; Tong, Hanghang; Haines, Jonathan; Xu, Rong; Zhang, Xiang

    2016-11-10

    recover true associations more accurately than other methods in terms of AUC values, and the performance differences are significant (with paired t-test p-values less than 0.05). This validates the importance to integrate tissue-specific molecular networks for studying disease gene prioritization and show the superiority of our network models and ranking algorithms toward this purpose. The source code and datasets are available at http://nijingchao.github.io/CRstar/ .

  1. Getting a Picture that Is Both Accurate and Stable: Situation Models and Epistemic Validation

    ERIC Educational Resources Information Center

    Schroeder, Sascha; Richter, Tobias; Hoever, Inga

    2008-01-01

    Text comprehension entails the construction of a situation model that prepares individuals for situated action. In order to meet this function, situation model representations are required to be both accurate and stable. We propose a framework according to which comprehenders rely on epistemic validation to prevent inaccurate information from…

  2. Accurate Treatment of Collision and Water-Delivery in Models of Terrestrial Planet Formation

    NASA Astrophysics Data System (ADS)

    Haghighipour, N.; Maindl, T. I.; Schaefer, C. M.; Wandel, O.

    2017-08-01

    We have developed a comprehensive approach in simulating collisions and growth of embryos to terrestrial planets where we use a combination of SPH and N-body codes to model collisions and the transfer of water and chemical compounds accurately.

  3. System-level insights into the cellular interactome of a non-model organism: inferring, modelling and analysing functional gene network of soybean (Glycine max).

    PubMed

    Xu, Yungang; Guo, Maozu; Zou, Quan; Liu, Xiaoyan; Wang, Chunyu; Liu, Yang

    2014-01-01

    Cellular interactome, in which genes and/or their products interact on several levels, forming transcriptional regulatory-, protein interaction-, metabolic-, signal transduction networks, etc., has attracted decades of research focuses. However, such a specific type of network alone can hardly explain the various interactive activities among genes. These networks characterize different interaction relationships, implying their unique intrinsic properties and defects, and covering different slices of biological information. Functional gene network (FGN), a consolidated interaction network that models fuzzy and more generalized notion of gene-gene relations, have been proposed to combine heterogeneous networks with the goal of identifying functional modules supported by multiple interaction types. There are yet no successful precedents of FGNs on sparsely studied non-model organisms, such as soybean (Glycine max), due to the absence of sufficient heterogeneous interaction data. We present an alternative solution for inferring the FGNs of soybean (SoyFGNs), in a pioneering study on the soybean interactome, which is also applicable to other organisms. SoyFGNs exhibit the typical characteristics of biological networks: scale-free, small-world architecture and modularization. Verified by co-expression and KEGG pathways, SoyFGNs are more extensive and accurate than an orthology network derived from Arabidopsis. As a case study, network-guided disease-resistance gene discovery indicates that SoyFGNs can provide system-level studies on gene functions and interactions. This work suggests that inferring and modelling the interactome of a non-model plant are feasible. It will speed up the discovery and definition of the functions and interactions of other genes that control important functions, such as nitrogen fixation and protein or lipid synthesis. The efforts of the study are the basis of our further comprehensive studies on the soybean functional interactome at the genome

  4. System-Level Insights into the Cellular Interactome of a Non-Model Organism: Inferring, Modelling and Analysing Functional Gene Network of Soybean (Glycine max)

    PubMed Central

    Xu, Yungang; Guo, Maozu; Zou, Quan; Liu, Xiaoyan; Wang, Chunyu; Liu, Yang

    2014-01-01

    Cellular interactome, in which genes and/or their products interact on several levels, forming transcriptional regulatory-, protein interaction-, metabolic-, signal transduction networks, etc., has attracted decades of research focuses. However, such a specific type of network alone can hardly explain the various interactive activities among genes. These networks characterize different interaction relationships, implying their unique intrinsic properties and defects, and covering different slices of biological information. Functional gene network (FGN), a consolidated interaction network that models fuzzy and more generalized notion of gene-gene relations, have been proposed to combine heterogeneous networks with the goal of identifying functional modules supported by multiple interaction types. There are yet no successful precedents of FGNs on sparsely studied non-model organisms, such as soybean (Glycine max), due to the absence of sufficient heterogeneous interaction data. We present an alternative solution for inferring the FGNs of soybean (SoyFGNs), in a pioneering study on the soybean interactome, which is also applicable to other organisms. SoyFGNs exhibit the typical characteristics of biological networks: scale-free, small-world architecture and modularization. Verified by co-expression and KEGG pathways, SoyFGNs are more extensive and accurate than an orthology network derived from Arabidopsis. As a case study, network-guided disease-resistance gene discovery indicates that SoyFGNs can provide system-level studies on gene functions and interactions. This work suggests that inferring and modelling the interactome of a non-model plant are feasible. It will speed up the discovery and definition of the functions and interactions of other genes that control important functions, such as nitrogen fixation and protein or lipid synthesis. The efforts of the study are the basis of our further comprehensive studies on the soybean functional interactome at the genome

  5. Modeling of capacitor charging dynamics in an energy harvesting system considering accurate electromechanical coupling effects

    NASA Astrophysics Data System (ADS)

    Bagheri, Shahriar; Wu, Nan; Filizadeh, Shaahin

    2018-06-01

    This paper presents an iterative numerical method that accurately models an energy harvesting system charging a capacitor with piezoelectric patches. The constitutive relations of piezoelectric materials connected with an external charging circuit with a diode bridge and capacitors lead to the electromechanical coupling effect and the difficulty of deriving accurate transient mechanical response, as well as the charging progress. The proposed model is built upon the Euler-Bernoulli beam theory and takes into account the electromechanical coupling effects as well as the dynamic process of charging an external storage capacitor. The model is validated through experimental tests on a cantilever beam coated with piezoelectric patches. Several parametric studies are performed and the functionality of the model is verified. The efficiency of power harvesting system can be predicted and tuned considering variations in different design parameters. Such a model can be utilized to design robust and optimal energy harvesting system.

  6. Obtaining Accurate Probabilities Using Classifier Calibration

    ERIC Educational Resources Information Center

    Pakdaman Naeini, Mahdi

    2016-01-01

    Learning probabilistic classification and prediction models that generate accurate probabilities is essential in many prediction and decision-making tasks in machine learning and data mining. One way to achieve this goal is to post-process the output of classification models to obtain more accurate probabilities. These post-processing methods are…

  7. Validation of an Accurate Three-Dimensional Helical Slow-Wave Circuit Model

    NASA Technical Reports Server (NTRS)

    Kory, Carol L.

    1997-01-01

    The helical slow-wave circuit embodies a helical coil of rectangular tape supported in a metal barrel by dielectric support rods. Although the helix slow-wave circuit remains the mainstay of the traveling-wave tube (TWT) industry because of its exceptionally wide bandwidth, a full helical circuit, without significant dimensional approximations, has not been successfully modeled until now. Numerous attempts have been made to analyze the helical slow-wave circuit so that the performance could be accurately predicted without actually building it, but because of its complex geometry, many geometrical approximations became necessary rendering the previous models inaccurate. In the course of this research it has been demonstrated that using the simulation code, MAFIA, the helical structure can be modeled with actual tape width and thickness, dielectric support rod geometry and materials. To demonstrate the accuracy of the MAFIA model, the cold-test parameters including dispersion, on-axis interaction impedance and attenuation have been calculated for several helical TWT slow-wave circuits with a variety of support rod geometries including rectangular and T-shaped rods, as well as various support rod materials including isotropic, anisotropic and partially metal coated dielectrics. Compared with experimentally measured results, the agreement is excellent. With the accuracy of the MAFIA helical model validated, the code was used to investigate several conventional geometric approximations in an attempt to obtain the most computationally efficient model. Several simplifications were made to a standard model including replacing the helical tape with filaments, and replacing rectangular support rods with shapes conforming to the cylindrical coordinate system with effective permittivity. The approximate models are compared with the standard model in terms of cold-test characteristics and computational time. The model was also used to determine the sensitivity of various

  8. Fractionation and reconstitution of factors required for accurate transcription of mammalian ribosomal RNA genes: identification of a species-dependent initiation factor.

    PubMed Central

    Mishima, Y; Financsek, I; Kominami, R; Muramatsu, M

    1982-01-01

    Mouse and human cell extracts (S100) can support an accurate and efficient transcription initiation on homologous ribosomal RNA gene (rDNA) templates. The cell extracts were fractionated with the aid of a phosphocellulose column into four fractions (termed A, B, C and D), including one containing a major part of the RNA polymerase I activity. Various reconstitution experiments indicate that fraction D is an absolute requirement for the correct and efficient transcription initiation by RNA polymerase I on both mouse and human genes. Fraction B effectively suppresses random initiation on these templates. Fraction A appears to further enhance the transcription which takes place with fractions C and D. Although fractions A, B and C are interchangeable between mouse and human extracts, fraction D is not; i.e. initiation of transcription required the presence of a homologous fraction D for both templates. The factor(s) in fraction D, however, is not literally species-specific, since mouse D fraction is capable of supporting accurate transcription initiation on a rat rDNA template in the presence of all the other fractions from human cell extract under the conditions where human D fraction is unable to support it. We conclude from these experiments that a species-dependent factor in fraction D plays an important role in the initiation of rDNA transcription in each animal species. Images PMID:7177852

  9. PconsD: ultra rapid, accurate model quality assessment for protein structure prediction.

    PubMed

    Skwark, Marcin J; Elofsson, Arne

    2013-07-15

    Clustering methods are often needed for accurately assessing the quality of modeled protein structures. Recent blind evaluation of quality assessment methods in CASP10 showed that there is little difference between many different methods as far as ranking models and selecting best model are concerned. When comparing many models, the computational cost of the model comparison can become significant. Here, we present PconsD, a fast, stream-computing method for distance-driven model quality assessment that runs on consumer hardware. PconsD is at least one order of magnitude faster than other methods of comparable accuracy. The source code for PconsD is freely available at http://d.pcons.net/. Supplementary benchmarking data are also available there. arne@bioinfo.se Supplementary data are available at Bioinformatics online.

  10. Improved animal models for testing gene therapy for atherosclerosis.

    PubMed

    Du, Liang; Zhang, Jingwan; De Meyer, Guido R Y; Flynn, Rowan; Dichek, David A

    2014-04-01

    Gene therapy delivered to the blood vessel wall could augment current therapies for atherosclerosis, including systemic drug therapy and stenting. However, identification of clinically useful vectors and effective therapeutic transgenes remains at the preclinical stage. Identification of effective vectors and transgenes would be accelerated by availability of animal models that allow practical and expeditious testing of vessel-wall-directed gene therapy. Such models would include humanlike lesions that develop rapidly in vessels that are amenable to efficient gene delivery. Moreover, because human atherosclerosis develops in normal vessels, gene therapy that prevents atherosclerosis is most logically tested in relatively normal arteries. Similarly, gene therapy that causes atherosclerosis regression requires gene delivery to an existing lesion. Here we report development of three new rabbit models for testing vessel-wall-directed gene therapy that either prevents or reverses atherosclerosis. Carotid artery intimal lesions in these new models develop within 2-7 months after initiation of a high-fat diet and are 20-80 times larger than lesions in a model we described previously. Individual models allow generation of lesions that are relatively rich in either macrophages or smooth muscle cells, permitting testing of gene therapy strategies targeted at either cell type. Two of the models include gene delivery to essentially normal arteries and will be useful for identifying strategies that prevent lesion development. The third model generates lesions rapidly in vector-naïve animals and can be used for testing gene therapy that promotes lesion regression. These models are optimized for testing helper-dependent adenovirus (HDAd)-mediated gene therapy; however, they could be easily adapted for testing of other vectors or of different types of molecular therapies, delivered directly to the blood vessel wall. Our data also supports the promise of HDAd to deliver long

  11. Accurate, simple, and inexpensive assays to diagnose F8 gene inversion mutations in hemophilia A patients and carriers.

    PubMed

    Dutta, Debargh; Gunasekera, Devi; Ragni, Margaret V; Pratt, Kathleen P

    2016-12-27

    The most frequent mutations resulting in hemophilia A are an intron 22 or intron 1 gene inversion, which together cause ∼50% of severe hemophilia A cases. We report a simple and accurate RNA-based assay to detect these mutations in patients and heterozygous carriers. The assays do not require specialized equipment or expensive reagents; therefore, they may provide useful and economic protocols that could be standardized for central laboratory testing. RNA is purified from a blood sample, and reverse transcription nested polymerase chain reaction (RT-NPCR) reactions amplify DNA fragments with the F8 sequence spanning the exon 22 to 23 splice site (intron 22 inversion test) or the exon 1 to 2 splice site (intron 1 inversion test). These sequences will be amplified only from F8 RNA without an intron 22 or intron 1 inversion mutation, respectively. Additional RT-NPCR reactions are then carried out to amplify the inverted sequences extending from F8 exon 19 to the first in-frame stop codon within intron 22 or a chimeric transcript containing F8 exon 1 and the VBP1 gene. These latter 2 products are produced only by individuals with an intron 22 or intron 1 inversion mutation, respectively. The intron 22 inversion mutations may be further classified (eg, as type 1 or type 2, reflecting the specific homologous recombination sites) by the standard DNA-based "inverse-shifting" PCR assay if desired. Efficient Bcl I and T4 DNA ligase enzymes that cleave and ligate DNA in minutes were used, which is a substantial improvement over previous protocols that required overnight incubations. These protocols can accurately detect F8 inversion mutations via same-day testing of patient samples.

  12. Rapid and accurate synthesis of TALE genes from synthetic oligonucleotides.

    PubMed

    Wang, Fenghua; Zhang, Hefei; Gao, Jingxia; Chen, Fengjiao; Chen, Sijie; Zhang, Cuizhen; Peng, Gang

    2016-01-01

    Custom synthesis of transcription activator-like effector (TALE) genes has relied upon plasmid libraries of pre-fabricated TALE-repeat monomers or oligomers. Here we describe a novel synthesis method that directly incorporates annealed synthetic oligonucleotides into the TALE-repeat units. Our approach utilizes iterative sets of oligonucleotides and a translational frame check strategy to ensure the high efficiency and accuracy of TALE-gene synthesis. TALE arrays of more than 20 repeats can be constructed, and the majority of the synthesized constructs have perfect sequences. In addition, this novel oligonucleotide-based method can readily accommodate design changes to the TALE repeats. We demonstrated an increased gene targeting efficiency against a genomic site containing a potentially methylated cytosine by incorporating non-conventional repeat variable di-residue (RVD) sequences.

  13. A Dynamical Model Reveals Gene Co-Localizations in Nucleus

    PubMed Central

    Yao, Ye; Lin, Wei; Hennessy, Conor; Fraser, Peter; Feng, Jianfeng

    2011-01-01

    Co-localization of networks of genes in the nucleus is thought to play an important role in determining gene expression patterns. Based upon experimental data, we built a dynamical model to test whether pure diffusion could account for the observed co-localization of genes within a defined subnuclear region. A simple standard Brownian motion model in two and three dimensions shows that preferential co-localization is possible for co-regulated genes without any direct interaction, and suggests the occurrence may be due to a limitation in the number of available transcription factors. Experimental data of chromatin movements demonstrates that fractional rather than standard Brownian motion is more appropriate to model gene mobilizations, and we tested our dynamical model against recent static experimental data, using a sub-diffusion process by which the genes tend to colocalize more easily. Moreover, in order to compare our model with recently obtained experimental data, we studied the association level between genes and factors, and presented data supporting the validation of this dynamic model. As further applications of our model, we applied it to test against more biological observations. We found that increasing transcription factor number, rather than factory number and nucleus size, might be the reason for decreasing gene co-localization. In the scenario of frequency- or amplitude-modulation of transcription factors, our model predicted that frequency-modulation may increase the co-localization between its targeted genes. PMID:21760760

  14. Accurately modeling Gaussian beam propagation in the context of Monte Carlo techniques

    NASA Astrophysics Data System (ADS)

    Hokr, Brett H.; Winblad, Aidan; Bixler, Joel N.; Elpers, Gabriel; Zollars, Byron; Scully, Marlan O.; Yakovlev, Vladislav V.; Thomas, Robert J.

    2016-03-01

    Monte Carlo simulations are widely considered to be the gold standard for studying the propagation of light in turbid media. However, traditional Monte Carlo methods fail to account for diffraction because they treat light as a particle. This results in converging beams focusing to a point instead of a diffraction limited spot, greatly effecting the accuracy of Monte Carlo simulations near the focal plane. Here, we present a technique capable of simulating a focusing beam in accordance to the rules of Gaussian optics, resulting in a diffraction limited focal spot. This technique can be easily implemented into any traditional Monte Carlo simulation allowing existing models to be converted to include accurate focusing geometries with minimal effort. We will present results for a focusing beam in a layered tissue model, demonstrating that for different scenarios the region of highest intensity, thus the greatest heating, can change from the surface to the focus. The ability to simulate accurate focusing geometries will greatly enhance the usefulness of Monte Carlo for countless applications, including studying laser tissue interactions in medical applications and light propagation through turbid media.

  15. Bottom-up coarse-grained models that accurately describe the structure, pressure, and compressibility of molecular liquids

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Dunn, Nicholas J. H.; Noid, W. G., E-mail: wnoid@chem.psu.edu

    2015-12-28

    The present work investigates the capability of bottom-up coarse-graining (CG) methods for accurately modeling both structural and thermodynamic properties of all-atom (AA) models for molecular liquids. In particular, we consider 1, 2, and 3-site CG models for heptane, as well as 1 and 3-site CG models for toluene. For each model, we employ the multiscale coarse-graining method to determine interaction potentials that optimally approximate the configuration dependence of the many-body potential of mean force (PMF). We employ a previously developed “pressure-matching” variational principle to determine a volume-dependent contribution to the potential, U{sub V}(V), that approximates the volume-dependence of the PMF.more » We demonstrate that the resulting CG models describe AA density fluctuations with qualitative, but not quantitative, accuracy. Accordingly, we develop a self-consistent approach for further optimizing U{sub V}, such that the CG models accurately reproduce the equilibrium density, compressibility, and average pressure of the AA models, although the CG models still significantly underestimate the atomic pressure fluctuations. Additionally, by comparing this array of models that accurately describe the structure and thermodynamic pressure of heptane and toluene at a range of different resolutions, we investigate the impact of bottom-up coarse-graining upon thermodynamic properties. In particular, we demonstrate that U{sub V} accounts for the reduced cohesion in the CG models. Finally, we observe that bottom-up coarse-graining introduces subtle correlations between the resolution, the cohesive energy density, and the “simplicity” of the model.« less

  16. Bayesian parameter estimation of a k-ε model for accurate jet-in-crossflow simulations

    DOE PAGES

    Ray, Jaideep; Lefantzi, Sophia; Arunajatesan, Srinivasan; ...

    2016-05-31

    Reynolds-averaged Navier–Stokes models are not very accurate for high-Reynolds-number compressible jet-in-crossflow interactions. The inaccuracy arises from the use of inappropriate model parameters and model-form errors in the Reynolds-averaged Navier–Stokes model. In this study, the hypothesis is pursued that Reynolds-averaged Navier–Stokes predictions can be significantly improved by using parameters inferred from experimental measurements of a supersonic jet interacting with a transonic crossflow.

  17. Markov State Models of gene regulatory networks.

    PubMed

    Chu, Brian K; Tse, Margaret J; Sato, Royce R; Read, Elizabeth L

    2017-02-06

    Gene regulatory networks with dynamics characterized by multiple stable states underlie cell fate-decisions. Quantitative models that can link molecular-level knowledge of gene regulation to a global understanding of network dynamics have the potential to guide cell-reprogramming strategies. Networks are often modeled by the stochastic Chemical Master Equation, but methods for systematic identification of key properties of the global dynamics are currently lacking. The method identifies the number, phenotypes, and lifetimes of long-lived states for a set of common gene regulatory network models. Application of transition path theory to the constructed Markov State Model decomposes global dynamics into a set of dominant transition paths and associated relative probabilities for stochastic state-switching. In this proof-of-concept study, we found that the Markov State Model provides a general framework for analyzing and visualizing stochastic multistability and state-transitions in gene networks. Our results suggest that this framework-adopted from the field of atomistic Molecular Dynamics-can be a useful tool for quantitative Systems Biology at the network scale.

  18. Unique attributes of cyanobacterial metabolism revealed by improved genome-scale metabolic modeling and essential gene analysis

    DOE PAGES

    Broddrick, Jared T.; Rubin, Benjamin E.; Welkie, David G.; ...

    2016-12-20

    The model cyanobacterium, Synechococcus elongatus PCC 7942, is a genetically tractable obligate phototroph that is being developed for the bioproduction of high-value chemicals. Genome-scale models (GEMs) have been successfully used to assess and engineer cellular metabolism; however, GEMs of phototrophic metabolism have been limited by the lack of experimental datasets for model validation and the challenges of incorporating photon uptake. In this paper, we develop a GEM of metabolism in S. elongatus using random barcode transposon site sequencing (RB-TnSeq) essential gene and physiological data specific to photoautotrophic metabolism. The model explicitly describes photon absorption and accounts for shading, resulting inmore » the characteristic linear growth curve of photoautotrophs. GEM predictions of gene essentiality were compared with data obtained from recent dense-transposon mutagenesis experiments. This dataset allowed major improvements to the accuracy of the model. Furthermore, discrepancies between GEM predictions and the in vivo dataset revealed biological characteristics, such as the importance of a truncated, linear TCA pathway, low flux toward amino acid synthesis from photorespiration, and knowledge gaps within nucleotide metabolism. Finally, coupling of strong experimental support and photoautotrophic modeling methods thus resulted in a highly accurate model of S. elongatus metabolism that highlights previously unknown areas of S. elongatus biology.« less

  19. Unique attributes of cyanobacterial metabolism revealed by improved genome-scale metabolic modeling and essential gene analysis

    PubMed Central

    Broddrick, Jared T.; Rubin, Benjamin E.; Welkie, David G.; Du, Niu; Mih, Nathan; Diamond, Spencer; Lee, Jenny J.; Golden, Susan S.; Palsson, Bernhard O.

    2016-01-01

    The model cyanobacterium, Synechococcus elongatus PCC 7942, is a genetically tractable obligate phototroph that is being developed for the bioproduction of high-value chemicals. Genome-scale models (GEMs) have been successfully used to assess and engineer cellular metabolism; however, GEMs of phototrophic metabolism have been limited by the lack of experimental datasets for model validation and the challenges of incorporating photon uptake. Here, we develop a GEM of metabolism in S. elongatus using random barcode transposon site sequencing (RB-TnSeq) essential gene and physiological data specific to photoautotrophic metabolism. The model explicitly describes photon absorption and accounts for shading, resulting in the characteristic linear growth curve of photoautotrophs. GEM predictions of gene essentiality were compared with data obtained from recent dense-transposon mutagenesis experiments. This dataset allowed major improvements to the accuracy of the model. Furthermore, discrepancies between GEM predictions and the in vivo dataset revealed biological characteristics, such as the importance of a truncated, linear TCA pathway, low flux toward amino acid synthesis from photorespiration, and knowledge gaps within nucleotide metabolism. Coupling of strong experimental support and photoautotrophic modeling methods thus resulted in a highly accurate model of S. elongatus metabolism that highlights previously unknown areas of S. elongatus biology. PMID:27911809

  20. Unique attributes of cyanobacterial metabolism revealed by improved genome-scale metabolic modeling and essential gene analysis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Broddrick, Jared T.; Rubin, Benjamin E.; Welkie, David G.

    The model cyanobacterium, Synechococcus elongatus PCC 7942, is a genetically tractable obligate phototroph that is being developed for the bioproduction of high-value chemicals. Genome-scale models (GEMs) have been successfully used to assess and engineer cellular metabolism; however, GEMs of phototrophic metabolism have been limited by the lack of experimental datasets for model validation and the challenges of incorporating photon uptake. In this paper, we develop a GEM of metabolism in S. elongatus using random barcode transposon site sequencing (RB-TnSeq) essential gene and physiological data specific to photoautotrophic metabolism. The model explicitly describes photon absorption and accounts for shading, resulting inmore » the characteristic linear growth curve of photoautotrophs. GEM predictions of gene essentiality were compared with data obtained from recent dense-transposon mutagenesis experiments. This dataset allowed major improvements to the accuracy of the model. Furthermore, discrepancies between GEM predictions and the in vivo dataset revealed biological characteristics, such as the importance of a truncated, linear TCA pathway, low flux toward amino acid synthesis from photorespiration, and knowledge gaps within nucleotide metabolism. Finally, coupling of strong experimental support and photoautotrophic modeling methods thus resulted in a highly accurate model of S. elongatus metabolism that highlights previously unknown areas of S. elongatus biology.« less

  1. An Accurate and Computationally Efficient Model for Membrane-Type Circular-Symmetric Micro-Hotplates

    PubMed Central

    Khan, Usman; Falconi, Christian

    2014-01-01

    Ideally, the design of high-performance micro-hotplates would require a large number of simulations because of the existence of many important design parameters as well as the possibly crucial effects of both spread and drift. However, the computational cost of FEM simulations, which are the only available tool for accurately predicting the temperature in micro-hotplates, is very high. As a result, micro-hotplate designers generally have no effective simulation-tools for the optimization. In order to circumvent these issues, here, we propose a model for practical circular-symmetric micro-hot-plates which takes advantage of modified Bessel functions, computationally efficient matrix-approach for considering the relevant boundary conditions, Taylor linearization for modeling the Joule heating and radiation losses, and external-region-segmentation strategy in order to accurately take into account radiation losses in the entire micro-hotplate. The proposed model is almost as accurate as FEM simulations and two to three orders of magnitude more computationally efficient (e.g., 45 s versus more than 8 h). The residual errors, which are mainly associated to the undesired heating in the electrical contacts, are small (e.g., few degrees Celsius for an 800 °C operating temperature) and, for important analyses, almost constant. Therefore, we also introduce a computationally-easy single-FEM-compensation strategy in order to reduce the residual errors to about 1 °C. As illustrative examples of the power of our approach, we report the systematic investigation of a spread in the membrane thermal conductivity and of combined variations of both ambient and bulk temperatures. Our model enables a much faster characterization of micro-hotplates and, thus, a much more effective optimization prior to fabrication. PMID:24763214

  2. Machine-learning approach identifies a pattern of gene expression in peripheral blood that can accurately detect ischaemic stroke

    PubMed Central

    O’Connell, Grant C; Petrone, Ashley B; Treadway, Madison B; Tennant, Connie S; Lucke-Wold, Noelle; Chantler, Paul D; Barr, Taura L

    2016-01-01

    Early and accurate diagnosis of stroke improves the probability of positive outcome. The objective of this study was to identify a pattern of gene expression in peripheral blood that could potentially be optimised to expedite the diagnosis of acute ischaemic stroke (AIS). A discovery cohort was recruited consisting of 39 AIS patients and 24 neurologically asymptomatic controls. Peripheral blood was sampled at emergency department admission, and genome-wide expression profiling was performed via microarray. A machine-learning technique known as genetic algorithm k-nearest neighbours (GA/kNN) was then used to identify a pattern of gene expression that could optimally discriminate between groups. This pattern of expression was then assessed via qRT-PCR in an independent validation cohort, where it was evaluated for its ability to discriminate between an additional 39 AIS patients and 30 neurologically asymptomatic controls, as well as 20 acute stroke mimics. GA/kNN identified 10 genes (ANTXR2, STK3, PDK4, CD163, MAL, GRAP, ID3, CTSZ, KIF1B and PLXDC2) whose coordinate pattern of expression was able to identify 98.4% of discovery cohort subjects correctly (97.4% sensitive, 100% specific). In the validation cohort, the expression levels of the same 10 genes were able to identify 95.6% of subjects correctly when comparing AIS patients to asymptomatic controls (92.3% sensitive, 100% specific), and 94.9% of subjects correctly when comparing AIS patients with stroke mimics (97.4% sensitive, 90.0% specific). The transcriptional pattern identified in this study shows strong diagnostic potential, and warrants further evaluation to determine its true clinical efficacy. PMID:29263821

  3. Accurate modeling of defects in graphene transport calculations

    NASA Astrophysics Data System (ADS)

    Linhart, Lukas; Burgdörfer, Joachim; Libisch, Florian

    2018-01-01

    We present an approach for embedding defect structures modeled by density functional theory into large-scale tight-binding simulations. We extract local tight-binding parameters for the vicinity of the defect site using Wannier functions. In the transition region between the bulk lattice and the defect the tight-binding parameters are continuously adjusted to approach the bulk limit far away from the defect. This embedding approach allows for an accurate high-level treatment of the defect orbitals using as many as ten nearest neighbors while keeping a small number of nearest neighbors in the bulk to render the overall computational cost reasonable. As an example of our approach, we consider an extended graphene lattice decorated with Stone-Wales defects, flower defects, double vacancies, or silicon substitutes. We predict distinct scattering patterns mirroring the defect symmetries and magnitude that should be experimentally accessible.

  4. Adaptation of video game UVW mapping to 3D visualization of gene expression patterns

    NASA Astrophysics Data System (ADS)

    Vize, Peter D.; Gerth, Victor E.

    2007-01-01

    Analysis of gene expression patterns within an organism plays a critical role in associating genes with biological processes in both health and disease. During embryonic development the analysis and comparison of different gene expression patterns allows biologists to identify candidate genes that may regulate the formation of normal tissues and organs and to search for genes associated with congenital diseases. No two individual embryos, or organs, are exactly the same shape or size so comparing spatial gene expression in one embryo to that in another is difficult. We will present our efforts in comparing gene expression data collected using both volumetric and projection approaches. Volumetric data is highly accurate but difficult to process and compare. Projection methods use UV mapping to align texture maps to standardized spatial frameworks. This approach is less accurate but is very rapid and requires very little processing. We have built a database of over 180 3D models depicting gene expression patterns mapped onto the surface of spline based embryo models. Gene expression data in different models can easily be compared to determine common regions of activity. Visualization software, both Java and OpenGL optimized for viewing 3D gene expression data will also be demonstrated.

  5. Accurate Modelling of Surface Currents and Internal Tides in a Semi-enclosed Coastal Sea

    NASA Astrophysics Data System (ADS)

    Allen, S. E.; Soontiens, N. K.; Dunn, M. B. H.; Liu, J.; Olson, E.; Halverson, M. J.; Pawlowicz, R.

    2016-02-01

    The Strait of Georgia is a deep (400 m), strongly stratified, semi-enclosed coastal sea on the west coast of North America. We have configured a baroclinic model of the Strait of Georgia and surrounding coastal waters using the NEMO ocean community model. We run daily nowcasts and forecasts and publish our sea-surface results (including storm surge warnings) to the web (salishsea.eos.ubc.ca/storm-surge). Tides in the Strait of Georgia are mixed and large. The baroclinic model and previous barotropic models accurately represent tidal sea-level variations and depth mean currents. The baroclinic model reproduces accurately the diurnal but not the semi-diurnal baroclinic tidal currents. In the Southern Strait of Georgia, strong internal tidal currents at the semi-diurnal frequency are observed. Strong semi-diurnal tides are also produced in the model, but are almost 180 degrees out of phase with the observations. In the model, in the surface, the barotropic and baroclinic tides reinforce, whereas the observations show that at the surface the baroclinic tides oppose the barotropic. As such the surface currents are very poorly modelled. Here we will present evidence of the internal tidal field from observations. We will discuss the generation regions of the tides, the necessary modifications to the model required to correct the phase, the resulting baroclinic tides and the improvements in the surface currents.

  6. Rapid and accurate identification of Mycobacterium tuberculosis complex and common non-tuberculous mycobacteria by multiplex real-time PCR targeting different housekeeping genes.

    PubMed

    Nasr Esfahani, Bahram; Rezaei Yazdi, Hadi; Moghim, Sharareh; Ghasemian Safaei, Hajieh; Zarkesh Esfahani, Hamid

    2012-11-01

    Rapid and accurate identification of mycobacteria isolates from primary culture is important due to timely and appropriate antibiotic therapy. Conventional methods for identification of Mycobacterium species based on biochemical tests needs several weeks and may remain inconclusive. In this study, a novel multiplex real-time PCR was developed for rapid identification of Mycobacterium genus, Mycobacterium tuberculosis complex (MTC) and the most common non-tuberculosis mycobacteria species including M. abscessus, M. fortuitum, M. avium complex, M. kansasii, and the M. gordonae in three reaction tubes but under same PCR condition. Genetic targets for primer designing included the 16S rDNA gene, the dnaJ gene, the gyrB gene and internal transcribed spacer (ITS). Multiplex real-time PCR was setup with reference Mycobacterium strains and was subsequently tested with 66 clinical isolates. Results of multiplex real-time PCR were analyzed with melting curves and melting temperature (T (m)) of Mycobacterium genus, MTC, and each of non-tuberculosis Mycobacterium species were determined. Multiplex real-time PCR results were compared with amplification and sequencing of 16S-23S rDNA ITS for identification of Mycobacterium species. Sensitivity and specificity of designed primers were each 100 % for MTC, M. abscessus, M. fortuitum, M. avium complex, M. kansasii, and M. gordonae. Sensitivity and specificity of designed primer for genus Mycobacterium was 96 and 100 %, respectively. According to the obtained results, we conclude that this multiplex real-time PCR with melting curve analysis and these novel primers can be used for rapid and accurate identification of genus Mycobacterium, MTC, and the most common non-tuberculosis Mycobacterium species.

  7. A Model-Based Joint Identification of Differentially Expressed Genes and Phenotype-Associated Genes

    PubMed Central

    Seo, Minseok; Shin, Su-kyung; Kwon, Eun-Young; Kim, Sung-Eun; Bae, Yun-Jung; Lee, Seungyeoun; Sung, Mi-Kyung; Choi, Myung-Sook; Park, Taesung

    2016-01-01

    Over the last decade, many analytical methods and tools have been developed for microarray data. The detection of differentially expressed genes (DEGs) among different treatment groups is often a primary purpose of microarray data analysis. In addition, association studies investigating the relationship between genes and a phenotype of interest such as survival time are also popular in microarray data analysis. Phenotype association analysis provides a list of phenotype-associated genes (PAGs). However, it is sometimes necessary to identify genes that are both DEGs and PAGs. We consider the joint identification of DEGs and PAGs in microarray data analyses. The first approach we used was a naïve approach that detects DEGs and PAGs separately and then identifies the genes in an intersection of the list of PAGs and DEGs. The second approach we considered was a hierarchical approach that detects DEGs first and then chooses PAGs from among the DEGs or vice versa. In this study, we propose a new model-based approach for the joint identification of DEGs and PAGs. Unlike the previous two-step approaches, the proposed method identifies genes simultaneously that are DEGs and PAGs. This method uses standard regression models but adopts different null hypothesis from ordinary regression models, which allows us to perform joint identification in one-step. The proposed model-based methods were evaluated using experimental data and simulation studies. The proposed methods were used to analyze a microarray experiment in which the main interest lies in detecting genes that are both DEGs and PAGs, where DEGs are identified between two diet groups and PAGs are associated with four phenotypes reflecting the expression of leptin, adiponectin, insulin-like growth factor 1, and insulin. Model-based approaches provided a larger number of genes, which are both DEGs and PAGs, than other methods. Simulation studies showed that they have more power than other methods. Through analysis of

  8. Accurate SHAPE-directed RNA secondary structure modeling, including pseudoknots.

    PubMed

    Hajdin, Christine E; Bellaousov, Stanislav; Huggins, Wayne; Leonard, Christopher W; Mathews, David H; Weeks, Kevin M

    2013-04-02

    A pseudoknot forms in an RNA when nucleotides in a loop pair with a region outside the helices that close the loop. Pseudoknots occur relatively rarely in RNA but are highly overrepresented in functionally critical motifs in large catalytic RNAs, in riboswitches, and in regulatory elements of viruses. Pseudoknots are usually excluded from RNA structure prediction algorithms. When included, these pairings are difficult to model accurately, especially in large RNAs, because allowing this structure dramatically increases the number of possible incorrect folds and because it is difficult to search the fold space for an optimal structure. We have developed a concise secondary structure modeling approach that combines SHAPE (selective 2'-hydroxyl acylation analyzed by primer extension) experimental chemical probing information and a simple, but robust, energy model for the entropic cost of single pseudoknot formation. Structures are predicted with iterative refinement, using a dynamic programming algorithm. This melded experimental and thermodynamic energy function predicted the secondary structures and the pseudoknots for a set of 21 challenging RNAs of known structure ranging in size from 34 to 530 nt. On average, 93% of known base pairs were predicted, and all pseudoknots in well-folded RNAs were identified.

  9. A novel essential domain perspective for exploring gene essentiality.

    PubMed

    Lu, Yao; Lu, Yulan; Deng, Jingyuan; Peng, Hai; Lu, Hui; Lu, Long Jason

    2015-09-15

    Genes with indispensable functions are identified as essential; however, the traditional gene-level studies of essentiality have several limitations. In this study, we characterized gene essentiality from a new perspective of protein domains, the independent structural or functional units of a polypeptide chain. To identify such essential domains, we have developed an Expectation-Maximization (EM) algorithm-based Essential Domain Prediction (EDP) Model. With simulated datasets, the model provided convergent results given different initial values and offered accurate predictions even with noise. We then applied the EDP model to six microbial species and predicted 1879 domains to be essential in at least one species, ranging 10-23% in each species. The predicted essential domains were more conserved than either non-essential domains or essential genes. Comparing essential domains in prokaryotes and eukaryotes revealed an evolutionary distance consistent with that inferred from ribosomal RNA. When utilizing these essential domains to reproduce the annotation of essential genes, we received accurate results that suggest protein domains are more basic units for the essentiality of genes. Furthermore, we presented several examples to illustrate how the combination of essential and non-essential domains can lead to genes with divergent essentiality. In summary, we have described the first systematic analysis on gene essentiality on the level of domains. huilu.bioinfo@gmail.com or Long.Lu@cchmc.org Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  10. Modeling T-cell activation using gene expression profiling and state-space models.

    PubMed

    Rangel, Claudia; Angus, John; Ghahramani, Zoubin; Lioumi, Maria; Sotheran, Elizabeth; Gaiba, Alessia; Wild, David L; Falciani, Francesco

    2004-06-12

    We have used state-space models to reverse engineer transcriptional networks from highly replicated gene expression profiling time series data obtained from a well-established model of T-cell activation. State space models are a class of dynamic Bayesian networks that assume that the observed measurements depend on some hidden state variables that evolve according to Markovian dynamics. These hidden variables can capture effects that cannot be measured in a gene expression profiling experiment, e.g. genes that have not been included in the microarray, levels of regulatory proteins, the effects of messenger RNA and protein degradation, etc. Bootstrap confidence intervals are developed for parameters representing 'gene-gene' interactions over time. Our models represent the dynamics of T-cell activation and provide a methodology for the development of rational and experimentally testable hypotheses. Supplementary data and Matlab computer source code will be made available on the web at the URL given below. http://public.kgi.edu/~wild/LDS/index.htm

  11. Accurate coarse-grained models for mixtures of colloids and linear polymers under good-solvent conditions

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    D’Adamo, Giuseppe, E-mail: giuseppe.dadamo@sissa.it; Pelissetto, Andrea, E-mail: andrea.pelissetto@roma1.infn.it; Pierleoni, Carlo, E-mail: carlo.pierleoni@aquila.infn.it

    2014-12-28

    A coarse-graining strategy, previously developed for polymer solutions, is extended here to mixtures of linear polymers and hard-sphere colloids. In this approach, groups of monomers are mapped onto a single pseudoatom (a blob) and the effective blob-blob interactions are obtained by requiring the model to reproduce some large-scale structural properties in the zero-density limit. We show that an accurate parametrization of the polymer-colloid interactions is obtained by simply introducing pair potentials between blobs and colloids. For the coarse-grained (CG) model in which polymers are modelled as four-blob chains (tetramers), the pair potentials are determined by means of the iterative Boltzmannmore » inversion scheme, taking full-monomer (FM) pair correlation functions at zero-density as targets. For a larger number n of blobs, pair potentials are determined by using a simple transferability assumption based on the polymer self-similarity. We validate the model by comparing its predictions with full-monomer results for the interfacial properties of polymer solutions in the presence of a single colloid and for thermodynamic and structural properties in the homogeneous phase at finite polymer and colloid density. The tetramer model is quite accurate for q ≲ 1 (q=R{sup ^}{sub g}/R{sub c}, where R{sup ^}{sub g} is the zero-density polymer radius of gyration and R{sub c} is the colloid radius) and reasonably good also for q = 2. For q = 2, an accurate coarse-grained description is obtained by using the n = 10 blob model. We also compare our results with those obtained by using single-blob models with state-dependent potentials.« less

  12. A Mouse Model for the Metabolic Effects of the Human Fat Mass and Obesity Associated FTO Gene

    PubMed Central

    Church, Chris; Deacon, Robert; Gerken, Thomas; Lee, Angela; Moir, Lee; Mecinović, Jasmin; Quwailid, Mohamed M.; Schofield, Christopher J.; Ashcroft, Frances M.; Cox, Roger D.

    2009-01-01

    Human FTO gene variants are associated with body mass index and type 2 diabetes. Because the obesity-associated SNPs are intronic, it is unclear whether changes in FTO expression or splicing are the cause of obesity or if regulatory elements within intron 1 influence upstream or downstream genes. We tested the idea that FTO itself is involved in obesity. We show that a dominant point mutation in the mouse Fto gene results in reduced fat mass, increased energy expenditure, and unchanged physical activity. Exposure to a high-fat diet enhances lean mass and lowers fat mass relative to control mice. Biochemical studies suggest the mutation occurs in a structurally novel domain and modifies FTO function, possibly by altering its dimerisation state. Gene expression profiling revealed increased expression of some fat and carbohydrate metabolism genes and an improved inflammatory profile in white adipose tissue of mutant mice. These data provide direct functional evidence that FTO is a causal gene underlying obesity. Compared to the reported mouse FTO knockout, our model more accurately reflects the effect of human FTO variants; we observe a heterozygous as well as homozygous phenotype, a smaller difference in weight and adiposity, and our mice do not show perinatal lethality or an age-related reduction in size and length. Our model suggests that a search for human coding mutations in FTO may be informative and that inhibition of FTO activity is a possible target for the treatment of morbid obesity. PMID:19680540

  13. Synchronous versus asynchronous modeling of gene regulatory networks.

    PubMed

    Garg, Abhishek; Di Cara, Alessandro; Xenarios, Ioannis; Mendoza, Luis; De Micheli, Giovanni

    2008-09-01

    In silico modeling of gene regulatory networks has gained some momentum recently due to increased interest in analyzing the dynamics of biological systems. This has been further facilitated by the increasing availability of experimental data on gene-gene, protein-protein and gene-protein interactions. The two dynamical properties that are often experimentally testable are perturbations and stable steady states. Although a lot of work has been done on the identification of steady states, not much work has been reported on in silico modeling of cellular differentiation processes. In this manuscript, we provide algorithms based on reduced ordered binary decision diagrams (ROBDDs) for Boolean modeling of gene regulatory networks. Algorithms for synchronous and asynchronous transition models have been proposed and their corresponding computational properties have been analyzed. These algorithms allow users to compute cyclic attractors of large networks that are currently not feasible using existing software. Hereby we provide a framework to analyze the effect of multiple gene perturbation protocols, and their effect on cell differentiation processes. These algorithms were validated on the T-helper model showing the correct steady state identification and Th1-Th2 cellular differentiation process. The software binaries for Windows and Linux platforms can be downloaded from http://si2.epfl.ch/~garg/genysis.html.

  14. Development of an Anatomically Accurate Finite Element Human Ocular Globe Model for Blast-Related Fluid-Structure Interaction Studies

    DTIC Science & Technology

    2017-02-01

    ARL-TR-7945 ● FEB 2017 US Army Research Laboratory Development of an Anatomically Accurate Finite Element Human Ocular Globe...ARL-TR-7945 ● FEB 2017 US Army Research Laboratory Development of an Anatomically Accurate Finite Element Human Ocular Globe Model... Finite Element Human Ocular Globe Model for Blast-Related Fluid-Structure Interaction Studies 5a. CONTRACT NUMBER 5b. GRANT NUMBER 5c. PROGRAM

  15. Accurate, efficient, and (iso)geometrically flexible collocation methods for phase-field models

    NASA Astrophysics Data System (ADS)

    Gomez, Hector; Reali, Alessandro; Sangalli, Giancarlo

    2014-04-01

    We propose new collocation methods for phase-field models. Our algorithms are based on isogeometric analysis, a new technology that makes use of functions from computational geometry, such as, for example, Non-Uniform Rational B-Splines (NURBS). NURBS exhibit excellent approximability and controllable global smoothness, and can represent exactly most geometries encapsulated in Computer Aided Design (CAD) models. These attributes permitted us to derive accurate, efficient, and geometrically flexible collocation methods for phase-field models. The performance of our method is demonstrated by several numerical examples of phase separation modeled by the Cahn-Hilliard equation. We feel that our method successfully combines the geometrical flexibility of finite elements with the accuracy and simplicity of pseudo-spectral collocation methods, and is a viable alternative to classical collocation methods.

  16. Gene expression patterns in rainbow trout, Oncorhynchus mykiss, exposed to a suite of model toxicants

    PubMed Central

    Hook, Sharon E.; Skillman, Ann D.; Small, Jack A.; Schultz, Irvin R.

    2008-01-01

    steroidogenesis, p450 and estrogen responsive genes appear to be useful for selectively identifying toxicant mode of action in fish, suggesting a link between gene expression profile and mode of toxicity. Our array results showed good agreement with quantitative real time polymerase chain reaction (qRT PCR), which demonstrates that the arrays are an accurate measure of gene expression. The specificity of the gene expression profile in response to a model toxicant, the link between genes with altered expression and mode of toxic action, and the consistency between array and qRT PCR results all suggest that cDNA microarrays have the potential to screen environmental contaminants for biomarkers and mode of toxic action. PMID:16488489

  17. Gene expression patterns in rainbow trout, Oncorhynchus mykiss, exposed to a suite of model toxicants.

    PubMed

    Hook, Sharon E; Skillman, Ann D; Small, Jack A; Schultz, Irvin R

    2006-05-25

    , p450 and estrogen responsive genes appear to be useful for selectively identifying toxicant mode of action in fish, suggesting a link between gene expression profile and mode of toxicity. Our array results showed good agreement with quantitative real time polymerase chain reaction (qRT PCR), which demonstrates that the arrays are an accurate measure of gene expression. The specificity of the gene expression profile in response to a model toxicant, the link between genes with altered expression and mode of toxic action, and the consistency between array and qRT PCR results all suggest that cDNA microarrays have the potential to screen environmental contaminants for biomarkers and mode of toxic action.

  18. GeneTopics - interpretation of gene sets via literature-driven topic models

    PubMed Central

    2013-01-01

    Background Annotation of a set of genes is often accomplished through comparison to a library of labelled gene sets such as biological processes or canonical pathways. However, this approach might fail if the employed libraries are not up to date with the latest research, don't capture relevant biological themes or are curated at a different level of granularity than is required to appropriately analyze the input gene set. At the same time, the vast biomedical literature offers an unstructured repository of the latest research findings that can be tapped to provide thematic sub-groupings for any input gene set. Methods Our proposed method relies on a gene-specific text corpus and extracts commonalities between documents in an unsupervised manner using a topic model approach. We automatically determine the number of topics summarizing the corpus and calculate a gene relevancy score for each topic allowing us to eliminate non-specific topics. As a result we obtain a set of literature topics in which each topic is associated with a subset of the input genes providing directly interpretable keywords and corresponding documents for literature research. Results We validate our method based on labelled gene sets from the KEGG metabolic pathway collection and the genetic association database (GAD) and show that the approach is able to detect topics consistent with the labelled annotation. Furthermore, we discuss the results on three different types of experimentally derived gene sets, (1) differentially expressed genes from a cardiac hypertrophy experiment in mice, (2) altered transcript abundance in human pancreatic beta cells, and (3) genes implicated by GWA studies to be associated with metabolite levels in a healthy population. In all three cases, we are able to replicate findings from the original papers in a quick and semi-automated manner. Conclusions Our approach provides a novel way of automatically generating meaningful annotations for gene sets that are directly

  19. Modeling stochasticity and robustness in gene regulatory networks.

    PubMed

    Garg, Abhishek; Mohanram, Kartik; Di Cara, Alessandro; De Micheli, Giovanni; Xenarios, Ioannis

    2009-06-15

    Understanding gene regulation in biological processes and modeling the robustness of underlying regulatory networks is an important problem that is currently being addressed by computational systems biologists. Lately, there has been a renewed interest in Boolean modeling techniques for gene regulatory networks (GRNs). However, due to their deterministic nature, it is often difficult to identify whether these modeling approaches are robust to the addition of stochastic noise that is widespread in gene regulatory processes. Stochasticity in Boolean models of GRNs has been addressed relatively sparingly in the past, mainly by flipping the expression of genes between different expression levels with a predefined probability. This stochasticity in nodes (SIN) model leads to over representation of noise in GRNs and hence non-correspondence with biological observations. In this article, we introduce the stochasticity in functions (SIF) model for simulating stochasticity in Boolean models of GRNs. By providing biological motivation behind the use of the SIF model and applying it to the T-helper and T-cell activation networks, we show that the SIF model provides more biologically robust results than the existing SIN model of stochasticity in GRNs. Algorithms are made available under our Boolean modeling toolbox, GenYsis. The software binaries can be downloaded from http://si2.epfl.ch/ approximately garg/genysis.html.

  20. BEYOND ELLIPSE(S): ACCURATELY MODELING THE ISOPHOTAL STRUCTURE OF GALAXIES WITH ISOFIT AND CMODEL

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ciambur, B. C., E-mail: bciambur@swin.edu.au

    2015-09-10

    This work introduces a new fitting formalism for isophotes that enables more accurate modeling of galaxies with non-elliptical shapes, such as disk galaxies viewed edge-on or galaxies with X-shaped/peanut bulges. Within this scheme, the angular parameter that defines quasi-elliptical isophotes is transformed from the commonly used, but inappropriate, polar coordinate to the “eccentric anomaly.” This provides a superior description of deviations from ellipticity, better capturing the true isophotal shape. Furthermore, this makes it possible to accurately recover both the surface brightness profile, using the correct azimuthally averaged isophote, and the two-dimensional model of any galaxy: the hitherto ubiquitous, but artificial,more » cross-like features in residual images are completely removed. The formalism has been implemented into the Image Reduction and Analysis Facility tasks Ellipse and Bmodel to create the new tasks “Isofit,” and “Cmodel.” The new tools are demonstrated here with application to five galaxies, chosen to be representative case-studies for several areas where this technique makes it possible to gain new scientific insight. Specifically: properly quantifying boxy/disky isophotes via the fourth harmonic order in edge-on galaxies, quantifying X-shaped/peanut bulges, higher-order Fourier moments for modeling bars in disks, and complex isophote shapes. Higher order (n > 4) harmonics now become meaningful and may correlate with structural properties, as boxyness/diskyness is known to do. This work also illustrates how the accurate construction, and subtraction, of a model from a galaxy image facilitates the identification and recovery of over-lapping sources such as globular clusters and the optical counterparts of X-ray sources.« less

  1. Specification, testing, and interpretation of gene-by-measured-environment interaction models in the presence of gene-environment correlation

    PubMed Central

    Rathouz, Paul J.; Van Hulle, Carol A.; Lee Rodgers, Joseph; Waldman, Irwin D.; Lahey, Benjamin B.

    2009-01-01

    Purcell (2002) proposed a bivariate biometric model for testing and quantifying the interaction between latent genetic influences and measured environments in the presence of gene-environment correlation. Purcell’s model extends the Cholesky model to include gene-environment interaction. We examine a number of closely-related alternative models that do not involve gene-environment interaction but which may fit the data as well Purcell’s model. Because failure to consider these alternatives could lead to spurious detection of gene-environment interaction, we propose alternative models for testing gene-environment interaction in the presence of gene-environment correlation, including one based on the correlated factors model. In addition, we note mathematical errors in the calculation of effect size via variance components in Purcell’s model. We propose a statistical method for deriving and interpreting variance decompositions that are true to the fitted model. PMID:18293078

  2. Multiclass classification of microarray data samples with a reduced number of genes

    PubMed Central

    2011-01-01

    Background Multiclass classification of microarray data samples with a reduced number of genes is a rich and challenging problem in Bioinformatics research. The problem gets harder as the number of classes is increased. In addition, the performance of most classifiers is tightly linked to the effectiveness of mandatory gene selection methods. Critical to gene selection is the availability of estimates about the maximum number of genes that can be handled by any classification algorithm. Lack of such estimates may lead to either computationally demanding explorations of a search space with thousands of dimensions or classification models based on gene sets of unrestricted size. In the former case, unbiased but possibly overfitted classification models may arise. In the latter case, biased classification models unable to support statistically significant findings may be obtained. Results A novel bound on the maximum number of genes that can be handled by binary classifiers in binary mediated multiclass classification algorithms of microarray data samples is presented. The bound suggests that high-dimensional binary output domains might favor the existence of accurate and sparse binary mediated multiclass classifiers for microarray data samples. Conclusions A comprehensive experimental work shows that the bound is indeed useful to induce accurate and sparse multiclass classifiers for microarray data samples. PMID:21342522

  3. Production of Accurate Skeletal Models of Domestic Animals Using Three-Dimensional Scanning and Printing Technology

    ERIC Educational Resources Information Center

    Li, Fangzheng; Liu, Chunying; Song, Xuexiong; Huan, Yanjun; Gao, Shansong; Jiang, Zhongling

    2018-01-01

    Access to adequate anatomical specimens can be an important aspect in learning the anatomy of domestic animals. In this study, the authors utilized a structured light scanner and fused deposition modeling (FDM) printer to produce highly accurate animal skeletal models. First, various components of the bovine skeleton, including the femur, the…

  4. Suitable Reference Genes for Accurate Gene Expression Analysis in Parsley (Petroselinum crispum) for Abiotic Stresses and Hormone Stimuli

    PubMed Central

    Li, Meng-Yao; Song, Xiong; Wang, Feng; Xiong, Ai-Sheng

    2016-01-01

    Parsley, one of the most important vegetables in the Apiaceae family, is widely used in the food, medicinal, and cosmetic industries. Recent studies on parsley mainly focus on its chemical composition, and further research involving the analysis of the plant's gene functions and expressions is required. qPCR is a powerful method for detecting very low quantities of target transcript levels and is widely used to study gene expression. To ensure the accuracy of results, a suitable reference gene is necessary for expression normalization. In this study, four software, namely geNorm, NormFinder, BestKeeper, and RefFinder were used to evaluate the expression stabilities of eight candidate reference genes of parsley (GAPDH, ACTIN, eIF-4α, SAND, UBC, TIP41, EF-1α, and TUB) under various conditions, including abiotic stresses (heat, cold, salt, and drought) and hormone stimuli treatments (GA, SA, MeJA, and ABA). Results showed that EF-1α and TUB were the most stable genes for abiotic stresses, whereas EF-1α, GAPDH, and TUB were the top three choices for hormone stimuli treatments. Moreover, EF-1α and TUB were the most stable reference genes among all tested samples, and UBC was the least stable one. Expression analysis of PcDREB1 and PcDREB2 further verified that the selected stable reference genes were suitable for gene expression normalization. This study can guide the selection of suitable reference genes in gene expression in parsley. PMID:27746803

  5. Suitable Reference Genes for Accurate Gene Expression Analysis in Parsley (Petroselinum crispum) for Abiotic Stresses and Hormone Stimuli.

    PubMed

    Li, Meng-Yao; Song, Xiong; Wang, Feng; Xiong, Ai-Sheng

    2016-01-01

    Parsley, one of the most important vegetables in the Apiaceae family, is widely used in the food, medicinal, and cosmetic industries. Recent studies on parsley mainly focus on its chemical composition, and further research involving the analysis of the plant's gene functions and expressions is required. qPCR is a powerful method for detecting very low quantities of target transcript levels and is widely used to study gene expression. To ensure the accuracy of results, a suitable reference gene is necessary for expression normalization. In this study, four software, namely geNorm, NormFinder, BestKeeper, and RefFinder were used to evaluate the expression stabilities of eight candidate reference genes of parsley ( GAPDH, ACTIN, eIF-4 α, SAND, UBC, TIP41, EF-1 α, and TUB ) under various conditions, including abiotic stresses (heat, cold, salt, and drought) and hormone stimuli treatments (GA, SA, MeJA, and ABA). Results showed that EF-1 α and TUB were the most stable genes for abiotic stresses, whereas EF-1 α, GAPDH , and TUB were the top three choices for hormone stimuli treatments. Moreover, EF-1 α and TUB were the most stable reference genes among all tested samples, and UBC was the least stable one. Expression analysis of PcDREB1 and PcDREB2 further verified that the selected stable reference genes were suitable for gene expression normalization. This study can guide the selection of suitable reference genes in gene expression in parsley.

  6. A model-updating procedure to stimulate piezoelectric transducers accurately.

    PubMed

    Piranda, B; Ballandras, S; Steichen, W; Hecart, B

    2001-09-01

    The use of numerical calculations based on finite element methods (FEM) has yielded significant improvements in the simulation and design of piezoelectric transducers piezoelectric transducer utilized in acoustic imaging. However, the ultimate precision of such models is directly controlled by the accuracy of material characterization. The present work is dedicated to the development of a model-updating technique adapted to the problem of piezoelectric transducer. The updating process is applied using the experimental admittance of a given structure for which a finite element analysis is performed. The mathematical developments are reported and then applied to update the entries of a FEM of a two-layer structure (a PbZrTi-PZT-ridge glued on a backing) for which measurements were available. The efficiency of the proposed approach is demonstrated, yielding the definition of a new set of constants well adapted to predict the structure response accurately. Improvement of the proposed approach, consisting of the updating of material coefficients not only on the admittance but also on the impedance data, is finally discussed.

  7. Accurate modeling of switched reluctance machine based on hybrid trained WNN

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Song, Shoujun, E-mail: sunnyway@nwpu.edu.cn; Ge, Lefei; Ma, Shaojie

    2014-04-15

    According to the strong nonlinear electromagnetic characteristics of switched reluctance machine (SRM), a novel accurate modeling method is proposed based on hybrid trained wavelet neural network (WNN) which combines improved genetic algorithm (GA) with gradient descent (GD) method to train the network. In the novel method, WNN is trained by GD method based on the initial weights obtained per improved GA optimization, and the global parallel searching capability of stochastic algorithm and local convergence speed of deterministic algorithm are combined to enhance the training accuracy, stability and speed. Based on the measured electromagnetic characteristics of a 3-phase 12/8-pole SRM, themore » nonlinear simulation model is built by hybrid trained WNN in Matlab. The phase current and mechanical characteristics from simulation under different working conditions meet well with those from experiments, which indicates the accuracy of the model for dynamic and static performance evaluation of SRM and verifies the effectiveness of the proposed modeling method.« less

  8. The identification of complete domains within protein sequences using accurate E-values for semi-global alignment

    PubMed Central

    Kann, Maricel G.; Sheetlin, Sergey L.; Park, Yonil; Bryant, Stephen H.; Spouge, John L.

    2007-01-01

    The sequencing of complete genomes has created a pressing need for automated annotation of gene function. Because domains are the basic units of protein function and evolution, a gene can be annotated from a domain database by aligning domains to the corresponding protein sequence. Ideally, complete domains are aligned to protein subsequences, in a ‘semi-global alignment’. Local alignment, which aligns pieces of domains to subsequences, is common in high-throughput annotation applications, however. It is a mature technique, with the heuristics and accurate E-values required for screening large databases and evaluating the screening results. Hidden Markov models (HMMs) provide an alternative theoretical framework for semi-global alignment, but their use is limited because they lack heuristic acceleration and accurate E-values. Our new tool, GLOBAL, overcomes some limitations of previous semi-global HMMs: it has accurate E-values and the possibility of the heuristic acceleration required for high-throughput applications. Moreover, according to a standard of truth based on protein structure, two semi-global HMM alignment tools (GLOBAL and HMMer) had comparable performance in identifying complete domains, but distinctly outperformed two tools based on local alignment. When searching for complete protein domains, therefore, GLOBAL avoids disadvantages commonly associated with HMMs, yet maintains their superior retrieval performance. PMID:17596268

  9. Sequence-based model of gap gene regulatory network.

    PubMed

    Kozlov, Konstantin; Gursky, Vitaly; Kulakovskiy, Ivan; Samsonova, Maria

    2014-01-01

    The detailed analysis of transcriptional regulation is crucially important for understanding biological processes. The gap gene network in Drosophila attracts large interest among researches studying mechanisms of transcriptional regulation. It implements the most upstream regulatory layer of the segmentation gene network. The knowledge of molecular mechanisms involved in gap gene regulation is far less complete than that of genetics of the system. Mathematical modeling goes beyond insights gained by genetics and molecular approaches. It allows us to reconstruct wild-type gene expression patterns in silico, infer underlying regulatory mechanism and prove its sufficiency. We developed a new model that provides a dynamical description of gap gene regulatory systems, using detailed DNA-based information, as well as spatial transcription factor concentration data at varying time points. We showed that this model correctly reproduces gap gene expression patterns in wild type embryos and is able to predict gap expression patterns in Kr mutants and four reporter constructs. We used four-fold cross validation test and fitting to random dataset to validate the model and proof its sufficiency in data description. The identifiability analysis showed that most model parameters are well identifiable. We reconstructed the gap gene network topology and studied the impact of individual transcription factor binding sites on the model output. We measured this impact by calculating the site regulatory weight as a normalized difference between the residual sum of squares error for the set of all annotated sites and for the set with the site of interest excluded. The reconstructed topology of the gap gene network is in agreement with previous modeling results and data from literature. We showed that 1) the regulatory weights of transcription factor binding sites show very weak correlation with their PWM score; 2) sites with low regulatory weight are important for the model output; 3

  10. Accurate Induction Energies for Small Organic Molecules. 2. Development and Testing of Distributed Polarizability Models against SAPT(DFT) Energies.

    PubMed

    Misquitta, Alston J; Stone, Anthony J; Price, Sarah L

    2008-01-01

    In part 1 of this two-part investigation we set out the theoretical basis for constructing accurate models of the induction energy of clusters of moderately sized organic molecules. In this paper we use these techniques to develop a variety of accurate distributed polarizability models for a set of representative molecules that include formamide, N-methyl propanamide, benzene, and 3-azabicyclo[3.3.1]nonane-2,4-dione. We have also explored damping, penetration, and basis set effects. In particular, we have provided a way to treat the damping of the induction expansion. Different approximations to the induction energy are evaluated against accurate SAPT(DFT) energies, and we demonstrate the accuracy of our induction models on the formamide-water dimer.

  11. Genotype-based association models of complex diseases to detect gene-gene and gene-environment interactions.

    PubMed

    Lobach, Iryna; Fan, Ruzong; Manga, Prashiela

    A central problem in genetic epidemiology is to identify and rank genetic markers involved in a disease. Complex diseases, such as cancer, hypertension, diabetes, are thought to be caused by an interaction of a panel of genetic factors, that can be identified by markers, which modulate environmental factors. Moreover, the effect of each genetic marker may be small. Hence, the association signal may be missed unless a large sample is considered, or a priori biomedical data are used. Recent advances generated a vast variety of a priori information, including linkage maps and information about gene regulatory dependence assembled into curated pathway databases. We propose a genotype-based approach that takes into account linkage disequilibrium (LD) information between genetic markers that are in moderate LD while modeling gene-gene and gene-environment interactions. A major advantage of our method is that the observed genetic information enters a model directly thus eliminating the need to estimate haplotype-phase. Our approach results in an algorithm that is inexpensive computationally and does not suffer from bias induced by haplotype-phase ambiguity. We investigated our model in a series of simulation experiments and demonstrated that the proposed approach results in estimates that are nearly unbiased and have small variability. We applied our method to the analysis of data from a melanoma case-control study and investigated interaction between a set of pigmentation genes and environmental factors defined by age and gender. Furthermore, an application of our method is demonstrated using a study of Alcohol Dependence.

  12. Genomic imprinting—an epigenetic gene-regulatory model

    PubMed Central

    Koerner, Martha V; Barlow, Denise P

    2010-01-01

    Epigenetic mechanisms (Box 1) are considered to play major gene-regulatory roles in development, differentiation and disease. However, the relative importance of epigenetics in defining the mammalian transcriptome in normal and disease states is unknown. The mammalian genome contains only a few model systems where epigenetic gene regulation has been shown to play a major role in transcriptional control. These model systems are important not only to investigate the biological function of known epigenetic modifications but also to identify new and unexpected epigenetic mechanisms in the mammalian genome. Here we review recent progress in understanding how epigenetic mechanisms control imprinted gene expression. PMID:20153958

  13. Digital gene expression for non-model organisms

    PubMed Central

    Hong, Lewis Z.; Li, Jun; Schmidt-Küntzel, Anne; Warren, Wesley C.; Barsh, Gregory S.

    2011-01-01

    Next-generation sequencing technologies offer new approaches for global measurements of gene expression but are mostly limited to organisms for which a high-quality assembled reference genome sequence is available. We present a method for gene expression profiling called EDGE, or EcoP15I-tagged Digital Gene Expression, based on ultra-high-throughput sequencing of 27-bp cDNA fragments that uniquely tag the corresponding gene, thereby allowing direct quantification of transcript abundance. We show that EDGE is capable of assaying for expression in >99% of genes in the genome and achieves saturation after 6–8 million reads. EDGE exhibits very little technical noise, reveals a large (106) dynamic range of gene expression, and is particularly suited for quantification of transcript abundance in non-model organisms where a high-quality annotated genome is not available. In a direct comparison with RNA-seq, both methods provide similar assessments of relative transcript abundance, but EDGE does better at detecting gene expression differences for poorly expressed genes and does not exhibit transcript length bias. Applying EDGE to laboratory mice, we show that a loss-of-function mutation in the melanocortin 1 receptor (Mc1r), recognized as a Mendelian determinant of yellow hair color in many different mammals, also causes reduced expression of genes involved in the interferon response. To illustrate the application of EDGE to a non-model organism, we examine skin biopsy samples from a cheetah (Acinonyx jubatus) and identify genes likely to control differences in the color of spotted versus non-spotted regions. PMID:21844123

  14. Helicopter flight dynamics simulation with a time-accurate free-vortex wake model

    NASA Astrophysics Data System (ADS)

    Ribera, Maria

    This dissertation describes the implementation and validation of a coupled rotor-fuselage simulation model with a time-accurate free-vortex wake model capable of capturing the response to maneuvers of arbitrary amplitude. The resulting model has been used to analyze different flight conditions, including both steady and transient maneuvers. The flight dynamics model is based on a system of coupled nonlinear rotor-fuselage differential equations in first-order, state-space form. The rotor model includes flexible blades, with coupled flap-lag-torsion dynamics and swept tips; the rigid body dynamics are modeled with the non-linear Euler equations. The free wake models the rotor flow field by tracking the vortices released at the blade tips. Their behavior is described by the equations of vorticity transport, which is approximated using finite differences, and solved using a time-accurate numerical scheme. The flight dynamics model can be solved as a system of non-linear algebraic trim equations to determine the steady state solution, or integrated in time in response to pilot-applied controls. This study also implements new approaches to reduce the prohibitive computational costs associated with such complex models without losing accuracy. The mathematical model was validated for trim conditions in level flight, turns, climbs and descents. The results obtained correlate well with flight test data, both in level flight as well as turning and climbing and descending flight. The swept tip model was also found to improve the trim predictions, particularly at high speed. The behavior of the rigid body and the rotor blade dynamics were also studied and related to the aerodynamic load distributions obtained with the free wake induced velocities. The model was also validated in a lateral maneuver from hover. The results show improvements in the on-axis prediction, and indicate a possible relation between the off-axis prediction and the lack of rotor-body interaction

  15. An accurate and efficient laser-envelope solver for the modeling of laser-plasma accelerators

    DOE PAGES

    Benedetti, C.; Schroeder, C. B.; Geddes, C. G. R.; ...

    2017-10-17

    Detailed and reliable numerical modeling of laser-plasma accelerators (LPAs), where a short and intense laser pulse interacts with an underdense plasma over distances of up to a meter, is a formidably challenging task. This is due to the great disparity among the length scales involved in the modeling, ranging from the micron scale of the laser wavelength to the meter scale of the total laser-plasma interaction length. The use of the time-averaged ponderomotive force approximation, where the laser pulse is described by means of its envelope, enables efficient modeling of LPAs by removing the need to model the details ofmore » electron motion at the laser wavelength scale. Furthermore, it allows simulations in cylindrical geometry which captures relevant 3D physics at 2D computational cost. A key element of any code based on the time-averaged ponderomotive force approximation is the laser envelope solver. In this paper we present the accurate and efficient envelope solver used in the code INF & RNO (INtegrated Fluid & paRticle simulatioN cOde). The features of the INF & RNO laser solver enable an accurate description of the laser pulse evolution deep into depletion even at a reasonably low resolution, resulting in significant computational speed-ups.« less

  16. An accurate and efficient laser-envelope solver for the modeling of laser-plasma accelerators

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Benedetti, C.; Schroeder, C. B.; Geddes, C. G. R.

    Detailed and reliable numerical modeling of laser-plasma accelerators (LPAs), where a short and intense laser pulse interacts with an underdense plasma over distances of up to a meter, is a formidably challenging task. This is due to the great disparity among the length scales involved in the modeling, ranging from the micron scale of the laser wavelength to the meter scale of the total laser-plasma interaction length. The use of the time-averaged ponderomotive force approximation, where the laser pulse is described by means of its envelope, enables efficient modeling of LPAs by removing the need to model the details ofmore » electron motion at the laser wavelength scale. Furthermore, it allows simulations in cylindrical geometry which captures relevant 3D physics at 2D computational cost. A key element of any code based on the time-averaged ponderomotive force approximation is the laser envelope solver. In this paper we present the accurate and efficient envelope solver used in the code INF & RNO (INtegrated Fluid & paRticle simulatioN cOde). The features of the INF & RNO laser solver enable an accurate description of the laser pulse evolution deep into depletion even at a reasonably low resolution, resulting in significant computational speed-ups.« less

  17. An accurate and efficient laser-envelope solver for the modeling of laser-plasma accelerators

    NASA Astrophysics Data System (ADS)

    Benedetti, C.; Schroeder, C. B.; Geddes, C. G. R.; Esarey, E.; Leemans, W. P.

    2018-01-01

    Detailed and reliable numerical modeling of laser-plasma accelerators (LPAs), where a short and intense laser pulse interacts with an underdense plasma over distances of up to a meter, is a formidably challenging task. This is due to the great disparity among the length scales involved in the modeling, ranging from the micron scale of the laser wavelength to the meter scale of the total laser-plasma interaction length. The use of the time-averaged ponderomotive force approximation, where the laser pulse is described by means of its envelope, enables efficient modeling of LPAs by removing the need to model the details of electron motion at the laser wavelength scale. Furthermore, it allows simulations in cylindrical geometry which captures relevant 3D physics at 2D computational cost. A key element of any code based on the time-averaged ponderomotive force approximation is the laser envelope solver. In this paper we present the accurate and efficient envelope solver used in the code INF&RNO (INtegrated Fluid & paRticle simulatioN cOde). The features of the INF&RNO laser solver enable an accurate description of the laser pulse evolution deep into depletion even at a reasonably low resolution, resulting in significant computational speed-ups.

  18. Probabilistic modeling of the evolution of gene synteny within reconciled phylogenies

    PubMed Central

    2015-01-01

    Background Most models of genome evolution concern either genetic sequences, gene content or gene order. They sometimes integrate two of the three levels, but rarely the three of them. Probabilistic models of gene order evolution usually have to assume constant gene content or adopt a presence/absence coding of gene neighborhoods which is blind to complex events modifying gene content. Results We propose a probabilistic evolutionary model for gene neighborhoods, allowing genes to be inserted, duplicated or lost. It uses reconciled phylogenies, which integrate sequence and gene content evolution. We are then able to optimize parameters such as phylogeny branch lengths, or probabilistic laws depicting the diversity of susceptibility of syntenic regions to rearrangements. We reconstruct a structure for ancestral genomes by optimizing a likelihood, keeping track of all evolutionary events at the level of gene content and gene synteny. Ancestral syntenies are associated with a probability of presence. We implemented the model with the restriction that at most one gene duplication separates two gene speciations in reconciled gene trees. We reconstruct ancestral syntenies on a set of 12 drosophila genomes, and compare the evolutionary rates along the branches and along the sites. We compare with a parsimony method and find a significant number of results not supported by the posterior probability. The model is implemented in the Bio++ library. It thus benefits from and enriches the classical models and methods for molecular evolution. PMID:26452018

  19. An accurate behavioral model for single-photon avalanche diode statistical performance simulation

    NASA Astrophysics Data System (ADS)

    Xu, Yue; Zhao, Tingchen; Li, Ding

    2018-01-01

    An accurate behavioral model is presented to simulate important statistical performance of single-photon avalanche diodes (SPADs), such as dark count and after-pulsing noise. The derived simulation model takes into account all important generation mechanisms of the two kinds of noise. For the first time, thermal agitation, trap-assisted tunneling and band-to-band tunneling mechanisms are simultaneously incorporated in the simulation model to evaluate dark count behavior of SPADs fabricated in deep sub-micron CMOS technology. Meanwhile, a complete carrier trapping and de-trapping process is considered in afterpulsing model and a simple analytical expression is derived to estimate after-pulsing probability. In particular, the key model parameters of avalanche triggering probability and electric field dependence of excess bias voltage are extracted from Geiger-mode TCAD simulation and this behavioral simulation model doesn't include any empirical parameters. The developed SPAD model is implemented in Verilog-A behavioral hardware description language and successfully operated on commercial Cadence Spectre simulator, showing good universality and compatibility. The model simulation results are in a good accordance with the test data, validating high simulation accuracy.

  20. Accurate atom-mapping computation for biochemical reactions.

    PubMed

    Latendresse, Mario; Malerich, Jeremiah P; Travers, Mike; Karp, Peter D

    2012-11-26

    The complete atom mapping of a chemical reaction is a bijection of the reactant atoms to the product atoms that specifies the terminus of each reactant atom. Atom mapping of biochemical reactions is useful for many applications of systems biology, in particular for metabolic engineering where synthesizing new biochemical pathways has to take into account for the number of carbon atoms from a source compound that are conserved in the synthesis of a target compound. Rapid, accurate computation of the atom mapping(s) of a biochemical reaction remains elusive despite significant work on this topic. In particular, past researchers did not validate the accuracy of mapping algorithms. We introduce a new method for computing atom mappings called the minimum weighted edit-distance (MWED) metric. The metric is based on bond propensity to react and computes biochemically valid atom mappings for a large percentage of biochemical reactions. MWED models can be formulated efficiently as Mixed-Integer Linear Programs (MILPs). We have demonstrated this approach on 7501 reactions of the MetaCyc database for which 87% of the models could be solved in less than 10 s. For 2.1% of the reactions, we found multiple optimal atom mappings. We show that the error rate is 0.9% (22 reactions) by comparing these atom mappings to 2446 atom mappings of the manually curated Kyoto Encyclopedia of Genes and Genomes (KEGG) RPAIR database. To our knowledge, our computational atom-mapping approach is the most accurate and among the fastest published to date. The atom-mapping data will be available in the MetaCyc database later in 2012; the atom-mapping software will be available within the Pathway Tools software later in 2012.

  1. Novel gene sets improve set-level classification of prokaryotic gene expression data.

    PubMed

    Holec, Matěj; Kuželka, Ondřej; Železný, Filip

    2015-10-28

    Set-level classification of gene expression data has received significant attention recently. In this setting, high-dimensional vectors of features corresponding to genes are converted into lower-dimensional vectors of features corresponding to biologically interpretable gene sets. The dimensionality reduction brings the promise of a decreased risk of overfitting, potentially resulting in improved accuracy of the learned classifiers. However, recent empirical research has not confirmed this expectation. Here we hypothesize that the reported unfavorable classification results in the set-level framework were due to the adoption of unsuitable gene sets defined typically on the basis of the Gene ontology and the KEGG database of metabolic networks. We explore an alternative approach to defining gene sets, based on regulatory interactions, which we expect to collect genes with more correlated expression. We hypothesize that such more correlated gene sets will enable to learn more accurate classifiers. We define two families of gene sets using information on regulatory interactions, and evaluate them on phenotype-classification tasks using public prokaryotic gene expression data sets. From each of the two gene-set families, we first select the best-performing subtype. The two selected subtypes are then evaluated on independent (testing) data sets against state-of-the-art gene sets and against the conventional gene-level approach. The novel gene sets are indeed more correlated than the conventional ones, and lead to significantly more accurate classifiers. The novel gene sets are indeed more correlated than the conventional ones, and lead to significantly more accurate classifiers. Novel gene sets defined on the basis of regulatory interactions improve set-level classification of gene expression data. The experimental scripts and other material needed to reproduce the experiments are available at http://ida.felk.cvut.cz/novelgenesets.tar.gz.

  2. Genome-Scale Metabolic Model for the Green Alga Chlorella vulgaris UTEX 395 Accurately Predicts Phenotypes under Autotrophic, Heterotrophic, and Mixotrophic Growth Conditions1

    PubMed Central

    Zuñiga, Cristal; Li, Chien-Ting; Zielinski, Daniel C.; Guarnieri, Michael T.; Antoniewicz, Maciek R.; Zengler, Karsten

    2016-01-01

    The green microalga Chlorella vulgaris has been widely recognized as a promising candidate for biofuel production due to its ability to store high lipid content and its natural metabolic versatility. Compartmentalized genome-scale metabolic models constructed from genome sequences enable quantitative insight into the transport and metabolism of compounds within a target organism. These metabolic models have long been utilized to generate optimized design strategies for an improved production process. Here, we describe the reconstruction, validation, and application of a genome-scale metabolic model for C. vulgaris UTEX 395, iCZ843. The reconstruction represents the most comprehensive model for any eukaryotic photosynthetic organism to date, based on the genome size and number of genes in the reconstruction. The highly curated model accurately predicts phenotypes under photoautotrophic, heterotrophic, and mixotrophic conditions. The model was validated against experimental data and lays the foundation for model-driven strain design and medium alteration to improve yield. Calculated flux distributions under different trophic conditions show that a number of key pathways are affected by nitrogen starvation conditions, including central carbon metabolism and amino acid, nucleotide, and pigment biosynthetic pathways. Furthermore, model prediction of growth rates under various medium compositions and subsequent experimental validation showed an increased growth rate with the addition of tryptophan and methionine. PMID:27372244

  3. Double Cluster Heads Model for Secure and Accurate Data Fusion in Wireless Sensor Networks

    PubMed Central

    Fu, Jun-Song; Liu, Yun

    2015-01-01

    Secure and accurate data fusion is an important issue in wireless sensor networks (WSNs) and has been extensively researched in the literature. In this paper, by combining clustering techniques, reputation and trust systems, and data fusion algorithms, we propose a novel cluster-based data fusion model called Double Cluster Heads Model (DCHM) for secure and accurate data fusion in WSNs. Different from traditional clustering models in WSNs, two cluster heads are selected after clustering for each cluster based on the reputation and trust system and they perform data fusion independently of each other. Then, the results are sent to the base station where the dissimilarity coefficient is computed. If the dissimilarity coefficient of the two data fusion results exceeds the threshold preset by the users, the cluster heads will be added to blacklist, and the cluster heads must be reelected by the sensor nodes in a cluster. Meanwhile, feedback is sent from the base station to the reputation and trust system, which can help us to identify and delete the compromised sensor nodes in time. Through a series of extensive simulations, we found that the DCHM performed very well in data fusion security and accuracy. PMID:25608211

  4. Duchenne Muscular Dystrophy Gene Therapy in the Canine Model

    PubMed Central

    2015-01-01

    Abstract Duchenne muscular dystrophy (DMD) is an X-linked lethal muscle disease caused by dystrophin deficiency. Gene therapy has significantly improved the outcome of dystrophin-deficient mice. Yet, clinical translation has not resulted in the expected benefits in human patients. This translational gap is largely because of the insufficient modeling of DMD in mice. Specifically, mice lacking dystrophin show minimum dystrophic symptoms, and they do not respond to the gene therapy vector in the same way as human patients do. Further, the size of a mouse is hundredfolds smaller than a boy, making it impossible to scale-up gene therapy in a mouse model. None of these limitations exist in the canine DMD (cDMD) model. For this reason, cDMD dogs have been considered a highly valuable platform to test experimental DMD gene therapy. Over the last three decades, a variety of gene therapy approaches have been evaluated in cDMD dogs using a number of nonviral and viral vectors. These studies have provided critical insight for the development of an effective gene therapy protocol in human patients. This review discusses the history, current status, and future directions of the DMD gene therapy in the canine model. PMID:25710459

  5. Learning Petri net models of non-linear gene interactions.

    PubMed

    Mayo, Michael

    2005-10-01

    Understanding how an individual's genetic make-up influences their risk of disease is a problem of paramount importance. Although machine-learning techniques are able to uncover the relationships between genotype and disease, the problem of automatically building the best biochemical model or "explanation" of the relationship has received less attention. In this paper, I describe a method based on random hill climbing that automatically builds Petri net models of non-linear (or multi-factorial) disease-causing gene-gene interactions. Petri nets are a suitable formalism for this problem, because they are used to model concurrent, dynamic processes analogous to biochemical reaction networks. I show that this method is routinely able to identify perfect Petri net models for three disease-causing gene-gene interactions recently reported in the literature.

  6. Gene therapy in animal models of autosomal dominant retinitis pigmentosa

    PubMed Central

    Rossmiller, Brian; Mao, Haoyu

    2012-01-01

    Gene therapy for dominantly inherited genetic disease is more difficult than gene-based therapy for recessive disorders, which can be treated with gene supplementation. Treatment of dominant disease may require gene supplementation partnered with suppression of the expression of the mutant gene either at the DNA level, by gene repair, or at the RNA level by RNA interference or transcriptional repression. In this review, we examine some of the gene delivery approaches used to treat animal models of autosomal dominant retinitis pigmentosa, focusing on those models associated with mutations in the gene for rhodopsin. We conclude that combinatorial approaches have the greatest promise for success. PMID:23077406

  7. Genome-Scale Metabolic Model for the Green Alga Chlorella vulgaris UTEX 395 Accurately Predicts Phenotypes under Autotrophic, Heterotrophic, and Mixotrophic Growth Conditions

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zuniga, Cristal; Li, Chien -Ting; Huelsman, Tyler

    The green microalgae Chlorella vulgaris has been widely recognized as a promising candidate for biofuel production due to its ability to store high lipid content and its natural metabolic versatility. Compartmentalized genome-scale metabolic models constructed from genome sequences enable quantitative insight into the transport and metabolism of compounds within a target organism. These metabolic models have long been utilized to generate optimized design strategies for an improved production process. Here, we describe the reconstruction, validation, and application of a genome-scale metabolic model for C. vulgaris UTEX 395, iCZ843. The reconstruction represents the most comprehensive model for any eukaryotic photosynthetic organismmore » to date, based on the genome size and number of genes in the reconstruction. The highly curated model accurately predicts phenotypes under photoautotrophic, heterotrophic, and mixotrophic conditions. The model was validated against experimental data and lays the foundation for model-driven strain design and medium alteration to improve yield. Calculated flux distributions under different trophic conditions show that a number of key pathways are affected by nitrogen starvation conditions, including central carbon metabolism and amino acid, nucleotide, and pigment biosynthetic pathways. Moreover, model prediction of growth rates under various medium compositions and subsequent experimental validation showed an increased growth rate with the addition of tryptophan and methionine.« less

  8. Genome-Scale Metabolic Model for the Green Alga Chlorella vulgaris UTEX 395 Accurately Predicts Phenotypes under Autotrophic, Heterotrophic, and Mixotrophic Growth Conditions

    DOE PAGES

    Zuniga, Cristal; Li, Chien -Ting; Huelsman, Tyler; ...

    2016-07-02

    The green microalgae Chlorella vulgaris has been widely recognized as a promising candidate for biofuel production due to its ability to store high lipid content and its natural metabolic versatility. Compartmentalized genome-scale metabolic models constructed from genome sequences enable quantitative insight into the transport and metabolism of compounds within a target organism. These metabolic models have long been utilized to generate optimized design strategies for an improved production process. Here, we describe the reconstruction, validation, and application of a genome-scale metabolic model for C. vulgaris UTEX 395, iCZ843. The reconstruction represents the most comprehensive model for any eukaryotic photosynthetic organismmore » to date, based on the genome size and number of genes in the reconstruction. The highly curated model accurately predicts phenotypes under photoautotrophic, heterotrophic, and mixotrophic conditions. The model was validated against experimental data and lays the foundation for model-driven strain design and medium alteration to improve yield. Calculated flux distributions under different trophic conditions show that a number of key pathways are affected by nitrogen starvation conditions, including central carbon metabolism and amino acid, nucleotide, and pigment biosynthetic pathways. Moreover, model prediction of growth rates under various medium compositions and subsequent experimental validation showed an increased growth rate with the addition of tryptophan and methionine.« less

  9. Genome-Scale Metabolic Model for the Green Alga Chlorella vulgaris UTEX 395 Accurately Predicts Phenotypes under Autotrophic, Heterotrophic, and Mixotrophic Growth Conditions.

    PubMed

    Zuñiga, Cristal; Li, Chien-Ting; Huelsman, Tyler; Levering, Jennifer; Zielinski, Daniel C; McConnell, Brian O; Long, Christopher P; Knoshaug, Eric P; Guarnieri, Michael T; Antoniewicz, Maciek R; Betenbaugh, Michael J; Zengler, Karsten

    2016-09-01

    The green microalga Chlorella vulgaris has been widely recognized as a promising candidate for biofuel production due to its ability to store high lipid content and its natural metabolic versatility. Compartmentalized genome-scale metabolic models constructed from genome sequences enable quantitative insight into the transport and metabolism of compounds within a target organism. These metabolic models have long been utilized to generate optimized design strategies for an improved production process. Here, we describe the reconstruction, validation, and application of a genome-scale metabolic model for C. vulgaris UTEX 395, iCZ843. The reconstruction represents the most comprehensive model for any eukaryotic photosynthetic organism to date, based on the genome size and number of genes in the reconstruction. The highly curated model accurately predicts phenotypes under photoautotrophic, heterotrophic, and mixotrophic conditions. The model was validated against experimental data and lays the foundation for model-driven strain design and medium alteration to improve yield. Calculated flux distributions under different trophic conditions show that a number of key pathways are affected by nitrogen starvation conditions, including central carbon metabolism and amino acid, nucleotide, and pigment biosynthetic pathways. Furthermore, model prediction of growth rates under various medium compositions and subsequent experimental validation showed an increased growth rate with the addition of tryptophan and methionine. © 2016 American Society of Plant Biologists. All rights reserved.

  10. A hamster model for Marburg virus infection accurately recapitulates Marburg hemorrhagic fever

    PubMed Central

    Marzi, Andrea; Banadyga, Logan; Haddock, Elaine; Thomas, Tina; Shen, Kui; Horne, Eva J.; Scott, Dana P.; Feldmann, Heinz; Ebihara, Hideki

    2016-01-01

    Marburg virus (MARV), a close relative of Ebola virus, is the causative agent of a severe human disease known as Marburg hemorrhagic fever (MHF). No licensed vaccine or therapeutic exists to treat MHF, and MARV is therefore classified as a Tier 1 select agent and a category A bioterrorism agent. In order to develop countermeasures against this severe disease, animal models that accurately recapitulate human disease are required. Here we describe the development of a novel, uniformly lethal Syrian golden hamster model of MHF using a hamster-adapted MARV variant Angola. Remarkably, this model displayed almost all of the clinical features of MHF seen in humans and non-human primates, including coagulation abnormalities, hemorrhagic manifestations, petechial rash, and a severely dysregulated immune response. This MHF hamster model represents a powerful tool for further dissecting MARV pathogenesis and accelerating the development of effective medical countermeasures against human MHF. PMID:27976688

  11. A hamster model for Marburg virus infection accurately recapitulates Marburg hemorrhagic fever.

    PubMed

    Marzi, Andrea; Banadyga, Logan; Haddock, Elaine; Thomas, Tina; Shen, Kui; Horne, Eva J; Scott, Dana P; Feldmann, Heinz; Ebihara, Hideki

    2016-12-15

    Marburg virus (MARV), a close relative of Ebola virus, is the causative agent of a severe human disease known as Marburg hemorrhagic fever (MHF). No licensed vaccine or therapeutic exists to treat MHF, and MARV is therefore classified as a Tier 1 select agent and a category A bioterrorism agent. In order to develop countermeasures against this severe disease, animal models that accurately recapitulate human disease are required. Here we describe the development of a novel, uniformly lethal Syrian golden hamster model of MHF using a hamster-adapted MARV variant Angola. Remarkably, this model displayed almost all of the clinical features of MHF seen in humans and non-human primates, including coagulation abnormalities, hemorrhagic manifestations, petechial rash, and a severely dysregulated immune response. This MHF hamster model represents a powerful tool for further dissecting MARV pathogenesis and accelerating the development of effective medical countermeasures against human MHF.

  12. Can phenological models predict tree phenology accurately in the future? The unrevealed hurdle of endodormancy break.

    PubMed

    Chuine, Isabelle; Bonhomme, Marc; Legave, Jean-Michel; García de Cortázar-Atauri, Iñaki; Charrier, Guillaume; Lacointe, André; Améglio, Thierry

    2016-10-01

    The onset of the growing season of trees has been earlier by 2.3 days per decade during the last 40 years in temperate Europe because of global warming. The effect of temperature on plant phenology is, however, not linear because temperature has a dual effect on bud development. On one hand, low temperatures are necessary to break bud endodormancy, and, on the other hand, higher temperatures are necessary to promote bud cell growth afterward. Different process-based models have been developed in the last decades to predict the date of budbreak of woody species. They predict that global warming should delay or compromise endodormancy break at the species equatorward range limits leading to a delay or even impossibility to flower or set new leaves. These models are classically parameterized with flowering or budbreak dates only, with no information on the endodormancy break date because this information is very scarce. Here, we evaluated the efficiency of a set of phenological models to accurately predict the endodormancy break dates of three fruit trees. Our results show that models calibrated solely with budbreak dates usually do not accurately predict the endodormancy break date. Providing endodormancy break date for the model parameterization results in much more accurate prediction of this latter, with, however, a higher error than that on budbreak dates. Most importantly, we show that models not calibrated with endodormancy break dates can generate large discrepancies in forecasted budbreak dates when using climate scenarios as compared to models calibrated with endodormancy break dates. This discrepancy increases with mean annual temperature and is therefore the strongest after 2050 in the southernmost regions. Our results claim for the urgent need of massive measurements of endodormancy break dates in forest and fruit trees to yield more robust projections of phenological changes in a near future. © 2016 John Wiley & Sons Ltd.

  13. Genome-wide prediction and analysis of human tissue-selective genes using microarray expression data

    PubMed Central

    2013-01-01

    Background Understanding how genes are expressed specifically in particular tissues is a fundamental question in developmental biology. Many tissue-specific genes are involved in the pathogenesis of complex human diseases. However, experimental identification of tissue-specific genes is time consuming and difficult. The accurate predictions of tissue-specific gene targets could provide useful information for biomarker development and drug target identification. Results In this study, we have developed a machine learning approach for predicting the human tissue-specific genes using microarray expression data. The lists of known tissue-specific genes for different tissues were collected from UniProt database, and the expression data retrieved from the previously compiled dataset according to the lists were used for input vector encoding. Random Forests (RFs) and Support Vector Machines (SVMs) were used to construct accurate classifiers. The RF classifiers were found to outperform SVM models for tissue-specific gene prediction. The results suggest that the candidate genes for brain or liver specific expression can provide valuable information for further experimental studies. Our approach was also applied for identifying tissue-selective gene targets for different types of tissues. Conclusions A machine learning approach has been developed for accurately identifying the candidate genes for tissue specific/selective expression. The approach provides an efficient way to select some interesting genes for developing new biomedical markers and improve our knowledge of tissue-specific expression. PMID:23369200

  14. Multiscale Methods for Accurate, Efficient, and Scale-Aware Models of the Earth System

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Goldhaber, Steve; Holland, Marika

    The major goal of this project was to contribute improvements to the infrastructure of an Earth System Model in order to support research in the Multiscale Methods for Accurate, Efficient, and Scale-Aware models of the Earth System project. In support of this, the NCAR team accomplished two main tasks: improving input/output performance of the model and improving atmospheric model simulation quality. Improvement of the performance and scalability of data input and diagnostic output within the model required a new infrastructure which can efficiently handle the unstructured grids common in multiscale simulations. This allows for a more computationally efficient model, enablingmore » more years of Earth System simulation. The quality of the model simulations was improved by reducing grid-point noise in the spectral element version of the Community Atmosphere Model (CAM-SE). This was achieved by running the physics of the model using grid-cell data on a finite-volume grid.« less

  15. Learning a weighted sequence model of the nucleosome core and linker yields more accurate predictions in Saccharomyces cerevisiae and Homo sapiens.

    PubMed

    Reynolds, Sheila M; Bilmes, Jeff A; Noble, William Stafford

    2010-07-08

    DNA in eukaryotes is packaged into a chromatin complex, the most basic element of which is the nucleosome. The precise positioning of the nucleosome cores allows for selective access to the DNA, and the mechanisms that control this positioning are important pieces of the gene expression puzzle. We describe a large-scale nucleosome pattern that jointly characterizes the nucleosome core and the adjacent linkers and is predominantly characterized by long-range oscillations in the mono, di- and tri-nucleotide content of the DNA sequence, and we show that this pattern can be used to predict nucleosome positions in both Homo sapiens and Saccharomyces cerevisiae more accurately than previously published methods. Surprisingly, in both H. sapiens and S. cerevisiae, the most informative individual features are the mono-nucleotide patterns, although the inclusion of di- and tri-nucleotide features results in improved performance. Our approach combines a much longer pattern than has been previously used to predict nucleosome positioning from sequence-301 base pairs, centered at the position to be scored-with a novel discriminative classification approach that selectively weights the contributions from each of the input features. The resulting scores are relatively insensitive to local AT-content and can be used to accurately discriminate putative dyad positions from adjacent linker regions without requiring an additional dynamic programming step and without the attendant edge effects and assumptions about linker length modeling and overall nucleosome density. Our approach produces the best dyad-linker classification results published to date in H. sapiens, and outperforms two recently published models on a large set of S. cerevisiae nucleosome positions. Our results suggest that in both genomes, a comparable and relatively small fraction of nucleosomes are well-positioned and that these positions are predictable based on sequence alone. We believe that the bulk of the

  16. Learning a Weighted Sequence Model of the Nucleosome Core and Linker Yields More Accurate Predictions in Saccharomyces cerevisiae and Homo sapiens

    PubMed Central

    Reynolds, Sheila M.; Bilmes, Jeff A.; Noble, William Stafford

    2010-01-01

    DNA in eukaryotes is packaged into a chromatin complex, the most basic element of which is the nucleosome. The precise positioning of the nucleosome cores allows for selective access to the DNA, and the mechanisms that control this positioning are important pieces of the gene expression puzzle. We describe a large-scale nucleosome pattern that jointly characterizes the nucleosome core and the adjacent linkers and is predominantly characterized by long-range oscillations in the mono, di- and tri-nucleotide content of the DNA sequence, and we show that this pattern can be used to predict nucleosome positions in both Homo sapiens and Saccharomyces cerevisiae more accurately than previously published methods. Surprisingly, in both H. sapiens and S. cerevisiae, the most informative individual features are the mono-nucleotide patterns, although the inclusion of di- and tri-nucleotide features results in improved performance. Our approach combines a much longer pattern than has been previously used to predict nucleosome positioning from sequence—301 base pairs, centered at the position to be scored—with a novel discriminative classification approach that selectively weights the contributions from each of the input features. The resulting scores are relatively insensitive to local AT-content and can be used to accurately discriminate putative dyad positions from adjacent linker regions without requiring an additional dynamic programming step and without the attendant edge effects and assumptions about linker length modeling and overall nucleosome density. Our approach produces the best dyad-linker classification results published to date in H. sapiens, and outperforms two recently published models on a large set of S. cerevisiae nucleosome positions. Our results suggest that in both genomes, a comparable and relatively small fraction of nucleosomes are well-positioned and that these positions are predictable based on sequence alone. We believe that the bulk of the

  17. Fast and accurate focusing analysis of large photon sieve using pinhole ring diffraction model.

    PubMed

    Liu, Tao; Zhang, Xin; Wang, Lingjie; Wu, Yanxiong; Zhang, Jizhen; Qu, Hemeng

    2015-06-10

    In this paper, we developed a pinhole ring diffraction model for the focusing analysis of a large photon sieve. Instead of analyzing individual pinholes, we discuss the focusing of all of the pinholes in a single ring. An explicit equation for the diffracted field of individual pinhole ring has been proposed. We investigated the validity range of this generalized model and analytically describe the sufficient conditions for the validity of this pinhole ring diffraction model. A practical example and investigation reveals the high accuracy of the pinhole ring diffraction model. This simulation method could be used for fast and accurate focusing analysis of a large photon sieve.

  18. Branch and bound algorithm for accurate estimation of analytical isotropic bidirectional reflectance distribution function models.

    PubMed

    Yu, Chanki; Lee, Sang Wook

    2016-05-20

    We present a reliable and accurate global optimization framework for estimating parameters of isotropic analytical bidirectional reflectance distribution function (BRDF) models. This approach is based on a branch and bound strategy with linear programming and interval analysis. Conventional local optimization is often very inefficient for BRDF estimation since its fitting quality is highly dependent on initial guesses due to the nonlinearity of analytical BRDF models. The algorithm presented in this paper employs L1-norm error minimization to estimate BRDF parameters in a globally optimal way and interval arithmetic to derive our feasibility problem and lower bounding function. Our method is developed for the Cook-Torrance model but with several normal distribution functions such as the Beckmann, Berry, and GGX functions. Experiments have been carried out to validate the presented method using 100 isotropic materials from the MERL BRDF database, and our experimental results demonstrate that the L1-norm minimization provides a more accurate and reliable solution than the L2-norm minimization.

  19. Pre-Modeling Ensures Accurate Solid Models

    ERIC Educational Resources Information Center

    Gow, George

    2010-01-01

    Successful solid modeling requires a well-organized design tree. The design tree is a list of all the object's features and the sequential order in which they are modeled. The solid-modeling process is faster and less prone to modeling errors when the design tree is a simple and geometrically logical definition of the modeled object. Few high…

  20. A random set scoring model for prioritization of disease candidate genes using protein complexes and data-mining of GeneRIF, OMIM and PubMed records.

    PubMed

    Jiang, Li; Edwards, Stefan M; Thomsen, Bo; Workman, Christopher T; Guldbrandtsen, Bernt; Sørensen, Peter

    2014-09-24

    Prioritizing genetic variants is a challenge because disease susceptibility loci are often located in genes of unknown function or the relationship with the corresponding phenotype is unclear. A global data-mining exercise on the biomedical literature can establish the phenotypic profile of genes with respect to their connection to disease phenotypes. The importance of protein-protein interaction networks in the genetic heterogeneity of common diseases or complex traits is becoming increasingly recognized. Thus, the development of a network-based approach combined with phenotypic profiling would be useful for disease gene prioritization. We developed a random-set scoring model and implemented it to quantify phenotype relevance in a network-based disease gene-prioritization approach. We validated our approach based on different gene phenotypic profiles, which were generated from PubMed abstracts, OMIM, and GeneRIF records. We also investigated the validity of several vocabulary filters and different likelihood thresholds for predicted protein-protein interactions in terms of their effect on the network-based gene-prioritization approach, which relies on text-mining of the phenotype data. Our method demonstrated good precision and sensitivity compared with those of two alternative complex-based prioritization approaches. We then conducted a global ranking of all human genes according to their relevance to a range of human diseases. The resulting accurate ranking of known causal genes supported the reliability of our approach. Moreover, these data suggest many promising novel candidate genes for human disorders that have a complex mode of inheritance. We have implemented and validated a network-based approach to prioritize genes for human diseases based on their phenotypic profile. We have devised a powerful and transparent tool to identify and rank candidate genes. Our global gene prioritization provides a unique resource for the biological interpretation of data

  1. An automatic and accurate method of full heart segmentation from CT image based on linear gradient model

    NASA Astrophysics Data System (ADS)

    Yang, Zili

    2017-07-01

    Heart segmentation is an important auxiliary method in the diagnosis of many heart diseases, such as coronary heart disease and atrial fibrillation, and in the planning of tumor radiotherapy. Most of the existing methods for full heart segmentation treat the heart as a whole part and cannot accurately extract the bottom of the heart. In this paper, we propose a new method based on linear gradient model to segment the whole heart from the CT images automatically and accurately. Twelve cases were tested in order to test this method and accurate segmentation results were achieved and identified by clinical experts. The results can provide reliable clinical support.

  2. Modeling gene expression measurement error: a quasi-likelihood approach

    PubMed Central

    Strimmer, Korbinian

    2003-01-01

    Background Using suitable error models for gene expression measurements is essential in the statistical analysis of microarray data. However, the true probabilistic model underlying gene expression intensity readings is generally not known. Instead, in currently used approaches some simple parametric model is assumed (usually a transformed normal distribution) or the empirical distribution is estimated. However, both these strategies may not be optimal for gene expression data, as the non-parametric approach ignores known structural information whereas the fully parametric models run the risk of misspecification. A further related problem is the choice of a suitable scale for the model (e.g. observed vs. log-scale). Results Here a simple semi-parametric model for gene expression measurement error is presented. In this approach inference is based an approximate likelihood function (the extended quasi-likelihood). Only partial knowledge about the unknown true distribution is required to construct this function. In case of gene expression this information is available in the form of the postulated (e.g. quadratic) variance structure of the data. As the quasi-likelihood behaves (almost) like a proper likelihood, it allows for the estimation of calibration and variance parameters, and it is also straightforward to obtain corresponding approximate confidence intervals. Unlike most other frameworks, it also allows analysis on any preferred scale, i.e. both on the original linear scale as well as on a transformed scale. It can also be employed in regression approaches to model systematic (e.g. array or dye) effects. Conclusions The quasi-likelihood framework provides a simple and versatile approach to analyze gene expression data that does not make any strong distributional assumptions about the underlying error model. For several simulated as well as real data sets it provides a better fit to the data than competing models. In an example it also improved the power of

  3. Deep developmental transcriptome sequencing uncovers numerous new genes and enhances gene annotation in the sponge Amphimedon queenslandica.

    PubMed

    Fernandez-Valverde, Selene L; Calcino, Andrew D; Degnan, Bernard M

    2015-05-15

    The demosponge Amphimedon queenslandica is amongst the few early-branching metazoans with an assembled and annotated draft genome, making it an important species in the study of the origin and early evolution of animals. Current gene models in this species are largely based on in silico predictions and low coverage expressed sequence tag (EST) evidence. Amphimedon queenslandica protein-coding gene models are improved using deep RNA-Seq data from four developmental stages and CEL-Seq data from 82 developmental samples. Over 86% of previously predicted genes are retained in the new gene models, although 24% have additional exons; there is also a marked increase in the total number of annotated 3' and 5' untranslated regions (UTRs). Importantly, these new developmental transcriptome data reveal numerous previously unannotated protein-coding genes in the Amphimedon genome, increasing the total gene number by 25%, from 30,060 to 40,122. In general, Amphimedon genes have introns that are markedly smaller than those in other animals and most of the alternatively spliced genes in Amphimedon undergo intron-retention; exon-skipping is the least common mode of alternative splicing. Finally, in addition to canonical polyadenylation signal sequences, Amphimedon genes are enriched in a number of unique AT-rich motifs in their 3' UTRs. The inclusion of developmental transcriptome data has substantially improved the structure and composition of protein-coding gene models in Amphimedon queenslandica, providing a more accurate and comprehensive set of genes for functional and comparative studies. These improvements reveal the Amphimedon genome is comprised of a remarkably high number of tightly packed genes. These genes have small introns and there is pervasive intron retention amongst alternatively spliced transcripts. These aspects of the sponge genome are more similar unicellular opisthokont genomes than to other animal genomes.

  4. An accurate fatigue damage model for welded joints subjected to variable amplitude loading

    NASA Astrophysics Data System (ADS)

    Aeran, A.; Siriwardane, S. C.; Mikkelsen, O.; Langen, I.

    2017-12-01

    Researchers in the past have proposed several fatigue damage models to overcome the shortcomings of the commonly used Miner’s rule. However, requirements of material parameters or S-N curve modifications restricts their practical applications. Also, application of most of these models under variable amplitude loading conditions have not been found. To overcome these restrictions, a new fatigue damage model is proposed in this paper. The proposed model can be applied by practicing engineers using only the S-N curve given in the standard codes of practice. The model is verified with experimentally derived damage evolution curves for C 45 and 16 Mn and gives better agreement compared to previous models. The model predicted fatigue lives are also in better correlation with experimental results compared to previous models as shown in earlier published work by the authors. The proposed model is applied to welded joints subjected to variable amplitude loadings in this paper. The model given around 8% shorter fatigue lives compared to Eurocode given Miner’s rule. This shows the importance of applying accurate fatigue damage models for welded joints.

  5. Modeling the Activity of Single Genes

    NASA Technical Reports Server (NTRS)

    Mjolsness, Eric; Gibson, Michael

    1999-01-01

    the key questions in gene regulation are: What genes are expressed in a certain cell at a certain time? How does gene expression differ from cell to cell in a multicellular organism? Which proteins act as transcription factors, i.e., are important in regulating gene expression? From questions like these, we hope to understand which genes are important for various macroscopic processes. Nearly all of the cells of a multicellular organism contain the same DNA. Yet this same genetic information yields a large number of different cell types. The fundamental difference between a neuron and a liver cell, for example, is which genes are expressed. Thus understanding gene regulation is an important step in understanding development. Furthermore, understanding the usual genes that are expressed in cells may give important clues about various diseases. Some diseases, such as sickle cell anemia and cystic fibrosis, are caused by defects in single, non-regulatory genes; others, such as certain cancers, are caused when the cellular control circuitry malfunctions - an understanding of these diseases will involve pathways of multiple interacting gene products. There are numerous challenges in the area of understanding and modeling gene regulation. First and foremost, biologists would like to develop a deeper understanding of the processes involved, including which genes and families of genes are important, how they interact, etc. From a computation point of view, there has been embarrassingly little work done. In this chapter there are many areas in which we can phrase meaningful, non-trivial computational questions, but questions that have not been addressed. Some of these are purely computational (what is a good algorithm for dealing with a model of type X) and others are more mathematical (given a system with certain characteristics, what sort of model can one use? How does one find biochemical parameters from system-level behavior using as few experiments as possible?). In

  6. Selection of reliable reference genes for quantitative real-time PCR gene expression analysis in Jute (Corchorus capsularis) under stress treatments

    PubMed Central

    Niu, Xiaoping; Qi, Jianmin; Zhang, Gaoyang; Xu, Jiantang; Tao, Aifen; Fang, Pingping; Su, Jianguang

    2015-01-01

    To accurately measure gene expression using quantitative reverse transcription PCR (qRT-PCR), reliable reference gene(s) are required for data normalization. Corchorus capsularis, an annual herbaceous fiber crop with predominant biodegradability and renewability, has not been investigated for the stability of reference genes with qRT-PCR. In this study, 11 candidate reference genes were selected and their expression levels were assessed using qRT-PCR. To account for the influence of experimental approach and tissue type, 22 different jute samples were selected from abiotic and biotic stress conditions as well as three different tissue types. The stability of the candidate reference genes was evaluated using geNorm, NormFinder, and BestKeeper programs, and the comprehensive rankings of gene stability were generated by aggregate analysis. For the biotic stress and NaCl stress subsets, ACT7 and RAN were suitable as stable reference genes for gene expression normalization. For the PEG stress subset, UBC, and DnaJ were sufficient for accurate normalization. For the tissues subset, four reference genes TUBβ, UBI, EF1α, and RAN were sufficient for accurate normalization. The selected genes were further validated by comparing expression profiles of WRKY15 in various samples, and two stable reference genes were recommended for accurate normalization of qRT-PCR data. Our results provide researchers with appropriate reference genes for qRT-PCR in C. capsularis, and will facilitate gene expression study under these conditions. PMID:26528312

  7. Central nervous system gene expression changes in a transgenic mouse model for bovine spongiform encephalopathy

    PubMed Central

    2011-01-01

    Gene expression analysis has proven to be a very useful tool to gain knowledge of the factors involved in the pathogenesis of diseases, particularly in the initial or preclinical stages. With the aim of finding new data on the events occurring in the Central Nervous System in animals affected with Bovine Spongiform Encephalopathy, a comprehensive genome wide gene expression study was conducted at different time points of the disease on mice genetically modified to model the bovine species brain in terms of cellular prion protein. An accurate analysis of the information generated by microarray technique was the key point to assess the biological relevance of the data obtained in terms of Transmissible Spongiform Encephalopathy pathogenesis. Validation of the microarray technique was achieved by RT-PCR confirming the RNA change and immunohistochemistry techniques that verified that expression changes were translated into variable levels of protein for selected genes. Our study reveals changes in the expression of genes, some of them not previously associated with prion diseases, at early stages of the disease previous to the detection of the pathological prion protein, that might have a role in neuronal degeneration and several transcriptional changes showing an important imbalance in the Central Nervous System homeostasis in advanced stages of the disease. Genes whose expression is altered at early stages of the disease should be considered as possible therapeutic targets and potential disease markers in preclinical diagnostic tool development. Genes non-previously related to prion diseases should be taken into consideration for further investigations. PMID:22035425

  8. Genome-Wide Comparative Gene Family Classification

    PubMed Central

    Frech, Christian; Chen, Nansheng

    2010-01-01

    Correct classification of genes into gene families is important for understanding gene function and evolution. Although gene families of many species have been resolved both computationally and experimentally with high accuracy, gene family classification in most newly sequenced genomes has not been done with the same high standard. This project has been designed to develop a strategy to effectively and accurately classify gene families across genomes. We first examine and compare the performance of computer programs developed for automated gene family classification. We demonstrate that some programs, including the hierarchical average-linkage clustering algorithm MC-UPGMA and the popular Markov clustering algorithm TRIBE-MCL, can reconstruct manual curation of gene families accurately. However, their performance is highly sensitive to parameter setting, i.e. different gene families require different program parameters for correct resolution. To circumvent the problem of parameterization, we have developed a comparative strategy for gene family classification. This strategy takes advantage of existing curated gene families of reference species to find suitable parameters for classifying genes in related genomes. To demonstrate the effectiveness of this novel strategy, we use TRIBE-MCL to classify chemosensory and ABC transporter gene families in C. elegans and its four sister species. We conclude that fully automated programs can establish biologically accurate gene families if parameterized accordingly. Comparative gene family classification finds optimal parameters automatically, thus allowing rapid insights into gene families of newly sequenced species. PMID:20976221

  9. Using the gene ontology to scan multilevel gene sets for associations in genome wide association studies.

    PubMed

    Schaid, Daniel J; Sinnwell, Jason P; Jenkins, Gregory D; McDonnell, Shannon K; Ingle, James N; Kubo, Michiaki; Goss, Paul E; Costantino, Joseph P; Wickerham, D Lawrence; Weinshilboum, Richard M

    2012-01-01

    Gene-set analyses have been widely used in gene expression studies, and some of the developed methods have been extended to genome wide association studies (GWAS). Yet, complications due to linkage disequilibrium (LD) among single nucleotide polymorphisms (SNPs), and variable numbers of SNPs per gene and genes per gene-set, have plagued current approaches, often leading to ad hoc "fixes." To overcome some of the current limitations, we developed a general approach to scan GWAS SNP data for both gene-level and gene-set analyses, building on score statistics for generalized linear models, and taking advantage of the directed acyclic graph structure of the gene ontology when creating gene-sets. However, other types of gene-set structures can be used, such as the popular Kyoto Encyclopedia of Genes and Genomes (KEGG). Our approach combines SNPs into genes, and genes into gene-sets, but assures that positive and negative effects of genes on a trait do not cancel. To control for multiple testing of many gene-sets, we use an efficient computational strategy that accounts for LD and provides accurate step-down adjusted P-values for each gene-set. Application of our methods to two different GWAS provide guidance on the potential strengths and weaknesses of our proposed gene-set analyses. © 2011 Wiley Periodicals, Inc.

  10. A Two-Phase Space Resection Model for Accurate Topographic Reconstruction from Lunar Imagery with PushbroomScanners

    PubMed Central

    Xu, Xuemiao; Zhang, Huaidong; Han, Guoqiang; Kwan, Kin Chung; Pang, Wai-Man; Fang, Jiaming; Zhao, Gansen

    2016-01-01

    Exterior orientation parameters’ (EOP) estimation using space resection plays an important role in topographic reconstruction for push broom scanners. However, existing models of space resection are highly sensitive to errors in data. Unfortunately, for lunar imagery, the altitude data at the ground control points (GCPs) for space resection are error-prone. Thus, existing models fail to produce reliable EOPs. Motivated by a finding that for push broom scanners, angular rotations of EOPs can be estimated independent of the altitude data and only involving the geographic data at the GCPs, which are already provided, hence, we divide the modeling of space resection into two phases. Firstly, we estimate the angular rotations based on the reliable geographic data using our proposed mathematical model. Then, with the accurate angular rotations, the collinear equations for space resection are simplified into a linear problem, and the global optimal solution for the spatial position of EOPs can always be achieved. Moreover, a certainty term is integrated to penalize the unreliable altitude data for increasing the error tolerance. Experimental results evidence that our model can obtain more accurate EOPs and topographic maps not only for the simulated data, but also for the real data from Chang’E-1, compared to the existing space resection model. PMID:27077855

  11. A Two-Phase Space Resection Model for Accurate Topographic Reconstruction from Lunar Imagery with PushbroomScanners.

    PubMed

    Xu, Xuemiao; Zhang, Huaidong; Han, Guoqiang; Kwan, Kin Chung; Pang, Wai-Man; Fang, Jiaming; Zhao, Gansen

    2016-04-11

    Exterior orientation parameters' (EOP) estimation using space resection plays an important role in topographic reconstruction for push broom scanners. However, existing models of space resection are highly sensitive to errors in data. Unfortunately, for lunar imagery, the altitude data at the ground control points (GCPs) for space resection are error-prone. Thus, existing models fail to produce reliable EOPs. Motivated by a finding that for push broom scanners, angular rotations of EOPs can be estimated independent of the altitude data and only involving the geographic data at the GCPs, which are already provided, hence, we divide the modeling of space resection into two phases. Firstly, we estimate the angular rotations based on the reliable geographic data using our proposed mathematical model. Then, with the accurate angular rotations, the collinear equations for space resection are simplified into a linear problem, and the global optimal solution for the spatial position of EOPs can always be achieved. Moreover, a certainty term is integrated to penalize the unreliable altitude data for increasing the error tolerance. Experimental results evidence that our model can obtain more accurate EOPs and topographic maps not only for the simulated data, but also for the real data from Chang'E-1, compared to the existing space resection model.

  12. Ab initio gene identification in metagenomic sequences

    PubMed Central

    Zhu, Wenhan; Lomsadze, Alexandre; Borodovsky, Mark

    2010-01-01

    We describe an algorithm for gene identification in DNA sequences derived from shotgun sequencing of microbial communities. Accurate ab initio gene prediction in a short nucleotide sequence of anonymous origin is hampered by uncertainty in model parameters. While several machine learning approaches could be proposed to bypass this difficulty, one effective method is to estimate parameters from dependencies, formed in evolution, between frequencies of oligonucleotides in protein-coding regions and genome nucleotide composition. Original version of the method was proposed in 1999 and has been used since for (i) reconstructing codon frequency vector needed for gene finding in viral genomes and (ii) initializing parameters of self-training gene finding algorithms. With advent of new prokaryotic genomes en masse it became possible to enhance the original approach by using direct polynomial and logistic approximations of oligonucleotide frequencies, as well as by separating models for bacteria and archaea. These advances have increased the accuracy of model reconstruction and, subsequently, gene prediction. We describe the refined method and assess its accuracy on known prokaryotic genomes split into short sequences. Also, we show that as a result of application of the new method, several thousands of new genes could be added to existing annotations of several human and mouse gut metagenomes. PMID:20403810

  13. Turning the gene tap off; implications of regulating gene expression for cancer therapeutics

    PubMed Central

    Curtin, James F.; Candolfi, Marianela; Xiong, Weidong; Lowenstein, Pedro R.; Castro, Maria G.

    2008-01-01

    Cancer poses a tremendous therapeutic challenge worldwide, highlighting the critical need for developing novel therapeutics. A promising cancer treatment modality is gene therapy, which is a form of molecular medicine designed to introduce into target cells genetic material with therapeutic intent. Anticancer gene therapy strategies currently used in preclinical models, and in some cases in the clinic, include proapoptotic genes, oncolytic/replicative vectors, conditional cytotoxic approaches, inhibition of angiogenesis, inhibition of growth factor signaling, inactivation of oncogenes, inhibition of tumor invasion and stimulation of the immune system. The translation of these novel therapeutic modalities from the preclinical setting to the clinic has been driven by encouraging preclinical efficacy data and advances in gene delivery technologies. One area of intense research involves the ability to accurately regulate the levels of therapeutic gene expression to achieve enhanced efficacy and provide the capability to switch gene expression off completely if adverse side effects should arise. This feature could also be implemented to switch gene expression off when a successful therapeutic outcome ensues. Here, we will review recent developments related to the engineering of transcriptional switches within gene delivery systems, which could be implemented in clinical gene therapy applications directed at the treatment of cancer. PMID:18347132

  14. ROCker: accurate detection and quantification of target genes in short-read metagenomic data sets by modeling sliding-window bitscores

    DOE PAGES

    Orellana, Luis H.; Rodriguez-R, Luis M.; Konstantinidis, Konstantinos T.

    2016-10-07

    Functional annotation of metagenomic and metatranscriptomic data sets relies on similarity searches based on e-value thresholds resulting in an unknown number of false positive and negative matches. To overcome these limitations, we introduce ROCker, aimed at identifying position-specific, most-discriminant thresholds in sliding windows along the sequence of a target protein, accounting for non-discriminative domains shared by unrelated proteins. ROCker employs the receiver operating characteristic (ROC) curve to minimize false discovery rate (FDR) and calculate the best thresholds based on how simulated shotgun metagenomic reads of known composition map onto well-curated reference protein sequences and thus, differs from HMM profiles andmore » related methods. We showcase ROCker using ammonia monooxygenase (amoA) and nitrous oxide reductase (nosZ) genes, mediating oxidation of ammonia and the reduction of the potent greenhouse gas, N 2O, to inert N 2, respectively. ROCker typically showed 60-fold lower FDR when compared to the common practice of using fixed e-values. Previously uncounted ‘atypical’ nosZ genes were found to be two times more abundant, on average, than their typical counterparts in most soil metagenomes and the abundance of bacterial amoA was quantified against the highly-related particulate methane monooxygenase (pmoA). Therefore, ROCker can reliably detect and quantify target genes in short-read metagenomes.« less

  15. ROCker: accurate detection and quantification of target genes in short-read metagenomic data sets by modeling sliding-window bitscores

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Orellana, Luis H.; Rodriguez-R, Luis M.; Konstantinidis, Konstantinos T.

    Functional annotation of metagenomic and metatranscriptomic data sets relies on similarity searches based on e-value thresholds resulting in an unknown number of false positive and negative matches. To overcome these limitations, we introduce ROCker, aimed at identifying position-specific, most-discriminant thresholds in sliding windows along the sequence of a target protein, accounting for non-discriminative domains shared by unrelated proteins. ROCker employs the receiver operating characteristic (ROC) curve to minimize false discovery rate (FDR) and calculate the best thresholds based on how simulated shotgun metagenomic reads of known composition map onto well-curated reference protein sequences and thus, differs from HMM profiles andmore » related methods. We showcase ROCker using ammonia monooxygenase (amoA) and nitrous oxide reductase (nosZ) genes, mediating oxidation of ammonia and the reduction of the potent greenhouse gas, N 2O, to inert N 2, respectively. ROCker typically showed 60-fold lower FDR when compared to the common practice of using fixed e-values. Previously uncounted ‘atypical’ nosZ genes were found to be two times more abundant, on average, than their typical counterparts in most soil metagenomes and the abundance of bacterial amoA was quantified against the highly-related particulate methane monooxygenase (pmoA). Therefore, ROCker can reliably detect and quantify target genes in short-read metagenomes.« less

  16. ROCker: accurate detection and quantification of target genes in short-read metagenomic data sets by modeling sliding-window bitscores

    PubMed Central

    2017-01-01

    Abstract Functional annotation of metagenomic and metatranscriptomic data sets relies on similarity searches based on e-value thresholds resulting in an unknown number of false positive and negative matches. To overcome these limitations, we introduce ROCker, aimed at identifying position-specific, most-discriminant thresholds in sliding windows along the sequence of a target protein, accounting for non-discriminative domains shared by unrelated proteins. ROCker employs the receiver operating characteristic (ROC) curve to minimize false discovery rate (FDR) and calculate the best thresholds based on how simulated shotgun metagenomic reads of known composition map onto well-curated reference protein sequences and thus, differs from HMM profiles and related methods. We showcase ROCker using ammonia monooxygenase (amoA) and nitrous oxide reductase (nosZ) genes, mediating oxidation of ammonia and the reduction of the potent greenhouse gas, N2O, to inert N2, respectively. ROCker typically showed 60-fold lower FDR when compared to the common practice of using fixed e-values. Previously uncounted ‘atypical’ nosZ genes were found to be two times more abundant, on average, than their typical counterparts in most soil metagenomes and the abundance of bacterial amoA was quantified against the highly-related particulate methane monooxygenase (pmoA). Therefore, ROCker can reliably detect and quantify target genes in short-read metagenomes. PMID:28180325

  17. Genes and Junk in Plant Mitochondria—Repair Mechanisms and Selection

    PubMed Central

    Christensen, Alan C.

    2014-01-01

    Plant mitochondrial genomes have very low mutation rates. In contrast, they also rearrange and expand frequently. This is easily understood if DNA repair in genes is accomplished by accurate mechanisms, whereas less accurate mechanisms including nonhomologous end joining or break-induced replication are used in nongenes. An important question is how different mechanisms of repair predominate in coding and noncoding DNA, although one possible mechanism is transcription-coupled repair (TCR). This work tests the predictions of TCR and finds no support for it. Examination of the mutation spectra and rates in genes and junk reveals what DNA repair mechanisms are available to plant mitochondria, and what selective forces act on the repair products. A model is proposed that mismatches and other DNA damages are repaired by converting them into double-strand breaks (DSBs). These can then be repaired by any of the DSB repair mechanisms, both accurate and inaccurate. Natural selection will eliminate coding regions repaired by inaccurate mechanisms, accounting for the low mutation rates in genes, whereas mutations, rearrangements, and expansions generated by inaccurate repair in noncoding regions will persist. Support for this model includes the structure of the mitochondrial mutS homolog in plants, which is fused to a double-strand endonuclease. The model proposes that plant mitochondria do not distinguish a damaged or mismatched DNA strand from the undamaged strand, they simply cut both strands and perform homology-based DSB repair. This plant-specific strategy for protecting future generations from mitochondrial DNA damage has the side effect of genome expansions and rearrangements. PMID:24904012

  18. ACTG: novel peptide mapping onto gene models.

    PubMed

    Choi, Seunghyuk; Kim, Hyunwoo; Paek, Eunok

    2017-04-15

    In many proteogenomic applications, mapping peptide sequences onto genome sequences can be very useful, because it allows us to understand origins of the gene products. Existing software tools either take the genomic position of a peptide start site as an input or assume that the peptide sequence exactly matches the coding sequence of a given gene model. In case of novel peptides resulting from genomic variations, especially structural variations such as alternative splicing, these existing tools cannot be directly applied unless users supply information about the variant, either its genomic position or its transcription model. Mapping potentially novel peptides to genome sequences, while allowing certain genomic variations, requires introducing novel gene models when aligning peptide sequences to gene structures. We have developed a new tool called ACTG (Amino aCids To Genome), which maps peptides to genome, assuming all possible single exon skipping, junction variation allowing three edit distances from the original splice sites, exon extension and frame shift. In addition, it can also consider SNVs (single nucleotide variations) during mapping phase if a user provides the VCF (variant call format) file as an input. Available at http://prix.hanyang.ac.kr/ACTG/search.jsp . eunokpaek@hanyang.ac.kr. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

  19. Combinatorial explosion in model gene networks

    NASA Astrophysics Data System (ADS)

    Edwards, R.; Glass, L.

    2000-09-01

    The explosive growth in knowledge of the genome of humans and other organisms leaves open the question of how the functioning of genes in interacting networks is coordinated for orderly activity. One approach to this problem is to study mathematical properties of abstract network models that capture the logical structures of gene networks. The principal issue is to understand how particular patterns of activity can result from particular network structures, and what types of behavior are possible. We study idealized models in which the logical structure of the network is explicitly represented by Boolean functions that can be represented by directed graphs on n-cubes, but which are continuous in time and described by differential equations, rather than being updated synchronously via a discrete clock. The equations are piecewise linear, which allows significant analysis and facilitates rapid integration along trajectories. We first give a combinatorial solution to the question of how many distinct logical structures exist for n-dimensional networks, showing that the number increases very rapidly with n. We then outline analytic methods that can be used to establish the existence, stability and periods of periodic orbits corresponding to particular cycles on the n-cube. We use these methods to confirm the existence of limit cycles discovered in a sample of a million randomly generated structures of networks of 4 genes. Even with only 4 genes, at least several hundred different patterns of stable periodic behavior are possible, many of them surprisingly complex. We discuss ways of further classifying these periodic behaviors, showing that small mutations (reversal of one or a few edges on the n-cube) need not destroy the stability of a limit cycle. Although these networks are very simple as models of gene networks, their mathematical transparency reveals relationships between structure and behavior, they suggest that the possibilities for orderly dynamics in such

  20. Combinatorial explosion in model gene networks.

    PubMed

    Edwards, R.; Glass, L.

    2000-09-01

    The explosive growth in knowledge of the genome of humans and other organisms leaves open the question of how the functioning of genes in interacting networks is coordinated for orderly activity. One approach to this problem is to study mathematical properties of abstract network models that capture the logical structures of gene networks. The principal issue is to understand how particular patterns of activity can result from particular network structures, and what types of behavior are possible. We study idealized models in which the logical structure of the network is explicitly represented by Boolean functions that can be represented by directed graphs on n-cubes, but which are continuous in time and described by differential equations, rather than being updated synchronously via a discrete clock. The equations are piecewise linear, which allows significant analysis and facilitates rapid integration along trajectories. We first give a combinatorial solution to the question of how many distinct logical structures exist for n-dimensional networks, showing that the number increases very rapidly with n. We then outline analytic methods that can be used to establish the existence, stability and periods of periodic orbits corresponding to particular cycles on the n-cube. We use these methods to confirm the existence of limit cycles discovered in a sample of a million randomly generated structures of networks of 4 genes. Even with only 4 genes, at least several hundred different patterns of stable periodic behavior are possible, many of them surprisingly complex. We discuss ways of further classifying these periodic behaviors, showing that small mutations (reversal of one or a few edges on the n-cube) need not destroy the stability of a limit cycle. Although these networks are very simple as models of gene networks, their mathematical transparency reveals relationships between structure and behavior, they suggest that the possibilities for orderly dynamics in such

  1. Synchrotron phase-contrast X-ray imaging reveals fluid dosing dynamics for gene transfer into mouse airways.

    PubMed

    Donnelley, M; Siu, K K W; Jamison, R A; Parsons, D W

    2012-01-01

    Although airway gene transfer research in mouse models relies on bolus fluid dosing into the nose or trachea, the dynamics and immediate fate of delivered gene transfer agents are poorly understood. In particular, this is because there are no in vivo methods able to accurately visualize the movement of fluid in small airways of intact animals. Using synchrotron phase-contrast X-ray imaging, we show that the fate of surrogate fluid doses delivered into live mouse airways can now be accurately and non-invasively monitored with high spatial and temporal resolution. This new imaging approach can help explain the non-homogenous distributions of gene expression observed in nasal airway gene transfer studies, suggests that substantial dose losses may occur at deliver into mouse trachea via immediate retrograde fluid motion and shows the influence of the speed of bolus delivery on the relative targeting of conducting and deeper lung airways. These findings provide insight into some of the factors that can influence gene expression in vivo, and this method provides a new approach to documenting and analyzing dose delivery in small-animal models.

  2. An accurate, simple prognostic model consisting of age, JAK2, CALR, and MPL mutation status for patients with primary myelofibrosis.

    PubMed

    Rozovski, Uri; Verstovsek, Srdan; Manshouri, Taghi; Dembitz, Vilma; Bozinovic, Ksenija; Newberry, Kate; Zhang, Ying; Bove, Joseph E; Pierce, Sherry; Kantarjian, Hagop; Estrov, Zeev

    2017-01-01

    In most patients with primary myelofibrosis, one of three mutually exclusive somatic mutations is detected. In approximately 60% of patients, the Janus kinase 2 gene is mutated, in 20%, the calreticulin gene is mutated, and in 5%, the myeloproliferative leukemia virus gene is mutated. Although patients with mutated calreticulin or myeloproliferative leukemia genes have a favorable outcome, and those with none of these mutations have an unfavorable outcome, prognostication based on mutation status is challenging due to the heterogeneous survival of patients with mutated Janus kinase 2. To develop a prognostic model based on mutation status, we screened primary myelofibrosis patients seen at the MD Anderson Cancer Center, Houston, USA, between 2000 and 2013 for the presence of Janus kinase 2, calreticulin, and myeloproliferative leukemia mutations. Of 344 primary myelofibrosis patients, Janus kinase 2 V617F was detected in 226 (66%), calreticulin mutation in 43 (12%), and myeloproliferative leukemia mutation in 16 (5%); 59 patients (17%) were triple-negatives. A 50% cut-off dichotomized Janus kinase 2-mutated patients into those with high Janus kinase 2 V617F allele burden and favorable survival and those with low Janus kinase 2 V617F allele burden and unfavorable survival. Patients with a favorable mutation status (high Janus kinase 2 V617F allele burden/myeloproliferative leukemia/calreticulin mutation) and aged 65 years or under had a median survival of 126 months. Patients with one risk factor (low Janus kinase 2 V617F allele burden/triple-negative or age >65 years) had an intermediate survival duration, and patients aged over 65 years with an adverse mutation status (low Janus kinase 2 V617F allele burden or triple-negative) had a median survival of only 35 months. Our simple and easily applied age- and mutation status-based scoring system accurately predicted the survival of patients with primary myelofibrosis. Copyright© Ferrata Storti Foundation.

  3. DNA sequence requirements for the accurate transcription of a protein-coding plastid gene in a plastid in vitro system from mustard (Sinapis alba L.)

    PubMed Central

    Link, Gerhard

    1984-01-01

    A nuclease-treated plastid extract from mustard (Sinapis alba L.) allows efficient transcription of cloned plastid DNA templates. In this in vitro system, the major runoff transcript of the truncated gene for the 32 000 mol. wt. photosystem II protein was accurately initiated from a site close to or identical with the in vivo start site. By using plasmids with deletions in the 5'-flanking region of this gene as templates, a DNA region required for efficient and selective initiation was detected ˜28-35 nucleotides upstream of the transcription start site. This region contains the sequence element TTGACA, which matches the consensus sequence for prokaryotic `−35' promoter elements. In the absence of this region, a region ˜13-27 nucleotides upstream of the start site still enables a basic level of specific transcription. This second region contains the sequence element TATATAA, which matches the consensus sequence for the `TATA' box of genes transcribed by RNA polymerase II (or B). The region between the `TATA'-like element and the transcription start site is not sufficient but may be required for specific transcription of the plastid gene. This latter region contains the sequence element TATACT, which resembles the prokaryotic `−10' (Pribnow) box. Based on the structural and transcriptional features of the 5' upstream region, a `promoter switch' mechanism is proposed, which may account for the developmentally regulated expression of this plastid gene. ImagesFig. 1.Fig. 2.Fig. 3.Fig. 4.Figure 5. PMID:16453540

  4. Communication: Accurate higher-order van der Waals coefficients between molecules from a model dynamic multipole polarizability

    DOE PAGES

    Tao, Jianmin; Rappe, Andrew M.

    2016-01-20

    Due to the absence of the long-range van der Waals (vdW) interaction, conventional density functional theory (DFT) often fails in the description of molecular complexes and solids. In recent years, considerable progress has been made in the development of the vdW correction. However, the vdW correction based on the leading-order coefficient C 6 alone can only achieve limited accuracy, while accurate modeling of higher-order coefficients remains a formidable task, due to the strong non-additivity effect. Here, we apply a model dynamic multipole polarizability within a modified single-frequency approximation to calculate C 8 and C 10 between small molecules. We findmore » that the higher-order vdW coefficients from this model can achieve remarkable accuracy, with mean absolute relative deviations of 5% for C 8 and 7% for C 10. As a result, inclusion of accurate higher-order contributions in the vdW correction will effectively enhance the predictive power of DFT in condensed matter physics and quantum chemistry.« less

  5. Gene-environment interactions and construct validity in preclinical models of psychiatric disorders.

    PubMed

    Burrows, Emma L; McOmish, Caitlin E; Hannan, Anthony J

    2011-08-01

    The contributions of genetic risk factors to susceptibility for brain disorders are often so closely intertwined with environmental factors that studying genes in isolation cannot provide the full picture of pathogenesis. With recent advances in our understanding of psychiatric genetics and environmental modifiers we are now in a position to develop more accurate animal models of psychiatric disorders which exemplify the complex interaction of genes and environment. Here, we consider some of the insights that have emerged from studying the relationship between defined genetic alterations and environmental factors in rodent models. A key issue in such animal models is the optimization of construct validity, at both genetic and environmental levels. Standard housing of laboratory mice and rats generally includes ad libitum food access and limited opportunity for physical exercise, leading to metabolic dysfunction under control conditions, and thus reducing validity of animal models with respect to clinical populations. A related issue, of specific relevance to neuroscientists, is that most standard-housed rodents have limited opportunity for sensory and cognitive stimulation, which in turn provides reduced incentive for complex motor activity. Decades of research using environmental enrichment has demonstrated beneficial effects on brain and behavior in both wild-type and genetically modified rodent models, relative to standard-housed littermate controls. One interpretation of such studies is that environmentally enriched animals more closely approximate average human levels of cognitive and sensorimotor stimulation, whereas the standard housing currently used in most laboratories models a more sedentary state of reduced mental and physical activity and abnormal stress levels. The use of such standard housing as a single environmental variable may limit the capacity for preclinical models to translate into successful clinical trials. Therefore, there is a need to

  6. Analyzing gene expression time-courses based on multi-resolution shape mixture model.

    PubMed

    Li, Ying; He, Ye; Zhang, Yu

    2016-11-01

    Biological processes actually are a dynamic molecular process over time. Time course gene expression experiments provide opportunities to explore patterns of gene expression change over a time and understand the dynamic behavior of gene expression, which is crucial for study on development and progression of biology and disease. Analysis of the gene expression time-course profiles has not been fully exploited so far. It is still a challenge problem. We propose a novel shape-based mixture model clustering method for gene expression time-course profiles to explore the significant gene groups. Based on multi-resolution fractal features and mixture clustering model, we proposed a multi-resolution shape mixture model algorithm. Multi-resolution fractal features is computed by wavelet decomposition, which explore patterns of change over time of gene expression at different resolution. Our proposed multi-resolution shape mixture model algorithm is a probabilistic framework which offers a more natural and robust way of clustering time-course gene expression. We assessed the performance of our proposed algorithm using yeast time-course gene expression profiles compared with several popular clustering methods for gene expression profiles. The grouped genes identified by different methods are evaluated by enrichment analysis of biological pathways and known protein-protein interactions from experiment evidence. The grouped genes identified by our proposed algorithm have more strong biological significance. A novel multi-resolution shape mixture model algorithm based on multi-resolution fractal features is proposed. Our proposed model provides a novel horizons and an alternative tool for visualization and analysis of time-course gene expression profiles. The R and Matlab program is available upon the request. Copyright © 2016 Elsevier Inc. All rights reserved.

  7. Development and application of accurate analytical models for single active electron potentials

    NASA Astrophysics Data System (ADS)

    Miller, Michelle; Jaron-Becker, Agnieszka; Becker, Andreas

    2015-05-01

    The single active electron (SAE) approximation is a theoretical model frequently employed to study scenarios in which inner-shell electrons may productively be treated as frozen spectators to a physical process of interest, and accurate analytical approximations for these potentials are sought as a useful simulation tool. Density function theory is often used to construct a SAE potential, requiring that a further approximation for the exchange correlation functional be enacted. In this study, we employ the Krieger, Li, and Iafrate (KLI) modification to the optimized-effective-potential (OEP) method to reduce the complexity of the problem to the straightforward solution of a system of linear equations through simple arguments regarding the behavior of the exchange-correlation potential in regions where a single orbital dominates. We employ this method for the solution of atomic and molecular potentials, and use the resultant curve to devise a systematic construction for highly accurate and useful analytical approximations for several systems. Supported by the U.S. Department of Energy (Grant No. DE-FG02-09ER16103), and the U.S. National Science Foundation (Graduate Research Fellowship, Grants No. PHY-1125844 and No. PHY-1068706).

  8. A Critical Review for Developing Accurate and Dynamic Predictive Models Using Machine Learning Methods in Medicine and Health Care.

    PubMed

    Alanazi, Hamdan O; Abdullah, Abdul Hanan; Qureshi, Kashif Naseer

    2017-04-01

    Recently, Artificial Intelligence (AI) has been used widely in medicine and health care sector. In machine learning, the classification or prediction is a major field of AI. Today, the study of existing predictive models based on machine learning methods is extremely active. Doctors need accurate predictions for the outcomes of their patients' diseases. In addition, for accurate predictions, timing is another significant factor that influences treatment decisions. In this paper, existing predictive models in medicine and health care have critically reviewed. Furthermore, the most famous machine learning methods have explained, and the confusion between a statistical approach and machine learning has clarified. A review of related literature reveals that the predictions of existing predictive models differ even when the same dataset is used. Therefore, existing predictive models are essential, and current methods must be improved.

  9. Identification of an Efficient Gene Expression Panel for Glioblastoma Classification

    PubMed Central

    Zelaya, Ivette; Laks, Dan R.; Zhao, Yining; Kawaguchi, Riki; Gao, Fuying; Kornblum, Harley I.; Coppola, Giovanni

    2016-01-01

    We present here a novel genetic algorithm-based random forest (GARF) modeling technique that enables a reduction in the complexity of large gene disease signatures to highly accurate, greatly simplified gene panels. When applied to 803 glioblastoma multiforme samples, this method allowed the 840-gene Verhaak et al. gene panel (the standard in the field) to be reduced to a 48-gene classifier, while retaining 90.91% classification accuracy, and outperforming the best available alternative methods. Additionally, using this approach we produced a 32-gene panel which allows for better consistency between RNA-seq and microarray-based classifications, improving cross-platform classification retention from 69.67% to 86.07%. A webpage producing these classifications is available at http://simplegbm.semel.ucla.edu. PMID:27855170

  10. Cross hole GPR traveltime inversion using a fast and accurate neural network as a forward model

    NASA Astrophysics Data System (ADS)

    Mejer Hansen, Thomas

    2017-04-01

    Probabilistic formulated inverse problems can be solved using Monte Carlo based sampling methods. In principle both advanced prior information, such as based on geostatistics, and complex non-linear forward physical models can be considered. However, in practice these methods can be associated with huge computational costs that in practice limit their application. This is not least due to the computational requirements related to solving the forward problem, where the physical response of some earth model has to be evaluated. Here, it is suggested to replace a numerical complex evaluation of the forward problem, with a trained neural network that can be evaluated very fast. This will introduce a modeling error, that is quantified probabilistically such that it can be accounted for during inversion. This allows a very fast and efficient Monte Carlo sampling of the solution to an inverse problem. We demonstrate the methodology for first arrival travel time inversion of cross hole ground-penetrating radar (GPR) data. An accurate forward model, based on 2D full-waveform modeling followed by automatic travel time picking, is replaced by a fast neural network. This provides a sampling algorithm three orders of magnitude faster than using the full forward model, and considerably faster, and more accurate, than commonly used approximate forward models. The methodology has the potential to dramatically change the complexity of the types of inverse problems that can be solved using non-linear Monte Carlo sampling techniques.

  11. Improved image quality in pinhole SPECT by accurate modeling of the point spread function in low magnification systems

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Pino, Francisco; Roé, Nuria; Aguiar, Pablo, E-mail: pablo.aguiar.fernandez@sergas.es

    2015-02-15

    Purpose: Single photon emission computed tomography (SPECT) has become an important noninvasive imaging technique in small-animal research. Due to the high resolution required in small-animal SPECT systems, the spatially variant system response needs to be included in the reconstruction algorithm. Accurate modeling of the system response should result in a major improvement in the quality of reconstructed images. The aim of this study was to quantitatively assess the impact that an accurate modeling of spatially variant collimator/detector response has on image-quality parameters, using a low magnification SPECT system equipped with a pinhole collimator and a small gamma camera. Methods: Threemore » methods were used to model the point spread function (PSF). For the first, only the geometrical pinhole aperture was included in the PSF. For the second, the septal penetration through the pinhole collimator was added. In the third method, the measured intrinsic detector response was incorporated. Tomographic spatial resolution was evaluated and contrast, recovery coefficients, contrast-to-noise ratio, and noise were quantified using a custom-built NEMA NU 4–2008 image-quality phantom. Results: A high correlation was found between the experimental data corresponding to intrinsic detector response and the fitted values obtained by means of an asymmetric Gaussian distribution. For all PSF models, resolution improved as the distance from the point source to the center of the field of view increased and when the acquisition radius diminished. An improvement of resolution was observed after a minimum of five iterations when the PSF modeling included more corrections. Contrast, recovery coefficients, and contrast-to-noise ratio were better for the same level of noise in the image when more accurate models were included. Ring-type artifacts were observed when the number of iterations exceeded 12. Conclusions: Accurate modeling of the PSF improves resolution, contrast, and

  12. Do dual-route models accurately predict reading and spelling performance in individuals with acquired alexia and agraphia?

    PubMed

    Rapcsak, Steven Z; Henry, Maya L; Teague, Sommer L; Carnahan, Susan D; Beeson, Pélagie M

    2007-06-18

    Coltheart and co-workers [Castles, A., Bates, T. C., & Coltheart, M. (2006). John Marshall and the developmental dyslexias. Aphasiology, 20, 871-892; Coltheart, M., Rastle, K., Perry, C., Langdon, R., & Ziegler, J. (2001). DRC: A dual route cascaded model of visual word recognition and reading aloud. Psychological Review, 108, 204-256] have demonstrated that an equation derived from dual-route theory accurately predicts reading performance in young normal readers and in children with reading impairment due to developmental dyslexia or stroke. In this paper, we present evidence that the dual-route equation and a related multiple regression model also accurately predict both reading and spelling performance in adult neurological patients with acquired alexia and agraphia. These findings provide empirical support for dual-route theories of written language processing.

  13. Fast and accurate computation of system matrix for area integral model-based algebraic reconstruction technique

    NASA Astrophysics Data System (ADS)

    Zhang, Shunli; Zhang, Dinghua; Gong, Hao; Ghasemalizadeh, Omid; Wang, Ge; Cao, Guohua

    2014-11-01

    Iterative algorithms, such as the algebraic reconstruction technique (ART), are popular for image reconstruction. For iterative reconstruction, the area integral model (AIM) is more accurate for better reconstruction quality than the line integral model (LIM). However, the computation of the system matrix for AIM is more complex and time-consuming than that for LIM. Here, we propose a fast and accurate method to compute the system matrix for AIM. First, we calculate the intersection of each boundary line of a narrow fan-beam with pixels in a recursive and efficient manner. Then, by grouping the beam-pixel intersection area into six types according to the slopes of the two boundary lines, we analytically compute the intersection area of the narrow fan-beam with the pixels in a simple algebraic fashion. Overall, experimental results show that our method is about three times faster than the Siddon algorithm and about two times faster than the distance-driven model (DDM) in computation of the system matrix. The reconstruction speed of our AIM-based ART is also faster than the LIM-based ART that uses the Siddon algorithm and DDM-based ART, for one iteration. The fast reconstruction speed of our method was accomplished without compromising the image quality.

  14. Modeling Autism by SHANK Gene Mutations in Mice

    PubMed Central

    Jiang, Yong-hui; Ehlers, Michael D.

    2013-01-01

    Summary Shank family proteins (Shank1, Shank2, and Shank3) are synaptic scaffolding proteins that organize an extensive protein complex at the postsynaptic density (PSD) of excitatory glutamatergic synapses. Recent human genetic studies indicate that SHANK family genes (SHANK1, SHANK2, and SHANK3) are causative genes for idiopathic autism spectrum disorders (ASD). Neurobiological studies of Shank mutations in mice support a general hypothesis of synaptic dysfunction in the pathophysiology of ASD. However, the molecular diversity of SHANK family gene products, as well as the heterogeneity in human and mouse phenotypes, pose challenges to modeling human SHANK mutations. Here, we review the molecular genetics of SHANK mutations in human ASD and discuss recent findings where such mutations have been modeled in mice. Conserved features of synaptic dysfunction and corresponding behaviors in Shank mouse mutants may help dissect the pathophysiology of ASD, but also highlight divergent phenotypes that arise from different mutations in the same gene. PMID:23583105

  15. A Weibull statistics-based lignocellulose saccharification model and a built-in parameter accurately predict lignocellulose hydrolysis performance.

    PubMed

    Wang, Mingyu; Han, Lijuan; Liu, Shasha; Zhao, Xuebing; Yang, Jinghua; Loh, Soh Kheang; Sun, Xiaomin; Zhang, Chenxi; Fang, Xu

    2015-09-01

    Renewable energy from lignocellulosic biomass has been deemed an alternative to depleting fossil fuels. In order to improve this technology, we aim to develop robust mathematical models for the enzymatic lignocellulose degradation process. By analyzing 96 groups of previously published and newly obtained lignocellulose saccharification results and fitting them to Weibull distribution, we discovered Weibull statistics can accurately predict lignocellulose saccharification data, regardless of the type of substrates, enzymes and saccharification conditions. A mathematical model for enzymatic lignocellulose degradation was subsequently constructed based on Weibull statistics. Further analysis of the mathematical structure of the model and experimental saccharification data showed the significance of the two parameters in this model. In particular, the λ value, defined the characteristic time, represents the overall performance of the saccharification system. This suggestion was further supported by statistical analysis of experimental saccharification data and analysis of the glucose production levels when λ and n values change. In conclusion, the constructed Weibull statistics-based model can accurately predict lignocellulose hydrolysis behavior and we can use the λ parameter to assess the overall performance of enzymatic lignocellulose degradation. Advantages and potential applications of the model and the λ value in saccharification performance assessment were discussed. Copyright © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  16. Accurate modeling and inversion of electrical resistivity data in the presence of metallic infrastructure with known location and dimension

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Johnson, Timothy C.; Wellman, Dawn M.

    2015-06-26

    Electrical resistivity tomography (ERT) has been widely used in environmental applications to study processes associated with subsurface contaminants and contaminant remediation. Anthropogenic alterations in subsurface electrical conductivity associated with contamination often originate from highly industrialized areas with significant amounts of buried metallic infrastructure. The deleterious influence of such infrastructure on imaging results generally limits the utility of ERT where it might otherwise prove useful for subsurface investigation and monitoring. In this manuscript we present a method of accurately modeling the effects of buried conductive infrastructure within the forward modeling algorithm, thereby removing them from the inversion results. The method ismore » implemented in parallel using immersed interface boundary conditions, whereby the global solution is reconstructed from a series of well-conditioned partial solutions. Forward modeling accuracy is demonstrated by comparison with analytic solutions. Synthetic imaging examples are used to investigate imaging capabilities within a subsurface containing electrically conductive buried tanks, transfer piping, and well casing, using both well casings and vertical electrode arrays as current sources and potential measurement electrodes. Results show that, although accurate infrastructure modeling removes the dominating influence of buried metallic features, the presence of metallic infrastructure degrades imaging resolution compared to standard ERT imaging. However, accurate imaging results may be obtained if electrodes are appropriately located.« less

  17. Prediction of gene expression with cis-SNPs using mixed models and regularization methods.

    PubMed

    Zeng, Ping; Zhou, Xiang; Huang, Shuiping

    2017-05-11

    It has been shown that gene expression in human tissues is heritable, thus predicting gene expression using only SNPs becomes possible. The prediction of gene expression can offer important implications on the genetic architecture of individual functional associated SNPs and further interpretations of the molecular basis underlying human diseases. We compared three types of methods for predicting gene expression using only cis-SNPs, including the polygenic model, i.e. linear mixed model (LMM), two sparse models, i.e. Lasso and elastic net (ENET), and the hybrid of LMM and sparse model, i.e. Bayesian sparse linear mixed model (BSLMM). The three kinds of prediction methods have very different assumptions of underlying genetic architectures. These methods were evaluated using simulations under various scenarios, and were applied to the Geuvadis gene expression data. The simulations showed that these four prediction methods (i.e. Lasso, ENET, LMM and BSLMM) behaved best when their respective modeling assumptions were satisfied, but BSLMM had a robust performance across a range of scenarios. According to R 2 of these models in the Geuvadis data, the four methods performed quite similarly. We did not observe any clustering or enrichment of predictive genes (defined as genes with R 2  ≥ 0.05) across the chromosomes, and also did not see there was any clear relationship between the proportion of the predictive genes and the proportion of genes in each chromosome. However, an interesting finding in the Geuvadis data was that highly predictive genes (e.g. R 2  ≥ 0.30) may have sparse genetic architectures since Lasso, ENET and BSLMM outperformed LMM for these genes; and this observation was validated in another gene expression data. We further showed that the predictive genes were enriched in approximately independent LD blocks. Gene expression can be predicted with only cis-SNPs using well-developed prediction models and these predictive genes were enriched in

  18. A Penalized Robust Method for Identifying Gene-Environment Interactions

    PubMed Central

    Shi, Xingjie; Liu, Jin; Huang, Jian; Zhou, Yong; Xie, Yang; Ma, Shuangge

    2015-01-01

    In high-throughput studies, an important objective is to identify gene-environment interactions associated with disease outcomes and phenotypes. Many commonly adopted methods assume specific parametric or semiparametric models, which may be subject to model mis-specification. In addition, they usually use significance level as the criterion for selecting important interactions. In this study, we adopt the rank-based estimation, which is much less sensitive to model specification than some of the existing methods and includes several commonly encountered data and models as special cases. Penalization is adopted for the identification of gene-environment interactions. It achieves simultaneous estimation and identification and does not rely on significance level. For computation feasibility, a smoothed rank estimation is further proposed. Simulation shows that under certain scenarios, for example with contaminated or heavy-tailed data, the proposed method can significantly outperform the existing alternatives with more accurate identification. We analyze a lung cancer prognosis study with gene expression measurements under the AFT (accelerated failure time) model. The proposed method identifies interactions different from those using the alternatives. Some of the identified genes have important implications. PMID:24616063

  19. Contemporary Animal Models For Human Gene Therapy Applications.

    PubMed

    Gopinath, Chitra; Nathar, Trupti Job; Ghosh, Arkasubhra; Hickstein, Dennis Durand; Nelson, Everette Jacob Remington

    2015-01-01

    Over the past three decades, gene therapy has been making considerable progress as an alternative strategy in the treatment of many diseases. Since 2009, several studies have been reported in humans on the successful treatment of various diseases. Animal models mimicking human disease conditions are very essential at the preclinical stage before embarking on a clinical trial. In gene therapy, for instance, they are useful in the assessment of variables related to the use of viral vectors such as safety, efficacy, dosage and localization of transgene expression. However, choosing a suitable disease-specific model is of paramount importance for successful clinical translation. This review focuses on the animal models that are most commonly used in gene therapy studies, such as murine, canine, non-human primates, rabbits, porcine, and a more recently developed humanized mice. Though small and large animals both have their own pros and cons as disease-specific models, the choice is made largely based on the type and length of study performed. While small animals with a shorter life span could be well-suited for degenerative/aging studies, large animals with longer life span could suit longitudinal studies and also help with dosage adjustments to maximize therapeutic benefit. Recently, humanized mice or mouse-human chimaeras have gained interest in the study of human tissues or cells, thereby providing a more reliable understanding of therapeutic interventions. Thus, animal models are of great importance with regard to testing new vector technologies in vivo for assessing safety and efficacy prior to a gene therapy clinical trial.

  20. Efficient and accurate approach to modeling the microstructure and defect properties of LaCoO3

    NASA Astrophysics Data System (ADS)

    Buckeridge, J.; Taylor, F. H.; Catlow, C. R. A.

    2016-04-01

    Complex perovskite oxides are promising materials for cathode layers in solid oxide fuel cells. Such materials have intricate electronic, magnetic, and crystalline structures that prove challenging to model accurately. We analyze a wide range of standard density functional theory approaches to modeling a highly promising system, the perovskite LaCoO3, focusing on optimizing the Hubbard U parameter to treat the self-interaction of the B-site cation's d states, in order to determine the most appropriate method to study defect formation and the effect of spin on local structure. By calculating structural and electronic properties for different magnetic states we determine that U =4 eV for Co in LaCoO3 agrees best with available experiments. We demonstrate that the generalized gradient approximation (PBEsol +U ) is most appropriate for studying structure versus spin state, while the local density approximation (LDA +U ) is most appropriate for determining accurate energetics for defect properties.

  1. Kinetic models of gene expression including non-coding RNAs

    NASA Astrophysics Data System (ADS)

    Zhdanov, Vladimir P.

    2011-03-01

    In cells, genes are transcribed into mRNAs, and the latter are translated into proteins. Due to the feedbacks between these processes, the kinetics of gene expression may be complex even in the simplest genetic networks. The corresponding models have already been reviewed in the literature. A new avenue in this field is related to the recognition that the conventional scenario of gene expression is fully applicable only to prokaryotes whose genomes consist of tightly packed protein-coding sequences. In eukaryotic cells, in contrast, such sequences are relatively rare, and the rest of the genome includes numerous transcript units representing non-coding RNAs (ncRNAs). During the past decade, it has become clear that such RNAs play a crucial role in gene expression and accordingly influence a multitude of cellular processes both in the normal state and during diseases. The numerous biological functions of ncRNAs are based primarily on their abilities to silence genes via pairing with a target mRNA and subsequently preventing its translation or facilitating degradation of the mRNA-ncRNA complex. Many other abilities of ncRNAs have been discovered as well. Our review is focused on the available kinetic models describing the mRNA, ncRNA and protein interplay. In particular, we systematically present the simplest models without kinetic feedbacks, models containing feedbacks and predicting bistability and oscillations in simple genetic networks, and models describing the effect of ncRNAs on complex genetic networks. Mathematically, the presentation is based primarily on temporal mean-field kinetic equations. The stochastic and spatio-temporal effects are also briefly discussed.

  2. Accurate Modeling of Dark-Field Scattering Spectra of Plasmonic Nanostructures.

    PubMed

    Jiang, Liyong; Yin, Tingting; Dong, Zhaogang; Liao, Mingyi; Tan, Shawn J; Goh, Xiao Ming; Allioux, David; Hu, Hailong; Li, Xiangyin; Yang, Joel K W; Shen, Zexiang

    2015-10-27

    Dark-field microscopy is a widely used tool for measuring the optical resonance of plasmonic nanostructures. However, current numerical methods for simulating the dark-field scattering spectra were carried out with plane wave illumination either at normal incidence or at an oblique angle from one direction. In actual experiments, light is focused onto the sample through an annular ring within a range of glancing angles. In this paper, we present a theoretical model capable of accurately simulating the dark-field light source with an annular ring. Simulations correctly reproduce a counterintuitive blue shift in the scattering spectra from gold nanodisks with a diameter beyond 140 nm. We believe that our proposed simulation method can be potentially applied as a general tool capable of simulating the dark-field scattering spectra of plasmonic nanostructures as well as other dielectric nanostructures with sizes beyond the quasi-static limit.

  3. Modeling the nitrogen cycle one gene at a time

    NASA Astrophysics Data System (ADS)

    Coles, V.; Stukel, M. R.; Hood, R. R.; Moran, M. A.; Paul, J. H.; Satinsky, B.; Zielinski, B.; Yager, P. L.

    2016-02-01

    Marine ecosystem models are lagging the revolution in microbial oceanography. As a result, modeling of the nitrogen cycle has largely failed to leverage new genomic information on nitrogen cycling pathways and the organisms that mediate them. We developed a nitrogen based ecosystem model whose community is determined by randomly assigning functional genes to build each organism's "DNA". Microbes are assigned a size that sets their baseline environmental responses using allometric response curves. These responses are modified by the costs and benefits conferred by each gene in an organism's genome. The microbes are embedded in a general circulation model where environmental conditions shape the emergent population. This model is used to explore whether organisms constructed from randomized combinations of metabolic capability alone can self-organize to create realistic oceanic biogeochemical gradients. Community size spectra and chlorophyll-a concentrations emerge in the model with reasonable fidelity to observations. The model is run repeatedly with randomly-generated microbial communities and each time realistic gradients in community size spectra, chlorophyll-a, and forms of nitrogen develop. This supports the hypothesis that the metabolic potential of a community rather than the realized species composition is the primary factor setting vertical and horizontal environmental gradients. Vertical distributions of nitrogen and transcripts for genes involved in nitrification are broadly consistent with observations. Modeled gene and transcript abundance for nitrogen cycling and processing of land-derived organic material match observations along the extreme gradients in the Amazon River plume, and they help to explain the factors controlling observed variability.

  4. Mixture models for detecting differentially expressed genes in microarrays.

    PubMed

    Jones, Liat Ben-Tovim; Bean, Richard; McLachlan, Geoffrey J; Zhu, Justin Xi

    2006-10-01

    An important and common problem in microarray experiments is the detection of genes that are differentially expressed in a given number of classes. As this problem concerns the selection of significant genes from a large pool of candidate genes, it needs to be carried out within the framework of multiple hypothesis testing. In this paper, we focus on the use of mixture models to handle the multiplicity issue. With this approach, a measure of the local FDR (false discovery rate) is provided for each gene. An attractive feature of the mixture model approach is that it provides a framework for the estimation of the prior probability that a gene is not differentially expressed, and this probability can subsequently be used in forming a decision rule. The rule can also be formed to take the false negative rate into account. We apply this approach to a well-known publicly available data set on breast cancer, and discuss our findings with reference to other approaches.

  5. Directional RNA-seq reveals highly complex condition-dependent transcriptomes in E. coli K12 through accurate full-length transcripts assembling.

    PubMed

    Li, Shan; Dong, Xia; Su, Zhengchang

    2013-07-30

    Although prokaryotic gene transcription has been studied over decades, many aspects of the process remain poorly understood. Particularly, recent studies have revealed that transcriptomes in many prokaryotes are far more complex than previously thought. Genes in an operon are often alternatively and dynamically transcribed under different conditions, and a large portion of genes and intergenic regions have antisense RNA (asRNA) and non-coding RNA (ncRNA) transcripts, respectively. Ironically, similar studies have not been conducted in the model bacterium E coli K12, thus it is unknown whether or not the bacterium possesses similar complex transcriptomes. Furthermore, although RNA-seq becomes the major method for analyzing the complexity of prokaryotic transcriptome, it is still a challenging task to accurately assemble full length transcripts using short RNA-seq reads. To fill these gaps, we have profiled the transcriptomes of E. coli K12 under different culture conditions and growth phases using a highly specific directional RNA-seq technique that can capture various types of transcripts in the bacterial cells, combined with a highly accurate and robust algorithm and tool TruHMM (http://bioinfolab.uncc.edu/TruHmm_package/) for assembling full length transcripts. We found that 46.9 ~ 63.4% of expressed operons were utilized in their putative alternative forms, 72.23 ~ 89.54% genes had putative asRNA transcripts and 51.37 ~ 72.74% intergenic regions had putative ncRNA transcripts under different culture conditions and growth phases. As has been demonstrated in many other prokaryotes, E. coli K12 also has a highly complex and dynamic transcriptomes under different culture conditions and growth phases. Such complex and dynamic transcriptomes might play important roles in the physiology of the bacterium. TruHMM is a highly accurate and robust algorithm for assembling full-length transcripts in prokaryotes using directional RNA-seq short reads.

  6. The Global Error Assessment (GEA) model for the selection of differentially expressed genes in microarray data.

    PubMed

    Mansourian, Robert; Mutch, David M; Antille, Nicolas; Aubert, Jerome; Fogel, Paul; Le Goff, Jean-Marc; Moulin, Julie; Petrov, Anton; Rytz, Andreas; Voegel, Johannes J; Roberts, Matthew-Alan

    2004-11-01

    Microarray technology has become a powerful research tool in many fields of study; however, the cost of microarrays often results in the use of a low number of replicates (k). Under circumstances where k is low, it becomes difficult to perform standard statistical tests to extract the most biologically significant experimental results. Other more advanced statistical tests have been developed; however, their use and interpretation often remain difficult to implement in routine biological research. The present work outlines a method that achieves sufficient statistical power for selecting differentially expressed genes under conditions of low k, while remaining as an intuitive and computationally efficient procedure. The present study describes a Global Error Assessment (GEA) methodology to select differentially expressed genes in microarray datasets, and was developed using an in vitro experiment that compared control and interferon-gamma treated skin cells. In this experiment, up to nine replicates were used to confidently estimate error, thereby enabling methods of different statistical power to be compared. Gene expression results of a similar absolute expression are binned, so as to enable a highly accurate local estimate of the mean squared error within conditions. The model then relates variability of gene expression in each bin to absolute expression levels and uses this in a test derived from the classical ANOVA. The GEA selection method is compared with both the classical and permutational ANOVA tests, and demonstrates an increased stability, robustness and confidence in gene selection. A subset of the selected genes were validated by real-time reverse transcription-polymerase chain reaction (RT-PCR). All these results suggest that GEA methodology is (i) suitable for selection of differentially expressed genes in microarray data, (ii) intuitive and computationally efficient and (iii) especially advantageous under conditions of low k. The GEA code for R

  7. A Novel Method for Accurate Operon Predictions in All SequencedProkaryotes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Price, Morgan N.; Huang, Katherine H.; Alm, Eric J.

    2004-12-01

    We combine comparative genomic measures and the distance separating adjacent genes to predict operons in 124 completely sequenced prokaryotic genomes. Our method automatically tailors itself to each genome using sequence information alone, and thus can be applied to any prokaryote. For Escherichia coli K12 and Bacillus subtilis, our method is 85 and 83% accurate, respectively, which is similar to the accuracy of methods that use the same features but are trained on experimentally characterized transcripts. In Halobacterium NRC-1 and in Helicobacterpylori, our method correctly infers that genes in operons are separated by shorter distances than they are in E.coli, andmore » its predictions using distance alone are more accurate than distance-only predictions trained on a database of E.coli transcripts. We use microarray data from sixphylogenetically diverse prokaryotes to show that combining intergenic distance with comparative genomic measures further improves accuracy and that our method is broadly effective. Finally, we survey operon structure across 124 genomes, and find several surprises: H.pylori has many operons, contrary to previous reports; Bacillus anthracis has an unusual number of pseudogenes within conserved operons; and Synechocystis PCC6803 has many operons even though it has unusually wide spacings between conserved adjacent genes.« less

  8. Validation of Reference Genes for Gene Expression Studies in Virus-Infected Nicotiana benthamiana Using Quantitative Real-Time PCR

    PubMed Central

    Han, Chenggui; Yu, Jialin; Li, Dawei; Zhang, Yongliang

    2012-01-01

    Nicotiana benthamiana is the most widely-used experimental host in plant virology. The recent release of the draft genome sequence for N. benthamiana consolidates its role as a model for plant–pathogen interactions. Quantitative real-time PCR (qPCR) is commonly employed for quantitative gene expression analysis. For valid qPCR analysis, accurate normalisation of gene expression against an appropriate internal control is required. Yet there has been little systematic investigation of reference gene stability in N. benthamiana under conditions of viral infections. In this study, the expression profiles of 16 commonly used housekeeping genes (GAPDH, 18S, EF1α, SAMD, L23, UK, PP2A, APR, UBI3, SAND, ACT, TUB, GBP, F-BOX, PPR and TIP41) were determined in N. benthamiana and those with acceptable expression levels were further selected for transcript stability analysis by qPCR of complementary DNA prepared from N. benthamiana leaf tissue infected with one of five RNA plant viruses (Tobacco necrosis virus A, Beet black scorch virus, Beet necrotic yellow vein virus, Barley stripe mosaic virus and Potato virus X). Gene stability was analysed in parallel by three commonly-used dedicated algorithms: geNorm, NormFinder and BestKeeper. Statistical analysis revealed that the PP2A, F-BOX and L23 genes were the most stable overall, and that the combination of these three genes was sufficient for accurate normalisation. In addition, the suitability of PP2A, F-BOX and L23 as reference genes was illustrated by expression-level analysis of AGO2 and RdR6 in virus-infected N. benthamiana leaves. This is the first study to systematically examine and evaluate the stability of different reference genes in N. benthamiana. Our results not only provide researchers studying these viruses a shortlist of potential housekeeping genes to use as normalisers for qPCR experiments, but should also guide the selection of appropriate reference genes for gene expression studies of N. benthamiana under

  9. Gene-Environment Interplay in Twin Models

    PubMed Central

    Hatemi, Peter K.

    2013-01-01

    In this article, we respond to Shultziner’s critique that argues that identical twins are more alike not because of genetic similarity, but because they select into more similar environments and respond to stimuli in comparable ways, and that these effects bias twin model estimates to such an extent that they are invalid. The essay further argues that the theory and methods that undergird twin models, as well as the empirical studies which rely upon them, are unaware of these potential biases. We correct this and other misunderstandings in the essay and find that gene-environment (GE) interplay is a well-articulated concept in behavior genetics and political science, operationalized as gene-environment correlation and gene-environment interaction. Both are incorporated into interpretations of the classical twin design (CTD) and estimated in numerous empirical studies through extensions of the CTD. We then conduct simulations to quantify the influence of GE interplay on estimates from the CTD. Due to the criticism’s mischaracterization of the CTD and GE interplay, combined with the absence of any empirical evidence to counter what is presented in the extant literature and this article, we conclude that the critique does not enhance our understanding of the processes that drive political traits, genetic or otherwise. PMID:24808718

  10. Technical Note: Using experimentally determined proton spot scanning timing parameters to accurately model beam delivery time.

    PubMed

    Shen, Jiajian; Tryggestad, Erik; Younkin, James E; Keole, Sameer R; Furutani, Keith M; Kang, Yixiu; Herman, Michael G; Bues, Martin

    2017-10-01

    To accurately model the beam delivery time (BDT) for a synchrotron-based proton spot scanning system using experimentally determined beam parameters. A model to simulate the proton spot delivery sequences was constructed, and BDT was calculated by summing times for layer switch, spot switch, and spot delivery. Test plans were designed to isolate and quantify the relevant beam parameters in the operation cycle of the proton beam therapy delivery system. These parameters included the layer switch time, magnet preparation and verification time, average beam scanning speeds in x- and y-directions, proton spill rate, and maximum charge and maximum extraction time for each spill. The experimentally determined parameters, as well as the nominal values initially provided by the vendor, served as inputs to the model to predict BDTs for 602 clinical proton beam deliveries. The calculated BDTs (T BDT ) were compared with the BDTs recorded in the treatment delivery log files (T Log ): ∆t = T Log -T BDT . The experimentally determined average layer switch time for all 97 energies was 1.91 s (ranging from 1.9 to 2.0 s for beam energies from 71.3 to 228.8 MeV), average magnet preparation and verification time was 1.93 ms, the average scanning speeds were 5.9 m/s in x-direction and 19.3 m/s in y-direction, the proton spill rate was 8.7 MU/s, and the maximum proton charge available for one acceleration is 2.0 ± 0.4 nC. Some of the measured parameters differed from the nominal values provided by the vendor. The calculated BDTs using experimentally determined parameters matched the recorded BDTs of 602 beam deliveries (∆t = -0.49 ± 1.44 s), which were significantly more accurate than BDTs calculated using nominal timing parameters (∆t = -7.48 ± 6.97 s). An accurate model for BDT prediction was achieved by using the experimentally determined proton beam therapy delivery parameters, which may be useful in modeling the interplay effect and patient throughput. The model may

  11. Thermodynamics-based models of transcriptional regulation with gene sequence.

    PubMed

    Wang, Shuqiang; Shen, Yanyan; Hu, Jinxing

    2015-12-01

    Quantitative models of gene regulatory activity have the potential to improve our mechanistic understanding of transcriptional regulation. However, the few models available today have been based on simplistic assumptions about the sequences being modeled or heuristic approximations of the underlying regulatory mechanisms. In this work, we have developed a thermodynamics-based model to predict gene expression driven by any DNA sequence. The proposed model relies on a continuous time, differential equation description of transcriptional dynamics. The sequence features of the promoter are exploited to derive the binding affinity which is derived based on statistical molecular thermodynamics. Experimental results show that the proposed model can effectively identify the activity levels of transcription factors and the regulatory parameters. Comparing with the previous models, the proposed model can reveal more biological sense.

  12. Ultradian hormone stimulation induces glucocorticoid receptor-mediated pulses of gene transcription.

    PubMed

    Stavreva, Diana A; Wiench, Malgorzata; John, Sam; Conway-Campbell, Becky L; McKenna, Mervyn A; Pooley, John R; Johnson, Thomas A; Voss, Ty C; Lightman, Stafford L; Hager, Gordon L

    2009-09-01

    Studies on glucocorticoid receptor (GR) action typically assess gene responses by long-term stimulation with synthetic hormones. As corticosteroids are released from adrenal glands in a circadian and high-frequency (ultradian) mode, such treatments may not provide an accurate assessment of physiological hormone action. Here we demonstrate that ultradian hormone stimulation induces cyclic GR-mediated transcriptional regulation, or gene pulsing, both in cultured cells and in animal models. Equilibrium receptor-occupancy of regulatory elements precisely tracks the ligand pulses. Nascent RNA transcripts from GR-regulated genes are released in distinct quanta, demonstrating a profound difference between the transcriptional programs induced by ultradian and constant stimulation. Gene pulsing is driven by rapid GR exchange with response elements and by GR recycling through the chaperone machinery, which promotes GR activation and reactivation in response to the ultradian hormone release, thus coupling promoter activity to the naturally occurring fluctuations in hormone levels. The GR signalling pathway has been optimized for a prompt and timely response to fluctuations in hormone levels, indicating that biologically accurate regulation of gene targets by GR requires an ultradian mode of hormone stimulation.

  13. Weighted functional linear regression models for gene-based association analysis.

    PubMed

    Belonogova, Nadezhda M; Svishcheva, Gulnara R; Wilson, James F; Campbell, Harry; Axenovich, Tatiana I

    2018-01-01

    Functional linear regression models are effectively used in gene-based association analysis of complex traits. These models combine information about individual genetic variants, taking into account their positions and reducing the influence of noise and/or observation errors. To increase the power of methods, where several differently informative components are combined, weights are introduced to give the advantage to more informative components. Allele-specific weights have been introduced to collapsing and kernel-based approaches to gene-based association analysis. Here we have for the first time introduced weights to functional linear regression models adapted for both independent and family samples. Using data simulated on the basis of GAW17 genotypes and weights defined by allele frequencies via the beta distribution, we demonstrated that type I errors correspond to declared values and that increasing the weights of causal variants allows the power of functional linear models to be increased. We applied the new method to real data on blood pressure from the ORCADES sample. Five of the six known genes with P < 0.1 in at least one analysis had lower P values with weighted models. Moreover, we found an association between diastolic blood pressure and the VMP1 gene (P = 8.18×10-6), when we used a weighted functional model. For this gene, the unweighted functional and weighted kernel-based models had P = 0.004 and 0.006, respectively. The new method has been implemented in the program package FREGAT, which is freely available at https://cran.r-project.org/web/packages/FREGAT/index.html.

  14. Model-based gene set analysis for Bioconductor.

    PubMed

    Bauer, Sebastian; Robinson, Peter N; Gagneur, Julien

    2011-07-01

    Gene Ontology and other forms of gene-category analysis play a major role in the evaluation of high-throughput experiments in molecular biology. Single-category enrichment analysis procedures such as Fisher's exact test tend to flag large numbers of redundant categories as significant, which can complicate interpretation. We have recently developed an approach called model-based gene set analysis (MGSA), that substantially reduces the number of redundant categories returned by the gene-category analysis. In this work, we present the Bioconductor package mgsa, which makes the MGSA algorithm available to users of the R language. Our package provides a simple and flexible application programming interface for applying the approach. The mgsa package has been made available as part of Bioconductor 2.8. It is released under the conditions of the Artistic license 2.0. peter.robinson@charite.de; julien.gagneur@embl.de.

  15. A new algebraic turbulence model for accurate description of airfoil flows

    NASA Astrophysics Data System (ADS)

    Xiao, Meng-Juan; She, Zhen-Su

    2017-11-01

    We report a new algebraic turbulence model (SED-SL) based on the SED theory, a symmetry-based approach to quantifying wall turbulence. The model specifies a multi-layer profile of a stress length (SL) function in both the streamwise and wall-normal directions, which thus define the eddy viscosity in the RANS equation (e.g. a zero-equation model). After a successful simulation of flat plate flow (APS meeting, 2016), we report here further applications of the model to the flow around airfoil, with significant improvement of the prediction accuracy of the lift (CL) and drag (CD) coefficients compared to other popular models (e.g. BL, SA, etc.). Two airfoils, namely RAE2822 airfoil and NACA0012 airfoil, are computed for over 50 cases. The results are compared to experimental data from AGARD report, which shows deviations of CL bounded within 2%, and CD within 2 counts (10-4) for RAE2822 and 6 counts for NACA0012 respectively (under a systematic adjustment of the flow conditions). In all these calculations, only one parameter (proportional to the Karmen constant) shows slight variation with Mach number. The most remarkable outcome is, for the first time, the accurate prediction of the drag coefficient. The other interesting outcome is the physical interpretation of the multi-layer parameters: they specify the corresponding multi-layer structure of turbulent boundary layer; when used together with simulation data, the SED-SL enables one to extract physical information from empirical data, and to understand the variation of the turbulent boundary layer.

  16. Towards accurate modeling of noncovalent interactions for protein rigidity analysis.

    PubMed

    Fox, Naomi; Streinu, Ileana

    2013-01-01

    Protein rigidity analysis is an efficient computational method for extracting flexibility information from static, X-ray crystallography protein data. Atoms and bonds are modeled as a mechanical structure and analyzed with a fast graph-based algorithm, producing a decomposition of the flexible molecule into interconnected rigid clusters. The result depends critically on noncovalent atomic interactions, primarily on how hydrogen bonds and hydrophobic interactions are computed and modeled. Ongoing research points to the stringent need for benchmarking rigidity analysis software systems, towards the goal of increasing their accuracy and validating their results, either against each other and against biologically relevant (functional) parameters. We propose two new methods for modeling hydrogen bonds and hydrophobic interactions that more accurately reflect a mechanical model, without being computationally more intensive. We evaluate them using a novel scoring method, based on the B-cubed score from the information retrieval literature, which measures how well two cluster decompositions match. To evaluate the modeling accuracy of KINARI, our pebble-game rigidity analysis system, we use a benchmark data set of 20 proteins, each with multiple distinct conformations deposited in the Protein Data Bank. Cluster decompositions for them were previously determined with the RigidFinder method from Gerstein's lab and validated against experimental data. When KINARI's default tuning parameters are used, an improvement of the B-cubed score over a crude baseline is observed in 30% of this data. With our new modeling options, improvements were observed in over 70% of the proteins in this data set. We investigate the sensitivity of the cluster decomposition score with case studies on pyruvate phosphate dikinase and calmodulin. To substantially improve the accuracy of protein rigidity analysis systems, thorough benchmarking must be performed on all current systems and future

  17. Towards accurate modeling of noncovalent interactions for protein rigidity analysis

    PubMed Central

    2013-01-01

    Background Protein rigidity analysis is an efficient computational method for extracting flexibility information from static, X-ray crystallography protein data. Atoms and bonds are modeled as a mechanical structure and analyzed with a fast graph-based algorithm, producing a decomposition of the flexible molecule into interconnected rigid clusters. The result depends critically on noncovalent atomic interactions, primarily on how hydrogen bonds and hydrophobic interactions are computed and modeled. Ongoing research points to the stringent need for benchmarking rigidity analysis software systems, towards the goal of increasing their accuracy and validating their results, either against each other and against biologically relevant (functional) parameters. We propose two new methods for modeling hydrogen bonds and hydrophobic interactions that more accurately reflect a mechanical model, without being computationally more intensive. We evaluate them using a novel scoring method, based on the B-cubed score from the information retrieval literature, which measures how well two cluster decompositions match. Results To evaluate the modeling accuracy of KINARI, our pebble-game rigidity analysis system, we use a benchmark data set of 20 proteins, each with multiple distinct conformations deposited in the Protein Data Bank. Cluster decompositions for them were previously determined with the RigidFinder method from Gerstein's lab and validated against experimental data. When KINARI's default tuning parameters are used, an improvement of the B-cubed score over a crude baseline is observed in 30% of this data. With our new modeling options, improvements were observed in over 70% of the proteins in this data set. We investigate the sensitivity of the cluster decomposition score with case studies on pyruvate phosphate dikinase and calmodulin. Conclusion To substantially improve the accuracy of protein rigidity analysis systems, thorough benchmarking must be performed on all

  18. Identifying differentially expressed genes in cancer patients using a non-parameter Ising model.

    PubMed

    Li, Xumeng; Feltus, Frank A; Sun, Xiaoqian; Wang, James Z; Luo, Feng

    2011-10-01

    Identification of genes and pathways involved in diseases and physiological conditions is a major task in systems biology. In this study, we developed a novel non-parameter Ising model to integrate protein-protein interaction network and microarray data for identifying differentially expressed (DE) genes. We also proposed a simulated annealing algorithm to find the optimal configuration of the Ising model. The Ising model was applied to two breast cancer microarray data sets. The results showed that more cancer-related DE sub-networks and genes were identified by the Ising model than those by the Markov random field model. Furthermore, cross-validation experiments showed that DE genes identified by Ising model can improve classification performance compared with DE genes identified by Markov random field model. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  19. Diurnal Transcriptome and Gene Network Represented through Sparse Modeling in Brachypodium distachyon.

    PubMed

    Koda, Satoru; Onda, Yoshihiko; Matsui, Hidetoshi; Takahagi, Kotaro; Yamaguchi-Uehara, Yukiko; Shimizu, Minami; Inoue, Komaki; Yoshida, Takuhiro; Sakurai, Tetsuya; Honda, Hiroshi; Eguchi, Shinto; Nishii, Ryuei; Mochida, Keiichi

    2017-01-01

    We report the comprehensive identification of periodic genes and their network inference, based on a gene co-expression analysis and an Auto-Regressive eXogenous (ARX) model with a group smoothly clipped absolute deviation (SCAD) method using a time-series transcriptome dataset in a model grass, Brachypodium distachyon . To reveal the diurnal changes in the transcriptome in B. distachyon , we performed RNA-seq analysis of its leaves sampled through a diurnal cycle of over 48 h at 4 h intervals using three biological replications, and identified 3,621 periodic genes through our wavelet analysis. The expression data are feasible to infer network sparsity based on ARX models. We found that genes involved in biological processes such as transcriptional regulation, protein degradation, and post-transcriptional modification and photosynthesis are significantly enriched in the periodic genes, suggesting that these processes might be regulated by circadian rhythm in B. distachyon . On the basis of the time-series expression patterns of the periodic genes, we constructed a chronological gene co-expression network and identified putative transcription factors encoding genes that might be involved in the time-specific regulatory transcriptional network. Moreover, we inferred a transcriptional network composed of the periodic genes in B. distachyon , aiming to identify genes associated with other genes through variable selection by grouping time points for each gene. Based on the ARX model with the group SCAD regularization using our time-series expression datasets of the periodic genes, we constructed gene networks and found that the networks represent typical scale-free structure. Our findings demonstrate that the diurnal changes in the transcriptome in B. distachyon leaves have a sparse network structure, demonstrating the spatiotemporal gene regulatory network over the cyclic phase transitions in B. distachyon diurnal growth.

  20. Animal models of Duchenne muscular dystrophy: from basic mechanisms to gene therapy

    PubMed Central

    McGreevy, Joe W.; Hakim, Chady H.; McIntosh, Mark A.; Duan, Dongsheng

    2015-01-01

    Duchenne muscular dystrophy (DMD) is a progressive muscle-wasting disorder. It is caused by loss-of-function mutations in the dystrophin gene. Currently, there is no cure. A highly promising therapeutic strategy is to replace or repair the defective dystrophin gene by gene therapy. Numerous animal models of DMD have been developed over the last 30 years, ranging from invertebrate to large mammalian models. mdx mice are the most commonly employed models in DMD research and have been used to lay the groundwork for DMD gene therapy. After ~30 years of development, the field has reached the stage at which the results in mdx mice can be validated and scaled-up in symptomatic large animals. The canine DMD (cDMD) model will be excellent for these studies. In this article, we review the animal models for DMD, the pros and cons of each model system, and the history and progress of preclinical DMD gene therapy research in the animal models. We also discuss the current and emerging challenges in this field and ways to address these challenges using animal models, in particular cDMD dogs. PMID:25740330

  1. Detecting regulatory gene-environment interactions with unmeasured environmental factors.

    PubMed

    Fusi, Nicoló; Lippert, Christoph; Borgwardt, Karsten; Lawrence, Neil D; Stegle, Oliver

    2013-06-01

    Genomic studies have revealed a substantial heritable component of the transcriptional state of the cell. To fully understand the genetic regulation of gene expression variability, it is important to study the effect of genotype in the context of external factors such as alternative environmental conditions. In model systems, explicit environmental perturbations have been considered for this purpose, allowing to directly test for environment-specific genetic effects. However, such experiments are limited to species that can be profiled in controlled environments, hampering their use in important systems such as human. Moreover, even in seemingly tightly regulated experimental conditions, subtle environmental perturbations cannot be ruled out, and hence unknown environmental influences are frequent. Here, we propose a model-based approach to simultaneously infer unmeasured environmental factors from gene expression profiles and use them in genetic analyses, identifying environment-specific associations between polymorphic loci and individual gene expression traits. In extensive simulation studies, we show that our method is able to accurately reconstruct environmental factors and their interactions with genotype in a variety of settings. We further illustrate the use of our model in a real-world dataset in which one environmental factor has been explicitly experimentally controlled. Our method is able to accurately reconstruct the true underlying environmental factor even if it is not given as an input, allowing to detect genuine genotype-environment interactions. In addition to the known environmental factor, we find unmeasured factors involved in novel genotype-environment interactions. Our results suggest that interactions with both known and unknown environmental factors significantly contribute to gene expression variability. and implementation: Software available at http://pmbio.github.io/envGPLVM/. Supplementary data are available at Bioinformatics online.

  2. Cumulative atomic multipole moments complement any atomic charge model to obtain more accurate electrostatic properties

    NASA Technical Reports Server (NTRS)

    Sokalski, W. A.; Shibata, M.; Ornstein, R. L.; Rein, R.

    1992-01-01

    The quality of several atomic charge models based on different definitions has been analyzed using cumulative atomic multipole moments (CAMM). This formalism can generate higher atomic moments starting from any atomic charges, while preserving the corresponding molecular moments. The atomic charge contribution to the higher molecular moments, as well as to the electrostatic potentials, has been examined for CO and HCN molecules at several different levels of theory. The results clearly show that the electrostatic potential obtained from CAMM expansion is convergent up to R-5 term for all atomic charge models used. This illustrates that higher atomic moments can be used to supplement any atomic charge model to obtain more accurate description of electrostatic properties.

  3. Integrating high dimensional bi-directional parsing models for gene mention tagging.

    PubMed

    Hsu, Chun-Nan; Chang, Yu-Ming; Kuo, Cheng-Ju; Lin, Yu-Shi; Huang, Han-Shen; Chung, I-Fang

    2008-07-01

    Tagging gene and gene product mentions in scientific text is an important initial step of literature mining. In this article, we describe in detail our gene mention tagger participated in BioCreative 2 challenge and analyze what contributes to its good performance. Our tagger is based on the conditional random fields model (CRF), the most prevailing method for the gene mention tagging task in BioCreative 2. Our tagger is interesting because it accomplished the highest F-scores among CRF-based methods and second over all. Moreover, we obtained our results by mostly applying open source packages, making it easy to duplicate our results. We first describe in detail how we developed our CRF-based tagger. We designed a very high dimensional feature set that includes most of information that may be relevant. We trained bi-directional CRF models with the same set of features, one applies forward parsing and the other backward, and integrated two models based on the output scores and dictionary filtering. One of the most prominent factors that contributes to the good performance of our tagger is the integration of an additional backward parsing model. However, from the definition of CRF, it appears that a CRF model is symmetric and bi-directional parsing models will produce the same results. We show that due to different feature settings, a CRF model can be asymmetric and the feature setting for our tagger in BioCreative 2 not only produces different results but also gives backward parsing models slight but constant advantage over forward parsing model. To fully explore the potential of integrating bi-directional parsing models, we applied different asymmetric feature settings to generate many bi-directional parsing models and integrate them based on the output scores. Experimental results show that this integrated model can achieve even higher F-score solely based on the training corpus for gene mention tagging. Data sets, programs and an on-line service of our gene

  4. An Integrated Tool to Study MHC Region: Accurate SNV Detection and HLA Genes Typing in Human MHC Region Using Targeted High-Throughput Sequencing

    PubMed Central

    Liu, Xiao; Xu, Yinyin; Liang, Dequan; Gao, Peng; Sun, Yepeng; Gifford, Benjamin; D’Ascenzo, Mark; Liu, Xiaomin; Tellier, Laurent C. A. M.; Yang, Fang; Tong, Xin; Chen, Dan; Zheng, Jing; Li, Weiyang; Richmond, Todd; Xu, Xun; Wang, Jun; Li, Yingrui

    2013-01-01

    The major histocompatibility complex (MHC) is one of the most variable and gene-dense regions of the human genome. Most studies of the MHC, and associated regions, focus on minor variants and HLA typing, many of which have been demonstrated to be associated with human disease susceptibility and metabolic pathways. However, the detection of variants in the MHC region, and diagnostic HLA typing, still lacks a coherent, standardized, cost effective and high coverage protocol of clinical quality and reliability. In this paper, we presented such a method for the accurate detection of minor variants and HLA types in the human MHC region, using high-throughput, high-coverage sequencing of target regions. A probe set was designed to template upon the 8 annotated human MHC haplotypes, and to encompass the 5 megabases (Mb) of the extended MHC region. We deployed our probes upon three, genetically diverse human samples for probe set evaluation, and sequencing data show that ∼97% of the MHC region, and over 99% of the genes in MHC region, are covered with sufficient depth and good evenness. 98% of genotypes called by this capture sequencing prove consistent with established HapMap genotypes. We have concurrently developed a one-step pipeline for calling any HLA type referenced in the IMGT/HLA database from this target capture sequencing data, which shows over 96% typing accuracy when deployed at 4 digital resolution. This cost-effective and highly accurate approach for variant detection and HLA typing in the MHC region may lend further insight into immune-mediated diseases studies, and may find clinical utility in transplantation medicine research. This one-step pipeline is released for general evaluation and use by the scientific community. PMID:23894464

  5. Selection of reference genes for quantitative gene expression normalization in flax (Linum usitatissimum L.).

    PubMed

    Huis, Rudy; Hawkins, Simon; Neutelings, Godfrey

    2010-04-19

    Quantitative real-time PCR (qRT-PCR) is currently the most accurate method for detecting differential gene expression. Such an approach depends on the identification of uniformly expressed 'housekeeping genes' (HKGs). Extensive transcriptomic data mining and experimental validation in different model plants have shown that the reliability of these endogenous controls can be influenced by the plant species, growth conditions and organs/tissues examined. It is therefore important to identify the best reference genes to use in each biological system before using qRT-PCR to investigate differential gene expression. In this paper we evaluate different candidate HKGs for developmental transcriptomic studies in the economically-important flax fiber- and oil-crop (Linum usitatissimum L). Specific primers were designed in order to quantify the expression levels of 20 different potential housekeeping genes in flax roots, internal- and external-stem tissues, leaves and flowers at different developmental stages. After calculations of PCR efficiencies, 13 HKGs were retained and their expression stabilities evaluated by the computer algorithms geNorm and NormFinder. According to geNorm, 2 Transcriptional Elongation Factors (TEFs) and 1 Ubiquitin gene are necessary for normalizing gene expression when all studied samples are considered. However, only 2 TEFs are required for normalizing expression in stem tissues. In contrast, NormFinder identified glyceraldehyde-3-phosphate dehydrogenase (GADPH) as the most stably expressed gene when all samples were grouped together, as well as when samples were classed into different sub-groups.qRT-PCR was then used to investigate the relative expression levels of two splice variants of the flax LuMYB1 gene (homologue of AtMYB59). LuMYB1-1 and LuMYB1-2 were highly expressed in the internal stem tissues as compared to outer stem tissues and other samples. This result was confirmed with both geNorm-designated- and Norm

  6. Estimation of Dynamic Systems for Gene Regulatory Networks from Dependent Time-Course Data.

    PubMed

    Kim, Yoonji; Kim, Jaejik

    2018-06-15

    Dynamic system consisting of ordinary differential equations (ODEs) is a well-known tool for describing dynamic nature of gene regulatory networks (GRNs), and the dynamic features of GRNs are usually captured through time-course gene expression data. Owing to high-throughput technologies, time-course gene expression data have complex structures such as heteroscedasticity, correlations between genes, and time dependence. Since gene experiments typically yield highly noisy data with small sample size, for a more accurate prediction of the dynamics, the complex structures should be taken into account in ODE models. Hence, this study proposes an ODE model considering such data structures and a fast and stable estimation method for the ODE parameters based on the generalized profiling approach with data smoothing techniques. The proposed method also provides statistical inference for the ODE estimator and it is applied to a zebrafish retina cell network.

  7. ZCURVE 3.0: identify prokaryotic genes with higher accuracy as well as automatically and accurately select essential genes

    PubMed Central

    Hua, Zhi-Gang; Lin, Yan; Yuan, Ya-Zhou; Yang, De-Chang; Wei, Wen; Guo, Feng-Biao

    2015-01-01

    In 2003, we developed an ab initio program, ZCURVE 1.0, to find genes in bacterial and archaeal genomes. In this work, we present the updated version (i.e. ZCURVE 3.0). Using 422 prokaryotic genomes, the average accuracy was 93.7% with the updated version, compared with 88.7% with the original version. Such results also demonstrate that ZCURVE 3.0 is comparable with Glimmer 3.02 and may provide complementary predictions to it. In fact, the joint application of the two programs generated better results by correctly finding more annotated genes while also containing fewer false-positive predictions. As the exclusive function, ZCURVE 3.0 contains one post-processing program that can identify essential genes with high accuracy (generally >90%). We hope ZCURVE 3.0 will receive wide use with the web-based running mode. The updated ZCURVE can be freely accessed from http://cefg.uestc.edu.cn/zcurve/ or http://tubic.tju.edu.cn/zcurveb/ without any restrictions. PMID:25977299

  8. Discovering potential driver genes through an integrated model of somatic mutation profiles and gene functional information.

    PubMed

    Xi, Jianing; Wang, Minghui; Li, Ao

    2017-09-26

    The accumulating availability of next-generation sequencing data offers an opportunity to pinpoint driver genes that are causally implicated in oncogenesis through computational models. Despite previous efforts made regarding this challenging problem, there is still room for improvement in the driver gene identification accuracy. In this paper, we propose a novel integrated approach called IntDriver for prioritizing driver genes. Based on a matrix factorization framework, IntDriver can effectively incorporate functional information from both the interaction network and Gene Ontology similarity, and detect driver genes mutated in different sets of patients at the same time. When evaluated through known benchmarking driver genes, the top ranked genes of our result show highly significant enrichment for the known genes. Meanwhile, IntDriver also detects some known driver genes that are not found by the other competing approaches. When measured by precision, recall and F1 score, the performances of our approach are comparable or increased in comparison to the competing approaches.

  9. Fitmunk: improving protein structures by accurate, automatic modeling of side-chain conformations.

    PubMed

    Porebski, Przemyslaw Jerzy; Cymborowski, Marcin; Pasenkiewicz-Gierula, Marta; Minor, Wladek

    2016-02-01

    Improvements in crystallographic hardware and software have allowed automated structure-solution pipelines to approach a near-`one-click' experience for the initial determination of macromolecular structures. However, in many cases the resulting initial model requires a laborious, iterative process of refinement and validation. A new method has been developed for the automatic modeling of side-chain conformations that takes advantage of rotamer-prediction methods in a crystallographic context. The algorithm, which is based on deterministic dead-end elimination (DEE) theory, uses new dense conformer libraries and a hybrid energy function derived from experimental data and prior information about rotamer frequencies to find the optimal conformation of each side chain. In contrast to existing methods, which incorporate the electron-density term into protein-modeling frameworks, the proposed algorithm is designed to take advantage of the highly discriminatory nature of electron-density maps. This method has been implemented in the program Fitmunk, which uses extensive conformational sampling. This improves the accuracy of the modeling and makes it a versatile tool for crystallographic model building, refinement and validation. Fitmunk was extensively tested on over 115 new structures, as well as a subset of 1100 structures from the PDB. It is demonstrated that the ability of Fitmunk to model more than 95% of side chains accurately is beneficial for improving the quality of crystallographic protein models, especially at medium and low resolutions. Fitmunk can be used for model validation of existing structures and as a tool to assess whether side chains are modeled optimally or could be better fitted into electron density. Fitmunk is available as a web service at http://kniahini.med.virginia.edu/fitmunk/server/ or at http://fitmunk.bitbucket.org/.

  10. Gene Regulation Networks for Modeling Drosophila Development

    NASA Technical Reports Server (NTRS)

    Mjolsness, E.

    1999-01-01

    This chapter will very briefly introduce and review some computational experiments in using trainable gene regulation network models to simulate and understand selected episodes in the development of the fruit fly, Drosophila Melanogaster.

  11. Reference gene identification for reliable normalisation of quantitative RT-PCR data in Setaria viridis.

    PubMed

    Nguyen, Duc Quan; Eamens, Andrew L; Grof, Christopher P L

    2018-01-01

    Quantitative real-time polymerase chain reaction (RT-qPCR) is the key platform for the quantitative analysis of gene expression in a wide range of experimental systems and conditions. However, the accuracy and reproducibility of gene expression quantification via RT-qPCR is entirely dependent on the identification of reliable reference genes for data normalisation. Green foxtail ( Setaria viridis ) has recently been proposed as a potential experimental model for the study of C 4 photosynthesis and is closely related to many economically important crop species of the Panicoideae subfamily of grasses, including Zea mays (maize), Sorghum bicolor (sorghum) and Sacchurum officinarum (sugarcane). Setaria viridis (Accession 10) possesses a number of key traits as an experimental model, namely; (i) a small sized, sequenced and well annotated genome; (ii) short stature and generation time; (iii) prolific seed production, and; (iv) is amendable to Agrobacterium tumefaciens -mediated transformation. There is currently however, a lack of reference gene expression information for Setaria viridis ( S. viridis ). We therefore aimed to identify a cohort of suitable S. viridis reference genes for accurate and reliable normalisation of S. viridis RT-qPCR expression data. Eleven putative candidate reference genes were identified and examined across thirteen different S. viridis tissues. Of these, the geNorm and NormFinder analysis software identified SERINE / THERONINE - PROTEIN PHOSPHATASE 2A ( PP2A ), 5 '- ADENYLYLSULFATE REDUCTASE 6 ( ASPR6 ) and DUAL SPECIFICITY PHOSPHATASE ( DUSP ) as the most suitable combination of reference genes for the accurate and reliable normalisation of S. viridis RT-qPCR expression data. To demonstrate the suitability of the three selected reference genes, PP2A , ASPR6 and DUSP , were used to normalise the expression of CINNAMYL ALCOHOL DEHYDROGENASE ( CAD ) genes across the same tissues. This approach readily demonstrated the suitably of the three

  12. Inferring gene dependency network specific to phenotypic alteration based on gene expression data and clinical information of breast cancer.

    PubMed

    Zhou, Xionghui; Liu, Juan

    2014-01-01

    Although many methods have been proposed to reconstruct gene regulatory network, most of them, when applied in the sample-based data, can not reveal the gene regulatory relations underlying the phenotypic change (e.g. normal versus cancer). In this paper, we adopt phenotype as a variable when constructing the gene regulatory network, while former researches either neglected it or only used it to select the differentially expressed genes as the inputs to construct the gene regulatory network. To be specific, we integrate phenotype information with gene expression data to identify the gene dependency pairs by using the method of conditional mutual information. A gene dependency pair (A,B) means that the influence of gene A on the phenotype depends on gene B. All identified gene dependency pairs constitute a directed network underlying the phenotype, namely gene dependency network. By this way, we have constructed gene dependency network of breast cancer from gene expression data along with two different phenotype states (metastasis and non-metastasis). Moreover, we have found the network scale free, indicating that its hub genes with high out-degrees may play critical roles in the network. After functional investigation, these hub genes are found to be biologically significant and specially related to breast cancer, which suggests that our gene dependency network is meaningful. The validity has also been justified by literature investigation. From the network, we have selected 43 discriminative hubs as signature to build the classification model for distinguishing the distant metastasis risks of breast cancer patients, and the result outperforms those classification models with published signatures. In conclusion, we have proposed a promising way to construct the gene regulatory network by using sample-based data, which has been shown to be effective and accurate in uncovering the hidden mechanism of the biological process and identifying the gene signature for

  13. Construction and analysis of gene-gene dynamics influence networks based on a Boolean model.

    PubMed

    Mazaya, Maulida; Trinh, Hung-Cuong; Kwon, Yung-Keun

    2017-12-21

    Identification of novel gene-gene relations is a crucial issue to understand system-level biological phenomena. To this end, many methods based on a correlation analysis of gene expressions or structural analysis of molecular interaction networks have been proposed. They have a limitation in identifying more complicated gene-gene dynamical relations, though. To overcome this limitation, we proposed a measure to quantify a gene-gene dynamical influence (GDI) using a Boolean network model and constructed a GDI network to indicate existence of a dynamical influence for every ordered pair of genes. It represents how much a state trajectory of a target gene is changed by a knockout mutation subject to a source gene in a gene-gene molecular interaction (GMI) network. Through a topological comparison between GDI and GMI networks, we observed that the former network is denser than the latter network, which implies that there exist many gene pairs of dynamically influencing but molecularly non-interacting relations. In addition, a larger number of hub genes were generated in the GDI network. On the other hand, there was a correlation between these networks such that the degree value of a node was positively correlated to each other. We further investigated the relationships of the GDI value with structural properties and found that there are negative and positive correlations with the length of a shortest path and the number of paths, respectively. In addition, a GDI network could predict a set of genes whose steady-state expression is affected in E. coli gene-knockout experiments. More interestingly, we found that the drug-targets with side-effects have a larger number of outgoing links than the other genes in the GDI network, which implies that they are more likely to influence the dynamics of other genes. Finally, we found biological evidences showing that the gene pairs which are not molecularly interacting but dynamically influential can be considered for novel gene-gene

  14. Using Dual Fluorescence Reporting Genes to Establish an In Vivo Imaging Model of Orthotopic Lung Adenocarcinoma in Mice.

    PubMed

    Lai, Cheng-Wei; Chen, Hsiao-Ling; Yen, Chih-Ching; Wang, Jiun-Long; Yang, Shang-Hsun; Chen, Chuan-Mu

    2016-12-01

    Lung adenocarcinoma is characterized by a poor prognosis and high mortality worldwide. In this study, we purposed to use the live imaging techniques and a reporter gene that generates highly penetrative near-infrared (NIR) fluorescence to establish a preclinical animal model that allows in vivo monitoring of lung cancer development and provides a non-invasive tool for the research on lung cancer pathogenesis and therapeutic efficacy. A human lung adenocarcinoma cell line (A549), which stably expressed the dual fluorescence reporting gene (pCAG-iRFP-2A-Venus), was used to generate subcutaneous or orthotopic lung cancer in nude mice. Cancer development was evaluated by live imaging via the NIR fluorescent signals from iRFP, and the signals were verified ex vivo by the green fluorescence of Venus from the gross lung. The tumor-bearing mice received miR-16 nucleic acid therapy by intranasal administration to demonstrate therapeutic efficacy in this live imaging system. For the subcutaneous xenografts, the detection of iRFP fluorescent signals revealed delicate changes occurring during tumor growth that are not distinguishable by conventional methods of tumor measurement. For the orthotopic xenografts, the positive correlation between the in vivo iRFP signal from mice chests and the ex vivo green fluorescent signal from gross lung tumors and the results of the suppressed tumorigenesis by miR-16 treatment indicated that lung tumor size can be accurately quantified by the emission of NIR fluorescence. In addition, orthotopic lung tumor localization can be accurately visualized using iRFP fluorescence tomography in vivo, thus revealing the trafficking of lung tumor cells. We introduced a novel dual fluorescence lung cancer model that provides a non-invasive option for preclinical research via the use of NIR fluorescence in live imaging of lung.

  15. A recellularized human colon model identifies cancer driver genes

    PubMed Central

    Chen, Huanhuan Joyce; Wei, Zhubo; Sun, Jian; Bhattacharya, Asmita; Savage, David J; Serda, Rita; Mackeyev, Yuri; Curley, Steven A.; Bu, Pengcheng; Wang, Lihua; Chen, Shuibing; Cohen-Gould, Leona; Huang, Emina; Shen, Xiling; Lipkin, Steven M.; Copeland, Neal G.; Jenkins, Nancy A.; Shuler, Michael L.

    2016-01-01

    Refined cancer models are needed to bridge the gap between cell-line, animal and clinical research. Here we describe the engineering of an organotypic colon cancer model by recellularization of a native human matrix that contains cell-populated mucosa and an intact muscularis mucosa layer. This ex vivo system recapitulates the pathophysiological progression from APC-mutant neoplasia to submucosal invasive tumor. We used it to perform a Sleeping Beauty transposon mutagenesis screen to identify genes that cooperate with mutant APC in driving invasive neoplasia. 38 candidate invasion driver genes were identified, 17 of which have been previously implicated in colorectal cancer progression, including TCF7L2, TWIST2, MSH2, DCC and EPHB1/2. Six invasion driver genes that to our knowledge have not been previously described were validated in vitro using cell proliferation, migration and invasion assays, and ex vivo using recellularized human colon. These results demonstrate the utility of our organoid model for studying cancer biology. PMID:27398792

  16. funRiceGenes dataset for comprehensive understanding and application of rice functional genes.

    PubMed

    Yao, Wen; Li, Guangwei; Yu, Yiming; Ouyang, Yidan

    2018-01-01

    As a main staple food, rice is also a model plant for functional genomic studies of monocots. Decoding of every DNA element of the rice genome is essential for genetic improvement to address increasing food demands. The past 15 years have witnessed extraordinary advances in rice functional genomics. Systematic characterization and proper deposition of every rice gene are vital for both functional studies and crop genetic improvement. We built a comprehensive and accurate dataset of ∼2800 functionally characterized rice genes and ∼5000 members of different gene families by integrating data from available databases and reviewing every publication on rice functional genomic studies. The dataset accounts for 19.2% of the 39 045 annotated protein-coding rice genes, which provides the most exhaustive archive for investigating the functions of rice genes. We also constructed 214 gene interaction networks based on 1841 connections between 1310 genes. The largest network with 762 genes indicated that pleiotropic genes linked different biological pathways. Increasing degree of conservation of the flowering pathway was observed among more closely related plants, implying substantial value of rice genes for future dissection of flowering regulation in other crops. All data are deposited in the funRiceGenes database (https://funricegenes.github.io/). Functionality for advanced search and continuous updating of the database are provided by a Shiny application (http://funricegenes.ncpgr.cn/). The funRiceGenes dataset would enable further exploring of the crosslink between gene functions and natural variations in rice, which can also facilitate breeding design to improve target agronomic traits of rice. © The Authors 2017. Published by Oxford University Press.

  17. Evaluating methods of inferring gene regulatory networks highlights their lack of performance for single cell gene expression data.

    PubMed

    Chen, Shuonan; Mar, Jessica C

    2018-06-19

    A fundamental fact in biology states that genes do not operate in isolation, and yet, methods that infer regulatory networks for single cell gene expression data have been slow to emerge. With single cell sequencing methods now becoming accessible, general network inference algorithms that were initially developed for data collected from bulk samples may not be suitable for single cells. Meanwhile, although methods that are specific for single cell data are now emerging, whether they have improved performance over general methods is unknown. In this study, we evaluate the applicability of five general methods and three single cell methods for inferring gene regulatory networks from both experimental single cell gene expression data and in silico simulated data. Standard evaluation metrics using ROC curves and Precision-Recall curves against reference sets sourced from the literature demonstrated that most of the methods performed poorly when they were applied to either experimental single cell data, or simulated single cell data, which demonstrates their lack of performance for this task. Using default settings, network methods were applied to the same datasets. Comparisons of the learned networks highlighted the uniqueness of some predicted edges for each method. The fact that different methods infer networks that vary substantially reflects the underlying mathematical rationale and assumptions that distinguish network methods from each other. This study provides a comprehensive evaluation of network modeling algorithms applied to experimental single cell gene expression data and in silico simulated datasets where the network structure is known. Comparisons demonstrate that most of these assessed network methods are not able to predict network structures from single cell expression data accurately, even if they are specifically developed for single cell methods. Also, single cell methods, which usually depend on more elaborative algorithms, in general have less

  18. A gene network model accounting for development and evolution of mammalian teeth

    PubMed Central

    Salazar-Ciudad, Isaac; Jernvall, Jukka

    2002-01-01

    Generation of morphological diversity remains a challenge for evolutionary biologists because it is unclear how an ultimately finite number of genes involved in initial pattern formation integrates with morphogenesis. Ideally, models used to search for the simplest developmental principles on how genes produce form should account for both developmental process and evolutionary change. Here we present a model reproducing the morphology of mammalian teeth by integrating experimental data on gene interactions and growth into a morphodynamic mechanism in which developing morphology has a causal role in patterning. The model predicts the course of tooth-shape development in different mammalian species and also reproduces key transitions in evolution. Furthermore, we reproduce the known expression patterns of several genes involved in tooth development and their dynamics over developmental time. Large morphological effects frequently can be achieved by small changes, according to this model, and similar morphologies can be produced by different changes. This finding may be consistent with why predicting the morphological outcomes of molecular experiments is challenging. Nevertheless, models incorporating morphology and gene activity show promise for linking genotypes to phenotypes. PMID:12048258

  19. The GermOnline cross-species systems browser provides comprehensive information on genes and gene products relevant for sexual reproduction.

    PubMed

    Gattiker, Alexandre; Niederhauser-Wiederkehr, Christa; Moore, James; Hermida, Leandro; Primig, Michael

    2007-01-01

    We report a novel release of the GermOnline knowledgebase covering genes relevant for the cell cycle, gametogenesis and fertility. GermOnline was extended into a cross-species systems browser including information on DNA sequence annotation, gene expression and the function of gene products. The database covers eight model organisms and Homo sapiens, for which complete genome annotation data are available. The database is now built around a sophisticated genome browser (Ensembl), our own microarray information management and annotation system (MIMAS) used to extensively describe experimental data obtained with high-density oligonucleotide microarrays (GeneChips) and a comprehensive system for online editing of database entries (MediaWiki). The RNA data include results from classical microarrays as well as tiling arrays that yield information on RNA expression levels, transcript start sites and lengths as well as exon composition. Members of the research community are solicited to help GermOnline curators keep database entries on genes and gene products complete and accurate. The database is accessible at http://www.germonline.org/.

  20. Modeling Dynamic Regulatory Processes in Stroke.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    McDermott, Jason E.; Jarman, Kenneth D.; Taylor, Ronald C.

    2012-10-11

    The ability to examine in silico the behavior of biological systems can greatly accelerate the pace of discovery in disease pathologies, such as stroke, where in vivo experimentation is lengthy and costly. In this paper we describe an approach to in silico examination of blood genomic responses to neuroprotective agents and subsequent stroke through the development of dynamic models of the regulatory processes observed in the experimental gene expression data. First, we identified functional gene clusters from these data. Next, we derived ordinary differential equations (ODEs) relating regulators and functional clusters from the data. These ODEs were used to developmore » dynamic models that simulate the expression of regulated functional clusters using system dynamics as the modeling paradigm. The dynamic model has the considerable advantage of only requiring an initial starting state, and does not require measurement of regulatory influences at each time point in order to make accurate predictions. The manipulation of input model parameters, such as changing the magnitude of gene expression, made it possible to assess the behavior of the networks through time under varying conditions. We report that an optimized dynamic model can provide accurate predictions of overall system behavior under several different preconditioning paradigms.« less

  1. Protein modelling of triterpene synthase genes from mangrove plants using Phyre2 and Swiss-model

    NASA Astrophysics Data System (ADS)

    Basyuni, M.; Wati, R.; Sulistiyono, N.; Hayati, R.; Sumardi; Oku, H.; Baba, S.; Sagami, H.

    2018-03-01

    Molecular cloning of five oxidosqualene cyclases (OSC) genes from Bruguiera gymnorrhiza, Kandelia candel, and Rhizophora stylosa had previously been cloned, characterized, and encoded mono and -multi triterpene synthases. The present study analyzed protein modelling of triterpene synthase genes from mangrove using Phyre2 and Swiss-model. The diversity was noted within protein modelling of triterpene synthases using Phyre2 from sequence identity (38-43%) and residue (696-703). RsM2 was distinguishable from others for template structure; it used lanosterol synthase as a template (PDB ID: w6j.1.A). By contrast, other genes used human lanosterol synthase (1w6k.1.A). The predicted bind sites were correlated with the product of triterpene synthase, the product of BgbAS was β-amyrin, while RsM1 contained a significant amount of β-amyrin. Similarly BgLUS and KcMS, both main products was lupeol, on the other hand, RsM2 with the outcome of taraxerol. Homology modelling revealed that 696 residues of BgbAS, BgLUS, RsM1, and RsM2 (91-92% of the amino acid sequence) had been modelled with 100% confidence by the single highest scoring template using Phyre2. This coverage was higher than Swiss-model (85-90%). The present study suggested that molecular cloning of triterpene genes provides useful tools for studying the protein modelling related regulation of isoprenoids biosynthesis in mangrove forests.

  2. ZCURVE 3.0: identify prokaryotic genes with higher accuracy as well as automatically and accurately select essential genes.

    PubMed

    Hua, Zhi-Gang; Lin, Yan; Yuan, Ya-Zhou; Yang, De-Chang; Wei, Wen; Guo, Feng-Biao

    2015-07-01

    In 2003, we developed an ab initio program, ZCURVE 1.0, to find genes in bacterial and archaeal genomes. In this work, we present the updated version (i.e. ZCURVE 3.0). Using 422 prokaryotic genomes, the average accuracy was 93.7% with the updated version, compared with 88.7% with the original version. Such results also demonstrate that ZCURVE 3.0 is comparable with Glimmer 3.02 and may provide complementary predictions to it. In fact, the joint application of the two programs generated better results by correctly finding more annotated genes while also containing fewer false-positive predictions. As the exclusive function, ZCURVE 3.0 contains one post-processing program that can identify essential genes with high accuracy (generally >90%). We hope ZCURVE 3.0 will receive wide use with the web-based running mode. The updated ZCURVE can be freely accessed from http://cefg.uestc.edu.cn/zcurve/ or http://tubic.tju.edu.cn/zcurveb/ without any restrictions. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  3. MED: a new non-supervised gene prediction algorithm for bacterial and archaeal genomes.

    PubMed

    Zhu, Huaiqiu; Hu, Gang-Qing; Yang, Yi-Fan; Wang, Jin; She, Zhen-Su

    2007-03-16

    Despite a remarkable success in the computational prediction of genes in Bacteria and Archaea, a lack of comprehensive understanding of prokaryotic gene structures prevents from further elucidation of differences among genomes. It continues to be interesting to develop new ab initio algorithms which not only accurately predict genes, but also facilitate comparative studies of prokaryotic genomes. This paper describes a new prokaryotic genefinding algorithm based on a comprehensive statistical model of protein coding Open Reading Frames (ORFs) and Translation Initiation Sites (TISs). The former is based on a linguistic "Entropy Density Profile" (EDP) model of coding DNA sequence and the latter comprises several relevant features related to the translation initiation. They are combined to form a so-called Multivariate Entropy Distance (MED) algorithm, MED 2.0, that incorporates several strategies in the iterative program. The iterations enable us to develop a non-supervised learning process and to obtain a set of genome-specific parameters for the gene structure, before making the prediction of genes. Results of extensive tests show that MED 2.0 achieves a competitive high performance in the gene prediction for both 5' and 3' end matches, compared to the current best prokaryotic gene finders. The advantage of the MED 2.0 is particularly evident for GC-rich genomes and archaeal genomes. Furthermore, the genome-specific parameters given by MED 2.0 match with the current understanding of prokaryotic genomes and may serve as tools for comparative genomic studies. In particular, MED 2.0 is shown to reveal divergent translation initiation mechanisms in archaeal genomes while making a more accurate prediction of TISs compared to the existing gene finders and the current GenBank annotation.

  4. Accurate and efficient modeling of the detector response in small animal multi-head PET systems.

    PubMed

    Cecchetti, Matteo; Moehrs, Sascha; Belcari, Nicola; Del Guerra, Alberto

    2013-10-07

    In fully three-dimensional PET imaging, iterative image reconstruction techniques usually outperform analytical algorithms in terms of image quality provided that an appropriate system model is used. In this study we concentrate on the calculation of an accurate system model for the YAP-(S)PET II small animal scanner, with the aim to obtain fully resolution- and contrast-recovered images at low levels of image roughness. For this purpose we calculate the system model by decomposing it into a product of five matrices: (1) a detector response component obtained via Monte Carlo simulations, (2) a geometric component which describes the scanner geometry and which is calculated via a multi-ray method, (3) a detector normalization component derived from the acquisition of a planar source, (4) a photon attenuation component calculated from x-ray computed tomography data, and finally, (5) a positron range component is formally included. This system model factorization allows the optimization of each component in terms of computation time, storage requirements and accuracy. The main contribution of this work is a new, efficient way to calculate the detector response component for rotating, planar detectors, that consists of a GEANT4 based simulation of a subset of lines of flight (LOFs) for a single detector head whereas the missing LOFs are obtained by using intrinsic detector symmetries. Additionally, we introduce and analyze a probability threshold for matrix elements of the detector component to optimize the trade-off between the matrix size in terms of non-zero elements and the resulting quality of the reconstructed images. In order to evaluate our proposed system model we reconstructed various images of objects, acquired according to the NEMA NU 4-2008 standard, and we compared them to the images reconstructed with two other system models: a model that does not include any detector response component and a model that approximates analytically the depth of interaction

  5. Accurate and efficient modeling of the detector response in small animal multi-head PET systems

    NASA Astrophysics Data System (ADS)

    Cecchetti, Matteo; Moehrs, Sascha; Belcari, Nicola; Del Guerra, Alberto

    2013-10-01

    In fully three-dimensional PET imaging, iterative image reconstruction techniques usually outperform analytical algorithms in terms of image quality provided that an appropriate system model is used. In this study we concentrate on the calculation of an accurate system model for the YAP-(S)PET II small animal scanner, with the aim to obtain fully resolution- and contrast-recovered images at low levels of image roughness. For this purpose we calculate the system model by decomposing it into a product of five matrices: (1) a detector response component obtained via Monte Carlo simulations, (2) a geometric component which describes the scanner geometry and which is calculated via a multi-ray method, (3) a detector normalization component derived from the acquisition of a planar source, (4) a photon attenuation component calculated from x-ray computed tomography data, and finally, (5) a positron range component is formally included. This system model factorization allows the optimization of each component in terms of computation time, storage requirements and accuracy. The main contribution of this work is a new, efficient way to calculate the detector response component for rotating, planar detectors, that consists of a GEANT4 based simulation of a subset of lines of flight (LOFs) for a single detector head whereas the missing LOFs are obtained by using intrinsic detector symmetries. Additionally, we introduce and analyze a probability threshold for matrix elements of the detector component to optimize the trade-off between the matrix size in terms of non-zero elements and the resulting quality of the reconstructed images. In order to evaluate our proposed system model we reconstructed various images of objects, acquired according to the NEMA NU 4-2008 standard, and we compared them to the images reconstructed with two other system models: a model that does not include any detector response component and a model that approximates analytically the depth of interaction

  6. Exploiting proteomic data for genome annotation and gene model validation in Aspergillus niger.

    PubMed

    Wright, James C; Sugden, Deana; Francis-McIntyre, Sue; Riba-Garcia, Isabel; Gaskell, Simon J; Grigoriev, Igor V; Baker, Scott E; Beynon, Robert J; Hubbard, Simon J

    2009-02-04

    Proteomic data is a potentially rich, but arguably unexploited, data source for genome annotation. Peptide identifications from tandem mass spectrometry provide prima facie evidence for gene predictions and can discriminate over a set of candidate gene models. Here we apply this to the recently sequenced Aspergillus niger fungal genome from the Joint Genome Institutes (JGI) and another predicted protein set from another A.niger sequence. Tandem mass spectra (MS/MS) were acquired from 1d gel electrophoresis bands and searched against all available gene models using Average Peptide Scoring (APS) and reverse database searching to produce confident identifications at an acceptable false discovery rate (FDR). 405 identified peptide sequences were mapped to 214 different A.niger genomic loci to which 4093 predicted gene models clustered, 2872 of which contained the mapped peptides. Interestingly, 13 (6%) of these loci either had no preferred predicted gene model or the genome annotators' chosen "best" model for that genomic locus was not found to be the most parsimonious match to the identified peptides. The peptides identified also boosted confidence in predicted gene structures spanning 54 introns from different gene models. This work highlights the potential of integrating experimental proteomics data into genomic annotation pipelines much as expressed sequence tag (EST) data has been. A comparison of the published genome from another strain of A.niger sequenced by DSM showed that a number of the gene models or proteins with proteomics evidence did not occur in both genomes, further highlighting the utility of the method.

  7. Exploiting proteomic data for genome annotation and gene model validation in Aspergillus niger

    PubMed Central

    Wright, James C; Sugden, Deana; Francis-McIntyre, Sue; Riba-Garcia, Isabel; Gaskell, Simon J; Grigoriev, Igor V; Baker, Scott E; Beynon, Robert J; Hubbard, Simon J

    2009-01-01

    Background Proteomic data is a potentially rich, but arguably unexploited, data source for genome annotation. Peptide identifications from tandem mass spectrometry provide prima facie evidence for gene predictions and can discriminate over a set of candidate gene models. Here we apply this to the recently sequenced Aspergillus niger fungal genome from the Joint Genome Institutes (JGI) and another predicted protein set from another A.niger sequence. Tandem mass spectra (MS/MS) were acquired from 1d gel electrophoresis bands and searched against all available gene models using Average Peptide Scoring (APS) and reverse database searching to produce confident identifications at an acceptable false discovery rate (FDR). Results 405 identified peptide sequences were mapped to 214 different A.niger genomic loci to which 4093 predicted gene models clustered, 2872 of which contained the mapped peptides. Interestingly, 13 (6%) of these loci either had no preferred predicted gene model or the genome annotators' chosen "best" model for that genomic locus was not found to be the most parsimonious match to the identified peptides. The peptides identified also boosted confidence in predicted gene structures spanning 54 introns from different gene models. Conclusion This work highlights the potential of integrating experimental proteomics data into genomic annotation pipelines much as expressed sequence tag (EST) data has been. A comparison of the published genome from another strain of A.niger sequenced by DSM showed that a number of the gene models or proteins with proteomics evidence did not occur in both genomes, further highlighting the utility of the method. PMID:19193216

  8. Subthalamic hGAD65 Gene Therapy and Striatum TH Gene Transfer in a Parkinson’s Disease Rat Model

    PubMed Central

    Zheng, Deyu; Jiang, Xiaohua; Zhao, Junpeng; Duan, Deyi; Zhao, Huanying; Xu, Qunyuan

    2013-01-01

    The aim of the present study is to detect a combination method to utilize gene therapy for the treatment of Parkinson’s disease (PD). Here, a PD rat model is used for the in vivo gene therapy of a recombinant adeno-associated virus (AAV2) containing a human glutamic acid decarboxylase 65 (rAAV2-hGAD65) gene delivered to the subthalamic nucleus (STN). This is combined with the ex vivo gene delivery of tyrosine hydroxylase (TH) by fibroblasts injected into the striatum. After the treatment, the rotation behavior was improved with the greatest efficacy in the combination group. The results of immunohistochemistry showed that hGAD65 gene delivery by AAV2 successfully led to phenotypic changes of neurons in STN. And the levels of glutamic acid and GABA in the internal segment of the globus pallidus (GPi) and substantia nigra pars reticulata (SNr) were obviously lower than the control groups. However, hGAD65 gene transfer did not effectively protect surviving dopaminergic neurons in the SNc and VTA. This study suggests that subthalamic hGAD65 gene therapy and combined with TH gene therapy can alleviate symptoms of the PD model rats, independent of the protection the DA neurons from death. PMID:23738148

  9. Selective pressures for accurate altruism targeting: evidence from digital evolution for difficult-to-test aspects of inclusive fitness theory.

    PubMed

    Clune, Jeff; Goldsby, Heather J; Ofria, Charles; Pennock, Robert T

    2011-03-07

    Inclusive fitness theory predicts that natural selection will favour altruist genes that are more accurate in targeting altruism only to copies of themselves. In this paper, we provide evidence from digital evolution in support of this prediction by competing multiple altruist-targeting mechanisms that vary in their accuracy in determining whether a potential target for altruism carries a copy of the altruist gene. We compete altruism-targeting mechanisms based on (i) kinship (kin targeting), (ii) genetic similarity at a level greater than that expected of kin (similarity targeting), and (iii) perfect knowledge of the presence of an altruist gene (green beard targeting). Natural selection always favoured the most accurate targeting mechanism available. Our investigations also revealed that evolution did not increase the altruism level when all green beard altruists used the same phenotypic marker. The green beard altruism levels stably increased only when mutations that changed the altruism level also changed the marker (e.g. beard colour), such that beard colour reliably indicated the altruism level. For kin- and similarity-targeting mechanisms, we found that evolution was able to stably adjust altruism levels. Our results confirm that natural selection favours altruist genes that are increasingly accurate in targeting altruism to only their copies. Our work also emphasizes that the concept of targeting accuracy must include both the presence of an altruist gene and the level of altruism it produces.

  10. Multi-fidelity machine learning models for accurate bandgap predictions of solids

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Pilania, Ghanshyam; Gubernatis, James E.; Lookman, Turab

    Here, we present a multi-fidelity co-kriging statistical learning framework that combines variable-fidelity quantum mechanical calculations of bandgaps to generate a machine-learned model that enables low-cost accurate predictions of the bandgaps at the highest fidelity level. Additionally, the adopted Gaussian process regression formulation allows us to predict the underlying uncertainties as a measure of our confidence in the predictions. In using a set of 600 elpasolite compounds as an example dataset and using semi-local and hybrid exchange correlation functionals within density functional theory as two levels of fidelities, we demonstrate the excellent learning performance of the method against actual high fidelitymore » quantum mechanical calculations of the bandgaps. The presented statistical learning method is not restricted to bandgaps or electronic structure methods and extends the utility of high throughput property predictions in a significant way.« less

  11. Multi-fidelity machine learning models for accurate bandgap predictions of solids

    DOE PAGES

    Pilania, Ghanshyam; Gubernatis, James E.; Lookman, Turab

    2016-12-28

    Here, we present a multi-fidelity co-kriging statistical learning framework that combines variable-fidelity quantum mechanical calculations of bandgaps to generate a machine-learned model that enables low-cost accurate predictions of the bandgaps at the highest fidelity level. Additionally, the adopted Gaussian process regression formulation allows us to predict the underlying uncertainties as a measure of our confidence in the predictions. In using a set of 600 elpasolite compounds as an example dataset and using semi-local and hybrid exchange correlation functionals within density functional theory as two levels of fidelities, we demonstrate the excellent learning performance of the method against actual high fidelitymore » quantum mechanical calculations of the bandgaps. The presented statistical learning method is not restricted to bandgaps or electronic structure methods and extends the utility of high throughput property predictions in a significant way.« less

  12. Accurate and fast multiple-testing correction in eQTL studies.

    PubMed

    Sul, Jae Hoon; Raj, Towfique; de Jong, Simone; de Bakker, Paul I W; Raychaudhuri, Soumya; Ophoff, Roel A; Stranger, Barbara E; Eskin, Eleazar; Han, Buhm

    2015-06-04

    In studies of expression quantitative trait loci (eQTLs), it is of increasing interest to identify eGenes, the genes whose expression levels are associated with variation at a particular genetic variant. Detecting eGenes is important for follow-up analyses and prioritization because genes are the main entities in biological processes. To detect eGenes, one typically focuses on the genetic variant with the minimum p value among all variants in cis with a gene and corrects for multiple testing to obtain a gene-level p value. For performing multiple-testing correction, a permutation test is widely used. Because of growing sample sizes of eQTL studies, however, the permutation test has become a computational bottleneck in eQTL studies. In this paper, we propose an efficient approach for correcting for multiple testing and assess eGene p values by utilizing a multivariate normal distribution. Our approach properly takes into account the linkage-disequilibrium structure among variants, and its time complexity is independent of sample size. By applying our small-sample correction techniques, our method achieves high accuracy in both small and large studies. We have shown that our method consistently produces extremely accurate p values (accuracy > 98%) for three human eQTL datasets with different sample sizes and SNP densities: the Genotype-Tissue Expression pilot dataset, the multi-region brain dataset, and the HapMap 3 dataset. Copyright © 2015 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.

  13. Globin gene structure in a reptile supports the transpositional model for amniote α- and β-globin gene evolution.

    PubMed

    Patel, Vidushi S; Ezaz, Tariq; Deakin, Janine E; Graves, Jennifer A Marshall

    2010-12-01

    The haemoglobin protein, required for oxygen transportation in the body, is encoded by α- and β-globin genes that are arranged in clusters. The transpositional model for the evolution of distinct α-globin and β-globin clusters in amniotes is much simpler than the previously proposed whole genome duplication model. According to this model, all jawed vertebrates share one ancient region containing α- and β-globin genes and several flanking genes in the order MPG-C16orf35-(α-β)-GBY-LUC7L that has been conserved for more than 410 million years, whereas amniotes evolved a distinct β-globin cluster by insertion of a transposed β-globin gene from this ancient region into a cluster of olfactory receptors flanked by CCKBR and RRM1. It could not be determined whether this organisation is conserved in all amniotes because of the paucity of information from non-avian reptiles. To fill in this gap, we examined globin gene organisation in a squamate reptile, the Australian bearded dragon lizard, Pogona vitticeps (Agamidae). We report here that the α-globin cluster (HBK, HBA) is flanked by C16orf35 and GBY and is located on a pair of microchromosomes, whereas the β-globin cluster is flanked by RRM1 on the 3' end and is located on the long arm of chromosome 3. However, the CCKBR gene that flanks the β-globin cluster on the 5' end in other amniotes is located on the short arm of chromosome 5 in P. vitticeps, indicating that a chromosomal break between the β-globin cluster and CCKBR occurred at least in the agamid lineage. Our data from a reptile species provide further evidence to support the transpositional model for the evolution of β-globin gene cluster in amniotes.

  14. Translating natural genetic variation to gene expression in a computational model of the Drosophila gap gene regulatory network

    PubMed Central

    Kozlov, Konstantin N.; Kulakovskiy, Ivan V.; Zubair, Asif; Marjoram, Paul; Lawrie, David S.; Nuzhdin, Sergey V.; Samsonova, Maria G.

    2017-01-01

    Annotating the genotype-phenotype relationship, and developing a proper quantitative description of the relationship, requires understanding the impact of natural genomic variation on gene expression. We apply a sequence-level model of gap gene expression in the early development of Drosophila to analyze single nucleotide polymorphisms (SNPs) in a panel of natural sequenced D. melanogaster lines. Using a thermodynamic modeling framework, we provide both analytical and computational descriptions of how single-nucleotide variants affect gene expression. The analysis reveals that the sequence variants increase (decrease) gene expression if located within binding sites of repressors (activators). We show that the sign of SNP influence (activation or repression) may change in time and space and elucidate the origin of this change in specific examples. The thermodynamic modeling approach predicts non-local and non-linear effects arising from SNPs, and combinations of SNPs, in individual fly genotypes. Simulation of individual fly genotypes using our model reveals that this non-linearity reduces to almost additive inputs from multiple SNPs. Further, we see signatures of the action of purifying selection in the gap gene regulatory regions. To infer the specific targets of purifying selection, we analyze the patterns of polymorphism in the data at two phenotypic levels: the strengths of binding and expression. We find that combinations of SNPs show evidence of being under selective pressure, while individual SNPs do not. The model predicts that SNPs appear to accumulate in the genotypes of the natural population in a way biased towards small increases in activating action on the expression pattern. Taken together, these results provide a systems-level view of how genetic variation translates to the level of gene regulatory networks via combinatorial SNP effects. PMID:28898266

  15. High-coverage methylation data of a gene model before and after DNA damage and homologous repair.

    PubMed

    Pezone, Antonio; Russo, Giusi; Tramontano, Alfonso; Florio, Ermanno; Scala, Giovanni; Landi, Rosaria; Zuchegna, Candida; Romano, Antonella; Chiariotti, Lorenzo; Muller, Mark T; Gottesman, Max E; Porcellini, Antonio; Avvedimento, Enrico V

    2017-04-11

    Genome-wide methylation analysis is limited by its low coverage and the inability to detect single variants below 10%. Quantitative analysis provides accurate information on the extent of methylation of single CpG dinucleotide, but it does not measure the actual polymorphism of the methylation profiles of single molecules. To understand the polymorphism of DNA methylation and to decode the methylation signatures before and after DNA damage and repair, we have deep sequenced in bisulfite-treated DNA a reporter gene undergoing site-specific DNA damage and homologous repair. In this paper, we provide information on the data generation, the rationale for the experiments and the type of assays used, such as cytofluorimetry and immunoblot data derived during a previous work published in Scientific Reports, describing the methylation and expression changes of a model gene (GFP) before and after formation of a double-strand break and repair by homologous-recombination or non-homologous-end-joining. These data provide: 1) a reference for the analysis of methylation polymorphism at selected loci in complex cell populations; 2) a platform and the tools to compare transcription and methylation profiles.

  16. High-coverage methylation data of a gene model before and after DNA damage and homologous repair

    PubMed Central

    Pezone, Antonio; Russo, Giusi; Tramontano, Alfonso; Florio, Ermanno; Scala, Giovanni; Landi, Rosaria; Zuchegna, Candida; Romano, Antonella; Chiariotti, Lorenzo; Muller, Mark T.; Gottesman, Max E.; Porcellini, Antonio; Avvedimento, Enrico V.

    2017-01-01

    Genome-wide methylation analysis is limited by its low coverage and the inability to detect single variants below 10%. Quantitative analysis provides accurate information on the extent of methylation of single CpG dinucleotide, but it does not measure the actual polymorphism of the methylation profiles of single molecules. To understand the polymorphism of DNA methylation and to decode the methylation signatures before and after DNA damage and repair, we have deep sequenced in bisulfite-treated DNA a reporter gene undergoing site-specific DNA damage and homologous repair. In this paper, we provide information on the data generation, the rationale for the experiments and the type of assays used, such as cytofluorimetry and immunoblot data derived during a previous work published in Scientific Reports, describing the methylation and expression changes of a model gene (GFP) before and after formation of a double-strand break and repair by homologous-recombination or non-homologous-end-joining. These data provide: 1) a reference for the analysis of methylation polymorphism at selected loci in complex cell populations; 2) a platform and the tools to compare transcription and methylation profiles. PMID:28398335

  17. A framework for scalable parameter estimation of gene circuit models using structural information.

    PubMed

    Kuwahara, Hiroyuki; Fan, Ming; Wang, Suojin; Gao, Xin

    2013-07-01

    Systematic and scalable parameter estimation is a key to construct complex gene regulatory models and to ultimately facilitate an integrative systems biology approach to quantitatively understand the molecular mechanisms underpinning gene regulation. Here, we report a novel framework for efficient and scalable parameter estimation that focuses specifically on modeling of gene circuits. Exploiting the structure commonly found in gene circuit models, this framework decomposes a system of coupled rate equations into individual ones and efficiently integrates them separately to reconstruct the mean time evolution of the gene products. The accuracy of the parameter estimates is refined by iteratively increasing the accuracy of numerical integration using the model structure. As a case study, we applied our framework to four gene circuit models with complex dynamics based on three synthetic datasets and one time series microarray data set. We compared our framework to three state-of-the-art parameter estimation methods and found that our approach consistently generated higher quality parameter solutions efficiently. Although many general-purpose parameter estimation methods have been applied for modeling of gene circuits, our results suggest that the use of more tailored approaches to use domain-specific information may be a key to reverse engineering of complex biological systems. http://sfb.kaust.edu.sa/Pages/Software.aspx. Supplementary data are available at Bioinformatics online.

  18. Gene prioritization and clustering by multi-view text mining

    PubMed Central

    2010-01-01

    Background Text mining has become a useful tool for biologists trying to understand the genetics of diseases. In particular, it can help identify the most interesting candidate genes for a disease for further experimental analysis. Many text mining approaches have been introduced, but the effect of disease-gene identification varies in different text mining models. Thus, the idea of incorporating more text mining models may be beneficial to obtain more refined and accurate knowledge. However, how to effectively combine these models still remains a challenging question in machine learning. In particular, it is a non-trivial issue to guarantee that the integrated model performs better than the best individual model. Results We present a multi-view approach to retrieve biomedical knowledge using different controlled vocabularies. These controlled vocabularies are selected on the basis of nine well-known bio-ontologies and are applied to index the vast amounts of gene-based free-text information available in the MEDLINE repository. The text mining result specified by a vocabulary is considered as a view and the obtained multiple views are integrated by multi-source learning algorithms. We investigate the effect of integration in two fundamental computational disease gene identification tasks: gene prioritization and gene clustering. The performance of the proposed approach is systematically evaluated and compared on real benchmark data sets. In both tasks, the multi-view approach demonstrates significantly better performance than other comparing methods. Conclusions In practical research, the relevance of specific vocabulary pertaining to the task is usually unknown. In such case, multi-view text mining is a superior and promising strategy for text-based disease gene identification. PMID:20074336

  19. Optimization of neural network architecture using genetic programming improves detection and modeling of gene-gene interactions in studies of human diseases

    PubMed Central

    Ritchie, Marylyn D; White, Bill C; Parker, Joel S; Hahn, Lance W; Moore, Jason H

    2003-01-01

    Background Appropriate definition of neural network architecture prior to data analysis is crucial for successful data mining. This can be challenging when the underlying model of the data is unknown. The goal of this study was to determine whether optimizing neural network architecture using genetic programming as a machine learning strategy would improve the ability of neural networks to model and detect nonlinear interactions among genes in studies of common human diseases. Results Using simulated data, we show that a genetic programming optimized neural network approach is able to model gene-gene interactions as well as a traditional back propagation neural network. Furthermore, the genetic programming optimized neural network is better than the traditional back propagation neural network approach in terms of predictive ability and power to detect gene-gene interactions when non-functional polymorphisms are present. Conclusion This study suggests that a machine learning strategy for optimizing neural network architecture may be preferable to traditional trial-and-error approaches for the identification and characterization of gene-gene interactions in common, complex human diseases. PMID:12846935

  20. Directional RNA-seq reveals highly complex condition-dependent transcriptomes in E. coli K12 through accurate full-length transcripts assembling

    PubMed Central

    2013-01-01

    Background Although prokaryotic gene transcription has been studied over decades, many aspects of the process remain poorly understood. Particularly, recent studies have revealed that transcriptomes in many prokaryotes are far more complex than previously thought. Genes in an operon are often alternatively and dynamically transcribed under different conditions, and a large portion of genes and intergenic regions have antisense RNA (asRNA) and non-coding RNA (ncRNA) transcripts, respectively. Ironically, similar studies have not been conducted in the model bacterium E coli K12, thus it is unknown whether or not the bacterium possesses similar complex transcriptomes. Furthermore, although RNA-seq becomes the major method for analyzing the complexity of prokaryotic transcriptome, it is still a challenging task to accurately assemble full length transcripts using short RNA-seq reads. Results To fill these gaps, we have profiled the transcriptomes of E. coli K12 under different culture conditions and growth phases using a highly specific directional RNA-seq technique that can capture various types of transcripts in the bacterial cells, combined with a highly accurate and robust algorithm and tool TruHMM (http://bioinfolab.uncc.edu/TruHmm_package/) for assembling full length transcripts. We found that 46.9 ~ 63.4% of expressed operons were utilized in their putative alternative forms, 72.23 ~ 89.54% genes had putative asRNA transcripts and 51.37 ~ 72.74% intergenic regions had putative ncRNA transcripts under different culture conditions and growth phases. Conclusions As has been demonstrated in many other prokaryotes, E. coli K12 also has a highly complex and dynamic transcriptomes under different culture conditions and growth phases. Such complex and dynamic transcriptomes might play important roles in the physiology of the bacterium. TruHMM is a highly accurate and robust algorithm for assembling full-length transcripts in prokaryotes using directional RNA

  1. Reference gene selection for quantitative gene expression studies during biological invasions: A test on multiple genes and tissues in a model ascidian Ciona savignyi.

    PubMed

    Huang, Xuena; Gao, Yangchun; Jiang, Bei; Zhou, Zunchun; Zhan, Aibin

    2016-01-15

    As invasive species have successfully colonized a wide range of dramatically different local environments, they offer a good opportunity to study interactions between species and rapidly changing environments. Gene expression represents one of the primary and crucial mechanisms for rapid adaptation to local environments. Here, we aim to select reference genes for quantitative gene expression analysis based on quantitative Real-Time PCR (qRT-PCR) for a model invasive ascidian, Ciona savignyi. We analyzed the stability of ten candidate reference genes in three tissues (siphon, pharynx and intestine) under two key environmental stresses (temperature and salinity) in the marine realm based on three programs (geNorm, NormFinder and delta Ct method). Our results demonstrated only minor difference for stability rankings among the three methods. The use of different single reference gene might influence the data interpretation, while multiple reference genes could minimize possible errors. Therefore, reference gene combinations were recommended for different tissues - the optimal reference gene combination for siphon was RPS15 and RPL17 under temperature stress, and RPL17, UBQ and TubA under salinity treatment; for pharynx, TubB, TubA and RPL17 were the most stable genes under temperature stress, while TubB, TubA and UBQ were the best under salinity stress; for intestine, UBQ, RPS15 and RPL17 were the most reliable reference genes under both treatments. Our results suggest that the necessity of selection and test of reference genes for different tissues under varying environmental stresses. The results obtained here are expected to reveal mechanisms of gene expression-mediated invasion success using C. savignyi as a model species. Copyright © 2015 Elsevier B.V. All rights reserved.

  2. Filtering Raw Terrestrial Laser Scanning Data for Efficient and Accurate Use in Geomorphologic Modeling

    NASA Astrophysics Data System (ADS)

    Gleason, M. J.; Pitlick, J.; Buttenfield, B. P.

    2011-12-01

    Terrestrial laser scanning (TLS) represents a new and particularly effective remote sensing technique for investigating geomorphologic processes. Unfortunately, TLS data are commonly characterized by extremely large volume, heterogeneous point distribution, and erroneous measurements, raising challenges for applied researchers. To facilitate efficient and accurate use of TLS in geomorphology, and to improve accessibility for TLS processing in commercial software environments, we are developing a filtering method for raw TLS data to: eliminate data redundancy; produce a more uniformly spaced dataset; remove erroneous measurements; and maintain the ability of the TLS dataset to accurately model terrain. Our method conducts local aggregation of raw TLS data using a 3-D search algorithm based on the geometrical expression of expected random errors in the data. This approach accounts for the estimated accuracy and precision limitations of the instruments and procedures used in data collection, thereby allowing for identification and removal of potential erroneous measurements prior to data aggregation. Initial tests of the proposed technique on a sample TLS point cloud required a modest processing time of approximately 100 minutes to reduce dataset volume over 90 percent (from 12,380,074 to 1,145,705 points). Preliminary analysis of the filtered point cloud revealed substantial improvement in homogeneity of point distribution and minimal degradation of derived terrain models. We will test the method on two independent TLS datasets collected in consecutive years along a non-vegetated reach of the North Fork Toutle River in Washington. We will evaluate the tool using various quantitative, qualitative, and statistical methods. The crux of this evaluation will include a bootstrapping analysis to test the ability of the filtered datasets to model the terrain at roughly the same accuracy as the raw datasets.

  3. Lung ultrasound accurately detects pneumothorax in a preterm newborn lamb model.

    PubMed

    Blank, Douglas A; Hooper, Stuart B; Binder-Heschl, Corinna; Kluckow, Martin; Gill, Andrew W; LaRosa, Domenic A; Inocencio, Ishmael M; Moxham, Alison; Rodgers, Karyn; Zahra, Valerie A; Davis, Peter G; Polglase, Graeme R

    2016-06-01

    Pneumothorax is a common emergency affecting extremely preterm. In adult studies, lung ultrasound has performed better than chest x-ray in the diagnosis of pneumothorax. The purpose of this study was to determine the efficacy of lung ultrasound (LUS) examination to detect pneumothorax using a preterm animal model. This was a prospective, observational study using newborn Border-Leicester lambs at gestational age = 126 days (equivalent to gestational age = 26 weeks in humans) receiving mechanical ventilation from birth to 2 h of life. At the conclusion of the experiment, LUS was performed, the lambs were then euthanised and a post-mortem exam was immediately performed. We used previously published ultrasound techniques to identify pneumothorax. Test characteristics of LUS to detect pneumothorax were calculated, using the post-mortem exam as the 'gold standard' test. Nine lambs (18 lungs) were examined. Four lambs had a unilateral pneumothorax, all of which were identified by LUS with no false positives. This was the first study to use post-mortem findings to test the efficacy of LUS to detect pneumothorax in a newborn animal model. Lung ultrasound accurately detected pneumothorax, verified by post-mortem exam, in premature, newborn lambs. © 2016 Paediatrics and Child Health Division (The Royal Australasian College of Physicians).

  4. Optimal Cluster Mill Pass Scheduling With an Accurate and Rapid New Strip Crown Model

    NASA Astrophysics Data System (ADS)

    Malik, Arif S.; Grandhi, Ramana V.; Zipf, Mark E.

    2007-05-01

    Besides the requirement to roll coiled sheet at high levels of productivity, the optimal pass scheduling of cluster-type reversing cold mills presents the added challenge of assigning mill parameters that facilitate the best possible strip flatness. The pressures of intense global competition, and the requirements for increasingly thinner, higher quality specialty sheet products that are more difficult to roll, continue to force metal producers to commission innovative flatness-control technologies. This means that during the on-line computerized set-up of rolling mills, the mathematical model should not only determine the minimum total number of passes and maximum rolling speed, it should simultaneously optimize the pass-schedule so that desired flatness is assured, either by manual or automated means. In many cases today, however, on-line prediction of strip crown and corresponding flatness for the complex cluster-type rolling mills is typically addressed either by trial and error, by approximate deflection models for equivalent vertical roll-stacks, or by non-physical pattern recognition style models. The abundance of the aforementioned methods is largely due to the complexity of cluster-type mill configurations and the lack of deflection models with sufficient accuracy and speed for on-line use. Without adequate assignment of the pass-schedule set-up parameters, it may be difficult or impossible to achieve the required strip flatness. In this paper, we demonstrate optimization of cluster mill pass-schedules using a new accurate and rapid strip crown model. This pass-schedule optimization includes computations of the predicted strip thickness profile to validate mathematical constraints. In contrast to many of the existing methods for on-line prediction of strip crown and flatness on cluster mills, the demonstrated method requires minimal prior tuning and no extensive training with collected mill data. To rapidly and accurately solve the multi-contact problem

  5. A Risk Stratification Model for Lung Cancer Based on Gene Coexpression Network and Deep Learning

    PubMed Central

    2018-01-01

    Risk stratification model for lung cancer with gene expression profile is of great interest. Instead of previous models based on individual prognostic genes, we aimed to develop a novel system-level risk stratification model for lung adenocarcinoma based on gene coexpression network. Using multiple microarray, gene coexpression network analysis was performed to identify survival-related networks. A deep learning based risk stratification model was constructed with representative genes of these networks. The model was validated in two test sets. Survival analysis was performed using the output of the model to evaluate whether it could predict patients' survival independent of clinicopathological variables. Five networks were significantly associated with patients' survival. Considering prognostic significance and representativeness, genes of the two survival-related networks were selected for input of the model. The output of the model was significantly associated with patients' survival in two test sets and training set (p < 0.00001, p < 0.0001 and p = 0.02 for training and test sets 1 and 2, resp.). In multivariate analyses, the model was associated with patients' prognosis independent of other clinicopathological features. Our study presents a new perspective on incorporating gene coexpression networks into the gene expression signature and clinical application of deep learning in genomic data science for prognosis prediction. PMID:29581968

  6. The Mouse Solitary Odorant Receptor Gene Promoters as Models for the Study of Odorant Receptor Gene Choice

    PubMed Central

    Degl'Innocenti, Andrea

    2016-01-01

    Background In vertebrates, several anatomical regions located within the nasal cavity mediate olfaction. Among these, the main olfactory epithelium detects most conventional odorants. Olfactory sensory neurons, provided with cilia exposed to the air, detect volatile chemicals via an extremely large family of seven-transmembrane chemoreceptors named odorant receptors. Their genes are expressed in a monogenic and monoallelic fashion: a single allele of a single odorant receptor gene is transcribed in a given mature neuron, through a still uncharacterized molecular mechanism known as odorant receptor gene choice. Aim Odorant receptor genes are typically arranged in genomic clusters, but a few are isolated (we call them solitary) from the others within a region broader than 1 Mb upstream and downstream with respect to their transcript's coordinates. The study of clustered genes is problematic, because of redundancy and ambiguities in their regulatory elements: we propose to use the solitary genes as simplified models to understand odorant receptor gene choice. Procedures Here we define number and identity of the solitary genes in the mouse genome (C57BL/6J), and assess the conservation of the solitary status in some mammalian orthologs. Furthermore, we locate their putative promoters, predict their homeodomain binding sites (commonly present in the promoters of odorant receptor genes) and compare candidate promoter sequences with those of wild-caught mice. We also provide expression data from histological sections. Results In the mouse genome there are eight intact solitary genes: Olfr19 (M12), Olfr49, Olfr266, Olfr267, Olfr370, Olfr371, Olfr466, Olfr1402; five are conserved as solitary in rat. These genes are all expressed in the main olfactory epithelium of three-day-old mice. The C57BL/6J candidate promoter of Olfr370 has considerably varied compared to its wild-type counterpart. Within the putative promoter for Olfr266 a homeodomain binding site is predicted. As a

  7. The Mouse Solitary Odorant Receptor Gene Promoters as Models for the Study of Odorant Receptor Gene Choice.

    PubMed

    Degl'Innocenti, Andrea; Parrilla, Marta; Harr, Bettina; Teschke, Meike

    2016-01-01

    In vertebrates, several anatomical regions located within the nasal cavity mediate olfaction. Among these, the main olfactory epithelium detects most conventional odorants. Olfactory sensory neurons, provided with cilia exposed to the air, detect volatile chemicals via an extremely large family of seven-transmembrane chemoreceptors named odorant receptors. Their genes are expressed in a monogenic and monoallelic fashion: a single allele of a single odorant receptor gene is transcribed in a given mature neuron, through a still uncharacterized molecular mechanism known as odorant receptor gene choice. Odorant receptor genes are typically arranged in genomic clusters, but a few are isolated (we call them solitary) from the others within a region broader than 1 Mb upstream and downstream with respect to their transcript's coordinates. The study of clustered genes is problematic, because of redundancy and ambiguities in their regulatory elements: we propose to use the solitary genes as simplified models to understand odorant receptor gene choice. Here we define number and identity of the solitary genes in the mouse genome (C57BL/6J), and assess the conservation of the solitary status in some mammalian orthologs. Furthermore, we locate their putative promoters, predict their homeodomain binding sites (commonly present in the promoters of odorant receptor genes) and compare candidate promoter sequences with those of wild-caught mice. We also provide expression data from histological sections. In the mouse genome there are eight intact solitary genes: Olfr19 (M12), Olfr49, Olfr266, Olfr267, Olfr370, Olfr371, Olfr466, Olfr1402; five are conserved as solitary in rat. These genes are all expressed in the main olfactory epithelium of three-day-old mice. The C57BL/6J candidate promoter of Olfr370 has considerably varied compared to its wild-type counterpart. Within the putative promoter for Olfr266 a homeodomain binding site is predicted. As a whole, our findings favor Olfr266

  8. Accurate 3d Textured Models of Vessels for the Improvement of the Educational Tools of a Museum

    NASA Astrophysics Data System (ADS)

    Soile, S.; Adam, K.; Ioannidis, C.; Georgopoulos, A.

    2013-02-01

    Besides the demonstration of the findings, modern museums organize educational programs which aim to experience and knowledge sharing combined with entertainment rather than to pure learning. Toward that effort, 2D and 3D digital representations are gradually replacing the traditional recording of the findings through photos or drawings. The present paper refers to a project that aims to create 3D textured models of two lekythoi that are exhibited in the National Archaeological Museum of Athens in Greece; on the surfaces of these lekythoi scenes of the adventures of Odysseus are depicted. The project is expected to support the production of an educational movie and some other relevant interactive educational programs for the museum. The creation of accurate developments of the paintings and of accurate 3D models is the basis for the visualization of the adventures of the mythical hero. The data collection was made by using a structured light scanner consisting of two machine vision cameras that are used for the determination of geometry of the object, a high resolution camera for the recording of the texture, and a DLP projector. The creation of the final accurate 3D textured model is a complicated and tiring procedure which includes the collection of geometric data, the creation of the surface, the noise filtering, the merging of individual surfaces, the creation of a c-mesh, the creation of the UV map, the provision of the texture and, finally, the general processing of the 3D textured object. For a better result a combination of commercial and in-house software made for the automation of various steps of the procedure was used. The results derived from the above procedure were especially satisfactory in terms of accuracy and quality of the model. However, the procedure was proved to be time consuming while the use of various software packages presumes the services of a specialist.

  9. Candidate innate immune system gene expression in the ecological model Daphnia

    PubMed Central

    Decaestecker, Ellen; Labbé, Pierrick; Ellegaard, Kirsten; Allen, Judith E.; Little, Tom J.

    2011-01-01

    The last ten years have witnessed increasing interest in host–pathogen interactions involving invertebrate hosts. The invertebrate innate immune system is now relatively well characterised, but in a limited range of genetic model organisms and under a limited number of conditions. Immune systems have been little studied under real-world scenarios of environmental variation and parasitism. Thus, we have investigated expression of candidate innate immune system genes in the water flea Daphnia, a model organism for ecological genetics, and whose capacity for clonal reproduction facilitates an exceptionally rigorous control of exposure dose or the study of responses at many time points. A unique characteristic of the particular Daphnia clones and pathogen strain combinations used presently is that they have been shown to be involved in specific host–pathogen coevolutionary interactions in the wild. We choose five genes, which are strong candidates to be involved in Daphnia–pathogen interactions, given that they have been shown to code for immune effectors in related organisms. Differential expression of these genes was quantified by qRT-PCR following exposure to the bacterial pathogen Pasteuria ramosa. Constitutive expression levels differed between host genotypes, and some genes appeared to show correlated expression. However, none of the genes appeared to show a major modification of expression level in response to Pasteuria exposure. By applying knowledge from related genetic model organisms (e.g. Drosophila) to models for the study of evolutionary ecology and coevolution (i.e. Daphnia), the candidate gene approach is temptingly efficient. However, our results show that detection of only weak patterns is likely if one chooses target genes for study based on previously identified genome sequences by comparison to homologues from other related organisms. Future work on the Daphnia–Pasteuria system will need to balance a candidate gene approach with more

  10. Candidate innate immune system gene expression in the ecological model Daphnia.

    PubMed

    Decaestecker, Ellen; Labbé, Pierrick; Ellegaard, Kirsten; Allen, Judith E; Little, Tom J

    2011-10-01

    The last ten years have witnessed increasing interest in host-pathogen interactions involving invertebrate hosts. The invertebrate innate immune system is now relatively well characterised, but in a limited range of genetic model organisms and under a limited number of conditions. Immune systems have been little studied under real-world scenarios of environmental variation and parasitism. Thus, we have investigated expression of candidate innate immune system genes in the water flea Daphnia, a model organism for ecological genetics, and whose capacity for clonal reproduction facilitates an exceptionally rigorous control of exposure dose or the study of responses at many time points. A unique characteristic of the particular Daphnia clones and pathogen strain combinations used presently is that they have been shown to be involved in specific host-pathogen coevolutionary interactions in the wild. We choose five genes, which are strong candidates to be involved in Daphnia-pathogen interactions, given that they have been shown to code for immune effectors in related organisms. Differential expression of these genes was quantified by qRT-PCR following exposure to the bacterial pathogen Pasteuria ramosa. Constitutive expression levels differed between host genotypes, and some genes appeared to show correlated expression. However, none of the genes appeared to show a major modification of expression level in response to Pasteuria exposure. By applying knowledge from related genetic model organisms (e.g. Drosophila) to models for the study of evolutionary ecology and coevolution (i.e. Daphnia), the candidate gene approach is temptingly efficient. However, our results show that detection of only weak patterns is likely if one chooses target genes for study based on previously identified genome sequences by comparison to homologues from other related organisms. Future work on the Daphnia-Pasteuria system will need to balance a candidate gene approach with more comprehensive

  11. Evaluating a common semi-mechanistic mathematical model of gene-regulatory networks

    PubMed Central

    2015-01-01

    Modeling and simulation of gene-regulatory networks (GRNs) has become an important aspect of modern systems biology investigations into mechanisms underlying gene regulation. A key challenge in this area is the automated inference (reverse-engineering) of dynamic, mechanistic GRN models from gene expression time-course data. Common mathematical formalisms for representing such models capture two aspects simultaneously within a single parameter: (1) Whether or not a gene is regulated, and if so, the type of regulator (activator or repressor), and (2) the strength of influence of the regulator (if any) on the target or effector gene. To accommodate both roles, "generous" boundaries or limits for possible values of this parameter are commonly allowed in the reverse-engineering process. This approach has several important drawbacks. First, in the absence of good guidelines, there is no consensus on what limits are reasonable. Second, because the limits may vary greatly among different reverse-engineering experiments, the concrete values obtained for the models may differ considerably, and thus it is difficult to compare models. Third, if high values are chosen as limits, the search space of the model inference process becomes very large, adding unnecessary computational load to the already complex reverse-engineering process. In this study, we demonstrate that restricting the limits to the [−1, +1] interval is sufficient to represent the essential features of GRN systems and offers a reduction of the search space without loss of quality in the resulting models. To show this, we have carried out reverse-engineering studies on data generated from artificial and experimentally determined from real GRN systems. PMID:26356485

  12. Quantitative gene-gene and gene-environment mapping for leaf shape variation using tree-based models.

    PubMed

    Fu, Guifang; Dai, Xiaotian; Symanzik, Jürgen; Bushman, Shaun

    2017-01-01

    Leaf shape traits have long been a focus of many disciplines, but the complex genetic and environmental interactive mechanisms regulating leaf shape variation have not yet been investigated in detail. The question of the respective roles of genes and environment and how they interact to modulate leaf shape is a thorny evolutionary problem, and sophisticated methodology is needed to address it. In this study, we investigated a framework-level approach that inputs shape image photographs and genetic and environmental data, and then outputs the relative importance ranks of all variables after integrating shape feature extraction, dimension reduction, and tree-based statistical models. The power of the proposed framework was confirmed by simulation and a Populus szechuanica var. tibetica data set. This new methodology resulted in the detection of novel shape characteristics, and also confirmed some previous findings. The quantitative modeling of a combination of polygenetic, plastic, epistatic, and gene-environment interactive effects, as investigated in this study, will improve the discernment of quantitative leaf shape characteristics, and the methods are ready to be applied to other leaf morphology data sets. Unlike the majority of approaches in the quantitative leaf shape literature, this framework-level approach is data-driven, without assuming any pre-known shape attributes, landmarks, or model structures. © 2016 The Authors. New Phytologist © 2016 New Phytologist Trust.

  13. Drug-gene modeling in pediatric T-cell acute lymphoblastic leukemia highlights importance of 6-mercaptopurine for outcome.

    PubMed

    Beesley, Alex H; Firth, Martin J; Anderson, Denise; Samuels, Amy L; Ford, Jette; Kees, Ursula R

    2013-05-01

    Patients relapsing with T-cell acute lymphoblastic leukemia (T-ALL) face a dismal outcome. The aim of this study was to identify new markers of drug resistance and clinical response in T-ALL. We measured gene expression and drug sensitivity in 15 pediatric T-ALL cell lines to find signatures predictive of resistance to 10 agents used in therapy. These were used to generate a model for outcome prediction in patient cohorts using microarray data from diagnosis specimens. In three independent T-ALL cohorts, the 10-drug model was able to accurately identify patient outcome, indicating that the in vitro-derived drug-gene profiles were clinically relevant. Importantly, predictions of outcome within each cohort were linked to distinct drugs, suggesting that different mechanisms contribute to relapse. Sulfite oxidase (SUOX) expression and the drug-transporter ABCC1 (MRP1) were linked to thiopurine sensitivity, suggesting novel pathways for targeting resistance. This study advances our understanding of drug resistance in T-ALL and provides new markers for patient stratification. The results suggest potential benefit from the earlier use of 6-mercaptopurine in T-ALL therapy or the development of adjuvants that may sensitize blasts to this drug. The methodology developed in this study could be applied to other cancers to achieve patient stratification at the time of diagnosis.

  14. Evolution dynamics of a model for gene duplication under adaptive conflict

    NASA Astrophysics Data System (ADS)

    Ancliff, Mark; Park, Jeong-Man

    2014-06-01

    We present and solve the dynamics of a model for gene duplication showing escape from adaptive conflict. We use a Crow-Kimura quasispecies model of evolution where the fitness landscape is a function of Hamming distances from two reference sequences, which are assumed to optimize two different gene functions, to describe the dynamics of a mixed population of individuals with single and double copies of a pleiotropic gene. The evolution equations are solved through a spin coherent state path integral, and we find two phases: one is an escape from an adaptive conflict phase, where each copy of a duplicated gene evolves toward subfunctionalization, and the other is a duplication loss of function phase, where one copy maintains its pleiotropic form and the other copy undergoes neutral mutation. The phase is determined by a competition between the fitness benefits of subfunctionalization and the greater mutational load associated with maintaining two gene copies. In the escape phase, we find a dynamics of an initial population of single gene sequences only which escape adaptive conflict through gene duplication and find that there are two time regimes: until a time t* single gene sequences dominate, and after t* double gene sequences outgrow single gene sequences. The time t* is identified as the time necessary for subfunctionalization to evolve and spread throughout the double gene sequences, and we show that there is an optimum mutation rate which minimizes this time scale.

  15. A Simple Iterative Model Accurately Captures Complex Trapline Formation by Bumblebees Across Spatial Scales and Flower Arrangements

    PubMed Central

    Reynolds, Andrew M.; Lihoreau, Mathieu; Chittka, Lars

    2013-01-01

    Pollinating bees develop foraging circuits (traplines) to visit multiple flowers in a manner that minimizes overall travel distance, a task analogous to the travelling salesman problem. We report on an in-depth exploration of an iterative improvement heuristic model of bumblebee traplining previously found to accurately replicate the establishment of stable routes by bees between flowers distributed over several hectares. The critical test for a model is its predictive power for empirical data for which the model has not been specifically developed, and here the model is shown to be consistent with observations from different research groups made at several spatial scales and using multiple configurations of flowers. We refine the model to account for the spatial search strategy of bees exploring their environment, and test several previously unexplored predictions. We find that the model predicts accurately 1) the increasing propensity of bees to optimize their foraging routes with increasing spatial scale; 2) that bees cannot establish stable optimal traplines for all spatial configurations of rewarding flowers; 3) the observed trade-off between travel distance and prioritization of high-reward sites (with a slight modification of the model); 4) the temporal pattern with which bees acquire approximate solutions to travelling salesman-like problems over several dozen foraging bouts; 5) the instability of visitation schedules in some spatial configurations of flowers; 6) the observation that in some flower arrays, bees' visitation schedules are highly individually different; 7) the searching behaviour that leads to efficient location of flowers and routes between them. Our model constitutes a robust theoretical platform to generate novel hypotheses and refine our understanding about how small-brained insects develop a representation of space and use it to navigate in complex and dynamic environments. PMID:23505353

  16. Hindered rotor models with variable kinetic functions for accurate thermodynamic and kinetic predictions

    NASA Astrophysics Data System (ADS)

    Reinisch, Guillaume; Leyssale, Jean-Marc; Vignoles, Gérard L.

    2010-10-01

    We present an extension of some popular hindered rotor (HR) models, namely, the one-dimensional HR (1DHR) and the degenerated two-dimensional HR (d2DHR) models, allowing for a simple and accurate treatment of internal rotations. This extension, based on the use of a variable kinetic function in the Hamiltonian instead of a constant reduced moment of inertia, is extremely suitable in the case of rocking/wagging motions involved in dissociation or atom transfer reactions. The variable kinetic function is first introduced in the framework of a classical 1DHR model. Then, an effective temperature and potential dependent constant is proposed in the cases of quantum 1DHR and classical d2DHR models. These methods are finally applied to the atom transfer reaction SiCl3+BCl3→SiCl4+BCl2. We show, for this particular case, that a proper accounting of internal rotations greatly improves the accuracy of thermodynamic and kinetic predictions. Moreover, our results confirm (i) that using a suitably defined kinetic function appears to be very adapted to such problems; (ii) that the separability assumption of independent rotations seems justified; and (iii) that a quantum mechanical treatment is not a substantial improvement with respect to a classical one.

  17. Parallel kinetic Monte Carlo simulation framework incorporating accurate models of adsorbate lateral interactions

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Nielsen, Jens; D’Avezac, Mayeul; Hetherington, James

    2013-12-14

    Ab initio kinetic Monte Carlo (KMC) simulations have been successfully applied for over two decades to elucidate the underlying physico-chemical phenomena on the surfaces of heterogeneous catalysts. These simulations necessitate detailed knowledge of the kinetics of elementary reactions constituting the reaction mechanism, and the energetics of the species participating in the chemistry. The information about the energetics is encoded in the formation energies of gas and surface-bound species, and the lateral interactions between adsorbates on the catalytic surface, which can be modeled at different levels of detail. The majority of previous works accounted for only pairwise-additive first nearest-neighbor interactions. Moremore » recently, cluster-expansion Hamiltonians incorporating long-range interactions and many-body terms have been used for detailed estimations of catalytic rate [C. Wu, D. J. Schmidt, C. Wolverton, and W. F. Schneider, J. Catal. 286, 88 (2012)]. In view of the increasing interest in accurate predictions of catalytic performance, there is a need for general-purpose KMC approaches incorporating detailed cluster expansion models for the adlayer energetics. We have addressed this need by building on the previously introduced graph-theoretical KMC framework, and we have developed Zacros, a FORTRAN2003 KMC package for simulating catalytic chemistries. To tackle the high computational cost in the presence of long-range interactions we introduce parallelization with OpenMP. We further benchmark our framework by simulating a KMC analogue of the NO oxidation system established by Schneider and co-workers [J. Catal. 286, 88 (2012)]. We show that taking into account only first nearest-neighbor interactions may lead to large errors in the prediction of the catalytic rate, whereas for accurate estimates thereof, one needs to include long-range terms in the cluster expansion.« less

  18. Accurate Treatment of Collisions and Water-Delivery in Models of Terrestrial Planet Formation

    NASA Astrophysics Data System (ADS)

    Haghighipour, Nader; Maindl, Thomas; Schaefer, Christoph

    2017-10-01

    It is widely accepted that collisions among solid bodies, ignited by their interactions with planetary embryos is the key process in the formation of terrestrial planets and transport of volatiles and chemical compounds to their accretion zones. Unfortunately, due to computational complexities, these collisions are often treated in a rudimentary way. Impacts are considered to be perfectly inelastic and volatiles are considered to be fully transferred from one object to the other. This perfect-merging assumption has profound effects on the mass and composition of final planetary bodies as it grossly overestimates the masses of these objects and the amounts of volatiles and chemical elements transferred to them. It also entirely neglects collisional-loss of volatiles (e.g., water) and draws an unrealistic connection between these properties and the chemical structure of the protoplanetary disk (i.e., the location of their original carriers). We have developed a new and comprehensive methodology to simulate growth of embryos to planetary bodies where we use a combination of SPH and N-body codes to accurately model collisions as well as the transport/transfer of chemical compounds. Our methodology accounts for the loss of volatiles (e.g., ice sublimation) during the orbital evolution of their careers and accurately tracks their transfer from one body to another. Results of our simulations show that traditional N-body modeling of terrestrial planet formation overestimates the amount of the mass and water contents of the final planets by over 60% implying that not only the amount of water they suggest is far from being realistic, small planets such as Mars can also form in these simulations when collisions are treated properly. We will present details of our methodology and discuss its implications for terrestrial planet formation and water delivery to Earth.

  19. Exchange-Hole Dipole Dispersion Model for Accurate Energy Ranking in Molecular Crystal Structure Prediction.

    PubMed

    Whittleton, Sarah R; Otero-de-la-Roza, A; Johnson, Erin R

    2017-02-14

    Accurate energy ranking is a key facet to the problem of first-principles crystal-structure prediction (CSP) of molecular crystals. This work presents a systematic assessment of B86bPBE-XDM, a semilocal density functional combined with the exchange-hole dipole moment (XDM) dispersion model, for energy ranking using 14 compounds from the first five CSP blind tests. Specifically, the set of crystals studied comprises 11 rigid, planar compounds and 3 co-crystals. The experimental structure was correctly identified as the lowest in lattice energy for 12 of the 14 total crystals. One of the exceptions is 4-hydroxythiophene-2-carbonitrile, for which the experimental structure was correctly identified once a quasi-harmonic estimate of the vibrational free-energy contribution was included, evidencing the occasional importance of thermal corrections for accurate energy ranking. The other exception is an organic salt, where charge-transfer error (also called delocalization error) is expected to cause the base density functional to be unreliable. Provided the choice of base density functional is appropriate and an estimate of temperature effects is used, XDM-corrected density-functional theory is highly reliable for the energetic ranking of competing crystal structures.

  20. Reverse engineering gene regulatory networks from measurement with missing values.

    PubMed

    Ogundijo, Oyetunji E; Elmas, Abdulkadir; Wang, Xiaodong

    2016-12-01

    Gene expression time series data are usually in the form of high-dimensional arrays. Unfortunately, the data may sometimes contain missing values: for either the expression values of some genes at some time points or the entire expression values of a single time point or some sets of consecutive time points. This significantly affects the performance of many algorithms for gene expression analysis that take as an input, the complete matrix of gene expression measurement. For instance, previous works have shown that gene regulatory interactions can be estimated from the complete matrix of gene expression measurement. Yet, till date, few algorithms have been proposed for the inference of gene regulatory network from gene expression data with missing values. We describe a nonlinear dynamic stochastic model for the evolution of gene expression. The model captures the structural, dynamical, and the nonlinear natures of the underlying biomolecular systems. We present point-based Gaussian approximation (PBGA) filters for joint state and parameter estimation of the system with one-step or two-step missing measurements . The PBGA filters use Gaussian approximation and various quadrature rules, such as the unscented transform (UT), the third-degree cubature rule and the central difference rule for computing the related posteriors. The proposed algorithm is evaluated with satisfying results for synthetic networks, in silico networks released as a part of the DREAM project, and the real biological network, the in vivo reverse engineering and modeling assessment (IRMA) network of yeast Saccharomyces cerevisiae . PBGA filters are proposed to elucidate the underlying gene regulatory network (GRN) from time series gene expression data that contain missing values. In our state-space model, we proposed a measurement model that incorporates the effect of the missing data points into the sequential algorithm. This approach produces a better inference of the model parameters and hence

  1. Development of Gene Centric Modeling for Nutrient Cycling

    EPA Pesticide Factsheets

    opportunity to participate in the development of a gene-centric model to help predict potential changes in the biogeochemistry of aquatic ecosystems that may arise from anthropogenic stressors and management decisions

  2. The Zebrafish Model Organism Database: new support for human disease models, mutation details, gene expression phenotypes and searching

    PubMed Central

    Howe, Douglas G.; Bradford, Yvonne M.; Eagle, Anne; Fashena, David; Frazer, Ken; Kalita, Patrick; Mani, Prita; Martin, Ryan; Moxon, Sierra Taylor; Paddock, Holly; Pich, Christian; Ramachandran, Sridhar; Ruzicka, Leyla; Schaper, Kevin; Shao, Xiang; Singer, Amy; Toro, Sabrina; Van Slyke, Ceri; Westerfield, Monte

    2017-01-01

    The Zebrafish Model Organism Database (ZFIN; http://zfin.org) is the central resource for zebrafish (Danio rerio) genetic, genomic, phenotypic and developmental data. ZFIN curators provide expert manual curation and integration of comprehensive data involving zebrafish genes, mutants, transgenic constructs and lines, phenotypes, genotypes, gene expressions, morpholinos, TALENs, CRISPRs, antibodies, anatomical structures, models of human disease and publications. We integrate curated, directly submitted, and collaboratively generated data, making these available to zebrafish research community. Among the vertebrate model organisms, zebrafish are superbly suited for rapid generation of sequence-targeted mutant lines, characterization of phenotypes including gene expression patterns, and generation of human disease models. The recent rapid adoption of zebrafish as human disease models is making management of these data particularly important to both the research and clinical communities. Here, we describe recent enhancements to ZFIN including use of the zebrafish experimental conditions ontology, ‘Fish’ records in the ZFIN database, support for gene expression phenotypes, models of human disease, mutation details at the DNA, RNA and protein levels, and updates to the ZFIN single box search. PMID:27899582

  3. A stochastic model for optimizing composite predictors based on gene expression profiles.

    PubMed

    Ramanathan, Murali

    2003-07-01

    This project was done to develop a mathematical model for optimizing composite predictors based on gene expression profiles from DNA arrays and proteomics. The problem was amenable to a formulation and solution analogous to the portfolio optimization problem in mathematical finance: it requires the optimization of a quadratic function subject to linear constraints. The performance of the approach was compared to that of neighborhood analysis using a data set containing cDNA array-derived gene expression profiles from 14 multiple sclerosis patients receiving intramuscular inteferon-beta1a. The Markowitz portfolio model predicts that the covariance between genes can be exploited to construct an efficient composite. The model predicts that a composite is not needed for maximizing the mean value of a treatment effect: only a single gene is needed, but the usefulness of the effect measure may be compromised by high variability. The model optimized the composite to yield the highest mean for a given level of variability or the least variability for a given mean level. The choices that meet this optimization criteria lie on a curve of composite mean vs. composite variability plot referred to as the "efficient frontier." When a composite is constructed using the model, it outperforms the composite constructed using the neighborhood analysis method. The Markowitz portfolio model may find potential applications in constructing composite biomarkers and in the pharmacogenomic modeling of treatment effects derived from gene expression endpoints.

  4. Gene × Environment Interactions in Schizophrenia: Evidence from Genetic Mouse Models

    PubMed Central

    Marr, Julia; Bock, Gavin; Desbonnet, Lieve; Waddington, John

    2016-01-01

    The study of gene × environment, as well as epistatic interactions in schizophrenia, has provided important insight into the complex etiopathologic basis of schizophrenia. It has also increased our understanding of the role of susceptibility genes in the disorder and is an important consideration as we seek to translate genetic advances into novel antipsychotic treatment targets. This review summarises data arising from research involving the modelling of gene × environment interactions in schizophrenia using preclinical genetic models. Evidence for synergistic effects on the expression of schizophrenia-relevant endophenotypes will be discussed. It is proposed that valid and multifactorial preclinical models are important tools for identifying critical areas, as well as underlying mechanisms, of convergence of genetic and environmental risk factors, and their interaction in schizophrenia. PMID:27725886

  5. miRDeep2 accurately identifies known and hundreds of novel microRNA genes in seven animal clades.

    PubMed

    Friedländer, Marc R; Mackowiak, Sebastian D; Li, Na; Chen, Wei; Rajewsky, Nikolaus

    2012-01-01

    microRNAs (miRNAs) are a large class of small non-coding RNAs which post-transcriptionally regulate the expression of a large fraction of all animal genes and are important in a wide range of biological processes. Recent advances in high-throughput sequencing allow miRNA detection at unprecedented sensitivity, but the computational task of accurately identifying the miRNAs in the background of sequenced RNAs remains challenging. For this purpose, we have designed miRDeep2, a substantially improved algorithm which identifies canonical and non-canonical miRNAs such as those derived from transposable elements and informs on high-confidence candidates that are detected in multiple independent samples. Analyzing data from seven animal species representing the major animal clades, miRDeep2 identified miRNAs with an accuracy of 98.6-99.9% and reported hundreds of novel miRNAs. To test the accuracy of miRDeep2, we knocked down the miRNA biogenesis pathway in a human cell line and sequenced small RNAs before and after. The vast majority of the >100 novel miRNAs expressed in this cell line were indeed specifically downregulated, validating most miRDeep2 predictions. Last, a new miRNA expression profiling routine, low time and memory usage and user-friendly interactive graphic output can make miRDeep2 useful to a wide range of researchers.

  6. Gene editing tools: state-of-the-art and the road ahead for the model and non-model fishes.

    PubMed

    Barman, Hirak Kumar; Rasal, Kiran Dashrath; Chakrapani, Vemulawada; Ninawe, A S; Vengayil, Doyil T; Asrafuzzaman, Syed; Sundaray, Jitendra K; Jayasankar, Pallipuram

    2017-10-01

    Advancements in the DNA sequencing technologies and computational biology have revolutionized genome/transcriptome sequencing of non-model fishes at an affordable cost. This has led to a paradigm shift with regard to our heightened understandings of structure-functional relationships of genes at a global level, from model animals/fishes to non-model large animals/fishes. Whole genome/transcriptome sequencing technologies were supplemented with the series of discoveries in gene editing tools, which are being used to modify genes at pre-determined positions using programmable nucleases to explore their respective in vivo functions. For a long time, targeted gene disruption experiments were mostly restricted to embryonic stem cells, advances in gene editing technologies such as zinc finger nuclease, transcriptional activator-like effector nucleases and CRISPR (clustered regulatory interspaced short palindromic repeats)/CRISPR-associated nucleases have facilitated targeted genetic modifications beyond stem cells to a wide range of somatic cell lines across species from laboratory animals to farmed animals/fishes. In this review, we discuss use of different gene editing tools and the strategic implications in fish species for basic and applied biology research.

  7. Mutual information and the fidelity of response of gene regulatory models

    NASA Astrophysics Data System (ADS)

    Tabbaa, Omar P.; Jayaprakash, C.

    2014-08-01

    We investigate cellular response to extracellular signals by using information theory techniques motivated by recent experiments. We present results for the steady state of the following gene regulatory models found in both prokaryotic and eukaryotic cells: a linear transcription-translation model and a positive or negative auto-regulatory model. We calculate both the information capacity and the mutual information exactly for simple models and approximately for the full model. We find that (1) small changes in mutual information can lead to potentially important changes in cellular response and (2) there are diminishing returns in the fidelity of response as the mutual information increases. We calculate the information capacity using Gillespie simulations of a model for the TNF-α-NF-κ B network and find good agreement with the measured value for an experimental realization of this network. Our results provide a quantitative understanding of the differences in cellular response when comparing experimentally measured mutual information values of different gene regulatory models. Our calculations demonstrate that Gillespie simulations can be used to compute the mutual information of more complex gene regulatory models, providing a potentially useful tool in synthetic biology.

  8. Creating and validating cis-regulatory maps of tissue-specific gene expression regulation

    PubMed Central

    O'Connor, Timothy R.; Bailey, Timothy L.

    2014-01-01

    Predicting which genomic regions control the transcription of a given gene is a challenge. We present a novel computational approach for creating and validating maps that associate genomic regions (cis-regulatory modules–CRMs) with genes. The method infers regulatory relationships that explain gene expression observed in a test tissue using widely available genomic data for ‘other’ tissues. To predict the regulatory targets of a CRM, we use cross-tissue correlation between histone modifications present at the CRM and expression at genes within 1 Mbp of it. To validate cis-regulatory maps, we show that they yield more accurate models of gene expression than carefully constructed control maps. These gene expression models predict observed gene expression from transcription factor binding in the CRMs linked to that gene. We show that our maps are able to identify long-range regulatory interactions and improve substantially over maps linking genes and CRMs based on either the control maps or a ‘nearest neighbor’ heuristic. Our results also show that it is essential to include CRMs predicted in multiple tissues during map-building, that H3K27ac is the most informative histone modification, and that CAGE is the most informative measure of gene expression for creating cis-regulatory maps. PMID:25200088

  9. Gene therapy in large animal models of human cardiovascular genetic disease.

    PubMed

    Sleeper, Meg M; Bish, Lawrence T; Sweeney, H Lee

    2009-01-01

    Several naturally occurring animal models for human genetic heart diseases offer an excellent opportunity to evaluate potential novel therapies, including gene therapy. Some of these diseases--especially those that result in a structural defect during development (e.g., patent ductus arteriosus, pulmonic stenosis)--would likely be difficult to treat with a therapeutic gene transfer approach. However, the ability to transduce a significant proportion of the myocardial cells should make the various forms of inherited cardiomyopathy amenable to a therapeutic gene transfer approach. Adeno-associated virus may be the ideal vector for cardiac gene therapy since its low immunogenicity allows for stable transgene expression, a crucial factor when considering treatment of a chronic disease. Cardiomyopathies are a major cause of morbidity and mortality in both children and adults, and large animal models are available for the major forms of inherited cardiomyopathy (dilated cardiomyopathy, hypertrophic cardiomyopathy, and arrhythmogenic right ventricular cardiomyopathy). One of these animal models, juvenile dilated cardiomyopathy of Portuguese water dogs, offers an effective means to assess the efficacy of therapeutic gene transfer to alter the course of cardiomyopathy and heart failure. Correction of the abnormal metabolic processes that occur with heart failure (e.g., calcium metabolism, apoptosis) could normalize diseased myocardial function. Gene therapy may offer a promising new approach for the treatment of cardiac disease in both veterinary and human clinical settings.

  10. Magnetic gaps in organic tri-radicals: From a simple model to accurate estimates.

    PubMed

    Barone, Vincenzo; Cacelli, Ivo; Ferretti, Alessandro; Prampolini, Giacomo

    2017-03-14

    The calculation of the energy gap between the magnetic states of organic poly-radicals still represents a challenging playground for quantum chemistry, and high-level techniques are required to obtain accurate estimates. On these grounds, the aim of the present study is twofold. From the one side, it shows that, thanks to recent algorithmic and technical improvements, we are able to compute reliable quantum mechanical results for the systems of current fundamental and technological interest. From the other side, proper parameterization of a simple Hubbard Hamiltonian allows for a sound rationalization of magnetic gaps in terms of basic physical effects, unraveling the role played by electron delocalization, Coulomb repulsion, and effective exchange in tuning the magnetic character of the ground state. As case studies, we have chosen three prototypical organic tri-radicals, namely, 1,3,5-trimethylenebenzene, 1,3,5-tridehydrobenzene, and 1,2,3-tridehydrobenzene, which differ either for geometric or electronic structure. After discussing the differences among the three species and their consequences on the magnetic properties in terms of the simple model mentioned above, accurate and reliable values for the energy gap between the lowest quartet and doublet states are computed by means of the so-called difference dedicated configuration interaction (DDCI) technique, and the final results are discussed and compared to both available experimental and computational estimates.

  11. Anatomically accurate individual face modeling.

    PubMed

    Zhang, Yu; Prakash, Edmond C; Sung, Eric

    2003-01-01

    This paper presents a new 3D face model of a specific person constructed from the anatomical perspective. By exploiting the laser range data, a 3D facial mesh precisely representing the skin geometry is reconstructed. Based on the geometric facial mesh, we develop a deformable multi-layer skin model. It takes into account the nonlinear stress-strain relationship and dynamically simulates the non-homogenous behavior of the real skin. The face model also incorporates a set of anatomically-motivated facial muscle actuators and underlying skull structure. Lagrangian mechanics governs the facial motion dynamics, dictating the dynamic deformation of facial skin in response to the muscle contraction.

  12. Automated Protocol for Large-Scale Modeling of Gene Expression Data.

    PubMed

    Hall, Michelle Lynn; Calkins, David; Sherman, Woody

    2016-11-28

    With the continued rise of phenotypic- and genotypic-based screening projects, computational methods to analyze, process, and ultimately make predictions in this field take on growing importance. Here we show how automated machine learning workflows can produce models that are predictive of differential gene expression as a function of a compound structure using data from A673 cells as a proof of principle. In particular, we present predictive models with an average accuracy of greater than 70% across a highly diverse ∼1000 gene expression profile. In contrast to the usual in silico design paradigm, where one interrogates a particular target-based response, this work opens the opportunity for virtual screening and lead optimization for desired multitarget gene expression profiles.

  13. Fast and Accurate Prediction of Numerical Relativity Waveforms from Binary Black Hole Coalescences Using Surrogate Models

    NASA Astrophysics Data System (ADS)

    Blackman, Jonathan; Field, Scott E.; Galley, Chad R.; Szilágyi, Béla; Scheel, Mark A.; Tiglio, Manuel; Hemberger, Daniel A.

    2015-09-01

    Simulating a binary black hole coalescence by solving Einstein's equations is computationally expensive, requiring days to months of supercomputing time. Using reduced order modeling techniques, we construct an accurate surrogate model, which is evaluated in a millisecond to a second, for numerical relativity (NR) waveforms from nonspinning binary black hole coalescences with mass ratios in [1, 10] and durations corresponding to about 15 orbits before merger. We assess the model's uncertainty and show that our modeling strategy predicts NR waveforms not used for the surrogate's training with errors nearly as small as the numerical error of the NR code. Our model includes all spherical-harmonic -2Yℓm waveform modes resolved by the NR code up to ℓ=8 . We compare our surrogate model to effective one body waveforms from 50 M⊙ to 300 M⊙ for advanced LIGO detectors and find that the surrogate is always more faithful (by at least an order of magnitude in most cases).

  14. Fast and Accurate Prediction of Numerical Relativity Waveforms from Binary Black Hole Coalescences Using Surrogate Models.

    PubMed

    Blackman, Jonathan; Field, Scott E; Galley, Chad R; Szilágyi, Béla; Scheel, Mark A; Tiglio, Manuel; Hemberger, Daniel A

    2015-09-18

    Simulating a binary black hole coalescence by solving Einstein's equations is computationally expensive, requiring days to months of supercomputing time. Using reduced order modeling techniques, we construct an accurate surrogate model, which is evaluated in a millisecond to a second, for numerical relativity (NR) waveforms from nonspinning binary black hole coalescences with mass ratios in [1, 10] and durations corresponding to about 15 orbits before merger. We assess the model's uncertainty and show that our modeling strategy predicts NR waveforms not used for the surrogate's training with errors nearly as small as the numerical error of the NR code. Our model includes all spherical-harmonic _{-2}Y_{ℓm} waveform modes resolved by the NR code up to ℓ=8. We compare our surrogate model to effective one body waveforms from 50M_{⊙} to 300M_{⊙} for advanced LIGO detectors and find that the surrogate is always more faithful (by at least an order of magnitude in most cases).

  15. RNA degradation and models for post-transcriptional gene-silencing.

    PubMed

    Meins, F

    2000-06-01

    Post-transcriptional gene silencing (PTGS) is a form of stable but potentially reversible epigenetic modification, which frequently occurs in transgenic plants. The interaction in trans of genes with similar transcribed sequences results in sequence-specific degradation of RNAs derived from the genes involved. Highly expressed single-copy loci, transcribed inverted repeats, and poorly transcribed complex loci can act as sources of signals that trigger PTGS. In some cases, mobile, sequence-specific silencing signals can move from cell to cell or even over long distances in the plant. Several current models hold that silencing signals are 'aberrant' RNAs (aRNA), which differ in some way from normal mRNAs. The most likely candidates are small antisense RNAs (asRNA) and double-stranded RNAs (dsRNA). Direct evidence that these or other aRNAs found in silent tissues can induce PTGS is still lacking. Most current models assume that silencing signals interact with target RNAs in a sequence-specific fashion. This results in degradation, usually in the cytoplasm, by exonucleolytic as well as endonucleolytic pathways, which are not necessarily PTGS-specific. Biochemical-switch models hold that the silent state is maintained by a positive auto-regulatory loop. One possibility is that concentrations of hypothetical silencing signals above a critical threshold trigger their own production by self-replication, by degradation of target RNAs, or by a combination of both mechanisms. These models can account for the stability, reversibility and multiplicity of silent states; the strong influence of transcription rate of target genes on the incidence and stability of silencing, and the amplification and systemic propagation of motile silencing signals.

  16. Religion, fertility and genes: a dual inheritance model.

    PubMed

    Rowthorn, Robert

    2011-08-22

    Religious people nowadays have more children on average than their secular counterparts. This paper uses a simple model to explore the evolutionary implications of this difference. It assumes that fertility is determined entirely by culture, whereas subjective predisposition towards religion is influenced by genetic endowment. People who carry a certain 'religiosity' gene are more likely than average to become or remain religious. The paper considers the effect of religious defections and exogamy on the religious and genetic composition of society. Defections reduce the ultimate share of the population with religious allegiance and slow down the spread of the religiosity gene. However, provided the fertility differential persists, and people with a religious allegiance mate mainly with people like themselves, the religiosity gene will eventually predominate despite a high rate of defection. This is an example of 'cultural hitch-hiking', whereby a gene spreads because it is able to hitch a ride with a high-fitness cultural practice. The theoretical arguments are supported by numerical simulations.

  17. Utilizing Adjoint-Based Error Estimates for Surrogate Models to Accurately Predict Probabilities of Events

    DOE PAGES

    Butler, Troy; Wildey, Timothy

    2018-01-01

    In thist study, we develop a procedure to utilize error estimates for samples of a surrogate model to compute robust upper and lower bounds on estimates of probabilities of events. We show that these error estimates can also be used in an adaptive algorithm to simultaneously reduce the computational cost and increase the accuracy in estimating probabilities of events using computationally expensive high-fidelity models. Specifically, we introduce the notion of reliability of a sample of a surrogate model, and we prove that utilizing the surrogate model for the reliable samples and the high-fidelity model for the unreliable samples gives preciselymore » the same estimate of the probability of the output event as would be obtained by evaluation of the original model for each sample. The adaptive algorithm uses the additional evaluations of the high-fidelity model for the unreliable samples to locally improve the surrogate model near the limit state, which significantly reduces the number of high-fidelity model evaluations as the limit state is resolved. Numerical results based on a recently developed adjoint-based approach for estimating the error in samples of a surrogate are provided to demonstrate (1) the robustness of the bounds on the probability of an event, and (2) that the adaptive enhancement algorithm provides a more accurate estimate of the probability of the QoI event than standard response surface approximation methods at a lower computational cost.« less

  18. Utilizing Adjoint-Based Error Estimates for Surrogate Models to Accurately Predict Probabilities of Events

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Butler, Troy; Wildey, Timothy

    In thist study, we develop a procedure to utilize error estimates for samples of a surrogate model to compute robust upper and lower bounds on estimates of probabilities of events. We show that these error estimates can also be used in an adaptive algorithm to simultaneously reduce the computational cost and increase the accuracy in estimating probabilities of events using computationally expensive high-fidelity models. Specifically, we introduce the notion of reliability of a sample of a surrogate model, and we prove that utilizing the surrogate model for the reliable samples and the high-fidelity model for the unreliable samples gives preciselymore » the same estimate of the probability of the output event as would be obtained by evaluation of the original model for each sample. The adaptive algorithm uses the additional evaluations of the high-fidelity model for the unreliable samples to locally improve the surrogate model near the limit state, which significantly reduces the number of high-fidelity model evaluations as the limit state is resolved. Numerical results based on a recently developed adjoint-based approach for estimating the error in samples of a surrogate are provided to demonstrate (1) the robustness of the bounds on the probability of an event, and (2) that the adaptive enhancement algorithm provides a more accurate estimate of the probability of the QoI event than standard response surface approximation methods at a lower computational cost.« less

  19. Ferret and Pig Models of Cystic Fibrosis: Prospects and Promise for Gene Therapy

    PubMed Central

    Yan, Ziying; Stewart, Zoe A.; Sinn, Patrick L.; Olsen, John C.; Hu, Jim; McCray, Paul B.

    2015-01-01

    Abstract Large animal models of genetic diseases are rapidly becoming integral to biomedical research as technologies to manipulate the mammalian genome improve. The creation of cystic fibrosis (CF) ferrets and pigs is an example of such progress in animal modeling, with the disease phenotypes in the ferret and pig models more reflective of human CF disease than mouse models. The ferret and pig CF models also provide unique opportunities to develop and assess the effectiveness of gene and cell therapies to treat affected organs. In this review, we examine the organ disease phenotypes in these new CF models and the opportunities to test gene therapies at various stages of disease progression in affected organs. We then discuss the progress in developing recombinant replication-defective adenoviral, adeno-associated viral, and lentiviral vectors to target genes to the lung and pancreas in ferrets and pigs, the two most affected organs in CF. Through this review, we hope to convey the potential of these new animal models for developing CF gene and cell therapies. PMID:25675143

  20. Ferret and pig models of cystic fibrosis: prospects and promise for gene therapy.

    PubMed

    Yan, Ziying; Stewart, Zoe A; Sinn, Patrick L; Olsen, John C; Hu, Jim; McCray, Paul B; Engelhardt, John F

    2015-03-01

    Large animal models of genetic diseases are rapidly becoming integral to biomedical research as technologies to manipulate the mammalian genome improve. The creation of cystic fibrosis (CF) ferrets and pigs is an example of such progress in animal modeling, with the disease phenotypes in the ferret and pig models more reflective of human CF disease than mouse models. The ferret and pig CF models also provide unique opportunities to develop and assess the effectiveness of gene and cell therapies to treat affected organs. In this review, we examine the organ disease phenotypes in these new CF models and the opportunities to test gene therapies at various stages of disease progression in affected organs. We then discuss the progress in developing recombinant replication-defective adenoviral, adeno-associated viral, and lentiviral vectors to target genes to the lung and pancreas in ferrets and pigs, the two most affected organs in CF. Through this review, we hope to convey the potential of these new animal models for developing CF gene and cell therapies.

  1. Using protein-protein interactions for refining gene networks estimated from microarray data by Bayesian networks.

    PubMed

    Nariai, N; Kim, S; Imoto, S; Miyano, S

    2004-01-01

    We propose a statistical method to estimate gene networks from DNA microarray data and protein-protein interactions. Because physical interactions between proteins or multiprotein complexes are likely to regulate biological processes, using only mRNA expression data is not sufficient for estimating a gene network accurately. Our method adds knowledge about protein-protein interactions to the estimation method of gene networks under a Bayesian statistical framework. In the estimated gene network, a protein complex is modeled as a virtual node based on principal component analysis. We show the effectiveness of the proposed method through the analysis of Saccharomyces cerevisiae cell cycle data. The proposed method improves the accuracy of the estimated gene networks, and successfully identifies some biological facts.

  2. Myostatin propeptide gene delivery by gene gun ameliorates muscle atrophy in a rat model of botulinum toxin-induced nerve denervation.

    PubMed

    Tsai, Sen-Wei; Tung, Yu-Tang; Chen, Hsiao-Ling; Yang, Shang-Hsun; Liu, Chia-Yi; Lu, Michelle; Pai, Hui-Jing; Lin, Chi-Chen; Chen, Chuan-Mu

    2016-02-01

    Muscle atrophy is a common symptom after nerve denervation. Myostatin propeptide, a precursor of myostatin, has been documented to improve muscle growth. However, the mechanism underlying the muscle atrophy attenuation effects of myostatin propeptide in muscles and the changes in gene expression are not well established. We investigated the possible underlying mechanisms associated with myostatin propeptide gene delivery by gene gun in a rat denervation muscle atrophy model, and evaluated gene expression patterns. In a rat botulinum toxin-induced nerve denervation muscle atrophy model, we evaluated the effects of wild-type (MSPP) and mutant-type (MSPPD75A) of myostatin propeptide gene delivery, and observed changes in gene activation associated with the neuromuscular junction, muscle and nerve. Muscle mass and muscle fiber size was moderately increased in myostatin propeptide treated muscles (p<0.05). And enhancement of the gene expression of the muscle regulatory factors, neurite outgrowth factors (IGF-1, GAP43) and acetylcholine receptors was observed. Our results demonstrate that myostatin propeptide gene delivery, especially the mutant-type of MSPPD75A, attenuates muscle atrophy through myogenic regulatory factors and acetylcholine receptor regulation. Our data concluded that myostatin propeptide gene therapy may be a promising treatment for nerve denervation induced muscle atrophy. Copyright © 2016 Elsevier Inc. All rights reserved.

  3. Induced Pluripotency and Gene Editing in Disease Modelling: Perspectives and Challenges

    PubMed Central

    Seah, Yu Fen Samantha; EL Farran, Chadi A.; Warrier, Tushar; Xu, Jian; Loh, Yuin-Han

    2015-01-01

    Embryonic stem cells (ESCs) are chiefly characterized by their ability to self-renew and to differentiate into any cell type derived from the three main germ layers. It was demonstrated that somatic cells could be reprogrammed to form induced pluripotent stem cells (iPSCs) via various strategies. Gene editing is a technique that can be used to make targeted changes in the genome, and the efficiency of this process has been significantly enhanced by recent advancements. The use of engineered endonucleases, such as homing endonucleases, zinc finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs) and Cas9 of the CRISPR system, has significantly enhanced the efficiency of gene editing. The combination of somatic cell reprogramming with gene editing enables us to model human diseases in vitro, in a manner considered superior to animal disease models. In this review, we discuss the various strategies of reprogramming and gene targeting with an emphasis on the current advancements and challenges of using these techniques to model human diseases. PMID:26633382

  4. Modelling the influence of parental effects on gene-network evolution.

    PubMed

    Odorico, Andreas; Rünneburger, Estelle; Le Rouzic, Arnaud

    2018-05-01

    Understanding the importance of nongenetic heredity in the evolutionary process is a major topic in modern evolutionary biology. We modified a classical gene-network model by allowing parental transmission of gene expression and studied its evolutionary properties through individual-based simulations. We identified ontogenetic time (i.e. the time gene networks have to stabilize before being submitted to natural selection) as a crucial factor in determining the evolutionary impact of this phenotypic inheritance. Indeed, fast-developing organisms display enhanced adaptation and greater robustness to mutations when evolving in presence of nongenetic inheritance (NGI). In contrast, in our model, long development reduces the influence of the inherited state of the gene network. NGI thus had a negligible effect on the evolution of gene networks when the speed at which transcription levels reach equilibrium is not constrained. Nevertheless, simulations show that intergenerational transmission of the gene-network state negatively affects the evolution of robustness to environmental disturbances for either fast- or slow-developing organisms. Therefore, these results suggest that the evolutionary consequences of NGI might not be sought only in the way species respond to selection, but also on the evolution of emergent properties (such as environmental and genetic canalization) in complex genetic architectures. © 2018 European Society For Evolutionary Biology. Journal of Evolutionary Biology © 2018 European Society For Evolutionary Biology.

  5. Large Animal Models for Foamy Virus Vector Gene Therapy

    PubMed Central

    Trobridge, Grant D.; Horn, Peter A.; Beard, Brian C.; Kiem, Hans-Peter

    2012-01-01

    Foamy virus (FV) vectors have shown great promise for hematopoietic stem cell (HSC) gene therapy. Their ability to efficiently deliver transgenes to multi-lineage long-term repopulating cells in large animal models suggests they will be effective for several human hematopoietic diseases. Here, we review FV vector studies in large animal models, including the use of FV vectors with the mutant O6-methylguanine-DNA methyltransferase, MGMTP140K to increase the number of genetically modified cells after transplantation. In these studies, FV vectors have mediated efficient gene transfer to polyclonal repopulating cells using short ex vivo transduction protocols designed to minimize the negative effects of ex vivo culture on stem cell engraftment. In this regard, FV vectors appear superior to gammaretroviral vectors, which require longer ex vivo culture to effect efficient transduction. FV vectors have also compared favorably with lentiviral vectors when directly compared in the dog model. FV vectors have corrected leukocyte adhesion deficiency and pyruvate kinase deficiency in the dog large animal model. FV vectors also appear safer than gammaretroviral vectors based on a reduced frequency of integrants near promoters and also near proto-oncogenes in canine repopulating cells. Together, these studies suggest that FV vectors should be highly effective for several human hematopoietic diseases, including those that will require relatively high percentages of gene-modified cells to achieve clinical benefit. PMID:23223198

  6. Analysis of functional importance of binding sites in the Drosophila gap gene network model.

    PubMed

    Kozlov, Konstantin; Gursky, Vitaly V; Kulakovskiy, Ivan V; Dymova, Arina; Samsonova, Maria

    2015-01-01

    The statistical thermodynamics based approach provides a promising framework for construction of the genotype-phenotype map in many biological systems. Among important aspects of a good model connecting the DNA sequence information with that of a molecular phenotype (gene expression) is the selection of regulatory interactions and relevant transcription factor bindings sites. As the model may predict different levels of the functional importance of specific binding sites in different genomic and regulatory contexts, it is essential to formulate and study such models under different modeling assumptions. We elaborate a two-layer model for the Drosophila gap gene network and include in the model a combined set of transcription factor binding sites and concentration dependent regulatory interaction between gap genes hunchback and Kruppel. We show that the new variants of the model are more consistent in terms of gene expression predictions for various genetic constructs in comparison to previous work. We quantify the functional importance of binding sites by calculating their impact on gene expression in the model and calculate how these impacts correlate across all sites under different modeling assumptions. The assumption about the dual interaction between hb and Kr leads to the most consistent modeling results, but, on the other hand, may obscure existence of indirect interactions between binding sites in regulatory regions of distinct genes. The analysis confirms the previously formulated regulation concept of many weak binding sites working in concert. The model predicts a more or less uniform distribution of functionally important binding sites over the sets of experimentally characterized regulatory modules and other open chromatin domains.

  7. BASIC: A Simple and Accurate Modular DNA Assembly Method.

    PubMed

    Storch, Marko; Casini, Arturo; Mackrow, Ben; Ellis, Tom; Baldwin, Geoff S

    2017-01-01

    Biopart Assembly Standard for Idempotent Cloning (BASIC) is a simple, accurate, and robust DNA assembly method. The method is based on linker-mediated DNA assembly and provides highly accurate DNA assembly with 99 % correct assemblies for four parts and 90 % correct assemblies for seven parts [1]. The BASIC standard defines a single entry vector for all parts flanked by the same prefix and suffix sequences and its idempotent nature means that the assembled construct is returned in the same format. Once a part has been adapted into the BASIC format it can be placed at any position within a BASIC assembly without the need for reformatting. This allows laboratories to grow comprehensive and universal part libraries and to share them efficiently. The modularity within the BASIC framework is further extended by the possibility of encoding ribosomal binding sites (RBS) and peptide linker sequences directly on the linkers used for assembly. This makes BASIC a highly versatile library construction method for combinatorial part assembly including the construction of promoter, RBS, gene variant, and protein-tag libraries. In comparison with other DNA assembly standards and methods, BASIC offers a simple robust protocol; it relies on a single entry vector, provides for easy hierarchical assembly, and is highly accurate for up to seven parts per assembly round [2].

  8. Discovery of rare protein-coding genes in model methylotroph Methylobacterium extorquens AM1.

    PubMed

    Kumar, Dhirendra; Mondal, Anupam Kumar; Yadav, Amit Kumar; Dash, Debasis

    2014-12-01

    Proteogenomics involves the use of MS to refine annotation of protein-coding genes and discover genes in a genome. We carried out comprehensive proteogenomic analysis of Methylobacterium extorquens AM1 (ME-AM1) from publicly available proteomics data with a motive to improve annotation for methylotrophs; organisms capable of surviving in reduced carbon compounds such as methanol. Besides identifying 2482(50%) proteins, 29 new genes were discovered and 66 annotated gene models were revised in ME-AM1 genome. One such novel gene is identified with 75 peptides, lacks homolog in other methylobacteria but has glycosyl transferase and lipopolysaccharide biosynthesis protein domains, indicating its potential role in outer membrane synthesis. Many novel genes are present only in ME-AM1 among methylobacteria. Distant homologs of these genes in unrelated taxonomic classes and low GC-content of few genes suggest lateral gene transfer as a potential mode of their origin. Annotations of methylotrophy related genes were also improved by the discovery of a short gene in methylotrophy gene island and redefining a gene important for pyrroquinoline quinone synthesis, essential for methylotrophy. The combined use of proteogenomics and rigorous bioinformatics analysis greatly enhanced the annotation of protein-coding genes in model methylotroph ME-AM1 genome. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  9. Defining the optimal animal model for translational research using gene set enrichment analysis.

    PubMed

    Weidner, Christopher; Steinfath, Matthias; Opitz, Elisa; Oelgeschläger, Michael; Schönfelder, Gilbert

    2016-08-01

    The mouse is the main model organism used to study the functions of human genes because most biological processes in the mouse are highly conserved in humans. Recent reports that compared identical transcriptomic datasets of human inflammatory diseases with datasets from mouse models using traditional gene-to-gene comparison techniques resulted in contradictory conclusions regarding the relevance of animal models for translational research. To reduce susceptibility to biased interpretation, all genes of interest for the biological question under investigation should be considered. Thus, standardized approaches for systematic data analysis are needed. We analyzed the same datasets using gene set enrichment analysis focusing on pathways assigned to inflammatory processes in either humans or mice. The analyses revealed a moderate overlap between all human and mouse datasets, with average positive and negative predictive values of 48 and 57% significant correlations. Subgroups of the septic mouse models (i.e., Staphylococcus aureus injection) correlated very well with most human studies. These findings support the applicability of targeted strategies to identify the optimal animal model and protocol to improve the success of translational research. © 2016 The Authors. Published under the terms of the CC BY 4.0 license.

  10. Modelling gene expression profiles related to prostate tumor progression using binary states

    PubMed Central

    2013-01-01

    Background Cancer is a complex disease commonly characterized by the disrupted activity of several cancer-related genes such as oncogenes and tumor-suppressor genes. Previous studies suggest that the process of tumor progression to malignancy is dynamic and can be traced by changes in gene expression. Despite the enormous efforts made for differential expression detection and biomarker discovery, few methods have been designed to model the gene expression level to tumor stage during malignancy progression. Such models could help us understand the dynamics and simplify or reveal the complexity of tumor progression. Methods We have modeled an on-off state of gene activation per sample then per stage to select gene expression profiles associated to tumor progression. The selection is guided by statistical significance of profiles based on random permutated datasets. Results We show that our method identifies expected profiles corresponding to oncogenes and tumor suppressor genes in a prostate tumor progression dataset. Comparisons with other methods support our findings and indicate that a considerable proportion of significant profiles is not found by other statistical tests commonly used to detect differential expression between tumor stages nor found by other tailored methods. Ontology and pathway analysis concurred with these findings. Conclusions Results suggest that our methodology may be a valuable tool to study tumor malignancy progression, which might reveal novel cancer therapies. PMID:23721350

  11. Exact protein distributions for stochastic models of gene expression using partitioning of Poisson processes.

    PubMed

    Pendar, Hodjat; Platini, Thierry; Kulkarni, Rahul V

    2013-04-01

    Stochasticity in gene expression gives rise to fluctuations in protein levels across a population of genetically identical cells. Such fluctuations can lead to phenotypic variation in clonal populations; hence, there is considerable interest in quantifying noise in gene expression using stochastic models. However, obtaining exact analytical results for protein distributions has been an intractable task for all but the simplest models. Here, we invoke the partitioning property of Poisson processes to develop a mapping that significantly simplifies the analysis of stochastic models of gene expression. The mapping leads to exact protein distributions using results for mRNA distributions in models with promoter-based regulation. Using this approach, we derive exact analytical results for steady-state and time-dependent distributions for the basic two-stage model of gene expression. Furthermore, we show how the mapping leads to exact protein distributions for extensions of the basic model that include the effects of posttranscriptional and posttranslational regulation. The approach developed in this work is widely applicable and can contribute to a quantitative understanding of stochasticity in gene expression and its regulation.

  12. Exact protein distributions for stochastic models of gene expression using partitioning of Poisson processes

    NASA Astrophysics Data System (ADS)

    Pendar, Hodjat; Platini, Thierry; Kulkarni, Rahul V.

    2013-04-01

    Stochasticity in gene expression gives rise to fluctuations in protein levels across a population of genetically identical cells. Such fluctuations can lead to phenotypic variation in clonal populations; hence, there is considerable interest in quantifying noise in gene expression using stochastic models. However, obtaining exact analytical results for protein distributions has been an intractable task for all but the simplest models. Here, we invoke the partitioning property of Poisson processes to develop a mapping that significantly simplifies the analysis of stochastic models of gene expression. The mapping leads to exact protein distributions using results for mRNA distributions in models with promoter-based regulation. Using this approach, we derive exact analytical results for steady-state and time-dependent distributions for the basic two-stage model of gene expression. Furthermore, we show how the mapping leads to exact protein distributions for extensions of the basic model that include the effects of posttranscriptional and posttranslational regulation. The approach developed in this work is widely applicable and can contribute to a quantitative understanding of stochasticity in gene expression and its regulation.

  13. Three Approaches to Modeling Gene-Environment Interactions in Longitudinal Family Data: Gene-Smoking Interactions in Blood Pressure.

    PubMed

    Basson, Jacob; Sung, Yun Ju; de Las Fuentes, Lisa; Schwander, Karen L; Vazquez, Ana; Rao, Dabeeru C

    2016-01-01

    Blood pressure (BP) has been shown to be substantially heritable, yet identified genetic variants explain only a small fraction of the heritability. Gene-smoking interactions have detected novel BP loci in cross-sectional family data. Longitudinal family data are available and have additional promise to identify BP loci. However, this type of data presents unique analysis challenges. Although several methods for analyzing longitudinal family data are available, which method is the most appropriate and under what conditions has not been fully studied. Using data from three clinic visits from the Framingham Heart Study, we performed association analysis accounting for gene-smoking interactions in BP at 31,203 markers on chromosome 22. We evaluated three different modeling frameworks: generalized estimating equations (GEE), hierarchical linear modeling, and pedigree-based mixed modeling. The three models performed somewhat comparably, with multiple overlaps in the most strongly associated loci from each model. Loci with the greatest significance were more strongly supported in the longitudinal analyses than in any of the component single-visit analyses. The pedigree-based mixed model was more conservative, with less inflation in the variant main effect and greater deflation in the gene-smoking interactions. The GEE, but not the other two models, resulted in substantial inflation in the tail of the distribution when variants with minor allele frequency <1% were included in the analysis. The choice of analysis method should depend on the model and the structure and complexity of the familial and longitudinal data. © 2015 WILEY PERIODICALS, INC.

  14. THE EFFECTS OF VIDEO MODELING WITH VOICEOVER INSTRUCTION ON ACCURATE IMPLEMENTATION OF DISCRETE-TRIAL INSTRUCTION

    PubMed Central

    Vladescu, Jason C; Carroll, Regina; Paden, Amber; Kodak, Tiffany M

    2012-01-01

    The present study replicates and extends previous research on the use of video modeling (VM) with voiceover instruction to train staff to implement discrete-trial instruction (DTI). After staff trainees reached the mastery criterion when teaching an adult confederate with VM, they taught a child with a developmental disability using DTI. The results showed that the staff trainees' accurate implementation of DTI remained high, and both child participants acquired new skills. These findings provide additional support that VM may be an effective method to train staff members to conduct DTI. PMID:22844149

  15. The effects of video modeling with voiceover instruction on accurate implementation of discrete-trial instruction.

    PubMed

    Vladescu, Jason C; Carroll, Regina; Paden, Amber; Kodak, Tiffany M

    2012-01-01

    The present study replicates and extends previous research on the use of video modeling (VM) with voiceover instruction to train staff to implement discrete-trial instruction (DTI). After staff trainees reached the mastery criterion when teaching an adult confederate with VM, they taught a child with a developmental disability using DTI. The results showed that the staff trainees' accurate implementation of DTI remained high, and both child participants acquired new skills. These findings provide additional support that VM may be an effective method to train staff members to conduct DTI.

  16. Clustering of time-course gene expression profiles using normal mixture models with autoregressive random effects

    PubMed Central

    2012-01-01

    Background Time-course gene expression data such as yeast cell cycle data may be periodically expressed. To cluster such data, currently used Fourier series approximations of periodic gene expressions have been found not to be sufficiently adequate to model the complexity of the time-course data, partly due to their ignoring the dependence between the expression measurements over time and the correlation among gene expression profiles. We further investigate the advantages and limitations of available models in the literature and propose a new mixture model with autoregressive random effects of the first order for the clustering of time-course gene-expression profiles. Some simulations and real examples are given to demonstrate the usefulness of the proposed models. Results We illustrate the applicability of our new model using synthetic and real time-course datasets. We show that our model outperforms existing models to provide more reliable and robust clustering of time-course data. Our model provides superior results when genetic profiles are correlated. It also gives comparable results when the correlation between the gene profiles is weak. In the applications to real time-course data, relevant clusters of coregulated genes are obtained, which are supported by gene-function annotation databases. Conclusions Our new model under our extension of the EMMIX-WIRE procedure is more reliable and robust for clustering time-course data because it adopts a random effects model that allows for the correlation among observations at different time points. It postulates gene-specific random effects with an autocorrelation variance structure that models coregulation within the clusters. The developed R package is flexible in its specification of the random effects through user-input parameters that enables improved modelling and consequent clustering of time-course data. PMID:23151154

  17. Prediction of regulatory gene pairs using dynamic time warping and gene ontology.

    PubMed

    Yang, Andy C; Hsu, Hui-Huang; Lu, Ming-Da; Tseng, Vincent S; Shih, Timothy K

    2014-01-01

    Selecting informative genes is the most important task for data analysis on microarray gene expression data. In this work, we aim at identifying regulatory gene pairs from microarray gene expression data. However, microarray data often contain multiple missing expression values. Missing value imputation is thus needed before further processing for regulatory gene pairs becomes possible. We develop a novel approach to first impute missing values in microarray time series data by combining k-Nearest Neighbour (KNN), Dynamic Time Warping (DTW) and Gene Ontology (GO). After missing values are imputed, we then perform gene regulation prediction based on our proposed DTW-GO distance measurement of gene pairs. Experimental results show that our approach is more accurate when compared with existing missing value imputation methods on real microarray data sets. Furthermore, our approach can also discover more regulatory gene pairs that are known in the literature than other methods.

  18. Generalization of the normal-exponential model: exploration of a more accurate parametrisation for the signal distribution on Illumina BeadArrays.

    PubMed

    Plancade, Sandra; Rozenholc, Yves; Lund, Eiliv

    2012-12-11

    Illumina BeadArray technology includes non specific negative control features that allow a precise estimation of the background noise. As an alternative to the background subtraction proposed in BeadStudio which leads to an important loss of information by generating negative values, a background correction method modeling the observed intensities as the sum of the exponentially distributed signal and normally distributed noise has been developed. Nevertheless, Wang and Ye (2012) display a kernel-based estimator of the signal distribution on Illumina BeadArrays and suggest that a gamma distribution would represent a better modeling of the signal density. Hence, the normal-exponential modeling may not be appropriate for Illumina data and background corrections derived from this model may lead to wrong estimation. We propose a more flexible modeling based on a gamma distributed signal and a normal distributed background noise and develop the associated background correction, implemented in the R-package NormalGamma. Our model proves to be markedly more accurate to model Illumina BeadArrays: on the one hand, it is shown on two types of Illumina BeadChips that this model offers a more correct fit of the observed intensities. On the other hand, the comparison of the operating characteristics of several background correction procedures on spike-in and on normal-gamma simulated data shows high similarities, reinforcing the validation of the normal-gamma modeling. The performance of the background corrections based on the normal-gamma and normal-exponential models are compared on two dilution data sets, through testing procedures which represent various experimental designs. Surprisingly, we observe that the implementation of a more accurate parametrisation in the model-based background correction does not increase the sensitivity. These results may be explained by the operating characteristics of the estimators: the normal-gamma background correction offers an improvement

  19. Improved methods and resources for paramecium genomics: transcription units, gene annotation and gene expression.

    PubMed

    Arnaiz, Olivier; Van Dijk, Erwin; Bétermier, Mireille; Lhuillier-Akakpo, Maoussi; de Vanssay, Augustin; Duharcourt, Sandra; Sallet, Erika; Gouzy, Jérôme; Sperling, Linda

    2017-06-26

    The 15 sibling species of the Paramecium aurelia cryptic species complex emerged after a whole genome duplication that occurred tens of millions of years ago. Given extensive knowledge of the genetics and epigenetics of Paramecium acquired over the last century, this species complex offers a uniquely powerful system to investigate the consequences of whole genome duplication in a unicellular eukaryote as well as the genetic and epigenetic mechanisms that drive speciation. High quality Paramecium gene models are important for research using this system. The major aim of the work reported here was to build an improved gene annotation pipeline for the Paramecium lineage. We generated oriented RNA-Seq transcriptome data across the sexual process of autogamy for the model species Paramecium tetraurelia. We determined, for the first time in a ciliate, candidate P. tetraurelia transcription start sites using an adapted Cap-Seq protocol. We developed TrUC, multi-threaded Perl software that in conjunction with TopHat mapping of RNA-Seq data to a reference genome, predicts transcription units for the annotation pipeline. We used EuGene software to combine annotation evidence. The high quality gene structural annotations obtained for P. tetraurelia were used as evidence to improve published annotations for 3 other Paramecium species. The RNA-Seq data were also used for differential gene expression analysis, providing a gene expression atlas that is more sensitive than the previously established microarray resource. We have developed a gene annotation pipeline tailored for the compact genomes and tiny introns of Paramecium species. A novel component of this pipeline, TrUC, predicts transcription units using Cap-Seq and oriented RNA-Seq data. TrUC could prove useful beyond Paramecium, especially in the case of high gene density. Accurate predictions of 3' and 5' UTR will be particularly valuable for studies of gene expression (e.g. nucleosome positioning, identification of cis

  20. A Comprehensive Strategy for Accurate Mutation Detection of the Highly Homologous PMS2.

    PubMed

    Li, Jianli; Dai, Hongzheng; Feng, Yanming; Tang, Jia; Chen, Stella; Tian, Xia; Gorman, Elizabeth; Schmitt, Eric S; Hansen, Terah A A; Wang, Jing; Plon, Sharon E; Zhang, Victor Wei; Wong, Lee-Jun C

    2015-09-01

    Germline mutations in the DNA mismatch repair gene PMS2 underlie the cancer susceptibility syndrome, Lynch syndrome. However, accurate molecular testing of PMS2 is complicated by a large number of highly homologous sequences. To establish a comprehensive approach for mutation detection of PMS2, we have designed a strategy combining targeted capture next-generation sequencing (NGS), multiplex ligation-dependent probe amplification, and long-range PCR followed by NGS to simultaneously detect point mutations and copy number changes of PMS2. Exonic deletions (E2 to E9, E5 to E9, E8, E10, E14, and E1 to E15), duplications (E11 to E12), and a nonsense mutation, p.S22*, were identified. Traditional multiplex ligation-dependent probe amplification and Sanger sequencing approaches cannot differentiate the origin of the exonic deletions in the 3' region when PMS2 and PMS2CL share identical sequences as a result of gene conversion. Our approach allows unambiguous identification of mutations in the active gene with a straightforward long-range-PCR/NGS method. Breakpoint analysis of multiple samples revealed that recurrent exon 14 deletions are mediated by homologous Alu sequences. Our comprehensive approach provides a reliable tool for accurate molecular analysis of genes containing multiple copies of highly homologous sequences and should improve PMS2 molecular analysis for patients with Lynch syndrome. Copyright © 2015 American Society for Investigative Pathology and the Association for Molecular Pathology. Published by Elsevier Inc. All rights reserved.

  1. Challenges for modeling global gene regulatory networks during development: insights from Drosophila.

    PubMed

    Wilczynski, Bartek; Furlong, Eileen E M

    2010-04-15

    Development is regulated by dynamic patterns of gene expression, which are orchestrated through the action of complex gene regulatory networks (GRNs). Substantial progress has been made in modeling transcriptional regulation in recent years, including qualitative "coarse-grain" models operating at the gene level to very "fine-grain" quantitative models operating at the biophysical "transcription factor-DNA level". Recent advances in genome-wide studies have revealed an enormous increase in the size and complexity or GRNs. Even relatively simple developmental processes can involve hundreds of regulatory molecules, with extensive interconnectivity and cooperative regulation. This leads to an explosion in the number of regulatory functions, effectively impeding Boolean-based qualitative modeling approaches. At the same time, the lack of information on the biophysical properties for the majority of transcription factors within a global network restricts quantitative approaches. In this review, we explore the current challenges in moving from modeling medium scale well-characterized networks to more poorly characterized global networks. We suggest to integrate coarse- and find-grain approaches to model gene regulatory networks in cis. We focus on two very well-studied examples from Drosophila, which likely represent typical developmental regulatory modules across metazoans. Copyright (c) 2009 Elsevier Inc. All rights reserved.

  2. Accurately Assessing the Risk of Schizophrenia Conferred by Rare Copy-Number Variation Affecting Genes with Brain Function

    PubMed Central

    Raychaudhuri, Soumya; Korn, Joshua M.; McCarroll, Steven A.; Altshuler, David; Sklar, Pamela; Purcell, Shaun; Daly, Mark J.

    2010-01-01

    Investigators have linked rare copy number variation (CNVs) to neuropsychiatric diseases, such as schizophrenia. One hypothesis is that CNV events cause disease by affecting genes with specific brain functions. Under these circumstances, we expect that CNV events in cases should impact brain-function genes more frequently than those events in controls. Previous publications have applied “pathway” analyses to genes within neuropsychiatric case CNVs to show enrichment for brain-functions. While such analyses have been suggestive, they often have not rigorously compared the rates of CNVs impacting genes with brain function in cases to controls, and therefore do not address important confounders such as the large size of brain genes and overall differences in rates and sizes of CNVs. To demonstrate the potential impact of confounders, we genotyped rare CNV events in 2,415 unaffected controls with Affymetrix 6.0; we then applied standard pathway analyses using four sets of brain-function genes and observed an apparently highly significant enrichment for each set. The enrichment is simply driven by the large size of brain-function genes. Instead, we propose a case-control statistical test, cnv-enrichment-test, to compare the rate of CNVs impacting specific gene sets in cases versus controls. With simulations, we demonstrate that cnv-enrichment-test is robust to case-control differences in CNV size, CNV rate, and systematic differences in gene size. Finally, we apply cnv-enrichment-test to rare CNV events published by the International Schizophrenia Consortium (ISC). This approach reveals nominal evidence of case-association in neuronal-activity and the learning gene sets, but not the other two examined gene sets. The neuronal-activity genes have been associated in a separate set of schizophrenia cases and controls; however, testing in independent samples is necessary to definitively confirm this association. Our method is implemented in the PLINK software package

  3. Computational challenges in modeling gene regulatory events.

    PubMed

    Pataskar, Abhijeet; Tiwari, Vijay K

    2016-10-19

    Cellular transcriptional programs driven by genetic and epigenetic mechanisms could be better understood by integrating "omics" data and subsequently modeling the gene-regulatory events. Toward this end, computational biology should keep pace with evolving experimental procedures and data availability. This article gives an exemplified account of the current computational challenges in molecular biology.

  4. Accurate Time-Dependent Traveling-Wave Tube Model Developed for Computational Bit-Error-Rate Testing

    NASA Technical Reports Server (NTRS)

    Kory, Carol L.

    2001-01-01

    The phenomenal growth of the satellite communications industry has created a large demand for traveling-wave tubes (TWT's) operating with unprecedented specifications requiring the design and production of many novel devices in record time. To achieve this, the TWT industry heavily relies on computational modeling. However, the TWT industry's computational modeling capabilities need to be improved because there are often discrepancies between measured TWT data and that predicted by conventional two-dimensional helical TWT interaction codes. This limits the analysis and design of novel devices or TWT's with parameters differing from what is conventionally manufactured. In addition, the inaccuracy of current computational tools limits achievable TWT performance because optimized designs require highly accurate models. To address these concerns, a fully three-dimensional, time-dependent, helical TWT interaction model was developed using the electromagnetic particle-in-cell code MAFIA (Solution of MAxwell's equations by the Finite-Integration-Algorithm). The model includes a short section of helical slow-wave circuit with excitation fed by radiofrequency input/output couplers, and an electron beam contained by periodic permanent magnet focusing. A cutaway view of several turns of the three-dimensional helical slow-wave circuit with input/output couplers is shown. This has been shown to be more accurate than conventionally used two-dimensional models. The growth of the communications industry has also imposed a demand for increased data rates for the transmission of large volumes of data. To achieve increased data rates, complex modulation and multiple access techniques are employed requiring minimum distortion of the signal as it is passed through the TWT. Thus, intersymbol interference (ISI) becomes a major consideration, as well as suspected causes such as reflections within the TWT. To experimentally investigate effects of the physical TWT on ISI would be

  5. Molecular insight into the association between cartilage regeneration and ear wound healing in genetic mouse models: targeting new genes in regeneration.

    PubMed

    Rai, Muhammad Farooq; Schmidt, Eric J; McAlinden, Audrey; Cheverud, James M; Sandell, Linda J

    2013-11-06

    Tissue regeneration is a complex trait with few genetic models available. Mouse strains LG/J and MRL are exceptional healers. Using recombinant inbred strains from a large (LG/J, healer) and small (SM/J, nonhealer) intercross, we have previously shown a positive genetic correlation between ear wound healing, knee cartilage regeneration, and protection from osteoarthritis. We hypothesize that a common set of genes operates in tissue healing and articular cartilage regeneration. Taking advantage of archived histological sections from recombinant inbred strains, we analyzed expression of candidate genes through branched-chain DNA technology directly from tissue lysates. We determined broad-sense heritability of candidates, Pearson correlation of candidates with healing phenotypes, and Ward minimum variance cluster analysis for strains. A bioinformatic assessment of allelic polymorphisms within and near candidate genes was also performed. The expression of several candidates was significantly heritable among strains. Although several genes correlated with both ear wound healing and cartilage healing at a marginal level, the expression of four genes representing DNA repair (Xrcc2, Pcna) and Wnt signaling (Axin2, Wnt16) pathways was significantly positively correlated with both phenotypes. Cluster analysis accurately classified healers and nonhealers for seven out of eight strains based on gene expression. Specific sequence differences between LG/J and SM/J were identified as potential causal polymorphisms. Our study suggests a common genetic basis between tissue healing and osteoarthritis susceptibility. Mapping genetic variations causing differences in diverse healing responses in multiple tissues may reveal generic healing processes in pursuit of new therapeutic targets designed to induce or enhance regeneration and, potentially, protection from osteoarthritis.

  6. Enzymic colorimetry-based DNA chip: a rapid and accurate assay for detecting mutations for clarithromycin resistance in the 23S rRNA gene of Helicobacter pylori.

    PubMed

    Xuan, Shi-Hai; Zhou, Yu-Gui; Shao, Bo; Cui, Ya-Lin; Li, Jian; Yin, Hong-Bo; Song, Xiao-Ping; Cong, Hui; Jing, Feng-Xiang; Jin, Qing-Hui; Wang, Hui-Min; Zhou, Jie

    2009-11-01

    Macrolide drugs, such as clarithromycin (CAM), are a key component of many combination therapies used to eradicate Helicobacter pylori. However, resistance to CAM is increasing in H. pylori and is becoming a serious problem in H. pylori eradication therapy. CAM resistance in H. pylori is mostly due to point mutations (A2142G/C, A2143G) in the peptidyltransferase-encoding region of the 23S rRNA gene. In this study an enzymic colorimetry-based DNA chip was developed to analyse single-nucleotide polymorphisms of the 23S rRNA gene to determine the prevalence of mutations in CAM-related resistance in H. pylori-positive patients. The results of the colorimetric DNA chip were confirmed by direct DNA sequencing. In 63 samples, the incidence of the A2143G mutation was 17.46 % (11/63). The results of the colorimetric DNA chip were concordant with DNA sequencing in 96.83 % of results (61/63). The colorimetric DNA chip could detect wild-type and mutant signals at every site, even at a DNA concentration of 1.53 x 10(2) copies microl(-1). Thus, the colorimetric DNA chip is a reliable assay for rapid and accurate detection of mutations in the 23S rRNA gene of H. pylori that lead to CAM-related resistance, directly from gastric tissues.

  7. An accurate real-time model of maglev planar motor based on compound Simpson numerical integration

    NASA Astrophysics Data System (ADS)

    Kou, Baoquan; Xing, Feng; Zhang, Lu; Zhou, Yiheng; Liu, Jiaqi

    2017-05-01

    To realize the high-speed and precise control of the maglev planar motor, a more accurate real-time electromagnetic model, which considers the influence of the coil corners, is proposed in this paper. Three coordinate systems for the stator, mover and corner coil are established. The coil is divided into two segments, the straight coil segment and the corner coil segment, in order to obtain a complete electromagnetic model. When only take the first harmonic of the flux density distribution of a Halbach magnet array into account, the integration method can be carried out towards the two segments according to Lorenz force law. The force and torque analysis formula of the straight coil segment can be derived directly from Newton-Leibniz formula, however, this is not applicable to the corner coil segment. Therefore, Compound Simpson numerical integration method is proposed in this paper to solve the corner segment. With the validation of simulation and experiment, the proposed model has high accuracy and can realize practical application easily.

  8. Log-Linear Models for Gene Association

    PubMed Central

    Hu, Jianhua; Joshi, Adarsh; Johnson, Valen E.

    2009-01-01

    We describe a class of log-linear models for the detection of interactions in high-dimensional genomic data. This class of models leads to a Bayesian model selection algorithm that can be applied to data that have been reduced to contingency tables using ranks of observations within subjects, and discretization of these ranks within gene/network components. Many normalization issues associated with the analysis of genomic data are thereby avoided. A prior density based on Ewens’ sampling distribution is used to restrict the number of interacting components assigned high posterior probability, and the calculation of posterior model probabilities is expedited by approximations based on the likelihood ratio statistic. Simulation studies are used to evaluate the efficiency of the resulting algorithm for known interaction structures. Finally, the algorithm is validated in a microarray study for which it was possible to obtain biological confirmation of detected interactions. PMID:19655032

  9. ICG: a wiki-driven knowledgebase of internal control genes for RT-qPCR normalization.

    PubMed

    Sang, Jian; Wang, Zhennan; Li, Man; Cao, Jiabao; Niu, Guangyi; Xia, Lin; Zou, Dong; Wang, Fan; Xu, Xingjian; Han, Xiaojiao; Fan, Jinqi; Yang, Ye; Zuo, Wanzhu; Zhang, Yang; Zhao, Wenming; Bao, Yiming; Xiao, Jingfa; Hu, Songnian; Hao, Lili; Zhang, Zhang

    2018-01-04

    Real-time quantitative PCR (RT-qPCR) has become a widely used method for accurate expression profiling of targeted mRNA and ncRNA. Selection of appropriate internal control genes for RT-qPCR normalization is an elementary prerequisite for reliable expression measurement. Here, we present ICG (http://icg.big.ac.cn), a wiki-driven knowledgebase for community curation of experimentally validated internal control genes as well as their associated experimental conditions. Unlike extant related databases that focus on qPCR primers in model organisms (mainly human and mouse), ICG features harnessing collective intelligence in community integration of internal control genes for a variety of species. Specifically, it integrates a comprehensive collection of more than 750 internal control genes for 73 animals, 115 plants, 12 fungi and 9 bacteria, and incorporates detailed information on recommended application scenarios corresponding to specific experimental conditions, which, collectively, are of great help for researchers to adopt appropriate internal control genes for their own experiments. Taken together, ICG serves as a publicly editable and open-content encyclopaedia of internal control genes and accordingly bears broad utility for reliable RT-qPCR normalization and gene expression characterization in both model and non-model organisms. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  10. Religion, fertility and genes: a dual inheritance model

    PubMed Central

    Rowthorn, Robert

    2011-01-01

    Religious people nowadays have more children on average than their secular counterparts. This paper uses a simple model to explore the evolutionary implications of this difference. It assumes that fertility is determined entirely by culture, whereas subjective predisposition towards religion is influenced by genetic endowment. People who carry a certain ‘religiosity’ gene are more likely than average to become or remain religious. The paper considers the effect of religious defections and exogamy on the religious and genetic composition of society. Defections reduce the ultimate share of the population with religious allegiance and slow down the spread of the religiosity gene. However, provided the fertility differential persists, and people with a religious allegiance mate mainly with people like themselves, the religiosity gene will eventually predominate despite a high rate of defection. This is an example of ‘cultural hitch-hiking’, whereby a gene spreads because it is able to hitch a ride with a high-fitness cultural practice. The theoretical arguments are supported by numerical simulations. PMID:21227968

  11. SINA: accurate high-throughput multiple sequence alignment of ribosomal RNA genes.

    PubMed

    Pruesse, Elmar; Peplies, Jörg; Glöckner, Frank Oliver

    2012-07-15

    In the analysis of homologous sequences, computation of multiple sequence alignments (MSAs) has become a bottleneck. This is especially troublesome for marker genes like the ribosomal RNA (rRNA) where already millions of sequences are publicly available and individual studies can easily produce hundreds of thousands of new sequences. Methods have been developed to cope with such numbers, but further improvements are needed to meet accuracy requirements. In this study, we present the SILVA Incremental Aligner (SINA) used to align the rRNA gene databases provided by the SILVA ribosomal RNA project. SINA uses a combination of k-mer searching and partial order alignment (POA) to maintain very high alignment accuracy while satisfying high throughput performance demands. SINA was evaluated in comparison with the commonly used high throughput MSA programs PyNAST and mothur. The three BRAliBase III benchmark MSAs could be reproduced with 99.3, 97.6 and 96.1 accuracy. A larger benchmark MSA comprising 38 772 sequences could be reproduced with 98.9 and 99.3% accuracy using reference MSAs comprising 1000 and 5000 sequences. SINA was able to achieve higher accuracy than PyNAST and mothur in all performed benchmarks. Alignment of up to 500 sequences using the latest SILVA SSU/LSU Ref datasets as reference MSA is offered at http://www.arb-silva.de/aligner. This page also links to Linux binaries, user manual and tutorial. SINA is made available under a personal use license.

  12. A method to rapidly and accurately compare relative efficacies of non-invasive imaging reporter genes in a mouse model, and its application to luciferase reporters

    PubMed Central

    Gil, Jose S.; Machado, Hidevaldo B.; Herschman, Harvey R.

    2013-01-01

    Purpose Our goal is to develop a simple, quantitative, robust method to compare the efficacy of imaging reporter genes in culture and in vivo. We describe an adenoviral vector-liver transduction procedure, and compare the luciferase reporter efficacies. Procedures Alternative reporter genes are expressed in a common adenoviral vector. Vector amounts used in vivo are based on cell culture titrations, ensuring the same transduction efficacy is used for each vector. After imaging, in vivo and in vitro values are normalized to hepatic vector transduction using quantitative real-time PCR. Results We assayed standard firefly luciferase (FLuc), enhanced firefly luciferase (EFLuc), luciferase 2 (Luc2), humanized Renilla luciferase (hRLuc), Renilla luciferase 8.6-535 (RLuc8.6), and a membrane-bound Gaussia luciferase variant (extGLuc) in cell culture and in vivo. We observed a greater that 100-fold increase in bioluminescent signal for both EFLuc and Luc2 when compared to FLuc, and a greater than 106-fold increase for RLuc8.6 when compared to hRLuc. ExtGLuc was not detectable in liver. Conclusions Our findings contrast, in some cases, with conclusions drawn in prior comparisons of these reporter genes, and demonstrate the need for a standardized method to evaluate alternative reporter genes in vivo. Our procedure can be adapted for reporter genes that utilize alternative imaging modalities (fluorescence, bioluminescence, MRI, SPECT, PET). PMID:21850545

  13. Predicting features of breast cancer with gene expression patterns.

    PubMed

    Lu, Xuesong; Lu, Xin; Wang, Zhigang C; Iglehart, J Dirk; Zhang, Xuegong; Richardson, Andrea L

    2008-03-01

    Data from gene expression arrays hold an enormous amount of biological information. We sought to determine if global gene expression in primary breast cancers contained information about biologic, histologic, and anatomic features of the disease in individual patients. Microarray data from the tumors of 129 patients were analyzed for the ability to predict biomarkers [estrogen receptor (ER) and HER2], histologic features [grade and lymphatic-vascular invasion (LVI)], and stage parameters (tumor size and lymph node metastasis). Multiple statistical predictors were used and the prediction accuracy was determined by cross-validation error rate; multidimensional scaling (MDS) allowed visualization of the predicted states under study. Models built from gene expression data accurately predict ER and HER2 status, and divide tumor grade into high-grade and low-grade clusters; intermediate-grade tumors are not a unique group. In contrast, gene expression data is inaccurate at predicting tumor size, lymph node status or LVI. The best model for prediction of nodal status included tumor size, LVI status and pathologically defined tumor subtype (based on combinations of ER, HER2, and grade); the addition of microarray-based prediction to this model failed to improve the prediction accuracy. Global gene expression supports a binary division of ER, HER2, and grade, clearly separating tumors into two categories; intermediate values for these bio-indicators do not define intermediate tumor subsets. Results are consistent with a model of regional metastasis that depends on inherent biologic differences in metastatic propensity between breast cancer subtypes, upon which time and chance then operate.

  14. Reverse engineering highlights potential principles of large gene regulatory network design and learning.

    PubMed

    Carré, Clément; Mas, André; Krouk, Gabriel

    2017-01-01

    Inferring transcriptional gene regulatory networks from transcriptomic datasets is a key challenge of systems biology, with potential impacts ranging from medicine to agronomy. There are several techniques used presently to experimentally assay transcription factors to target relationships, defining important information about real gene regulatory networks connections. These techniques include classical ChIP-seq, yeast one-hybrid, or more recently, DAP-seq or target technologies. These techniques are usually used to validate algorithm predictions. Here, we developed a reverse engineering approach based on mathematical and computer simulation to evaluate the impact that this prior knowledge on gene regulatory networks may have on training machine learning algorithms. First, we developed a gene regulatory networks-simulating engine called FRANK (Fast Randomizing Algorithm for Network Knowledge) that is able to simulate large gene regulatory networks (containing 10 4 genes) with characteristics of gene regulatory networks observed in vivo. FRANK also generates stable or oscillatory gene expression directly produced by the simulated gene regulatory networks. The development of FRANK leads to important general conclusions concerning the design of large and stable gene regulatory networks harboring scale free properties (built ex nihilo). In combination with supervised (accepting prior knowledge) support vector machine algorithm we (i) address biologically oriented questions concerning our capacity to accurately reconstruct gene regulatory networks and in particular we demonstrate that prior-knowledge structure is crucial for accurate learning, and (ii) draw conclusions to inform experimental design to performed learning able to solve gene regulatory networks in the future. By demonstrating that our predictions concerning the influence of the prior-knowledge structure on support vector machine learning capacity holds true on real data ( Escherichia coli K14 network

  15. The importance of accurate muscle modelling for biomechanical analyses: a case study with a lizard skull

    PubMed Central

    Gröning, Flora; Jones, Marc E. H.; Curtis, Neil; Herrel, Anthony; O'Higgins, Paul; Evans, Susan E.; Fagan, Michael J.

    2013-01-01

    Computer-based simulation techniques such as multi-body dynamics analysis are becoming increasingly popular in the field of skull mechanics. Multi-body models can be used for studying the relationships between skull architecture, muscle morphology and feeding performance. However, to be confident in the modelling results, models need to be validated against experimental data, and the effects of uncertainties or inaccuracies in the chosen model attributes need to be assessed with sensitivity analyses. Here, we compare the bite forces predicted by a multi-body model of a lizard (Tupinambis merianae) with in vivo measurements, using anatomical data collected from the same specimen. This subject-specific model predicts bite forces that are very close to the in vivo measurements and also shows a consistent increase in bite force as the bite position is moved posteriorly on the jaw. However, the model is very sensitive to changes in muscle attributes such as fibre length, intrinsic muscle strength and force orientation, with bite force predictions varying considerably when these three variables are altered. We conclude that accurate muscle measurements are crucial to building realistic multi-body models and that subject-specific data should be used whenever possible. PMID:23614944

  16. Computational challenges in modeling gene regulatory events

    PubMed Central

    Pataskar, Abhijeet; Tiwari, Vijay K.

    2016-01-01

    ABSTRACT Cellular transcriptional programs driven by genetic and epigenetic mechanisms could be better understood by integrating “omics” data and subsequently modeling the gene-regulatory events. Toward this end, computational biology should keep pace with evolving experimental procedures and data availability. This article gives an exemplified account of the current computational challenges in molecular biology. PMID:27390891

  17. Liver-Directed Lentiviral Gene Therapy in a Dog Model of Hemophilia B

    PubMed Central

    Bartholomae, Cynthia C.; Volpin, Monica; Della Valle, Patrizia; Sanvito, Francesca; Sergi Sergi, Lucia; Gallina, Pierangela; Benedicenti, Fabrizio; Bellinger, Dwight; Raymer, Robin; Merricks, Elizabeth; Bellintani, Francesca; Martin, Samia; Doglioni, Claudio; D’Angelo, Armando; VandenDriessche, Thierry; Chuah, Marinee K.; Schmidt, Manfred; Nichols, Timothy; Montini, Eugenio; Naldini, Luigi

    2017-01-01

    We investigated the safety and efficacy of liver-directed gene therapy using lentiviral vectors in a large animal model of hemophilia B, and evaluated the risk of insertional mutagenesis in tumor-prone mouse models. We show that gene therapy using lentiviral vectors targeting expression of a canine factor IX transgene to hepatocytes was well-tolerated and provided stable long-term production of coagulation factor IX in dogs with hemophilia B. By exploiting three different mouse models designed to amplify the consequences of insertional mutagenesis, we show that no genotoxicity was detected with these lentiviral vectors. Our findings suggest that lentiviral vectors may be an attractive candidate for gene therapy targeted to the liver and may be useful for the treatment of hemophilia. PMID:25739762

  18. A Survey of Statistical Models for Reverse Engineering Gene Regulatory Networks

    PubMed Central

    Huang, Yufei; Tienda-Luna, Isabel M.; Wang, Yufeng

    2009-01-01

    Statistical models for reverse engineering gene regulatory networks are surveyed in this article. To provide readers with a system-level view of the modeling issues in this research, a graphical modeling framework is proposed. This framework serves as the scaffolding on which the review of different models can be systematically assembled. Based on the framework, we review many existing models for many aspects of gene regulation; the pros and cons of each model are discussed. In addition, network inference algorithms are also surveyed under the graphical modeling framework by the categories of point solutions and probabilistic solutions and the connections and differences among the algorithms are provided. This survey has the potential to elucidate the development and future of reverse engineering GRNs and bring statistical signal processing closer to the core of this research. PMID:20046885

  19. Accurate De Novo Prediction of Protein Contact Map by Ultra-Deep Learning Model.

    PubMed

    Wang, Sheng; Sun, Siqi; Li, Zhen; Zhang, Renyu; Xu, Jinbo

    2017-01-01

    Protein contacts contain key information for the understanding of protein structure and function and thus, contact prediction from sequence is an important problem. Recently exciting progress has been made on this problem, but the predicted contacts for proteins without many sequence homologs is still of low quality and not very useful for de novo structure prediction. This paper presents a new deep learning method that predicts contacts by integrating both evolutionary coupling (EC) and sequence conservation information through an ultra-deep neural network formed by two deep residual neural networks. The first residual network conducts a series of 1-dimensional convolutional transformation of sequential features; the second residual network conducts a series of 2-dimensional convolutional transformation of pairwise information including output of the first residual network, EC information and pairwise potential. By using very deep residual networks, we can accurately model contact occurrence patterns and complex sequence-structure relationship and thus, obtain higher-quality contact prediction regardless of how many sequence homologs are available for proteins in question. Our method greatly outperforms existing methods and leads to much more accurate contact-assisted folding. Tested on 105 CASP11 targets, 76 past CAMEO hard targets, and 398 membrane proteins, the average top L long-range prediction accuracy obtained by our method, one representative EC method CCMpred and the CASP11 winner MetaPSICOV is 0.47, 0.21 and 0.30, respectively; the average top L/10 long-range accuracy of our method, CCMpred and MetaPSICOV is 0.77, 0.47 and 0.59, respectively. Ab initio folding using our predicted contacts as restraints but without any force fields can yield correct folds (i.e., TMscore>0.6) for 203 of the 579 test proteins, while that using MetaPSICOV- and CCMpred-predicted contacts can do so for only 79 and 62 of them, respectively. Our contact-assisted models also have

  20. Accurate De Novo Prediction of Protein Contact Map by Ultra-Deep Learning Model

    PubMed Central

    Li, Zhen; Zhang, Renyu

    2017-01-01

    Motivation Protein contacts contain key information for the understanding of protein structure and function and thus, contact prediction from sequence is an important problem. Recently exciting progress has been made on this problem, but the predicted contacts for proteins without many sequence homologs is still of low quality and not very useful for de novo structure prediction. Method This paper presents a new deep learning method that predicts contacts by integrating both evolutionary coupling (EC) and sequence conservation information through an ultra-deep neural network formed by two deep residual neural networks. The first residual network conducts a series of 1-dimensional convolutional transformation of sequential features; the second residual network conducts a series of 2-dimensional convolutional transformation of pairwise information including output of the first residual network, EC information and pairwise potential. By using very deep residual networks, we can accurately model contact occurrence patterns and complex sequence-structure relationship and thus, obtain higher-quality contact prediction regardless of how many sequence homologs are available for proteins in question. Results Our method greatly outperforms existing methods and leads to much more accurate contact-assisted folding. Tested on 105 CASP11 targets, 76 past CAMEO hard targets, and 398 membrane proteins, the average top L long-range prediction accuracy obtained by our method, one representative EC method CCMpred and the CASP11 winner MetaPSICOV is 0.47, 0.21 and 0.30, respectively; the average top L/10 long-range accuracy of our method, CCMpred and MetaPSICOV is 0.77, 0.47 and 0.59, respectively. Ab initio folding using our predicted contacts as restraints but without any force fields can yield correct folds (i.e., TMscore>0.6) for 203 of the 579 test proteins, while that using MetaPSICOV- and CCMpred-predicted contacts can do so for only 79 and 62 of them, respectively. Our contact

  1. De-embedding technique for accurate modeling of compact 3D MMIC CPW transmission lines

    NASA Astrophysics Data System (ADS)

    Pohan, U. H.; KKyabaggu, P. B.; Sinulingga, E. P.

    2018-02-01

    Requirement for high-density and high-functionality microwave and millimeter-wave circuits have led to the innovative circuit architectures such as three-dimensional multilayer MMICs. The major advantage of the multilayer techniques is that one can employ passive and active components based on CPW technology. In this work, MMIC Coplanar Waveguide(CPW)components such as Transmission Line (TL) are modeled in their 3D layouts. Main characteristics of CPWTL suffered from the probe pads’ parasitic and resonant frequency effects have been studied. By understanding the parasitic effects, then the novel de-embedding technique are developed accurately in order to predict high frequency characteristics of the designed MMICs. The novel de-embedding technique has shown to be critical in reducing the probe pad parasitic significantly from the model. As results, high frequency characteristics of the designed MMICs have been presented with minimumparasitic effects of the probe pads. The de-embedding process optimises the determination of main characteristics of Compact 3D MMIC CPW transmission lines.

  2. Gene doping detection: evaluation of approach for direct detection of gene transfer using erythropoietin as a model system.

    PubMed

    Baoutina, A; Coldham, T; Bains, G S; Emslie, K R

    2010-08-01

    As clinical gene therapy has progressed toward realizing its potential, concern over misuse of the technology to enhance performance in athletes is growing. Although 'gene doping' is banned by the World Anti-Doping Agency, its detection remains a major challenge. In this study, we developed a methodology for direct detection of the transferred genetic material and evaluated its feasibility for gene doping detection in blood samples from athletes. Using erythropoietin (EPO) as a model gene and a simple in vitro system, we developed real-time PCR assays that target sequences within the transgene complementary DNA corresponding to exon/exon junctions. As these junctions are absent in the endogenous gene due to their interruption by introns, the approach allows detection of trace amounts of a transgene in a large background of the endogenous gene. Two developed assays and one commercial gene expression assay for EPO were validated. On the basis of ability of these assays to selectively amplify transgenic DNA and analysis of literature on testing of gene transfer in preclinical and clinical gene therapy, it is concluded that the developed approach would potentially be suitable to detect gene doping through gene transfer by analysis of small volumes of blood using regular out-of-competition testing.

  3. Production of accurate skeletal models of domestic animals using three-dimensional scanning and printing technology.

    PubMed

    Li, Fangzheng; Liu, Chunying; Song, Xuexiong; Huan, Yanjun; Gao, Shansong; Jiang, Zhongling

    2018-01-01

    Access to adequate anatomical specimens can be an important aspect in learning the anatomy of domestic animals. In this study, the authors utilized a structured light scanner and fused deposition modeling (FDM) printer to produce highly accurate animal skeletal models. First, various components of the bovine skeleton, including the femur, the fifth rib, and the sixth cervical (C6) vertebra were used to produce digital models. These were then used to produce 1:1 scale physical models with the FDM printer. The anatomical features of the digital models and three-dimensional (3D) printed models were then compared with those of the original skeletal specimens. The results of this study demonstrated that both digital and physical scale models of animal skeletal components could be rapidly produced using 3D printing technology. In terms of accuracy between models and original specimens, the standard deviations of the femur and the fifth rib measurements were 0.0351 and 0.0572, respectively. All of the features except the nutrient foramina on the original bone specimens could be identified in the digital and 3D printed models. Moreover, the 3D printed models could serve as a viable alternative to original bone specimens when used in anatomy education, as determined from student surveys. This study demonstrated an important example of reproducing bone models to be used in anatomy education and veterinary clinical training. Anat Sci Educ 11: 73-80. © 2017 American Association of Anatomists. © 2017 American Association of Anatomists.

  4. Ensemble predictive model for more accurate soil organic carbon spectroscopic estimation

    NASA Astrophysics Data System (ADS)

    Vašát, Radim; Kodešová, Radka; Borůvka, Luboš

    2017-07-01

    A myriad of signal pre-processing strategies and multivariate calibration techniques has been explored in attempt to improve the spectroscopic prediction of soil organic carbon (SOC) over the last few decades. Therefore, to come up with a novel, more powerful, and accurate predictive approach to beat the rank becomes a challenging task. However, there may be a way, so that combine several individual predictions into a single final one (according to ensemble learning theory). As this approach performs best when combining in nature different predictive algorithms that are calibrated with structurally different predictor variables, we tested predictors of two different kinds: 1) reflectance values (or transforms) at each wavelength and 2) absorption feature parameters. Consequently we applied four different calibration techniques, two per each type of predictors: a) partial least squares regression and support vector machines for type 1, and b) multiple linear regression and random forest for type 2. The weights to be assigned to individual predictions within the ensemble model (constructed as a weighted average) were determined by an automated procedure that ensured the best solution among all possible was selected. The approach was tested at soil samples taken from surface horizon of four sites differing in the prevailing soil units. By employing the ensemble predictive model the prediction accuracy of SOC improved at all four sites. The coefficient of determination in cross-validation (R2cv) increased from 0.849, 0.611, 0.811 and 0.644 (the best individual predictions) to 0.864, 0.650, 0.824 and 0.698 for Site 1, 2, 3 and 4, respectively. Generally, the ensemble model affected the final prediction so that the maximal deviations of predicted vs. observed values of the individual predictions were reduced, and thus the correlation cloud became thinner as desired.

  5. [Gene therapy for inherited retinal dystrophies].

    PubMed

    Côco, Monique; Han, Sang Won; Sallum, Juliana Maria Ferraz

    2009-01-01

    The inherited retinal dystrophies comprise a large number of disorders characterized by a slow and progressive retinal degeneration. They are the result of mutations in genes that express in either the photoreceptor cells or the retinal pigment epithelium. The mode of inheritance can be autosomal dominant, autosomal recessive, X linked recessive, digenic or mitochondrial DNA inherited. At the moment, there is no treatment for these conditions and the patients can expect a progressive loss of vision. Accurate genetic counseling and support for rehabilitation are indicated. Research into the molecular and genetic basis of disease is continually expanding and improving the prospects for rational treatments. In this way, gene therapy, defined as the introduction of exogenous genetic material into human cells for therapeutic purposes, may ultimately offer the greatest treatment for the inherited retinal dystrophies. The eye is an attractive target for gene therapy because of its accessibility, immune privilege and translucent media. A number of retinal diseases affecting the eye have known gene defects. Besides, there is a well characterized animal model for many of these conditions. Proposals for clinical trials of gene therapy for inherited retinal degenerations owing to defects in the gene RPE65, have recently received ethical approval and the obtained preliminary results brought large prospects in the improvement on patient's quality of life.

  6. Finite Element Modelling of a Field-Sensed Magnetic Suspended System for Accurate Proximity Measurement Based on a Sensor Fusion Algorithm with Unscented Kalman Filter

    PubMed Central

    Chowdhury, Amor; Sarjaš, Andrej

    2016-01-01

    The presented paper describes accurate distance measurement for a field-sensed magnetic suspension system. The proximity measurement is based on a Hall effect sensor. The proximity sensor is installed directly on the lower surface of the electro-magnet, which means that it is very sensitive to external magnetic influences and disturbances. External disturbances interfere with the information signal and reduce the usability and reliability of the proximity measurements and, consequently, the whole application operation. A sensor fusion algorithm is deployed for the aforementioned reasons. The sensor fusion algorithm is based on the Unscented Kalman Filter, where a nonlinear dynamic model was derived with the Finite Element Modelling approach. The advantage of such modelling is a more accurate dynamic model parameter estimation, especially in the case when the real structure, materials and dimensions of the real-time application are known. The novelty of the paper is the design of a compact electro-magnetic actuator with a built-in low cost proximity sensor for accurate proximity measurement of the magnetic object. The paper successively presents a modelling procedure with the finite element method, design and parameter settings of a sensor fusion algorithm with Unscented Kalman Filter and, finally, the implementation procedure and results of real-time operation. PMID:27649197

  7. Finite Element Modelling of a Field-Sensed Magnetic Suspended System for Accurate Proximity Measurement Based on a Sensor Fusion Algorithm with Unscented Kalman Filter.

    PubMed

    Chowdhury, Amor; Sarjaš, Andrej

    2016-09-15

    The presented paper describes accurate distance measurement for a field-sensed magnetic suspension system. The proximity measurement is based on a Hall effect sensor. The proximity sensor is installed directly on the lower surface of the electro-magnet, which means that it is very sensitive to external magnetic influences and disturbances. External disturbances interfere with the information signal and reduce the usability and reliability of the proximity measurements and, consequently, the whole application operation. A sensor fusion algorithm is deployed for the aforementioned reasons. The sensor fusion algorithm is based on the Unscented Kalman Filter, where a nonlinear dynamic model was derived with the Finite Element Modelling approach. The advantage of such modelling is a more accurate dynamic model parameter estimation, especially in the case when the real structure, materials and dimensions of the real-time application are known. The novelty of the paper is the design of a compact electro-magnetic actuator with a built-in low cost proximity sensor for accurate proximity measurement of the magnetic object. The paper successively presents a modelling procedure with the finite element method, design and parameter settings of a sensor fusion algorithm with Unscented Kalman Filter and, finally, the implementation procedure and results of real-time operation.

  8. Simultaneous gene finding in multiple genomes.

    PubMed

    König, Stefanie; Romoth, Lars W; Gerischer, Lizzy; Stanke, Mario

    2016-11-15

    As the tree of life is populated with sequenced genomes ever more densely, the new challenge is the accurate and consistent annotation of entire clades of genomes. We address this problem with a new approach to comparative gene finding that takes a multiple genome alignment of closely related species and simultaneously predicts the location and structure of protein-coding genes in all input genomes, thereby exploiting negative selection and sequence conservation. The model prefers potential gene structures in the different genomes that are in agreement with each other, or-if not-where the exon gains and losses are plausible given the species tree. We formulate the multi-species gene finding problem as a binary labeling problem on a graph. The resulting optimization problem is NP hard, but can be efficiently approximated using a subgradient-based dual decomposition approach. The proposed method was tested on whole-genome alignments of 12 vertebrate and 12 Drosophila species. The accuracy was evaluated for human, mouse and Drosophila melanogaster and compared to competing methods. Results suggest that our method is well-suited for annotation of (a large number of) genomes of closely related species within a clade, in particular, when RNA-Seq data are available for many of the genomes. The transfer of existing annotations from one genome to another via the genome alignment is more accurate than previous approaches that are based on protein-spliced alignments, when the genomes are at close to medium distances. The method is implemented in C ++ as part of Augustus and available open source at http://bioinf.uni-greifswald.de/augustus/ CONTACT: stefaniekoenig@ymail.com or mario.stanke@uni-greifswald.deSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  9. Augmenting Microarray Data with Literature-Based Knowledge to Enhance Gene Regulatory Network Inference

    PubMed Central

    Kilicoglu, Halil; Shin, Dongwook; Rindflesch, Thomas C.

    2014-01-01

    Gene regulatory networks are a crucial aspect of systems biology in describing molecular mechanisms of the cell. Various computational models rely on random gene selection to infer such networks from microarray data. While incorporation of prior knowledge into data analysis has been deemed important, in practice, it has generally been limited to referencing genes in probe sets and using curated knowledge bases. We investigate the impact of augmenting microarray data with semantic relations automatically extracted from the literature, with the view that relations encoding gene/protein interactions eliminate the need for random selection of components in non-exhaustive approaches, producing a more accurate model of cellular behavior. A genetic algorithm is then used to optimize the strength of interactions using microarray data and an artificial neural network fitness function. The result is a directed and weighted network providing the individual contribution of each gene to its target. For testing, we used invasive ductile carcinoma of the breast to query the literature and a microarray set containing gene expression changes in these cells over several time points. Our model demonstrates significantly better fitness than the state-of-the-art model, which relies on an initial random selection of genes. Comparison to the component pathways of the KEGG Pathways in Cancer map reveals that the resulting networks contain both known and novel relationships. The p53 pathway results were manually validated in the literature. 60% of non-KEGG relationships were supported (74% for highly weighted interactions). The method was then applied to yeast data and our model again outperformed the comparison model. Our results demonstrate the advantage of combining gene interactions extracted from the literature in the form of semantic relations with microarray analysis in generating contribution-weighted gene regulatory networks. This methodology can make a significant contribution to

  10. Augmenting microarray data with literature-based knowledge to enhance gene regulatory network inference.

    PubMed

    Chen, Guocai; Cairelli, Michael J; Kilicoglu, Halil; Shin, Dongwook; Rindflesch, Thomas C

    2014-06-01

    Gene regulatory networks are a crucial aspect of systems biology in describing molecular mechanisms of the cell. Various computational models rely on random gene selection to infer such networks from microarray data. While incorporation of prior knowledge into data analysis has been deemed important, in practice, it has generally been limited to referencing genes in probe sets and using curated knowledge bases. We investigate the impact of augmenting microarray data with semantic relations automatically extracted from the literature, with the view that relations encoding gene/protein interactions eliminate the need for random selection of components in non-exhaustive approaches, producing a more accurate model of cellular behavior. A genetic algorithm is then used to optimize the strength of interactions using microarray data and an artificial neural network fitness function. The result is a directed and weighted network providing the individual contribution of each gene to its target. For testing, we used invasive ductile carcinoma of the breast to query the literature and a microarray set containing gene expression changes in these cells over several time points. Our model demonstrates significantly better fitness than the state-of-the-art model, which relies on an initial random selection of genes. Comparison to the component pathways of the KEGG Pathways in Cancer map reveals that the resulting networks contain both known and novel relationships. The p53 pathway results were manually validated in the literature. 60% of non-KEGG relationships were supported (74% for highly weighted interactions). The method was then applied to yeast data and our model again outperformed the comparison model. Our results demonstrate the advantage of combining gene interactions extracted from the literature in the form of semantic relations with microarray analysis in generating contribution-weighted gene regulatory networks. This methodology can make a significant contribution to

  11. Improved kinetic model of Escherichia coli central carbon metabolism in batch and continuous cultures.

    PubMed

    Kurata, Hiroyuki; Sugimoto, Yurie

    2018-02-01

    Many kinetic models of Escherichia coli central metabolism have been built, but few models accurately reproduced the dynamic behaviors of wild type and multiple genetic mutants. In 2016, our latest kinetic model improved problems of existing models to reproduce the cell growth and glucose uptake of wild type, ΔpykA:pykF and Δpgi in a batch culture, while it overestimated the glucose uptake and cell growth rates of Δppc and hardly captured the typical characteristics of the glyoxylate and TCA cycle fluxes for Δpgi and Δppc. Such discrepancies between the simulated and experimental data suggested biological complexity. In this study, we overcame these problems by assuming critical mechanisms regarding the OAA-regulated isocitrate dehydrogenase activity, aceBAK gene regulation and growth suppression. The present model accurately predicts the extracellular and intracellular dynamics of wild type and many gene knockout mutants in batch and continuous cultures. It is now the most accurate, detailed kinetic model of E. coli central carbon metabolism and will contribute to advances in mathematical modeling of cell factories. Copyright © 2017 The Society for Biotechnology, Japan. Published by Elsevier B.V. All rights reserved.

  12. A multi-Poisson dynamic mixture model to cluster developmental patterns of gene expression by RNA-seq.

    PubMed

    Ye, Meixia; Wang, Zhong; Wang, Yaqun; Wu, Rongling

    2015-03-01

    Dynamic changes of gene expression reflect an intrinsic mechanism of how an organism responds to developmental and environmental signals. With the increasing availability of expression data across a time-space scale by RNA-seq, the classification of genes as per their biological function using RNA-seq data has become one of the most significant challenges in contemporary biology. Here we develop a clustering mixture model to discover distinct groups of genes expressed during a period of organ development. By integrating the density function of multivariate Poisson distribution, the model accommodates the discrete property of read counts characteristic of RNA-seq data. The temporal dependence of gene expression is modeled by the first-order autoregressive process. The model is implemented with the Expectation-Maximization algorithm and model selection to determine the optimal number of gene clusters and obtain the estimates of Poisson parameters that describe the pattern of time-dependent expression of genes from each cluster. The model has been demonstrated by analyzing a real data from an experiment aimed to link the pattern of gene expression to catkin development in white poplar. The usefulness of the model has been validated through computer simulation. The model provides a valuable tool for clustering RNA-seq data, facilitating our global view of expression dynamics and understanding of gene regulation mechanisms. © The Author 2014. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.

  13. SIFTER search: a web server for accurate phylogeny-based protein function prediction

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sahraeian, Sayed M.; Luo, Kevin R.; Brenner, Steven E.

    We are awash in proteins discovered through high-throughput sequencing projects. As only a minuscule fraction of these have been experimentally characterized, computational methods are widely used for automated annotation. Here, we introduce a user-friendly web interface for accurate protein function prediction using the SIFTER algorithm. SIFTER is a state-of-the-art sequence-based gene molecular function prediction algorithm that uses a statistical model of function evolution to incorporate annotations throughout the phylogenetic tree. Due to the resources needed by the SIFTER algorithm, running SIFTER locally is not trivial for most users, especially for large-scale problems. The SIFTER web server thus provides access tomore » precomputed predictions on 16 863 537 proteins from 232 403 species. Users can explore SIFTER predictions with queries for proteins, species, functions, and homologs of sequences not in the precomputed prediction set. Lastly, the SIFTER web server is accessible at http://sifter.berkeley.edu/ and the source code can be downloaded.« less

  14. SIFTER search: a web server for accurate phylogeny-based protein function prediction

    DOE PAGES

    Sahraeian, Sayed M.; Luo, Kevin R.; Brenner, Steven E.

    2015-05-15

    We are awash in proteins discovered through high-throughput sequencing projects. As only a minuscule fraction of these have been experimentally characterized, computational methods are widely used for automated annotation. Here, we introduce a user-friendly web interface for accurate protein function prediction using the SIFTER algorithm. SIFTER is a state-of-the-art sequence-based gene molecular function prediction algorithm that uses a statistical model of function evolution to incorporate annotations throughout the phylogenetic tree. Due to the resources needed by the SIFTER algorithm, running SIFTER locally is not trivial for most users, especially for large-scale problems. The SIFTER web server thus provides access tomore » precomputed predictions on 16 863 537 proteins from 232 403 species. Users can explore SIFTER predictions with queries for proteins, species, functions, and homologs of sequences not in the precomputed prediction set. Lastly, the SIFTER web server is accessible at http://sifter.berkeley.edu/ and the source code can be downloaded.« less

  15. snoU6 and 5S RNAs are not reliable miRNA reference genes in neuronal differentiation.

    PubMed

    Lim, Q E; Zhou, L; Ho, Y K; Wan, G; Too, H P

    2011-12-29

    Accurate profiling of microRNAs (miRNAs) is an essential step for understanding the functional significance of these small RNAs in both physiological and pathological processes. Quantitative real-time PCR (qPCR) has gained acceptance as a robust and reliable transcriptomic method to profile subtle changes in miRNA levels and requires reference genes for accurate normalization of gene expression. 5S and snoU6 RNAs are commonly used as reference genes in microRNA quantification. It is currently unknown if these small RNAs are stably expressed during neuronal differentiation. Panels of miRNAs have been suggested as alternative reference genes to 5S and snoU6 in various physiological contexts. To test the hypothesis that miRNAs may serve as stable references during neuronal differentiation, the expressions of eight miRNAs, 5S and snoU6 RNAs in five differentiating neuronal cell types were analyzed using qPCR. The stabilities of the expressions were evaluated using two complementary statistical approaches (geNorm and Normfinder). Expressions of 5S and snoU6 RNAs were stable under some but not all conditions of neuronal differentiation and thus are not suitable reference genes. In contrast, a combination of three miRNAs (miR-103, miR-106b and miR-26b) allowed accurate expression normalization across different models of neuronal differentiation. Copyright © 2011 IBRO. Published by Elsevier Ltd. All rights reserved.

  16. Integrative strategies to identify candidate genes in rodent models of human alcoholism.

    PubMed

    Treadwell, Julie A

    2006-01-01

    The search for genes underlying alcohol-related behaviours in rodent models of human alcoholism has been ongoing for many years with only limited success. Recently, new strategies that integrate several of the traditional approaches have provided new insights into the molecular mechanisms underlying ethanol's actions in the brain. We have used alcohol-preferring C57BL/6J (B6) and alcohol-avoiding DBA/2J (D2) genetic strains of mice in an integrative strategy combining high-throughput gene expression screening, genetic segregation analysis, and mapping to previously published quantitative trait loci to uncover candidate genes for the ethanol-preference phenotype. In our study, 2 genes, retinaldehyde binding protein 1 (Rlbp1) and syntaxin 12 (Stx12), were found to be strong candidates for ethanol preference. Such experimental approaches have the power and the potential to greatly speed up the laborious process of identifying candidate genes for the animal models of human alcoholism.

  17. A Protocol for Using Gene Set Enrichment Analysis to Identify the Appropriate Animal Model for Translational Research.

    PubMed

    Weidner, Christopher; Steinfath, Matthias; Wistorf, Elisa; Oelgeschläger, Michael; Schneider, Marlon R; Schönfelder, Gilbert

    2017-08-16

    Recent studies that compared transcriptomic datasets of human diseases with datasets from mouse models using traditional gene-to-gene comparison techniques resulted in contradictory conclusions regarding the relevance of animal models for translational research. A major reason for the discrepancies between different gene expression analyses is the arbitrary filtering of differentially expressed genes. Furthermore, the comparison of single genes between different species and platforms often is limited by technical variance, leading to misinterpretation of the con/discordance between data from human and animal models. Thus, standardized approaches for systematic data analysis are needed. To overcome subjective gene filtering and ineffective gene-to-gene comparisons, we recently demonstrated that gene set enrichment analysis (GSEA) has the potential to avoid these problems. Therefore, we developed a standardized protocol for the use of GSEA to distinguish between appropriate and inappropriate animal models for translational research. This protocol is not suitable to predict how to design new model systems a-priori, as it requires existing experimental omics data. However, the protocol describes how to interpret existing data in a standardized manner in order to select the most suitable animal model, thus avoiding unnecessary animal experiments and misleading translational studies.

  18. Mapping lupus susceptibility genes in the NZM2410 mouse model.

    PubMed

    Morel, Laurence

    2012-01-01

    Considerable efforts have been deployed over the years to decipher the genetic basis of systemic lupus erythematosus (SLE). The NZM2410 strain is murine model in which the genetic analysis of SLE is the most advanced. NZM2410 studies have shown that, as in SLE patients, lupus susceptibility is achieved by the coexpression of many susceptibility alleles, each of which with a small contribution to the overall disease phenotype. This mouse model has also revealed the critical role played by gene-gene interactions, which are believed to be an essential contribution to human SLE heritability, although it has been much more difficult to characterize. We have now reached a phase in which NZM2410 susceptibility genes have been identified, all them novel in their association with lupus or even with immune functions. Ongoing studies geared at understanding how these genes impact immune tolerance and interact with each other in the mouse, and their impact on the human immune system or target organs, will undoubtedly lead to important discovery for a better understanding on the disease and potential identification of therapeutic targets. Copyright © 2012 Elsevier Inc. All rights reserved.

  19. Inferring evolution of gene duplicates using probabilistic models and nonparametric belief propagation.

    PubMed

    Zeng, Jia; Hannenhalli, Sridhar

    2013-01-01

    Gene duplication, followed by functional evolution of duplicate genes, is a primary engine of evolutionary innovation. In turn, gene expression evolution is a critical component of overall functional evolution of paralogs. Inferring evolutionary history of gene expression among paralogs is therefore a problem of considerable interest. It also represents significant challenges. The standard approaches of evolutionary reconstruction assume that at an internal node of the duplication tree, the two duplicates evolve independently. However, because of various selection pressures functional evolution of the two paralogs may be coupled. The coupling of paralog evolution corresponds to three major fates of gene duplicates: subfunctionalization (SF), conserved function (CF) or neofunctionalization (NF). Quantitative analysis of these fates is of great interest and clearly influences evolutionary inference of expression. These two interrelated problems of inferring gene expression and evolutionary fates of gene duplicates have not been studied together previously and motivate the present study. Here we propose a novel probabilistic framework and algorithm to simultaneously infer (i) ancestral gene expression and (ii) the likely fate (SF, NF, CF) at each duplication event during the evolution of gene family. Using tissue-specific gene expression data, we develop a nonparametric belief propagation (NBP) algorithm to predict the ancestral expression level as a proxy for function, and describe a novel probabilistic model that relates the predicted and known expression levels to the possible evolutionary fates. We validate our model using simulation and then apply it to a genome-wide set of gene duplicates in human. Our results suggest that SF tends to be more frequent at the earlier stage of gene family expansion, while NF occurs more frequently later on.

  20. Petri net modelling of gene regulation of the Duchenne muscular dystrophy.

    PubMed

    Grunwald, Stefanie; Speer, Astrid; Ackermann, Jörg; Koch, Ina

    2008-05-01

    Searching for therapeutic strategies for Duchenne muscular dystrophy, it is of great interest to understand the responsible molecular pathways down-stream of dystrophin completely. For this reason we have performed real-time PCR experiments to compare mRNA expression levels of relevant genes in tissues of affected patients and controls. To bring experimental data in context with the underlying pathway theoretical models are needed. Modelling of biological processes in the cell at higher description levels is still an open problem in the field of systems biology. In this paper, a new application of Petri net theory is presented to model gene regulatory processes of Duchenne muscular dystrophy. We have developed a Petri net model, which is based mainly on own experimental and literature data. We distinguish between up- and down-regulated states of gene expression. The analysis of the model comprises the computation of structural and dynamic properties with focus on a thorough T-invariant analysis, including clustering techniques and the decomposition of the network into maximal common transition sets (MCT-sets), which can be interpreted as functionally related building blocks. All possible pathways, which reflect the complex net behaviour in dependence of different gene expression patterns, are discussed. We introduce Mauritius maps of T-invariants, which enable, for example, theoretical knockout analysis. The resulted model serves as basis for a better understanding of pathological processes, and thereby for planning next experimental steps in searching for new therapeutic possibilities. Free availability of the Petri net editor and animator Snoopy and the clustering tool PInA via http://www-dssz.informatik.tu-cottbus.de/~ wwwdssz/. The Petri net models used can be accessed via http://www.tfh-berlin.de/bi/duchenne/.

  1. Gene Expression Profiling in Rodent Models for Schizophrenia

    PubMed Central

    Schijndel, Jessica E. Van; Martens, Gerard J.M

    2010-01-01

    The complex neurodevelopmental disorder schizophrenia is thought to be induced by an interaction between predisposing genes and environmental stressors. In order to get a better insight into the aetiology of this complex disorder, animal models have been developed. In this review, we summarize mRNA expression profiling studies on neurodevelopmental, pharmacological and genetic animal models for schizophrenia. We discuss parallels and contradictions among these studies, and propose strategies for future research. PMID:21629445

  2. Reverse engineering model structures for soil and ecosystem respiration: the potential of gene expression programming

    NASA Astrophysics Data System (ADS)

    Ilie, Iulia; Dittrich, Peter; Carvalhais, Nuno; Jung, Martin; Heinemeyer, Andreas; Migliavacca, Mirco; Morison, James I. L.; Sippel, Sebastian; Subke, Jens-Arne; Wilkinson, Matthew; Mahecha, Miguel D.

    2017-09-01

    Accurate model representation of land-atmosphere carbon fluxes is essential for climate projections. However, the exact responses of carbon cycle processes to climatic drivers often remain uncertain. Presently, knowledge derived from experiments, complemented by a steadily evolving body of mechanistic theory, provides the main basis for developing such models. The strongly increasing availability of measurements may facilitate new ways of identifying suitable model structures using machine learning. Here, we explore the potential of gene expression programming (GEP) to derive relevant model formulations based solely on the signals present in data by automatically applying various mathematical transformations to potential predictors and repeatedly evolving the resulting model structures. In contrast to most other machine learning regression techniques, the GEP approach generates readable models that allow for prediction and possibly for interpretation. Our study is based on two cases: artificially generated data and real observations. Simulations based on artificial data show that GEP is successful in identifying prescribed functions, with the prediction capacity of the models comparable to four state-of-the-art machine learning methods (random forests, support vector machines, artificial neural networks, and kernel ridge regressions). Based on real observations we explore the responses of the different components of terrestrial respiration at an oak forest in south-eastern England. We find that the GEP-retrieved models are often better in prediction than some established respiration models. Based on their structures, we find previously unconsidered exponential dependencies of respiration on seasonal ecosystem carbon assimilation and water dynamics. We noticed that the GEP models are only partly portable across respiration components, the identification of a general terrestrial respiration model possibly prevented by equifinality issues. Overall, GEP

  3. Liver-directed lentiviral gene therapy in a dog model of hemophilia B.

    PubMed

    Cantore, Alessio; Ranzani, Marco; Bartholomae, Cynthia C; Volpin, Monica; Valle, Patrizia Della; Sanvito, Francesca; Sergi, Lucia Sergi; Gallina, Pierangela; Benedicenti, Fabrizio; Bellinger, Dwight; Raymer, Robin; Merricks, Elizabeth; Bellintani, Francesca; Martin, Samia; Doglioni, Claudio; D'Angelo, Armando; VandenDriessche, Thierry; Chuah, Marinee K; Schmidt, Manfred; Nichols, Timothy; Montini, Eugenio; Naldini, Luigi

    2015-03-04

    We investigated the efficacy of liver-directed gene therapy using lentiviral vectors in a large animal model of hemophilia B and evaluated the risk of insertional mutagenesis in tumor-prone mouse models. We showed that gene therapy using lentiviral vectors targeting the expression of a canine factor IX transgene in hepatocytes was well tolerated and provided a stable long-term production of coagulation factor IX in dogs with hemophilia B. By exploiting three different mouse models designed to amplify the consequences of insertional mutagenesis, we showed that no genotoxicity was detected with these lentiviral vectors. Our findings suggest that lentiviral vectors may be an attractive candidate for gene therapy targeted to the liver and may be potentially useful for the treatment of hemophilia. Copyright © 2015, American Association for the Advancement of Science.

  4. Rice-arsenate interactions in hydroponics: a three-gene model for tolerance.

    PubMed

    Norton, Gareth J; Nigar, Meher; Williams, Paul N; Dasgupta, Tapash; Meharg, Andrew A; Price, Adam H

    2008-01-01

    In this study, the genetic mapping of the tolerance of root growth to 13.3 muM arsenate [As(V)] using the BalaxAzucena population is improved, and candidate genes for further study are identified. A remarkable three-gene model of tolerance is advanced, which appears to involve epistatic interaction between three major genes, two on chromosome 6 and one on chromosome 10. Any combination of two of these genes inherited from the tolerant parent leads to the plant having tolerance. Lists of potential positional candidate genes are presented. These are then refined using whole genome transcriptomics data and bioinformatics. Physiological evidence is also provided that genes related to phosphate transport are unlikely to be behind the genetic loci conferring tolerance. These results offer testable hypotheses for genes related to As(V) tolerance that might offer strategies for mitigating arsenic (As) accumulation in consumed rice.

  5. Rice–arsenate interactions in hydroponics: a three-gene model for tolerance

    PubMed Central

    Norton, Gareth J.; Nigar, Meher; Dasgupta, Tapash; Meharg, Andrew A.; Price, Adam H.

    2008-01-01

    In this study, the genetic mapping of the tolerance of root growth to 13.3 μM arsenate [As(V)] using the Bala×Azucena population is improved, and candidate genes for further study are identified. A remarkable three-gene model of tolerance is advanced, which appears to involve epistatic interaction between three major genes, two on chromosome 6 and one on chromosome 10. Any combination of two of these genes inherited from the tolerant parent leads to the plant having tolerance. Lists of potential positional candidate genes are presented. These are then refined using whole genome transcriptomics data and bioinformatics. Physiological evidence is also provided that genes related to phosphate transport are unlikely to be behind the genetic loci conferring tolerance. These results offer testable hypotheses for genes related to As(V) tolerance that might offer strategies for mitigating arsenic (As) accumulation in consumed rice. PMID:18453529

  6. Maximal gene number maintainable by stochastic correction - The second error threshold.

    PubMed

    Hubai, András G; Kun, Ádám

    2016-09-21

    There is still no general solution to Eigen׳s Paradox, the chicken-or-egg problem of the origin of life: neither accurate copying, nor long genomes could have evolved without one another being established beforehand. But an array of small, individually replicating genes might offer a workaround, provided that multilevel selection assists the survival of the ensemble. There are two key difficulties that such a system has to overcome: the non-synchronous replication of genes, and their random assortment into daughter cells (the units of higher-level selection) upon fission. Here we find, using the Stochastic Corrector Model framework, that a large number (τ≥90) of genes can coexist. Furthermore, the system can tolerate about 10% replication rate asymmetry (competition) among the genes. On this basis, we put forward a plausible (and testable!) scenario for how novel genes could have been incorporated into early living systems: a route to complex metabolism. Copyright © 2016 Elsevier Ltd. All rights reserved.

  7. Fast and Accurate Hybrid Stream PCRTMSOLAR Radiative Transfer Model for Reflected Solar Spectrum Simulation in the Cloudy Atmosphere

    NASA Technical Reports Server (NTRS)

    Yang, Qiguang; Liu, Xu; Wu, Wan; Kizer, Susan; Baize, Rosemary R.

    2016-01-01

    A hybrid stream PCRTM-SOLAR model has been proposed for fast and accurate radiative transfer simulation. It calculates the reflected solar (RS) radiances with a fast coarse way and then, with the help of a pre-saved matrix, transforms the results to obtain the desired high accurate RS spectrum. The methodology has been demonstrated with the hybrid stream discrete ordinate (HSDO) radiative transfer (RT) model. The HSDO method calculates the monochromatic radiances using a 4-stream discrete ordinate method, where only a small number of monochromatic radiances are simulated with both 4-stream and a larger N-stream (N = 16) discrete ordinate RT algorithm. The accuracy of the obtained channel radiance is comparable to the result from N-stream moderate resolution atmospheric transmission version 5 (MODTRAN5). The root-mean-square errors are usually less than 5x10(exp -4) mW/sq cm/sr/cm. The computational speed is three to four-orders of magnitude faster than the medium speed correlated-k option MODTRAN5. This method is very efficient to simulate thousands of RS spectra under multi-layer clouds/aerosols and solar radiation conditions for climate change study and numerical weather prediction applications.

  8. Efficient and accurate causal inference with hidden confounders from genome-transcriptome variation data

    PubMed Central

    2017-01-01

    Mapping gene expression as a quantitative trait using whole genome-sequencing and transcriptome analysis allows to discover the functional consequences of genetic variation. We developed a novel method and ultra-fast software Findr for higly accurate causal inference between gene expression traits using cis-regulatory DNA variations as causal anchors, which improves current methods by taking into consideration hidden confounders and weak regulations. Findr outperformed existing methods on the DREAM5 Systems Genetics challenge and on the prediction of microRNA and transcription factor targets in human lymphoblastoid cells, while being nearly a million times faster. Findr is publicly available at https://github.com/lingfeiwang/findr. PMID:28821014

  9. Integrated Translatome and Proteome: Approach for Accurate Portraying of Widespread Multifunctional Aspects of Trichoderma

    PubMed Central

    Sharma, Vivek; Salwan, Richa; Sharma, P. N.; Gulati, Arvind

    2017-01-01

    Genome-wide studies of transcripts expression help in systematic monitoring of genes and allow targeting of candidate genes for future research. In contrast to relatively stable genomic data, the expression of genes is dynamic and regulated both at time and space level at different level in. The variation in the rate of translation is specific for each protein. Both the inherent nature of an mRNA molecule to be translated and the external environmental stimuli can affect the efficiency of the translation process. In biocontrol agents (BCAs), the molecular response at translational level may represents noise-like response of absolute transcript level and an adaptive response to physiological and pathological situations representing subset of mRNAs population actively translated in a cell. The molecular responses of biocontrol are complex and involve multistage regulation of number of genes. The use of high-throughput techniques has led to rapid increase in volume of transcriptomics data of Trichoderma. In general, almost half of the variations of transcriptome and protein level are due to translational control. Thus, studies are required to integrate raw information from different “omics” approaches for accurate depiction of translational response of BCAs in interaction with plants and plant pathogens. The studies on translational status of only active mRNAs bridging with proteome data will help in accurate characterization of only a subset of mRNAs actively engaged in translation. This review highlights the associated bottlenecks and use of state-of-the-art procedures in addressing the gap to accelerate future accomplishment of biocontrol mechanisms. PMID:28900417

  10. Gene expression during blow fly development: improving the precision of age estimates in forensic entomology.

    PubMed

    Tarone, Aaron M; Foran, David R

    2011-01-01

    Forensic entomologists use size and developmental stage to estimate blow fly age, and from those, a postmortem interval. Since such estimates are generally accurate but often lack precision, particularly in the older developmental stages, alternative aging methods would be advantageous. Presented here is a means of incorporating developmentally regulated gene expression levels into traditional stage and size data, with a goal of more precisely estimating developmental age of immature Lucilia sericata. Generalized additive models of development showed improved statistical support compared to models that did not include gene expression data, resulting in an increase in estimate precision, especially for postfeeding third instars and pupae. The models were then used to make blind estimates of development for 86 immature L. sericata raised on rat carcasses. Overall, inclusion of gene expression data resulted in increased precision in aging blow flies. © 2010 American Academy of Forensic Sciences.

  11. An accurate model for the computation of the dose of protons in water.

    PubMed

    Embriaco, A; Bellinzona, V E; Fontana, A; Rotondi, A

    2017-06-01

    The accurate and fast calculation of the dose in proton radiation therapy is an essential ingredient for successful treatments. We propose a novel approach with a minimal number of parameters. The approach is based on the exact calculation of the electromagnetic part of the interaction, namely the Molière theory of the multiple Coulomb scattering for the transversal 1D projection and the Bethe-Bloch formula for the longitudinal stopping power profile, including a gaussian energy straggling. To this e.m. contribution the nuclear proton-nucleus interaction is added with a simple two-parameter model. Then, the non gaussian lateral profile is used to calculate the radial dose distribution with a method that assumes the cylindrical symmetry of the distribution. The results, obtained with a fast C++ based computational code called MONET (MOdel of ioN dosE for Therapy), are in very good agreement with the FLUKA MC code, within a few percent in the worst case. This study provides a new tool for fast dose calculation or verification, possibly for clinical use. Copyright © 2017 Associazione Italiana di Fisica Medica. Published by Elsevier Ltd. All rights reserved.

  12. Robust variable selection method for nonparametric differential equation models with application to nonlinear dynamic gene regulatory network analysis.

    PubMed

    Lu, Tao

    2016-01-01

    The gene regulation network (GRN) evaluates the interactions between genes and look for models to describe the gene expression behavior. These models have many applications; for instance, by characterizing the gene expression mechanisms that cause certain disorders, it would be possible to target those genes to block the progress of the disease. Many biological processes are driven by nonlinear dynamic GRN. In this article, we propose a nonparametric differential equation (ODE) to model the nonlinear dynamic GRN. Specially, we address following questions simultaneously: (i) extract information from noisy time course gene expression data; (ii) model the nonlinear ODE through a nonparametric smoothing function; (iii) identify the important regulatory gene(s) through a group smoothly clipped absolute deviation (SCAD) approach; (iv) test the robustness of the model against possible shortening of experimental duration. We illustrate the usefulness of the model and associated statistical methods through a simulation and a real application examples.

  13. The evolution of the protein synthesis system. I - A model of a primitive protein synthesis system

    NASA Technical Reports Server (NTRS)

    Mizutani, H.; Ponnamperuma, C.

    1977-01-01

    A model is developed to describe the evolution of the protein synthesis system. The model is comprised of two independent autocatalytic systems, one including one gene (A-gene) and two activated amino acid polymerases (O and A-polymerases), and the other including the addition of another gene (N-gene) and a nucleotide polymerase. Simulation results have suggested that even a small enzymic activity and polymerase specificity could lead the system to the most accurate protein synthesis, as far as permitted by transitions to systems with higher accuracy.

  14. Identification and evaluation of reference genes for accurate gene expression normalization of fresh and frozen-thawed spermatozoa of water buffalo (Bubalus bubalis).

    PubMed

    Ashish, Shende; Bhure, S K; Harikrishna, Pillai; Ramteke, S S; Muhammed Kutty, V H; Shruthi, N; Ravi Kumar, G V P P S; Manish, Mahawar; Ghosh, S K; Mihir, Sarkar

    2017-04-01

    The quantitative real time PCR (qRT-PCR) has become an important tool for gene-expression analysis for a selected number of genes in life science. Although large dynamic range, sensitivity and reproducibility of qRT-PCR is good, the reliability majorly depend on the selection of proper reference genes (RGs) employed for normalization. Although, RGs expression has been reported to vary considerably within same cell type with different experimental treatments. No systematic study has been conducted to identify and evaluate the appropriate RGs in spermatozoa of domestic animals. Therefore, this study was conducted to analyze suitable stable RGs in fresh and frozen-thawed spermatozoa. We have assessed 13 candidate RGs (BACT, RPS18s, RPS15A, ATP5F1, HMBS, ATP2B4, RPL13, EEF2, TBP, EIF2B2, MDH1, B2M and GLUT5) of different functions and pathways using five algorithms. Regardless of the approach, the ranking of the most and the least candidate RGs remained almost same. The comprehensive ranking by RefFinder showed GLUT5, ATP2B4 and B2M, MDH1 as the top two stable and least stable RGs, respectively. The expression levels of four heat shock proteins (HSP) were employed as a target gene to evaluate RGs efficiency for normalization. The results demonstrated an exponential difference in expression levels of the four HSP genes upon normalization of the data with the most stable and the least stable RGs. Our study, provides a convenient RGs for normalization of gene-expression of key metabolic pathways effected during freezing and thawing of spermatozoa of buffalo and other closely related bovines. Copyright © 2017 Elsevier Inc. All rights reserved.

  15. Integrated modeling of protein-coding genes in the Manduca sexta genome using RNA-Seq data from the biochemical model insect

    PubMed Central

    Cao, Xiaolong; Jiang, Haobo

    2015-01-01

    The genome sequence of Manduca sexta was recently determined using 454 technology. Cufflinks and MAKER2 were used to establish gene models in the genome assembly based on the RNA-Seq data and other species' sequences. Aided by the extensive RNA-Seq data from 50 tissue samples at various life stages, annotators over the world (including the present authors) have manually confirmed and improved a small percentage of the models after spending months of effort. While such collaborative efforts are highly commendable, many of the predicted genes still have problems which may hamper future research on this insect species. As a biochemical model representing lepidopteran pests, M. sexta has been used extensively to study insect physiological processes for over five decades. In this work, we assembled Manduca datasets Cufflinks 3.0, Trinity 4.0, and Oases 4.0 to assist the manual annotation efforts and development of Official Gene Set (OGS) 2.0. To further improve annotation quality, we developed methods to evaluate gene models in the MAKER2, Cufflinks, Oases and Trinity assemblies and selected the best ones to constitute MCOT 1.0 after thorough crosschecking. MCOT 1.0 has 18,089 genes encoding 31,666 proteins: 32.8% match OGS 2.0 models perfectly or near perfectly, 11,747 differ considerably, and 29.5% are absent in OGS 2.0. Future automation of this process is anticipated to greatly reduce human efforts in generating comprehensive, reliable models of structural genes in other genome projects where extensive RNA-Seq data are available. PMID:25612938

  16. Integrated modeling of protein-coding genes in the Manduca sexta genome using RNA-Seq data from the biochemical model insect.

    PubMed

    Cao, Xiaolong; Jiang, Haobo

    2015-07-01

    The genome sequence of Manduca sexta was recently determined using 454 technology. Cufflinks and MAKER2 were used to establish gene models in the genome assembly based on the RNA-Seq data and other species' sequences. Aided by the extensive RNA-Seq data from 50 tissue samples at various life stages, annotators over the world (including the present authors) have manually confirmed and improved a small percentage of the models after spending months of effort. While such collaborative efforts are highly commendable, many of the predicted genes still have problems which may hamper future research on this insect species. As a biochemical model representing lepidopteran pests, M. sexta has been used extensively to study insect physiological processes for over five decades. In this work, we assembled Manduca datasets Cufflinks 3.0, Trinity 4.0, and Oases 4.0 to assist the manual annotation efforts and development of Official Gene Set (OGS) 2.0. To further improve annotation quality, we developed methods to evaluate gene models in the MAKER2, Cufflinks, Oases and Trinity assemblies and selected the best ones to constitute MCOT 1.0 after thorough crosschecking. MCOT 1.0 has 18,089 genes encoding 31,666 proteins: 32.8% match OGS 2.0 models perfectly or near perfectly, 11,747 differ considerably, and 29.5% are absent in OGS 2.0. Future automation of this process is anticipated to greatly reduce human efforts in generating comprehensive, reliable models of structural genes in other genome projects where extensive RNA-Seq data are available. Copyright © 2015 Elsevier Ltd. All rights reserved.

  17. Accurate, Streamlined Analysis of mRNA Translation by Sucrose Gradient Fractionation

    PubMed Central

    Aboulhouda, Soufiane; Di Santo, Rachael; Therizols, Gabriel; Weinberg, David

    2017-01-01

    The efficiency with which proteins are produced from mRNA molecules can vary widely across transcripts, cell types, and cellular states. Methods that accurately assay the translational efficiency of mRNAs are critical to gaining a mechanistic understanding of post-transcriptional gene regulation. One way to measure translational efficiency is to determine the number of ribosomes associated with an mRNA molecule, normalized to the length of the coding sequence. The primary method for this analysis of individual mRNAs is sucrose gradient fractionation, which physically separates mRNAs based on the number of bound ribosomes. Here, we describe a streamlined protocol for accurate analysis of mRNA association with ribosomes. Compared to previous protocols, our method incorporates internal controls and improved buffer conditions that together reduce artifacts caused by non-specific mRNA–ribosome interactions. Moreover, our direct-from-fraction qRT-PCR protocol eliminates the need for RNA purification from gradient fractions, which greatly reduces the amount of hands-on time required and facilitates parallel analysis of multiple conditions or gene targets. Additionally, no phenol waste is generated during the procedure. We initially developed the protocol to investigate the translationally repressed state of the HAC1 mRNA in S. cerevisiae, but we also detail adapted procedures for mammalian cell lines and tissues. PMID:29170751

  18. Can AERONET data be used to accurately model the monochromatic beam and circumsolar irradiances under cloud-free conditions in desert environment?

    NASA Astrophysics Data System (ADS)

    Eissa, Y.; Blanc, P.; Wald, L.; Ghedira, H.

    2015-07-01

    Routine measurements of the beam irradiance at normal incidence (DNI) include the irradiance originating from within the extent of the solar disc only (DNIS) whose angular extent is 0.266° ± 1.7 %, and that from a larger circumsolar region, called the circumsolar normal irradiance (CSNI). This study investigates if the spectral aerosol optical properties of the AERONET stations are sufficient for an accurate modelling of the monochromatic DNIS and CSNI under cloud-free conditions in a desert environment. The data from an AERONET station in Abu Dhabi, United Arab Emirates, and a collocated Sun and Aureole Measurement (SAM) instrument which offers reference measurements of the monochromatic profile of solar radiance, were exploited. Using the AERONET data both the radiative transfer models libRadtran and SMARTS offer an accurate estimate of the monochromatic DNIS, with a relative root mean square error (RMSE) of 5 %, a relative bias of +1 % and acoefficient of determination greater than 0.97. After testing two configurations in SMARTS and three in libRadtran for modelling the monochromatic CSNI, libRadtran exhibits the most accurate results when the AERONET aerosol phase function is presented as a Two Term Henyey-Greenstein phase function. In this case libRadtran exhibited a relative RMSE and a bias of respectively 22 and -19 % and a coefficient of determination of 0.89. The results are promising and pave the way towards reporting the contribution of the broadband circumsolar irradiance to standard DNI measurements.

  19. Beyond mean-field approximations for accurate and computationally efficient models of on-lattice chemical kinetics

    NASA Astrophysics Data System (ADS)

    Pineda, M.; Stamatakis, M.

    2017-07-01

    Modeling the kinetics of surface catalyzed reactions is essential for the design of reactors and chemical processes. The majority of microkinetic models employ mean-field approximations, which lead to an approximate description of catalytic kinetics by assuming spatially uncorrelated adsorbates. On the other hand, kinetic Monte Carlo (KMC) methods provide a discrete-space continuous-time stochastic formulation that enables an accurate treatment of spatial correlations in the adlayer, but at a significant computation cost. In this work, we use the so-called cluster mean-field approach to develop higher order approximations that systematically increase the accuracy of kinetic models by treating spatial correlations at a progressively higher level of detail. We further demonstrate our approach on a reduced model for NO oxidation incorporating first nearest-neighbor lateral interactions and construct a sequence of approximations of increasingly higher accuracy, which we compare with KMC and mean-field. The latter is found to perform rather poorly, overestimating the turnover frequency by several orders of magnitude for this system. On the other hand, our approximations, while more computationally intense than the traditional mean-field treatment, still achieve tremendous computational savings compared to KMC simulations, thereby opening the way for employing them in multiscale modeling frameworks.

  20. Interrogating the topological robustness of gene regulatory circuits by randomization

    PubMed Central

    Levine, Herbert; Onuchic, Jose N.

    2017-01-01

    One of the most important roles of cells is performing their cellular tasks properly for survival. Cells usually achieve robust functionality, for example, cell-fate decision-making and signal transduction, through multiple layers of regulation involving many genes. Despite the combinatorial complexity of gene regulation, its quantitative behavior has been typically studied on the basis of experimentally verified core gene regulatory circuitry, composed of a small set of important elements. It is still unclear how such a core circuit operates in the presence of many other regulatory molecules and in a crowded and noisy cellular environment. Here we report a new computational method, named random circuit perturbation (RACIPE), for interrogating the robust dynamical behavior of a gene regulatory circuit even without accurate measurements of circuit kinetic parameters. RACIPE generates an ensemble of random kinetic models corresponding to a fixed circuit topology, and utilizes statistical tools to identify generic properties of the circuit. By applying RACIPE to simple toggle-switch-like motifs, we observed that the stable states of all models converge to experimentally observed gene state clusters even when the parameters are strongly perturbed. RACIPE was further applied to a proposed 22-gene network of the Epithelial-to-Mesenchymal Transition (EMT), from which we identified four experimentally observed gene states, including the states that are associated with two different types of hybrid Epithelial/Mesenchymal phenotypes. Our results suggest that dynamics of a gene circuit is mainly determined by its topology, not by detailed circuit parameters. Our work provides a theoretical foundation for circuit-based systems biology modeling. We anticipate RACIPE to be a powerful tool to predict and decode circuit design principles in an unbiased manner, and to quantitatively evaluate the robustness and heterogeneity of gene expression. PMID:28362798

  1. A versatile strategy for gene trapping and trap conversion in emerging model organisms.

    PubMed

    Kontarakis, Zacharias; Pavlopoulos, Anastasios; Kiupakis, Alexandros; Konstantinides, Nikolaos; Douris, Vassilis; Averof, Michalis

    2011-06-01

    Genetic model organisms such as Drosophila, C. elegans and the mouse provide formidable tools for studying mechanisms of development, physiology and behaviour. Established models alone, however, allow us to survey only a tiny fraction of the morphological and functional diversity present in the animal kingdom. Here, we present iTRAC, a versatile gene-trapping approach that combines the implementation of unbiased genetic screens with the generation of sophisticated genetic tools both in established and emerging model organisms. The approach utilises an exon-trapping transposon vector that carries an integrase docking site, allowing the targeted integration of new constructs into trapped loci. We provide proof of principle for iTRAC in the emerging model crustacean Parhyale hawaiensis: we generate traps that allow specific developmental and physiological processes to be visualised in unparalleled detail, we show that trapped genes can be easily cloned from an unsequenced genome, and we demonstrate targeting of new constructs into a trapped locus. Using this approach, gene traps can serve as platforms for generating diverse reporters, drivers for tissue-specific expression, gene knockdown and other genetic tools not yet imagined.

  2. Investigating the Effects of Imputation Methods for Modelling Gene Networks Using a Dynamic Bayesian Network from Gene Expression Data

    PubMed Central

    CHAI, Lian En; LAW, Chow Kuan; MOHAMAD, Mohd Saberi; CHONG, Chuii Khim; CHOON, Yee Wen; DERIS, Safaai; ILLIAS, Rosli Md

    2014-01-01

    Background: Gene expression data often contain missing expression values. Therefore, several imputation methods have been applied to solve the missing values, which include k-nearest neighbour (kNN), local least squares (LLS), and Bayesian principal component analysis (BPCA). However, the effects of these imputation methods on the modelling of gene regulatory networks from gene expression data have rarely been investigated and analysed using a dynamic Bayesian network (DBN). Methods: In the present study, we separately imputed datasets of the Escherichia coli S.O.S. DNA repair pathway and the Saccharomyces cerevisiae cell cycle pathway with kNN, LLS, and BPCA, and subsequently used these to generate gene regulatory networks (GRNs) using a discrete DBN. We made comparisons on the basis of previous studies in order to select the gene network with the least error. Results: We found that BPCA and LLS performed better on larger networks (based on the S. cerevisiae dataset), whereas kNN performed better on smaller networks (based on the E. coli dataset). Conclusion: The results suggest that the performance of each imputation method is dependent on the size of the dataset, and this subsequently affects the modelling of the resultant GRNs using a DBN. In addition, on the basis of these results, a DBN has the capacity to discover potential edges, as well as display interactions, between genes. PMID:24876803

  3. Robust diagnosis of non-Hodgkin lymphoma phenotypes validated on gene expression data from different laboratories.

    PubMed

    Bhanot, Gyan; Alexe, Gabriela; Levine, Arnold J; Stolovitzky, Gustavo

    2005-01-01

    A major challenge in cancer diagnosis from microarray data is the need for robust, accurate, classification models which are independent of the analysis techniques used and can combine data from different laboratories. We propose such a classification scheme originally developed for phenotype identification from mass spectrometry data. The method uses a robust multivariate gene selection procedure and combines the results of several machine learning tools trained on raw and pattern data to produce an accurate meta-classifier. We illustrate and validate our method by applying it to gene expression datasets: the oligonucleotide HuGeneFL microarray dataset of Shipp et al. (www.genome.wi.mit.du/MPR/lymphoma) and the Hu95Av2 Affymetrix dataset (DallaFavera's laboratory, Columbia University). Our pattern-based meta-classification technique achieves higher predictive accuracies than each of the individual classifiers , is robust against data perturbations and provides subsets of related predictive genes. Our techniques predict that combinations of some genes in the p53 pathway are highly predictive of phenotype. In particular, we find that in 80% of DLBCL cases the mRNA level of at least one of the three genes p53, PLK1 and CDK2 is elevated, while in 80% of FL cases, the mRNA level of at most one of them is elevated.

  4. Yeast Phenomics: An Experimental Approach for Modeling Gene Interaction Networks that Buffer Disease

    PubMed Central

    Hartman, John L.; Stisher, Chandler; Outlaw, Darryl A.; Guo, Jingyu; Shah, Najaf A.; Tian, Dehua; Santos, Sean M.; Rodgers, John W.; White, Richard A.

    2015-01-01

    The genome project increased appreciation of genetic complexity underlying disease phenotypes: many genes contribute each phenotype and each gene contributes multiple phenotypes. The aspiration of predicting common disease in individuals has evolved from seeking primary loci to marginal risk assignments based on many genes. Genetic interaction, defined as contributions to a phenotype that are dependent upon particular digenic allele combinations, could improve prediction of phenotype from complex genotype, but it is difficult to study in human populations. High throughput, systematic analysis of S. cerevisiae gene knockouts or knockdowns in the context of disease-relevant phenotypic perturbations provides a tractable experimental approach to derive gene interaction networks, in order to deduce by cross-species gene homology how phenotype is buffered against disease-risk genotypes. Yeast gene interaction network analysis to date has revealed biology more complex than previously imagined. This has motivated the development of more powerful yeast cell array phenotyping methods to globally model the role of gene interaction networks in modulating phenotypes (which we call yeast phenomic analysis). The article illustrates yeast phenomic technology, which is applied here to quantify gene X media interaction at higher resolution and supports use of a human-like media for future applications of yeast phenomics for modeling human disease. PMID:25668739

  5. A model of gene expression based on random dynamical systems reveals modularity properties of gene regulatory networks.

    PubMed

    Antoneli, Fernando; Ferreira, Renata C; Briones, Marcelo R S

    2016-06-01

    Here we propose a new approach to modeling gene expression based on the theory of random dynamical systems (RDS) that provides a general coupling prescription between the nodes of any given regulatory network given the dynamics of each node is modeled by a RDS. The main virtues of this approach are the following: (i) it provides a natural way to obtain arbitrarily large networks by coupling together simple basic pieces, thus revealing the modularity of regulatory networks; (ii) the assumptions about the stochastic processes used in the modeling are fairly general, in the sense that the only requirement is stationarity; (iii) there is a well developed mathematical theory, which is a blend of smooth dynamical systems theory, ergodic theory and stochastic analysis that allows one to extract relevant dynamical and statistical information without solving the system; (iv) one may obtain the classical rate equations form the corresponding stochastic version by averaging the dynamic random variables (small noise limit). It is important to emphasize that unlike the deterministic case, where coupling two equations is a trivial matter, coupling two RDS is non-trivial, specially in our case, where the coupling is performed between a state variable of one gene and the switching stochastic process of another gene and, hence, it is not a priori true that the resulting coupled system will satisfy the definition of a random dynamical system. We shall provide the necessary arguments that ensure that our coupling prescription does indeed furnish a coupled regulatory network of random dynamical systems. Finally, the fact that classical rate equations are the small noise limit of our stochastic model ensures that any validation or prediction made on the basis of the classical theory is also a validation or prediction of our model. We illustrate our framework with some simple examples of single-gene system and network motifs. Copyright © 2016 Elsevier Inc. All rights reserved.

  6. Selection of suitable reference genes for gene expression studies in Staphylococcus capitis during growth under erythromycin stress.

    PubMed

    Cui, Bintao; Smooker, Peter M; Rouch, Duncan A; Deighton, Margaret A

    2016-08-01

    Accurate and reproducible measurement of gene transcription requires appropriate reference genes, which are stably expressed under different experimental conditions to provide normalization. Staphylococcus capitis is a human pathogen that produces biofilm under stress, such as imposed by antimicrobial agents. In this study, a set of five commonly used staphylococcal reference genes (gyrB, sodA, recA, tuf and rpoB) were systematically evaluated in two clinical isolates of Staphylococcus capitis (S. capitis subspecies urealyticus and capitis, respectively) under erythromycin stress in mid-log and stationary phases. Two public software programs (geNorm and NormFinder) and two manual calculation methods, reference residue normalization (RRN) and relative quantitative (RQ), were applied. The potential reference genes selected by the four algorithms were further validated by comparing the expression of a well-studied biofilm gene (icaA) with phenotypic biofilm formation in S. capitis under four different experimental conditions. The four methods differed considerably in their ability to predict the most suitable reference gene or gene combination for comparing icaA expression under different conditions. Under the conditions used here, the RQ method provided better selection of reference genes than the other three algorithms; however, this finding needs to be confirmed with a larger number of isolates. This study reinforces the need to assess the stability of reference genes for analysis of target gene expression under different conditions and the use of more than one algorithm in such studies. Although this work was conducted using a specific human pathogen, it emphasizes the importance of selecting suitable reference genes for accurate normalization of gene expression more generally.

  7. The Joint Effects of Background Selection and Genetic Recombination on Local Gene Genealogies

    PubMed Central

    Zeng, Kai; Charlesworth, Brian

    2011-01-01

    Background selection, the effects of the continual removal of deleterious mutations by natural selection on variability at linked sites, is potentially a major determinant of DNA sequence variability. However, the joint effects of background selection and genetic recombination on the shape of the neutral gene genealogy have proved hard to study analytically. The only existing formula concerns the mean coalescent time for a pair of alleles, making it difficult to assess the importance of background selection from genome-wide data on sequence polymorphism. Here we develop a structured coalescent model of background selection with recombination and implement it in a computer program that efficiently generates neutral gene genealogies for an arbitrary sample size. We check the validity of the structured coalescent model against forward-in-time simulations and show that it accurately captures the effects of background selection. The model produces more accurate predictions of the mean coalescent time than the existing formula and supports the conclusion that the effect of background selection is greater in the interior of a deleterious region than at its boundaries. The level of linkage disequilibrium between sites is elevated by background selection, to an extent that is well summarized by a change in effective population size. The structured coalescent model is readily extendable to more realistic situations and should prove useful for analyzing genome-wide polymorphism data. PMID:21705759

  8. The joint effects of background selection and genetic recombination on local gene genealogies.

    PubMed

    Zeng, Kai; Charlesworth, Brian

    2011-09-01

    Background selection, the effects of the continual removal of deleterious mutations by natural selection on variability at linked sites, is potentially a major determinant of DNA sequence variability. However, the joint effects of background selection and genetic recombination on the shape of the neutral gene genealogy have proved hard to study analytically. The only existing formula concerns the mean coalescent time for a pair of alleles, making it difficult to assess the importance of background selection from genome-wide data on sequence polymorphism. Here we develop a structured coalescent model of background selection with recombination and implement it in a computer program that efficiently generates neutral gene genealogies for an arbitrary sample size. We check the validity of the structured coalescent model against forward-in-time simulations and show that it accurately captures the effects of background selection. The model produces more accurate predictions of the mean coalescent time than the existing formula and supports the conclusion that the effect of background selection is greater in the interior of a deleterious region than at its boundaries. The level of linkage disequilibrium between sites is elevated by background selection, to an extent that is well summarized by a change in effective population size. The structured coalescent model is readily extendable to more realistic situations and should prove useful for analyzing genome-wide polymorphism data.

  9. Lung tumor diagnosis and subtype discovery by gene expression profiling.

    PubMed

    Wang, Lu-yong; Tu, Zhuowen

    2006-01-01

    The optimal treatment of patients with complex diseases, such as cancers, depends on the accurate diagnosis by using a combination of clinical and histopathological data. In many scenarios, it becomes tremendously difficult because of the limitations in clinical presentation and histopathology. To accurate diagnose complex diseases, the molecular classification based on gene or protein expression profiles are indispensable for modern medicine. Moreover, many heterogeneous diseases consist of various potential subtypes in molecular basis and differ remarkably in their response to therapies. It is critical to accurate predict subgroup on disease gene expression profiles. More fundamental knowledge of the molecular basis and classification of disease could aid in the prediction of patient outcome, the informed selection of therapies, and identification of novel molecular targets for therapy. In this paper, we propose a new disease diagnostic method, probabilistic boosting tree (PB tree) method, on gene expression profiles of lung tumors. It enables accurate disease classification and subtype discovery in disease. It automatically constructs a tree in which each node combines a number of weak classifiers into a strong classifier. Also, subtype discovery is naturally embedded in the learning process. Our algorithm achieves excellent diagnostic performance, and meanwhile it is capable of detecting the disease subtype based on gene expression profile.

  10. Analyzing gene perturbation screens with nested effects models in R and bioconductor.

    PubMed

    Fröhlich, Holger; Beissbarth, Tim; Tresch, Achim; Kostka, Dennis; Jacob, Juby; Spang, Rainer; Markowetz, F

    2008-11-01

    Nested effects models (NEMs) are a class of probabilistic models introduced to analyze the effects of gene perturbation screens visible in high-dimensional phenotypes like microarrays or cell morphology. NEMs reverse engineer upstream/downstream relations of cellular signaling cascades. NEMs take as input a set of candidate pathway genes and phenotypic profiles of perturbing these genes. NEMs return a pathway structure explaining the observed perturbation effects. Here, we describe the package nem, an open-source software to efficiently infer NEMs from data. Our software implements several search algorithms for model fitting and is applicable to a wide range of different data types and representations. The methods we present summarize the current state-of-the-art in NEMs. Our software is written in the R language and freely avail-able via the Bioconductor project at http://www.bioconductor.org.

  11. Modeling methodology for the accurate and prompt prediction of symptomatic events in chronic diseases.

    PubMed

    Pagán, Josué; Risco-Martín, José L; Moya, José M; Ayala, José L

    2016-08-01

    Prediction of symptomatic crises in chronic diseases allows to take decisions before the symptoms occur, such as the intake of drugs to avoid the symptoms or the activation of medical alarms. The prediction horizon is in this case an important parameter in order to fulfill the pharmacokinetics of medications, or the time response of medical services. This paper presents a study about the prediction limits of a chronic disease with symptomatic crises: the migraine. For that purpose, this work develops a methodology to build predictive migraine models and to improve these predictions beyond the limits of the initial models. The maximum prediction horizon is analyzed, and its dependency on the selected features is studied. A strategy for model selection is proposed to tackle the trade off between conservative but robust predictive models, with respect to less accurate predictions with higher horizons. The obtained results show a prediction horizon close to 40min, which is in the time range of the drug pharmacokinetics. Experiments have been performed in a realistic scenario where input data have been acquired in an ambulatory clinical study by the deployment of a non-intrusive Wireless Body Sensor Network. Our results provide an effective methodology for the selection of the future horizon in the development of prediction algorithms for diseases experiencing symptomatic crises. Copyright © 2016 Elsevier Inc. All rights reserved.

  12. Can AERONET data be used to accurately model the monochromatic beam and circumsolar irradiances under cloud-free conditions in desert environment?

    NASA Astrophysics Data System (ADS)

    Eissa, Y.; Blanc, P.; Wald, L.; Ghedira, H.

    2015-12-01

    Routine measurements of the beam irradiance at normal incidence include the irradiance originating from within the extent of the solar disc only (DNIS), whose angular extent is 0.266° ± 1.7 %, and from a larger circumsolar region, called the circumsolar normal irradiance (CSNI). This study investigates whether the spectral aerosol optical properties of the AERONET stations are sufficient for an accurate modelling of the monochromatic DNIS and CSNI under cloud-free conditions in a desert environment. The data from an AERONET station in Abu Dhabi, United Arab Emirates, and the collocated Sun and Aureole Measurement instrument which offers reference measurements of the monochromatic profile of solar radiance were exploited. Using the AERONET data both the radiative transfer models libRadtran and SMARTS offer an accurate estimate of the monochromatic DNIS, with a relative root mean square error (RMSE) of 6 % and a coefficient of determination greater than 0.96. The observed relative bias obtained with libRadtran is +2 %, while that obtained with SMARTS is -1 %. After testing two configurations in SMARTS and three in libRadtran for modelling the monochromatic CSNI, libRadtran exhibits the most accurate results when the AERONET aerosol phase function is presented as a two-term Henyey-Greenstein phase function. In this case libRadtran exhibited a relative RMSE and a bias of respectively 27 and -24 % and a coefficient of determination of 0.882. Therefore, AERONET data may very well be used to model the monochromatic DNIS and the monochromatic CSNI. The results are promising and pave the way towards reporting the contribution of the broadband circumsolar irradiance to standard measurements of the beam irradiance.

  13. Definition of Historical Models of Gene Function and Their Relation to Students' Understanding of Genetics

    ERIC Educational Resources Information Center

    Gericke, Niklas Markus; Hagberg, Mariana

    2007-01-01

    Models are often used when teaching science. In this paper historical models and students' ideas about genetics are compared. The historical development of the scientific idea of the gene and its function is described and categorized into five historical models of gene function. Differences and similarities between these historical models are made…

  14. Case Report: Application of whole exome sequencing for accurate diagnosis of rare syndromes of mineralocorticoid excess

    PubMed Central

    Narayanan, Ranjit; Karuthedath Vellarikkal, Shamsudheen; Jayarajan, Rijith; Verma, Ankit; Dixit, Vishal; Scaria, Vinod; Sivasubbu, Sridhar

    2017-01-01

    Syndromes of mineralocorticoid excess (SME) are closely related clinical manifestations occurring within a specific set of diseases. Overlapping clinical manifestations of such syndromes often create a dilemma in accurate diagnosis, which is crucial for disease surveillance and management especially in rare genetic disorders. Here we demonstrate the use of whole exome sequencing (WES) for accurate diagnosis of rare SME and report that p.R337C variation in the HSD11B2 gene causes progressive apparent mineralocorticoid excess (AME) syndrome in a South Indian family of Mappila origin. PMID:29067160

  15. Electroporation-mediated Delivery of Genes in Rodent Models of Lung Contusion

    PubMed Central

    Machado-Aranda, David; Raghavendran, Krishnan

    2015-01-01

    Several of the biological processes involved in the pathogenesis of acute lung injury and acute respiratory distress syndrome after lung contusion are regulated at a genetic and epigenetic level. Thus, strategies to manipulate gene expression in this context are highly desirable not only to elucidate the mechanisms involved but also to look for potential therapies. In the present chapter, we describe mouse and rat models of inducing blunt thoracic injury followed by electroporation-mediated gene delivery to the lung. Electroporation is a highly efficient and easily reproducible technique that allows circumvention of several of lung gene delivery challenges and safety issues present with other forms of lung gene therapy. PMID:24510825

  16. Selection and Validation of Reference Genes for Accurate RT-qPCR Data Normalization in Coffea spp. under a Climate Changes Context of Interacting Elevated [CO2] and Temperature

    PubMed Central

    Martins, Madlles Q.; Fortunato, Ana S.; Rodrigues, Weverton P.; Partelli, Fábio L.; Campostrini, Eliemar; Lidon, Fernando C.; DaMatta, Fábio M.; Ramalho, José C.; Ribeiro-Barros, Ana I.

    2017-01-01

    /oxygenase (RLS), results from the in silico aggregation and experimental validation of the best number of reference genes showed that two reference genes are adequate to normalize RT-qPCR data. Altogether, this work highlights the importance of an adequate selection of reference genes for each single or combined experimental condition and constitutes the basis to accurately study molecular responses of Coffea spp. in a context of climate changes and global warming. PMID:28326094

  17. Selection and Validation of Reference Genes for Accurate RT-qPCR Data Normalization in Coffea spp. under a Climate Changes Context of Interacting Elevated [CO2] and Temperature.

    PubMed

    Martins, Madlles Q; Fortunato, Ana S; Rodrigues, Weverton P; Partelli, Fábio L; Campostrini, Eliemar; Lidon, Fernando C; DaMatta, Fábio M; Ramalho, José C; Ribeiro-Barros, Ana I

    2017-01-01

    /oxygenase ( RLS ), results from the in silico aggregation and experimental validation of the best number of reference genes showed that two reference genes are adequate to normalize RT-qPCR data. Altogether, this work highlights the importance of an adequate selection of reference genes for each single or combined experimental condition and constitutes the basis to accurately study molecular responses of Coffea spp. in a context of climate changes and global warming.

  18. Breast cancer prognosis by combinatorial analysis of gene expression data.

    PubMed

    Alexe, Gabriela; Alexe, Sorin; Axelrod, David E; Bonates, Tibérius O; Lozina, Irina I; Reiss, Michael; Hammer, Peter L

    2006-01-01

    The potential of applying data analysis tools to microarray data for diagnosis and prognosis is illustrated on the recent breast cancer dataset of van 't Veer and coworkers. We re-examine that dataset using the novel technique of logical analysis of data (LAD), with the double objective of discovering patterns characteristic for cases with good or poor outcome, using them for accurate and justifiable predictions; and deriving novel information about the role of genes, the existence of special classes of cases, and other factors. Data were analyzed using the combinatorics and optimization-based method of LAD, recently shown to provide highly accurate diagnostic and prognostic systems in cardiology, cancer proteomics, hematology, pulmonology, and other disciplines. LAD identified a subset of 17 of the 25,000 genes, capable of fully distinguishing between patients with poor, respectively good prognoses. An extensive list of 'patterns' or 'combinatorial biomarkers' (that is, combinations of genes and limitations on their expression levels) was generated, and 40 patterns were used to create a prognostic system, shown to have 100% and 92.9% weighted accuracy on the training and test sets, respectively. The prognostic system uses fewer genes than other methods, and has similar or better accuracy than those reported in other studies. Out of the 17 genes identified by LAD, three (respectively, five) were shown to play a significant role in determining poor (respectively, good) prognosis. Two new classes of patients (described by similar sets of covering patterns, gene expression ranges, and clinical features) were discovered. As a by-product of the study, it is shown that the training and the test sets of van 't Veer have differing characteristics. The study shows that LAD provides an accurate and fully explanatory prognostic system for breast cancer using genomic data (that is, a system that, in addition to predicting good or poor prognosis, provides an individualized

  19. Genetic background effects in quantitative genetics: gene-by-system interactions.

    PubMed

    Sardi, Maria; Gasch, Audrey P

    2018-04-11

    Proper cell function depends on networks of proteins that interact physically and functionally to carry out physiological processes. Thus, it seems logical that the impact of sequence variation in one protein could be significantly influenced by genetic variants at other loci in a genome. Nonetheless, the importance of such genetic interactions, known as epistasis, in explaining phenotypic variation remains a matter of debate in genetics. Recent work from our lab revealed that genes implicated from an association study of toxin tolerance in Saccharomyces cerevisiae show extensive interactions with the genetic background: most implicated genes, regardless of allele, are important for toxin tolerance in only one of two tested strains. The prevalence of background effects in our study adds to other reports of widespread genetic-background interactions in model organisms. We suggest that these effects represent many-way interactions with myriad features of the cellular system that vary across classes of individuals. Such gene-by-system interactions may influence diverse traits and require new modeling approaches to accurately represent genotype-phenotype relationships across individuals.

  20. Identification of gene regulation models from single-cell data

    NASA Astrophysics Data System (ADS)

    Weber, Lisa; Raymond, William; Munsky, Brian

    2018-09-01

    In quantitative analyses of biological processes, one may use many different scales of models (e.g. spatial or non-spatial, deterministic or stochastic, time-varying or at steady-state) or many different approaches to match models to experimental data (e.g. model fitting or parameter uncertainty/sloppiness quantification with different experiment designs). These different analyses can lead to surprisingly different results, even when applied to the same data and the same model. We use a simplified gene regulation model to illustrate many of these concerns, especially for ODE analyses of deterministic processes, chemical master equation and finite state projection analyses of heterogeneous processes, and stochastic simulations. For each analysis, we employ MATLAB and PYTHON software to consider a time-dependent input signal (e.g. a kinase nuclear translocation) and several model hypotheses, along with simulated single-cell data. We illustrate different approaches (e.g. deterministic and stochastic) to identify the mechanisms and parameters of the same model from the same simulated data. For each approach, we explore how uncertainty in parameter space varies with respect to the chosen analysis approach or specific experiment design. We conclude with a discussion of how our simulated results relate to the integration of experimental and computational investigations to explore signal-activated gene expression models in yeast (Neuert et al 2013 Science 339 584–7) and human cells (Senecal et al 2014 Cell Rep. 8 75–83)5.

  1. Defining Aggressive Prostate Cancer Using a 12-Gene Model1

    PubMed Central

    Riva, Alberto; Kim, Robert; Varambally, Sooryanarayana; He, Le; Kutok, Jeff; Aster, Jonathan C; Tang, Jeffery; Kuefer, Rainer; Hofer, Matthias D; Febbo, Phillip G; Chinnaiyan, Arul M; Rubin, Mark A

    2006-01-01

    Abstract The critical clinical question in prostate cancer research is: How do we develop means of distinguishing aggressive disease from indolent disease? Using a combination of proteomic and expression array data, we identified a set of 36 genes with concordant dysregulation of protein products that could be evaluated in situ by quantitative immunohistochemistry. Another five prostate cancer biomarkers were included using linear discriminant analysis, we determined that the optimal model used to predict prostate cancer progression consisted of 12 proteins. Using a separate patient population, transcriptional levels of the 12 genes encoding for these proteins predicted prostate-specific antigen failure in 79 men following surgery for clinically localized prostate cancer (P = .0015). This study demonstrates that cross-platform models can lead to predictive models with the possible advantage of being more robust through this selection process. PMID:16533427

  2. Stability evaluation of reference genes for gene expression analysis by RT-qPCR in soybean under different conditions.

    PubMed

    Wan, Qiao; Chen, Shuilian; Shan, Zhihui; Yang, Zhonglu; Chen, Limiao; Zhang, Chanjuan; Yuan, Songli; Hao, Qinnan; Zhang, Xiaojuan; Qiu, Dezhen; Chen, Haifeng; Zhou, Xinan

    2017-01-01

    Real-time quantitative reverse transcription PCR is a sensitive and widely used technique to quantify gene expression. To achieve a reliable result, appropriate reference genes are highly required for normalization of transcripts in different samples. In this study, 9 previously published reference genes (60S, Fbox, ELF1A, ELF1B, ACT11, TUA5, UBC4, G6PD, CYP2) of soybean [Glycine max (L.) Merr.] were selected. The expression stability of the 9 genes was evaluated under conditions of biotic stress caused by infection with soybean mosaic virus, nitrogen stress, across different cultivars and developmental stages. ΔCt and geNorm algorithms were used to evaluate and rank the expression stability of the 9 reference genes. Results obtained from two algorithms showed high consistency. Moreover, results of pairwise variation showed that two reference genes were sufficient to normalize the expression levels of target genes under each experimental setting. For virus infection, ELF1A and ELF1B were the most stable reference genes for accurate normalization. For different developmental stages, Fbox and G6PD had the highest expression stability between two soybean cultivars (Tanlong No. 1 and Tanlong No. 2). ELF1B and ACT11 were identified as the most stably expressed reference genes both under nitrogen stress and among different cultivars. The results showed that none of the candidate reference genes were uniformly expressed at different conditions, and selecting appropriate reference genes was pivotal for gene expression studies with particular condition and tissue. The most stable combination of genes identified in this study will help to achieve more accurate and reliable results in a wide variety of samples in soybean.

  3. Gene Expression Analysis to Assess the Relevance of Rodent Models to Human Lung Injury.

    PubMed

    Sweeney, Timothy E; Lofgren, Shane; Khatri, Purvesh; Rogers, Angela J

    2017-08-01

    The relevance of animal models to human diseases is an area of intense scientific debate. The degree to which mouse models of lung injury recapitulate human lung injury has never been assessed. Integrating data from both human and animal expression studies allows for increased statistical power and identification of conserved differential gene expression across organisms and conditions. We sought comprehensive integration of gene expression data in experimental acute lung injury (ALI) in rodents compared with humans. We performed two separate gene expression multicohort analyses to determine differential gene expression in experimental animal and human lung injury. We used correlational and pathway analyses combined with external in vitro gene expression data to identify both potential drivers of underlying inflammation and therapeutic drug candidates. We identified 21 animal lung tissue datasets and three human lung injury bronchoalveolar lavage datasets. We show that the metasignatures of animal and human experimental ALI are significantly correlated despite these widely varying experimental conditions. The gene expression changes among mice and rats across diverse injury models (ozone, ventilator-induced lung injury, LPS) are significantly correlated with human models of lung injury (Pearson r = 0.33-0.45, P < 1E -16 ). Neutrophil signatures are enriched in both animal and human lung injury. Predicted therapeutic targets, peptide ligand signatures, and pathway analyses are also all highly overlapping. Gene expression changes are similar in animal and human experimental ALI, and provide several physiologic and therapeutic insights to the disease.

  4. Use of DAVID algorithms for gene functional classification in a non-model organism, rainbow trout

    USDA-ARS?s Scientific Manuscript database

    Gene functional clustering is essential in transcriptome data analysis but software programs are not always suitable for use with non-model species. The DAVID Gene Functional Classification Tool has been widely used for soft clustering in model species, but requires adaptations for use in non-model ...

  5. Accurate Modeling of the Terrestrial Gamma-Ray Background for Homeland Security Applications

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sandness, Gerald A.; Schweppe, John E.; Hensley, Walter K.

    2009-10-24

    Abstract–The Pacific Northwest National Laboratory has developed computer models to simulate the use of radiation portal monitors to screen vehicles and cargo for the presence of illicit radioactive material. The gamma radiation emitted by the vehicles or cargo containers must often be measured in the presence of a relatively large gamma-ray background mainly due to the presence of potassium, uranium, and thorium (and progeny isotopes) in the soil and surrounding building materials. This large background is often a significant limit to the detection sensitivity for items of interest and must be modeled accurately for analyzing homeland security situations. Calculations ofmore » the expected gamma-ray emission from a disk of soil and asphalt were made using the Monte Carlo transport code MCNP and were compared to measurements made at a seaport with a high-purity germanium detector. Analysis revealed that the energy spectrum of the measured background could not be reproduced unless the model included gamma rays coming from the ground out to distances of at least 300 m. The contribution from beyond about 50 m was primarily due to gamma rays that scattered in the air before entering the detectors rather than passing directly from the ground to the detectors. These skyshine gamma rays contribute tens of percent to the total gamma-ray spectrum, primarily at energies below a few hundred keV. The techniques that were developed to efficiently calculate the contributions from a large soil disk and a large air volume in a Monte Carlo simulation are described and the implications of skyshine in portal monitoring applications are discussed.« less

  6. Differential gene expression detection and sample classification using penalized linear regression models.

    PubMed

    Wu, Baolin

    2006-02-15

    Differential gene expression detection and sample classification using microarray data have received much research interest recently. Owing to the large number of genes p and small number of samples n (p > n), microarray data analysis poses big challenges for statistical analysis. An obvious problem owing to the 'large p small n' is over-fitting. Just by chance, we are likely to find some non-differentially expressed genes that can classify the samples very well. The idea of shrinkage is to regularize the model parameters to reduce the effects of noise and produce reliable inferences. Shrinkage has been successfully applied in the microarray data analysis. The SAM statistics proposed by Tusher et al. and the 'nearest shrunken centroid' proposed by Tibshirani et al. are ad hoc shrinkage methods. Both methods are simple, intuitive and prove to be useful in empirical studies. Recently Wu proposed the penalized t/F-statistics with shrinkage by formally using the (1) penalized linear regression models for two-class microarray data, showing good performance. In this paper we systematically discussed the use of penalized regression models for analyzing microarray data. We generalize the two-class penalized t/F-statistics proposed by Wu to multi-class microarray data. We formally derive the ad hoc shrunken centroid used by Tibshirani et al. using the (1) penalized regression models. And we show that the penalized linear regression models provide a rigorous and unified statistical framework for sample classification and differential gene expression detection.

  7. Predictive models for mutations in mismatch repair genes: implication for genetic counseling in developing countries.

    PubMed

    Monteiro Santos, Erika Maria; Valentin, Mev Dominguez; Carneiro, Felipe; de Oliveira, Ligia Petrolini; de Oliveira Ferreira, Fabio; Junior, Samuel Aguiar; Nakagawa, Wilson Toshihiko; Gomy, Israel; de Faria Ferraz, Victor Evangelista; da Silva Junior, Wilson Araujo; Carraro, Dirce Maria; Rossi, Benedito Mauro

    2012-02-09

    Lynch syndrome (LS) is the most common form of inherited predisposition to colorectal cancer (CRC), accounting for 2-5% of all CRC. LS is an autosomal dominant disease characterized by mutations in the mismatch repair genes mutL homolog 1 (MLH1), mutS homolog 2 (MSH2), postmeiotic segregation increased 1 (PMS1), post-meiotic segregation increased 2 (PMS2) and mutS homolog 6 (MSH6). Mutation risk prediction models can be incorporated into clinical practice, facilitating the decision-making process and identifying individuals for molecular investigation. This is extremely important in countries with limited economic resources. This study aims to evaluate sensitivity and specificity of five predictive models for germline mutations in repair genes in a sample of individuals with suspected Lynch syndrome. Blood samples from 88 patients were analyzed through sequencing MLH1, MSH2 and MSH6 genes. The probability of detecting a mutation was calculated using the PREMM, Barnetson, MMRpro, Wijnen and Myriad models. To evaluate the sensitivity and specificity of the models, receiver operating characteristic curves were constructed. Of the 88 patients included in this analysis, 31 mutations were identified: 16 were found in the MSH2 gene, 15 in the MLH1 gene and no pathogenic mutations were identified in the MSH6 gene. It was observed that the AUC for the PREMM (0.846), Barnetson (0.850), MMRpro (0.821) and Wijnen (0.807) models did not present significant statistical difference. The Myriad model presented lower AUC (0.704) than the four other models evaluated. Considering thresholds of ≥ 5%, the models sensitivity varied between 1 (Myriad) and 0.87 (Wijnen) and specificity ranged from 0 (Myriad) to 0.38 (Barnetson). The Barnetson, PREMM, MMRpro and Wijnen models present similar AUC. The AUC of the Myriad model is statistically inferior to the four other models.

  8. Stochastic models for inferring genetic regulation from microarray gene expression data.

    PubMed

    Tian, Tianhai

    2010-03-01

    Microarray expression profiles are inherently noisy and many different sources of variation exist in microarray experiments. It is still a significant challenge to develop stochastic models to realize noise in microarray expression profiles, which has profound influence on the reverse engineering of genetic regulation. Using the target genes of the tumour suppressor gene p53 as the test problem, we developed stochastic differential equation models and established the relationship between the noise strength of stochastic models and parameters of an error model for describing the distribution of the microarray measurements. Numerical results indicate that the simulated variance from stochastic models with a stochastic degradation process can be represented by a monomial in terms of the hybridization intensity and the order of the monomial depends on the type of stochastic process. The developed stochastic models with multiple stochastic processes generated simulations whose variance is consistent with the prediction of the error model. This work also established a general method to develop stochastic models from experimental information. 2009 Elsevier Ireland Ltd. All rights reserved.

  9. Biased Gene Fractionation and Dominant Gene Expression among the Subgenomes of Brassica rapa

    PubMed Central

    Cheng, Feng; Wu, Jian; Fang, Lu; Sun, Silong; Liu, Bo; Lin, Ke; Bonnema, Guusje; Wang, Xiaowu

    2012-01-01

    Polyploidization, both ancient and recent, is frequent among plants. A “two-step theory" was proposed to explain the meso-triplication of the Brassica “A" genome: Brassica rapa. By accurately partitioning of this genome, we observed that genes in the less fractioned subgenome (LF) were dominantly expressed over the genes in more fractioned subgenomes (MFs: MF1 and MF2), while the genes in MF1 were slightly dominantly expressed over the genes in MF2. The results indicated that the dominantly expressed genes tended to be resistant against gene fractionation. By re-sequencing two B. rapa accessions: a vegetable turnip (VT117) and a Rapid Cycling line (L144), we found that genes in LF had less non-synonymous or frameshift mutations than genes in MFs; however mutation rates were not significantly different between MF1 and MF2. The differences in gene expression patterns and on-going gene death among the three subgenomes suggest that “two-step" genome triplication and differential subgenome methylation played important roles in the genome evolution of B. rapa. PMID:22567157

  10. Biased gene fractionation and dominant gene expression among the subgenomes of Brassica rapa.

    PubMed

    Cheng, Feng; Wu, Jian; Fang, Lu; Sun, Silong; Liu, Bo; Lin, Ke; Bonnema, Guusje; Wang, Xiaowu

    2012-01-01

    Polyploidization, both ancient and recent, is frequent among plants. A "two-step theory" was proposed to explain the meso-triplication of the Brassica "A" genome: Brassica rapa. By accurately partitioning of this genome, we observed that genes in the less fractioned subgenome (LF) were dominantly expressed over the genes in more fractioned subgenomes (MFs: MF1 and MF2), while the genes in MF1 were slightly dominantly expressed over the genes in MF2. The results indicated that the dominantly expressed genes tended to be resistant against gene fractionation. By re-sequencing two B. rapa accessions: a vegetable turnip (VT117) and a Rapid Cycling line (L144), we found that genes in LF had less non-synonymous or frameshift mutations than genes in MFs; however mutation rates were not significantly different between MF1 and MF2. The differences in gene expression patterns and on-going gene death among the three subgenomes suggest that "two-step" genome triplication and differential subgenome methylation played important roles in the genome evolution of B. rapa.

  11. Herbicide targets and detoxification proteins in sugarcane: from gene assembly to structure modelling.

    PubMed

    Lloyd Evans, Dyfed; Joshi, Shailesh Vinay

    2017-07-01

    In a genome context, sugarcane is a classic orphan crop, in that no genome and only very few genes have been assembled. We have devised a novel exome assembly methodology that has allowed us to assemble and characterize 49 genes that serve as herbicide targets, safener interacting proteins, and members of herbicide detoxification pathways within the sugarcane genome. We have structurally modelled the products of each of these genes, as well as determining allelic, genomic, and RNA-Seq based polymorphisms for each gene. This study provides the largest collection of sugarcane structures modelled to date. We demonstrate that sugarcane genes are highly polymorphic, revealing that each genotype is evolving both uniquely and independently. In addition, we present an exome assembly system for orphan crops that can be executed on commodity infrastructure, making exome assembly practical for any group. In terms of knowledge about herbicide modes of action and detoxification, we have advanced sugarcane from a crop where no information about any herbicide-associated gene was available to the situation where sugarcane is now a species with the single largest collection of known and annotated herbicide-associated genes.

  12. Generating Accurate 3d Models of Architectural Heritage Structures Using Low-Cost Camera and Open Source Algorithms

    NASA Astrophysics Data System (ADS)

    Zacharek, M.; Delis, P.; Kedzierski, M.; Fryskowska, A.

    2017-05-01

    These studies have been conductedusing non-metric digital camera and dense image matching algorithms, as non-contact methods of creating monuments documentation.In order toprocess the imagery, few open-source software and algorithms of generating adense point cloud from images have been executed. In the research, the OSM Bundler, VisualSFM software, and web application ARC3D were used. Images obtained for each of the investigated objects were processed using those applications, and then dense point clouds and textured 3D models were created. As a result of post-processing, obtained models were filtered and scaled.The research showedthat even using the open-source software it is possible toobtain accurate 3D models of structures (with an accuracy of a few centimeters), but for the purpose of documentation and conservation of cultural and historical heritage, such accuracy can be insufficient.

  13. Gene Circuit Analysis of the Terminal Gap Gene huckebein

    PubMed Central

    Ashyraliyev, Maksat; Siggens, Ken; Janssens, Hilde; Blom, Joke; Akam, Michael; Jaeger, Johannes

    2009-01-01

    The early embryo of Drosophila melanogaster provides a powerful model system to study the role of genes in pattern formation. The gap gene network constitutes the first zygotic regulatory tier in the hierarchy of the segmentation genes involved in specifying the position of body segments. Here, we use an integrative, systems-level approach to investigate the regulatory effect of the terminal gap gene huckebein (hkb) on gap gene expression. We present quantitative expression data for the Hkb protein, which enable us to include hkb in gap gene circuit models. Gap gene circuits are mathematical models of gene networks used as computational tools to extract regulatory information from spatial expression data. This is achieved by fitting the model to gap gene expression patterns, in order to obtain estimates for regulatory parameters which predict a specific network topology. We show how considering variability in the data combined with analysis of parameter determinability significantly improves the biological relevance and consistency of the approach. Our models are in agreement with earlier results, which they extend in two important respects: First, we show that Hkb is involved in the regulation of the posterior hunchback (hb) domain, but does not have any other essential function. Specifically, Hkb is required for the anterior shift in the posterior border of this domain, which is now reproduced correctly in our models. Second, gap gene circuits presented here are able to reproduce mutants of terminal gap genes, while previously published models were unable to reproduce any null mutants correctly. As a consequence, our models now capture the expression dynamics of all posterior gap genes and some variational properties of the system correctly. This is an important step towards a better, quantitative understanding of the developmental and evolutionary dynamics of the gap gene network. PMID:19876378

  14. Gene circuit analysis of the terminal gap gene huckebein.

    PubMed

    Ashyraliyev, Maksat; Siggens, Ken; Janssens, Hilde; Blom, Joke; Akam, Michael; Jaeger, Johannes

    2009-10-01

    The early embryo of Drosophila melanogaster provides a powerful model system to study the role of genes in pattern formation. The gap gene network constitutes the first zygotic regulatory tier in the hierarchy of the segmentation genes involved in specifying the position of body segments. Here, we use an integrative, systems-level approach to investigate the regulatory effect of the terminal gap gene huckebein (hkb) on gap gene expression. We present quantitative expression data for the Hkb protein, which enable us to include hkb in gap gene circuit models. Gap gene circuits are mathematical models of gene networks used as computational tools to extract regulatory information from spatial expression data. This is achieved by fitting the model to gap gene expression patterns, in order to obtain estimates for regulatory parameters which predict a specific network topology. We show how considering variability in the data combined with analysis of parameter determinability significantly improves the biological relevance and consistency of the approach. Our models are in agreement with earlier results, which they extend in two important respects: First, we show that Hkb is involved in the regulation of the posterior hunchback (hb) domain, but does not have any other essential function. Specifically, Hkb is required for the anterior shift in the posterior border of this domain, which is now reproduced correctly in our models. Second, gap gene circuits presented here are able to reproduce mutants of terminal gap genes, while previously published models were unable to reproduce any null mutants correctly. As a consequence, our models now capture the expression dynamics of all posterior gap genes and some variational properties of the system correctly. This is an important step towards a better, quantitative understanding of the developmental and evolutionary dynamics of the gap gene network.

  15. [Analysis of genetic models and gene effects on main agronomy characters in rapeseed].

    PubMed

    Li, J; Qiu, J; Tang, Z; Shen, L

    1992-01-01

    According to four different genetic models, the genetic patterns of 8 agronomy traits were analysed by using the data of 24 generations which included positive and negative cross of 81008 x Tower, both of the varieties are of good quality. The results showed that none of 8 characters could fit in with additive-dominance models. Epistasis was found in all of these characters, and it has significant effect on generation means. Seed weight/plant and some other main yield characters are controlled by duplicate interaction genes. The interaction between triple genes or multiple genes needs to be utilized in yield heterosis.

  16. Accurate upwind methods for the Euler equations

    NASA Technical Reports Server (NTRS)

    Huynh, Hung T.

    1993-01-01

    A new class of piecewise linear methods for the numerical solution of the one-dimensional Euler equations of gas dynamics is presented. These methods are uniformly second-order accurate, and can be considered as extensions of Godunov's scheme. With an appropriate definition of monotonicity preservation for the case of linear convection, it can be shown that they preserve monotonicity. Similar to Van Leer's MUSCL scheme, they consist of two key steps: a reconstruction step followed by an upwind step. For the reconstruction step, a monotonicity constraint that preserves uniform second-order accuracy is introduced. Computational efficiency is enhanced by devising a criterion that detects the 'smooth' part of the data where the constraint is redundant. The concept and coding of the constraint are simplified by the use of the median function. A slope steepening technique, which has no effect at smooth regions and can resolve a contact discontinuity in four cells, is described. As for the upwind step, existing and new methods are applied in a manner slightly different from those in the literature. These methods are derived by approximating the Euler equations via linearization and diagonalization. At a 'smooth' interface, Harten, Lax, and Van Leer's one intermediate state model is employed. A modification for this model that can resolve contact discontinuities is presented. Near a discontinuity, either this modified model or a more accurate one, namely, Roe's flux-difference splitting. is used. The current presentation of Roe's method, via the conceptually simple flux-vector splitting, not only establishes a connection between the two splittings, but also leads to an admissibility correction with no conditional statement, and an efficient approximation to Osher's approximate Riemann solver. These reconstruction and upwind steps result in schemes that are uniformly second-order accurate and economical at smooth regions, and yield high resolution at discontinuities.

  17. Developing Pedagogical Tools to Improve Teaching Multiple Models of the Gene in High School

    ERIC Educational Resources Information Center

    Auckaraaree, Nantaya

    2013-01-01

    Multiple models of the gene are used to explore genetic phenomena in scientific practices and in the classroom. In genetics curricula, the classical and molecular models are presented in disconnected domains. Research demonstrates that, without explicit connections, students have difficulty developing an understanding of the gene that spans…

  18. Accurate diode behavioral model with reverse recovery

    NASA Astrophysics Data System (ADS)

    Banáš, Stanislav; Divín, Jan; Dobeš, Josef; Paňko, Václav

    2018-01-01

    This paper deals with the comprehensive behavioral model of p-n junction diode containing reverse recovery effect, applicable to all standard SPICE simulators supporting Verilog-A language. The model has been successfully used in several production designs, which require its full complexity, robustness and set of tuning parameters comparable with standard compact SPICE diode model. The model is like standard compact model scalable with area and temperature and can be used as a stand-alone diode or as a part of more complex device macro-model, e.g. LDMOS, JFET, bipolar transistor. The paper briefly presents the state of the art followed by the chapter describing the model development and achieved solutions. During precise model verification some of them were found non-robust or poorly converging and replaced by more robust solutions, demonstrated in the paper. The measurement results of different technologies and different devices compared with a simulation using the new behavioral model are presented as the model validation. The comparison of model validation in time and frequency domains demonstrates that the implemented reverse recovery effect with correctly extracted parameters improves the model simulation results not only in switching from ON to OFF state, which is often published, but also its impedance/admittance frequency dependency in GHz range. Finally the model parameter extraction and the comparison with SPICE compact models containing reverse recovery effect is presented.

  19. Integrating mitosis, toxicity, and transgene expression in a telecommunications packet-switched network model of lipoplex-mediated gene delivery.

    PubMed

    Martin, Timothy M; Wysocki, Beata J; Beyersdorf, Jared P; Wysocki, Tadeusz A; Pannier, Angela K

    2014-08-01

    Gene delivery systems transport exogenous genetic information to cells or biological systems with the potential to directly alter endogenous gene expression and behavior with applications in functional genomics, tissue engineering, medical devices, and gene therapy. Nonviral systems offer advantages over viral systems because of their low immunogenicity, inexpensive synthesis, and easy modification but suffer from lower transfection levels. The representation of gene transfer using models offers perspective and interpretation of complex cellular mechanisms,including nonviral gene delivery where exact mechanisms are unknown. Here, we introduce a novel telecommunications model of the nonviral gene delivery process in which the delivery of the gene to a cell is synonymous with delivery of a packet of information to a destination computer within a packet-switched computer network. Such a model uses nodes and layers to simplify the complexity of modeling the transfection process and to overcome several challenges of existing models. These challenges include a limited scope and limited time frame, which often does not incorporate biological effects known to affect transfection. The telecommunication model was constructed in MATLAB to model lipoplex delivery of the gene encoding the green fluorescent protein to HeLa cells. Mitosis and toxicity events were included in the model resulting in simulation outputs of nuclear internalization and transfection efficiency that correlated with experimental data. A priori predictions based on model sensitivity analysis suggest that increasing endosomal escape and decreasing lysosomal degradation, protein degradation, and GFP-induced toxicity can improve transfection efficiency by three-fold. Application of the telecommunications model to nonviral gene delivery offers insight into the development of new gene delivery systems with therapeutically relevant transfection levels.

  20. An animal model for Norrie disease (ND): gene targeting of the mouse ND gene.

    PubMed

    Berger, W; van de Pol, D; Bächner, D; Oerlemans, F; Winkens, H; Hameister, H; Wieringa, B; Hendriks, W; Ropers, H H

    1996-01-01

    In order to elucidate the cellular and molecular processes which are involved in Norrie disease (ND), we have used gene targeting technology to generate ND mutant mice. The murine homologue of the ND gene was cloned and shown to encode a polypeptide that shares 94% of the amino acid sequence with its human counterpart. RNA in situ hybridization revealed expression in retina, brain and the olfactory bulb and epithelium of 2 week old mice. Hemizygous mice carrying a replacement mutation in exon 2 of the ND gene developed retrolental structures in the vitreous body and showed an overall disorganization of the retinal ganglion cell layer. The outer plexiform layer disappears occasionally, resulting in a juxtaposed inner and outer nuclear layer. At the same regions, the outer segments of the photoreceptor cell layer are no longer present. These ocular findings are consistent with observations in ND patients and the generated mouse line provides a faithful model for study of early pathogenic events in this severe X-linked recessive neurological disorder.

  1. Gene Therapy for Fracture Repair

    DTIC Science & Technology

    2005-12-01

    therapeutic benefits. We have identified a murine leukemia virus (MLV) vector that provides robust transgene expression in fracture tissues, and applied it to...During the second year of funding, we used the surgical technique to apply the murine leukemia virus (MLV)-based vector to the fracture tissues and...trochanter. ii ) Fracture Injection The therapeutic gene chosen was the BMP-2/4 hybrid gene. To most accurately establish the expression of the

  2. An Accurate Fire-Spread Algorithm in the Weather Research and Forecasting Model Using the Level-Set Method

    NASA Astrophysics Data System (ADS)

    Muñoz-Esparza, Domingo; Kosović, Branko; Jiménez, Pedro A.; Coen, Janice L.

    2018-04-01

    The level-set method is typically used to track and propagate the fire perimeter in wildland fire models. Herein, a high-order level-set method using fifth-order WENO scheme for the discretization of spatial derivatives and third-order explicit Runge-Kutta temporal integration is implemented within the Weather Research and Forecasting model wildland fire physics package, WRF-Fire. The algorithm includes solution of an additional partial differential equation for level-set reinitialization. The accuracy of the fire-front shape and rate of spread in uncoupled simulations is systematically analyzed. It is demonstrated that the common implementation used by level-set-based wildfire models yields to rate-of-spread errors in the range 10-35% for typical grid sizes (Δ = 12.5-100 m) and considerably underestimates fire area. Moreover, the amplitude of fire-front gradients in the presence of explicitly resolved turbulence features is systematically underestimated. In contrast, the new WRF-Fire algorithm results in rate-of-spread errors that are lower than 1% and that become nearly grid independent. Also, the underestimation of fire area at the sharp transition between the fire front and the lateral flanks is found to be reduced by a factor of ≈7. A hybrid-order level-set method with locally reduced artificial viscosity is proposed, which substantially alleviates the computational cost associated with high-order discretizations while preserving accuracy. Simulations of the Last Chance wildfire demonstrate additional benefits of high-order accurate level-set algorithms when dealing with complex fuel heterogeneities, enabling propagation across narrow fuel gaps and more accurate fire backing over the lee side of no fuel clusters.

  3. Bayesian Variable Selection for Hierarchical Gene-Environment and Gene-Gene Interactions

    PubMed Central

    Liu, Changlu; Ma, Jianzhong; Amos, Christopher I.

    2014-01-01

    We propose a Bayesian hierarchical mixture model framework that allows us to investigate the genetic and environmental effects, gene by gene interactions and gene by environment interactions in the same model. Our approach incorporates the natural hierarchical structure between the main effects and interaction effects into a mixture model, such that our methods tend to remove the irrelevant interaction effects more effectively, resulting in more robust and parsimonious models. We consider both strong and weak hierarchical models. For a strong hierarchical model, both of the main effects between interacting factors must be present for the interactions to be considered in the model development, while for a weak hierarchical model, only one of the two main effects is required to be present for the interaction to be evaluated. Our simulation results show that the proposed strong and weak hierarchical mixture models work well in controlling false positive rates and provide a powerful approach for identifying the predisposing effects and interactions in gene-environment interaction studies, in comparison with the naive model that does not impose this hierarchical constraint in most of the scenarios simulated. We illustrated our approach using data for lung cancer and cutaneous melanoma. PMID:25154630

  4. Accurate quantification of fluorescent targets within turbid media based on a decoupled fluorescence Monte Carlo model.

    PubMed

    Deng, Yong; Luo, Zhaoyang; Jiang, Xu; Xie, Wenhao; Luo, Qingming

    2015-07-01

    We propose a method based on a decoupled fluorescence Monte Carlo model for constructing fluorescence Jacobians to enable accurate quantification of fluorescence targets within turbid media. The effectiveness of the proposed method is validated using two cylindrical phantoms enclosing fluorescent targets within homogeneous and heterogeneous background media. The results demonstrate that our method can recover relative concentrations of the fluorescent targets with higher accuracy than the perturbation fluorescence Monte Carlo method. This suggests that our method is suitable for quantitative fluorescence diffuse optical tomography, especially for in vivo imaging of fluorophore targets for diagnosis of different diseases and abnormalities.

  5. Sensitivity analysis of gene ranking methods in phenotype prediction.

    PubMed

    deAndrés-Galiana, Enrique J; Fernández-Martínez, Juan L; Sonis, Stephen T

    2016-12-01

    It has become clear that noise generated during the assay and analytical processes has the ability to disrupt accurate interpretation of genomic studies. Not only does such noise impact the scientific validity and costs of studies, but when assessed in the context of clinically translatable indications such as phenotype prediction, it can lead to inaccurate conclusions that could ultimately impact patients. We applied a sequence of ranking methods to damp noise associated with microarray outputs, and then tested the utility of the approach in three disease indications using publically available datasets. This study was performed in three phases. We first theoretically analyzed the effect of noise in phenotype prediction problems showing that it can be expressed as a modeling error that partially falsifies the pathways. Secondly, via synthetic modeling, we performed the sensitivity analysis for the main gene ranking methods to different types of noise. Finally, we studied the predictive accuracy of the gene lists provided by these ranking methods in synthetic data and in three different datasets related to cancer, rare and neurodegenerative diseases to better understand the translational aspects of our findings. In the case of synthetic modeling, we showed that Fisher's Ratio (FR) was the most robust gene ranking method in terms of precision for all the types of noise at different levels. Significance Analysis of Microarrays (SAM) provided slightly lower performance and the rest of the methods (fold change, entropy and maximum percentile distance) were much less precise and accurate. The predictive accuracy of the smallest set of high discriminatory probes was similar for all the methods in the case of Gaussian and Log-Gaussian noise. In the case of class assignment noise, the predictive accuracy of SAM and FR is higher. Finally, for real datasets (Chronic Lymphocytic Leukemia, Inclusion Body Myositis and Amyotrophic Lateral Sclerosis) we found that FR and SAM

  6. Ensemble MD simulations restrained via crystallographic data: Accurate structure leads to accurate dynamics

    PubMed Central

    Xue, Yi; Skrynnikov, Nikolai R

    2014-01-01

    Currently, the best existing molecular dynamics (MD) force fields cannot accurately reproduce the global free-energy minimum which realizes the experimental protein structure. As a result, long MD trajectories tend to drift away from the starting coordinates (e.g., crystallographic structures). To address this problem, we have devised a new simulation strategy aimed at protein crystals. An MD simulation of protein crystal is essentially an ensemble simulation involving multiple protein molecules in a crystal unit cell (or a block of unit cells). To ensure that average protein coordinates remain correct during the simulation, we introduced crystallography-based restraints into the MD protocol. Because these restraints are aimed at the ensemble-average structure, they have only minimal impact on conformational dynamics of the individual protein molecules. So long as the average structure remains reasonable, the proteins move in a native-like fashion as dictated by the original force field. To validate this approach, we have used the data from solid-state NMR spectroscopy, which is the orthogonal experimental technique uniquely sensitive to protein local dynamics. The new method has been tested on the well-established model protein, ubiquitin. The ensemble-restrained MD simulations produced lower crystallographic R factors than conventional simulations; they also led to more accurate predictions for crystallographic temperature factors, solid-state chemical shifts, and backbone order parameters. The predictions for 15N R1 relaxation rates are at least as accurate as those obtained from conventional simulations. Taken together, these results suggest that the presented trajectories may be among the most realistic protein MD simulations ever reported. In this context, the ensemble restraints based on high-resolution crystallographic data can be viewed as protein-specific empirical corrections to the standard force fields. PMID:24452989

  7. HEMATOPOIETIC STEM CELL GENE THERAPY: ASSESSING THE RELEVANCE OF PRE-CLINICAL MODELS

    PubMed Central

    Larochelle, Andre; Dunbar, Cynthia E.

    2013-01-01

    The modern laboratory mouse has become a central tool for biomedical research with a notable influence in the field of hematopoiesis. Application of retroviral-based gene transfer approaches to mouse hematopoietic stem cells (HSCs) has led to a sophisticated understanding of the hematopoietic hierarchy in this model. However, the assumption that gene transfer methodologies developed in the mouse could be similarly applied to human HSCs for the treatment of human diseases left the field of gene therapy in a decade-long quandary. It is not until more relevant humanized xenograft mouse models and phylogenetically related large animal species were used to optimize gene transfer methodologies that unequivocal clinical successes were achieved. However, the subsequent reporting of severe adverse events in these clinical trials casted doubts on the predictive value of conventional pre-clinical testing, and encouraged the development of new assays for assessing the relative genotoxicity of various vector designs. PMID:24014892

  8. Neural model of gene regulatory network: a survey on supportive meta-heuristics.

    PubMed

    Biswas, Surama; Acharyya, Sriyankar

    2016-06-01

    Gene regulatory network (GRN) is produced as a result of regulatory interactions between different genes through their coded proteins in cellular context. Having immense importance in disease detection and drug finding, GRN has been modelled through various mathematical and computational schemes and reported in survey articles. Neural and neuro-fuzzy models have been the focus of attraction in bioinformatics. Predominant use of meta-heuristic algorithms in training neural models has proved its excellence. Considering these facts, this paper is organized to survey neural modelling schemes of GRN and the efficacy of meta-heuristic algorithms towards parameter learning (i.e. weighting connections) within the model. This survey paper renders two different structure-related approaches to infer GRN which are global structure approach and substructure approach. It also describes two neural modelling schemes, such as artificial neural network/recurrent neural network based modelling and neuro-fuzzy modelling. The meta-heuristic algorithms applied so far to learn the structure and parameters of neutrally modelled GRN have been reviewed here.

  9. Identification of HMX1 target genes: A predictive promoter model approach

    PubMed Central

    Boulling, Arnaud; Wicht, Linda

    2013-01-01

    Purpose A homozygous mutation in the H6 family homeobox 1 (HMX1) gene is responsible for a new oculoauricular defect leading to eye and auricular developmental abnormalities as well as early retinal degeneration (MIM 612109). However, the HMX1 pathway remains poorly understood, and in the first approach to better understand the pathway’s function, we sought to identify the target genes. Methods We developed a predictive promoter model (PPM) approach using a comparative transcriptomic analysis in the retina at P15 of a mouse model lacking functional Hmx1 (dmbo mouse) and its respective wild-type. This PPM was based on the hypothesis that HMX1 binding site (HMX1-BS) clusters should be more represented in promoters of HMX1 target genes. The most differentially expressed genes in the microarray experiment that contained HMX1-BS clusters were used to generate the PPM, which was then statistically validated. Finally, we developed two genome-wide target prediction methods: one that focused on conserving PPM features in human and mouse and one that was based on the co-occurrence of HMX1-BS pairs fitting the PPM, in human or in mouse, independently. Results The PPM construction revealed that sarcoglycan, gamma (35kDa dystrophin-associated glycoprotein) (Sgcg), teashirt zinc finger homeobox 2 (Tshz2), and solute carrier family 6 (neurotransmitter transporter, glycine) (Slc6a9) genes represented Hmx1 targets in the mouse retina at P15. Moreover, the genome-wide target prediction revealed that mouse genes belonging to the retinal axon guidance pathway were targeted by Hmx1. Expression of these three genes was experimentally validated using a quantitative reverse transcription PCR approach. The inhibitory activity of Hmx1 on Sgcg, as well as protein tyrosine phosphatase, receptor type, O (Ptpro) and Sema3f, two targets identified by the PPM, were validated with luciferase assay. Conclusions Gene expression analysis between wild-type and dmbo mice allowed us to develop a PPM

  10. Evidence-based gene models for structural and functional annotations of the oil palm genome.

    PubMed

    Chan, Kuang-Lim; Tatarinova, Tatiana V; Rosli, Rozana; Amiruddin, Nadzirah; Azizi, Norazah; Halim, Mohd Amin Ab; Sanusi, Nik Shazana Nik Mohd; Jayanthi, Nagappan; Ponomarenko, Petr; Triska, Martin; Solovyev, Victor; Firdaus-Raih, Mohd; Sambanthamurthi, Ravigadevi; Murphy, Denis; Low, Eng-Ti Leslie

    2017-09-08

    Oil palm is an important source of edible oil. The importance of the crop, as well as its long breeding cycle (10-12 years) has led to the sequencing of its genome in 2013 to pave the way for genomics-guided breeding. Nevertheless, the first set of gene predictions, although useful, had many fragmented genes. Classification and characterization of genes associated with traits of interest, such as those for fatty acid biosynthesis and disease resistance, were also limited. Lipid-, especially fatty acid (FA)-related genes are of particular interest for the oil palm as they specify oil yields and quality. This paper presents the characterization of the oil palm genome using different gene prediction methods and comparative genomics analysis, identification of FA biosynthesis and disease resistance genes, and the development of an annotation database and bioinformatics tools. Using two independent gene-prediction pipelines, Fgenesh++ and Seqping, 26,059 oil palm genes with transcriptome and RefSeq support were identified from the oil palm genome. These coding regions of the genome have a characteristic broad distribution of GC 3 (fraction of cytosine and guanine in the third position of a codon) with over half the GC 3 -rich genes (GC 3  ≥ 0.75286) being intronless. In comparison, only one-seventh of the oil palm genes identified are intronless. Using comparative genomics analysis, characterization of conserved domains and active sites, and expression analysis, 42 key genes involved in FA biosynthesis in oil palm were identified. For three of them, namely EgFABF, EgFABH and EgFAD3, segmental duplication events were detected. Our analysis also identified 210 candidate resistance genes in six classes, grouped by their protein domain structures. We present an accurate and comprehensive annotation of the oil palm genome, focusing on analysis of important categories of genes (GC 3 -rich and intronless), as well as those associated with important functions, such as FA

  11. Application of thin plate splines for accurate regional ionosphere modeling with multi-GNSS data

    NASA Astrophysics Data System (ADS)

    Krypiak-Gregorczyk, Anna; Wielgosz, Pawel; Borkowski, Andrzej

    2016-04-01

    GNSS-derived regional ionosphere models are widely used in both precise positioning, ionosphere and space weather studies. However, their accuracy is often not sufficient to support precise positioning, RTK in particular. In this paper, we presented new approach that uses solely carrier phase multi-GNSS observables and thin plate splines (TPS) for accurate ionospheric TEC modeling. TPS is a closed solution of a variational problem minimizing both the sum of squared second derivatives of a smoothing function and the deviation between data points and this function. This approach is used in UWM-rt1 regional ionosphere model developed at UWM in Olsztyn. The model allows for providing ionospheric TEC maps with high spatial and temporal resolutions - 0.2x0.2 degrees and 2.5 minutes, respectively. For TEC estimation, EPN and EUPOS reference station data is used. The maps are available with delay of 15-60 minutes. In this paper we compare the performance of UWM-rt1 model with IGS global and CODE regional ionosphere maps during ionospheric storm that took place on March 17th, 2015. During this storm, the TEC level over Europe doubled comparing to earlier quiet days. The performance of the UWM-rt1 model was validated by (a) comparison to reference double-differenced ionospheric corrections over selected baselines, and (b) analysis of post-fit residuals to calibrated carrier phase geometry-free observational arcs at selected test stations. The results show a very good performance of UWM-rt1 model. The obtained post-fit residuals in case of UWM maps are lower by one order of magnitude comparing to IGS maps. The accuracy of UWM-rt1 -derived TEC maps is estimated at 0.5 TECU. This may be directly translated to the user positioning domain.

  12. A literature search tool for intelligent extraction of disease-associated genes.

    PubMed

    Jung, Jae-Yoon; DeLuca, Todd F; Nelson, Tristan H; Wall, Dennis P

    2014-01-01

    To extract disorder-associated genes from the scientific literature in PubMed with greater sensitivity for literature-based support than existing methods. We developed a PubMed query to retrieve disorder-related, original research articles. Then we applied a rule-based text-mining algorithm with keyword matching to extract target disorders, genes with significant results, and the type of study described by the article. We compared our resulting candidate disorder genes and supporting references with existing databases. We demonstrated that our candidate gene set covers nearly all genes in manually curated databases, and that the references supporting the disorder-gene link are more extensive and accurate than other general purpose gene-to-disorder association databases. We implemented a novel publication search tool to find target articles, specifically focused on links between disorders and genotypes. Through comparison against gold-standard manually updated gene-disorder databases and comparison with automated databases of similar functionality we show that our tool can search through the entirety of PubMed to extract the main gene findings for human diseases rapidly and accurately.

  13. Modeling Gene-Environment Interactions With Quasi-Natural Experiments.

    PubMed

    Schmitz, Lauren; Conley, Dalton

    2017-02-01

    This overview develops new empirical models that can effectively document Gene × Environment (G×E) interactions in observational data. Current G×E studies are often unable to support causal inference because they use endogenous measures of the environment or fail to adequately address the nonrandom distribution of genes across environments, confounding estimates. Comprehensive measures of genetic variation are incorporated into quasi-natural experimental designs to exploit exogenous environmental shocks or isolate variation in environmental exposure to avoid potential confounders. In addition, we offer insights from population genetics that improve upon extant approaches to address problems from population stratification. Together, these tools offer a powerful way forward for G×E research on the origin and development of social inequality across the life course. © 2015 Wiley Periodicals, Inc.

  14. Predicting Gene Structure Changes Resulting from Genetic Variants via Exon Definition Features.

    PubMed

    Majoros, William H; Holt, Carson; Campbell, Michael S; Ware, Doreen; Yandell, Mark; Reddy, Timothy E

    2018-04-25

    Genetic variation that disrupts gene function by altering gene splicing between individuals can substantially influence traits and disease. In those cases, accurately predicting the effects of genetic variation on splicing can be highly valuable for investigating the mechanisms underlying those traits and diseases. While methods have been developed to generate high quality computational predictions of gene structures in reference genomes, the same methods perform poorly when used to predict the potentially deleterious effects of genetic changes that alter gene splicing between individuals. Underlying that discrepancy in predictive ability are the common assumptions by reference gene finding algorithms that genes are conserved, well-formed, and produce functional proteins. We describe a probabilistic approach for predicting recent changes to gene structure that may or may not conserve function. The model is applicable to both coding and noncoding genes, and can be trained on existing gene annotations without requiring curated examples of aberrant splicing. We apply this model to the problem of predicting altered splicing patterns in the genomes of individual humans, and we demonstrate that performing gene-structure prediction without relying on conserved coding features is feasible. The model predicts an unexpected abundance of variants that create de novo splice sites, an observation supported by both simulations and empirical data from RNA-seq experiments. While these de novo splice variants are commonly misinterpreted by other tools as coding or noncoding variants of little or no effect, we find that in some cases they can have large effects on splicing activity and protein products, and we propose that they may commonly act as cryptic factors in disease. The software is available from geneprediction.org/SGRF. bmajoros@duke.edu. Supplementary information is available at Bioinformatics online.

  15. Mutation databases for inherited renal disease: are they complete, accurate, clinically relevant, and freely available?

    PubMed

    Savige, Judy; Dagher, Hayat; Povey, Sue

    2014-07-01

    This study examined whether gene-specific DNA variant databases for inherited diseases of the kidney fulfilled the Human Variome Project recommendations of being complete, accurate, clinically relevant and freely available. A recent review identified 60 inherited renal diseases caused by mutations in 132 genes. The disease name, MIM number, gene name, together with "mutation" or "database," were used to identify web-based databases. Fifty-nine diseases (98%) due to mutations in 128 genes had a variant database. Altogether there were 349 databases (a median of 3 per gene, range 0-6), but no gene had two databases with the same number of variants, and 165 (50%) databases included fewer than 10 variants. About half the databases (180, 54%) had been updated in the previous year. Few (77, 23%) were curated by "experts" but these included nine of the 11 with the most variants. Even fewer databases (41, 12%) included clinical features apart from the name of the associated disease. Most (223, 67%) could be accessed without charge, including those for 50 genes (40%) with the maximum number of variants. Future efforts should focus on encouraging experts to collaborate on a single database for each gene affected in inherited renal disease, including both unpublished variants, and clinical phenotypes. © 2014 WILEY PERIODICALS, INC.

  16. Accurate phylogenetic classification of DNA fragments based onsequence composition

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    McHardy, Alice C.; Garcia Martin, Hector; Tsirigos, Aristotelis

    2006-05-01

    Metagenome studies have retrieved vast amounts of sequenceout of a variety of environments, leading to novel discoveries and greatinsights into the uncultured microbial world. Except for very simplecommunities, diversity makes sequence assembly and analysis a verychallenging problem. To understand the structure a 5 nd function ofmicrobial communities, a taxonomic characterization of the obtainedsequence fragments is highly desirable, yet currently limited mostly tothose sequences that contain phylogenetic marker genes. We show that forclades at the rank of domain down to genus, sequence composition allowsthe very accurate phylogenetic 10 characterization of genomic sequence.We developed a composition-based classifier, PhyloPythia, for de novophylogenetic sequencemore » characterization and have trained it on adata setof 340 genomes. By extensive evaluation experiments we show that themethodis accurate across all taxonomic ranks considered, even forsequences that originate fromnovel organisms and are as short as 1kb.Application to two metagenome datasets 15 obtained from samples ofphosphorus-removing sludge showed that the method allows the accurateclassification at genus level of most sequence fragments from thedominant populations, while at the same time correctly characterizingeven larger parts of the samples at higher taxonomic levels.« less

  17. A Biomechanical Model of the Scapulothoracic Joint to Accurately Capture Scapular Kinematics during Shoulder Movements

    PubMed Central

    Seth, Ajay; Matias, Ricardo; Veloso, António P.; Delp, Scott L.

    2016-01-01

    The complexity of shoulder mechanics combined with the movement of skin relative to the scapula makes it difficult to measure shoulder kinematics with sufficient accuracy to distinguish between symptomatic and asymptomatic individuals. Multibody skeletal models can improve motion capture accuracy by reducing the space of possible joint movements, and models are used widely to improve measurement of lower limb kinematics. In this study, we developed a rigid-body model of a scapulothoracic joint to describe the kinematics of the scapula relative to the thorax. This model describes scapular kinematics with four degrees of freedom: 1) elevation and 2) abduction of the scapula on an ellipsoidal thoracic surface, 3) upward rotation of the scapula normal to the thoracic surface, and 4) internal rotation of the scapula to lift the medial border of the scapula off the surface of the thorax. The surface dimensions and joint axes can be customized to match an individual’s anthropometry. We compared the model to “gold standard” bone-pin kinematics collected during three shoulder tasks and found modeled scapular kinematics to be accurate to within 2mm root-mean-squared error for individual bone-pin markers across all markers and movement tasks. As an additional test, we added random and systematic noise to the bone-pin marker data and found that the model reduced kinematic variability due to noise by 65% compared to Euler angles computed without the model. Our scapulothoracic joint model can be used for inverse and forward dynamics analyses and to compute joint reaction loads. The computational performance of the scapulothoracic joint model is well suited for real-time applications; it is freely available for use with OpenSim 3.2, and is customizable and usable with other OpenSim models. PMID:26734761

  18. A Biomechanical Model of the Scapulothoracic Joint to Accurately Capture Scapular Kinematics during Shoulder Movements.

    PubMed

    Seth, Ajay; Matias, Ricardo; Veloso, António P; Delp, Scott L

    2016-01-01

    The complexity of shoulder mechanics combined with the movement of skin relative to the scapula makes it difficult to measure shoulder kinematics with sufficient accuracy to distinguish between symptomatic and asymptomatic individuals. Multibody skeletal models can improve motion capture accuracy by reducing the space of possible joint movements, and models are used widely to improve measurement of lower limb kinematics. In this study, we developed a rigid-body model of a scapulothoracic joint to describe the kinematics of the scapula relative to the thorax. This model describes scapular kinematics with four degrees of freedom: 1) elevation and 2) abduction of the scapula on an ellipsoidal thoracic surface, 3) upward rotation of the scapula normal to the thoracic surface, and 4) internal rotation of the scapula to lift the medial border of the scapula off the surface of the thorax. The surface dimensions and joint axes can be customized to match an individual's anthropometry. We compared the model to "gold standard" bone-pin kinematics collected during three shoulder tasks and found modeled scapular kinematics to be accurate to within 2 mm root-mean-squared error for individual bone-pin markers across all markers and movement tasks. As an additional test, we added random and systematic noise to the bone-pin marker data and found that the model reduced kinematic variability due to noise by 65% compared to Euler angles computed without the model. Our scapulothoracic joint model can be used for inverse and forward dynamics analyses and to compute joint reaction loads. The computational performance of the scapulothoracic joint model is well suited for real-time applications; it is freely available for use with OpenSim 3.2, and is customizable and usable with other OpenSim models.

  19. Antenna modeling considerations for accurate SAR calculations in human phantoms in close proximity to GSM cellular base station antennas.

    PubMed

    van Wyk, Marnus J; Bingle, Marianne; Meyer, Frans J C

    2005-09-01

    International bodies such as International Commission on Non-Ionizing Radiation Protection (ICNIRP) and the Institute for Electrical and Electronic Engineering (IEEE) make provision for human exposure assessment based on SAR calculations (or measurements) and basic restrictions. In the case of base station exposure this is mostly applicable to occupational exposure scenarios in the very near field of these antennas where the conservative reference level criteria could be unnecessarily restrictive. This study presents a variety of critical aspects that need to be considered when calculating SAR in a human body close to a mobile phone base station antenna. A hybrid FEM/MoM technique is proposed as a suitable numerical method to obtain accurate results. The verification of the FEM/MoM implementation has been presented in a previous publication; the focus of this study is an investigation into the detail that must be included in a numerical model of the antenna, to accurately represent the real-world scenario. This is accomplished by comparing numerical results to measurements for a generic GSM base station antenna and appropriate, representative canonical and human phantoms. The results show that it is critical to take the disturbance effect of the human phantom (a large conductive body) on the base station antenna into account when the antenna-phantom spacing is less than 300 mm. For these small spacings, the antenna structure must be modeled in detail. The conclusion is that it is feasible to calculate, using the proposed techniques and methodology, accurate occupational compliance zones around base station antennas based on a SAR profile and basic restriction guidelines. (c) 2005 Wiley-Liss, Inc.

  20. High-Dimensional Sparse Factor Modeling: Applications in Gene Expression Genomics

    PubMed Central

    Carvalho, Carlos M.; Chang, Jeffrey; Lucas, Joseph E.; Nevins, Joseph R.; Wang, Quanli; West, Mike

    2010-01-01

    We describe studies in molecular profiling and biological pathway analysis that use sparse latent factor and regression models for microarray gene expression data. We discuss breast cancer applications and key aspects of the modeling and computational methodology. Our case studies aim to investigate and characterize heterogeneity of structure related to specific oncogenic pathways, as well as links between aggregate patterns in gene expression profiles and clinical biomarkers. Based on the metaphor of statistically derived “factors” as representing biological “subpathway” structure, we explore the decomposition of fitted sparse factor models into pathway subcomponents and investigate how these components overlay multiple aspects of known biological activity. Our methodology is based on sparsity modeling of multivariate regression, ANOVA, and latent factor models, as well as a class of models that combines all components. Hierarchical sparsity priors address questions of dimension reduction and multiple comparisons, as well as scalability of the methodology. The models include practically relevant non-Gaussian/nonparametric components for latent structure, underlying often quite complex non-Gaussianity in multivariate expression patterns. Model search and fitting are addressed through stochastic simulation and evolutionary stochastic search methods that are exemplified in the oncogenic pathway studies. Supplementary supporting material provides more details of the applications, as well as examples of the use of freely available software tools for implementing the methodology. PMID:21218139